From 3e4b193f783c2d43547123518cadd1b2a9b11055 Mon Sep 17 00:00:00 2001 From: Gunes Bayir Date: Sat, 16 Mar 2024 23:40:39 +0000 Subject: Fix quant. gemv kernel driver by adding set_quantized_bias() arm_gemm fuses the actual bias addition with the output stage in quantized gemm. The output stage, in its very basic form, is: A_offset * B_offset - sum(A_row_i) * B_offset - sum(B_col_j) * A_offset Matrix B is usually constant (e.g. weight matrix in convolutions). Therefore, except the middle term above, the expression is constant across the same output row because the column sums of matrix B are pre-calculated. The bias is also usually constant. When it is, it makes sense to add the bias vector to the above sum and just perform a single addition on top of the output tensor. For this to happen, the column sum computation of B tensor must account for the bias. This is ensured by set_quantized_bias() method in the interface. This function passes the bias pointer and strides to arm_gemm. Gemv_pretransposed does not implement set_quantized_bias() and uses the parent function, which does nothing. Therefore, the bias is not added to the output. This causes tests to fail. Resolves: COMPMID-6928 Change-Id: Iba24fabc65fdc47edb12db6abff2fb47784c0743 Signed-off-by: Gunes Bayir Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11310 Benchmark: Arm Jenkins Tested-by: Arm Jenkins Reviewed-by: Jakub Sujak --- src/core/NEON/kernels/arm_gemm/gemm_quint8.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'src/core/NEON/kernels/arm_gemm/gemm_quint8.cpp') diff --git a/src/core/NEON/kernels/arm_gemm/gemm_quint8.cpp b/src/core/NEON/kernels/arm_gemm/gemm_quint8.cpp index 3baf9857da..b85b1c4fcf 100644 --- a/src/core/NEON/kernels/arm_gemm/gemm_quint8.cpp +++ b/src/core/NEON/kernels/arm_gemm/gemm_quint8.cpp @@ -67,7 +67,7 @@ static const GemmImplementation gemm_quint8_meth #ifdef ARM_COMPUTE_ENABLE_SME2 // SME kernels { - GemmMethod::GEMM_HYBRID, + GemmMethod::GEMV_PRETRANSPOSED, "sme2_gemv_u8qa_dot_16VL", [](const GemmArgs &args, const Requantize32 &qp) { return args._ci->has_sme2() && quant_hybrid_asymmetric(qp) && args._Msize == 1 && !args._indirect_input && args._nbatches == 1; }, nullptr, -- cgit v1.2.1