diff options
author | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2020-04-15 11:42:15 +0100 |
---|---|---|
committer | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2020-04-20 13:04:42 +0000 |
commit | eb65f6da695ac0d3e495817145cceb1c4de4f048 (patch) | |
tree | 1e4980ba6d6ce2d738670c2ebadf4e24ebd172ce /arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h | |
parent | 47a899017e67556ffffef78571c9be61dd7bc3f0 (diff) | |
download | ComputeLibrary-eb65f6da695ac0d3e495817145cceb1c4de4f048.tar.gz |
COMPMID-3304: Update OpenCL GEMM heuristic for Int8
Change-Id: I6b7ff678d8d0437a1639db2ff602ea1cdb155464
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3056
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h')
-rw-r--r-- | arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h b/arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h index 02ed20e5af..032539b699 100644 --- a/arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h +++ b/arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h @@ -30,9 +30,9 @@ namespace arm_compute { class ICLTensor; -/** OpenCL kernel used to add the offset contribution after @ref CLGEMMLowpMatrixMultiplyKernel and perform the output stage. +/** OpenCL kernel used to add the offset contribution after the matrix multiplication and perform the output stage. * - * This kernel takes a final int32 accumulator value (the output of @ref CLGEMMLowpMatrixMultiplyKernel), adds to it the offset contribution + * This kernel takes a final int32 accumulator value (the output of the matrix multiplication), adds to it the offset contribution * of matrix A and matrix B and performs the output stage defined by the output_stage argument * * @note For quantized computations the output data type for auto-initialization must be passed as part of the @ref GEMMLowpOutputStageInfo. @@ -52,7 +52,7 @@ public: CLGEMMLowpOffsetContributionOutputStageKernel &operator=(CLGEMMLowpOffsetContributionOutputStageKernel &&) = default; /** Initialise the kernel's input and output. * - * @param[in] mm_result Input tensor containing the result of @ref CLGEMMLowpMatrixMultiplyKernel. Data type supported: S32 + * @param[in] mm_result Input tensor containing the result of the matrix multiplication. Data type supported: S32 * @param[in] vector_sum_col Input row-vector of sums of all the entries in each column of matrix B. * Note: vector_sum_col can be a nullptr in case a_offset = 0. Data type supported: same as @p mm_result * @param[in] vector_sum_row Input row-vector of sums of all the entries in each row of matrix A. @@ -74,7 +74,7 @@ public: /** Initialise the kernel's input and output. * * @param[in] compile_context The compile context to be used. - * @param[in] mm_result Input tensor containing the result of @ref CLGEMMLowpMatrixMultiplyKernel. Data type supported: S32 + * @param[in] mm_result Input tensor containing the result of the matrix multiplication. Data type supported: S32 * @param[in] vector_sum_col Input row-vector of sums of all the entries in each column of matrix B. * Note: vector_sum_col can be a nullptr in case a_offset = 0. Data type supported: same as @p mm_result * @param[in] vector_sum_row Input row-vector of sums of all the entries in each row of matrix A. |