aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp
diff options
context:
space:
mode:
authorGian Marco Iodice <gianmarco.iodice@arm.com>2020-08-11 14:14:06 +0100
committerGian Marco Iodice <gianmarco.iodice@arm.com>2020-08-12 14:58:38 +0000
commit088d63aae947efd8bbcfd4d27c1f50a6af79e3b9 (patch)
tree294b50056d5b85ef74beae997a8184a12b75d693 /src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp
parentb10181b9b476a0b41e270472e97eb0b8e5e197d5 (diff)
downloadComputeLibrary-088d63aae947efd8bbcfd4d27c1f50a6af79e3b9.tar.gz
COMPMID-3337: Remove write paddings in both axes from CLGEMMMatrixMultiplyReshapedKernel
- Change the interface of STORE_BLOCK_BOUNDARY_AWARE passing the conditions on Y and X rather than the X/ coordinates. This allows to use the macro with both GEMM reshaped and GEMM reshaped rhs only - Remove padding from the output tensor of CLGEMMMatrixMultiplyReshapedKernel - Add tests for validating the zero padding requirement Change-Id: I13263cc71ce065c5be34ed198def320dd5823495 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3712 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp')
-rw-r--r--src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp2
1 files changed, 1 insertions, 1 deletions
diff --git a/src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp b/src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp
index e65726b234..cf77c70bfa 100644
--- a/src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp
+++ b/src/core/CL/kernels/CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.cpp
@@ -162,7 +162,7 @@ std::pair<Status, Window> validate_and_configure_window(ITensorInfo *input0, ITe
input0->dimension(0),
input0->dimension(1));
AccessWindowStatic input1_access(input1, 0, 0,
- ceil_to_multiple(input1->dimension(0), num_elems_processed_per_iteration_x),
+ input1->dimension(0),
input1->dimension(1));
AccessWindowStatic output_access(output, 0, 0,
output->dimension(0),