diff options
author | Viet-Hoa Do <viet-hoa.do@arm.com> | 2023-10-09 10:58:35 +0100 |
---|---|---|
committer | Viet-Hoa Do <viet-hoa.do@arm.com> | 2023-10-11 10:01:49 +0000 |
commit | c210c85548c7f627690ed9259622d3ab342fe612 (patch) | |
tree | 6385edb5083a805bac8ddd83567a1e1dac0715ce /src/core/NEON/kernels/arm_gemm/transforms/sme_transpose_interleave_16VL.hpp | |
parent | fb9c25d27791d934300581596cce7c5875a79a80 (diff) | |
download | ComputeLibrary-c210c85548c7f627690ed9259622d3ab342fe612.tar.gz |
Optimize CL reduction operation
* Batch dimension is added to reduction operation.
- All the dimensions higher than the batch dimension are collapsed
so that the input and output tensors are always 3-4D.
- CL kernel is called once instead of being repeatedly called
to process each sliding window.
Resolves: COMPMID-6443
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Icd99939d52d3bb648f08537e5f52ef27e894061b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10456
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'src/core/NEON/kernels/arm_gemm/transforms/sme_transpose_interleave_16VL.hpp')
0 files changed, 0 insertions, 0 deletions