From c210c85548c7f627690ed9259622d3ab342fe612 Mon Sep 17 00:00:00 2001 From: Viet-Hoa Do Date: Mon, 9 Oct 2023 10:58:35 +0100 Subject: Optimize CL reduction operation * Batch dimension is added to reduction operation. - All the dimensions higher than the batch dimension are collapsed so that the input and output tensors are always 3-4D. - CL kernel is called once instead of being repeatedly called to process each sliding window. Resolves: COMPMID-6443 Signed-off-by: Viet-Hoa Do Change-Id: Icd99939d52d3bb648f08537e5f52ef27e894061b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10456 Reviewed-by: Jakub Sujak Tested-by: Arm Jenkins Benchmark: Arm Jenkins Comments-Addressed: Arm Jenkins --- docs/user_guide/release_version_and_change_log.dox | 1 + 1 file changed, 1 insertion(+) (limited to 'docs') diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox index d1429b61d7..b2500944ca 100644 --- a/docs/user_guide/release_version_and_change_log.dox +++ b/docs/user_guide/release_version_and_change_log.dox @@ -56,6 +56,7 @@ v23.11 Public major release - Optimize @ref cpu::CpuReshape - Optimize @ref opencl::ClTranspose - Optimize @ref NEStackLayer + - Optimize @ref CLReductionOperation. - Add new OpenCLâ„¢ kernels: - @ref opencl::kernels::ClMatMulLowpNativeMMULKernel support for QASYMM8 and QASYMM8_SIGNED, with batch support - Deprecate support for Bfloat16 in @ref cpu::CpuCast. -- cgit v1.2.1