From a23b4686a091a7960a4b336d0fe53f15db4ae538 Mon Sep 17 00:00:00 2001 From: Jakub Sujak Date: Thu, 5 Oct 2023 10:20:59 +0100 Subject: Optimize CLTranspose operator * Transpose higher dimensional tensors (>2D) by collapsing higher dimensions into the third dimension thus avoiding multiple dispatches of the CL kernel * Maximize tile size without register spilling Resolves: COMPMID-6448 Change-Id: Iac094b8c428bdf319d9c28a8334cb55d58e2d14b Signed-off-by: Jakub Sujak Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10443 Tested-by: Arm Jenkins Reviewed-by: Viet-Hoa Do Comments-Addressed: Arm Jenkins Benchmark: Arm Jenkins --- docs/user_guide/release_version_and_change_log.dox | 1 + 1 file changed, 1 insertion(+) (limited to 'docs/user_guide') diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox index 3e04837c1e..882244d2f2 100644 --- a/docs/user_guide/release_version_and_change_log.dox +++ b/docs/user_guide/release_version_and_change_log.dox @@ -54,6 +54,7 @@ v23.11 Public major release - Remove legacy PostOps interface. PostOps was the experimental interface for kernel fusion and is replaced by the new Dynamic Fusion interface. - Performance optimizations: - Optimize @ref cpu::CpuReshape + - Optimize @ref opencl::ClTranspose - Add new OpenCLâ„¢ kernels: - @ref opencl::kernels::ClMatMulLowpNativeMMULKernel support for QASYMM8 and QASYMM8_SIGNED, with batch support - Deprecate support for Bfloat16 in @ref cpu::CpuCast. -- cgit v1.2.1