Integrate multi-threaded pretranspose_B_array

This is required for the case where rhs (B) is dynamic and needs to be pretransposed in every run. In a multi-threaded setting, this means the previously single-threaded pretranspose_B_array would become the bottleneck Resolves COMPMID-5896 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Id508c46992188a0f76a505152931d4955d04c16d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9455 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
author: SiCong Li <sicong.li@arm.com> 2023-04-06 16:30:18 +0100
committer: SiCong Li <sicong.li@arm.com> 2023-04-26 09:10:38 +0000
commit: dba672cec878966e465bb476e896c8f75bbd9145 (patch)
tree: fcc8df3dc3f3799a616d2a10d52dd9bfdf6d2e33 /src/runtime
parent: 7fefac722568d997b4d9e136925e93c7abeb564a (diff)
download: ComputeLibrary-dba672cec878966e465bb476e896c8f75bbd9145.tar.gz
1 files changed, 3 insertions, 3 deletions
diff --git a/src/runtime/IScheduler.cpp b/src/runtime/IScheduler.cpp
index 39f41555fa..436fd9ca16 100644
--- a/src/runtime/IScheduler.cpp
+++ b/src/runtime/IScheduler.cpp
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2016-2022 Arm Limited.
+ * Copyright (c) 2016-2023 Arm Limited.
  *
  * SPDX-License-Identifier: MIT
  *
@@ -139,7 +139,7 @@ void IScheduler::schedule_common(ICPPKernel *kernel, const Hints &hints, const W
                 default:
                     ARM_COMPUTE_ERROR("Unknown strategy");
             }
-            // Make sure the smallest window is larger than minimim workload size
+            // Make sure the smallest window is larger than minimum workload size
             num_windows = adjust_num_of_windows(max_window, hints.split_dimension(), num_windows, *kernel, cpu_info());
 
             std::vector<IScheduler::Workload> workloads(num_windows);
@@ -178,7 +178,7 @@ void IScheduler::run_tagged_workloads(std::vector<Workload> &workloads, const ch
 std::size_t IScheduler::adjust_num_of_windows(const Window &window, std::size_t split_dimension, std::size_t init_num_windows, const ICPPKernel &kernel, const CPUInfo &cpu_info)
 {
     // Mitigation of the narrow split issue, which occurs when the split dimension is too small to split (hence "narrow").
-    if(window.num_iterations(split_dimension) < init_num_windows )
+    if(window.num_iterations(split_dimension) < init_num_windows)
     {
         auto recommended_split_dim = Window::DimX;
         for(std::size_t dims = Window::DimY; dims <= Window::DimW; ++dims)
author	SiCong Li <sicong.li@arm.com>	2023-04-06 16:30:18 +0100
committer	SiCong Li <sicong.li@arm.com>	2023-04-26 09:10:38 +0000
commit	dba672cec878966e465bb476e896c8f75bbd9145 (patch)
tree	fcc8df3dc3f3799a616d2a10d52dd9bfdf6d2e33 /src/runtime
parent	7fefac722568d997b4d9e136925e93c7abeb564a (diff)
download	ComputeLibrary-dba672cec878966e465bb476e896c8f75bbd9145.tar.gz