Fix CPU multiplication layer threading overhead

* When the tensors are reinterpreted as 1D, any thing smaller than 10KB won't be splitted into different thread. Resolves: COMPMID-5630 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: Icff7089e37c85c8b325f099008a080a5805d36a2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8581 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
author: Viet-Hoa Do <viet-hoa.do@arm.com> 2022-11-08 12:01:21 +0000
committer: Viet-Hoa Do <viet-hoa.do@arm.com> 2022-11-09 11:29:14 +0000
commit: d4a9cc00a666c7d4c2a35c49d71b322f27e369fc (patch)
tree: c46a58f9679aa1b7cfb14511a16f5052e0f50ca2 /src/cpu/kernels/CpuMulKernel.h
parent: d158609e9ab13069a0a4d2d01d3f1a739a678dd0 (diff)
download: ComputeLibrary-d4a9cc00a666c7d4c2a35c49d71b322f27e369fc.tar.gz
1 files changed, 1 insertions, 0 deletions
diff --git a/src/cpu/kernels/CpuMulKernel.h b/src/cpu/kernels/CpuMulKernel.h
index 5727b9d012..c92e1efdf4 100644
--- a/src/cpu/kernels/CpuMulKernel.h
+++ b/src/cpu/kernels/CpuMulKernel.h
@@ -79,6 +79,7 @@ public:
     // Inherited methods overridden
     void run_op(ITensorPack &tensors, const Window &window, const ThreadInfo &info) override;
     const char *name() const override;
+    size_t get_mws(const CPUInfo &platform, size_t thread_count) const override;
 
     /** Get the preferred dimension in which the scheduler splits the work into multiple jobs.
       *
author	Viet-Hoa Do <viet-hoa.do@arm.com>	2022-11-08 12:01:21 +0000
committer	Viet-Hoa Do <viet-hoa.do@arm.com>	2022-11-09 11:29:14 +0000
commit	d4a9cc00a666c7d4c2a35c49d71b322f27e369fc (patch)
tree	c46a58f9679aa1b7cfb14511a16f5052e0f50ca2 /src/cpu/kernels/CpuMulKernel.h
parent	d158609e9ab13069a0a4d2d01d3f1a739a678dd0 (diff)
download	ComputeLibrary-d4a9cc00a666c7d4c2a35c49d71b322f27e369fc.tar.gz