aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/CLKernelLibrary.cpp
diff options
context:
space:
mode:
authorPablo Marquez Tello <pablo.tello@arm.com>2021-03-03 12:12:35 +0000
committerPablo Marquez Tello <pablo.tello@arm.com>2021-04-19 15:02:29 +0000
commitfe7ae817755577be29f4c07aa27d8ef9e821da45 (patch)
tree459b1b22f59cf5144cd72b839fbfdf21fa341479 /src/core/CL/CLKernelLibrary.cpp
parent60c3b0e6821a80d78ffca5be30e05d062d071cd2 (diff)
downloadComputeLibrary-fe7ae817755577be29f4c07aa27d8ef9e821da45.tar.gz
CLInstanceNormalizationLayer NHWC optimisation
* Make changes to split the workload into two kernels. One kernel precomputes mean and variance and the second kernel just loads these precomputed values. * The new approach runs %30 faster than the original code for NHWC workloads like 32x192x256. * Resolves MLCE-337 Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Diffstat (limited to 'src/core/CL/CLKernelLibrary.cpp')
-rw-r--r--src/core/CL/CLKernelLibrary.cpp1
1 files changed, 1 insertions, 0 deletions
diff --git a/src/core/CL/CLKernelLibrary.cpp b/src/core/CL/CLKernelLibrary.cpp
index eef204fde9..002a14400f 100644
--- a/src/core/CL/CLKernelLibrary.cpp
+++ b/src/core/CL/CLKernelLibrary.cpp
@@ -356,6 +356,7 @@ const std::map<std::string, std::string> CLKernelLibrary::_kernel_program_map =
{ "im2col9x9_nhwc", "im2col.cl" },
{ "im2col_generic_nhwc", "im2col.cl" },
{ "instance_normalization", "instance_normalization.cl" },
+ { "compute_mean_var", "instance_normalization.cl" },
{ "l2_normalize_x", "l2_normalize.cl" },
{ "l2_normalize_y", "l2_normalize.cl" },
{ "l2_normalize_z", "l2_normalize.cl" },