diff options
author | Pablo Marquez Tello <pablo.tello@arm.com> | 2021-03-03 12:12:35 +0000 |
---|---|---|
committer | Pablo Marquez Tello <pablo.tello@arm.com> | 2021-04-19 15:02:29 +0000 |
commit | fe7ae817755577be29f4c07aa27d8ef9e821da45 (patch) | |
tree | 459b1b22f59cf5144cd72b839fbfdf21fa341479 /src/graph/nodes/EltwiseLayerNode.cpp | |
parent | 60c3b0e6821a80d78ffca5be30e05d062d071cd2 (diff) | |
download | ComputeLibrary-fe7ae817755577be29f4c07aa27d8ef9e821da45.tar.gz |
CLInstanceNormalizationLayer NHWC optimisation
* Make changes to split the workload into two kernels. One kernel precomputes
mean and variance and the second kernel just loads these precomputed values.
* The new approach runs %30 faster than the original code for NHWC workloads
like 32x192x256.
* Resolves MLCE-337
Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Diffstat (limited to 'src/graph/nodes/EltwiseLayerNode.cpp')
0 files changed, 0 insertions, 0 deletions