aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/CL/functions/CLReduceMean.cpp
diff options
context:
space:
mode:
authorMichele Di Giorgio <michele.digiorgio@arm.com>2018-10-24 12:20:19 +0100
committerAnthony Barbier <anthony.barbier@arm.com>2018-11-02 16:55:45 +0000
commita1422fbf985c89ffebc8f5af8093e9cd987cfe29 (patch)
tree4536c22bb96859e8d80cc0c11e4d106b780b7539 /src/runtime/CL/functions/CLReduceMean.cpp
parent33893c3e5a8d298f1a9fcc36ab89b73382fc1245 (diff)
downloadComputeLibrary-a1422fbf985c89ffebc8f5af8093e9cd987cfe29.tar.gz
COMPMID-1673: Collapse window in CLArithmeticAddition when one operand is a vector
When one of the operands is a vector, the kernel does a broadcast addition and the window is not collapsed. This represent an issue because it leads to a lot of enqueues that increases the time taken by the OpenCL driver. This patch allows to collapse the window when one of the two operands is a vector. Furthermore, it adds LWS tuner to the kernel. It also changes the number of elements processed per iteration to 8 to make better usage of the cache. Change-Id: I5f09ab0ddcffb3b7f9326a987c79a997b2d7fa8c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155003 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
Diffstat (limited to 'src/runtime/CL/functions/CLReduceMean.cpp')
0 files changed, 0 insertions, 0 deletions