aboutsummaryrefslogtreecommitdiff
path: root/SConscript
diff options
context:
space:
mode:
authorDavid Mansell <David.Mansell@arm.com>2023-11-22 11:33:46 +0000
committerDavid Mansell <David.Mansell@arm.com>2024-01-25 13:24:10 +0000
commitfb92e22c642985a5ea7906e7e7f46285d1d47718 (patch)
tree4b5ff83a83fe3ef88ee6744e6b843a06ad0aaaa9 /SConscript
parent2aec5f1870b6cd5edd7de6403b5cf75530eb77f5 (diff)
downloadComputeLibrary-fb92e22c642985a5ea7906e7e7f46285d1d47718.tar.gz
arm_gemm: convolution: optimize convolver.hpp.
The code in convolver.hpp generates pointers into either the appropriate point in the input activation tensor or the padding buffer for each kernel point of each output point of the convolution. This is done at runtime interspersed with the data transform and matrix multiplication steps. As such, it can have a significant impact on performance, particularly for low input channel counts. This change improves the performance of this code by streamlining the checks for out of range input points (which must be directed to the padding buffer). The previous implementation checked all four borders for every point. The revised code does the checks one at a time, and for any failing check applies the result to as many output points as possible without repeating the other checks. Signed-off-by: David Mansell <David.Mansell@arm.com> Change-Id: I36a4fa114b425c1bcba2be40acf36718522519f5 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11004 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Diffstat (limited to 'SConscript')
0 files changed, 0 insertions, 0 deletions