diff options
author | David Mansell <David.Mansell@arm.com> | 2023-11-22 11:33:46 +0000 |
---|---|---|
committer | David Mansell <David.Mansell@arm.com> | 2024-01-25 13:24:10 +0000 |
commit | fb92e22c642985a5ea7906e7e7f46285d1d47718 (patch) | |
tree | 4b5ff83a83fe3ef88ee6744e6b843a06ad0aaaa9 /.pre-commit-config.yaml | |
parent | 2aec5f1870b6cd5edd7de6403b5cf75530eb77f5 (diff) | |
download | ComputeLibrary-fb92e22c642985a5ea7906e7e7f46285d1d47718.tar.gz |
arm_gemm: convolution: optimize convolver.hpp.
The code in convolver.hpp generates pointers into either the
appropriate point in the input activation tensor or the padding buffer
for each kernel point of each output point of the convolution. This is
done at runtime interspersed with the data transform and matrix
multiplication steps. As such, it can have a significant impact on
performance, particularly for low input channel counts.
This change improves the performance of this code by streamlining the
checks for out of range input points (which must be directed to the
padding buffer). The previous implementation checked all four borders
for every point. The revised code does the checks one at a time, and
for any failing check applies the result to as many output points as
possible without repeating the other checks.
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: I36a4fa114b425c1bcba2be40acf36718522519f5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11004
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Diffstat (limited to '.pre-commit-config.yaml')
0 files changed, 0 insertions, 0 deletions