aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON/kernels/arm_gemm/convolver.hpp
AgeCommit message (Collapse)Author
2024-01-25arm_gemm: convolution: optimize convolver.hpp.David Mansell
The code in convolver.hpp generates pointers into either the appropriate point in the input activation tensor or the padding buffer for each kernel point of each output point of the convolution. This is done at runtime interspersed with the data transform and matrix multiplication steps. As such, it can have a significant impact on performance, particularly for low input channel counts. This change improves the performance of this code by streamlining the checks for out of range input points (which must be directed to the padding buffer). The previous implementation checked all four borders for every point. The revised code does the checks one at a time, and for any failing check applies the result to as many output points as possible without repeating the other checks. Signed-off-by: David Mansell <David.Mansell@arm.com> Change-Id: I36a4fa114b425c1bcba2be40acf36718522519f5 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11004 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2020-11-17COMPMID-3970: Failure when building with GCC < 6Georgios Pinitas
Address pre-N4387 tuple usage Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iefe6e08e27b8fe1e688d2ff9db8cb7e172b568f3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4429 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3776: Indirect GEMMGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I51a1b0f098bc3a8c408c50c92221e4df3061e12c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4343 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>