aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON/kernels
AgeCommit message (Collapse)Author
2020-11-23COMPMID-3987: Nightly failure - Android builds failing in dataset and validationManuel Bottini
Removing warnings from vector library in GCC 7.1+ Removing warning in wanted switch cases fall throughs GCAccessor moving constructor removed Removing parentheses equality checks in stb_image Small fixes in GEMM test suite Change-Id: I8ba8e3fa20b45c32e5b6219473e0f33ab787ca30 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4483 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> (cherry picked from commit 827817e627acfdc50c3a8ed748932e5893cc8a18) Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4172 Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-11-23Update tuning numbers for A55 for both fp16 and fp32Georgios Pinitas
Resolves: COMPMID-3974 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6d5189e44ebeda1575a80dd14ec3a09c75f68e03 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4521 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> (cherry picked from commit 40943df83026b66356f24e30f31f78c8b9e59c92) Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4169 Tested-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-11-17COMPMID-3962: Add Logical And, Or, Not support on NEONGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iabcd94d1ed6fe8bb27ce93924c35e25f48f39cf1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4438 Reviewed-by: James Conroy <james.conroy@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-17COMPMID-3970: Failure when building with GCC < 6Georgios Pinitas
Address pre-N4387 tuple usage Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iefe6e08e27b8fe1e688d2ff9db8cb7e172b568f3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4429 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3851: Fix regression on NEDepthwiseConvolutionLayerNativeKernelSang-Hoon Park
The exit condition of some for loops in quantized version of the kernel with depth_multiplier=1 is decided during compilation to fix performance issue. Change-Id: I849b3d63b2a2cf5eb374ae681898ae1c296fb4fe Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4392 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3852: Fix NEReduction windowGeorgios Pinitas
ReductionOperations splits the kernel for scheduling on the X dimension when reduction axis is > 0. By setting the execution window to be unit one in the X dimension the execution was always restricted to a single thread. Alters the window to enable multi-threading Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Idcbe2b78957678310bb8e895969f01de972d3667 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4389 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-11-12COMPMID-3960: Mismatch on NEArithmeticSubtractionGeorgios Pinitas
Corner-case failure when both input shapes had unit shape on the X axis. Broadcasting was enabled leading to invalid window execution. Check is updated to cross-validate the presence of broadcasting by checking the X dimension in both input shapes. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I0b79542279e8d155d2661fddff9691d94a1f6855 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4391 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3776: Indirect GEMMGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I51a1b0f098bc3a8c408c50c92221e4df3061e12c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4343 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-10COMPMID-3958: Fix build error with Werror=1Sang-Hoon Park
Remove unused variable in anonymouse namespace. Change-Id: Id9775cd7982f2a2ebf68f20e0c4e33013c3382a0 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4361 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3852: Fix complex multiplication remove padding performance regressionSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I2605baba63c9cca0370328860313b8ec09e04fb6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4355 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-2808: Add support for QASYMM8_SIGNED in NEROIAlignLayerSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Id4f4c96e1823a4b27886fee9baf70847172e619c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4335 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3951 LargeGraph_FLOAT32_Rank4_25 CTS failures in Android Q in CL Fix1SiCong Li
* Fix CLSpaceToBatchLayerKernel and NESpaceToBatchLayerKernel validation errors by using the correctly calculated output tensor shape Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I21d61f870e6a23a2e38dcb95c939b0bf08082b6f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4347 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-06COMPMID-3850: NEPooling regression for NHWCGeorgios Pinitas
Expand left-over loop to handle multiples of 8 for quantized data type during MaxPooling. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I1304d174c45d2c98247470ac8b4bb6752bbc03a6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4339 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3638: Move NEON kernelsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Ieed3e4bc8be7fef80c90c5094599b477a56fc473 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4285 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-30COMPMID-3926: Floor CTS failing in NeonMichele Di Giorgio
Depending n the value of `len`, the left-over loop might end up writing/reading out-of-bounds, therefore corrupting the memory. Change-Id: I1b0bb300f3e5ea668b585266e1aa6af7f93a5d1e Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4290 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-29COMPMID-3853: Decouple NEActivationLayerMichalis Spyrou
Decouple datatypes and remove Activation template. Binary size dropped by 25Kb. Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I32c207db124895fee25b56437f9495403315b867 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4217 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-29COMPMID-3827: Resize CTS failing in Neon after removing paddingManuel Bottini
Change-Id: I7cf27272e4e6e82b36a31a80ed47ae38fbbf9129 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4269 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-20COMPMID-3637: Move utility headers from arm_compute to srcSang-Hoon Park
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: If9d6fa8c900b68c4b6fd373f2fc1f9abb83ea917 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4145 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-19COMPMID-3163: Remove padding from NEDepthwiseConvolutionLayerNativeKernelSang-Hoon Park
Change-Id: Ibbd6bee5c6a4ce4f212b207d17a65b9c33bcfa78 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4106 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-16COMPMID-3805: Fix SQRT non-zero output for zero inputSang-Hoon Park
- For AArch64, NEActivationLayerKernel uses vsqrt rather than vinvsqrt. - For non-AArch64, it masks values to ensure zero input results in zero output without producing NaN. - Test cases for FP16 and FP32's positive boundary values are added. Change-Id: Ic0104ee5d7045059c2e9bd052616a4a3b43a315d Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4150 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-10-14COMPMID-3172: Remove padding from NEGEMMMatrixMultiplyKernelMichele Di Giorgio
Template parameter has been removed, which reduces the binary size by: - ~4 kB for armv8.2a - ~12 kB for armv8a Change-Id: Ib499a18a4980a3ee7b201507b943f900adf20a73 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4122 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-14COMPMID-3144: Remove padding from NEDirectConvolutionLayerKernelManuel Bottini
Change-Id: I22b907eebfbe037e6e1c7bf604172f4709a9cbed Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4082 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-10-13COMPMID-3828: fix unsigned overflow in quantized reduce meanSang-Hoon Park
Change-Id: I9d3122b4858137d422548d1d417eb04a27ae9c7b Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4143 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-09COMPMID-3794: Fix window loops causing performance regressionMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Id4d95c6ce5fed91bb079b8bfe1abceedefd20c97 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4117 Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-08COMPMID-3170: Remove padding in NEGEMMLowpMatrixMultiplyKernelmorgolock
Change-Id: Ie95442c6c6a145c1a45937b03cbd433bf08e36ab Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4094 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-08COMPMID-3684: Use case data type decouplingGeorgios Pinitas
Decouples data types for NEFloorKernel Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6756300540bc5ef32a9990246eed8619a76855f2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4084 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-07COMPMID-3821: NEON Reduction op PROD failuresMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I8cfdd24c4e71a6a4be610ba67a75ad2943a43801 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4097 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-07COMPMID-3637: Move wrapper to srcGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I524b0c4b49c7a7035b7d078b9585d77b0d438e10 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4083 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-06COMPMID-3181: Remove padding from NEReductionOperationKernelSheri Zhang
COMPMID-3803: Remove padding from NEComplexPixelWiseMultiplicationKernel Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I309fc4ab62bacbca9203d2680a9d6d52f76f70e6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4078 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2020-10-02COMPMID-3145: Remove padding from NEScaleKernelManuel Bottini
Change-Id: I530b12c6270d7dbeb3ef7af62484842ebcb65925 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4000 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-10-02COMPMID-3183: Removed padding NEGEMMLowpReductionKernelmorgolock
Change-Id: Ibf7741ffdefcceb9683c919e79302fc35c36ea65 Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4031 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-09-30COMPMID-3802: Remove templates from NEDirectConvolutionLayerOutputStageKernelMichalis Spyrou
Removing bool template reduces the binary size by 20Kb. Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I652cea7d320a00b6c6e44cdacb61e77f3c10e56a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4053 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-09-29COMPMID-3784 Add broadcast support to S32 NEPixelwiseMultiplicationSiCong Li
Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ifae31c74eb46c561225394a387fc15332423bfa9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4030 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-09-29COMPMID-3174: Remove padding from NEDirectConvolutionLayerOutputStageKernelMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I3c5cfe50e9cee30b66f4094da105d383c077aaf9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4044 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-09-22COMPMID-3757: (u)int8: Don't select the 16-bit route on A53 for cases with ↵David Mansell
very few rows Also added 2D version of the 16-bit route, and altered the selection heuristic so that 2D mode will be used in cases where 1D mode won't thread well. Change-Id: I0057fde08456771dc0090ac51f50d82f8bb86044 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3903 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-15COMPMID-3752: NEPermuteKernel does not support permutations2Michele Di Giorgio
Solves also: - COMPMID-3766: CTS Failures in Transpose Neon + FP16 Change-Id: I9d323f45f49cc0bce9e6329790bcf2f0eeec8572 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3949 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-10COMPMID-3159: Remove padding from NEPoolingLayerKernelMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Ib5b252e1b65794a8f360276d03ff94922e1991f8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3946 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-10COMPMID-3583: Add S32 support to NEElementwiseDivisionGeorgios Pinitas
Division follows the flooring division approach where for example 5/2=2 while -5/2=-3 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I65756e0b31fe8d97f743a4c13dc5f96304722f75 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3929 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-09MLCE-229: Fixed requantization per channel in asm kernelmorgolock
Change-Id: Iaf1465f3144371e153ce123ac00da5cc092f77df Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3939 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-09-09COMPMID-3581 Add S32 support to NEPixelWiseMultiplicationSiCong Li
* Add S32 support to NEPixelWiseMultiplication and NEPixelWiseMultiplicationKernel * Scale == 1/255 is not supported for S32, as on non-aarch64 the precision requirement is not met, and scale is a non-standard parameter anyway. * Fix the data types validation logics to also test for all invalid data type combinations. * Add validation tests for S32 NEON PixelWiseMultiplication * The wrap tolerance for ScaleOther (scale == 1/2^n) cases is set to 1 instead of 0 because the reference uses floating point division followed by rounding, which is isn't bit accurate. Change-Id: I28839afda7a4f98c985d1763620e08d98f740142 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3923 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-08COMPMID-3151: Remove NEDepthwiseConvolutionLayer3x3KernelGeorgios Pinitas
Prefer NEDepthwiseConvolutionLayerNativeKernel as it has a native format of NHWC avoiding extra transformation to the NCHW domain. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: If5d8de11691b8ef7f4c3816941f87417d0c8646b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3930 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-07COMPMID-3155: Remove padding from NEGEMMLowpOffsetContributionKernelMichele Di Giorgio
Change-Id: I93c3b795cf6fe0b27008543b6671a3be0a965603 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3916 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-07COMPMID-3748: Compiler issue with Bfloat16 on gcc8Georgios Pinitas
Treat bf16 memory on memset as raw memory by casting to void*. This hides the class-memaccess warning and is safe for the current class layout of arm_compute::bfloat16 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I5e242827d3737b4491d29abe7570eefee5b6edc1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3928 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-09-07COMPMID-3580 Add S32 support to NEArithmeticSubtractionSiCong Li
* Fix convert policy validate logics and add missing validate test * Add S32 support to NEArithmeticSubtraction and NEArithmeticSubtractionKernel * Add S32 validation tests Change-Id: I1b6cb15b024613c202fe9f17747a83da43a5ddcf Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3908 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-09-04COMPMID-3157: Remove padding from NEGEMMTranspose1xWKernelGian Marco Iodice
- Remove padding from NEGEMMTranspose1xWKernel - Extend test for validating zero padding requirement Change-Id: I9ce4ca95a500229b045dc140cfff21fdf7373700 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3920 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-03COMPMID-3143: Remove padding from NEGEMMInterleave4x4KernelGian Marco Iodice
- Remove padding from NEGEMMInterleave4x4Kernel - Extend test for validating zero padding requirement Change-Id: I94abc271e005f9dd6e1721b185631f55f598dbfd Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3915 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-28COMPMID-3504: Add support for BOOL in NEON comparison operatorsMichele Di Giorgio
Change-Id: I81b0c2482bc20b1ab5124ed6179bb94cbced7875 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3869 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-19MLCE-229: Support for negative shifts in asm kernelsmorgolock
Change-Id: I2c5e98aae7698963f106d7423df0e65cd00ee2a9 Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3710 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-10COMPMID-3700: ArmNN OOB test failureSheri Zhang
Fix ArmNN compiling issue through removing defulat template value for offset_no_padding() and pooling2_nchw_maxpool_indices(). Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ie7114d102b8533e757b8e841afcac1adfbcb3b54 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3697 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-3392: Collapse TensorMaps into a single TensorPackGeorgios Pinitas
Collapse InputTensorMap and OutputTensorMap to a single TensorPack mechanism. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ie2fdfc6b07d84ad589169ec99ca64fcf45a00bec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/253783 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3641 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>