aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON
AgeCommit message (Collapse)Author
2021-01-07Clean up macro definitions in arm_compute headersGiorgio Arena
- Expose loose macros by prefixing "ARM_COMPUTE_" Resolves: COMPMID-3701 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I4334b01c1a5cd8585f4a1ba2d870be956c61a83d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4769 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-05COMPMID-3874: Create ArithmeticAddition SVE/SVE2Michalis Spyrou
Change-Id: I4ec7561a7f6a42a22b8187968ae302dbe75023bc Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4753 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-05COMPMID-4076: ArmNN unittest failure with memory access voilation in ↵Sheri Zhang
FuseReLUIntoBatchNormFloat32CpuAccTest 1. Fix fusable and non-fusable configuration issue 2. Fix FP16 issue Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I6d0eacca7ac437f236ad403ddb283c10c8f419a6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4761 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-05Improve NEIm2Col validation for invalid shapesGeorgios Pinitas
Ensure that Im2Col transformation is valid for the given input meta-data. In more detail, validate that the combination of input shape, padding and kernel width leads to a valid execution window and output shape. Resolves: COMPMID-4040 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id813373b2efdfdfbe71dc0d0acc1d7bf8ecd5e84 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4757 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-04Add utility functions for SVESang-Hoon Park
- Few bit-width dependent intrinsics are added. - Few math functions are added. Partially implements: COMPMID-3872 Change-Id: Ia6ab46bd170fec9c7c8d4410b7ef4d84710b68ed Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4718 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-24COMPMID-3871: Create BatchNormalization SVE/SVE2Sheri Zhang
1. Decouple data type for NHWC 2. Add NHWC SVE support for BachNormalization Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I0383b969b555b429d9acebb4efa17ecba9429ea7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4755 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-12-23Fix baremetal arm_compute_validation build errorsSiCongLi
* Add -C flag to instruct preprocessor not to strip comments. This is to prevent marker comments like '// fall through' that suppresses certain warnings from being removed. * Fix unused variable warnings. * Add M_PI definition that's missing from certain toolchain standard libraries. Resolves COMPMID-4054 Change-Id: I1d641db668685d4b678f3d0efed84bfe9e630b4b Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4692 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-14COMPMID-3870: Create ActivationLayer SVE/SVE2Michalis Spyrou
Adds support for ActivationLayer for SVE and SVE2. Datatypes supported: *FP32 *FP16 *QASYMM8 *QASYMM8_SIGNED *QSYMM16 Change-Id: Ia3583891795cda4ca2f9fa27c440731a5c27710d Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4566 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-14COMPMID-3968 30% regression on FSSD v1 25 GrayscaleGiorgio Arena
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib1ecd7aa10fec0b7e2b3d929e212c1af34c0f58d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4533 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-11Remove (CL/NE)UpsampleLayer in favor to (NE/CL)ScaleGeorgios Pinitas
Upsample functions and kernels can be replaced with the Scale as they provide same functionality Partially resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic2f9ba352c183aa87d69d551d5c172d0f22119e8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4679 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-10Remove (NE/CL)YoloLayer supportGeorgios Pinitas
YOLO layer is too specialized and specific to a single model type. Can be decomposed using split, activation and concatenate layers Partially Resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I3cde88f8d4cc7d8c70ce1bb3b32b00f8d09bdca2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4678 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-08Wrap Flatten layer over reshapeGeorgios Pinitas
Flatten layer is lowered into a Reshape layer. Remove (CL/NE)FlatternLayerKernel. Partially Resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id9e2ddfe2e2dd793541badff3490c05e4c908f88 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4660 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-07COMPMID-3869: Update Sconstruct to support SVE/SVE2Manuel Bottini
Modifying scons to build with SVE/SVE2 Updating the documentation with examples Change-Id: I80875206599d5444b9c21ac75c4a8e4efd30d8b5 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4629 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-12-03Update GEMV heuristics for quantized types for A53Georgios Pinitas
Switch assembly kernels to dispatch a 4x4 blocked GEMM kernel for A53 when M <= 4 instead of the 8x12 u16 based one. Resolves: COMPMID-3983 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic46a1b51a7c075e46dcb5cd578c75260ded0540c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4640 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-02Remove support for (NE/CL)LocallyConnectedLayerGeorgios Pinitas
Remove out-of-date and unmaintained LocallyConnectedLayer for both NEON and OpenCL. Resolves: COMPMID-3924 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ia61398ed8cfa3876f41c1b342c4a80d1cca0ca83 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4634 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-02COMPMID-3862: Add support QASYMM8 LEAKY RELU activationSang-Hoon Park
- LEAKY RELU activation is supported for QASYMM8 data type - vquantize on NEON side has been modified to match with other backends (OpenCL and reference) Change-Id: I194631225c8d4f3cc96027d64812ec2be2b4328a Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4593 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-01Update default C++ standard to C++14Georgios Pinitas
(3RDPARTY_UPDATE) Resolves: COMPMID-3849 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6369f112337310140e2d6c8e79630cd11138dfa0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4544 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-23COMPMID-3987: Nightly failure - Android builds failing in dataset and validationManuel Bottini
Removing warnings from vector library in GCC 7.1+ Removing warning in wanted switch cases fall throughs GCAccessor moving constructor removed Removing parentheses equality checks in stb_image Small fixes in GEMM test suite Change-Id: I8ba8e3fa20b45c32e5b6219473e0f33ab787ca30 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4483 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-23Update tuning numbers for A55 for both fp16 and fp32Georgios Pinitas
Resolves: COMPMID-3974 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6d5189e44ebeda1575a80dd14ec3a09c75f68e03 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4521 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-17COMPMID-3962: Add Logical And, Or, Not support on NEONGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iabcd94d1ed6fe8bb27ce93924c35e25f48f39cf1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4438 Reviewed-by: James Conroy <james.conroy@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-17COMPMID-3970: Failure when building with GCC < 6Georgios Pinitas
Address pre-N4387 tuple usage Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iefe6e08e27b8fe1e688d2ff9db8cb7e172b568f3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4429 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3851: Fix regression on NEDepthwiseConvolutionLayerNativeKernelSang-Hoon Park
The exit condition of some for loops in quantized version of the kernel with depth_multiplier=1 is decided during compilation to fix performance issue. Change-Id: I849b3d63b2a2cf5eb374ae681898ae1c296fb4fe Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4392 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3852: Fix NEReduction windowGeorgios Pinitas
ReductionOperations splits the kernel for scheduling on the X dimension when reduction axis is > 0. By setting the execution window to be unit one in the X dimension the execution was always restricted to a single thread. Alters the window to enable multi-threading Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Idcbe2b78957678310bb8e895969f01de972d3667 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4389 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-11-12COMPMID-3960: Mismatch on NEArithmeticSubtractionGeorgios Pinitas
Corner-case failure when both input shapes had unit shape on the X axis. Broadcasting was enabled leading to invalid window execution. Check is updated to cross-validate the presence of broadcasting by checking the X dimension in both input shapes. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I0b79542279e8d155d2661fddff9691d94a1f6855 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4391 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3776: Indirect GEMMGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I51a1b0f098bc3a8c408c50c92221e4df3061e12c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4343 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-10COMPMID-3958: Fix build error with Werror=1Sang-Hoon Park
Remove unused variable in anonymouse namespace. Change-Id: Id9775cd7982f2a2ebf68f20e0c4e33013c3382a0 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4361 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-10COMPMID-3639: Fix script to generate *Kernels.hSang-Hoon Park
Change-Id: Ie44fc807fe8d7ad04a97f0ea4f611b60cb8e0716 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4325 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-11-09COMPMID-3852: Fix complex multiplication remove padding performance regressionSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I2605baba63c9cca0370328860313b8ec09e04fb6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4355 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-2808: Add support for QASYMM8_SIGNED in NEROIAlignLayerSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Id4f4c96e1823a4b27886fee9baf70847172e619c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4335 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3951 LargeGraph_FLOAT32_Rank4_25 CTS failures in Android Q in CL Fix1SiCong Li
* Fix CLSpaceToBatchLayerKernel and NESpaceToBatchLayerKernel validation errors by using the correctly calculated output tensor shape Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I21d61f870e6a23a2e38dcb95c939b0bf08082b6f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4347 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-06COMPMID-3850: NEPooling regression for NHWCGeorgios Pinitas
Expand left-over loop to handle multiples of 8 for quantized data type during MaxPooling. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I1304d174c45d2c98247470ac8b4bb6752bbc03a6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4339 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3638: Move NEON kernelsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Ieed3e4bc8be7fef80c90c5094599b477a56fc473 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4285 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-30COMPMID-3926: Floor CTS failing in NeonMichele Di Giorgio
Depending n the value of `len`, the left-over loop might end up writing/reading out-of-bounds, therefore corrupting the memory. Change-Id: I1b0bb300f3e5ea668b585266e1aa6af7f93a5d1e Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4290 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-29COMPMID-3853: Decouple NEActivationLayerMichalis Spyrou
Decouple datatypes and remove Activation template. Binary size dropped by 25Kb. Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I32c207db124895fee25b56437f9495403315b867 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4217 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-29COMPMID-3827: Resize CTS failing in Neon after removing paddingManuel Bottini
Change-Id: I7cf27272e4e6e82b36a31a80ed47ae38fbbf9129 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4269 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-20COMPMID-3637: Move utility headers from arm_compute to srcSang-Hoon Park
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: If9d6fa8c900b68c4b6fd373f2fc1f9abb83ea917 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4145 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-19COMPMID-3163: Remove padding from NEDepthwiseConvolutionLayerNativeKernelSang-Hoon Park
Change-Id: Ibbd6bee5c6a4ce4f212b207d17a65b9c33bcfa78 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4106 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-16COMPMID-3805: Fix SQRT non-zero output for zero inputSang-Hoon Park
- For AArch64, NEActivationLayerKernel uses vsqrt rather than vinvsqrt. - For non-AArch64, it masks values to ensure zero input results in zero output without producing NaN. - Test cases for FP16 and FP32's positive boundary values are added. Change-Id: Ic0104ee5d7045059c2e9bd052616a4a3b43a315d Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4150 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-10-14COMPMID-3172: Remove padding from NEGEMMMatrixMultiplyKernelMichele Di Giorgio
Template parameter has been removed, which reduces the binary size by: - ~4 kB for armv8.2a - ~12 kB for armv8a Change-Id: Ib499a18a4980a3ee7b201507b943f900adf20a73 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4122 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-14COMPMID-3144: Remove padding from NEDirectConvolutionLayerKernelManuel Bottini
Change-Id: I22b907eebfbe037e6e1c7bf604172f4709a9cbed Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4082 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-10-13COMPMID-3828: fix unsigned overflow in quantized reduce meanSang-Hoon Park
Change-Id: I9d3122b4858137d422548d1d417eb04a27ae9c7b Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4143 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-09COMPMID-3794: Fix window loops causing performance regressionMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Id4d95c6ce5fed91bb079b8bfe1abceedefd20c97 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4117 Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-08COMPMID-3170: Remove padding in NEGEMMLowpMatrixMultiplyKernelmorgolock
Change-Id: Ie95442c6c6a145c1a45937b03cbd433bf08e36ab Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4094 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-08COMPMID-3684: Use case data type decouplingGeorgios Pinitas
Decouples data types for NEFloorKernel Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6756300540bc5ef32a9990246eed8619a76855f2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4084 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-07COMPMID-3821: NEON Reduction op PROD failuresMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I8cfdd24c4e71a6a4be610ba67a75ad2943a43801 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4097 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-07COMPMID-3637: Move wrapper to srcGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I524b0c4b49c7a7035b7d078b9585d77b0d438e10 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4083 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-06COMPMID-3181: Remove padding from NEReductionOperationKernelSheri Zhang
COMPMID-3803: Remove padding from NEComplexPixelWiseMultiplicationKernel Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I309fc4ab62bacbca9203d2680a9d6d52f76f70e6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4078 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2020-10-02COMPMID-3145: Remove padding from NEScaleKernelManuel Bottini
Change-Id: I530b12c6270d7dbeb3ef7af62484842ebcb65925 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4000 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-10-02COMPMID-3183: Removed padding NEGEMMLowpReductionKernelmorgolock
Change-Id: Ibf7741ffdefcceb9683c919e79302fc35c36ea65 Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4031 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-09-30COMPMID-3802: Remove templates from NEDirectConvolutionLayerOutputStageKernelMichalis Spyrou
Removing bool template reduces the binary size by 20Kb. Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I652cea7d320a00b6c6e44cdacb61e77f3c10e56a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4053 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>