aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/core/NEON/NEKernels.h
AgeCommit message (Collapse)Author
2019-12-04COMPMID-2826 Comply with DCL51-CPPMichalis Spyrou
Rename all header guards to be compliant with DCL51-CPP Change-Id: I47b09375bb1b8d39d80c275ce69a3f25fb385d75 Signed-off-by: Michalis Spyrou <micspy01@e123758.cambridge.arm.com> Reviewed-on: https://review.mlplatform.org/c/2393 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-11-27COMPMID-2805: Add QASYMM8_SIGNED support in NEGEMMLowpOutputStageGeorgios Pinitas
Add support from requantizing down from S32 to Int8 with fixed point requantization. This involves the following: - Compute fixed point multiplication between each entry of input by result_fixedpoint_multiplier - Add bias to final result if bias tensor is not a nullptr - Round to nearest division by a power-of-two using result_shift - Add offset to each result - Clamp the value between the specified min and max bounds - Cast to int8 data type Change-Id: I641b3fac0833c568d8565ccb859bbc561a24c17d Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/2340 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-11-06COMPMID-2308: NEConvolutionLayer: support QUANT8_SYMM_PER_CHANNEL filtersGeorgios Pinitas
Change-Id: Ic1bf5f0d21ccd525f84213a360f7e199d7f50577 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/2177 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-10-21COMPMID-2708 NEDepthwiseConvolution Generic: support for QUANT8_PER_CHANNEL_SYMMGiorgio Arena
COMPMID-2470 Implement a new and generic depthwise convolution for NEON QASYMM8 NHWC COMPMID-2477 Enable FP16 data type for the new generic convolution on NEON for NHWC COMPMID-2625 Remove old implementation files for the generic NEDepthwiseConvolution Change-Id: I8f6deda4fc69dd7e472fba3228b1ed5dad172f3e Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/2094 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-09-20COMPMID-2257: Implement NEGenerateProposals.Pablo Tello
Change-Id: I8d751f8b09f842a214c305a4530a71d82f16db8f Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/1943 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-09-17COMPMID-2314: Implement NEON INSTANCE_NORMALIZATION functionManuel Bottini
Change-Id: Ibaa574207aedf691953f8af8fa32b6408a1664ec Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/1905 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-09-09MLCE-129: NEPad 30x slower than TensorFlow's implementationManuel Bottini
Change-Id: I44770e6a3134c70c4bd58f890d06cb43c9bd8bff Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/1853 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-09-04COMPMID-2246: Implement NEBoundingBoxTransform.Pablo Tello
Change-Id: I69faa21ccdc295f5369450a12e6e217d873349e3 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/1854 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-08-29COMPMID-2318: Implement NEROIAlignLayerPablo Tello
Change-Id: I2665c98b5d6523a9ddaef0b7c220aecd22559ce6 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/1826 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-07-29COMPMID-2336: Rename the new generic depthwise convolution on NEONGian Marco Iodice
Change-Id: I45cacf75b08bb9d867343037507e56f200ad6ac0 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/1637 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-07-26COMPMID-2179 New generic depthwise convolution for NEON F32 NHWCGiorgio Arena
Change-Id: I2b883785c0500d4bdb6ee4700382ee058be2cd36 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/1538 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-06-28COMPMID-2234 : Add support for axis 3 in NE/CLConcatenateLayerVidhya Sudhan Loganathan
Change-Id: Ic86f89ece3afe72809bc69c6de6fee7d21daa1d4 Signed-off-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com> Reviewed-on: https://review.mlplatform.org/c/1440 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-06-25COMPMID-2406: Create a new GEMMLowpQuantizeDownInt32ToInt16ScaleKernel for NEONGian Marco Iodice
Change-Id: I3f3e247728fd6dafca066e41835f0ef9442d9b7a Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/1379 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-06-18COMPMID-2387: Add support for NEMeanStdDevNormalizationLayerMichele Di Giorgio
Change-Id: If5a9558b78f9c696dfc72633f73e9e21e5c2970f Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/1351 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-05-29COMPMID-2237Manuel Bottini
Implement SPACE_TO_DEPTH for NEON Change-Id: I9f427bceca6da52671e0096be08772612f4be152 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/1227 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-05-24COMPMID-2240 Implement DEPTH_TO_SPACE for NEONMichalis Spyrou
Change-Id: I705aa0f804093c3628c691e46cca475f2819dc65 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/1198 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-05-01COMPMID-1963: Implement FFT (2D) on NEONgiuros01
Change-Id: I3b564be8d7949e00c6544071ef62dd51de838c96 Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/1048 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-04-26COMPMID-1961: Implement FFT (1D) on NEONgiuros01
Change-Id: I0bea3bfbc3b0cd9e8c9a0e0f6f430640573f08d1 Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/996 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-03-19COMPMID-1933: Implement NEHeightConcatenateLayer.Pablo Tello
Added support to concactenate tensors along the Y axis in NEConcatenateLayer. Change-Id: Ib714bfcf9954cc35918efa7d52fc9164bb08bdf6 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/841 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-03-15COMPMID-1694: Fuse offset contribution with the output stage when we use ↵George Wort
NEGEMMLowpMatrixMultiplyCore Change-Id: Ic1a681e4cc03e1eba3bf8485d9cdb17b3e926047 Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/561 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-03-05COMPMID-1843: Implement NECropGeorge Wort
Change-Id: I27e8b1a00c2315c72106e8e596f84ad48fb770e3 Signed-off-by: George Wort <george.wort@arm.com> Reviewed-on: https://review.mlplatform.org/c/648 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2019-03-04COMPMID-1946: Implement NEBatchToSpacegiuros01
Change-Id: I119645eb3ea437c7dfe59545da58b328a7184f3f Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/734 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-03-01COMPMID-1947: Implement NESpaceToBatchgiuros01
Change-Id: I59b3c17874ba24559b7fddf74f7659a1b9177759 Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/735 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-21COMPMID-1763 : NEON: Implement GatherJohn Kesapides
Change-Id: I9a3808315290bd395f5acce4530ab8daccddf8be Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/167195 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/520 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2019-01-14COMPMID-1772: Implement PadV2 for NEONGeorgios Pinitas
Change-Id: Ia4604524a034c46b004fd850183480c5fbfd8cb3 Reviewed-on: https://review.mlplatform.org/437 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-14COMPMID-1758: NEON: Implement RangeManuel Bottini
Change-Id: I56dff9462b85760fbed6db43224cadb90d283810 Reviewed-on: https://review.mlplatform.org/472 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-11COMPMID-1761: NEON: Implement PackIsabella Gottardi
Change-Id: Icc3392494b1e3361e8fd925da200827c494351b3 Reviewed-on: https://review.mlplatform.org/430 Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-01-08COMPMID-1759 NEON: Implement ReverseMichalis Spyrou
Change-Id: I53852069ca223eb571a443e501278980fc60f3b4 Reviewed-on: https://review.mlplatform.org/474 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-01-08COMPMID-1756: NEON: Implement RSqrt, ExpGeorge Wort
Change-Id: I6b140b8868b04f7d3032a51831a80829e8e1560e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/165590 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-on: https://review.mlplatform.org/470 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-02COMPMID-1769: Add support for NEON StridedSlice,Split,Slice,UnstackGeorgios Pinitas
Change-Id: I7d3c5e6858fed090410720f76947327e39bc72f8 Reviewed-on: https://review.mlplatform.org/450 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: VidhyaSudhan Loganathan <vidhyasudhan.loganathan@arm.com>
2019-01-02COMPMID-1767: NEON: Implement Where/SelectGeorge Wort
Change-Id: If8a1ab6d6a029a5c547b726e0692eecef9a2e97d Reviewed-on: https://review.mlplatform.org/415 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-17COMPMID-1754: NEON: Implement Maximum, Minumum, SquaredDifferencegiuros01
Change-Id: I77e8c6a8af6ad841293ed5e66ed582035cc1424b Reviewed-on: https://review.mlplatform.org/339 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-13COMPMID-1741: Implement NEFuseBatchNormalizationKernelgiuros01
Change-Id: Ib3ba4b22804a94a1e8ef6d7076e28c2fc1cd2fa2 Reviewed-on: https://review.mlplatform.org/385 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
2018-12-05COMPMID-1757: NEON: Implement Tilegiuros01
Change-Id: Ic6a1f55f14d53896725afe426bc2e2acb1546589 Reviewed-on: https://review.mlplatform.org/343 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-16COMPMID-1461 SSD support: Create NEON PriorBoxMichalis Spyrou
Change-Id: I99e1c3939cfea4b9cb0ddfa313706f31b213ca89
2018-11-08COMPMID-1579: Add support for ChannelShuffle operator in NEONGeorgios Pinitas
Change-Id: I6d5f91579850906e1eb973ff6c5612195255e631
2018-11-02COMPMID-1596 Create UpsampleLayer for NEONMichalis Spyrou
Change-Id: I82d95c4f1c5fed13b213a2591cc2b4e0d0e02a54 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149676 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1540 Implement YOLOLayer on NEONMichalis Spyrou
Change-Id: Ice28996959dc666fff5e8ae486c1ff8093db083f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148367 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1528: Add ReorgLayer on NEONGeorgios Pinitas
Change-Id: I44369b4a716767163e2233b7d87bff300c523383 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146314 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1047 Extract Flatten function from Im2Col for NEONGiorgio Arena
Change-Id: I80f3aaadc8cae8c9ca1a5a239e79bda302b89bd8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144813 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1366 Implement NECopyMichalis Spyrou
Change-Id: I183e4b7081bf12de3546293a00da68b4f4a0dd5e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143987 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1364: Add support for NHWC in NEDepthConcatenateLayerGeorgios Pinitas
Change-Id: I4f8e46d1c79afa9284f2c6dc00383c453a8e7bd5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140165 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1050 CL/NEON: Create a function to convert the 2D weights of FC ↵Giorgio Arena
layer from NHWC to NCHW and viceversa Change-Id: If77ffeb92b6eb883e5d2d2c97c2c4d1d23d17c8d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129257 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1074: Rename WinograLayer.cpp to WinogradConvolutionLayer.cppGeorgios Pinitas
Change-Id: Iccac7cd6cb458469568d0cd6fb36b262353f4188 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129261 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-959: Removed Interleave blocked kernel.Pablo Tello
Change-Id: I775eecbc39da583aae2eb4e033c5930dfc402899 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125684 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-881: RSH new arm_gemm interface.Pablo Tello
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-903: Implements NEPermute for NHWC conversionsGeorgios Pinitas
Change-Id: I4083e8d16bb23933634f229a1408dfd0e8f2922a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120069 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-876: Integrate RSH native GEMM kernel.Georgios Pinitas
Change-Id: Iaae87e155fa673bf099c2bc21a7be072c5c08fc1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119118 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-866: Integrate SGEMV Neon Assembly from RSHMichele Di Giorgio
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-471 Implement Deconvolution on OpenCLMichalis Spyrou
Change-Id: Ie00c6b08a51d30c5ce2637d40ee3d165b8a68686 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110311 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>