aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON/kernels
AgeCommit message (Collapse)Author
2018-09-17COMPMID-472 : Implement Floor for CL and NEON.Georgios Pinitas
Change-Id: I675a4545b1fe9ab665a07c834720bfe7ff589cee Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82527 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-417 NEON/CL MeanStdDev bugfix using FillBorderKernelGiorgio Arena
Change-Id: Ic48ba7f69783d0e1e80611264e2bc67d1732436e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81293 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-438: Add support for floating point Min-Max Location layer.Michele Di Giorgio
Change-Id: I84ae564a40fc7320a6f94a84d53906ba51404f51 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79797 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417: Port NEDirectConvolution 1x1 to QS16.Pablo Tello
Change-Id: Icae6a5091e836d0aca24375f43cca9e6d3a2090f Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81662 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-413: Add support for QS8 and QS16 CLNormalizationLayer.Michele Di Giorgio
Change-Id: I1aaa9fb8d05796bbca9cfae584e084646552bb71 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80155 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-456: Add support for QS16 NEON Normalization Layer.Michele Di Giorgio
Change-Id: I1e542808cfd7774c67cc4e9a58e42449e4fb29aa Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81735 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-417 - Bug Fix WarpPerspective kernelIsabella Gottardi
Change-Id: Ic26fb3b1b60c1a1f4848d683862a25bd1ebc2cc8 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82053 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-09-17COMPMID-421: Added FP16 support in BatchNormalizationLayer.Pablo Tello
Change-Id: I7142e0e8466ef79e016ae56d285e8e9291573e52 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79814 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-421: Added FP16 support in the Neon Locally Connected Layer.Pablo Tello
Change-Id: I4b52a209a5ce1a7e69494008538ed242b14b5593 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81520 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-447: Support scaling factors different than 1 for QS8/QS16 ↵Michele Di Giorgio
NEPixelWiseMultiplication. Change-Id: I6d90a18df861d53546bdca982192b4ffc0dbb3c2 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80794 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-09-17COMPMID-421: Added FP16 support to Softmax.Pablo Tello
Change-Id: If48178689e7cdadf1858556438c7292128be5b92 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80436 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-421: Added FP16 support to the NEON Direct Convolution function.Pablo Tello
Change-Id: I3a1aa2ce985ecf95fc5f441a6e6d43b4935306ee Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79965 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-421: Added FP16 support in Pooling LayerPablo Tello
Change-Id: I6b6119c8770051c1656da40aa073c539c15b493e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78985 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-421: Added FP16 support in ActivationLayer.Pablo Tello
Change-Id: I7ba573b19d56e3c87996edb5218a00e5bfca451e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79755 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-421: Added FP16 support to Arithmetic Subtraction.Pablo Tello
Change-Id: I2043531e8e81f28354a208ff91024c3954389422 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80304 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-401: Implement FixedPointPosition conversion for NEON.Georgios Pinitas
Adds support of changing the fixed point position of a tensor in DepthConvert. Change-Id: Ic3b50a4628fac7497a0217d92941c9d6f64d21cb Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80438 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-410 Port BatchNormalization to use fixed point 16Michalis Spyrou
Change-Id: I7d3e9ff70c717ef5e6de2bcfbfd277f39006702f Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78956 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417: Add Leaky RELU support for both NEON/CL.Georgios Pinitas
-Adds parametrizable leaky relu (x>0) ? x : a*x. Change-Id: Ief19a435b5832a30b56f4aaaf55125787addee94 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80575 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-444: Add support for QS8/QS16 NEON Arithmetic Add/Sub/Mul.Michele Di Giorgio
Change-Id: Ia482498688ca1884272b5062e3415e736e03d36f Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80448 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-417: Port DepthConcatenate to QS8/QS16 for NEON/CL.Georgios Pinitas
Change-Id: I3dddae63043c7aa18d908a4fc8abacf3c64f98ca Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80081 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-09-17COMPMID-421: Added FP16 suppot to NENormalizationLayer and ↵Pablo Tello
NEPixelWiseMultiplication. Change-Id: If174f8071502fc5cc94b27cd44a9b1d5e451a9e2 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79553 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-421: Added FP16 support to Arithmetic Addition.Pablo Tello
Change-Id: I728f0a856e6581db5b61494a9c4850b963a61573 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80280 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-428: Port NESoftmaxLayer to 16-bit fixed point.Georgios Pinitas
Change-Id: I65122950bab9124b9758c27096c0f458b77aeabb Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79365 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-09-17COMPMID-421: Added F16 support in FC Layer.Pablo Tello
Change-Id: I9c3ab51ae024be69c7b1d83803b1a8f60a0cdbfd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79326 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-409: Add support for QS8 and QS16 CLPixelWiseMultiplication.Michele Di Giorgio
Change-Id: I7f66d49d746ba9fb6e726ccab83d3a97b8ddef80 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78491 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-427: Port NEActivationLayer in 16bit fixed point.Georgios Pinitas
Change-Id: Iebd61807f7b597c6bd990673bc7655c68ee16f4b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79085 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-09-17COMPMID-417: Fix assert in GEMMTransposeMoritz Pflanzer
The assert was checking the wrong thing. Only if the window over the input is smaller than the number of processed elements, the output shape would be empty. However, the valid region will be empty if the input's first dimension is less than the number of elements processed. That required the changes in TensorShape. Change-Id: I36fed7893dfd502e26c5c776c9a2d774d6cd91c6 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79813 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-09-17COMPMID-417: Fix output access window in ChannelExtract Kernels.Georgios Pinitas
Change-Id: I0349ef7205e316d85a01e83e86016310143f8886 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79820 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-417: Auto initialize for SoftmaxLayer NEON/CL.Georgios Pinitas
Change-Id: I6f35ac7a15fecab93deec4c6266e5c9632e599f0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79628 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-09-17COMPMID-417: DepthConvert NEON for QS8/QS16.Georgios Pinitas
Change-Id: Ieb120bccf146045b3a0001ceb3893d4e67fd19df Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79763 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-09-17COMPMID-436, COMPMID-437 - Port NEConvolutionLayer & NEFullyConnectedLayer ↵Gian Marco Iodice
to support 16 bit fixed point Change-Id: I69edf2dac242f941bac95c8479d921e7be6abca7 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79725 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-09-17COMPMID-433 - Port NEGEMM to support 16 bit fixed pointGian Marco Iodice
Change-Id: I82de74d7027bbc8a00a4d6671e968785280d5f6c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79498 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-418 Add check and fix comments after preprocessor conditionsAnthony Barbier
Change-Id: I1353fd652ee180e3931e58b4ce13d651a48c7e2c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79567 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-421: Fixed FP16 support in Neon GEMM.Pablo Tello
Fixed GEMM FP16 problem with matrices that are not multiple of 32. Added a new test suite NEON/GEMM/Float16/SmallGEMM. Implemented FP16 function to multiply vector by a matrix. Change-Id: Ie6c692885a48d0206bd6fe748332fa83bc286d67 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79118 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-417: Auto configuration for Add/Sub/Mul Neon/CL.Georgios Pinitas
Change-Id: I3580de76bc53d342b53443d1077b1407d75a672a Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79570 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-417: Auto initialization for PoolingLayer for NEON/CL.Georgios Pinitas
Change-Id: I2c399c5fe30a9d68fb84742771e7ef10beadb071 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79569 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-417: Autoconfigure for BatchNormalization CL/NEON.Georgios Pinitas
Change-Id: I49a410ccd0102699543bfd23a4d9518d781df281 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79563 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-417: Add autoconfigure in NormalizationLayer CL/NEON.Georgios Pinitas
Change-Id: I6a6ad0c7a92322776de6a4d4cceeb1365859ef4d Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79566 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-432 - Extended Convolution Layer to support rectangular kernelsGian Marco Iodice
Change-Id: I99be1efede4de6dd63ce103fb11196c413757621 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79252 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-421: Fixed a problem in Convolution Layer reference values for FP16.Pablo Tello
All methods in std::numeric_limits<float16_t> return 0. Change-Id: I2289e01853e1b2c38afdec119ef6fc8af8a9752e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79312 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-359: Implement NEON ROIPoolingLayerGeorgios Pinitas
Change-Id: Ibffa738d4016d7221968bd43a4e6e1dab85baee8 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78623 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-414 - Port CLConvolutionLayer to support 8 bit fixed point - ↵Gian Marco Iodice
CLWeightsReshapeKernel Change-Id: Ie32e6bdd557a8243eb9988aa7eab4e4ca2291e79 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78701 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-417 - Adding support for rectangular kernelsGian Marco Iodice
Change-Id: I4dde0929bc689c83582b95856dd0253228125df2 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78994 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-424 Add validation tests for Gaussian5x5SiCong Li
* Fix apply_2d_spatial_filter to use double as intermediate type * Fix tensor_elem_at to use random value if on border and border_mode is UNDEFINED Change-Id: I7feea23c4664cc63c5bab936566dc92b98c723b9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78905 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-09-17COMPMID-403: Add 7x7 NEON Pooling support.Michele Di Giorgio
Change-Id: I2f1e808884f215b9cf79e1f2015ef901e66b3e5f Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78146 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-421: Added FP16 support in Convolutional layer (Neon)Pablo Tello
The test suite for FP16 is conditionally compiled in when the target platform is arch=arm64-8.2-a Change-Id: I1686157e83809a00a91058bff80dbecf692fb356 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78740 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-09-17COMPMID-345 - In-place computation for Activation LayerGian Marco Iodice
Change-Id: I25ebfccc3d3e758cc8164e0b33805c0bb303891a Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78226 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-345 - Auto-inizialization for NECol2ImKernel and ↵Gian Marco Iodice
NEGEMMInterleave4x4Kernel Added support for 8 bit fixed point in CLTransposeKernel Change-Id: I8257fa30e90e70825c97c16b0a11af73c426319c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78563 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-411 - Ported CLGEMMInterleave4x4Kernel and CLGEMMTranspose1xWKernel ↵Gian Marco Iodice
to support 8 bit fixed point Change-Id: If236c9047ed536e808a0ed26e97e1799ca938e03 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78529 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-345: Scale input valid region in TransposeWindow.Georgios Pinitas
Change-Id: I880e85834acc42d9d15b38ceeaadbaee9690a484 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78093 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>