aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON/kernels
AgeCommit message (Collapse)Author
2018-11-02COMPMID-678 Align Convolution InterfacesGiorgio Arena
Change-Id: I257a09860dd82e7bb7a767edf96dcaf31b512855 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95865 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-583 - Implemented reference implementation and validation tests ↵Isabella Gottardi
(NEON and CL) for Histogram Change-Id: Iccf6b4483cb8394dab2f861a737583126f9bed81 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91601 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-675 - Fixed mismatches in GEMMLowpMatrixMultiplyKernel dotproduct pathPablo Tello
Change-Id: I791a08c1e333ce6fc5d537f50ab731fbe066e9c9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95737 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-671: Add global pooling layer support.Georgios Pinitas
Change-Id: Iead7497cc03e1e7bde440d2965a7bf54cbfa88bf Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95579 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Joel Liang <joel.liang@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-675 - Reworked NEGEMMLowp interface/functionGian Marco
The new interface makes NEGEMMLowp able to work with ASYMM8 data types. Implemented 2 new functions: - NEGEMMLowpMatrixMultiplyCore - NEGEMMLowpOutputStage These functions should make the integration in android NN doable For more information about GEMMLowp: https://github.com/google/gemmlowp/blob/master/doc/low-precision.md Change-Id: Ie2c775f45234f68ca53dba644b3a912b997fd890 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95504 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-556: Add support to build arm64-v8.2-a for Android platform (clang ↵Ioan-Cristian Szabo
compiler) Change-Id: Ibb779dd3a8d10786da6d8f70590e654e14654d7b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95530 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-662: Integrated the new a64_s8_gemm_12x8 + dot product kernel into ACL.Pablo Tello
Change-Id: Id8f919e486a132fc58346c9f84fccbeeb83d19b3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94233 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-616 - Optimizing GEMMLowp on NEON intrinsicsGian Marco Iodice
Change-Id: Ibbeff5d37249b6e8fc34ad496035a1511c9da5a3 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/94072 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-647: Exclude padding pixels from averaging factor.Georgios Pinitas
Adds support for excluding the padding pixels from the average scaling factor calculation. Change-Id: Ia13fbfeae235aff564db74191613921848231a01 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93715 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-464 Implement Depthwise convolution 3x3 on NEONMichalis Spyrou
Change-Id: Ie4e1803a52afac6b6c597c6e551729dad2347cd1 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/92607 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-575: Port Magnitude to new validationJohn Richardson
Change-Id: I2600947bef30853d00adfa4b919dbcb860de9bfd Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91717 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Cherry-picked minor fixes from GithubAnthony Barbier
- Added --api 21 to documentation - Removed include of a runtime header in Core - cherry-picked 2 small fixes from Github commit 869d424d6fd5df7b15a858f2c5f853536f7a0aca Author: giorgio-arena <arena.cpp@gmail.com> Date: Mon Oct 23 16:58:59 2017 +0100 Update 00_introduction.dox commit f054c210e493111061a458b887f7c4edaca06a9f Author: Forrest Iandola <fiandola@gmail.com> Date: Mon Oct 16 00:44:24 2017 -0700 fix comment typo kernerl's --> kernel's Change-Id: I0a3893148a9565acbfd18d340da41845ce3ad44f Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93460 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-634: Enable clang with libc++ to compile for Android (32 and 64 bits)Ioan-Cristian Szabo
Change-Id: I693f64e70cd478e93675a8b04360128ded3b60d4 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/93015 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-481: Optimized and cleaned up NEGEMMInterleaveBlockedKernel.Pablo Tello
Change-Id: Ia0fd86341bed7bb9022891544fa449f828dfba75 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91746 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-641: Fix NENonLinearFilter invalid read.Georgios Pinitas
NENonLinearFilter was reading out-of-bounds for 5x5 disk configuration. Changed to load within bounds and shift the vector appropriately later on. Change-Id: Ieb63312200af4c8989776d2b9188e0f3128e4853 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/92726 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-601: support for asymetric padding in cl conv and depthwise convJaroslaw Rzepecki
Change-Id: I5c6c95091ae77dba96459c0640f9f6167a988c8c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91700 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-569 - Implemented reference implementation and validation tests ↵Isabella Gottardi
(NEON and CL) for Median3x3 Change-Id: I7028f0bcc4a502261210f536ed604a7651ab6726 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91332 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-481: Add gemmlowp_aarch64_v8p4 kernel.Pablo Tello
Change-Id: I15496b16ffd636f5bff76572e750df7e15c80830 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/90532 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-470: Neon Deconvolution.Pablo Tello
Implemented by up-sampling the input with zeros insertions between the input samples and convolving the Deconvolution kernels on the up-sampled result. The upsampling is performed by the function NEDeconvolutionLayerUpsample. Convolving is done by NEDirectConvolutionLayer. Change-Id: I25f7ba7c6b99cd9310797972ede40aeff4a54900 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85319 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-554 Add NodesMichalis Spyrou
- BatchNormalization - L2Normalize - Floor Change-Id: I03e06dea30e956f56a86f9c5642cd609c6696ad2 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/91364 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-463 - Extended Pooling Layer on NEON to support Global PoolingGian Marco Iodice
Change-Id: I8ae44187624deeab3d40d878e7b34ff651f1dad0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/89834 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417 Fix bare metal build for armv7aMichalis Spyrou
Change-Id: I566a41061b75c3a1dad5374fcdc84372e6cfbe89 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/89670 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Fix clang-tidy failures for 32-bit runsGeorgios Pinitas
Change-Id: I2fbb6dda1c281627a4d64dce3b4c4d2ebaa8d022 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/89289 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-424 Implemented reference implementation and tests for WarpAffineIsabella Gottardi
Change-Id: I4924ab1de17adc3b880a5cc22f2497abbc8e221b Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85820 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>
2018-11-02COMPMID-481: Add AArch32 GEMMMoritz Pflanzer
Change-Id: Idba0b30bfb27866a46a22388014ab81432ea28dc Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86196 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-544: NEDirectConvolutionKernel optimization.Pablo Tello
The optimization works on tensors with width <= 8 and height <= 8. The new code is 0.5 faster than the old one as it uses fewer instrunctions to compute the same result. Change-Id: I408d6c73ebd3d266bdaaf92fcb6bcdd58f239977 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/88642 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-481: Add AArch64 GEMMMoritz Pflanzer
Change-Id: I34f94f99cb05f0eabafee13c5e623ee779b72360 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83741 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-515: L2 Pooling for FP32/FP16 in CL.Georgios Pinitas
Change-Id: I43641fa672f5905ca62edd1f63fc93e0cf7ea382 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85963 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-417: Add support for floats in scale.Georgios Pinitas
Change-Id: I7d714ba13861509080a89817f54e9d32da83e970 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86026 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fixed broken scons script to build bare_metal and related ↵Pablo Tello
compiler errors. Change-Id: I5f2d6c8b199698a5c2622254696da7034cef1b50 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87928 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-462: Implement TensorReshape for NEON and CL.Georgios Pinitas
Change-Id: I11b39c2ceca26ade73822e29a384ef866ae05729 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87707 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-532: NEON ConvolutionLayer Quantised valgrind errorsGeorgios Pinitas
When matrix B was one dimensional the AccessWindowTranpose did not add bottom padding leading to invalid accesses. Switches the matrix B access window to AccessWindowStatic and allows AccessWindowStatic to add padding to 1D tensors. Change-Id: Ic7fbd20e0c85575b98a506c4c22d2f9ecd8995a9 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87757 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Fix NEDirectConvolutionLayerGeorgios Pinitas
Change-Id: I62a1fc7253a4f597d0d63b80310e0c84c3602b1a Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87436 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-417 Fix ROIPoolingSiCong Li
* Fix ROIPooling in NEON, CL and Reference. Change-Id: Id5066625e5073e0bfebe69391f7941e993003296 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/87435 Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-439 - Refactored NEQuantizationLayer and NEQuantizationLayer in ↵Gian Marco Iodice
order to support 3D input tensors Change-Id: I03eac2108a30bed56d40dfd52e75577a35d492e0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85783 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-514 (3RDPARTY_UPDATE)(DATA_UPDATE) Add support to load .npy dataSiCong Li
* Add tensorflow_data_extractor script. * Incorporate 3rdparty npy reader libnpy. * Port AlexNet system test to validation_new. * Port LeNet5 system test to validation_new. * Update 3rdparty/ and data/ submodules. Change-Id: I156d060fe9185cd8db810b34bf524cbf5cb34f61 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84914 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-481: Add thread info parameterMoritz Pflanzer
Change-Id: Iebb50a88d017445b6b37a86563ebd4abd86c5cf5 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86788 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-519: Add support for Lower and Upper Bounded RELU for CL/NEONGeorgios Pinitas
Change-Id: I7b16216ac59c899a33942bf17757b54535256d7a Reviewed-on: http://mpd-gerrit.cambridge.arm.com/86172 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-518 - Bare metal supportMichalis Spyrou
Change-Id: Ida6d3dc46476fd9a67b5860e5e5bf8b848a8ac23 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85981 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-424 Implemented reference implementation, new output valid region ↵Isabella Gottardi
and validation tests (NEON and CL) for Scale Change-Id: I056fa3588b807a97cacf0b8afaec56e37ffc92af Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83872 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-358 Implement OpenCL ROI PoolingSiCong Li
* Implement OpenCL ROI Pooling * Add CLROIPoolingLayer benchmarks Change-Id: I8786d01d551850a1b4d599a48fabe3925e0a27d0 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79833 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-417: Port PoolingLayer to new validation.Georgios Pinitas
Change-Id: I7f2f5f5f81ad9932661fc4c660bf90614288bc96 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/85270 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-439: Implement NEON Dequantization Layer.Michele Di Giorgio
Change-Id: I2f4f9d0d3437e9d8142f0f82b330233d31ffd552 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80086 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-345: Added support for 5x5 kernels in NEDirectConvolutionPablo Tello
Change-Id: I25cd8f057566b59ce40e2acf14714e83a286ae4e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83791 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-11-02COMPMID-439: Implement NEON Quantization Layer.Michele Di Giorgio
Change-Id: Iefbb421915e56d880d6a3e20c113913560f6ca10 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79934 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-345: Optimization for NEFillBorder kernel.Pablo Tello
It's about 0.8 faster than the old code for the special cases where left and top borders are both of size 1. This should improve a bit the performance of many kernels but specially in DirectConvolution where the kernel size is 3. Change-Id: I7d150cac4b1d9bf3bbf897ef6151e139fc34b39c Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83403 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Cleanup NEON FullyConnectedLayerMoritz Pflanzer
Change-Id: Ie02a0a1a28ca2771e29a5e6552242caf0f6db1cf Reviewed-on: http://mpd-gerrit.cambridge.arm.com/83555 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-417: Add in-place support for batch-normalization.Georgios Pinitas
Change-Id: I4b0c9348f3bc2addc198a76fadd1b583abf42b60 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/84434 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-424 NEON/CL Harris Corners validation tests.Giorgio Arena
Change-Id: I82d2a73f515a8d45d16b9ddb702fea51ae05c82e Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79687 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
2018-11-02COMPMID-412: Port PoolingLayer to use fixed point 16.Michalis Spyrou
Change-Id: I2005de4c7c14526996309826d33a0ec8e732d2d5 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/78720 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Steven Niu <steven.niu@arm.com>