aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/NEON
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1451: Fix allocation of weights in DeconvolutionMichele Di Giorgio
Change-Id: If3ca0b034a7448df1e5349b51a2b124f1b4e99c1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153956 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1586: Add support for NHWC CLDeconvolutionLayerMichele Di Giorgio
COMPMID-1651: Fix QASYMM8 CLDeconvolutionLayer This patch also extends the range of values used for testing Convolution and Deconvolution to cover quantized [-1.0f, 1.0f]. Change-Id: I8b280669db67bb3ec25bf5d411c8f5954f5b0dab Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149869 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1621 Deconvolution wrong output calculationMichalis Spyrou
Change-Id: Ida71312bcf6dbd854f2ab1efc65f74910c79e152 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151510 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1519: Add support for 3D input/output in CLGEMMLowpOutputStageGeorgios Pinitas
Change-Id: I637add70310d2da4d82b236a6352af9d33be17a1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149706 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1596 Create UpsampleLayer for NEONMichalis Spyrou
Change-Id: I82d95c4f1c5fed13b213a2591cc2b4e0d0e02a54 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149676 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1590: Memory corruption in arm_compute_validation.Pablo Tello
RSH code only support padding SAME|VALID, this means we cannot call it with padx=1 for kernel size 5x5. The supporting padding values are 2 and 0. Fixed the problem by modifying the test shapes and added some asserts in NEWinogradConvolutionLayer. Change-Id: I4b73fa9d13c2200a47002965dc3b471d0f2cafba Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149883 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1540 Implement YOLOLayer on NEONMichalis Spyrou
Change-Id: Ice28996959dc666fff5e8ae486c1ff8093db083f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148367 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1446 : Add support for 3D output in NEGEMMLowpOutputStageGeorgios Pinitas
Change-Id: I61e7d39d09a9936b1128ec04038fa2d8dfe6a2c8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149211 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1564: Add NEDepthwiseConvolution3x3 for QASYMM8Georgios Pinitas
Change-Id: I1f55508af6f220e5f41df7b56daffb4761ed0591 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148253 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-1532: Add DepthwiseConvolution3x3 FP16 on NEONGeorgios Pinitas
Change-Id: I780970f317b979b3230e2b471ac01df7fda9ee14 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148168 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1563: Fix name of NEGEMMInterleavedWrapperAnthony Barbier
Change-Id: I5f868091cae7bd86eeeb7216d44f32c190c5a604 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/147804 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1566: Add broadcast to CLArithmeticSubtractionGeorgios Pinitas
Change-Id: I05d21f9a92013ecfd1128d12cf1561cfd6e5c5e9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/147983 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1563: Added a tag to ISCheduler::run_workloads to identify workloadsAnthony Barbier
Change-Id: Ieac59e3ccf47feab8f88c65200eb8a81b2eb4196 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/147728 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1552: support kernels sizes 1x7, 7x1, 1x5, 5x1 in NEWinogradPablo Tello
Refactored the validate method to make it easier to maintain in the future when adding support for new kernels sizes Change-Id: I12d9fe7af15ceb0e655cef61ca94407558fb29e8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146713 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1556 - Add ReorgLayer to graph APIGian Marco Iodice
Change-Id: I50c13b5808f3cceec36b92e7afc027f47ebbdea4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/147369 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02[COMPMID-386] Github: Support SoftmaxLayer on different number of dimensions?giuros01
Change-Id: I7422b977538ff29930a90f078badc2edee78af93 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146638 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02[COMPMID-1482] Auto-initialize the output tensor of im2col and col2im in ↵Giuseppe Rossini
NEGEMMConvolutionLayer Change-Id: I56df03e5b491f0ea96f8f123229b116b35dd443f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146770 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1548: NEON FP16 mismatches on CannyEdge and HarrisCorners.Georgios Pinitas
Removes FP16 from HarrisCorners and CannyEdge. Change-Id: I5e4f9205fdbe4de85f04f55ecf1568c837e56cc0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146247 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1528: Add ReorgLayer on NEONGeorgios Pinitas
Change-Id: I44369b4a716767163e2233b7d87bff300c523383 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146314 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1514: Add validate to NEFloor and CLFloorGeorgios Pinitas
COMPMID-1515: Add FP16 support to NEFloor and CLFloor Change-Id: Ib63a62c7681056ee13be99ce081b4d3949da4217 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/146547 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1247:Integrate kernel size 1x3 & 3x1 support in NEWinogradLayer.Pablo Tello
Change-Id: I6fe198881230e49864c841a3b2366ccf2a9247f9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145210 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1469: Add validate in NEGEMMMatrixAdditionKernelGeorgios Pinitas
Change-Id: I228e2503eb40c12869fbd7e834ac1309aa613480 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145878 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1304: NEDepthConvert : Add support for FP32 -> FP16 and FP16 -> FP32 ↵Michele Di Giorgio
+ validate() function Change-Id: I12e4696a454744f6d493ab3a53520d3acf3a1a26 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145719 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02[COMPMID-1301] Add validate() method to NEReshapeLayerGiuseppe Rossini
Change-Id: Idc3b15f2421858bbf726cd9da82487ff2e1f2910 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145335 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1246 Fix NEON mobilenet NCHWGiorgio Arena
Change-Id: I1dd6df9bd4a96cb7cbacce939a89c3a7ccee71c8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145397 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1534 - Fix GEMM and Magnitude test for FP16Gian Marco Iodice
On GEMM we had accuracy issue On Magnitude we have disabled the fp16 acceleration since we do not have feature parity with CL and this function is not used for ML Change-Id: Iaebe3bbbd2a9f45db0c714aa5ebaf48eb0b65741 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145467 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1534: Fix LSTM/RNN Layers for NEON and FP16Georgios Pinitas
Switches default activation layer to the respective datasets to RELU from LOGISTIC Change-Id: I09f1ad09922ccdd6e1dc33c28a594f7ffbfe40f4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145436 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1047 Extract Flatten function from Im2Col for NEONGiorgio Arena
Change-Id: I80f3aaadc8cae8c9ca1a5a239e79bda302b89bd8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144813 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1188: Set all arguments to const in ↵Georgios Pinitas
NEDepthwiseConvolutionLayer::validate() Change-Id: If922d5ea118910f651f986ff40f0c0a2b8bfc459 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144614 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1060 LSTM FP32 NEONMichalis Spyrou
Change-Id: I0bdf874e61917903c26f713ec41a7ffc29e07233 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143892 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1480 Add support for NHWC QASYMM8/FP32(non-optimized) to NEON ↵Giorgio Arena
DepthwiseConvolution Change-Id: I751f5d3fb74085d2e67f610ecf52da4736d0cfb5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143870 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1366 Implement NECopyMichalis Spyrou
Change-Id: I183e4b7081bf12de3546293a00da68b4f4a0dd5e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143987 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1498 - Enable grouping in CLGEMMConvolutionLayerGian Marco Iodice
Change-Id: I15c7df21773145b03f42b6f78bd7ad2e5b8a5219 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144126 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1246 Remove unused window iterator from NERNNLayer.Michalis Spyrou
Change-Id: Ia1ab755f85adb602c115f20e384fb459d3f91927 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143894 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1485 - Add support for NHWC when running NEGEMMConvolutionLayer with ↵Gian Marco Iodice
FP16/QASYMM8 When the GEMM3D check fails, now we fallback to the classic implementation with im2col and col2im. In this manner the function can work with QASYMM8 and FP16 Change-Id: I359e9da3a63956f33b5acbc9bca4383b14af10e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143372 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1342 Add grouping support to CLIm2ColKernelGiorgio Arena
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02MLCE-13: Sanitizing matrix argument in the Warp.Pablo Tello
This changes help to prevent errors like passing a matrix with less elements than required into the warp functions. Change-Id: I863f933a5e0568258717cffed3a20788d3d03083 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143044 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1188 - Removed the multiplication by 4 in NEGEMMInterleavedWrapperGian Marco Iodice
Change-Id: Iaf8519bc483b947876a9b6ba83b4eb43b45b83a1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143135 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1277 - Optimizing CLIm2ColKernel for NHWC.Gian Marco Iodice
This patch includes: - Im2Col optimizations for NHWC using a new data layout - Refactoring of CLIm2ColKernel adding validation method and auto-init - Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1248 Enabled memory manager in NEWinogradConvolutionLayerAnthony Barbier
Change-Id: I7bbab53f18a42f0879d80122a52bb6bdca4b8631 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142413 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1406: Refactor gemm_interleaved to use our own types and schedulerAnthony Barbier
- Ported PrepareB kernel from gemm_interleave - Ported TransformA feature from gemm_interleave - Allocate reshaped a and b buffers - Added memory_manager / memory_group - MatrixMultiply kernel - Interleave kernels execution. - Fixed a few bugs: all nightly Convolution tests passing for threads=1 and threads=4 - Added Doxygen documentations and comments in the code - Added support for all data types supported Change-Id: Iffa1c09fda0bb9c61213bb83524d5a48e7ecb03c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141281 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02MLCE-36: FC tranpose weightsGeorgios Pinitas
Change-Id: I3b8a6c00e61ba6da459ca5fc7275393f9d073aed Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142533 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188 - Fixed CLGEMMConvolutionLayer/NEGEMMConvolutionLayer for NHWCGian Marco Iodice
We skipped im2col also without unit strides Change-Id: I04c63a6dda8553b3890e832a56ff6854349c829a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142520 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1445: Fixed NEON GEMM assembly dispatch for QASYMM8Anthony Barbier
Change-Id: I2d20cd3c5f83a9ba4e0de6659b255337877d5bbc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142252 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1440: Access original B in gemm assembly when not pretransposed.Georgios Pinitas
Change-Id: I5f2c198f7ac4d8996180e204e763ab53f5e7ea3d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142153 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Matteo Martincigh <matteo.martincigh@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1434: Fix NEWinograd for NHWC and sub-tensorsGeorgios Pinitas
Apply offsets and strides to winograd transform functions in NEON. Change-Id: Ia4f44d22244203a5f9d93d2fed73570396b0d28c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141803 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1386: Add FC convert weights on NEONGeorgios Pinitas
Change-Id: I7a3c6db9285e3899494f496b2562d80cec1b6521 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141407 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1401 Implement NEFullyConnectedLayer for QASYMM8Giorgio Arena
Change-Id: I0404df6d369855e2f458f2db8f26e81c80a1ee87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1419: Make NEGEMMAssemblyDispatch dynamically typed instead of templatedAnthony Barbier
This makes it easier to integrate in GEMMLowpMatrixMultiplyCore Change-Id: Ibf80803f016a2e6a24d943ffafb50b48f04ec545 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140868 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>