aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON/kernels
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1342 Add grouping support to CLIm2ColKernelGiorgio Arena
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02MLCE-13: Sanitizing matrix argument in the Warp.Pablo Tello
This changes help to prevent errors like passing a matrix with less elements than required into the warp functions. Change-Id: I863f933a5e0568258717cffed3a20788d3d03083 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143044 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1277 - Optimizing CLIm2ColKernel for NHWC.Gian Marco Iodice
This patch includes: - Im2Col optimizations for NHWC using a new data layout - Refactoring of CLIm2ColKernel adding validation method and auto-init - Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-1406: Refactor gemm_interleaved to use our own types and schedulerAnthony Barbier
- Ported PrepareB kernel from gemm_interleave - Ported TransformA feature from gemm_interleave - Allocate reshaped a and b buffers - Added memory_manager / memory_group - MatrixMultiply kernel - Interleave kernels execution. - Fixed a few bugs: all nightly Convolution tests passing for threads=1 and threads=4 - Added Doxygen documentations and comments in the code - Added support for all data types supported Change-Id: Iffa1c09fda0bb9c61213bb83524d5a48e7ecb03c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141281 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1434: Fix NEWinograd for NHWC and sub-tensorsGeorgios Pinitas
Apply offsets and strides to winograd transform functions in NEON. Change-Id: Ia4f44d22244203a5f9d93d2fed73570396b0d28c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141803 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1429: NEGEMMNativeWrapperKernel fails when ran for WinogradGeorgios Pinitas
Change-Id: Ief9b717fe2bcf626660109ec491f8882d0ef06d7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141658 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1367: Enable NHWC in graph examplesGeorgios Pinitas
Change-Id: Iabc54a3a1bdcd46a9a921cda39c7c85fef672b72 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141449 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1188: Remove FP16 capabilities check form NEReshapeLayer.Georgios Pinitas
Change-Id: I2dde22f70b5aa27be983cf6b6ee1d1926653aa99 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141510 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1401 Implement NEFullyConnectedLayer for QASYMM8Giorgio Arena
Change-Id: I0404df6d369855e2f458f2db8f26e81c80a1ee87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1419: Make NEGEMMAssemblyDispatch dynamically typed instead of templatedAnthony Barbier
This makes it easier to integrate in GEMMLowpMatrixMultiplyCore Change-Id: Ibf80803f016a2e6a24d943ffafb50b48f04ec545 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140868 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1405: Create our own gemm_native kernel / function.Anthony Barbier
Change-Id: Ie0a80bd6b4eb5632cac63ccf54bcb07d4309da19 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140305 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1386: Add support for converting weights for CL.Georgios Pinitas
Change-Id: I62e3ead903366baeeb1488f233a9b8b0c388c9de Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140403 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1364: Add support for NHWC in NEDepthConcatenateLayerGeorgios Pinitas
Change-Id: I4f8e46d1c79afa9284f2c6dc00383c453a8e7bd5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140165 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1357: Stop passing around raw pointers in NEWinogradConvolutionAnthony Barbier
First step to allow us to enable the memory manager in this function Change-Id: Ic42fdac4c74cd21973c71130b59883e4a87d3dca Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140167 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1409: Disable BN + Act fusion in the graph if activation is not ↵Georgios Pinitas
supported Change-Id: Ifba641d498638ac48bbc9624338a2168a56fe5c6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140075 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1271 Avoid memory leak in list of gemm methodsAnthony Barbier
Change-Id: I80764d09bf5fb87b3a98bc0e1803d25c6c682c1f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139859 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1167: Validation for NEDepthwiseConvolutionLayerAbe Mbise
Change-Id: I9689e1a0627dc015dd2ce98417e4c97bb55581bb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131327 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1271: New system for GEMM heuristicsDavid Mansell
This patch implements a system for separating the "validity" from "preferred" aspect of the current heuristics in gemm_*.cpp. Now, each gemm_*.cpp defines a list of candidate implementations, each of which supplies an is_valid() function (to check for validity), an is_preferred() function (the "heuristic" part), and an instantiate() function which actually produces the GemmCommon object pointer. The actual gemm() function is now templated and uses this list to select an implementation. This patch also implements a mechanism to identify the preferred implementation, and override it via the GemmConfig structure. Change-Id: Id49ab7af8bf2e3e9fd951a9698883ade234d40e1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139120 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1380: Pre-work for SVE support.David Mansell
This patch makes the needed infrastructure changes to allow SVE kernels to be added later on. Change-Id: Ide5bccac2f47278e93fff3d648231aee2d5f8c2e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139070 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-970 : Remove QS8 / QS16 supportVidhya Sudhan Loganathan
Removed QS32 references Change-Id: Ic7df02c08ae7aa1b7dcae15bdda113321af851b8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138703 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1374: Fix constraints on AArch32 SGEMM.David Mansell
The "cc" constraint was missing on the a53/a55r1 versions of this kernel. Added "memory" to these (and the generic kernel) as well for safety. Change-Id: I4df1b2fde43c20550ba7a51436b326f5e9e9871f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138812 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1369: Revert accidental formatting of RSH's repoAnthony Barbier
Pulled latest fixes from David's repo: commit f43ebe932c84083332b0b1a0348241b69dda63a7 Author: David Mansell <David.Mansell@arm.com> Date: Tue Jul 3 18:09:01 2018 +0100 Whitespace tidying, fixed comment in gemv_batched imported from ACL. Change-Id: Ie37a623f44e90d88072236cb853ac55ac82d5f51 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138530 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: David Mansell <david.mansell@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-970 : Remove QS8 / QS16 supportVidhya Sudhan Loganathan
Removed fixed point related code. Change-Id: I487acf138dace3b0450e0d72ca7071eaec254566 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137678 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1363: Handle null bias pointer in DirectConvolutionOutputStageKernelAnthony Barbier
Change-Id: I412db69f31811fe4a7d262a0f146ef545731fc02 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138500 Reviewed-by: Aron Virginas-Tar <aron.virginas-tar@arm.com> Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1287: Extending NEWinogradLayer test suitePablo Tello
Added NHWC to the dataset to the validation tests Fixed a problem in the output transform which made the Activation to fail because way/ordering the output transform wrote the data to the output tensor. Change-Id: I9609f86605dbfef70b47a0fb043287bf0e5d675b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136015 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1162: Enable NHWC data layout support for NEWinogradConvolutionLayer ↵Pablo Tello
- part1 In this first part we reworked the configuration of the kernels as before we passed the raw pointer to the buffer within the configuration of the function Change-Id: I83d3cb64c562303093c7f0ae52395ecd080a5d52 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133560 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1256: Memory corruption in NEGEMMGeorgios Pinitas
Change-Id: I762a3c9add2e26b850f388a78a16861abb2bf0f9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134553 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-568: Implement Canny edge function for CL/NEONAbe Mbise
Change-Id: Ic5f197463f962bac4b23663bcef7ac744be6fc2a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114250 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-566: Implement reference and CL/NEON validation for ColorConvert ↵Sanghoon Lee
(part 1) - Image to MultiImage will be in part 2 Change-Id: Id2f22c39fb41a78a360d20d2c3bdecd57cdfd152 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128321 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1177: Improved native GEMM.David Mansell
Improve the native GEMM so it can cope with any value for M. Also change the selection code so that the native GEMM is selected if M is small and nmulti is large - Winograd needs GEMMs like this and they don't thread properly with the blocked GEMM. (also rename gemm_batched.hpp back to gemv_batched.hpp) Change-Id: I736c33373ada562cbc0c00540520a58103faa9d5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131739 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1163: NEON Scale NHWC failuresGeorgios Pinitas
Change-Id: Ice620385ce787b568b38fcbdddc94ef385396141 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131355 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-814: Add validate method to scale.Georgios Pinitas
Change-Id: I5004c79ac7b10f988f25e14847f1ea2be01629da Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131143 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1166: Fixed alpha/beta mixup for some merges.David Mansell
The default templated merge, and the specialised S8 12x8 merge, were using alpha and beta the wrong way round. Fixed. Change-Id: Ie559b665edf1eb012e8cb54ea0bca31612bcc072 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131309 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1102 : Enable the use of 4x4 tile sizes in neon implementation of ↵Vidhya Sudhan Loganathan
winograd conv. Change-Id: Ibd2f2c6680b647a066255ea77d4a2a172ef76aa3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130418 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-948: Add validation to NEL2NormalizeLayerJohn Richardson
Change-Id: I0cfea24884066412c2f13d9acdb72ddbccac7545 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130407 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1139: Fix SIGSEGV in arm_compute_benchmarkGeorgios Pinitas
Invalid boundary checks. Change-Id: I9c346dc15d2080f710b283364e83831fa3466941 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131029 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-814: NEScale NHWC supportGeorgios Pinitas
Change-Id: Ibf5c624a5c5482faa42eb02bc8abe9ae0d65b0d1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130608 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1050 CL/NEON: Create a function to convert the 2D weights of FC ↵Giorgio Arena
layer from NHWC to NCHW and viceversa Change-Id: If77ffeb92b6eb883e5d2d2c97c2c4d1d23d17c8d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129257 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-810 Add NHWC data format support for NEON convolutionMichalis Spyrou
Change-Id: I2a7b49a12da7f3bc3f04749243b1dc111160de6e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129348 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1112: Enabled multithreading transforms in Winograd.Pablo Tello
Updated RSH code as well. Change-Id: I9452ff5c7f0ff0cd60b8c223cdd71077288eb0c1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130177 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-597: Port HOGMultiDetection to new frameworkJohn Richardson
Change-Id: I4b31b7f052a06bea4154d04c9926a0e076e28d73 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126555 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: John Richardson <john.richardson@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1134: Fixing the Window reconstruction in gemm_nativeAnthony Barbier
Change-Id: Ibab955dbade4805c39cab003362be2bb3c74b166 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130605 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1097: Port mobilenet to NHWCGeorgios Pinitas
Change-Id: I789065bfa0d4ef133388e1904c5caf31e450f80f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129495 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-808 Add NHWC data format support for NEON direct convolutionGiorgio Arena
Change-Id: I5d4cc3d5b0d25f3fe4ed998c0f15b1b8e260a43a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125697 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1105: Fix mismatches conv layer when using NativeGemm with multiple ↵Pablo Tello
threads Change-Id: Id5ba16a7e3382070fda936c63d174df53596da04 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129964 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-926 Add depth multiplier support to NEON/CL/GLES depthwise convolutionGiorgio Arena
Change-Id: I03f32c62350e5ea43e77bb15fc5a832d83719e3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126657 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1010: Remove RSH profiler headerGeorgios Pinitas
Change-Id: I2967ec94c3bead0b92ff1d1581ff6afea21c7f04 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129405 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-806 Add NHWC data format support format for NEON poolingMichalis Spyrou
Change-Id: I7ab174c72f3d56134fcec259a137739061fd12e9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123065 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1074: Rename WinograLayer.cpp to WinogradConvolutionLayer.cppGeorgios Pinitas
Change-Id: Iccac7cd6cb458469568d0cd6fb36b262353f4188 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129261 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>