aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2018-11-02Revert "COMPMID-582: Add validation to channel_extract kernels."Anthony Barbier
This reverts commit 9a0875951d43dda035f32d2e0728cf59d80cb4d3. Change-Id: I6af0bc64c656f91cf1e0357f8760defa08f2e78d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121190 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02Revert "COMPMID-915: Create ResNet50 example"Anthony Barbier
This reverts commit 2e8c7ee2ecebd9783c97bbd602a61989e1247d6b. Change-Id: Id90691f427a68d01480889f8d5fff190fd72c5a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121176 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-939 Fix mismatches and finalize CLSoftmaxLayer optimizationGiorgio Arena
Change-Id: I4404f91a270e0ba7bbb7451c4c43a485fd4a3f6c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121105 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-915: Create ResNet50 exampleAlex Gilday
ResidualLayer node (COMPMID-916) also created as required for the ResNet architecture. Change-Id: I3aef0b6d6fd5bfcd4916fed4d8d4466b8a92b70d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120562 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-909: Enabling in-place computation for batchnormalization and ↵Michele Di Giorgio
activation at graph level Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-925: Enabling OpenCL tuner in the graph examplesMichele Di Giorgio
Change-Id: I4fe501281f527e20e8fdd0253d59ea2c4629056b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120354 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-845: Create a ConvolutionLayer for CLIsabella Gottardi
Change-Id: Ifcc406d2d0a99c911d6b6c875657b0e0028255d5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119148 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-934: Return an error in Validate when we don't support asymmetric ↵Anthony Barbier
padding Currently an assert gets fired in debug mode, and we just ignore the asymmetric padding in release mode. Change-Id: Ia6278b5722f7e93f356a975ab3243e6bb07e44a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120840 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-828 - Add support for pool widths 4, 5 & 6 and for non square data ↵Isabella Gottardi
sizes - Part 2 (NEON) Change-Id: I64bc8e3f71236edb71494f431ee34077eb8814ca Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118203 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-938: OCLgrind: Mismatches in depthwise convolution on BifrostGeorgios Pinitas
Invalid conversions in oclgrind when clamp is used. Removed call to clamp in CL kernel and replace with convert_sat. Change-Id: I3cd9b87dc10c65d307fbf6eb0aec1b671fba6e97 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121062 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Productise Winograd.Pablo Tello
a) Added support for kernel size 5. b) Templatised data type for transforms and batched gemms kernels. Change-Id: Idb83dda7a5eec19e015888ab31902bd791913297 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120540 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I6413a05f6870a0d04f12d7348269b15297ae8493 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-882 - Optimizing GEMMLowp on OpenCL reshaping matricesGian Marco
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz. The performance have been reported here https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-905 Optimize CLSoftmaxLayer for QASYMM8Giorgio Arena
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-765 Move direct convolution output stage to the right fileGiorgio Arena
Change-Id: Ifb4d27ba05aa618babb79b1f8e95fbfa689c5f3a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120792 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-856: CL Depthwise Convolution QASYMM8 supportGeorgios Pinitas
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-906: Use fused activation in NEON Batch normalizationGeorgios Pinitas
Change-Id: I5a6413548b2c9b8972c91ddba57395509dffd87e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120656 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix CPPPermute error when permuting the strides.Georgios Pinitas
Change-Id: I4ea57579d997dd6a2e248634e3b7cb58bb3e2838 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120693 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 : NEON Wrapper initial traits and overloadsGeorgios Pinitas
Change-Id: Iea4c4732d19e8cf9b245ac2a9f75b2aa70a5839e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118149 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Switch 1x1 DeconvolutionLayer to use the ConvolutionLayerGeorgios Pinitas
-Swithes the 1x1 DeconvolutionLayer to use the ConvolutionLayer instead of the DirectConvolutionLayer. Change-Id: I3ffe152c42c3b1c7ea572f264cd3215df01aedc2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120292 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Sanitize permutation vector for Permute.Georgios Pinitas
If permutation vector is bigger than the tensorshape to permute then infer dimensions of size one for the extra dimensions. Change-Id: I5addb292f770d925f47f756902e16073039e8f71 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120473 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Stefana Simion <stefana.simion@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-905 Asymm functions support for all vec sizesGiorgio Arena
Change-Id: Ie0c5885a60771f728f80a8c4bdb7f1e4085fa3ee Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120267 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-903: Implements NEPermute for NHWC conversionsGeorgios Pinitas
Change-Id: I4083e8d16bb23933634f229a1408dfd0e8f2922a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120069 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix CLDeconvolutionLayerUpsampleKernel access window.Georgios Pinitas
Change-Id: I4893060ee2fe46db16aac6ee762c45dd30f35cc0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120216 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-875: Deconvolution 4x4 not workingGeorgios Pinitas
-Enforces the use of the ConvolutionLayer function in the DeconvolutionLayer. -Adds tests for 4x4 Deconvolution. -Alters the ConvolutionLayer validation to support even kernels. Change-Id: Id27e285f078e690b8dd58490dd8ea6d875b3cec6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-884: Valgrind: NEDirectConvolutionLayerKernel invalid readGeorgios Pinitas
Change-Id: I258f03b61446e8333645efe80f2857e8c725b9de Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118943 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-828 - Add support for pool widths 4, 5 & 6 and for non square data ↵Isabella Gottardi
sizes - Part 2 (CL) Change-Id: I004906b9b1f11158fe17b4aa2640a7f4685fb929 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118462 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Extended GEMM benchmarkGian Marco
Added new benchmarks GEMM in order to evaluate the performance when the input matrix B has to be reshaped only once Change-Id: I1c4790213704ce57ea7b28f6f362c56edccd1eb9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118910 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-897 Merge batch normalization with bounded reluGiorgio Arena
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-892: OCLGrind failures on both validation and benchmarkGeorgios Pinitas
-Adds quantization info to the ActivationLayer benchmark fixture -Replaces clamp with convert_sat in depthwise conv kernel -Fixes ROIPooling execution slice Change-Id: Ie9bbe08abcfb8278456964e476b0948247c7ecba Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118957 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-907 Optimizing FixedPoint calculation in the output stage of GEMMLowpGiorgio Arena
Change-Id: Ic26fed30f9a54e6adef7861c05c9d55d23ca52ca Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119913 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Fixes DepthwiseConvolution weights shapeGeorgios Pinitas
Change-Id: Id13be9b33fc9b96e058db917e242136f7920fad8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119570 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-765: Fix inclusion error.Georgios Pinitas
Change-Id: I9d8eaadc1fa32716c109e64c9a8793d9b6f8cc6e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119746 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-873: Integrate RSH NEON Depthwise Convolution routineGeorgios Pinitas
Change-Id: Ida1e9a836bc518bfe5563e16bf7f92bde5fc13f7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118472 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-876: Integrate RSH native GEMM kernel.Georgios Pinitas
Change-Id: Iaae87e155fa673bf099c2bc21a7be072c5c08fc1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119118 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-578: Implement FAST corners for CL/NEONAbe Mbise
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-888 Valgrind invalid read in NEGEMVAArch64KernelMichalis Spyrou
Change-Id: I470f244718571e32ac55062b2b62fd0f6996efc6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118940 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-901 - Optimizing CLCol2ImKernelGian Marco
This patch makes col2im on OpenCL 2 times faster Change-Id: I8d90f5a72a050355ca1fd13433d8c2c26e5e33f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119442 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-879: Investigate CL mismatches in Convolution S16Georgios Pinitas
Changes and simplifies the validation to divide the scale in integer format instead of double. Change-Id: Ib9156e9515e4e542391eeda11548f3d15613a0af Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119256 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-828 - Add support for non square pool size - Part1Isabella Gottardi
Change-Id: Ib8100e7c659c49694c746fa3f36ce20f44f6929f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117804 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-895 - Optimizing CLDepthwiseConvolution3x3KernelGian Marco
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1 Performance reported in the following confluence page: https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02 Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-894: Segfault: neon_cnn on S5 neoGeorgios Pinitas
Removed double managment of the same tensor object Change-Id: Ibc74cd8c7bd199cd473ff68f692840cbf01b27b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119119 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-765 - Added LWS hint in CLIm2ColGian Marco
The LWS hint has been applied for optimized cases 1x1 and 3x3 Change-Id: I6b4bfe2f9f7da627052336889b8a18d279fe2675 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119162 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-784: Added support for PADDING = SAME in Winograd layer.Pablo Tello
Change-Id: I5a420da6a8041f9ff6d0811815f2fc74c85c56a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119014 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-855 - Optimizing im2col on OpenCL (DCHW)Gian Marco
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11 Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-754: Add validation method to CLPermute kernelMichele Di Giorgio
Change-Id: If6f3888a035b557a6c369efa22b56d6c8d3efbd3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118789 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-874: Improve default number of threads choice in the SchedulerGeorgios Pinitas
Change-Id: Ia30ec2afce0aafcd39f41440efb972b18bbda9f8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118657 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-890: Valgrind: NEON Convolution Layer validation failsIsabella Gottardi
Change-Id: I5296815cf04e5f805d6523196567b6c01715c8b5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118711 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-871: Remove vst4q/vld4q from NEActivationLayer.Georgios Pinitas
Change-Id: Iebd2a8fece1af87c93d6795e176d8c37ca64bbf6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118187 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>