aboutsummaryrefslogtreecommitdiff
path: root/src/core
AgeCommit message (Collapse)Author
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I5022d02f06f9d849dad76e3d9b8e48632c236429 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121191 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-936: Convolution failure in NEON Convolution Layer.Georgios Pinitas
Change-Id: I68a98eff57c8db719a501b68541666e8bc5f2081 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121180 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-853 Use tile 2 for CL depthwise convolution QASYM8Giorgio Arena
Change-Id: I91f6a0b057f5eb84c6ac7db5abbc05c7520ed5d2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120760 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Fixed SAME paddding in WinogradLayerPablo Tello
There were mismatches when using kernel size 5 and padding = SAME Change-Id: Id834e96ebcf665616f99c995b48e302dcff8dc48 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121144 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02Revert "COMPMID-582: Add validation to channel_extract kernels."Anthony Barbier
This reverts commit 9a0875951d43dda035f32d2e0728cf59d80cb4d3. Change-Id: I6af0bc64c656f91cf1e0357f8760defa08f2e78d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121190 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-939 Fix mismatches and finalize CLSoftmaxLayer optimizationGiorgio Arena
Change-Id: I4404f91a270e0ba7bbb7451c4c43a485fd4a3f6c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121105 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-909: Enabling in-place computation for batchnormalization and ↵Michele Di Giorgio
activation at graph level Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-934: Return an error in Validate when we don't support asymmetric ↵Anthony Barbier
padding Currently an assert gets fired in debug mode, and we just ignore the asymmetric padding in release mode. Change-Id: Ia6278b5722f7e93f356a975ab3243e6bb07e44a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120840 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-828 - Add support for pool widths 4, 5 & 6 and for non square data ↵Isabella Gottardi
sizes - Part 2 (NEON) Change-Id: I64bc8e3f71236edb71494f431ee34077eb8814ca Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118203 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-938: OCLgrind: Mismatches in depthwise convolution on BifrostGeorgios Pinitas
Invalid conversions in oclgrind when clamp is used. Removed call to clamp in CL kernel and replace with convert_sat. Change-Id: I3cd9b87dc10c65d307fbf6eb0aec1b671fba6e97 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121062 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Productise Winograd.Pablo Tello
a) Added support for kernel size 5. b) Templatised data type for transforms and batched gemms kernels. Change-Id: Idb83dda7a5eec19e015888ab31902bd791913297 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120540 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I6413a05f6870a0d04f12d7348269b15297ae8493 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-882 - Optimizing GEMMLowp on OpenCL reshaping matricesGian Marco
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz. The performance have been reported here https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-905 Optimize CLSoftmaxLayer for QASYMM8Giorgio Arena
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-765 Move direct convolution output stage to the right fileGiorgio Arena
Change-Id: Ifb4d27ba05aa618babb79b1f8e95fbfa689c5f3a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120792 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-856: CL Depthwise Convolution QASYMM8 supportGeorgios Pinitas
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-906: Use fused activation in NEON Batch normalizationGeorgios Pinitas
Change-Id: I5a6413548b2c9b8972c91ddba57395509dffd87e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120656 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix CPPPermute error when permuting the strides.Georgios Pinitas
Change-Id: I4ea57579d997dd6a2e248634e3b7cb58bb3e2838 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120693 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 : NEON Wrapper initial traits and overloadsGeorgios Pinitas
Change-Id: Iea4c4732d19e8cf9b245ac2a9f75b2aa70a5839e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118149 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Sanitize permutation vector for Permute.Georgios Pinitas
If permutation vector is bigger than the tensorshape to permute then infer dimensions of size one for the extra dimensions. Change-Id: I5addb292f770d925f47f756902e16073039e8f71 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120473 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Stefana Simion <stefana.simion@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-905 Asymm functions support for all vec sizesGiorgio Arena
Change-Id: Ie0c5885a60771f728f80a8c4bdb7f1e4085fa3ee Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120267 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-903: Implements NEPermute for NHWC conversionsGeorgios Pinitas
Change-Id: I4083e8d16bb23933634f229a1408dfd0e8f2922a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120069 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix CLDeconvolutionLayerUpsampleKernel access window.Georgios Pinitas
Change-Id: I4893060ee2fe46db16aac6ee762c45dd30f35cc0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120216 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-884: Valgrind: NEDirectConvolutionLayerKernel invalid readGeorgios Pinitas
Change-Id: I258f03b61446e8333645efe80f2857e8c725b9de Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118943 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-828 - Add support for pool widths 4, 5 & 6 and for non square data ↵Isabella Gottardi
sizes - Part 2 (CL) Change-Id: I004906b9b1f11158fe17b4aa2640a7f4685fb929 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118462 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-897 Merge batch normalization with bounded reluGiorgio Arena
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-892: OCLGrind failures on both validation and benchmarkGeorgios Pinitas
-Adds quantization info to the ActivationLayer benchmark fixture -Replaces clamp with convert_sat in depthwise conv kernel -Fixes ROIPooling execution slice Change-Id: Ie9bbe08abcfb8278456964e476b0948247c7ecba Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118957 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-907 Optimizing FixedPoint calculation in the output stage of GEMMLowpGiorgio Arena
Change-Id: Ic26fed30f9a54e6adef7861c05c9d55d23ca52ca Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119913 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Fix inclusion error.Georgios Pinitas
Change-Id: I9d8eaadc1fa32716c109e64c9a8793d9b6f8cc6e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119746 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-873: Integrate RSH NEON Depthwise Convolution routineGeorgios Pinitas
Change-Id: Ida1e9a836bc518bfe5563e16bf7f92bde5fc13f7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118472 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-876: Integrate RSH native GEMM kernel.Georgios Pinitas
Change-Id: Iaae87e155fa673bf099c2bc21a7be072c5c08fc1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119118 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-578: Implement FAST corners for CL/NEONAbe Mbise
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-888 Valgrind invalid read in NEGEMVAArch64KernelMichalis Spyrou
Change-Id: I470f244718571e32ac55062b2b62fd0f6996efc6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118940 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-901 - Optimizing CLCol2ImKernelGian Marco
This patch makes col2im on OpenCL 2 times faster Change-Id: I8d90f5a72a050355ca1fd13433d8c2c26e5e33f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119442 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-879: Investigate CL mismatches in Convolution S16Georgios Pinitas
Changes and simplifies the validation to divide the scale in integer format instead of double. Change-Id: Ib9156e9515e4e542391eeda11548f3d15613a0af Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119256 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-828 - Add support for non square pool size - Part1Isabella Gottardi
Change-Id: Ib8100e7c659c49694c746fa3f36ce20f44f6929f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117804 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-895 - Optimizing CLDepthwiseConvolution3x3KernelGian Marco
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1 Performance reported in the following confluence page: https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02 Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 - Added LWS hint in CLIm2ColGian Marco
The LWS hint has been applied for optimized cases 1x1 and 3x3 Change-Id: I6b4bfe2f9f7da627052336889b8a18d279fe2675 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119162 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-855 - Optimizing im2col on OpenCL (DCHW)Gian Marco
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11 Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-754: Add validation method to CLPermute kernelMichele Di Giorgio
Change-Id: If6f3888a035b557a6c369efa22b56d6c8d3efbd3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118789 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-871: Remove vst4q/vld4q from NEActivationLayer.Georgios Pinitas
Change-Id: Iebd2a8fece1af87c93d6795e176d8c37ca64bbf6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118187 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-787: Add CL support for broadcast multiplyMichele Di Giorgio
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix direct convolution output stage.Georgios Pinitas
Change-Id: Ie4ac7f61675c1fb9b1748d6784fccb26f058832a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118635 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed clangtidy warnings.Pablo Tello
Change-Id: I83d0f2bc8e0ebfdc0b60931f2c5acf0469caf886 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Winograd tramsforms refactoringPablo Tello
1) Removed the example files winograd_layer.hpp/cpp 2) Teplatized winograd transform kernels Change-Id: I7045fa0b801b9d30a11275914aaa2dafd254aed2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118332 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-400: Implement the tensorshift kernel for fixing DC's alignment ↵Xinghang Zhou
issue on OpenGL ES Change-Id: I7a8489bb0fddc72899ea165e414ee87bdbfb45b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118106 Reviewed-by: Joel Liang <joel.liang@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-394: GLES fails to compile Dropout kernel on S5steli01
Change-Id: Ie480332e6e302edd406627e90be0d7df3e61dde5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118303 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02IVGCVSW-863 Broadcast support in CL/NEON Arithmetic AddDiego Lopez Recas
Also, added instrumentation to support generic tensor broadcasting for NEON and CL backends. Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-866: Integrate SGEMV Neon Assembly from RSHMichele Di Giorgio
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>