aboutsummaryrefslogtreecommitdiff
path: root/src/runtime
AgeCommit message (Collapse)Author
2018-11-02COMPMID-978: Fixed bug in Load/Store tuning data and in the CLTuner interceptorAnthony Barbier
Change-Id: I6cdecef623ff5806cf82cb12a60aef8aefec32f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122712 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-978 Load/Store tuning data from file (Part2)Anthony Barbier
Change-Id: I1819f42c0e456673543b267d51f730b6e80a0ad9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122629 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-540 Replace NEDeconvolutionLayerUpsampleKernel with NEScaleKernelMichalis Spyrou
Change-Id: Ic29557cca24447ef40fa2cfca84f208b4d43f8de Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122180 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-978 Load/Store tuning data from fileAnthony Barbier
Change-Id: I1d1f402df3a58704c021b9866d489844fb5e7d7a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122395 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-941 Add NEON broadcast multiply supportMichalis Spyrou
Change-Id: I1f808c25750461bec9a28b2f6615fbd0f624117a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122262 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-959: Update valid region in DepthConcatenateGeorgios Pinitas
Change-Id: I8aaf15a64aab592bfbdb386fdb07631cad933fa6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122307 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Fix performance issues on OpenCLGian Marco
The problem was related to the reshape of the weights. The reshaping happened for each run Change-Id: Ie7d02fa6bb08df34e44213303e9eb0700ff77160 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121877 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Fixed number of threads hint for set_num_threads(0)Anthony Barbier
Change-Id: I8a71a68b597ecba03581aa79e8fd481874d7e180 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121796 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-617: Add validate support for NEON FullyConnectedLayerIoan-Cristian Szabo
Change-Id: I08987022c8d4cc335c00b8af27bd3edb8fe64d3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111596 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Alexander Gilday <alexander.gilday@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-754: Add validation to kernels.Georgios Pinitas
Adds validation method to: - CLConvolutionLayer Change-Id: I95516e20cfb71c1e603c60fc6491ac695883a856 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117355 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add CLPermute validation methodIsabella Gottardi
Change-Id: I77ed920a43738effd55b086e3138f497057a72c5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121618 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-927: Adding support for FP16 in CLDepthwiseConvolutionLayer3x3Michele Di Giorgio
Change-Id: Ie5f299c7a7fbe3062cee22bb2b4ae5df818fe490 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121178 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-391: Fix GLES COMPUTE alignment issuesFrank Lei
APPBROWSER-402: Performance optimization for squeezenet/xray model Change-Id: If31b186b99a6d6087164019fe94d3ac9279e3204 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119526 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I5022d02f06f9d849dad76e3d9b8e48632c236429 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121191 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-846: Create a ConvolutionLayer for NEONIsabella Gottardi
Change-Id: I98bbef40bfac5b05134be4ef9fb54d14c0c9e8e8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118806 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Fix get_convolution_method in order to return the correct method.Isabella Gottardi
Change-Id: Ia4be053b9f5399fe7e241cebb4292890e957ae54 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121141 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-936: Convolution failure in NEON Convolution Layer.Georgios Pinitas
Change-Id: I68a98eff57c8db719a501b68541666e8bc5f2081 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121180 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02Revert "COMPMID-582: Add validation to channel_extract kernels."Anthony Barbier
This reverts commit 9a0875951d43dda035f32d2e0728cf59d80cb4d3. Change-Id: I6af0bc64c656f91cf1e0357f8760defa08f2e78d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121190 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-909: Enabling in-place computation for batchnormalization and ↵Michele Di Giorgio
activation at graph level Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-845: Create a ConvolutionLayer for CLIsabella Gottardi
Change-Id: Ifcc406d2d0a99c911d6b6c875657b0e0028255d5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119148 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-934: Return an error in Validate when we don't support asymmetric ↵Anthony Barbier
padding Currently an assert gets fired in debug mode, and we just ignore the asymmetric padding in release mode. Change-Id: Ia6278b5722f7e93f356a975ab3243e6bb07e44a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120840 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-784: Productise Winograd.Pablo Tello
a) Added support for kernel size 5. b) Templatised data type for transforms and batched gemms kernels. Change-Id: Idb83dda7a5eec19e015888ab31902bd791913297 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120540 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I6413a05f6870a0d04f12d7348269b15297ae8493 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-882 - Optimizing GEMMLowp on OpenCL reshaping matricesGian Marco
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz. The performance have been reported here https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-905 Optimize CLSoftmaxLayer for QASYMM8Giorgio Arena
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-856: CL Depthwise Convolution QASYMM8 supportGeorgios Pinitas
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-906: Use fused activation in NEON Batch normalizationGeorgios Pinitas
Change-Id: I5a6413548b2c9b8972c91ddba57395509dffd87e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120656 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fix CPPPermute error when permuting the strides.Georgios Pinitas
Change-Id: I4ea57579d997dd6a2e248634e3b7cb58bb3e2838 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120693 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765: Switch 1x1 DeconvolutionLayer to use the ConvolutionLayerGeorgios Pinitas
-Swithes the 1x1 DeconvolutionLayer to use the ConvolutionLayer instead of the DirectConvolutionLayer. Change-Id: I3ffe152c42c3b1c7ea572f264cd3215df01aedc2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120292 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-903: Implements NEPermute for NHWC conversionsGeorgios Pinitas
Change-Id: I4083e8d16bb23933634f229a1408dfd0e8f2922a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120069 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-875: Deconvolution 4x4 not workingGeorgios Pinitas
-Enforces the use of the ConvolutionLayer function in the DeconvolutionLayer. -Adds tests for 4x4 Deconvolution. -Alters the ConvolutionLayer validation to support even kernels. Change-Id: Id27e285f078e690b8dd58490dd8ea6d875b3cec6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Extended GEMM benchmarkGian Marco
Added new benchmarks GEMM in order to evaluate the performance when the input matrix B has to be reshaped only once Change-Id: I1c4790213704ce57ea7b28f6f362c56edccd1eb9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118910 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-897 Merge batch normalization with bounded reluGiorgio Arena
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-873: Integrate RSH NEON Depthwise Convolution routineGeorgios Pinitas
Change-Id: Ida1e9a836bc518bfe5563e16bf7f92bde5fc13f7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118472 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-578: Implement FAST corners for CL/NEONAbe Mbise
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-828 - Add support for non square pool size - Part1Isabella Gottardi
Change-Id: Ib8100e7c659c49694c746fa3f36ce20f44f6929f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117804 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-895 - Optimizing CLDepthwiseConvolution3x3KernelGian Marco
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1 Performance reported in the following confluence page: https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02 Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-894: Segfault: neon_cnn on S5 neoGeorgios Pinitas
Removed double managment of the same tensor object Change-Id: Ibc74cd8c7bd199cd473ff68f692840cbf01b27b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119119 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-784: Added support for PADDING = SAME in Winograd layer.Pablo Tello
Change-Id: I5a420da6a8041f9ff6d0811815f2fc74c85c56a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119014 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-874: Improve default number of threads choice in the SchedulerGeorgios Pinitas
Change-Id: Ia30ec2afce0aafcd39f41440efb972b18bbda9f8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118657 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-890: Valgrind: NEON Convolution Layer validation failsIsabella Gottardi
Change-Id: I5296815cf04e5f805d6523196567b6c01715c8b5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118711 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-787: Add CL support for broadcast multiplyMichele Di Giorgio
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed clangtidy warnings.Pablo Tello
Change-Id: I83d0f2bc8e0ebfdc0b60931f2c5acf0469caf886 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed CPU detection code to read larger buffer (was failing on ↵Pablo Tello
Arm Cortex-A55 FPGA with 8 CPUs with lots of flags). Change-Id: I493fb1013c6c25d9b9c809705b1ee24abac1d8d1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118456 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Winograd tramsforms refactoringPablo Tello
1) Removed the example files winograd_layer.hpp/cpp 2) Teplatized winograd transform kernels Change-Id: I7045fa0b801b9d30a11275914aaa2dafd254aed2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118332 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-400: Implement the tensorshift kernel for fixing DC's alignment ↵Xinghang Zhou
issue on OpenGL ES Change-Id: I7a8489bb0fddc72899ea165e414ee87bdbfb45b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118106 Reviewed-by: Joel Liang <joel.liang@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-863 Broadcast support in CL/NEON Arithmetic AddDiego Lopez Recas
Also, added instrumentation to support generic tensor broadcasting for NEON and CL backends. Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-866: Integrate SGEMV Neon Assembly from RSHMichele Di Giorgio
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-765: Allow RSH's code to not have default cases in their switchesAnthony Barbier
Change-Id: I2d3cc9668852a1ba414fc3148866df408f770dc8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118308 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>