aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/runtime
AgeCommit message (Collapse)Author
2018-11-02COMPMID-935 - Implementing Convolution with Winograd on OpenCL (part 4)Gian Marco Iodice
Implemented Winograd Output Transform (2x2,3x3) on OpenCL Implemented CLWinogradConvolutionLayer on OpenCL Change-Id: I6a113fc5f052ca07f878d2b800d2ab003f84af65 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add validation to LocallyConnected and NEDeconv layersAlex Gilday
Change-Id: Ifed8713f4d7f1315af684b30d11323db2b533f10 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121783 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-1004 GLES: Add memory manager to GLES functionsMichalis Spyrou
Change-Id: I80fc9c0dd02afd79b501abde751036f9599b7bf2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125103 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-853 Fuse CL DepthwiseConvolution with Activation for QASYM8Giorgio Arena
Change-Id: I287908f76af458ad4b4d865d353dc37e33877250 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120839 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-754: Add validation to (De)QuantizationLayersAlex Gilday
Change-Id: If8fa1277e8dc5b8e28a8bcad4ff9fc672b00ce9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123275 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-881: RSH new arm_gemm interface.Pablo Tello
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1002: Expose number of pools in PoolManagerGeorgios Pinitas
Change-Id: I3b14bd87d0650dc4378dc3ccbd14274b9c019657 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124428 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-793 : Add graph intermediate representationGeorgios Pinitas
Change-Id: Ic1685de4e19e0ac79669ef2da64e1dc96c7ea0bf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115248 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-935 Implementing Convolution with Winograd on OpenCL (part 3)Giorgio Arena
Change-Id: I51f92f30602fb0a02314f344fa67061f448694bf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122793 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-978: Fixed bug in Load/Store tuning data and in the CLTuner interceptorAnthony Barbier
Change-Id: I6cdecef623ff5806cf82cb12a60aef8aefec32f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122712 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-978 Load/Store tuning data from file (Part2)Anthony Barbier
Change-Id: I1819f42c0e456673543b267d51f730b6e80a0ad9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122629 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-540 Replace NEDeconvolutionLayerUpsampleKernel with NEScaleKernelMichalis Spyrou
Change-Id: Ic29557cca24447ef40fa2cfca84f208b4d43f8de Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122180 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-978 Load/Store tuning data from fileAnthony Barbier
Change-Id: I1d1f402df3a58704c021b9866d489844fb5e7d7a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122395 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-941 Add NEON broadcast multiply supportMichalis Spyrou
Change-Id: I1f808c25750461bec9a28b2f6615fbd0f624117a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122262 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Fix performance issues on OpenCLGian Marco
The problem was related to the reshape of the weights. The reshaping happened for each run Change-Id: Ie7d02fa6bb08df34e44213303e9eb0700ff77160 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121877 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-617: Add validate support for NEON FullyConnectedLayerIoan-Cristian Szabo
Change-Id: I08987022c8d4cc335c00b8af27bd3edb8fe64d3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111596 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Alexander Gilday <alexander.gilday@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-754: Add validation to kernels.Georgios Pinitas
Adds validation method to: - CLConvolutionLayer Change-Id: I95516e20cfb71c1e603c60fc6491ac695883a856 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117355 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add CLPermute validation methodIsabella Gottardi
Change-Id: I77ed920a43738effd55b086e3138f497057a72c5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121618 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-927: Adding support for FP16 in CLDepthwiseConvolutionLayer3x3Michele Di Giorgio
Change-Id: Ie5f299c7a7fbe3062cee22bb2b4ae5df818fe490 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121178 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-391: Fix GLES COMPUTE alignment issuesFrank Lei
APPBROWSER-402: Performance optimization for squeezenet/xray model Change-Id: If31b186b99a6d6087164019fe94d3ac9279e3204 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119526 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-846: Create a ConvolutionLayer for NEONIsabella Gottardi
Change-Id: I98bbef40bfac5b05134be4ef9fb54d14c0c9e8e8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118806 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-909: Enabling in-place computation for batchnormalization and ↵Michele Di Giorgio
activation at graph level Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-845: Create a ConvolutionLayer for CLIsabella Gottardi
Change-Id: Ifcc406d2d0a99c911d6b6c875657b0e0028255d5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119148 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-784: Productise Winograd.Pablo Tello
a) Added support for kernel size 5. b) Templatised data type for transforms and batched gemms kernels. Change-Id: Idb83dda7a5eec19e015888ab31902bd791913297 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120540 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-905 Optimize CLSoftmaxLayer for QASYMM8Giorgio Arena
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-856: CL Depthwise Convolution QASYMM8 supportGeorgios Pinitas
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-906: Use fused activation in NEON Batch normalizationGeorgios Pinitas
Change-Id: I5a6413548b2c9b8972c91ddba57395509dffd87e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120656 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Switch 1x1 DeconvolutionLayer to use the ConvolutionLayerGeorgios Pinitas
-Swithes the 1x1 DeconvolutionLayer to use the ConvolutionLayer instead of the DirectConvolutionLayer. Change-Id: I3ffe152c42c3b1c7ea572f264cd3215df01aedc2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120292 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-903: Implements NEPermute for NHWC conversionsGeorgios Pinitas
Change-Id: I4083e8d16bb23933634f229a1408dfd0e8f2922a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120069 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-792: Update doctrings of CL kernels supporting broadcast operationsMichele Di Giorgio
Change-Id: I71146a83c67c4b193ef1e79d78bd80f9449781e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118748 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-875: Deconvolution 4x4 not workingGeorgios Pinitas
-Enforces the use of the ConvolutionLayer function in the DeconvolutionLayer. -Adds tests for 4x4 Deconvolution. -Alters the ConvolutionLayer validation to support even kernels. Change-Id: Id27e285f078e690b8dd58490dd8ea6d875b3cec6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Extended GEMM benchmarkGian Marco
Added new benchmarks GEMM in order to evaluate the performance when the input matrix B has to be reshaped only once Change-Id: I1c4790213704ce57ea7b28f6f362c56edccd1eb9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118910 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-897 Merge batch normalization with bounded reluGiorgio Arena
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-873: Integrate RSH NEON Depthwise Convolution routineGeorgios Pinitas
Change-Id: Ida1e9a836bc518bfe5563e16bf7f92bde5fc13f7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118472 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-578: Implement FAST corners for CL/NEONAbe Mbise
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-874: Improve default number of threads choice in the SchedulerGeorgios Pinitas
Change-Id: Ia30ec2afce0aafcd39f41440efb972b18bbda9f8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118657 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-787: Add CL support for broadcast multiplyMichele Di Giorgio
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765: Fixed clangtidy warnings.Pablo Tello
Change-Id: I83d0f2bc8e0ebfdc0b60931f2c5acf0469caf886 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-784: Winograd tramsforms refactoringPablo Tello
1) Removed the example files winograd_layer.hpp/cpp 2) Teplatized winograd transform kernels Change-Id: I7045fa0b801b9d30a11275914aaa2dafd254aed2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118332 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-400: Implement the tensorshift kernel for fixing DC's alignment ↵Xinghang Zhou
issue on OpenGL ES Change-Id: I7a8489bb0fddc72899ea165e414ee87bdbfb45b3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118106 Reviewed-by: Joel Liang <joel.liang@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-863 Broadcast support in CL/NEON Arithmetic AddDiego Lopez Recas
Also, added instrumentation to support generic tensor broadcasting for NEON and CL backends. Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02IVGCVSW-798 Add Softmax NEON support for QASYMM8Diego Lopez Recas
Change-Id: I4f2cca52caf210fdb7d6bb7e9436ac51cb5088b4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112398 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-784: Added support for biases in WinogradLayer.Pablo Tello
1) Updated to the latest code from the RSH repo. 2) Moved winograd transforms into kernels. 3) Added support for biases Change-Id: I7f39f34a599b49d7d9b549cc10a4f4d4a8007ab8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117474 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-791: Generic Depthwise Convolution Layer NEON QASYMM8Georgios Pinitas
Change-Id: I33cf54e68f6c097ac58b6f16c3f9a720978f09cd Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117289 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-790 - NEON: Add QASYMM8 support to ConvolutionIsabella Gottardi
Change-Id: Iec82a91ad351cfe8d07d0976a24bd42f4703177a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116833 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-748 - Integrating optimized SGEMM for bifrostGian Marco
This patch introduces a new GEMM capable to improve the mac utilisation of 10% compared to the GEMM without reshape. However this implementation is not faster in all cases as we need to take into account the time for reshaping the matrices. For this reason an heuristic solution to select the optimal GEMM to use has been added to the function. More information about the heuristic implementation can be found at COMPMID-852. With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can improved the performance of 1.5x. More information about the performance uplift can be found here: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02 Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-798 Add instrumentation to NEON kernelsAnthony Barbier
Change-Id: I9dbb090cac731d68bd98a7d1c8ab0e1cb0a5c911 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116746 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02APPBROWSER-377: GCConvoutionLayer support for FP16Stephen Li
Change-Id: I801b5e393a16a9f92c062826e6fcfd5982ca7bb3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116584 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-815: Updated NEWinogradLayer with the lastest code from Research.Pablo Tello
Change-Id: I86d7f53b5f5d1dbc22078aea5c32b08a25d1f49e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116634 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>