aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/CL
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1016: Optimize kernel reconfigurationGeorgios Pinitas
Optimizes kernel reconfiguration when memory manager is used. Note that this works only if every sub-sequent reconfigurations leads to sizes less than the first one. Change-Id: I08898e99929c3756147a02979b726c2380b6e11d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125114 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-959 Add missing validate check in CLRNNLayerMichalis Spyrou
Change-Id: I9346e62dc4eef4f3dc7a70160ac9370f4deeb8fa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126185 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-992 Implement CL RNN functionMichalis Spyrou
Change-Id: I8dbada5fabedbb8523e433ba73d504bd15b81466 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125787 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-337: Adding OpenCL SVM support.Pablo Tello
Change-Id: I250d6a1daeccf91d97b6da65aec53b02cf6046a7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116140 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1019 Implement copy function CLMichalis Spyrou
Change-Id: Ica17528bf6c812d9caf9d66c612c11434ec1dc69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125542 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1009 Support 4x4 output tile for Winograd Filter Transform on OpenCL.Giorgio Arena
Change-Id: I68c6453e0f192de659582404f109a89616b9fbb9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124811 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-935 - Implementing Convolution with Winograd on OpenCL (part 4)Gian Marco Iodice
Implemented Winograd Output Transform (2x2,3x3) on OpenCL Implemented CLWinogradConvolutionLayer on OpenCL Change-Id: I6a113fc5f052ca07f878d2b800d2ab003f84af65 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add validation to LocallyConnected and NEDeconv layersAlex Gilday
Change-Id: Ifed8713f4d7f1315af684b30d11323db2b533f10 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121783 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-959: Manage memory on pure GEMMGeorgios Pinitas
Change-Id: I30e605db5e54266c6af70ac9fe602437966d9c73 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125107 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-853 Fuse CL DepthwiseConvolution with Activation for QASYM8Giorgio Arena
Change-Id: I287908f76af458ad4b4d865d353dc37e33877250 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120839 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-754: Add validation to (De)QuantizationLayersAlex Gilday
Change-Id: If8fa1277e8dc5b8e28a8bcad4ff9fc672b00ce9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123275 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-594: Implement reference and CL/NEON validation for LocallyConnectedSanghoon Lee
Change-Id: I01e7abcf3f1b19458128e277044af850ad9fa224 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118610 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-793 : Add graph intermediate representationGeorgios Pinitas
Change-Id: Ic1685de4e19e0ac79669ef2da64e1dc96c7ea0bf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115248 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-935 Implementing Convolution with Winograd on OpenCL (part 3)Giorgio Arena
Change-Id: I51f92f30602fb0a02314f344fa67061f448694bf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122793 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-886 Don't use LWS hints by default for GPU post Mali-G72Michalis Spyrou
Change-Id: I64cb2d7f9513d69aebd9307a803b1b2c9c0e04c3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121929 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPID-978: Throw exception on missing tuning fileAnthony Barbier
Change-Id: I09ad6f52cc32f5dcd38d2ba7c6143e7e9ab12b61 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122767 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-978: Fixed bug in Load/Store tuning data and in the CLTuner interceptorAnthony Barbier
Change-Id: I6cdecef623ff5806cf82cb12a60aef8aefec32f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122712 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-978 Load/Store tuning data from file (Part2)Anthony Barbier
Change-Id: I1819f42c0e456673543b267d51f730b6e80a0ad9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122629 Reviewed-by: Robert Hughes <robert.hughes@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-978 Load/Store tuning data from fileAnthony Barbier
Change-Id: I1d1f402df3a58704c021b9866d489844fb5e7d7a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122395 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-959: Update valid region in DepthConcatenateGeorgios Pinitas
Change-Id: I8aaf15a64aab592bfbdb386fdb07631cad933fa6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122307 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-765 - Fix performance issues on OpenCLGian Marco
The problem was related to the reshape of the weights. The reshaping happened for each run Change-Id: Ie7d02fa6bb08df34e44213303e9eb0700ff77160 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121877 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-617: Add validate support for NEON FullyConnectedLayerIoan-Cristian Szabo
Change-Id: I08987022c8d4cc335c00b8af27bd3edb8fe64d3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111596 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Alexander Gilday <alexander.gilday@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-754: Add validation to kernels.Georgios Pinitas
Adds validation method to: - CLConvolutionLayer Change-Id: I95516e20cfb71c1e603c60fc6491ac695883a856 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117355 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add CLPermute validation methodIsabella Gottardi
Change-Id: I77ed920a43738effd55b086e3138f497057a72c5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121618 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-927: Adding support for FP16 in CLDepthwiseConvolutionLayer3x3Michele Di Giorgio
Change-Id: Ie5f299c7a7fbe3062cee22bb2b4ae5df818fe490 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121178 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I5022d02f06f9d849dad76e3d9b8e48632c236429 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121191 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 - Fix get_convolution_method in order to return the correct method.Isabella Gottardi
Change-Id: Ia4be053b9f5399fe7e241cebb4292890e957ae54 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121141 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-936: Convolution failure in NEON Convolution Layer.Georgios Pinitas
Change-Id: I68a98eff57c8db719a501b68541666e8bc5f2081 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121180 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02Revert "COMPMID-582: Add validation to channel_extract kernels."Anthony Barbier
This reverts commit 9a0875951d43dda035f32d2e0728cf59d80cb4d3. Change-Id: I6af0bc64c656f91cf1e0357f8760defa08f2e78d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121190 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-909: Enabling in-place computation for batchnormalization and ↵Michele Di Giorgio
activation at graph level Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-845: Create a ConvolutionLayer for CLIsabella Gottardi
Change-Id: Ifcc406d2d0a99c911d6b6c875657b0e0028255d5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119148 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-934: Return an error in Validate when we don't support asymmetric ↵Anthony Barbier
padding Currently an assert gets fired in debug mode, and we just ignore the asymmetric padding in release mode. Change-Id: Ia6278b5722f7e93f356a975ab3243e6bb07e44a8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120840 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-582: Add validation to channel_extract kernels.Ioan-Cristian Szabo
Change-Id: I6413a05f6870a0d04f12d7348269b15297ae8493 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-882 - Optimizing GEMMLowp on OpenCL reshaping matricesGian Marco
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz. The performance have been reported here https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-905 Optimize CLSoftmaxLayer for QASYMM8Giorgio Arena
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-856: CL Depthwise Convolution QASYMM8 supportGeorgios Pinitas
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-875: Deconvolution 4x4 not workingGeorgios Pinitas
-Enforces the use of the ConvolutionLayer function in the DeconvolutionLayer. -Adds tests for 4x4 Deconvolution. -Alters the ConvolutionLayer validation to support even kernels. Change-Id: Id27e285f078e690b8dd58490dd8ea6d875b3cec6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-897 Merge batch normalization with bounded reluGiorgio Arena
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-578: Implement FAST corners for CL/NEONAbe Mbise
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-895 - Optimizing CLDepthwiseConvolution3x3KernelGian Marco
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1 Performance reported in the following confluence page: https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02 Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-787: Add CL support for broadcast multiplyMichele Di Giorgio
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-863 Broadcast support in CL/NEON Arithmetic AddDiego Lopez Recas
Also, added instrumentation to support generic tensor broadcasting for NEON and CL backends. Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 - Added third dimension for CLTunerGian Marco
Change-Id: I0a7ea4cde1dbf8edd28908dfff80928ef7e996c4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117647 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-588: Port Equalize Histogram to new validationJohn Richardson
Change-Id: Iff50adf2993bd69c2696a47559d6b2e0011fed87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110177 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-790 - NEON: Add QASYMM8 support to ConvolutionIsabella Gottardi
Change-Id: Iec82a91ad351cfe8d07d0976a24bd42f4703177a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116833 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-748 - Integrating optimized SGEMM for bifrostGian Marco
This patch introduces a new GEMM capable to improve the mac utilisation of 10% compared to the GEMM without reshape. However this implementation is not faster in all cases as we need to take into account the time for reshaping the matrices. For this reason an heuristic solution to select the optimal GEMM to use has been added to the function. More information about the heuristic implementation can be found at COMPMID-852. With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can improved the performance of 1.5x. More information about the performance uplift can be found here: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02 Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-816 - Optimizing CLGEMMLowpMatrixMultiplyCore - Part1Gian Marco
The performance improvements have been reported at the following confluence page: https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Config3 of McVail looks improved by 29x Change-Id: I8b203c0b75fc368f85cea863b7eed398fab3e79a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115783 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-838 Implement CLPermuteMichalis Spyrou
Change-Id: I6d97b649f1ebc289c9e6f8949e67740a6b3cbcb2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116636 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-674 - Create Google InceptionV3 exampleGeorgios Pinitas
Change-Id: I389e0d4104b7dde60b7cdd612a83f3328517e44c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115804 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>