aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/core/CL
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1107: Add support for ChannelShuffle in CLMichele Di Giorgio
Change-Id: I56d2a02b316f0c69ff1fd7220e732f775414fe69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129709 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-926 Add depth multiplier support to NEON/CL/GLES depthwise convolutionGiorgio Arena
Change-Id: I03f32c62350e5ea43e77bb15fc5a832d83719e3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126657 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1043: Rework GCGEMMMatrixMultiplyKernel interface and allow auto ↵Michele Di Giorgio
initialization of the tensors This patch also: - removes support for already reshaped weights in GCConvolutionLayer - makes GCConvolutionLayer similar to CLGEMMConvolutionLayer - enables usage of the GCGEMM function in GCConvolution instead of calling the GEMM kernels directly Change-Id: I3e4a64335555e86e18585d38d8fda4bfdb44e265 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127696 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-959 Refactor OpenCL interceptors to use lambda functionsAnthony Barbier
Change-Id: I29b73a311d7278255b77524f2a5eaaa4dccab711 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128392 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-855: Get the library to work on non Mali GPUsAnthony Barbier
Change-Id: Ia6a7b7a9d8b10ebf6b3c6a0fffa10bdf5dd8d8ef Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128381 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1026 - Add support for 4x4 output tile in CLWinogradConvolutionLayerGian Marco Iodice
The performance achieved can be found at the following confluence page: https://confluence.arm.com/display/MLENG/GEMM-based+convolution+vs+Winograd-based+convolution+on+OpenCL Change-Id: I4b690cfdd4eb4ff0cd17b14fdd49ccaa1d1dc85c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127729 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-987: Make beta and gamma optional in BatchNormalizationMichele Di Giorgio
Currently we have beta and gamma compulsory in Batch normalization. There are network that might not need one or both of those. Thus these should be optional with beta(offset) defaulting to zero and gamma(scale) to 1. Will also reduce some memory requirements. Change-Id: I15bf1ec14b814be2acebf1be1a4fba9c4fbd3190 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123237 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-577: Implement CL validation for GaussianPyramidSanghoon Lee
Change-Id: If879cbe15b14d97818c24d44b29fc69b6c8cb686 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127601 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-959: Add accessors for the OpenCL program cacheAnthony Barbier
Change-Id: I7920ecdf6687341cbcf4d75aecc15c4164c64636 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127722 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-811 Add NHWC data format support for CL depthwise convolution QASYMM8Giorgio Arena
Change-Id: I89de432f3fbcba7abf9e1d4f8396a4334b4fa2c2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118324 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1013 - Create WinogradInfo data structureGian Marco Iodice
COMPMID-1014 - Refactoring Winograd's dataset Change-Id: I6abdcbf9a90d663f4db666cd410afece9f1d034d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125899 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-994 : Check cl_arm_printf is supported in the CLSchedulerVidhya Sudhan Loganathan
Introduced static and dynamic checks before using printf vendor extension features (callbacks and buffers) Change-Id: Ib38cb3d8591bbb482d02a41918f4b52efde75267 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126751 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-797: Switch to new graph.Georgios Pinitas
- Cleaned up build system Change-Id: If2faa27ee5b31fa8b972836960ab3ef671059c8d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126435 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-734: CLTuner reworkGeorgios Pinitas
Change-Id: I8f20d6ea8a09869d71003e7b08e0d33775282f6c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125802 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1017: Implement dilated convolution in NEON, OpenCL, and GCAlex Gilday
Change-Id: If4626ec9e215e14dffe22e80812da5bac84a52e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125734 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-337: Adding OpenCL SVM support.Pablo Tello
Change-Id: I250d6a1daeccf91d97b6da65aec53b02cf6046a7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116140 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1019 Implement copy function CLMichalis Spyrou
Change-Id: Ica17528bf6c812d9caf9d66c612c11434ec1dc69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125542 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1008: Fix Doxygen issuesAlex Gilday
Change-Id: Ie73d8771f85d1f5b059f3a56f1bbd73c98e94a38 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124723 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1009 Support 4x4 output tile for Winograd Filter Transform on OpenCL.Giorgio Arena
Change-Id: I68c6453e0f192de659582404f109a89616b9fbb9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124811 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-935 - Implementing Convolution with Winograd on OpenCL (part 4)Gian Marco Iodice
Implemented Winograd Output Transform (2x2,3x3) on OpenCL Implemented CLWinogradConvolutionLayer on OpenCL Change-Id: I6a113fc5f052ca07f878d2b800d2ab003f84af65 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add validation to LocallyConnected and NEDeconv layersAlex Gilday
Change-Id: Ifed8713f4d7f1315af684b30d11323db2b533f10 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121783 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-853 Fuse CL DepthwiseConvolution with Activation for QASYM8Giorgio Arena
Change-Id: I287908f76af458ad4b4d865d353dc37e33877250 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120839 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-754: Add validation to (De)QuantizationLayersAlex Gilday
Change-Id: If8fa1277e8dc5b8e28a8bcad4ff9fc672b00ce9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123275 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-935 - Implementing Convolution with Winograd on OpenCL (part 2)Gian Marco Iodice
Implemented Winograd Filter Transform 3x3 on OpenCL Change-Id: I8f2b2dd938c5c000ef7ce392a37fb7b8b4202a4e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122708 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-935 Implementing Convolution with Winograd on OpenCL (part 3)Giorgio Arena
Change-Id: I51f92f30602fb0a02314f344fa67061f448694bf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122793 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-886 Don't use LWS hints by default for GPU post Mali-G72Michalis Spyrou
Change-Id: I64cb2d7f9513d69aebd9307a803b1b2c9c0e04c3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121929 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-995 Add CL_DEVICE_VERSION to the test framework outputAnthony Barbier
Change-Id: Ic32742388fbd45c8acc395977586204980eff591 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123541 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Kevin Petit <kevin.petit@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-754: Add validation to kernels.Georgios Pinitas
Adds validation method to: - CLConvolutionLayer Change-Id: I95516e20cfb71c1e603c60fc6491ac695883a856 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117355 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-864 Window::collapse_if_possible() is misused in several CL kernelsMichalis Spyrou
Removed unnecessary collapse_if_possible() calls. Change-Id: I6f3434bc4a26470c4de5bac4e3d90b4b019c2c9c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117993 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-927: Adding support for FP16 in CLDepthwiseConvolutionLayer3x3Michele Di Giorgio
Change-Id: Ie5f299c7a7fbe3062cee22bb2b4ae5df818fe490 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121178 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-909: Enabling in-place computation for batchnormalization and ↵Michele Di Giorgio
activation at graph level Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-882 - Optimizing GEMMLowp on OpenCL reshaping matricesGian Marco
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz. The performance have been reported here https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02 Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-856: CL Depthwise Convolution QASYMM8 supportGeorgios Pinitas
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-897 Merge batch normalization with bounded reluGiorgio Arena
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-891 - Use OpenCL timer in CLTunerGian Marco
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-855 - Optimizing im2col on OpenCL (DCHW)Gian Marco
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11 Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-754: Add validation method to CLPermute kernelMichele Di Giorgio
Change-Id: If6f3888a035b557a6c369efa22b56d6c8d3efbd3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118789 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-787: Add CL support for broadcast multiplyMichele Di Giorgio
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02IVGCVSW-863 Broadcast support in CL/NEON Arithmetic AddDiego Lopez Recas
Also, added instrumentation to support generic tensor broadcasting for NEON and CL backends. Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-748 - Integrating optimized SGEMM for bifrostGian Marco
This patch introduces a new GEMM capable to improve the mac utilisation of 10% compared to the GEMM without reshape. However this implementation is not faster in all cases as we need to take into account the time for reshaping the matrices. For this reason an heuristic solution to select the optimal GEMM to use has been added to the function. More information about the heuristic implementation can be found at COMPMID-852. With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can improved the performance of 1.5x. More information about the performance uplift can be found here: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02 Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-841: Add CL QASYMM8 RELU ActivationMichele Di Giorgio
Change-Id: I8e0b7cad2f977942224d0116e8498bf9b2d6014d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117229 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-838 Implement CLPermuteMichalis Spyrou
Change-Id: I6d97b649f1ebc289c9e6f8949e67740a6b3cbcb2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116636 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-839 Added method to clear the Kernel Library's program cacheAnthony Barbier
Change-Id: If2e14c19f16686a2a8e05832845f8bfcf0f0cdaf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116537 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-471 Implement Deconvolution on OpenCLMichalis Spyrou
Change-Id: Ie00c6b08a51d30c5ce2637d40ee3d165b8a68686 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110311 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-765 Merge pull request #325 from lukeiwanski/feature/no_exceptionsAnthony Barbier
ARM_COMPUTE_NO_EXCEPTIONS macro guard Cherry-picked public merge request from Codeplay Change-Id: Id819177fcc86a64dc4e82eefe46b2f646619e8c0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114924 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-617: Adds CLFullyConnectionLayer validation supportGeorgios Pinitas
Change-Id: I4d2eb9872a3165fdcaa7784596e441cbe563dbc2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112577 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Ioan-Cristian Szabo <ioan-cristian.szabo@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-766 Allow graph examples to run without OpenCL being present on the ↵Anthony Barbier
platform Change-Id: I4142e0720ecb58549a08d4e86ad21abb882f5f37 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114552 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556 Fix Color Convert documentationGiorgio Arena
Change-Id: Iec0728cbe33be1c006499c7892841baf584485f7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112908 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556 Fix Channel Extract documentationGiorgio Arena
Change-Id: Iac5fd0800953f6db040f025313e5077fcad74af6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112931 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-556: Rename Error to Status and inverse logicGeorgios Pinitas
Change-Id: Ib57d4f7177cc6179302bda7ad870acb8bd3825f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112115 Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>