aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1068 Create validate method to CLDepthWiseConvolutionGiorgio Arena
Change-Id: I3301b66a8a072c6ecd0d7f2dabef350017b55ac4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128677 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-922 - CLGEMM FP16 optimizations - part2Gian Marco Iodice
This patch improves of ~30 % GEMM fp16 when the reshape is required The results have been reported at the following confluence page: https://confluence.arm.com/display/MLENG/GEMM+FP16+performance%3A+ACL+18.05 Change-Id: I8233095a7e9ab06f1f915782a25dd41653b49140 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128254 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1125: Add support for FP16 in CLDepthwiseConvolutionMichele Di Giorgio
Change-Id: I4838f5a8e4c33ed646cd05e0bb682fca635a29a3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130469 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-1117: TransposeAccessWindow leads to high paddingGeorgios Pinitas
Switches CLGEMMMatrixMultiplyKernel and CLGEMMTranspose1xWKernel to use AccessWindowStatic Change-Id: I21533d4218215d5b8f84b23c603062678eccb1ed Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130244 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1107: Add support for ChannelShuffle in CLMichele Di Giorgio
Change-Id: I56d2a02b316f0c69ff1fd7220e732f775414fe69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129709 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-808 Add NHWC data format support for NEON direct convolutionGiorgio Arena
Change-Id: I5d4cc3d5b0d25f3fe4ed998c0f15b1b8e260a43a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125697 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1089 CLPoolingLayer fix padding and add output shape computation to ↵Giorgio Arena
ShapeCalculator Change-Id: Ide83424e9fe6b8102ed9e3c355c099c3912e7e61 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129635 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-926 Add depth multiplier support to NEON/CL/GLES depthwise convolutionGiorgio Arena
Change-Id: I03f32c62350e5ea43e77bb15fc5a832d83719e3b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126657 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-805 Add NHWC data format support for CL poolingMichalis Spyrou
Change-Id: I3d91fde78b971aba3f6349f633cd9b1c50e5cacf Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124712 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1096 - Add fast_math flag to CLConvolutionLayerGian Marco Iodice
COMPMID-1103 - CLWinogradConvolutionLayer mismatches Change-Id: Iceaa9482a1790ec39d2720c220261aaea8043978 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129398 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1085 : runtime_error thrown by QASYMM8 CLGEMMConvolutionLayer::validateVidhya Sudhan Loganathan
shape and quantization info were corrected. Error from validate() is forwarded. Validate() tests outside the context of configure()are added. Change-Id: I13f1a02eccda6b595089c4875b21853ca372f2f2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129323 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-803: Add NHWC data format support for CL batch normalisationMichele Di Giorgio
Change-Id: Ie37588f60b9cfc7b1d09b1e8628fcfb4b17e0717 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123834 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1078 Fix CL DepthwiseConvolutionLayer QASYMM8 failing validationGiorgio Arena
Change-Id: Id540490e5faf11c466ff039a20880eeedd6e5ec7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128612 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-855: Get the library to work on non Mali GPUsAnthony Barbier
Change-Id: Ia6a7b7a9d8b10ebf6b3c6a0fffa10bdf5dd8d8ef Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128381 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1026 - Add support for 4x4 output tile in CLWinogradConvolutionLayerGian Marco Iodice
The performance achieved can be found at the following confluence page: https://confluence.arm.com/display/MLENG/GEMM-based+convolution+vs+Winograd-based+convolution+on+OpenCL Change-Id: I4b690cfdd4eb4ff0cd17b14fdd49ccaa1d1dc85c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127729 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1037 Add support for F(4x4, 5x5) in CLWinogradOutputTransformKernelGiorgio Arena
Change-Id: I0b126f03028f08687497b0d79d2e2764f7ed07c8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128001 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-987: Make beta and gamma optional in BatchNormalizationMichele Di Giorgio
Currently we have beta and gamma compulsory in Batch normalization. There are network that might not need one or both of those. Thus these should be optional with beta(offset) defaulting to zero and gamma(scale) to 1. Will also reduce some memory requirements. Change-Id: I15bf1ec14b814be2acebf1be1a4fba9c4fbd3190 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123237 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-577: Implement CL validation for GaussianPyramidSanghoon Lee
Change-Id: If879cbe15b14d97818c24d44b29fc69b6c8cb686 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127601 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1056 - Optimizing CLGEMMMatrixMultiplyKernel refactoring the inner loopGian Marco Iodice
Results reported at: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.05 Change-Id: I3246c4f19c4d21a7d6a44e4593bc5caffc016f81 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127838 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-959: Add accessors for the OpenCL program cacheAnthony Barbier
Change-Id: I7920ecdf6687341cbcf4d75aecc15c4164c64636 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127722 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-922 - CLGEMM FP16 optimizations - part1Gian Marco Iodice
This patch improves of ~20% GEMM fp16. The results has been reported at the following confluence page: https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.05 I am aware with few cases we have a bit of degradation. However this cases are memory bound anyway (Fully connected layer cases) Change-Id: I183cbb7fba55a0b5eb86532c4dca5efe096096b0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128044 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-811 Add NHWC data format support for CL depthwise convolution QASYMM8Giorgio Arena
Change-Id: I89de432f3fbcba7abf9e1d4f8396a4334b4fa2c2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118324 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1037 Add support for F(4x4, 5x5) in CLWinogradInputTransformKernelGiorgio Arena
Change-Id: Iac26936f46d0f7cdd9d2f8393b0092cd5a223c45 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127675 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-959: Fixed order of init/destruction of CLSymbols / CLKernelLibraryAnthony Barbier
Change-Id: I6871c28db69e1580c2ece73a9294742586db81f0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127954 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1031: Use LWS hints for G51, G51BIG, G51LIT, and TNOXSam Laynton
Change-Id: Ie07d9225faaef778bdcfdcb56ae42ec95962e48d Signed-off-by: Sam Laynton <sam.laynton@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126735 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1037 Add support for F(4x4, 5x5) in CLWinogradFilterTransformKernelGiorgio Arena
Change-Id: I6dd639bf5df9bc0c133996f75bdee767f70a6cfb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127469 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-959: Fixed clang-tidy formatting, made GLES builds fail for standalone=1Anthony Barbier
Change-Id: I746ef0b2f8e02349e6067139e90c2c34949cad03 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127690 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-584: Add validation to channel_combine kernelsIoan-Cristian Szabo
Change-Id: I67fe3fcea08704d9f4b04d22fe34db83b2697b87 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110562 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-585: Port OpticalFlow to new validationJohn Richardson
Change-Id: Ia36bd11ca27420d3059eea15df81b237900149ec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125175 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: John Richardson <john.richardson@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1013 - Create WinogradInfo data structureGian Marco Iodice
COMPMID-1014 - Refactoring Winograd's dataset Change-Id: I6abdcbf9a90d663f4db666cd410afece9f1d034d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125899 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-994 : Check cl_arm_printf is supported in the CLSchedulerVidhya Sudhan Loganathan
Introduced static and dynamic checks before using printf vendor extension features (callbacks and buffers) Change-Id: Ib38cb3d8591bbb482d02a41918f4b52efde75267 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126751 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-949: Optimizing CLDepthwiseConvolution3x3Kernel for FP16Michele Di Giorgio
Change-Id: I2af6544eab17004c5b3de56557cb2cc5efecc915 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122181 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-797: Switch to new graph.Georgios Pinitas
- Cleaned up build system Change-Id: If2faa27ee5b31fa8b972836960ab3ef671059c8d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126435 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-734: CLTuner reworkGeorgios Pinitas
Change-Id: I8f20d6ea8a09869d71003e7b08e0d33775282f6c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125802 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-596: Port HOGDetector to new validationJohn Richardson
Change-Id: I73231fc71c5166268e6c909b7930b7e034f3794e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118876 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1017: Implement dilated convolution in NEON, OpenCL, and GCAlex Gilday
Change-Id: If4626ec9e215e14dffe22e80812da5bac84a52e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125734 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1032 - Fixing bug in CLGEMM when is_interleaved_transposed=trueGian Marco Iodice
The bug concerned the collapse of the window in CLGEMMMatrixMultiplyKernel Change-Id: I5043bf37b72eeb615ebe7fb3f2c8e72d006bf341 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126262 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-337: Adding OpenCL SVM support.Pablo Tello
Change-Id: I250d6a1daeccf91d97b6da65aec53b02cf6046a7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116140 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-959: Fix valid region for ScaleDiego Lopez Recas
Change-Id: Ic9ce52d772a178916dfa60fbb6456d295c06b83d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122647 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1019 Implement copy function CLMichalis Spyrou
Change-Id: Ica17528bf6c812d9caf9d66c612c11434ec1dc69 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125542 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1008: Fix Doxygen issuesAlex Gilday
Change-Id: Ie73d8771f85d1f5b059f3a56f1bbd73c98e94a38 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124723 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1009 Support 4x4 output tile for Winograd Filter Transform on OpenCL.Giorgio Arena
Change-Id: I68c6453e0f192de659582404f109a89616b9fbb9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124811 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-959: Set output valid region to CLPermuteKernel.Georgios Pinitas
Change-Id: Ieb62347c71cef5ac8941e632c8afa678f5db6f21 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124813 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-935 - Implementing Convolution with Winograd on OpenCL (part 4)Gian Marco Iodice
Implemented Winograd Output Transform (2x2,3x3) on OpenCL Implemented CLWinogradConvolutionLayer on OpenCL Change-Id: I6a113fc5f052ca07f878d2b800d2ab003f84af65 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125148 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-754: Add validation to LocallyConnected and NEDeconv layersAlex Gilday
Change-Id: Ifed8713f4d7f1315af684b30d11323db2b533f10 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121783 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-853 Fuse CL DepthwiseConvolution with Activation for QASYM8Giorgio Arena
Change-Id: I287908f76af458ad4b4d865d353dc37e33877250 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120839 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-754: Add validation to (De)QuantizationLayersAlex Gilday
Change-Id: If8fa1277e8dc5b8e28a8bcad4ff9fc672b00ce9a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123275 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-11-02COMPMID-935 - Implementing Convolution with Winograd on OpenCL (part 2)Gian Marco Iodice
Implemented Winograd Filter Transform 3x3 on OpenCL Change-Id: I8f2b2dd938c5c000ef7ce392a37fb7b8b4202a4e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122708 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02Revert "COMPMID-959: Fix valid region for Scale by always setting full shape"Georgios Pinitas
This reverts commit 77fdc0420dfd3c5009370650f963748a73f0db58. Change-Id: I1440f56f29637821e20ad2edf1055196bb69a0e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124290 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-959: Fix valid region for Scale by always setting full shapeDiego Lopez Recas
Change-Id: Idc2d004713768ae73e157674d15c928cca0992d7 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122703 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>