Age | Commit message (Collapse) | Author |
|
Change-Id: I2af6544eab17004c5b3de56557cb2cc5efecc915
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122181
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: I73231fc71c5166268e6c909b7930b7e034f3794e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118876
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If4626ec9e215e14dffe22e80812da5bac84a52e2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125734
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
The bug concerned the collapse of the window in CLGEMMMatrixMultiplyKernel
Change-Id: I5043bf37b72eeb615ebe7fb3f2c8e72d006bf341
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126262
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ica17528bf6c812d9caf9d66c612c11434ec1dc69
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125542
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ie73d8771f85d1f5b059f3a56f1bbd73c98e94a38
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124723
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I68c6453e0f192de659582404f109a89616b9fbb9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124811
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Implemented Winograd Output Transform (2x2,3x3) on OpenCL
Implemented CLWinogradConvolutionLayer on OpenCL
Change-Id: I6a113fc5f052ca07f878d2b800d2ab003f84af65
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125148
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I287908f76af458ad4b4d865d353dc37e33877250
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120839
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Implemented Winograd Filter Transform 3x3 on OpenCL
Change-Id: I8f2b2dd938c5c000ef7ce392a37fb7b8b4202a4e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122708
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I51f92f30602fb0a02314f344fa67061f448694bf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122793
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
This patch enables GEMM to execute multiple batches in parallel
https://confluence.arm.com/display/MLENG/Winograd%3A+batched+GEMM
Change-Id: I66222db041dd35e82af11fbb262fd1ebd3ca4b2f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120866
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ica047a92d3ab199ffc65a512b9ba10e865639dfe
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121806
Reviewed-by: Les Bell <les.bell@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ie5f299c7a7fbe3062cee22bb2b4ae5df818fe490
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121178
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5022d02f06f9d849dad76e3d9b8e48632c236429
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121191
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I91f6a0b057f5eb84c6ac7db5abbc05c7520ed5d2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120760
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I4404f91a270e0ba7bbb7451c4c43a485fd4a3f6c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121105
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Invalid conversions in oclgrind when clamp is used.
Removed call to clamp in CL kernel and replace with convert_sat.
Change-Id: I3cd9b87dc10c65d307fbf6eb0aec1b671fba6e97
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121062
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz.
The performance have been reported here
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ifb4d27ba05aa618babb79b1f8e95fbfa689c5f3a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120792
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie0c5885a60771f728f80a8c4bdb7f1e4085fa3ee
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120267
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
sizes - Part 2 (CL)
Change-Id: I004906b9b1f11158fe17b4aa2640a7f4685fb929
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118462
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
-Adds quantization info to the ActivationLayer benchmark fixture
-Replaces clamp with convert_sat in depthwise conv kernel
-Fixes ROIPooling execution slice
Change-Id: Ie9bbe08abcfb8278456964e476b0948247c7ecba
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118957
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: Ic26fed30f9a54e6adef7861c05c9d55d23ca52ca
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119913
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
This patch makes col2im on OpenCL 2 times faster
Change-Id: I8d90f5a72a050355ca1fd13433d8c2c26e5e33f5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119442
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1
Performance reported in the following confluence page:
https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02
Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11
Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Changed CLReductionOperationKernel: Now each kernel computes
a 2D slice instead of 1D. This reduces the memory footprint
from around 1.6Gb for a 4k input image to a few Mb, which was
caused by the __local memory and was probably the cause for this bug.
Change-Id: I71ac71ff09b041c945a134177600f0f3475e48cf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117835
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch introduces a new GEMM capable to improve the mac utilisation
of 10% compared to the GEMM without reshape. However this implementation
is not faster in all cases as we need to take into account the time for
reshaping the matrices. For this reason an heuristic solution to select
the optimal GEMM to use has been added to the function. More information
about the heuristic implementation can be found at COMPMID-852.
With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can
improved the performance of 1.5x.
More information about the performance uplift can be found here:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02
Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The performance improvements have been reported at the following
confluence page:
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Config3 of McVail looks improved by 29x
Change-Id: I8b203c0b75fc368f85cea863b7eed398fab3e79a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115783
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I8e0b7cad2f977942224d0116e8498bf9b2d6014d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117229
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6d97b649f1ebc289c9e6f8949e67740a6b3cbcb2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116636
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I389e0d4104b7dde60b7cdd612a83f3328517e44c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115804
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
necessary
Change-Id: Iea8a21f7c71025bfde6fdf7c7a7c92ba749b189b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116673
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
MobileNet QASYMM8 dwc layers
Change-Id: I30eaea3f3625086e311ad201ef73a8f06a01e382
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116521
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ie00c6b08a51d30c5ce2637d40ee3d165b8a68686
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110311
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I2021612e61de1b82aaeb49249d06929c7fceb15f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115216
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Updated following kernels to collapse their execution window and reduce
number of kernel enqueues:
-CLArithmeticAddition
-CLArithmeticSubtraction
-CLPixelWiseMultiplication
Change-Id: I13d503515a20fa9be1401ead1e27e9bbc6627975
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114878
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Id69df4ce98d1d89bdf9c9aa5c4d909659909b30f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110456
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Idaab987384d6a12a114f609abd50446fd94536b2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110879
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
DoD:
- Implement NEON kernel for quantizing down the gemmlowp result. The
result should be scaled by a fixedpoint number
- Implement OpenCL kernel for quantizing down the gemmlowp result. The
result should be scaled by a fixedpoint number
- Add test for validating the result
Required for:
- Integration of GEMMLowp in Android NN
- Convolution quantized
- Fully connected quantized
Change-Id: Ia963d25d695471e963961fb49a5600e78374ac4f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110981
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I73a11ef3ff7265abce196b128413f54623d33cae
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111294
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
|
|
Change-Id: I70e04d3a175ba366432ada98e9ca893c9f81b260
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111094
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
-Changes way of clamping in the kernel side.
-Fills padding with quantized values
Change-Id: I94d17c341fd637fbb24390722162b551b62d16cb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111114
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
|
|
Change-Id: I480eb8ad55b632c7d75b1a89e952e77b0ebbeda5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111158
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
|
|
Change-Id: I4b5150476839649e6c3005a54f01e0788519bfb1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111101
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|