Age | Commit message (Collapse) | Author |
|
Change-Id: Id6dece059b521e50ef546c3ee2883acedf8e3b1c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134760
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Input+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: Iac35a54389266701b7d8f5434a7a37df85b7b187
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133315
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I2e3f725ef5ed1454755086b9640ab84a81f4d40e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135170
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
And extended tests coverage adding kernel shapes 3x1, 1x5 and 7x7
Change-Id: Ia7c1d4da2368d5f5fbc1a41187f4ac1aca5f150f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127727
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ifd125fcb5451dbac3c28b15a9471048a74fee0ad
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128987
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Output+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: I6995f5cef759ba70ebd96d545b952041b6f1f36e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128729
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I03d6c6db13bcb565f117725bdab2b68c89a49e21
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122185
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ie218447c4f3f94a37b5dd2d3b33488c7f5869adf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128520
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I013d57f6e2becbd6d2d7700ce5fbbeca670443c4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133735
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Added
* Compile time switches for kernels using FP16 extensions
* Validation for support of atomics extension
Change-Id: Ia88e601db054ff35f1508988b5e322bd27511ac5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133216
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I507b04680a4e88426b682bd0be03bccb560ec78d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/132589
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I40faba421281b1cf080fa6a825d04a4366cdaeb0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130700
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
layer from NHWC to NCHW and viceversa
Change-Id: If77ffeb92b6eb883e5d2d2c97c2c4d1d23d17c8d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129257
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch improves of ~30 % GEMM fp16 when the reshape is required
The results have been reported at the following confluence page:
https://confluence.arm.com/display/MLENG/GEMM+FP16+performance%3A+ACL+18.05
Change-Id: I8233095a7e9ab06f1f915782a25dd41653b49140
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128254
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I56d2a02b316f0c69ff1fd7220e732f775414fe69
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129709
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I3d91fde78b971aba3f6349f633cd9b1c50e5cacf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124712
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ie37588f60b9cfc7b1d09b1e8628fcfb4b17e0717
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123834
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ia6a7b7a9d8b10ebf6b3c6a0fffa10bdf5dd8d8ef
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128381
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
The performance achieved can be found at the following confluence page:
https://confluence.arm.com/display/MLENG/GEMM-based+convolution+vs+Winograd-based+convolution+on+OpenCL
Change-Id: I4b690cfdd4eb4ff0cd17b14fdd49ccaa1d1dc85c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127729
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I0b126f03028f08687497b0d79d2e2764f7ed07c8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128001
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I7920ecdf6687341cbcf4d75aecc15c4164c64636
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127722
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
This patch improves of ~20% GEMM fp16.
The results has been reported at the following confluence page:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.05
I am aware with few cases we have a bit of degradation. However this cases are
memory bound anyway (Fully connected layer cases)
Change-Id: I183cbb7fba55a0b5eb86532c4dca5efe096096b0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128044
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I89de432f3fbcba7abf9e1d4f8396a4334b4fa2c2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118324
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Iac26936f46d0f7cdd9d2f8393b0092cd5a223c45
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127675
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I6871c28db69e1580c2ece73a9294742586db81f0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127954
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6dd639bf5df9bc0c133996f75bdee767f70a6cfb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127469
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I2af6544eab17004c5b3de56557cb2cc5efecc915
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122181
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: I250d6a1daeccf91d97b6da65aec53b02cf6046a7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116140
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ica17528bf6c812d9caf9d66c612c11434ec1dc69
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125542
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I68c6453e0f192de659582404f109a89616b9fbb9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124811
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Implemented Winograd Output Transform (2x2,3x3) on OpenCL
Implemented CLWinogradConvolutionLayer on OpenCL
Change-Id: I6a113fc5f052ca07f878d2b800d2ab003f84af65
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125148
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Implemented Winograd Filter Transform 3x3 on OpenCL
Change-Id: I8f2b2dd938c5c000ef7ce392a37fb7b8b4202a4e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122708
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I51f92f30602fb0a02314f344fa67061f448694bf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122793
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I64cb2d7f9513d69aebd9307a803b1b2c9c0e04c3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121929
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ic32742388fbd45c8acc395977586204980eff591
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123541
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Kevin Petit <kevin.petit@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ie5f299c7a7fbe3062cee22bb2b4ae5df818fe490
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121178
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz.
The performance have been reported here
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ifb4d27ba05aa618babb79b1f8e95fbfa689c5f3a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120792
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
sizes - Part 2 (CL)
Change-Id: I004906b9b1f11158fe17b4aa2640a7f4685fb929
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118462
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1
Performance reported in the following confluence page:
https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02
Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11
Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
This patch introduces a new GEMM capable to improve the mac utilisation
of 10% compared to the GEMM without reshape. However this implementation
is not faster in all cases as we need to take into account the time for
reshaping the matrices. For this reason an heuristic solution to select
the optimal GEMM to use has been added to the function. More information
about the heuristic implementation can be found at COMPMID-852.
With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can
improved the performance of 1.5x.
More information about the performance uplift can be found here:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02
Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The performance improvements have been reported at the following
confluence page:
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Config3 of McVail looks improved by 29x
Change-Id: I8b203c0b75fc368f85cea863b7eed398fab3e79a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115783
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6d97b649f1ebc289c9e6f8949e67740a6b3cbcb2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116636
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie00c6b08a51d30c5ce2637d40ee3d165b8a68686
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110311
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Idaab987384d6a12a114f609abd50446fd94536b2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110879
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
DoD:
- Implement NEON kernel for quantizing down the gemmlowp result. The
result should be scaled by a fixedpoint number
- Implement OpenCL kernel for quantizing down the gemmlowp result. The
result should be scaled by a fixedpoint number
- Add test for validating the result
Required for:
- Integration of GEMMLowp in Android NN
- Convolution quantized
- Fully connected quantized
Change-Id: Ia963d25d695471e963961fb49a5600e78374ac4f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110981
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I73a11ef3ff7265abce196b128413f54623d33cae
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111294
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
|