Age | Commit message (Collapse) | Author |
|
Change-Id: Ibabce61cf5427de80078a6468023bed05f5e7c2c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139006
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
OpenCL NHWC
Change-Id: Ia07e0dfcbcd07366c4bcb956e298369fb12a0369
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138759
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
on OpenCL NCHW
Change-Id: Ia293cd89651146a0e27e5f7c74ca9c924807e83c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138707
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Removed QS32 references
Change-Id: Ic7df02c08ae7aa1b7dcae15bdda113321af851b8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138703
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Removed fixed point related code.
Change-Id: I487acf138dace3b0450e0d72ca7071eaec254566
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137678
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I583227fc1a38b1a34de253e383d71cca66007f18
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138273
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I574f7945f0be009c638d860028bce8b52b4120fd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136484
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
on OpenCL
Change-Id: I39667bab49daa4da009694163274a59fd3574c73
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137595
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ice2bb644841fdea4e776872ff5481eb927e66bd1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137714
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I50e4f5e7d47e21c300f754bee2c216863075b5cf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136191
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I3dffdd1772b78db27a4374f074a24a15a9552189
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134859
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I64b09c692a1da44413a03a3abb4b4534d138dc3d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136986
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I09adb8493fd2c438871c3d734cadf4b950c24d25
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134822
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Id6dece059b521e50ef546c3ee2883acedf8e3b1c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134760
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If9385e6bcbf2242b973f42d6979b16ebc39f2cb4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136159
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Input+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: Iac35a54389266701b7d8f5434a7a37df85b7b187
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133315
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I8c4823a0d909e19e9ef548f00b9ae98c66de61dd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123569
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I2e3f725ef5ed1454755086b9640ab84a81f4d40e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135170
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Check if the depth is multiple of tile size for NHWC if not write to
dummy padding.
Change-Id: Ie854dcbc75aa94bd1686f7769a009dd2654fdfed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135055
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0e437a43d3ae0fb7d0e425e8cb8bb56314604297
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135659
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
And extended tests coverage adding kernel shapes 3x1, 1x5 and 7x7
Change-Id: Ia7c1d4da2368d5f5fbc1a41187f4ac1aca5f150f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127727
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ifd125fcb5451dbac3c28b15a9471048a74fee0ad
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128987
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Output+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: I6995f5cef759ba70ebd96d545b952041b6f1f36e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128729
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Mismatches caused by the CL kernel computing the green value in
a different way than in NEON and C++.
Luminance values must be added after multiplying the input
UV values with the coefficients and not before.
Change-Id: I359573a98cf12f3be5c3437c28822175a5703dbb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134158
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I03d6c6db13bcb565f117725bdab2b68c89a49e21
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122185
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ie218447c4f3f94a37b5dd2d3b33488c7f5869adf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128520
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic5f197463f962bac4b23663bcef7ac744be6fc2a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114250
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Added
* Compile time switches for kernels using FP16 extensions
* Validation for support of atomics extension
Change-Id: Ia88e601db054ff35f1508988b5e322bd27511ac5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133216
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I791855edf6f821381ecb8ff0652fb14a5810d9d7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131912
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I717ec4d0e483966c5de0148206b9eaabe81b9179
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/132417
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic82ca002220fa31d8618a55084ff1dfc2585bea7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131944
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I1ea4db4e1ba37a736445ba991eeb08c247a6a61e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131393
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I22fe80393ec70e4501a4f9f9cad14014029d035d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129134
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I40faba421281b1cf080fa6a825d04a4366cdaeb0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130700
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
-Multiple definitions of COLS_MTX_B in gemm.cl one for FP32 and one for
FP16.
-GEMMTranspose1xWKernel invalid check fro small window sizes.
Change-Id: I9c7ddd3577aec9afc702731ca27a1e10d6eddb81
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131023
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
layer from NHWC to NCHW and viceversa
Change-Id: If77ffeb92b6eb883e5d2d2c97c2c4d1d23d17c8d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129257
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch improves of ~30 % GEMM fp16 when the reshape is required
The results have been reported at the following confluence page:
https://confluence.arm.com/display/MLENG/GEMM+FP16+performance%3A+ACL+18.05
Change-Id: I8233095a7e9ab06f1f915782a25dd41653b49140
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128254
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I4838f5a8e4c33ed646cd05e0bb682fca635a29a3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130469
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: I56d2a02b316f0c69ff1fd7220e732f775414fe69
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129709
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I03f32c62350e5ea43e77bb15fc5a832d83719e3b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126657
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I3d91fde78b971aba3f6349f633cd9b1c50e5cacf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124712
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
COMPMID-1103 - CLWinogradConvolutionLayer mismatches
Change-Id: Iceaa9482a1790ec39d2720c220261aaea8043978
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129398
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ie37588f60b9cfc7b1d09b1e8628fcfb4b17e0717
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123834
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Id540490e5faf11c466ff039a20880eeedd6e5ec7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128612
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
The performance achieved can be found at the following confluence page:
https://confluence.arm.com/display/MLENG/GEMM-based+convolution+vs+Winograd-based+convolution+on+OpenCL
Change-Id: I4b690cfdd4eb4ff0cd17b14fdd49ccaa1d1dc85c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127729
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I0b126f03028f08687497b0d79d2e2764f7ed07c8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128001
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Currently we have beta and gamma compulsory in Batch normalization. There are
network that might not need one or both of those. Thus these should be optional
with beta(offset) defaulting to zero and gamma(scale) to 1. Will also reduce
some memory requirements.
Change-Id: I15bf1ec14b814be2acebf1be1a4fba9c4fbd3190
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123237
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Results reported at:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.05
Change-Id: I3246c4f19c4d21a7d6a44e4593bc5caffc016f81
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127838
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch improves of ~20% GEMM fp16.
The results has been reported at the following confluence page:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.05
I am aware with few cases we have a bit of degradation. However this cases are
memory bound anyway (Fully connected layer cases)
Change-Id: I183cbb7fba55a0b5eb86532c4dca5efe096096b0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128044
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I89de432f3fbcba7abf9e1d4f8396a4334b4fa2c2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118324
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|