Age | Commit message (Collapse) | Author |
|
- part1
In this first part we reworked the configuration of the kernels as before we
passed the raw pointer to the buffer within the configuration of the function
Change-Id: I83d3cb64c562303093c7f0ae52395ecd080a5d52
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133560
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Check if the depth is multiple of tile size for NHWC if not write to
dummy padding.
Change-Id: Ie854dcbc75aa94bd1686f7769a009dd2654fdfed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135055
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0e437a43d3ae0fb7d0e425e8cb8bb56314604297
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135659
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
And extended tests coverage adding kernel shapes 3x1, 1x5 and 7x7
Change-Id: Ia7c1d4da2368d5f5fbc1a41187f4ac1aca5f150f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127727
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Id24c2f07c59d863f8e1af6a1afbf6a542b2b9954
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135142
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I42bdb9f71f14f0d82306a990f7d8a066947a4290
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135129
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
CLDepthwiseConvolution3x3NCHW
Change-Id: Ib2526f18bf303afd498ff85ca18c8df876f545ed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134546
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
In case of reconfiguration there might be the need for reallocating internal
data. This patch allows resusage of already allocated memory for CLTensors only
if the newly requested memory is smaller than the previous one, otherwise an
error is thrown.
Change-Id: Ibb545d0c521f87636f8a00154b879958570ee184
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131022
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ifd125fcb5451dbac3c28b15a9471048a74fee0ad
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128987
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaabb1153c2abe0400ec79d51a21347debe92d642
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134062
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I762a3c9add2e26b850f388a78a16861abb2bf0f9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134553
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I8c430f2efafa0f47e2b12e388713ba693a6df8ee
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134467
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Output+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: I6995f5cef759ba70ebd96d545b952041b6f1f36e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128729
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
CLFullyConnectedLayer
Change-Id: I1c3b2197906cd4b905309bbd5f2012bbae6a7dba
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133730
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Id1c68c3bf442c3fcff265041b260d007db7593cb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134027
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Mismatches caused by the CL kernel computing the green value in
a different way than in NEON and C++.
Luminance values must be added after multiplying the input
UV values with the coefficients and not before.
Change-Id: I359573a98cf12f3be5c3437c28822175a5703dbb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134158
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
configuring window
Now max padding is equal to 15 instead of 127. If input width is less
than 128 we decrease the number of threads in the WG.
Change-Id: I5ff0b6fd8cb46143ba49e745ec9ad01f691bdd80
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134152
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I03d6c6db13bcb565f117725bdab2b68c89a49e21
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122185
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I915461d3216ee8b181a592a89143ee8c6bb25661
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134054
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I125660d412945aa152cb76c78280ca0d52264b86
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133372
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie218447c4f3f94a37b5dd2d3b33488c7f5869adf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128520
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
- Add an entry point to allow the user to parallelise an arbitrary queue of workloads (Will be used to interleave GEMM / BufferManager)
- Added a ThreadFeeder which acts as a thread-safe work distributor
Change-Id: I3a84fb7446c453cfcd337e21338c2ccf9f29f7b3
Note: This patch doesn't introduce any change in the default strategy, therefore it shouldn't have any impact on the performance
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133058
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I013d57f6e2becbd6d2d7700ce5fbbeca670443c4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133735
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Nodes added:
-ChannelShuffle
-Resize
-Deconvolution
-Dummy (used for performance analysis and debugging)
Change-Id: Iad19960cbbce6e25532f77bfd34b2292c0ca9781
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131672
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic5f197463f962bac4b23663bcef7ac744be6fc2a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114250
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Added
* Compile time switches for kernels using FP16 extensions
* Validation for support of atomics extension
Change-Id: Ia88e601db054ff35f1508988b5e322bd27511ac5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133216
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I507b04680a4e88426b682bd0be03bccb560ec78d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/132589
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I96fbca08c2ad3a7415d1578fe7ec56f8a6069783
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131946
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I791855edf6f821381ecb8ff0652fb14a5810d9d7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131912
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0d981a06655cdd86c71fddbd07303d781577d0fd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/132620
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
(part 1)
- Image to MultiImage will be in part 2
Change-Id: Id2f22c39fb41a78a360d20d2c3bdecd57cdfd152
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128321
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I717ec4d0e483966c5de0148206b9eaabe81b9179
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/132417
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic82ca002220fa31d8618a55084ff1dfc2585bea7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131944
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Improve the native GEMM so it can cope with any value for M. Also
change the selection code so that the native GEMM is selected if M is
small and nmulti is large - Winograd needs GEMMs like this and they
don't thread properly with the blocked GEMM.
(also rename gemm_batched.hpp back to gemv_batched.hpp)
Change-Id: I736c33373ada562cbc0c00540520a58103faa9d5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131739
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I1e0fd08f1053678cec696f20fd2f3a68dd5f1deb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131423
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ice620385ce787b568b38fcbdddc94ef385396141
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131355
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I1ea4db4e1ba37a736445ba991eeb08c247a6a61e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131393
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6736ba4486df5ab10685ce17d41147359a4f3e80
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131091
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I42f0e7dab38e45b5eecfe6858eaecee8939c8585
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129291
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iee4ad387c41dd8ccfe31b3044d797f2d7448e552
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126655
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5fcbbb3b6f22204f0aaebbc319dfdf03593577e8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130067
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I5004c79ac7b10f988f25e14847f1ea2be01629da
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131143
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The default templated merge, and the specialised S8 12x8 merge, were
using alpha and beta the wrong way round. Fixed.
Change-Id: Ie559b665edf1eb012e8cb54ea0bca31612bcc072
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131309
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I22fe80393ec70e4501a4f9f9cad14014029d035d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129134
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Remove redudant code left over from validation method refactoring.
Update output shapes in CL/ReductionOperation Validate test suite.
Change-Id: Ica846dd7f65380fa21708472e10b5bc609a32027
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131207
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I40faba421281b1cf080fa6a825d04a4366cdaeb0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130700
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Since now the input transform can be multi-threaded, I re-ebaled Winograd in all graph examples
Change-Id: I39ef78243bb47fdae135e18dcae2102af0675b3b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131048
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
winograd conv.
Change-Id: Ibd2f2c6680b647a066255ea77d4a2a172ef76aa3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130418
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I54e58cb0b0cdd90bbb8dc2be4f06b76af88dc26d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131054
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
Change-Id: I0cfea24884066412c2f13d9acdb72ddbccac7545
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130407
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|