Age | Commit message (Collapse) | Author |
|
on OpenCL
Change-Id: I39667bab49daa4da009694163274a59fd3574c73
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137595
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ice2bb644841fdea4e776872ff5481eb927e66bd1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137714
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I412420a4f02225708fcc8f446a5af5a9faf7d0a5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137846
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ie74bb71057027bca3b8a9b03b4a9f156d58b3253
Note: No performance impact as this part of the code is not currently used
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137807
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I50e4f5e7d47e21c300f754bee2c216863075b5cf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136191
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: If0836522792717a843c1cab405afc9320ce53079
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137162
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
During the mutating passes accessors of optimized nodes were dropped
instead of being transfered to appropriate tensors.
Change-Id: I29183984d94806bdfb5c92af3acefd928c0fd171
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136036
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I3dffdd1772b78db27a4374f074a24a15a9552189
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134859
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6e642c8cd968240f883c327464519e57e5d0c3e3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136088
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Changes input_access to StaticWindow to manually add the bottom padding
that is not taken into account through RectangleAccess.
Change-Id: Id39223eaff08688c9ade37973023959faa6b42a6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136566
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I64b09c692a1da44413a03a3abb4b4534d138dc3d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136986
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I09adb8493fd2c438871c3d734cadf4b950c24d25
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134822
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Id6dece059b521e50ef546c3ee2883acedf8e3b1c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134760
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If9385e6bcbf2242b973f42d6979b16ebc39f2cb4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136159
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Added NHWC to the dataset to the validation tests
Fixed a problem in the output transform which made the Activation to fail
because way/ordering the output transform wrote the data to the output tensor.
Change-Id: I9609f86605dbfef70b47a0fb043287bf0e5d675b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136015
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I440df2b2af512fd874651baf28428caa6f8e0b41
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134433
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5b46764f9c3154ec3e3b9c951cc9e6dfbcb81dfb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134255
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Input+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: Iac35a54389266701b7d8f5434a7a37df85b7b187
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133315
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
- Introduced some Hints allowing the function to set its favourite splitting method for a given workload
- Implemented the bucket split (Disabled by default)
Change-Id: I3a48dfb0bd0ec8b69a44d9c4a4c77ad3f6dc9827
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133079
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I8c4823a0d909e19e9ef548f00b9ae98c66de61dd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123569
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I2e3f725ef5ed1454755086b9640ab84a81f4d40e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135170
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
- part1
In this first part we reworked the configuration of the kernels as before we
passed the raw pointer to the buffer within the configuration of the function
Change-Id: I83d3cb64c562303093c7f0ae52395ecd080a5d52
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133560
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Check if the depth is multiple of tile size for NHWC if not write to
dummy padding.
Change-Id: Ie854dcbc75aa94bd1686f7769a009dd2654fdfed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135055
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0e437a43d3ae0fb7d0e425e8cb8bb56314604297
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135659
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
And extended tests coverage adding kernel shapes 3x1, 1x5 and 7x7
Change-Id: Ia7c1d4da2368d5f5fbc1a41187f4ac1aca5f150f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127727
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Id24c2f07c59d863f8e1af6a1afbf6a542b2b9954
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135142
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I42bdb9f71f14f0d82306a990f7d8a066947a4290
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135129
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
CLDepthwiseConvolution3x3NCHW
Change-Id: Ib2526f18bf303afd498ff85ca18c8df876f545ed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134546
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
In case of reconfiguration there might be the need for reallocating internal
data. This patch allows resusage of already allocated memory for CLTensors only
if the newly requested memory is smaller than the previous one, otherwise an
error is thrown.
Change-Id: Ibb545d0c521f87636f8a00154b879958570ee184
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131022
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ifd125fcb5451dbac3c28b15a9471048a74fee0ad
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128987
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaabb1153c2abe0400ec79d51a21347debe92d642
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134062
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I762a3c9add2e26b850f388a78a16861abb2bf0f9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134553
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I8c430f2efafa0f47e2b12e388713ba693a6df8ee
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134467
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Output+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: I6995f5cef759ba70ebd96d545b952041b6f1f36e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128729
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
CLFullyConnectedLayer
Change-Id: I1c3b2197906cd4b905309bbd5f2012bbae6a7dba
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133730
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Id1c68c3bf442c3fcff265041b260d007db7593cb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134027
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Mismatches caused by the CL kernel computing the green value in
a different way than in NEON and C++.
Luminance values must be added after multiplying the input
UV values with the coefficients and not before.
Change-Id: I359573a98cf12f3be5c3437c28822175a5703dbb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134158
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
configuring window
Now max padding is equal to 15 instead of 127. If input width is less
than 128 we decrease the number of threads in the WG.
Change-Id: I5ff0b6fd8cb46143ba49e745ec9ad01f691bdd80
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134152
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I03d6c6db13bcb565f117725bdab2b68c89a49e21
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122185
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I915461d3216ee8b181a592a89143ee8c6bb25661
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134054
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I125660d412945aa152cb76c78280ca0d52264b86
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133372
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie218447c4f3f94a37b5dd2d3b33488c7f5869adf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128520
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
- Add an entry point to allow the user to parallelise an arbitrary queue of workloads (Will be used to interleave GEMM / BufferManager)
- Added a ThreadFeeder which acts as a thread-safe work distributor
Change-Id: I3a84fb7446c453cfcd337e21338c2ccf9f29f7b3
Note: This patch doesn't introduce any change in the default strategy, therefore it shouldn't have any impact on the performance
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133058
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I013d57f6e2becbd6d2d7700ce5fbbeca670443c4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133735
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Nodes added:
-ChannelShuffle
-Resize
-Deconvolution
-Dummy (used for performance analysis and debugging)
Change-Id: Iad19960cbbce6e25532f77bfd34b2292c0ca9781
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131672
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic5f197463f962bac4b23663bcef7ac744be6fc2a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114250
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Added
* Compile time switches for kernels using FP16 extensions
* Validation for support of atomics extension
Change-Id: Ia88e601db054ff35f1508988b5e322bd27511ac5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133216
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I507b04680a4e88426b682bd0be03bccb560ec78d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/132589
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I96fbca08c2ad3a7415d1578fe7ec56f8a6069783
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131946
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I791855edf6f821381ecb8ff0652fb14a5810d9d7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131912
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|