Age | Commit message (Collapse) | Author |
|
Change-Id: I2c6a744f174cfb6c78a9923b737f06537debaa0d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139758
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
element_size = data_type_size * num_channels
line_size = element_size * dimension[0] * num_channels
Change-Id: I72dd2ed7ee1f461f86232955c11e002a706cc89a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139833
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
COMPMID-1392: OCLGrind failures in im2col1x1_stridex1_dchw
COMPMID-1395: OCLGrind failures in output_stage_quantized
Change-Id: I35504bd1f701316df122be52d458c71bbd7e7909
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139722
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
- Extend support for FP16 in CLReduction.
- For F16/F32 MeanStdDev we perform one reduction operation for mean
and one for stddev and we calculate the final result in the host CPU.
Change-Id: Iad2099f26c0ba7969737d22f00c6c275634d875c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135870
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I9689e1a0627dc015dd2ce98417e4c97bb55581bb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131327
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I136f7aa4bca268abd4fbe4f6ce4bcc2708ec3671
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139689
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch implements a system for separating the "validity" from
"preferred" aspect of the current heuristics in gemm_*.cpp.
Now, each gemm_*.cpp defines a list of candidate implementations,
each of which supplies an is_valid() function (to check for
validity), an is_preferred() function (the "heuristic" part), and an
instantiate() function which actually produces the GemmCommon object
pointer.
The actual gemm() function is now templated and uses this list to
select an implementation. This patch also implements a mechanism to
identify the preferred implementation, and override it via the
GemmConfig structure.
Change-Id: Id49ab7af8bf2e3e9fd951a9698883ade234d40e1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139120
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Makes GEMM3D account top padding when jumping among planes.
Change-Id: Ia7c16cfa5498de106774ce42cbc4716e9f43195b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139612
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Seems OCLGrind to operate wrongly on some intrinsics when there is a
mixture of vectors and scalars passed to it.
Change-Id: I9e3782e739603ec59bacc3c77d91a70b1899fe3e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139474
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
clMeanStdDevKernel
Change-Id: I47e55d61a9d9464ef63b0da890e06f8a0b434796
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139527
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Id5e0795238f77c049df9c109dafc5ef878c1897d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139234
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ibabce61cf5427de80078a6468023bed05f5e7c2c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139006
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
This patch makes the needed infrastructure changes to allow SVE
kernels to be added later on.
Change-Id: Ide5bccac2f47278e93fff3d648231aee2d5f8c2e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139070
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
OpenCL NHWC
Change-Id: Ia07e0dfcbcd07366c4bcb956e298369fb12a0369
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138759
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
on OpenCL NCHW
Change-Id: Ia293cd89651146a0e27e5f7c74ca9c924807e83c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138707
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Removed QS32 references
Change-Id: Ic7df02c08ae7aa1b7dcae15bdda113321af851b8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138703
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The "cc" constraint was missing on the a53/a55r1 versions of this kernel.
Added "memory" to these (and the generic kernel) as well for safety.
Change-Id: I4df1b2fde43c20550ba7a51436b326f5e9e9871f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138812
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Pulled latest fixes from David's repo:
commit f43ebe932c84083332b0b1a0348241b69dda63a7
Author: David Mansell <David.Mansell@arm.com>
Date: Tue Jul 3 18:09:01 2018 +0100
Whitespace tidying, fixed comment in gemv_batched imported from ACL.
Change-Id: Ie37a623f44e90d88072236cb853ac55ac82d5f51
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138530
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: David Mansell <david.mansell@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Removed fixed point related code.
Change-Id: I487acf138dace3b0450e0d72ca7071eaec254566
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137678
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I412db69f31811fe4a7d262a0f146ef545731fc02
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138500
Reviewed-by: Aron Virginas-Tar <aron.virginas-tar@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I583227fc1a38b1a34de253e383d71cca66007f18
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138273
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I8e8dee355bbf708cc3abb22de867f848a22dccd6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138022
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I78d7b4a53fe6525cc19fd49c5d555a4334e6de3b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137903
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I574f7945f0be009c638d860028bce8b52b4120fd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136484
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
on OpenCL
Change-Id: I39667bab49daa4da009694163274a59fd3574c73
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137595
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ice2bb644841fdea4e776872ff5481eb927e66bd1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/137714
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I50e4f5e7d47e21c300f754bee2c216863075b5cf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136191
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I3dffdd1772b78db27a4374f074a24a15a9552189
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134859
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Changes input_access to StaticWindow to manually add the bottom padding
that is not taken into account through RectangleAccess.
Change-Id: Id39223eaff08688c9ade37973023959faa6b42a6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136566
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I64b09c692a1da44413a03a3abb4b4534d138dc3d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136986
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I09adb8493fd2c438871c3d734cadf4b950c24d25
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134822
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Id6dece059b521e50ef546c3ee2883acedf8e3b1c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134760
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If9385e6bcbf2242b973f42d6979b16ebc39f2cb4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136159
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Added NHWC to the dataset to the validation tests
Fixed a problem in the output transform which made the Activation to fail
because way/ordering the output transform wrote the data to the output tensor.
Change-Id: I9609f86605dbfef70b47a0fb043287bf0e5d675b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/136015
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Input+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: Iac35a54389266701b7d8f5434a7a37df85b7b187
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133315
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I8c4823a0d909e19e9ef548f00b9ae98c66de61dd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/123569
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I2e3f725ef5ed1454755086b9640ab84a81f4d40e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135170
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
- part1
In this first part we reworked the configuration of the kernels as before we
passed the raw pointer to the buffer within the configuration of the function
Change-Id: I83d3cb64c562303093c7f0ae52395ecd080a5d52
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/133560
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Check if the depth is multiple of tile size for NHWC if not write to
dummy padding.
Change-Id: Ie854dcbc75aa94bd1686f7769a009dd2654fdfed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135055
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0e437a43d3ae0fb7d0e425e8cb8bb56314604297
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/135659
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
And extended tests coverage adding kernel shapes 3x1, 1x5 and 7x7
Change-Id: Ia7c1d4da2368d5f5fbc1a41187f4ac1aca5f150f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127727
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
CLDepthwiseConvolution3x3NCHW
Change-Id: Ib2526f18bf303afd498ff85ca18c8df876f545ed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134546
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ifd125fcb5451dbac3c28b15a9471048a74fee0ad
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128987
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaabb1153c2abe0400ec79d51a21347debe92d642
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134062
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I762a3c9add2e26b850f388a78a16861abb2bf0f9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134553
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
https://confluence.arm.com/display/MLENG/Winograd+Output+Transform%3A+NCHW+vs+NHWC+on+OpenCL
Change-Id: I6995f5cef759ba70ebd96d545b952041b6f1f36e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128729
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Mismatches caused by the CL kernel computing the green value in
a different way than in NEON and C++.
Luminance values must be added after multiplying the input
UV values with the coefficients and not before.
Change-Id: I359573a98cf12f3be5c3437c28822175a5703dbb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134158
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
configuring window
Now max padding is equal to 15 instead of 127. If input width is less
than 128 we decrease the number of threads in the WG.
Change-Id: I5ff0b6fd8cb46143ba49e745ec9ad01f691bdd80
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134152
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I03d6c6db13bcb565f117725bdab2b68c89a49e21
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122185
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ie218447c4f3f94a37b5dd2d3b33488c7f5869adf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128520
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|