Age | Commit message (Collapse) | Author |
|
- The array initializer for the TILE object cannot always be utilized and so we
do require to manually initialize the TILE with the LOOP_UNROLLING macro
- Resolves COMPMID-4371
Change-Id: I2598354b9fae84c5e3bd11219fffdcdc297215e1
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5417
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- The cl_image object can be used for the weights
- cl_image can only work for f32/f16
- Fix the implicit padding on the first dimension X
Resolves COMPMID-4341
Change-Id: I04e0901c69e7765c42afceca38c4a840645b9123
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5393
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The issue is related with clang version, clang 3.9 has the problem, clange 4.0 works. The workaround is to add an extra {} to make this work.
Resolves: COMPMID-4348
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I2d8fc6400f32af5406fbf2d2556127a53b2ce918
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5392
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The biases input can be nullptr, hence we need to check before
referencing.
A test is also added to ensure a successful configure and run of Direct
Convolution when there is no bias.
Resolves: COMPMID-4315
Change-Id: I23223efd6ced81215aff490221fb4606945c139b
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5322
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: James Conroy <james.conroy@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The new function can handle different block sizes (M0, N0)
New utility macros have been developed to simplify the work and the
future OpenCL kernel development. In particular the work has been done
to also consider cases with:
- the texture pipe support
- dynamic tensor shape support
Change-Id: Ife4c64baf07517938bb8ad18e6a5f4579345c40f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5297
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- The ARM DOT macro was using wrong variables for performing the dot
product
- K0 could be a non power of 2 values when IFM was not a multiple of 16
- Refactor the test for direct convolution NHWC
Resolves COMPMID-4135, COMPMID-4155
Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Pass the quantized zero value to the opencl kernel
Fixes COMPMID-3908
Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Refactor direct convolution for NHWC
- Remove old kernels for NHWC
- Change the heuristic in CLConvolutionLayer.cpp. The new direct
convolution implementation is faster than FFT
Resolves COMPMID-3908
Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
* Add FP16 to validation tests.
* Complete benchmark tests for CL and NEON Direct Convolution.
Change-Id: Ie73d8580832372db01b82b39786fd9c8be560090
Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82014
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
|
|
Change-Id: I1b44dc375045964e65557f0ead57a7c12d6bf097
Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81418
Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|