aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/cl_kernels/direct_convolution.cl
AgeCommit message (Collapse)Author
2021-04-13Fix TILE initialization in direct convolution and winograd transformsGian Marco Iodice
- The array initializer for the TILE object cannot always be utilized and so we do require to manually initialize the TILE with the LOOP_UNROLLING macro - Resolves COMPMID-4371 Change-Id: I2598354b9fae84c5e3bd11219fffdcdc297215e1 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5417 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-12Add support for cl_image in CLDirectConvolutionLayerGian Marco Iodice
- The cl_image object can be used for the weights - cl_image can only work for f32/f16 - Fix the implicit padding on the first dimension X Resolves COMPMID-4341 Change-Id: I04e0901c69e7765c42afceca38c4a840645b9123 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5393 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-09Fix OpenCL kernel compiling failure with array initilizerSheri Zhang
The issue is related with clang version, clang 3.9 has the problem, clange 4.0 works. The workaround is to add an extra {} to make this work. Resolves: COMPMID-4348 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I2d8fc6400f32af5406fbf2d2556127a53b2ce918 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5392 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-26Check biases pointer before referencing in CLDirectConvolutionLayerMichele Di Giorgio
The biases input can be nullptr, hence we need to check before referencing. A test is also added to ensure a successful configure and run of Direct Convolution when there is no bias. Resolves: COMPMID-4315 Change-Id: I23223efd6ced81215aff490221fb4606945c139b Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5322 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: James Conroy <james.conroy@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Extend direct convolution (F32/F16/QASYMM8)Gian Marco Iodice
The new function can handle different block sizes (M0, N0) New utility macros have been developed to simplify the work and the future OpenCL kernel development. In particular the work has been done to also consider cases with: - the texture pipe support - dynamic tensor shape support Change-Id: Ife4c64baf07517938bb8ad18e6a5f4579345c40f Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5297 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix OpenCL direct convolutionGian Marco Iodice
- The ARM DOT macro was using wrong variables for performing the dot product - K0 could be a non power of 2 values when IFM was not a multiple of 16 - Refactor the test for direct convolution NHWC Resolves COMPMID-4135, COMPMID-4155 Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-20Direct convolution fix for quantized data typeGian Marco Iodice
- Pass the quantized zero value to the opencl kernel Fixes COMPMID-3908 Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-19Remove padding from direct convolution - OpenCLGian Marco Iodice
- Refactor direct convolution for NHWC - Remove old kernels for NHWC - Change the heuristic in CLConvolutionLayer.cpp. The new direct convolution implementation is faster than FFT Resolves COMPMID-3908 Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-355 Implement CL DirectConvolution1x1SiCong Li
* Add FP16 to validation tests. * Complete benchmark tests for CL and NEON Direct Convolution. Change-Id: Ie73d8580832372db01b82b39786fd9c8be560090 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82014 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-355 Implement 3x3 CL direct convolutionsteniu01
Change-Id: I1b44dc375045964e65557f0ead57a7c12d6bf097 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81418 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>