aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/cl_kernels/direct_convolution.cl
AgeCommit message (Collapse)Author
2021-01-20Direct convolution fix for quantized data typeGian Marco Iodice
- Pass the quantized zero value to the opencl kernel Fixes COMPMID-3908 Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-19Remove padding from direct convolution - OpenCLGian Marco Iodice
- Refactor direct convolution for NHWC - Remove old kernels for NHWC - Change the heuristic in CLConvolutionLayer.cpp. The new direct convolution implementation is faster than FFT Resolves COMPMID-3908 Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-09-17COMPMID-355 Implement CL DirectConvolution1x1SiCong Li
* Add FP16 to validation tests. * Complete benchmark tests for CL and NEON Direct Convolution. Change-Id: Ie73d8580832372db01b82b39786fd9c8be560090 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/82014 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
2018-09-17COMPMID-355 Implement 3x3 CL direct convolutionsteniu01
Change-Id: I1b44dc375045964e65557f0ead57a7c12d6bf097 Reviewed-on: http://mpd-gerrit.cambridge.arm.com/81418 Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>