Rework OpenCL Depthwise Convolution

- Remove dedicated kernels for NCHW. Now we only use NHWC with permute - Remove specialized kernels for 3x3 NHWC - Simplify CLDepthwiseConvolutionLayer.cpp to call just the native implementation for both floating-point and quantized data types - Develop two parametric opencl kernels for depthwise convolution layer NHWC (floating-point and quantized) - Add support to export the weights to cl_image - Extend test for depthwise convolution on opencl Resolves COMPMID-4417 Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
author: Gian Marco Iodice <gianmarco.iodice@arm.com> 2021-04-16 15:08:59 +0100
committer: Gian Marco Iodice <gianmarco.iodice@arm.com> 2021-07-02 15:56:45 +0000
commit: 8155c0253c00aa9e26651361460c66feb39829a6 (patch)
tree: 41dacc432d4d1f1daa32d20d15e5120c11b9fa56 /src/core/CL/cl_kernels/direct_convolution.cl
parent: 2eb5d16b839cbc28c6cb7f0de7a0bf15290b425a (diff)
download: ComputeLibrary-8155c0253c00aa9e26651361460c66feb39829a6.tar.gz
1 files changed, 1 insertions, 2 deletions
diff --git a/src/core/CL/cl_kernels/direct_convolution.cl b/src/core/CL/cl_kernels/direct_convolution.cl
index c5444cd7cc..75a7a0f004 100644
--- a/src/core/CL/cl_kernels/direct_convolution.cl
+++ b/src/core/CL/cl_kernels/direct_convolution.cl
@@ -32,10 +32,9 @@
  *
  * @note Data layout supported: NHWC
  * @note Data type supported: F32/F16/QASYMM8/QASYMM8_SIGNED
- * @note The data type must be passed at compile time using -DDATA_TYPE (e.g. -DDATA_TYPE=half)
  * @note The accumulation data type must be passed at compile time using -DACC_DATA_TYPE (e.g. -DDATA_TYPE_PROMOTED=half)
  * @note The convolution padding (left and top) must be passed at compile time using -DPAD_LEFT and -DPAD_TOP (e.g. -DPAD_LEFT=2, -DPAD_TOP=2)
- * @note The convolution strides must be passed at compile time using -DSTRIDE and -DPAD_TOP (e.g. -DPAD_LEFT=2, -DPAD_TOP=2)
+ * @note The convolution strides must be passed at compile time using -DSTRIDE_X and -DSTRIDE_Y (e.g. -DSTRIDE_X=2, -DSTRIDE_Y=2)
  * @note The spatial dimensions of the weights must be passed at compile time using -DWEI_WIDTH and -DWEI_HEIGHT (e.g. -DWEI_WIDTH=9, -DWEI_HEIGHT=9)
  * @note The spatial dimensions of the source tensor must be passed at compile time using -DSRC_WIDTH and -DSRC_HEIGHT (e.g. -DSRC_WIDTH=96, -DSRC_HEIGHT=64)
  * @note The spatial dimensions of the destination tensor must be passed at compile time using -DDST_WIDTH and -DDST_HEIGHT (e.g. -DDST_WIDTH=96, -DDST_HEIGHT=64)
author	Gian Marco Iodice <gianmarco.iodice@arm.com>	2021-04-16 15:08:59 +0100
committer	Gian Marco Iodice <gianmarco.iodice@arm.com>	2021-07-02 15:56:45 +0000
commit	8155c0253c00aa9e26651361460c66feb39829a6 (patch)
tree	41dacc432d4d1f1daa32d20d15e5120c11b9fa56 /src/core/CL/cl_kernels/direct_convolution.cl
parent	2eb5d16b839cbc28c6cb7f0de7a0bf15290b425a (diff)
download	ComputeLibrary-8155c0253c00aa9e26651361460c66feb39829a6.tar.gz