diff options
author | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2021-04-16 15:08:59 +0100 |
---|---|---|
committer | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2021-07-02 15:56:45 +0000 |
commit | 8155c0253c00aa9e26651361460c66feb39829a6 (patch) | |
tree | 41dacc432d4d1f1daa32d20d15e5120c11b9fa56 /docs/user_guide/release_version_and_change_log.dox | |
parent | 2eb5d16b839cbc28c6cb7f0de7a0bf15290b425a (diff) | |
download | ComputeLibrary-8155c0253c00aa9e26651361460c66feb39829a6.tar.gz |
Rework OpenCL Depthwise Convolution
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'docs/user_guide/release_version_and_change_log.dox')
-rw-r--r-- | docs/user_guide/release_version_and_change_log.dox | 12 |
1 files changed, 6 insertions, 6 deletions
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox index 2d4d0358f2..0c8b57ff9f 100644 --- a/docs/user_guide/release_version_and_change_log.dox +++ b/docs/user_guide/release_version_and_change_log.dox @@ -62,7 +62,7 @@ v21.05 Public major release - @ref NEDeconvolutionLayer - Remove padding from OpenCL kernels: - @ref CLL2NormalizeLayerKernel - - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel + - CLDepthwiseConvolutionLayer3x3NHWCKernel - @ref CLNormalizationLayerKernel - @ref CLNormalizePlanarYUVLayerKernel - @ref opencl::kernels::ClMulKernel @@ -271,7 +271,7 @@ v20.11 Public major release - @ref CLDepthwiseConvolutionLayerNativeKernel - CLDepthConvertLayerKernel - CLCopyKernel - - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel + - CLDepthwiseConvolutionLayer3x3NHWCKernel - CLActivationLayerKernel - CLWinogradFilterTransformKernel - CLWidthConcatenateLayerKernel @@ -1032,10 +1032,10 @@ v18.11 Public major release - CLWidthConcatenateLayer - CLFlattenLayer - @ref CLSoftmaxLayer - - Add dot product support for @ref CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride + - Add dot product support for CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride - Add SVE support - Fused batch normalization into convolution layer weights in @ref CLFuseBatchNormalization - - Fuses activation in @ref CLDepthwiseConvolutionLayer3x3NCHWKernel, @ref CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer + - Fuses activation in CLDepthwiseConvolutionLayer3x3NCHWKernel, CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer - Added NHWC data layout support to: - @ref CLChannelShuffleLayer - @ref CLDeconvolutionLayer @@ -1045,7 +1045,7 @@ v18.11 Public major release - NEDepthwiseConvolutionLayer3x3Kernel - CLPixelWiseMultiplicationKernel - Added FP16 support to the following kernels: - - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel + - CLDepthwiseConvolutionLayer3x3NHWCKernel - NEDepthwiseConvolutionLayer3x3Kernel - @ref CLNormalizePlanarYUVLayerKernel - @ref CLWinogradConvolutionLayer (5x5 kernel) @@ -1286,7 +1286,7 @@ v17.09 Public major release - NEReshapeLayerKernel / @ref NEReshapeLayer - New OpenCL kernels / functions: - - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer + - CLDepthwiseConvolutionLayer3x3NCHWKernel CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer - CLDequantizationLayerKernel / CLDequantizationLayer - CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer - CLFlattenLayer |