diff options
author | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2021-04-16 15:08:59 +0100 |
---|---|---|
committer | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2021-06-24 11:16:30 +0000 |
commit | 561c176598cd14245e2e7918fdf136d1c888d1da (patch) | |
tree | 82adfff6de30292dabbbcc7ced4ae35cac3d45cf /docs/user_guide/release_version_and_change_log.dox | |
parent | 31c7c26822270f1c4952c8973aa8bfb38e0a7c68 (diff) | |
download | ComputeLibrary-561c176598cd14245e2e7918fdf136d1c888d1da.tar.gz |
Rework OpenCL Depthwise Convolution
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute
- Remove specialized kernels for 3x3 NHWC
- Simplify CLDepthwiseConvolutionLayer.cpp to call just the native
implementation for both floating-point and quantized data types
- Develop two parametric opencl kernels for depthwise convolution layer NHWC
(floating-point and quantized)
- Add support to export the weights to cl_image
- Extend test for depthwise convolution on opencl
Resolves COMPMID-4417
Change-Id: I253dd5d959a70783c82e62b1771a5e9f91621cb0
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5806
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Diffstat (limited to 'docs/user_guide/release_version_and_change_log.dox')
-rw-r--r-- | docs/user_guide/release_version_and_change_log.dox | 12 |
1 files changed, 6 insertions, 6 deletions
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox index b847761b8c..fd3806a19b 100644 --- a/docs/user_guide/release_version_and_change_log.dox +++ b/docs/user_guide/release_version_and_change_log.dox @@ -62,7 +62,7 @@ v21.05 Public major release - @ref NEDeconvolutionLayer - Remove padding from OpenCL kernels: - @ref CLL2NormalizeLayerKernel - - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel + - CLDepthwiseConvolutionLayer3x3NHWCKernel - @ref CLNormalizationLayerKernel - @ref CLNormalizePlanarYUVLayerKernel - @ref opencl::kernels::ClMulKernel @@ -271,7 +271,7 @@ v20.11 Public major release - @ref CLDepthwiseConvolutionLayerNativeKernel - CLDepthConvertLayerKernel - CLCopyKernel - - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel + - CLDepthwiseConvolutionLayer3x3NHWCKernel - CLActivationLayerKernel - CLWinogradFilterTransformKernel - CLWidthConcatenateLayerKernel @@ -1032,10 +1032,10 @@ v18.11 Public major release - CLWidthConcatenateLayer - CLFlattenLayer - @ref CLSoftmaxLayer - - Add dot product support for @ref CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride + - Add dot product support for CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride - Add SVE support - Fused batch normalization into convolution layer weights in @ref CLFuseBatchNormalization - - Fuses activation in @ref CLDepthwiseConvolutionLayer3x3NCHWKernel, @ref CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer + - Fuses activation in CLDepthwiseConvolutionLayer3x3NCHWKernel, CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer - Added NHWC data layout support to: - @ref CLChannelShuffleLayer - @ref CLDeconvolutionLayer @@ -1045,7 +1045,7 @@ v18.11 Public major release - NEDepthwiseConvolutionLayer3x3Kernel - CLPixelWiseMultiplicationKernel - Added FP16 support to the following kernels: - - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel + - CLDepthwiseConvolutionLayer3x3NHWCKernel - NEDepthwiseConvolutionLayer3x3Kernel - @ref CLNormalizePlanarYUVLayerKernel - @ref CLWinogradConvolutionLayer (5x5 kernel) @@ -1286,7 +1286,7 @@ v17.09 Public major release - NEReshapeLayerKernel / @ref NEReshapeLayer - New OpenCL kernels / functions: - - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer + - CLDepthwiseConvolutionLayer3x3NCHWKernel CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer - CLDequantizationLayerKernel / CLDequantizationLayer - CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer - CLFlattenLayer |