diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/user_guide/release_version_and_change_log.dox | 10 |
1 files changed, 6 insertions, 4 deletions
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox index d658b5354f..8bb2a3f305 100644 --- a/docs/user_guide/release_version_and_change_log.dox +++ b/docs/user_guide/release_version_and_change_log.dox @@ -46,11 +46,8 @@ v23.02 Public major release - Add the following operators to the experimental dynamic fusion API: - GpuAdd, GpuCast, GpuClamp, GpuDepthwiseConv2d, GpuMul, GpuOutput, GpuPool2d, GpuReshape, GpuResize, GpuSoftmax, GpuSub. - Add SME/SME2 kernels for GeMM, Winograd convolution, Depthwise convolution and Pooling. + - Add new CPU operator AddMulAdd for float and quantized types. - Add new flag @ref ITensorInfo::lock_paddings() to tensors to prevent extending tensor paddings. - - Add new OpenCL kernel to compute indirect convolution: - - \link opencl::kernels::ClIndirectConv2dKernel ClIndirectConv2dKernel \endlink - - Add new OpenCL kernel to compute transposed convolution: - - \link opencl::kernels::ClTransposedConvolutionKernel ClTransposedConvolutionKernel \endlink - Add experimental support for CPU only Bazel and CMake builds. - Performance optimizations: - Optimize CPU base-e exponential functions for FP32. @@ -58,6 +55,11 @@ v23.02 Public major release - Optimize CPU quantized Subtraction by reusing the quantized Addition kernel. - Optimize CPU ReduceMean by removing quantization steps and performing the operation in integer domain. - Optimize GPU Scale and Dynamic Fusion GpuResize by removing quantization steps and performing the operation in integer domain. + - Update the heuristic for CLDepthwiseConvolutionNative kernel. + - Add new optimized OpenCL kernel to compute indirect convolution: + - \link opencl::kernels::ClIndirectConv2dKernel ClIndirectConv2dKernel \endlink + - Add new optimized OpenCL kernel to compute transposed convolution: + - \link opencl::kernels::ClTransposedConvolutionKernel ClTransposedConvolutionKernel \endlink - Update recommended/minimum NDK version to r20b. - Various optimizations and bug fixes. |