aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJakub Sujak <jakub.sujak@arm.com>2023-02-10 14:36:48 +0000
committerJakub Sujak <jakub.sujak@arm.com>2023-02-10 17:36:23 +0000
commita801adbb72cda705f78df97144635a41f643338c (patch)
treeea5daac6ad12a43d6570af650a84e49d16b08747
parent63989ebaad913417feb77c5eff732bc64c0b644d (diff)
downloadComputeLibrary-a801adbb72cda705f78df97144635a41f643338c.tar.gz
Update release version and change log documentation
Resolves: COMPMID-5565 Change-Id: I9dca679f57f6c3cc9489669b80a5da2aba500d34 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9122 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
-rw-r--r--docs/user_guide/release_version_and_change_log.dox10
1 files changed, 6 insertions, 4 deletions
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox
index d658b5354f..8bb2a3f305 100644
--- a/docs/user_guide/release_version_and_change_log.dox
+++ b/docs/user_guide/release_version_and_change_log.dox
@@ -46,11 +46,8 @@ v23.02 Public major release
- Add the following operators to the experimental dynamic fusion API:
- GpuAdd, GpuCast, GpuClamp, GpuDepthwiseConv2d, GpuMul, GpuOutput, GpuPool2d, GpuReshape, GpuResize, GpuSoftmax, GpuSub.
- Add SME/SME2 kernels for GeMM, Winograd convolution, Depthwise convolution and Pooling.
+ - Add new CPU operator AddMulAdd for float and quantized types.
- Add new flag @ref ITensorInfo::lock_paddings() to tensors to prevent extending tensor paddings.
- - Add new OpenCL kernel to compute indirect convolution:
- - \link opencl::kernels::ClIndirectConv2dKernel ClIndirectConv2dKernel \endlink
- - Add new OpenCL kernel to compute transposed convolution:
- - \link opencl::kernels::ClTransposedConvolutionKernel ClTransposedConvolutionKernel \endlink
- Add experimental support for CPU only Bazel and CMake builds.
- Performance optimizations:
- Optimize CPU base-e exponential functions for FP32.
@@ -58,6 +55,11 @@ v23.02 Public major release
- Optimize CPU quantized Subtraction by reusing the quantized Addition kernel.
- Optimize CPU ReduceMean by removing quantization steps and performing the operation in integer domain.
- Optimize GPU Scale and Dynamic Fusion GpuResize by removing quantization steps and performing the operation in integer domain.
+ - Update the heuristic for CLDepthwiseConvolutionNative kernel.
+ - Add new optimized OpenCL kernel to compute indirect convolution:
+ - \link opencl::kernels::ClIndirectConv2dKernel ClIndirectConv2dKernel \endlink
+ - Add new optimized OpenCL kernel to compute transposed convolution:
+ - \link opencl::kernels::ClTransposedConvolutionKernel ClTransposedConvolutionKernel \endlink
- Update recommended/minimum NDK version to r20b.
- Various optimizations and bug fixes.