aboutsummaryrefslogtreecommitdiff
path: root/docs/user_guide/release_version_and_change_log.dox
diff options
context:
space:
mode:
authorRenato Arantes <renato.arantes@arm.com>2024-01-26 17:31:18 +0000
committerRenato Barros Arantes <renato.arantes@arm.com>2024-03-21 11:15:30 +0000
commit36a75dafdbe6d6a3a6f50bd075fe01f5b7dace38 (patch)
tree0701d615ef30444b9d0789db691b59b81fd9e86e /docs/user_guide/release_version_and_change_log.dox
parentd2191150736dde66d79eb97e0c8ee506eef3c8fc (diff)
downloadComputeLibrary-36a75dafdbe6d6a3a6f50bd075fe01f5b7dace38.tar.gz
[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorch® autocast() function
The full range of tests must be added with [MLINFSW-482] epic due to the lack of reordering kernels implemented in Acl. Co-Authored-By: David Mansell <David.Mansell@arm.com> Change-Id: I820d316295a1ec94fdc89c37e4144a268f914c36 Signed-off-by: Renato Arantes <renato.arantes@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11169 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'docs/user_guide/release_version_and_change_log.dox')
-rw-r--r--docs/user_guide/release_version_and_change_log.dox3
1 files changed, 2 insertions, 1 deletions
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox
index 2d46737e96..31b756070d 100644
--- a/docs/user_guide/release_version_and_change_log.dox
+++ b/docs/user_guide/release_version_and_change_log.dox
@@ -42,10 +42,11 @@ If there is more than one release in a month then an extra sequential number is
@section S2_2_changelog Changelog
v24.04 Public major release
+ - Add Bfloat16 data type support for @ref NEMatMul.
- Optimize start-up time of @ref NEConvolutionLayer for some input configurations where GeMM is selected as the convolution algorithm
- Optimize @ref NEConvolutionLayer for input tensor size > 1e7 bytes and weight tensor height > 7
- Performance optimizations:
- - Optimize @ref NESoftmaxLayer for axis != 0 by natively supporting higher axes up to axis 3.
+ - Optimize @ref NESoftmaxLayer for axis != 0 by natively supporting higher axes up to axis 3.
v24.02.1 Public patch release
- Fix performance regression in fixed-format kernels