diff options
author | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2023-04-28 10:40:07 +0100 |
---|---|---|
committer | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2023-05-02 15:53:28 +0000 |
commit | 60ab4e66ea3cb85042035fd1aafbfea666bb4ea7 (patch) | |
tree | 05818f7fafb0cf02d337b201756548152090436f /CMakeLists.txt | |
parent | d7113e4af5b5497d3a3a62dc9cf6b147e2a024cd (diff) | |
download | ComputeLibrary-60ab4e66ea3cb85042035fd1aafbfea666bb4ea7.tar.gz |
Fix export_to_cl_image issue in the fp16 GeMM implementation
- The issue affects Fp16 GeMM on Arm® Mali™-G78
- The issue was caused by a missing fallback implementation for the
case when export_to_cl_image cannot be used
- The new implementation fixes this issues and make the GeMM
implementation for M=1 also faster (4-5% on various networks with fully
connected at the end of the model)
- This patch also enables the H0=0 case in the GeMM examples
Resolves COMPMID-5812, COMPMID-5688, and COMPMID-6147
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Change-Id: Ib7b355ae25337962598dd2ba21665b1a6b48686f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/514664
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9526
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'CMakeLists.txt')
0 files changed, 0 insertions, 0 deletions