diff options
author | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2020-10-29 13:36:50 +0000 |
---|---|---|
committer | Gian Marco Iodice <gianmarco.iodice@arm.com> | 2020-10-30 15:35:02 +0000 |
commit | 839e19865d4b654899d1da5cfb94304841e7f210 (patch) | |
tree | 10321574df9e263036a60689fb5fb03608b2f487 /src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h | |
parent | c4d45559b00cdbdca80296c23be5939439fbbbd0 (diff) | |
download | ComputeLibrary-839e19865d4b654899d1da5cfb94304841e7f210.tar.gz |
COMPMID-3930: Update CLGEMM heuristic for fp16. Mali-G76
- Since the GEMM kernel can now work without padding, the heuristic
requires to be fine-tuned to exploit this feature
- The heuristic affects Mali-G76 FP16 only
Change-Id: Ia430627f02131ad956ce2219b80c83c8e7cabaf2
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4284
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Diffstat (limited to 'src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h')
-rw-r--r-- | src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h b/src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h index a495b48301..e3cc8e4a27 100644 --- a/src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h +++ b/src/runtime/CL/gemm/CLGEMMKernelSelectionBifrost.h @@ -45,6 +45,7 @@ public: private: CLGEMMKernelType g76_f32(unsigned int m, unsigned int n, unsigned int k, unsigned int b, bool is_rhs_constant); + CLGEMMKernelType g76_f16(unsigned int m, unsigned int n, unsigned int k, unsigned int b, bool is_rhs_constant); CLGEMMKernelType g71_f16(unsigned int m, unsigned int n, unsigned int k, unsigned int b, bool is_rhs_constant); CLGEMMKernelType default_f32(unsigned int m, unsigned int n, unsigned int k, unsigned int b, bool is_rhs_constant); CLGEMMKernelType default_f16(unsigned int m, unsigned int n, unsigned int k, unsigned int b, bool is_rhs_constant); |