diff options
author | Viet-Hoa Do <viet-hoa.do@arm.com> | 2022-09-21 11:31:46 +0100 |
---|---|---|
committer | Viet-Hoa Do <viet-hoa.do@arm.com> | 2022-10-03 16:46:42 +0000 |
commit | b5368fb3da65ca1d31e6acd6cd45b8b6b789f1eb (patch) | |
tree | 90786fcb5f55f90fec6124da6b241cb56ce0d4af /src/gpu/cl/kernels/ClDirectConv2dKernel.cpp | |
parent | 304dfdba67958f5987d88ad0ce538399c3e50bc8 (diff) | |
download | ComputeLibrary-b5368fb3da65ca1d31e6acd6cd45b8b6b789f1eb.tar.gz |
Force CL kernel compilation with 64 registers
* For DDK version 30 and higher, force the CL compiler to use
64 registers for NHWC direct convolution.
Resolves: COMPMID-5508
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I7d9ecc3b5a4eceaff44542cd26f6f05e30ab2c1f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8351
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Diffstat (limited to 'src/gpu/cl/kernels/ClDirectConv2dKernel.cpp')
-rw-r--r-- | src/gpu/cl/kernels/ClDirectConv2dKernel.cpp | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/src/gpu/cl/kernels/ClDirectConv2dKernel.cpp b/src/gpu/cl/kernels/ClDirectConv2dKernel.cpp index c4b70ca82b..722c802138 100644 --- a/src/gpu/cl/kernels/ClDirectConv2dKernel.cpp +++ b/src/gpu/cl/kernels/ClDirectConv2dKernel.cpp @@ -292,6 +292,11 @@ void ClDirectConv2dKernel::configure(const CLCompileContext &compile_context, IT build_options.add_option_if(act_info.enabled(), "-DA_VAL=" + float_to_string_with_full_precision(act_info.a())); build_options.add_option_if(act_info.enabled(), "-DB_VAL=" + float_to_string_with_full_precision(act_info.b())); } + + if(compile_context.get_ddk_version() >= 30) + { + build_options.add_option("-fregister-allocation=64"); + } } else { |