diff options
author | Sang-Hoon Park <sang-hoon.park@arm.com> | 2019-09-18 13:39:00 +0100 |
---|---|---|
committer | Georgios Pinitas <georgios.pinitas@arm.com> | 2019-10-01 12:02:45 +0000 |
commit | 2aa7fd011a4baff52dceb00a71b3674f819df8fc (patch) | |
tree | 081a8b0a75ff130d2c6179acf1fe1f1b58943412 /src/core/CL/kernels | |
parent | 5c4a8e96460eb83a6caef1c69ea5cbb4893858d7 (diff) | |
download | ComputeLibrary-2aa7fd011a4baff52dceb00a71b3674f819df8fc.tar.gz |
COMPMID-2601 [CL] add mixed precision support to PoolingLayer
* PoolingLayerInfo is updated with a new flag.
* CL Kernel is updated to use FP32 accumulation.
* CL pooling layer testscases are added for mixed precision.
* Reference pooling layer is updated to use FP32 accumulation.
Change-Id: I4ab2167cc7f86c86293cf50a0ca5119c04dc9c7e
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1973
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: VidhyaSudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Diffstat (limited to 'src/core/CL/kernels')
-rw-r--r-- | src/core/CL/kernels/CLPoolingLayerKernel.cpp | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/src/core/CL/kernels/CLPoolingLayerKernel.cpp b/src/core/CL/kernels/CLPoolingLayerKernel.cpp index 8eaf5bf76f..8e69157fdb 100644 --- a/src/core/CL/kernels/CLPoolingLayerKernel.cpp +++ b/src/core/CL/kernels/CLPoolingLayerKernel.cpp @@ -236,6 +236,12 @@ void CLPoolingLayerKernel::configure(const ICLTensor *input, ICLTensor *output, build_opts.add_option_if(data_type == DataType::F16, "-DFP16"); + const auto use_fp_mixed_precision = (data_type == DataType::F16) && pool_info.fp_mixed_precision(); + const auto use_wider_accumulator = use_fp_mixed_precision && (pool_type != PoolingType::MAX); + const auto acc_data_type = get_cl_type_from_data_type(use_wider_accumulator ? DataType::F32 : data_type); + build_opts.add_option("-DACC_DATA_TYPE=" + acc_data_type); + build_opts.add_option_if(use_wider_accumulator, "-DFP_MIXED_PRECISION"); + // Create kernel switch(data_layout) { |