diff options
author | Viet-Hoa Do <viet-hoa.do@arm.com> | 2022-09-22 10:24:23 +0100 |
---|---|---|
committer | Viet-Hoa Do <viet-hoa.do@arm.com> | 2022-10-03 08:57:23 +0000 |
commit | 40b441905760846e9fdaca283a4a4de038a6ef0d (patch) | |
tree | 38a4f6b5122bfaf44a2a33e90b331a2e1a30b113 /src/cpu/kernels/add/generic/neon/impl.h | |
parent | ff81de5a9a0f6b9331c3b112cc2aed552f0482a9 (diff) | |
download | ComputeLibrary-40b441905760846e9fdaca283a4a4de038a6ef0d.tar.gz |
Optimize CPU add layer on quantized data
* Use fixed-point arithmetic where possible.
* Various optimization for the FP32-based implementation.
This implementation is kept as the fall-back solution
in case of unrealistic quantization parameters that exceed
the range of fixed-point solution.
Resolves: COMPMID-5458
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I221d2d3801ecaae4fe0b7cf6ae8ef00ca3743665
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8317
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'src/cpu/kernels/add/generic/neon/impl.h')
-rw-r--r-- | src/cpu/kernels/add/generic/neon/impl.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/src/cpu/kernels/add/generic/neon/impl.h b/src/cpu/kernels/add/generic/neon/impl.h index f8f0f517b0..e6a12fb4c0 100644 --- a/src/cpu/kernels/add/generic/neon/impl.h +++ b/src/cpu/kernels/add/generic/neon/impl.h @@ -35,6 +35,11 @@ void add_same_neon(const ITensor *src0, const ITensor *src1, ITensor *dst, const template <typename ScalarType> void add_same_neon_as_1d_array(const ITensor *src0, const ITensor *src1, ITensor *dst, const ConvertPolicy &policy, const Window &window); + +bool add_q8_neon_fixedpoint_possible(const ITensorInfo *src0, const ITensorInfo *src1, const ITensorInfo *dst); + +template <typename ScalarType> +void add_q8_neon_fixedpoint(const ITensor *src0, const ITensor *src1, ITensor *dst, const ConvertPolicy &policy, const Window &window); } // namespace cpu } // namespace arm_compute #endif // SRC_CORE_NEON_KERNELS_ADD_IMPL_H
\ No newline at end of file |