aboutsummaryrefslogtreecommitdiff
path: root/src/gpu/cl/kernels/ClCastKernel.h
diff options
context:
space:
mode:
authorFreddie Liardet <frederick.liardet@arm.com>2022-05-16 14:09:10 +0100
committerGunes Bayir <gunes.bayir@arm.com>2022-07-22 10:18:41 +0000
commite572dff7adc334a98ac4a0326d66037451d5d079 (patch)
tree9c4db3d743078de9bda67dfed674e3f371a4e238 /src/gpu/cl/kernels/ClCastKernel.h
parente87120731ca65c54b082734af07f748ac9651427 (diff)
downloadComputeLibrary-e572dff7adc334a98ac4a0326d66037451d5d079.tar.gz
Add GemmLowp MMUL Reshaped Only Rhs Support for QASYMM8/QASYMM8_SIGNED
This patch introduces a GEMMLowp routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5398 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I8d06453645688f3658b6c7c06f1ebc25a2505661 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7932 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
Diffstat (limited to 'src/gpu/cl/kernels/ClCastKernel.h')
-rw-r--r--src/gpu/cl/kernels/ClCastKernel.h3
1 files changed, 2 insertions, 1 deletions
diff --git a/src/gpu/cl/kernels/ClCastKernel.h b/src/gpu/cl/kernels/ClCastKernel.h
index 5c223fc5fa..7fadfa73d0 100644
--- a/src/gpu/cl/kernels/ClCastKernel.h
+++ b/src/gpu/cl/kernels/ClCastKernel.h
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2016-2021 Arm Limited.
+ * Copyright (c) 2016-2022 Arm Limited.
*
* SPDX-License-Identifier: MIT
*
@@ -49,6 +49,7 @@ public:
*
* - QSYMM8_PER_CHANNEL -> QASYMM8 (ATTENTION: it is the user's responsibility to keep track of the quantization info in the TensorInfo meta-data)
* - U8 -> S8, U16, S16, U32, S32, F16, F32
+ * - S8 -> U8, U16, S16, U32, S32, F16, F32
* - U16 -> U8, S8, S16, U32, S32, F16, F32
* - S16 -> U8, S8, U16, U32, S32, F16, F32
* - U32 -> U8, S8, U16, S16, S32, F16, F32