Fix validation error in CL generate proposals kernel

This fix modifies some of the conversions done in the generate proposals kernel that causes DDK issues while compiling the kernel. The issues are mostly related to conversion from i64 to fp16, and it doesn't affect fp32. Firstly, type identifier size_t is converted into unsigned int. But, this alone was compiling but causing mismatches, even in older devices, where it was passing before. Therefore, the fp16 conversion delayed until vector construction where the integers are now converted to fp32, and then fp16. This, although may not be ideal, seems like the best solution. Resolves: COMPMID-6756 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: Iee61216c908fe51431985b80c3653fc32add4741 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10879 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
author: Gunes Bayir <gunes.bayir@arm.com> 2023-12-13 11:51:05 +0000
committer: Gunes Bayir <gunes.bayir@arm.com> 2023-12-14 17:10:49 +0000
commit: feef9b98c0cdc9df773fdf2d7b17496e0fa269a0 (patch)
tree: 94a772f79d3dc3dd8c2b9b81be9f024709f138c0
parent: b7e7e1a1ea5c8e176f2099990ecd97acee5eb8ca (diff)
download: ComputeLibrary-feef9b98c0cdc9df773fdf2d7b17496e0fa269a0.tar.gz
1 files changed, 6 insertions, 8 deletions
diff --git a/src/core/CL/cl_kernels/common/generate_proposals.cl b/src/core/CL/cl_kernels/common/generate_proposals.cl
index 5b8502072a..bfe1922ac2 100644
--- a/src/core/CL/cl_kernels/common/generate_proposals.cl
+++ b/src/core/CL/cl_kernels/common/generate_proposals.cl
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2019-2021 Arm Limited.
+ * Copyright (c) 2019-2021, 2023 Arm Limited.
  *
  * SPDX-License-Identifier: MIT
  *
@@ -59,18 +59,16 @@ __kernel void generate_proposals_compute_all_anchors(
     Vector anchors = CONVERT_TO_VECTOR_STRUCT_NO_STEP(anchors);
     Vector rois    = CONVERT_TO_VECTOR_STRUCT(rois);
 
-    const size_t idx = get_global_id(0);
+    const unsigned int idx = get_global_id(0);
     // Find the index of the anchor
-    const size_t anchor_idx = idx % NUM_ANCHORS;
+    const unsigned int anchor_idx = idx % NUM_ANCHORS;
 
     // Find which shift is this thread using
-    const size_t shift_idx = idx / NUM_ANCHORS;
+    const unsigned int shift_idx = idx / NUM_ANCHORS;
 
     // Compute the shift on the X and Y direction (the shift depends exclusively by the index thread id)
-    const DATA_TYPE
-    shift_x = (DATA_TYPE)(shift_idx % WIDTH) * STRIDE;
-    const DATA_TYPE
-    shift_y = (DATA_TYPE)(shift_idx / WIDTH) * STRIDE;
+    const float shift_x = (float)(shift_idx % WIDTH) * STRIDE;
+    const float shift_y = (float)(shift_idx / WIDTH) * STRIDE;
 
     const VEC_DATA_TYPE(DATA_TYPE, NUM_ROI_FIELDS)
     shift = (VEC_DATA_TYPE(DATA_TYPE, NUM_ROI_FIELDS))(shift_x, shift_y, shift_x, shift_y);
author	Gunes Bayir <gunes.bayir@arm.com>	2023-12-13 11:51:05 +0000
committer	Gunes Bayir <gunes.bayir@arm.com>	2023-12-14 17:10:49 +0000
commit	feef9b98c0cdc9df773fdf2d7b17496e0fa269a0 (patch)
tree	94a772f79d3dc3dd8c2b9b81be9f024709f138c0
parent	b7e7e1a1ea5c8e176f2099990ecd97acee5eb8ca (diff)
download	ComputeLibrary-feef9b98c0cdc9df773fdf2d7b17496e0fa269a0.tar.gz