aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/cl_kernels/common
AgeCommit message (Collapse)Author
2022-07-22Add GemmLowp MMUL Reshaped Only Rhs Support for QASYMM8/QASYMM8_SIGNEDFreddie Liardet
This patch introduces a GEMMLowp routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5398 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I8d06453645688f3658b6c7c06f1ebc25a2505661 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7932 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-13Add Gemm MMUL Reshaped Only Rhs Support for FP32/FP16Gunes Bayir
This patch introduces a GEMM routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5216 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I2e5d7806f5904347185bb3e250f73d73d6669dba Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7914 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-06-27Implement new Elementwise Dynamic Fusion Operators: Div, FloorMichalis Spyrou
Resolves: COMPMID-5355 Change-Id: I92f73fbe885f28bbe7b07965b90cfd807c93602f Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7745 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2022-05-31Add cl_khr_integer_dot_product extension supportViet-Hoa Do
* Replace arm_dot(_acc) with dot when cl_khr_integer_dot_product extension is available. Resolves: COMPMID-5206 Change-Id: I7fd763e2421987584e4dae271008972644ea2f41 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7647 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-03-08Merge kernel prototype patchGiorgio Arena
Resolves: COMPMID-5151 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic4024d5cd4819fe917a1d49621f1866ae2e90a37 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7260 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-11Improve start-up time for concatenation layersramelg01
- pass tensor's dimensions at runtime rather than compile time - Add guard macro to compile only kernel of internest Resolves: COMPMID-5121 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I76b7c0cf56d803f58ebff5494c904ace2a86ef5a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7097 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-02Revert "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros"Ramy Elgammal
This reverts commit 10e88a7351 "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros" Resolves: COMPMID-5095 Signed-off-by: Ramy Elgammal<ramy.elgammal@arm.com> Change-Id: I46e167882f072e7508b6101d295accb6e089e740 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7045 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-25Rework gemm_mm_reshaped_only_rhs_ kernels with new macrosGian Marco Iodice
- Rework gemm_reshaped_rhs_only with new TILE macros - Fuse post ops in gemm_reshaped_rhs_only Resolves COMPMID-4890 Change-Id: I944948ecec6d08deaf3545b80cd3eeac26e44205 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6944 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-12-23Rework gemm_reshape_lhs_ with new macrosAdnan AlSinan
Resolves COMPMID-4892 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I52f23ca293506fc693ae829daccc6e889a050752 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6833 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-26Rework gemm_reshape_rhs_(nt,t) with new macrosGian Marco Iodice
Resolves COMPMID-4891 Change-Id: Ifdf2a0eaed23347a1b4465ea8d58c11b72083952 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6741 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-11-20Improve start-up timer for GeMM (floating-point):ramelg01
- Pass M,N,K at runtime as kernel parameters - Add a guard macro to compile only kernel of interest - Move reshpaing kernels to gemm_utils.cl - Remove the fallback reshaping kernel with Y-Padding support Resolves: COMPMID-4888 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ida3851326f0b77e410633271de9ecca106e37931 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6662 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-04Add validate tests for CLConvolutionLayer and CLGEMMConvolutionLayer with ↵SiCongLi
post ops * Add validate tests * Restrict post ops support in ClGemmConv2d to only those that do not need im2col or col2im. In practice this means we only support post ops in conv1x1 with stride = 1, dilation = 1 and data layout = NHWC Resolves COMPMID-4435 Change-Id: I1fdf0c5d565a4624857250075ac76db35c2f383b Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6573 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-04Add PRelu to supported PostOps in:ramelg01
- ClGemmMatrixMultiplyReshapedKernel - ClGemmMatrixMultiplyNativeKernel - ClGemmMatrixMultiplyReshapedOnlyRhsKernel Resolves: COMPMID-4713 Change-Id: I3adcb1b3d4af37ebcbc3bee19cc1845885d08600 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6553 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-03Fix out-of-bound reads in cl gemm kernelsSiCongLi
* Revert "Remove padding in FP Cl Gemm kernels" This reverts commit 48717a3d38fef8d316cd4b9fd9a3bc1a43db736b. * Allow different boundary row handling strategies across native, reshaped and reshaped_only_rhs kernels by introducing a ELTWISE_OPERAND_ROW parameter to the macro Resolves COMPMID-4919 Change-Id: Icefc23c0760a6abb838fef1d0d5bda06b07c79e3 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6569 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-11-02Add post ops to ClGemmMatrixMultiplyReshapedOnlyRHSKernel and ↵SiCongLi
ClGemmMatrixMultiplyNativeKernel Part 3 Partially resolves: COMPMID-4435 Change-Id: Ifc5affa3a24a70942ca2d001380205df09b03ad7 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6550 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-01Remove padding in FP Cl Gemm kernelsSiCongLi
* Remove rhs and bias padding in ClGemmMatrixMultiplyNativeKernel * Rework ClGemmMatrixMultiplyReshapedOnlyRHSKernel to use the same padding boundary condition as the other kernels Partially resolves COMPMID-4435 Change-Id: I1c17af9cca0b5cb3be087ce160948b7b0e62d297 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6549 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-10-28Add experimental PostOp interface to ClGemmMatrixMultiplyReshapedKernel Part 1SiCongLi
This interface supports the fusion of multiple elementwise operations Partially resolves: COMPMID-4435 Change-Id: If68dd7dd98dcf239fde7cb1f0a4a6d4d1e899a6f Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6483 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-18Remove legacy GeMM kernels on OpenCLGian Marco Iodice
Resolves COMPMID-4446 Change-Id: I1d3c2391b67681f4d3af440826aa95b47a1288a6 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6444 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-10-14Implement CLDirectConv3D f32/f16Giorgio Arena
Resolve COMPMID-4660 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ibd66ec1eb6faa60086981b1e3a9c12561df3445f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6420 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-10-13Improve performance of Softmax uint8 on GPUAdnan AlSinan
Resolves COMPMID-4805 Change-Id: I0acd4479f196cf9518995a60d3b57a9a49e0db57 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6413 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Freddie Liardet <frederick.liardet@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-09-23Fix inefficient store in gemmlowp_mm_reshaped_only_rhs_tGian Marco Iodice
- The out-of-boundary condition was performed also for PARTIAL_STORE_N0 = 0 Resolves: COMPMID-4774 COMPMID-4771 Change-Id: I0d7e078c67615b513ffeb66860f224999b5135fa Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6302 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-09-09Remove padding from ClGemmMatrixMultiplyReshapedOnlyRhsKernelGiorgio Arena
Resolve COMPMID-4450 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I6f280d5d66ec43fb5cb06c83fe15a1f227ad165d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6232 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-07Remove padding from ClGemmMatrixMultiplyReshapedKernelGiorgio Arena
Create new macros for loading values from memory while being aware of boundaries of the tensor to not generate page faults. Resolves: COMPMID-4447 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ia5fd0a5dcb40942bccd5e686307d0055e1a1dd82 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6226 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-09-06Revert "Remove padding from ClGemmMatrixMultiplyReshapedKernel"Pablo Marquez Tello
This reverts commit 50335fd3d0734157382741fcf1bfdaf630c60c4b. Resolves COMPMID-4792 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Change-Id: Ia6580143d9cf5a7bd5c87ca4214022f7c241ec6f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6214 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-03Remove padding from ClPool2dKernel NCHWGiorgio Arena
- Simplify NCHW kernel structure by removing old optimized paths - Merge quantized with fp kernels Resolve COMPMID-4722 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I79016b119619aed6a6193295601cd6517f14b88c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6183 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-09-01Remove padding from ClGemmMatrixMultiplyReshapedKernelMichele Di Giorgio
Create new macros for loading values from memory while being aware of boundaries of the tensor to not generate page faults. Resolves: COMPMID-4447 Change-Id: If9a455291e395ebd9070ebe5e120b3064d8fab29 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6168 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-08-23Remove padding from ClScaleKernelGiorgio Arena
- Merge quantized kernels with fp for bilinear interpolation (both NCHW and NHWC) - Pass dimensions at compile time rather than at run time - Use tile-based approach to rework the NCHW kernels - Remove unused functions/files Resolve COMPMID-4723 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ifcdf02beb9daa9f318395751b3c85eb2fe874082 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6138 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-07-25Reorganize the kernels into nhwc, nchw and common foldersAdnan AlSinan
The Following kernels have been split into nchw/nhwc kernels files: - batchnormalization_layer - batch_to_space - channel_shuffle - depth_to_space - dequantization_layer - im2col - normalization_layer - normalize_planar_yuv_layer - normalize_planar_yuv_layer_quantized - pooling_layer - pooling_layer_quantized - remap - reorg_layer - scale - scale_quantized - space_to_batch - space_to_depth - upsample_layer - winograd_filter_transform - winograd_input_transform - winograd_output_transform The following kernels have been moved to nchw folder: - direct_convolution1x1 - direct_convolution3x3 - direct_convolution5x5 - direct_convolution_quantized - prior_box_layer The following kernels have been moved to nhwc folder: - direct_convolution - dwc_native_fp_nhwc - dwc_native_quantized_nhwc The following kernels have been removed: - sobel_filter While the rest kerenls have been moved to the common folder. Partially resolves COMPMID-4453 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ic327ac935687ec351c610c65a3c6357f364a5a58 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5919 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>