aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/cl_kernels/nhwc/winograd_input_transform.cl
AgeCommit message (Collapse)Author
2023-04-26Improve Winograd performance on OpenCLGian Marco Iodice
- Performs more output elements per work-item in the case of Fp16 computation in Winograd Input/Output transform Resolves COMPMID-6018 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Change-Id: If5e6f5182eff8c1f05a3505c437d0a997490f0bd Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9447 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-02-09Improve start-up time for winograd_input_transform_*_nhwcramelg01
- pass tensor's dimensions at runtime rather than compile time - Add guard macro to compile only kernel(s) of internest Resolves: COMPMID-5119 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ib01098e397011a1201c2800c62a8954ec70e63e8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7083 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-25Reorganize the kernels into nhwc, nchw and common foldersAdnan AlSinan
The Following kernels have been split into nchw/nhwc kernels files: - batchnormalization_layer - batch_to_space - channel_shuffle - depth_to_space - dequantization_layer - im2col - normalization_layer - normalize_planar_yuv_layer - normalize_planar_yuv_layer_quantized - pooling_layer - pooling_layer_quantized - remap - reorg_layer - scale - scale_quantized - space_to_batch - space_to_depth - upsample_layer - winograd_filter_transform - winograd_input_transform - winograd_output_transform The following kernels have been moved to nchw folder: - direct_convolution1x1 - direct_convolution3x3 - direct_convolution5x5 - direct_convolution_quantized - prior_box_layer The following kernels have been moved to nhwc folder: - direct_convolution - dwc_native_fp_nhwc - dwc_native_quantized_nhwc The following kernels have been removed: - sobel_filter While the rest kerenls have been moved to the common folder. Partially resolves COMPMID-4453 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ic327ac935687ec351c610c65a3c6357f364a5a58 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5919 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>