aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels
AgeCommit message (Collapse)Author
2021-04-19CLInstanceNormalizationLayer NHWC optimisationPablo Marquez Tello
* Make changes to split the workload into two kernels. One kernel precomputes mean and variance and the second kernel just loads these precomputed values. * The new approach runs %30 faster than the original code for NHWC workloads like 32x192x256. * Resolves MLCE-337 Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-19Port DepthwiseConvolution to new APIMichalis Spyrou
Resolves: COMPMID-4185 Change-Id: Ib5f22356356a022d567bb18d44ea272b62d10ebf Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5424 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Remove padding from CLNormalizePlanarYUVLayerKernelSheri Zhang
Resolve: COMPMID-3911 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Id5615b6a8b52030fb611a1a04bcd4664b8232e90 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5451 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-16Remove unnecessary use AccessWindowsMichele Di Giorgio
In these cases, no padding is introduced and the use of AccessWindows is not necessary and makes the code more confusing. Change-Id: Id712cba35bb0440eb40c69fdc7ad0084dc9a5ab3 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5440 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-14Remove unused AccessWindow* includesMichele Di Giorgio
Change-Id: I9f8d0c6e17d58700cc01fc5134cd2dffd26bc742 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5430 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-14Remove OpenCL padding: CLNormalizationLayerKernelManuel Bottini
Only for NHWC data layout Resolves: COMPMID-3910 Change-Id: Ie2d71482b3e3b55ac155e9af152032a5de8bbd50 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5388 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Port CLConvertFullyConnectedWeights to new APITeresa Charlin
* Replace ICLKernel by IClKernel in other unrelated kernels Resolves partially: COMPMID-4187 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I173b8f2ac645dbfd7d412f4b058c5c9655c229ee Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5402 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-09Fix bug on Implicit Padding for CL GEMMMatrixMultiplyInterleavedTransposedManuel Bottini
Resolves: COMPMID-4342 Change-Id: I468c6d68c0284e4ec76f22037a697fff7bc5638c Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5391 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-08Rework the OpenCL Winograd Input Transformations NHWCGian Marco Iodice
- Rework Winograd Input Transform 3x3 NHWC using the new macros - Rework Winograd Input Transform 5x5 NHWC using the new macros - Rework Winograd Input Transform 7x7 NHWC using the new macros - The new implementation is also faster than before - Winograd Input Transform 5x5/7x7 3x faster Resolves COMPMID-4139 Change-Id: Ia9c8af23a2d47d2db60ec4c44650a63a34ffa0d5 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5358 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-06Remove OpenCL padding: CLL2NormalizeLayerKernelManuel Bottini
Resolves: COMPMID-3909 Change-Id: I00a1705ed202002e2a6053702272181805fa6869 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5360 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-01Added Qasymm8 datatype support to CLROIPoolingLayer with TestsSuhail Munshi
Also fixes RoiPoolingLayer not matching reference with Float32 datatype Issue Tests added to check ROIPooling Layer against reference with both Float32 and Qasymm8 input. Resolves : COMPMID-2320 Change-Id: Ib86d2e6b3803e74f922a545ea573da02c28e54cc Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2021-03-31Remove Computer Vision generic interfaces and typesGeorgios Pinitas
Removes: - reference validation routines - CV related types and structures - CV related interfaces Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I3a203da12d9b04c154059b190aeba18a611149a9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5340 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-29Port ClTranspose to new APITeresa Charlin
Partially Resolves: COMPMID-4277 (1/2) Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I704c2303135cbe1ba46d2fd5642c84c562204bc7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5194 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-29Remove usage of valid window region CL - NHWCMichalis Spyrou
Resolves: COMPMID-4153 Change-Id: Ib0d60c9acaac8aaf3946c62fc2d740b5ec6cee5c Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5301 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-25Improve performance of Winograd Output Transform 3x3Gian Marco Iodice
This patch reworks the winograd output transform 3x3 NHWC on OpenCL - Use utility macros in tile_helpers.h to rewrite the kernel - Implement the tile utility macro for the activation Resolves COMPMID-4144 Change-Id: I86a9bb9ea96b9629a18642b56bb63750710e6af5 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5324 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Make ClDirectConvolutionKernel statelessSheri Zhang
ClDirectorConvolution triggers ClActivation (if enabled) Remove static tuner as the interface need to be changed base on new api. Remove functions in ClScaleKernel specific for static Tuner. Solves: COMPMID-4010 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I7861c3462fda323a6fe1891834068a462245cb1b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5262 Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Make ClPixelWiseMultiplicationKernel statelessSheri Zhang
Partially resolves: COMPMID-4183 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ibc08d2d84d023ef8b23ed44d534aa1ca24515e4d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5274 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-12Port OpenCL Scale to new APIManuel Bottini
Partially resolves: COMPMID-4190 Change-Id: I680dd80fcbe4e7568511792c60a725b2646fa6ff Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5197 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-11Port OpenCL Dequantization to new APIManuel Bottini
Partially resolves: COMPMID-4193 Change-Id: I4e14149d5b0a7f9c0dd3bfce800eaddca1e4d885 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5238 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-10Port OpenCL Quantization to new APIManuel Bottini
Partially resolves: COMPMID-4193 Change-Id: Ie8367769c690442a0e30383c67851b50ab7c6742 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5231 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-08Make Softmax kernels on OpenCL statelessSang-Hoon Park
* ClSoftmaxKernel and ClSoftmax are created. * ClSoftmaxKernel is now state-less and ClSoftmax handles the internal tensors required for computation. * add_const_tensor() is added to TensorPack not only to have symmetric interface but also to benefit from implicit conversion. Implements: COMPMID-3998 Change-Id: I4f823121777be24260fd12b2cd71a6ff718c4eed Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5087 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-03Remove Compute Vision CL supportMichalis Spyrou
Resolves COMPMID-4151 Change-Id: I46f541efe8c4087f27794d2e158b6c1547d459ba Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5160 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-02-23Avoid nullptr dereference of vector_sum_colGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I4cc002da82da2219f3909a4e34463946cde4cf65 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5155 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-18Set CLDirectConvolutionLayerKernel NCHW _border_size to input paddingGiorgio Arena
Change-Id: I5802c470683647b7426b3b6e7d17280cabc32163 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5100 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-10Revert changes on tensor's strides and fix CLDepthwiseConvolution 3x3 QuantizedGiorgio Arena
- Revert changes in strides > num_dimensions. Set them to 0 - Fix offset calculcation in depthwise 3x3 quantized using select and stride_y for max offset Resolve COMPMID-4254 Change-Id: Ia99b9637f18b99b1fa3d4b7b4892046027d3e7e5 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5040 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-09Fix CLDepthwiseConvolutionLayer 3x3 QASYMM8Giorgio Arena
Fix errors when computing tensors with one element only - Replace Tensor3D with raw pointers so to get rid of offset to first element for NCHW layout - Add stronger out of bound constraints for NHWC layout - Set the border size to the input's padding for NHWC - Fill the strides == 0 with the largest stride, so to avoid accessing empty strides and multiplying by 0 Resolve COMPMID-4088 Change-Id: I751a4e6d7094b3c42306ff7f53af848fd35f19ac Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5024 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-08Make memset/copy functions state-lessSheri Zhang
Port following functions: - CLCopy - CLFill - CLPermute - CLReshapeLayer - CLCropResize Resolves: COMPMID-4002 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I8392aa515aaeb5b44dab6122be6a795d08376d5f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5003 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix OpenCL direct convolutionGian Marco Iodice
- The ARM DOT macro was using wrong variables for performing the dot product - K0 could be a non power of 2 values when IFM was not a multiple of 16 - Refactor the test for direct convolution NHWC Resolves COMPMID-4135, COMPMID-4155 Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix segfault in fsrcnn.tflite in GpuAccTeresa Charlin
* In CLDirectConvolution check for non-bias separately Resolves: COMPMID-4214 Change-Id: I83c0688e9b48d059665bbc6e1f0f050a516132d6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4980 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Make CL Pooling kernels and functions state-lessMichele Di Giorgio
Resolves COMPMID-4000 Change-Id: I64878f93c033b4928fdefbb964c37c67fdecfaab Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4971 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-02-01Make data_layout an attribute of the Scale functionMichele Di Giorgio
Resolves COMPMID-4208 Change-Id: I61ca670134a005462ad0528a5aff9507a90860e7 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4942 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-27Make CL Elementwise Unary kernels and functions state-lessMichele Di Giorgio
Resolves COMPMID-4004 Change-Id: I1dfe8bc52c1ff394ea208ba98b51033c738746a4 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4922 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-26Make CLArithmeticAddition kernel and function state-lessMichele Di Giorgio
Resolves COMPMID-4006 Change-Id: Iddc32b0b250142aac9a4a7b9dc0eef462d196025 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4913 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2021-01-22CTS failures in Android R and Q in GpuAcc in ArgMinMaxGiorgio Arena
- Fix ambiguosity with select in OpenCL - Define a new macro for signed integer data type of the same input data type's size. This is needed because some ops (e.g. logical operators) in OpenCL work in this way Resolves: COMPMID-4116, COMPMID-4110 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I560eda63fce24abd03d061f78f2f2ca951053fd0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4898 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-21Make CLFloor and CLActivation kernels and functions state-lessGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Icbe4e6a7c6732a59bdda0136af44c4852452dfd1 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4891 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-20Make all CL Concatenate kernels and functions state-lessMichele Di Giorgio
Resolves COMPMID-3995 Change-Id: I84172bed20924f1d9ae3b4d14d7b321e9494296e Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4887 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-20Direct convolution fix for quantized data typeGian Marco Iodice
- Pass the quantized zero value to the opencl kernel Fixes COMPMID-3908 Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-19Remove padding from direct convolution - OpenCLGian Marco Iodice
- Refactor direct convolution for NHWC - Remove old kernels for NHWC - Change the heuristic in CLConvolutionLayer.cpp. The new direct convolution implementation is faster than FFT Resolves COMPMID-3908 Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-14Remove OpenCL padding CLTransposeKernelManuel Bottini
By handling more general NxM blocks (where M and N can be 1,2,4,8,16) instead of only 4x4, 8x8, 16x16 and managing corner left values with partial stores Resolves: COMPMID-3923 Change-Id: I49b1a560c8325e00e061bd04edcf55034d04dcd8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4780 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-13Remove padding for CLArgMinMaxLayerKernel and fix CLRange mismatchesGiorgio Arena
- Cast the destination pointer to (__global DATA_TYPE*) when VEC_SIZE == 1 in range.cl Resolves: COMPMID-3906, COMPMID-4093 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic0a334d98785ea434ed81f89dbe34e7674991f82 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4792 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-01-13Remove OpenCL padding CLFloorKernelManuel Bottini
Use of proper vector size with boundary checking loads and stores Resolves: COMPMID-3922 Change-Id: Ib631d499603b860fcfdbe3da903b866a125359a8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4789 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-11Remove OpenCL padding: CLROIAlignLayerKernelManuel Bottini
Add padding checks in configure Resolves: COMPMID-3914 Change-Id: Ia5be67283402d8811ceb3007be3a666ab502d775 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4787 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-05Remove OpenCL padding: CLPadLayerKernelGiorgio Arena
Resolves: COMPMID-3912 Change-Id: I1f8bd3bfec263ebfd70bc96f9183ccdc3089db13 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4754 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-12-23Fix baremetal arm_compute_validation build errorsSiCongLi
* Add -C flag to instruct preprocessor not to strip comments. This is to prevent marker comments like '// fall through' that suppresses certain warnings from being removed. * Fix unused variable warnings. * Add M_PI definition that's missing from certain toolchain standard libraries. Resolves COMPMID-4054 Change-Id: I1d641db668685d4b678f3d0efed84bfe9e630b4b Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4692 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-18Remove OpenCL padding CLScaleKernelManuel Bottini
Resolves COMPMID-3918 Change-Id: I970b1eaf2ae6f2f5a8cfc318cd1a3dfd3ba36fdb Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4668 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2020-12-18Add new shapes to WinogradInputTransform dataset and fix border size for ↵Giorgio Arena
NCHW data layout Fix border size for CLWinogradInputTransformKernel with NCHW data layout by setting it to the input's paddings. Add new the new validation shapes to the WinogradInputTransform's dataset Resolves COMPMID-4042 Change-Id: Id93ac86e75c94ea3f2f35edcedebafada928f34a Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4694 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-12-18Adding no padding check asserts to specific CL KernelsManuel Bottini
Resolves COMPMID-3905 Updates following kernels:: - CLDeconvolutionLayerUpsampleKernel - CLDeconvolutionReshapeOutputKernel - CLInstanceNormalizationLayerKernel - CLMaxUnpoolingLayerKernel - CLPermuteKernel - CLQLSTMLayerNormalizationKernel - CLReorgLayerKernel - CLReverseKernel - CLSpaceToBatchLayerKernel - CLSpaceToDepthLayerKernel - CLGenerateProposalsLayerKernel - CLFFTDigitReverseKernel - CLFFTRadixStageKernel - CLFFTScaleKernel - CLFillBorderKernel - CLGatherKernel - CLStridedSliceKernel - CLBoundingBoxTransformKernel Change-Id: I067ec670ff9cceadb1dfbf60dabef311a567d99a Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4713 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-16COMPMID-3919: Remove OpenCL Padding CLSelectKernelManuel Bottini
Change-Id: I07222a9eb03c785bb63414f581152267b133e9fc Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4699 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-14Enable FFT for FP16Giorgio Arena
Resolves: COMPMID-4051 Change-Id: I0c0bf97212dd281c19d5081e6247e7dc0c23cd6b Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4687 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-11Remove (CL/NE)UpsampleLayer in favor to (NE/CL)ScaleGeorgios Pinitas
Upsample functions and kernels can be replaced with the Scale as they provide same functionality Partially resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic2f9ba352c183aa87d69d551d5c172d0f22119e8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4679 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>