aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/cl_kernels
AgeCommit message (Collapse)Author
2021-03-01Fix bug on CLCast from float to int8Giorgio Arena
Resolve COMPMID-4288 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I340d18a13f3f00432df3a7d3fc83f93e2591b6d3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5184 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-02-22Comply with Trademark rules for use of Neon, Arm and MaliSheri Zhang
Full trademarks available in README.md Resolves: COMPMID-4257 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ibfba2adf2eef3449433f467464ebd87d7198474d Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5116 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-10Revert changes on tensor's strides and fix CLDepthwiseConvolution 3x3 QuantizedGiorgio Arena
- Revert changes in strides > num_dimensions. Set them to 0 - Fix offset calculcation in depthwise 3x3 quantized using select and stride_y for max offset Resolve COMPMID-4254 Change-Id: Ia99b9637f18b99b1fa3d4b7b4892046027d3e7e5 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5040 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-09Fix CLDepthwiseConvolutionLayer 3x3 QASYMM8Giorgio Arena
Fix errors when computing tensors with one element only - Replace Tensor3D with raw pointers so to get rid of offset to first element for NCHW layout - Add stronger out of bound constraints for NHWC layout - Set the border size to the input's padding for NHWC - Fill the strides == 0 with the largest stride, so to avoid accessing empty strides and multiplying by 0 Resolve COMPMID-4088 Change-Id: I751a4e6d7094b3c42306ff7f53af848fd35f19ac Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5024 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-04Fix CL Compiler frontend failure for ArgMinMaxGiorgio Arena
- Change select condition's data type to satisfy its signature - Add failing test case with VEC_SIZE == 1 Resolve: COMPMID-4110 Change-Id: I52287bff7a2108f92fd12164e267df6c074d5508 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4978 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix OpenCL direct convolutionGian Marco Iodice
- The ARM DOT macro was using wrong variables for performing the dot product - K0 could be a non power of 2 values when IFM was not a multiple of 16 - Refactor the test for direct convolution NHWC Resolves COMPMID-4135, COMPMID-4155 Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-22CTS failures in Android R and Q in GpuAcc in ArgMinMaxGiorgio Arena
- Fix ambiguosity with select in OpenCL - Define a new macro for signed integer data type of the same input data type's size. This is needed because some ops (e.g. logical operators) in OpenCL work in this way Resolves: COMPMID-4116, COMPMID-4110 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I560eda63fce24abd03d061f78f2f2ca951053fd0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4898 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-20Direct convolution fix for quantized data typeGian Marco Iodice
- Pass the quantized zero value to the opencl kernel Fixes COMPMID-3908 Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-19Remove padding from direct convolution - OpenCLGian Marco Iodice
- Refactor direct convolution for NHWC - Remove old kernels for NHWC - Change the heuristic in CLConvolutionLayer.cpp. The new direct convolution implementation is faster than FFT Resolves COMPMID-3908 Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-19LOGICAL_NOT failTeresa Charlin
Explicitly cast scalar to vector for LOGICAL_NOT Related with COMPUTE-12536 and IVGCVSW-5617 Change-Id: I03accce000f8889fc4fb88c42c3c87845acb4f42 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4874 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-15[Nightly Failure] Fix DeconvolutionLayer OpenCL kernel compilationGiorgio Arena
- Add case for VEC_SIZE == 3 in the TRANSPOSED_U macro Resolves: COMPMID-4094 Change-Id: I31870e589e66d895f9bf65c87aa04f32038365c0 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4864 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-14Remove OpenCL padding CLTransposeKernelManuel Bottini
By handling more general NxM blocks (where M and N can be 1,2,4,8,16) instead of only 4x4, 8x8, 16x16 and managing corner left values with partial stores Resolves: COMPMID-3923 Change-Id: I49b1a560c8325e00e061bd04edcf55034d04dcd8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4780 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-13Remove padding for CLArgMinMaxLayerKernel and fix CLRange mismatchesGiorgio Arena
- Cast the destination pointer to (__global DATA_TYPE*) when VEC_SIZE == 1 in range.cl Resolves: COMPMID-3906, COMPMID-4093 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic0a334d98785ea434ed81f89dbe34e7674991f82 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4792 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-01-13[Nightly Failure] Fix CLDepthwiseConvolutionLayer 3x3 QASYMM8 on MidgardGiorgio Arena
- Add checks for pad top/bottom bigger than (kernel size / 2) Resolves: COMPMID-4088 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ifc5ea2154847d447bc5643d7607e7256aeddfcbf Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4840 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-13Remove OpenCL padding CLFloorKernelManuel Bottini
Use of proper vector size with boundary checking loads and stores Resolves: COMPMID-3922 Change-Id: Ib631d499603b860fcfdbe3da903b866a125359a8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4789 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-08[Nightly Failure] Fix OpenCL kernel compilation for CLRangeGiorgio Arena
- Change raw pointers in OpenCL kernel to __global uchar* Resolves: COMPMID-4079 Change-Id: Ieeb99ced565bef59583216fd274958b29c7b2758 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4774 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-05Remove OpenCL padding: CLPadLayerKernelGiorgio Arena
Resolves: COMPMID-3912 Change-Id: I1f8bd3bfec263ebfd70bc96f9183ccdc3089db13 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4754 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-12-30Fix FFT FP16 failures for some OpenCL compilersGiorgio Arena
- Fix unsupported native_cos and native_sin for half data types. Change to regular cos and sin functions. Resolves: COMPMID-4064 Change-Id: Id07fa0fd811e00a93f5b848636ad4f4481e9a409 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4730 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-12-23Fix Select errors at OpenCL kernel compile timeGiorgio Arena
- Fix erroneously typed pointers. Raw OpenCL pointers should be defined as pointing to 8bit values and then used with a cast to their true pointer types, due to offset calculation with strides Resolves: COMPMID-4065 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I7e792bc22fbbc2ab6b65a8f5c4dc599f63e657a6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4731 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-18Remove OpenCL padding CLScaleKernelManuel Bottini
Resolves COMPMID-3918 Change-Id: I970b1eaf2ae6f2f5a8cfc318cd1a3dfd3ba36fdb Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4668 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2020-12-16NeuralNetwork VTS tests LOGICAL_AND and LOGICAL_OR fail with R29Teresa Charlin
Explicitly cast scalar to vector for LOGICAL_AND and LOGICAL_OR Resolves COMPUTE-12536 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Iabdf7feaef9cb9b41a2fc78e73473ebcfcc3e091 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4706 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-16COMPMID-3919: Remove OpenCL Padding CLSelectKernelManuel Bottini
Change-Id: I07222a9eb03c785bb63414f581152267b133e9fc Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4699 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-14Enable FFT for FP16Giorgio Arena
Resolves: COMPMID-4051 Change-Id: I0c0bf97212dd281c19d5081e6247e7dc0c23cd6b Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4687 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-10[Review Shape] CLDepthwiseConvolutionLayer mismatchesGiorgio Arena
- Fixed a bug that corrected the number of dimensions of a TensorShape for added trailing 1s - Avoided adding offset_first_element for the Depthwise 3x3 NCHW OpenCL kernels, since it wouldn't align with the window which is based on the output - Adjusted padding requirements along the x for Depthwise 3x3 NCHW. The kernel should always add 2 * dilation_(x/y) to the num_elems_read_x/y - Adjusted the kernel's border_size given to the border handler at function level - Added the dataset that previously made the tests fail Resolves: COMPMID-4041 Change-Id: Ifab7d38b263f12173fcc96a5f0bd3375756c3c53 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4673 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-10COMPMID-3921: Remove OpenCL Padding CLBitwiseKernelManuel Bottini
Adding BitwiseOperation enum class Generalizing CL Bitwise kernels with a single CLBitwiseKernel Removing CL padding from CLBitwiseKernel Change-Id: I79cd79c1e425b6da7d52308a420edf8cfb7a5a36 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4646 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-08Wrap Flatten layer over reshapeGeorgios Pinitas
Flatten layer is lowered into a Reshape layer. Remove (CL/NE)FlatternLayerKernel. Partially Resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id9e2ddfe2e2dd793541badff3490c05e4c908f88 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4660 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-02Remove support for (NE/CL)LocallyConnectedLayerGeorgios Pinitas
Remove out-of-date and unmaintained LocallyConnectedLayer for both NEON and OpenCL. Resolves: COMPMID-3924 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ia61398ed8cfa3876f41c1b342c4a80d1cca0ca83 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4634 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-01COMPMID-3916: Remove OpenCL padding CLRangeKernelManuel Bottini
Change-Id: Id2cc77508b0f2fa36a298059476b01704cfbdcaf Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4580 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-11-27COMPMID-3961: Cleaning up logical operators on OpenCLSang-Hoon Park
Change-Id: I04cd23e9abcb1828e54cd59fee3bfa95a6dea3fb Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4461 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
2020-11-25COMPMID-4025 [Nightly failure] Fix FP16 CLWidthConcatenateLayer mismatchesGiorgio Arena
Change-Id: I62e09682fe42c17227208387135ff2a165357335 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4553 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-24COMPMID-4022 Nightly failure: CL LogicalNot -45 error clCreateKernelGiorgio Arena
Change-Id: I62dab54582a677753bd9337f6a7db265e57d330d Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4536 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-23Fix output address calculation in GEMM OpenCL kernelsMichele Di Giorgio
Resolves COMPMID-3977 Change-Id: I222e0d1726993e54699646323820fc4ae53ab520 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4530 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-18COMPMID-3961: Add Logical OR/AND/NOT operator on CLSang-Hoon Park
Change-Id: I612aeed6affa17624fb9044964dd59c41a5c9888 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4448 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-17COMPMID-3979 Sanitise Padding Removal epicSiCong Li
* Add missing padding immutability asserts in all relevant CL kernels * Remove unnecessary zero padding validation tests. Change-Id: If93f9ccbc988e0286f5e7b135f812141476d5da0 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4446 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3956: Nightly CL failure on G71 with error code -7Manuel Bottini
Change-Id: Iba02375df47d227feca07cc0215e3389e7c55ade Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4401 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3735 Remove OpenCL padding: CLSoftmaxLayerKernelGiorgio Arena
- Renamed SELECT_DATA_TYPE to SELECT_VEC_DATA_TYPE to reflect its usage with vectors. SELECT_DATA_TYPE(dt) will now return the primitive data type - Changed the interface of VEC_OFFS and V_OFFS in order to receive the primitive data type as a parameter rather than its vector form - Performed a general cleanup of the kernels, such as creating macro for sum and max reduces, remove reduntant macros, defines, variables, calculations, etc... - Using VEC_SIZE and VEC_SIZE_LEFTOVER in every kernel in order to allow computation for smaller shapes without adding paddings - Removed the actual padding from the kernel and adjusting its calculations accordingly. Added asserts for padding removal checks. Removed invalid Validate tests. Change-Id: If5ccbd5d34e255d38c7f6bfe8740e2b80b28e264 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4277 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3951 LargeGraph_FLOAT32_Rank4_25 CTS failures in Android Q in CL Fix1SiCong Li
* Fix CLSpaceToBatchLayerKernel and NESpaceToBatchLayerKernel validation errors by using the correctly calculated output tensor shape Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I21d61f870e6a23a2e38dcb95c939b0bf08082b6f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4347 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3730: Remove CLGEMMMatrixMultiplyKernel Patch2SiCong Li
Change-Id: I56137938c9ebe1a5aeeaa750b39fcfc6818016f1 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-05COMPMID-3730 Remove padding from CLGEMMMatrixMultiplyKernel Patch1SiCong Li
* Remove default definition for STORE_BLOCK_BOUNDARY_AWARE to avoid elusive bugs * Clean up gemm_mm_interleaved* and gemm_mm_floating_point* kernels * Relocate to gemm_v1.cl to avoid clashing with new kernels * Rename compile time arguments to conform with the established terminology(MNKB), and to facilitate the use of STORE_BLOCK_BOUNDARY_AWARE Change-Id: Ia85c746b2536cad87257a79685b459b5d2f9a1be Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4329 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-04COMPMID-3599: Fix rotate condition in concatenate_width_x4Michele Di Giorgio
Correct copy-paste error introduced in previous fix. Change-Id: I8a82a5a9acd9afbe30c760faf78d87818510642b Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4323 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3931: Concat quant8 unittests, android VTS and CTS tests failingMichele Di Giorgio
Change-Id: Ib9a31b861f95caec72a1aa02dbe3c2b46ed25efc Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4309 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3721: Remove OpenCL padding ↵Manuel Bottini
CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel Change-Id: I45d26d5f565f9a55f6b5e8d7652b14283ae616f7 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4299 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-10-29COMPMID-3928: Fix output conversion in gemmlowp_mm_nativeMichele Di Giorgio
This patch solves the following issues that arose from nightly tests: - The accumulated result of gemmlowp_mm_native can be either uint or int and in order to be stored in memory we need to convert it to int. - The RHS matrix still needs padding on the X dimension. Hence, revert few changes to add the necessary padding elements. - Replace zero padding validation tests with assertion in the configure method of the kernel. Change-Id: Ib6614a91bd0e98f2b850f52eef14d4fbf55517c8 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4259 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-29COMPMID-3720: Remove OpenCL padding CLGEMMLowpMatrixMultiplyReshapedKernelManuel Bottini
Change-Id: Ie70ba877f0356661a055f026124904bbf2181a33 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4251 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-29COMPMID-3737: Remove OpenCL padding: CLWidthConcatenate2TensorsKernelSheri Zhang
Remove padding from CLWidthConcatenate2TensorsKernel Remove padding from CLWidthConcatenate4TensorsKernel Change-Id: I2142618e87bf11f831fe3b9375c4a7efda8d3a21 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4266 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-10-28COMPMID-3710: Remove OpenCL padding: CLDepthConvertLayerKernelSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Iee1d4655012ce4cb699535697aeefec673f0bc63 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4157 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-28COMPMID-3793: Remove OpenCL padding: CLWidthConcatenateLayerKernelSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I705044a9429bb9a08268368b09463c2af85616d5 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4253 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-27COMPMID-3927 CTS failure for OpenCL ElementwiseOperationGiorgio Arena
Change-Id: I1a82315dfa9b021f72e0b687da658e2e02c9bb34 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4257 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-26COMPMID-3925: Dispatch CLGEMM with no padding y requirementGian Marco Iodice
- Add has_pad_y flag in GEMMKernelInfo - Skip reinterpret as 3D in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel if has_pad_y = false - Add test to validate CLGEMMMatrixMultiplyReshapedOnlyRHSkernel with had_pad_y = false/true - Configure two variants of CLGEMMMatrixMultiplyReshapedOnlyRHSKernel to run with has_pad_y = false/true in CLGEMM - Check if the lhs/dst tensors have pad y. If not, run CLGEMMMatrixMultiplyReshapedOnlyRHSKernel without padding requirement Change-Id: I68bb43389789736d676b899ac7c77fd9138babaf Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4248 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-10-26COMPMID-3741: Remove OpenCL padding: CLWinogradOutputTransformKernelGian Marco Iodice
- Refactor the OpenCL kernels for Winograd output transform NHWC to avoid padding requirement - The kernel adopt the reverse store approach to avoid out-of-bound writes Change-Id: If9aad20354ff2146f57ead07ba0aaadb3df919f9 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4222 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>