aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/cl_kernels
AgeCommit message (Collapse)Author
2021-03-31Fix trademarks throughout the codebaseMichele Di Giorgio
Resolves: COMPMID-4299 Change-Id: Ie6a52c1371b9a2a7b5bb4f019ecd5e70a2008567 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5338 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-29New variant of OpenCL Winograd (4x4,5x5) input transformationAleksandr Nikolaev
Resolves: COMPMID-4141 Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: I1437680029ff25a3a5d4f6f258f30960545056a9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5299 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-03-26Check biases pointer before referencing in CLDirectConvolutionLayerMichele Di Giorgio
The biases input can be nullptr, hence we need to check before referencing. A test is also added to ensure a successful configure and run of Direct Convolution when there is no bias. Resolves: COMPMID-4315 Change-Id: I23223efd6ced81215aff490221fb4606945c139b Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5322 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: James Conroy <james.conroy@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-25Improve performance of Winograd Output Transform 3x3Gian Marco Iodice
This patch reworks the winograd output transform 3x3 NHWC on OpenCL - Use utility macros in tile_helpers.h to rewrite the kernel - Implement the tile utility macro for the activation Resolves COMPMID-4144 Change-Id: I86a9bb9ea96b9629a18642b56bb63750710e6af5 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5324 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Fix gemmlowp kernel crash when n0==16SiCongLi
Resolves COMPMID-4296 Change-Id: Ib4e26fd3f9ba66f18ea8ef8b982cc88158564045 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5277 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Improve performance of Winograd Input transform 3x3Giorgio Arena
Resolve COMPMID-4143 Change-Id: I71521f50ad47c53c963303251b5da94a7abd5783 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5302 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-23Extend direct convolution (F32/F16/QASYMM8)Gian Marco Iodice
The new function can handle different block sizes (M0, N0) New utility macros have been developed to simplify the work and the future OpenCL kernel development. In particular the work has been done to also consider cases with: - the texture pipe support - dynamic tensor shape support Change-Id: Ife4c64baf07517938bb8ad18e6a5f4579345c40f Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5297 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-03-03Remove Compute Vision CL supportMichalis Spyrou
Resolves COMPMID-4151 Change-Id: I46f541efe8c4087f27794d2e158b6c1547d459ba Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5160 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-03-01Fix bug on CLCast from float to int8Giorgio Arena
Resolve COMPMID-4288 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I340d18a13f3f00432df3a7d3fc83f93e2591b6d3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5184 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-02-22Comply with Trademark rules for use of Neon, Arm and MaliSheri Zhang
Full trademarks available in README.md Resolves: COMPMID-4257 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ibfba2adf2eef3449433f467464ebd87d7198474d Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5116 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-10Revert changes on tensor's strides and fix CLDepthwiseConvolution 3x3 QuantizedGiorgio Arena
- Revert changes in strides > num_dimensions. Set them to 0 - Fix offset calculcation in depthwise 3x3 quantized using select and stride_y for max offset Resolve COMPMID-4254 Change-Id: Ia99b9637f18b99b1fa3d4b7b4892046027d3e7e5 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5040 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-09Fix CLDepthwiseConvolutionLayer 3x3 QASYMM8Giorgio Arena
Fix errors when computing tensors with one element only - Replace Tensor3D with raw pointers so to get rid of offset to first element for NCHW layout - Add stronger out of bound constraints for NHWC layout - Set the border size to the input's padding for NHWC - Fill the strides == 0 with the largest stride, so to avoid accessing empty strides and multiplying by 0 Resolve COMPMID-4088 Change-Id: I751a4e6d7094b3c42306ff7f53af848fd35f19ac Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5024 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-04Fix CL Compiler frontend failure for ArgMinMaxGiorgio Arena
- Change select condition's data type to satisfy its signature - Add failing test case with VEC_SIZE == 1 Resolve: COMPMID-4110 Change-Id: I52287bff7a2108f92fd12164e267df6c074d5508 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4978 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix OpenCL direct convolutionGian Marco Iodice
- The ARM DOT macro was using wrong variables for performing the dot product - K0 could be a non power of 2 values when IFM was not a multiple of 16 - Refactor the test for direct convolution NHWC Resolves COMPMID-4135, COMPMID-4155 Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-22CTS failures in Android R and Q in GpuAcc in ArgMinMaxGiorgio Arena
- Fix ambiguosity with select in OpenCL - Define a new macro for signed integer data type of the same input data type's size. This is needed because some ops (e.g. logical operators) in OpenCL work in this way Resolves: COMPMID-4116, COMPMID-4110 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I560eda63fce24abd03d061f78f2f2ca951053fd0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4898 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-20Direct convolution fix for quantized data typeGian Marco Iodice
- Pass the quantized zero value to the opencl kernel Fixes COMPMID-3908 Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-19Remove padding from direct convolution - OpenCLGian Marco Iodice
- Refactor direct convolution for NHWC - Remove old kernels for NHWC - Change the heuristic in CLConvolutionLayer.cpp. The new direct convolution implementation is faster than FFT Resolves COMPMID-3908 Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-19LOGICAL_NOT failTeresa Charlin
Explicitly cast scalar to vector for LOGICAL_NOT Related with COMPUTE-12536 and IVGCVSW-5617 Change-Id: I03accce000f8889fc4fb88c42c3c87845acb4f42 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4874 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-15[Nightly Failure] Fix DeconvolutionLayer OpenCL kernel compilationGiorgio Arena
- Add case for VEC_SIZE == 3 in the TRANSPOSED_U macro Resolves: COMPMID-4094 Change-Id: I31870e589e66d895f9bf65c87aa04f32038365c0 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4864 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-14Remove OpenCL padding CLTransposeKernelManuel Bottini
By handling more general NxM blocks (where M and N can be 1,2,4,8,16) instead of only 4x4, 8x8, 16x16 and managing corner left values with partial stores Resolves: COMPMID-3923 Change-Id: I49b1a560c8325e00e061bd04edcf55034d04dcd8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4780 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-13Remove padding for CLArgMinMaxLayerKernel and fix CLRange mismatchesGiorgio Arena
- Cast the destination pointer to (__global DATA_TYPE*) when VEC_SIZE == 1 in range.cl Resolves: COMPMID-3906, COMPMID-4093 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic0a334d98785ea434ed81f89dbe34e7674991f82 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4792 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-01-13[Nightly Failure] Fix CLDepthwiseConvolutionLayer 3x3 QASYMM8 on MidgardGiorgio Arena
- Add checks for pad top/bottom bigger than (kernel size / 2) Resolves: COMPMID-4088 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ifc5ea2154847d447bc5643d7607e7256aeddfcbf Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4840 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-13Remove OpenCL padding CLFloorKernelManuel Bottini
Use of proper vector size with boundary checking loads and stores Resolves: COMPMID-3922 Change-Id: Ib631d499603b860fcfdbe3da903b866a125359a8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4789 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-08[Nightly Failure] Fix OpenCL kernel compilation for CLRangeGiorgio Arena
- Change raw pointers in OpenCL kernel to __global uchar* Resolves: COMPMID-4079 Change-Id: Ieeb99ced565bef59583216fd274958b29c7b2758 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4774 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-01-05Remove OpenCL padding: CLPadLayerKernelGiorgio Arena
Resolves: COMPMID-3912 Change-Id: I1f8bd3bfec263ebfd70bc96f9183ccdc3089db13 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4754 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-12-30Fix FFT FP16 failures for some OpenCL compilersGiorgio Arena
- Fix unsupported native_cos and native_sin for half data types. Change to regular cos and sin functions. Resolves: COMPMID-4064 Change-Id: Id07fa0fd811e00a93f5b848636ad4f4481e9a409 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4730 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-12-23Fix Select errors at OpenCL kernel compile timeGiorgio Arena
- Fix erroneously typed pointers. Raw OpenCL pointers should be defined as pointing to 8bit values and then used with a cast to their true pointer types, due to offset calculation with strides Resolves: COMPMID-4065 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I7e792bc22fbbc2ab6b65a8f5c4dc599f63e657a6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4731 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-18Remove OpenCL padding CLScaleKernelManuel Bottini
Resolves COMPMID-3918 Change-Id: I970b1eaf2ae6f2f5a8cfc318cd1a3dfd3ba36fdb Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4668 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2020-12-16NeuralNetwork VTS tests LOGICAL_AND and LOGICAL_OR fail with R29Teresa Charlin
Explicitly cast scalar to vector for LOGICAL_AND and LOGICAL_OR Resolves COMPUTE-12536 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Iabdf7feaef9cb9b41a2fc78e73473ebcfcc3e091 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4706 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-16COMPMID-3919: Remove OpenCL Padding CLSelectKernelManuel Bottini
Change-Id: I07222a9eb03c785bb63414f581152267b133e9fc Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4699 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-14Enable FFT for FP16Giorgio Arena
Resolves: COMPMID-4051 Change-Id: I0c0bf97212dd281c19d5081e6247e7dc0c23cd6b Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4687 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-10[Review Shape] CLDepthwiseConvolutionLayer mismatchesGiorgio Arena
- Fixed a bug that corrected the number of dimensions of a TensorShape for added trailing 1s - Avoided adding offset_first_element for the Depthwise 3x3 NCHW OpenCL kernels, since it wouldn't align with the window which is based on the output - Adjusted padding requirements along the x for Depthwise 3x3 NCHW. The kernel should always add 2 * dilation_(x/y) to the num_elems_read_x/y - Adjusted the kernel's border_size given to the border handler at function level - Added the dataset that previously made the tests fail Resolves: COMPMID-4041 Change-Id: Ifab7d38b263f12173fcc96a5f0bd3375756c3c53 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4673 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-10COMPMID-3921: Remove OpenCL Padding CLBitwiseKernelManuel Bottini
Adding BitwiseOperation enum class Generalizing CL Bitwise kernels with a single CLBitwiseKernel Removing CL padding from CLBitwiseKernel Change-Id: I79cd79c1e425b6da7d52308a420edf8cfb7a5a36 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4646 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-08Wrap Flatten layer over reshapeGeorgios Pinitas
Flatten layer is lowered into a Reshape layer. Remove (CL/NE)FlatternLayerKernel. Partially Resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id9e2ddfe2e2dd793541badff3490c05e4c908f88 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4660 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-02Remove support for (NE/CL)LocallyConnectedLayerGeorgios Pinitas
Remove out-of-date and unmaintained LocallyConnectedLayer for both NEON and OpenCL. Resolves: COMPMID-3924 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ia61398ed8cfa3876f41c1b342c4a80d1cca0ca83 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4634 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-01COMPMID-3916: Remove OpenCL padding CLRangeKernelManuel Bottini
Change-Id: Id2cc77508b0f2fa36a298059476b01704cfbdcaf Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4580 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-11-27COMPMID-3961: Cleaning up logical operators on OpenCLSang-Hoon Park
Change-Id: I04cd23e9abcb1828e54cd59fee3bfa95a6dea3fb Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4461 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
2020-11-25COMPMID-4025 [Nightly failure] Fix FP16 CLWidthConcatenateLayer mismatchesGiorgio Arena
Change-Id: I62e09682fe42c17227208387135ff2a165357335 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4553 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-24COMPMID-4022 Nightly failure: CL LogicalNot -45 error clCreateKernelGiorgio Arena
Change-Id: I62dab54582a677753bd9337f6a7db265e57d330d Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4536 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-23Fix output address calculation in GEMM OpenCL kernelsMichele Di Giorgio
Resolves COMPMID-3977 Change-Id: I222e0d1726993e54699646323820fc4ae53ab520 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4530 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-18COMPMID-3961: Add Logical OR/AND/NOT operator on CLSang-Hoon Park
Change-Id: I612aeed6affa17624fb9044964dd59c41a5c9888 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4448 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-17COMPMID-3979 Sanitise Padding Removal epicSiCong Li
* Add missing padding immutability asserts in all relevant CL kernels * Remove unnecessary zero padding validation tests. Change-Id: If93f9ccbc988e0286f5e7b135f812141476d5da0 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4446 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3956: Nightly CL failure on G71 with error code -7Manuel Bottini
Change-Id: Iba02375df47d227feca07cc0215e3389e7c55ade Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4401 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3735 Remove OpenCL padding: CLSoftmaxLayerKernelGiorgio Arena
- Renamed SELECT_DATA_TYPE to SELECT_VEC_DATA_TYPE to reflect its usage with vectors. SELECT_DATA_TYPE(dt) will now return the primitive data type - Changed the interface of VEC_OFFS and V_OFFS in order to receive the primitive data type as a parameter rather than its vector form - Performed a general cleanup of the kernels, such as creating macro for sum and max reduces, remove reduntant macros, defines, variables, calculations, etc... - Using VEC_SIZE and VEC_SIZE_LEFTOVER in every kernel in order to allow computation for smaller shapes without adding paddings - Removed the actual padding from the kernel and adjusting its calculations accordingly. Added asserts for padding removal checks. Removed invalid Validate tests. Change-Id: If5ccbd5d34e255d38c7f6bfe8740e2b80b28e264 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4277 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3951 LargeGraph_FLOAT32_Rank4_25 CTS failures in Android Q in CL Fix1SiCong Li
* Fix CLSpaceToBatchLayerKernel and NESpaceToBatchLayerKernel validation errors by using the correctly calculated output tensor shape Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I21d61f870e6a23a2e38dcb95c939b0bf08082b6f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4347 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3730: Remove CLGEMMMatrixMultiplyKernel Patch2SiCong Li
Change-Id: I56137938c9ebe1a5aeeaa750b39fcfc6818016f1 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-05COMPMID-3730 Remove padding from CLGEMMMatrixMultiplyKernel Patch1SiCong Li
* Remove default definition for STORE_BLOCK_BOUNDARY_AWARE to avoid elusive bugs * Clean up gemm_mm_interleaved* and gemm_mm_floating_point* kernels * Relocate to gemm_v1.cl to avoid clashing with new kernels * Rename compile time arguments to conform with the established terminology(MNKB), and to facilitate the use of STORE_BLOCK_BOUNDARY_AWARE Change-Id: Ia85c746b2536cad87257a79685b459b5d2f9a1be Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4329 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-04COMPMID-3599: Fix rotate condition in concatenate_width_x4Michele Di Giorgio
Correct copy-paste error introduced in previous fix. Change-Id: I8a82a5a9acd9afbe30c760faf78d87818510642b Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4323 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3931: Concat quant8 unittests, android VTS and CTS tests failingMichele Di Giorgio
Change-Id: Ib9a31b861f95caec72a1aa02dbe3c2b46ed25efc Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4309 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3721: Remove OpenCL padding ↵Manuel Bottini
CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel Change-Id: I45d26d5f565f9a55f6b5e8d7652b14283ae616f7 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4299 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>