aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels
AgeCommit message (Collapse)Author
2021-02-23Avoid nullptr dereference of vector_sum_colGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I4cc002da82da2219f3909a4e34463946cde4cf65 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5155 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-18Set CLDirectConvolutionLayerKernel NCHW _border_size to input paddingGiorgio Arena
Change-Id: I5802c470683647b7426b3b6e7d17280cabc32163 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5100 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-10Revert changes on tensor's strides and fix CLDepthwiseConvolution 3x3 QuantizedGiorgio Arena
- Revert changes in strides > num_dimensions. Set them to 0 - Fix offset calculcation in depthwise 3x3 quantized using select and stride_y for max offset Resolve COMPMID-4254 Change-Id: Ia99b9637f18b99b1fa3d4b7b4892046027d3e7e5 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5040 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-09Fix CLDepthwiseConvolutionLayer 3x3 QASYMM8Giorgio Arena
Fix errors when computing tensors with one element only - Replace Tensor3D with raw pointers so to get rid of offset to first element for NCHW layout - Add stronger out of bound constraints for NHWC layout - Set the border size to the input's padding for NHWC - Fill the strides == 0 with the largest stride, so to avoid accessing empty strides and multiplying by 0 Resolve COMPMID-4088 Change-Id: I751a4e6d7094b3c42306ff7f53af848fd35f19ac Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5024 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-02-08Make memset/copy functions state-lessSheri Zhang
Port following functions: - CLCopy - CLFill - CLPermute - CLReshapeLayer - CLCropResize Resolves: COMPMID-4002 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I8392aa515aaeb5b44dab6122be6a795d08376d5f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5003 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix OpenCL direct convolutionGian Marco Iodice
- The ARM DOT macro was using wrong variables for performing the dot product - K0 could be a non power of 2 values when IFM was not a multiple of 16 - Refactor the test for direct convolution NHWC Resolves COMPMID-4135, COMPMID-4155 Change-Id: I3a2dc89ef613ae20245cfc28e76ea36c55eaf81d Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4962 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Fix segfault in fsrcnn.tflite in GpuAccTeresa Charlin
* In CLDirectConvolution check for non-bias separately Resolves: COMPMID-4214 Change-Id: I83c0688e9b48d059665bbc6e1f0f050a516132d6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4980 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-02-03Make CL Pooling kernels and functions state-lessMichele Di Giorgio
Resolves COMPMID-4000 Change-Id: I64878f93c033b4928fdefbb964c37c67fdecfaab Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4971 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-02-01Make data_layout an attribute of the Scale functionMichele Di Giorgio
Resolves COMPMID-4208 Change-Id: I61ca670134a005462ad0528a5aff9507a90860e7 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4942 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-27Make CL Elementwise Unary kernels and functions state-lessMichele Di Giorgio
Resolves COMPMID-4004 Change-Id: I1dfe8bc52c1ff394ea208ba98b51033c738746a4 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4922 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-26Make CLArithmeticAddition kernel and function state-lessMichele Di Giorgio
Resolves COMPMID-4006 Change-Id: Iddc32b0b250142aac9a4a7b9dc0eef462d196025 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4913 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2021-01-22CTS failures in Android R and Q in GpuAcc in ArgMinMaxGiorgio Arena
- Fix ambiguosity with select in OpenCL - Define a new macro for signed integer data type of the same input data type's size. This is needed because some ops (e.g. logical operators) in OpenCL work in this way Resolves: COMPMID-4116, COMPMID-4110 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I560eda63fce24abd03d061f78f2f2ca951053fd0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4898 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-21Make CLFloor and CLActivation kernels and functions state-lessGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Icbe4e6a7c6732a59bdda0136af44c4852452dfd1 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4891 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-20Make all CL Concatenate kernels and functions state-lessMichele Di Giorgio
Resolves COMPMID-3995 Change-Id: I84172bed20924f1d9ae3b4d14d7b321e9494296e Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4887 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-20Direct convolution fix for quantized data typeGian Marco Iodice
- Pass the quantized zero value to the opencl kernel Fixes COMPMID-3908 Change-Id: I6454c2e49f5b150a99178f2d72e0afa0a2990b54 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4884 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-19Remove padding from direct convolution - OpenCLGian Marco Iodice
- Refactor direct convolution for NHWC - Remove old kernels for NHWC - Change the heuristic in CLConvolutionLayer.cpp. The new direct convolution implementation is faster than FFT Resolves COMPMID-3908 Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-01-14Remove OpenCL padding CLTransposeKernelManuel Bottini
By handling more general NxM blocks (where M and N can be 1,2,4,8,16) instead of only 4x4, 8x8, 16x16 and managing corner left values with partial stores Resolves: COMPMID-3923 Change-Id: I49b1a560c8325e00e061bd04edcf55034d04dcd8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4780 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-13Remove padding for CLArgMinMaxLayerKernel and fix CLRange mismatchesGiorgio Arena
- Cast the destination pointer to (__global DATA_TYPE*) when VEC_SIZE == 1 in range.cl Resolves: COMPMID-3906, COMPMID-4093 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic0a334d98785ea434ed81f89dbe34e7674991f82 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4792 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-01-13Remove OpenCL padding CLFloorKernelManuel Bottini
Use of proper vector size with boundary checking loads and stores Resolves: COMPMID-3922 Change-Id: Ib631d499603b860fcfdbe3da903b866a125359a8 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4789 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-11Remove OpenCL padding: CLROIAlignLayerKernelManuel Bottini
Add padding checks in configure Resolves: COMPMID-3914 Change-Id: Ia5be67283402d8811ceb3007be3a666ab502d775 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4787 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-01-05Remove OpenCL padding: CLPadLayerKernelGiorgio Arena
Resolves: COMPMID-3912 Change-Id: I1f8bd3bfec263ebfd70bc96f9183ccdc3089db13 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4754 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-12-23Fix baremetal arm_compute_validation build errorsSiCongLi
* Add -C flag to instruct preprocessor not to strip comments. This is to prevent marker comments like '// fall through' that suppresses certain warnings from being removed. * Fix unused variable warnings. * Add M_PI definition that's missing from certain toolchain standard libraries. Resolves COMPMID-4054 Change-Id: I1d641db668685d4b678f3d0efed84bfe9e630b4b Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4692 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-18Remove OpenCL padding CLScaleKernelManuel Bottini
Resolves COMPMID-3918 Change-Id: I970b1eaf2ae6f2f5a8cfc318cd1a3dfd3ba36fdb Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4668 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2020-12-18Add new shapes to WinogradInputTransform dataset and fix border size for ↵Giorgio Arena
NCHW data layout Fix border size for CLWinogradInputTransformKernel with NCHW data layout by setting it to the input's paddings. Add new the new validation shapes to the WinogradInputTransform's dataset Resolves COMPMID-4042 Change-Id: Id93ac86e75c94ea3f2f35edcedebafada928f34a Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4694 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-12-18Adding no padding check asserts to specific CL KernelsManuel Bottini
Resolves COMPMID-3905 Updates following kernels:: - CLDeconvolutionLayerUpsampleKernel - CLDeconvolutionReshapeOutputKernel - CLInstanceNormalizationLayerKernel - CLMaxUnpoolingLayerKernel - CLPermuteKernel - CLQLSTMLayerNormalizationKernel - CLReorgLayerKernel - CLReverseKernel - CLSpaceToBatchLayerKernel - CLSpaceToDepthLayerKernel - CLGenerateProposalsLayerKernel - CLFFTDigitReverseKernel - CLFFTRadixStageKernel - CLFFTScaleKernel - CLFillBorderKernel - CLGatherKernel - CLStridedSliceKernel - CLBoundingBoxTransformKernel Change-Id: I067ec670ff9cceadb1dfbf60dabef311a567d99a Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4713 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-16COMPMID-3919: Remove OpenCL Padding CLSelectKernelManuel Bottini
Change-Id: I07222a9eb03c785bb63414f581152267b133e9fc Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4699 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-14Enable FFT for FP16Giorgio Arena
Resolves: COMPMID-4051 Change-Id: I0c0bf97212dd281c19d5081e6247e7dc0c23cd6b Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4687 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-11Remove (CL/NE)UpsampleLayer in favor to (NE/CL)ScaleGeorgios Pinitas
Upsample functions and kernels can be replaced with the Scale as they provide same functionality Partially resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic2f9ba352c183aa87d69d551d5c172d0f22119e8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4679 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-10[Review Shape] CLDepthwiseConvolutionLayer mismatchesGiorgio Arena
- Fixed a bug that corrected the number of dimensions of a TensorShape for added trailing 1s - Avoided adding offset_first_element for the Depthwise 3x3 NCHW OpenCL kernels, since it wouldn't align with the window which is based on the output - Adjusted padding requirements along the x for Depthwise 3x3 NCHW. The kernel should always add 2 * dilation_(x/y) to the num_elems_read_x/y - Adjusted the kernel's border_size given to the border handler at function level - Added the dataset that previously made the tests fail Resolves: COMPMID-4041 Change-Id: Ifab7d38b263f12173fcc96a5f0bd3375756c3c53 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4673 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-12-10COMPMID-3921: Remove OpenCL Padding CLBitwiseKernelManuel Bottini
Adding BitwiseOperation enum class Generalizing CL Bitwise kernels with a single CLBitwiseKernel Removing CL padding from CLBitwiseKernel Change-Id: I79cd79c1e425b6da7d52308a420edf8cfb7a5a36 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4646 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-10Remove (NE/CL)YoloLayer supportGeorgios Pinitas
YOLO layer is too specialized and specific to a single model type. Can be decomposed using split, activation and concatenate layers Partially Resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I3cde88f8d4cc7d8c70ce1bb3b32b00f8d09bdca2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4678 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-08Wrap Flatten layer over reshapeGeorgios Pinitas
Flatten layer is lowered into a Reshape layer. Remove (CL/NE)FlatternLayerKernel. Partially Resolves: COMPMID-3996 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id9e2ddfe2e2dd793541badff3490c05e4c908f88 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4660 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-12-02Remove support for (NE/CL)LocallyConnectedLayerGeorgios Pinitas
Remove out-of-date and unmaintained LocallyConnectedLayer for both NEON and OpenCL. Resolves: COMPMID-3924 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ia61398ed8cfa3876f41c1b342c4a80d1cca0ca83 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4634 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-02Remove unused CLGEMMMatrixVectorMultiplyKernelGeorgios Pinitas
Partially Resolves: COMPMID-3924 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ibc47bd5bf5203dbad8d0755608918fcb384053c3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4633 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-02COMPMID-3862: Add support QASYMM8 LEAKY RELU activationSang-Hoon Park
- LEAKY RELU activation is supported for QASYMM8 data type - vquantize on NEON side has been modified to match with other backends (OpenCL and reference) Change-Id: I194631225c8d4f3cc96027d64812ec2be2b4328a Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4593 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-01COMPMID-4026 Fix FP32 CLDirectConvolutionLayer nightly mismatchesSiCong Li
The mismatches are caused by out of bound memory access on weight tensor due to lack of padding in the channel (first in NHWC) dimension. Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I5a73f190f8e131c67ed7769f6f716db9d79dc674 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4628 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-01COMPMID-3916: Remove OpenCL padding CLRangeKernelManuel Bottini
Change-Id: Id2cc77508b0f2fa36a298059476b01704cfbdcaf Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4580 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-11-27COMPMID-3961: Cleaning up logical operators on OpenCLSang-Hoon Park
Change-Id: I04cd23e9abcb1828e54cd59fee3bfa95a6dea3fb Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4461 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
2020-11-18COMPMID-3961: Add Logical OR/AND/NOT operator on CLSang-Hoon Park
Change-Id: I612aeed6affa17624fb9044964dd59c41a5c9888 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4448 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-17COMPMID-3979 Sanitise Padding Removal epicSiCong Li
* Add missing padding immutability asserts in all relevant CL kernels * Remove unnecessary zero padding validation tests. Change-Id: If93f9ccbc988e0286f5e7b135f812141476d5da0 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4446 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-16COMPMID-3973: CTS failure in QASYMM8_SIGNED Depthwise and Fully connected ↵Michele Di Giorgio
when fusing Bounded ReLU in Android R GpuAcc Change-Id: I6cfee002846d0c84de7e0a5f141dfc4807b93b33 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4421 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-13COMPMID-3956: Nightly CL failure on G71 with error code -7Manuel Bottini
Change-Id: Iba02375df47d227feca07cc0215e3389e7c55ade Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4401 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3735 Remove OpenCL padding: CLSoftmaxLayerKernelGiorgio Arena
- Renamed SELECT_DATA_TYPE to SELECT_VEC_DATA_TYPE to reflect its usage with vectors. SELECT_DATA_TYPE(dt) will now return the primitive data type - Changed the interface of VEC_OFFS and V_OFFS in order to receive the primitive data type as a parameter rather than its vector form - Performed a general cleanup of the kernels, such as creating macro for sum and max reduces, remove reduntant macros, defines, variables, calculations, etc... - Using VEC_SIZE and VEC_SIZE_LEFTOVER in every kernel in order to allow computation for smaller shapes without adding paddings - Removed the actual padding from the kernel and adjusting its calculations accordingly. Added asserts for padding removal checks. Removed invalid Validate tests. Change-Id: If5ccbd5d34e255d38c7f6bfe8740e2b80b28e264 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4277 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3951 LargeGraph_FLOAT32_Rank4_25 CTS failures in Android Q in CL Fix1SiCong Li
* Fix CLSpaceToBatchLayerKernel and NESpaceToBatchLayerKernel validation errors by using the correctly calculated output tensor shape Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I21d61f870e6a23a2e38dcb95c939b0bf08082b6f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4347 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-09COMPMID-3730: Remove CLGEMMMatrixMultiplyKernel Patch2SiCong Li
Change-Id: I56137938c9ebe1a5aeeaa750b39fcfc6818016f1 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-07COMPMID-3639: (3RDPARTY_UPDATE) Move CL kernels to srcSang-Hoon Park
Change-Id: I10d27db788e5086adae1841e3e2441cd9b76ef84 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4310 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-11-05COMPMID-3730 Remove padding from CLGEMMMatrixMultiplyKernel Patch1SiCong Li
* Remove default definition for STORE_BLOCK_BOUNDARY_AWARE to avoid elusive bugs * Clean up gemm_mm_interleaved* and gemm_mm_floating_point* kernels * Relocate to gemm_v1.cl to avoid clashing with new kernels * Rename compile time arguments to conform with the established terminology(MNKB), and to facilitate the use of STORE_BLOCK_BOUNDARY_AWARE Change-Id: Ia85c746b2536cad87257a79685b459b5d2f9a1be Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4329 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3939: Update GEMM heuristic Mali-G77Gian Marco Iodice
- Update heuristic for GEMM reshaped RHS only - Fix left-over block size in CLGEMMMatrixMultiplyReshapedOlyRHSKernel Change-Id: I34c738821ed2e4a537da4a15058eec164cb6b61f Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4305 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-03COMPMID-3721: Remove OpenCL padding ↵Manuel Bottini
CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel Change-Id: I45d26d5f565f9a55f6b5e8d7652b14283ae616f7 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4299 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-10-29COMPMID-3928: Fix output conversion in gemmlowp_mm_nativeMichele Di Giorgio
This patch solves the following issues that arose from nightly tests: - The accumulated result of gemmlowp_mm_native can be either uint or int and in order to be stored in memory we need to convert it to int. - The RHS matrix still needs padding on the X dimension. Hence, revert few changes to add the necessary padding elements. - Replace zero padding validation tests with assertion in the configure method of the kernel. Change-Id: Ib6614a91bd0e98f2b850f52eef14d4fbf55517c8 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4259 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>