aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2021-05-12Fix GEMMLowp output stage validation crash when input's first dimension == 1Giorgio Arena
- Change vectors' fixed length of 4 in gemmlowp_output_stage_quantize_down_float with its kernel's adjusted vec_size Resolve COMPMID-4355 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I371ecf114356fb6a8d18c5c3727f09ae247484bd Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5631 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-12Fix bug in Select operator when input is 1xNFreddie Liardet
Fix issue where gpu select kernel would not compile when input was 1xN size. Also add relevant tests for 1xN input. Resolves: COMPMID-4357 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: Ib5f339379e9cc7ef05605312889dbad34cbfe5dd Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5620 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-12Fix 'ARM_DOT_K0XN0' macro redefinedGiorgio Arena
Resolve COMPMID-4339 Change-Id: Ieacd288ca8f69e04d88692a093325a24198ebe3e Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5623 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-12Suppress unused variable warning that is present on empty backend buildsGeorgios Pinitas
When neither OpenCL nor accelerated CPU backends are built then unused variable warnings are signalled. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I548a440849178c8430f74ab4f75736d081a50903 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5604 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-12Remove unused CLCoreRuntimeContextGeorgios Pinitas
CLCoreRuntime context is currently unused and is planned to be replaced by the Context infrastructure Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic2874800960ca954f647e8867e7db951ce823e1c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5571 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-11Adding S32 Support to NEG operator in CLElementwiseUnaryLayerSuhail Munshi
Resolves : <COMPMID-3793> Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I4a6144ba788ae46d9637987455bec2dff8b6f561 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5586 Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-11Add support for additional CPU variantsGeorgios Pinitas
Resolves: COMPMID-4392 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id0a1694fd479499347d5f820fc228a69cb55f4d1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5610 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-07Fix missing DATA_TYPE in DOT_PRODUCT4_INTEGER8 OpenCL macroGian Marco Iodice
- DOT_PRODUCT8_INTEGER8 and DOT_PRODUCT16_INTEGER8 are calling DOT_PRODUCT4_INTEGER8 without passing DST_DATA_TYPE Resolves COMPMID-4491 Change-Id: I394bd2f9208489e820885e49ed40e607d6470620 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5594 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-07Remove TODOsSheri Zhang
Remove TODO/FIXME either already done or won't do. Partially resolve: COMPMID-4471 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Iec8f25b9bf1b2a70c072dd17d44625fa93e84ed1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5591 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-05-07Update heuristic for CLConvolutionLayerGian Marco Iodice
- Call direct convolution when filter size height is greater than or equal to 5 Resolves COMPMID-4439 Change-Id: Ie8ccccc0629eb4c74bd62c4bb4ced47f6898a945 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5589 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-06Fix performance issues in NEReductionPablo Marquez Tello
* Revert "Workaround for compiler error in gcc-9.2 and 9.3" This reverts commit e81825bf68ebfce21f6839fa59ddb7e22884a206. * Resolves COMPMID-4462 * Solves issue when using GCC9.2 Change-Id: Id17a5eadb10083ea10665f1f42f21a3af3cafd36 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5587 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-05Creates ClKerneLibrary to serve just as a kernel container without build logicGeorgios Pinitas
Moves all the kernels out of the legacy CLKernelLibrary into a separate class that its planned responsibility is to only handle the kernel strings. Currently CLKernelLibrary mixes kernel inventory and building logic, thus coupling multiple responsibilities. By separating compilation logic, the kernel library removes any dependencies from the OpenCL and simplifies maintainance but also increases flexibility. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Idcd491783b5a413bd944d385956075ead0204362 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5572 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-05Rename Quantization/Dequantization kernels/operators to imperative moodGeorgios Pinitas
Renames the following kernels/functions - [Cl|Cpu]DequantizationKernel -> [Cl|Cpu]DequantizeKernel - [Cl|Cpu]Dequantization -> [Cl|Cpu]CpuDequantize - [Cl|Cpu]QuantizationKernel -> [Cl|Cpu]QuantizeKernel - [Cl|Cpu]Quantization -> [Cl|Cpu]Quantize Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic3c5eb3b7fe28f807294d159830eef99c2dd6219 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5566 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-05Adding S32 support to CLPixelWiseMultiplicationSuhail Munshi
Partially resolves : COMPMID-3793 Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Id82e00c784f0a039017fd896f11671bdda2dd4ab Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5530 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-05Fix for tanh at small argument valuesAleksandr Nikolaev
x - x^3/3 is more accurate approximation for |x| < 0.005 than (exp2x - 1)/(exp2x + 1). Resolves: COMPMID-4098 Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: If6f9d7ce4d8d00d36d2dada7ab8f8d9f5b58f5c0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/321354 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5563 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-04Fix bug on CLReductionOperationGiorgio Arena
Execution window along the X axis needs to be collapsed on the 3rd axis (rather than the 2nd) since there could be implicit padding added along the Y Resolve COMPMID-4425 Change-Id: I9623a31749b737fea7c623cabdcfbf77cbe8f6dc Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5551 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-04NEReduceMean failed on v8.2 debug build for AndroidManuel Bottini
vpadd is not correctly converted by some compilers in debug. Therefore we opted for a serial computation of the elements in the result vector for debug builds Resolves: COMPMID-4420 Change-Id: I2d32af8568852a419226a409e3849d08e4e649c7 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5536 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-04Rename PixelwiseMultiplications to Mul for simplicityGeorgios Pinitas
Changes the names of the following: - PixelWiseMultiplicationKernel to MulKernel for all backends - PixelWiseMultiplication to Mul for all backends Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I88108c2d22c888fce37ea1028863026160b9da97 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5534 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-30Add support for new CPU variantsGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6c9555f945a4a2a986f1b8dbb2782f2e3994c502 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5537 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-30Update operator list documentation. Part 2.Teresa Charlin
All data type and data layout information for the operators are store in the function header files Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I30b564f7eda6bbd99bf3ad36ddb6639ac118eb8b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/319829 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5531 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-30Add optimization for global pooling in pooling_layer.clGian Marco Iodice
- Simplify the implementation when the pooling size has the same spatial dimensions of the input tensor - Rework the heuristic for F32/F16 - Add test for validating the global pooling path - Fix compare_dimensions in validation. The validation fails because we have different number of dimensions for NCHW and NHWC (e.g. 1,1,2,1(NCHW) -> 2,1,1,1(NHWC) Resolves COMPMID-4426 Change-Id: Ia53ee659a9fbc3d011f286a8150d1be9d6d2cd05 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5533 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-29Remove Global pooling optimizationMichalis Spyrou
It doesn't take into consideration the padding. Change-Id: Ie7a2a0c7aee93fd7007deb6e46f6bec0e7f06066 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5529 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-29Remove OpenCL padding: CLReductionOperationKernelGiorgio Arena
Change the parallel implementation across the X, now every thread computes one row Add missing test for MEAN_SUM Make reduction on any axis != 0 work with num_channels > 1 Resolve COMPMID-3917 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib0f99540104e3c253bcd1ea637833db533f5e76e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5522 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-29Add config_id for CLPixelWiseMultiplicationGian Marco Iodice
- Add missing config_id in ClPixelWiseMultiplicationKernel Resolves COMPMID-4410 Change-Id: I26103d3395ac2e03591b97f8de1ffcb91cd3bf21 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5528 Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-29Remove stale/solved TODOsMichele Di Giorgio
Change-Id: I5c440f4c6ca4186adcfa926e6b7d924086671f29 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5520 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-29Fix Global Pooling failuresMichalis Spyrou
Correctly calculate the filter size when we exclude padding. Resolves: COMPMID-4408 Change-Id: Ia96d6ddbda58d23a5a0e02854ecb22089d538f94 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5523 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-04-28Add Queue supportGeorgios Pinitas
Queues are responsible for scheduling operators and performing other runtime related activities like for example tuning. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I0366d9048470d277b8cbf59fa42f95c0ae57c5c9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5487 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-28Fix for performance regression on G71 (mate9) for direct convolutionAleksandr Nikolaev
This variant with GEMM solves regression problem only partially, but guarantees precision. Resolves: COMPMID-4387 Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: I7fdb7d6ecf9d67926d994a9f666a80aeac3147a5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/317792 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5470 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-28Add per-channel quantization support for CLDeconvolutionLayerFreddie Liardet
Add QSYMM8_PER_CHANNEL support on weight input for CLDeconvolutionLayer. When weights are per-channel quantized type "Direct" method is always used. Also reduce number of QSYMM8_PER_CHANNEL tests for NEDeconvolutionLayer. Resolves: COMPMID-3438 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I1330cac5142e19d21e322574fb8d912558745b02 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5484 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-28Add explicit cast on GEMMLowp kernelMichalis Spyrou
Oclgrind flags this as a warning. Resolves: COMPMID-4338 Change-Id: Id5a075894027d867bdd21937626f052acb05b531 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5512 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-27Add optimization for global pooling in pooling_layer.clGian Marco Iodice
- Simplify the implementation when the pooling size has the same spatial dimensions of the input tensor - Rework the heuristic for F32/F16 - Add test for validating the global pooling path - Fix compare_dimensions in validation. The validation fails because we have different number of dimensions for NCHW and NHWC (e.g. 1,1,2,1(NCHW) -> 2,1,1,1(NHWC) Change-Id: Iba680cb30bf2a5d0952265a4cc9794f368549ca5 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5510 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-27Remove padding from CLROIPoolingLayerKernelGian Marco Iodice
Resolves COMPMID-3915 Change-Id: I26c103507717f16588f19c07bf20df5657022dec Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5489 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-27Fixed CTS failures CLInstanceNormPablo Tello
* Resolves COMPMID-4400 Change-Id: I54c33a017c735194fbf4437d1c7df465208bc0ca Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5505 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-04-24Fix Depthwise failure in Cpu backendMichalis Spyrou
Resolves: COMPMID-4395 Change-Id: Ib3dfdc42e95998c1e5713d6ec1bdaa83299b0360 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5488 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-23[Nightly #1129] CL/Winograd/ConvolutionLayer/F16 mismatch on Mate9Manuel Bottini
Fixing Conv5x5, Conv5x1, Conv1x5 Resolves: COMPMID-4380 Change-Id: I5206d9b85b1d73f6010f02c119aae91266395ba7 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5485 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-23Add missing limits includeMichalis Spyrou
This caused build failures on bare metal. Resolves: COMPMID-4399 Change-Id: I151012740a440e8939b76b978aa9e96741229245 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5482 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-22Fix bug on CLPixelWiseMultiplication broadcasted for Quantized typesGiorgio Arena
Resolve COMPMID-4396 Change-Id: I9b16791f84d60bc4a5303a6393cdbe9db3a4f0e9 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5483 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
2021-04-21Add support for CLVKMichalis Spyrou
This patch enables CLVK through the graph API and inside the CLScheduler. By default the Native platform is selected. Selecting CLVK can be done via --target=clvk. Resolves COMPMID-4205 and COMPMID-4206 Change-Id: Ic60744980c6b8a60e776627ea677ed46be88f656 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5475 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-21Remove unused yolo_layer OpenCL kernelMichele Di Giorgio
Change-Id: Ibab2095a1a5b525c8513f924cd5bcecd5172ed48 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5467 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Fix biases check in NEDepthwiseMichalis Spyrou
Check if biases are null before passing them to CpuDepthwiseNativeKernel. Resolves: COMPMID-4389 Change-Id: I29acbdefb75b9e81c293801a265e9b850d8d00b9 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5464 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Update assembly codeMichalis Spyrou
This patch brings performance uplift on Cortex-A35. Resolves: COMPMID-4316 Change-Id: I2b9c02a599373f780dd1b981b821e33bd59a3422 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5461 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20CLDepthwiseConvolutionLayer rework - Part 1Giorgio Arena
Remove the reshaped variant for CLDepthwiseConvolutionLayer 3x3 NHWC Quantized - Remove kernel selection by GPUTarget - Remove unused quantized support from the NHWC kernel - Remove CLDepthwiseConvolutionLayerReshapeWeightsKernel - Remove OpenCL kernels for reshaped dwc 3x3 quantized and weights reshape - Remove the "_bifrost" suffix in common OpenCL kernel - Remove the ICLDepthwiseConvolutionLayer3x3Kernel common interface Resolve COMPMID-3864, COMPMID-3907 Change-Id: Icfac0fb6c00e214985beb05dad7c0cdbbee7d830 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5447 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Port CpuConvertFullyConnectedWeights to new APITeresa Charlin
* Remove includes of NEConvertFullyConnectedWeightsKernel.h Resolves partially: COMPMID-4187 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I1bf246546d3ef53edb4c5a8bc05a0db92d2d3bff Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5418 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Remove OpenCL padding: CLPixelWiseMultiplicationKernelGiorgio Arena
- Change kernel's vec_size to 16 / sizeof(output) - Change ICLKernel.cpp to handle broadcast without padding Resolve COMPMID-3913 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I03e884b250ef5784dc109bff8cf2c96b345d119f Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5450 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-04-20[Nightly #1129] CL/Winograd/ConvolutionLayer/F16 mismatch on Mate9Manuel Bottini
Computing the activation in FP32 and then converting in FP16 Resolves: COMPMID-4380 Change-Id: I8a857af65967c8017fb60a358b4f8f0d9fc2e1c2 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5457 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Remove experimental tracing featurePablo Marquez Tello
* This commit removes the tracing code which has not been maintained for a few releases. * Resolves MLCE-445 Change-Id: I14793c82fe58ffef0cf936edf4af077b5dde85f8 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5455 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Added S32 Integer support to DIV operator in CLElementWiseOperations with TestsSuhail Munshi
Partially Resolves : COMPMID-3793 Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I14d6884c34f33a6caee11fc1230f9d2d3ae6c4c1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5425 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19CLInstanceNormalizationLayer NHWC optimisationPablo Marquez Tello
* Make changes to split the workload into two kernels. One kernel precomputes mean and variance and the second kernel just loads these precomputed values. * The new approach runs %30 faster than the original code for NHWC workloads like 32x192x256. * Resolves MLCE-337 Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-19Port DepthwiseConvolution to new APIMichalis Spyrou
Resolves: COMPMID-4185 Change-Id: Ib5f22356356a022d567bb18d44ea272b62d10ebf Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5424 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Remove padding from CLNormalizePlanarYUVLayerKernelSheri Zhang
Resolve: COMPMID-3911 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Id5615b6a8b52030fb611a1a04bcd4664b8232e90 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5451 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>