aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-04-20Fix biases check in NEDepthwiseMichalis Spyrou
Check if biases are null before passing them to CpuDepthwiseNativeKernel. Resolves: COMPMID-4389 Change-Id: I29acbdefb75b9e81c293801a265e9b850d8d00b9 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5464 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Update assembly codeMichalis Spyrou
This patch brings performance uplift on Cortex-A35. Resolves: COMPMID-4316 Change-Id: I2b9c02a599373f780dd1b981b821e33bd59a3422 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5461 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20CLDepthwiseConvolutionLayer rework - Part 1Giorgio Arena
Remove the reshaped variant for CLDepthwiseConvolutionLayer 3x3 NHWC Quantized - Remove kernel selection by GPUTarget - Remove unused quantized support from the NHWC kernel - Remove CLDepthwiseConvolutionLayerReshapeWeightsKernel - Remove OpenCL kernels for reshaped dwc 3x3 quantized and weights reshape - Remove the "_bifrost" suffix in common OpenCL kernel - Remove the ICLDepthwiseConvolutionLayer3x3Kernel common interface Resolve COMPMID-3864, COMPMID-3907 Change-Id: Icfac0fb6c00e214985beb05dad7c0cdbbee7d830 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5447 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Port CpuConvertFullyConnectedWeights to new APITeresa Charlin
* Remove includes of NEConvertFullyConnectedWeightsKernel.h Resolves partially: COMPMID-4187 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I1bf246546d3ef53edb4c5a8bc05a0db92d2d3bff Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5418 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Remove OpenCL padding: CLPixelWiseMultiplicationKernelGiorgio Arena
- Change kernel's vec_size to 16 / sizeof(output) - Change ICLKernel.cpp to handle broadcast without padding Resolve COMPMID-3913 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I03e884b250ef5784dc109bff8cf2c96b345d119f Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5450 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-04-20[Nightly #1129] CL/Winograd/ConvolutionLayer/F16 mismatch on Mate9Manuel Bottini
Computing the activation in FP32 and then converting in FP16 Resolves: COMPMID-4380 Change-Id: I8a857af65967c8017fb60a358b4f8f0d9fc2e1c2 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5457 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-20Remove experimental tracing featurePablo Marquez Tello
* This commit removes the tracing code which has not been maintained for a few releases. * Resolves MLCE-445 Change-Id: I14793c82fe58ffef0cf936edf4af077b5dde85f8 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5455 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Added S32 Integer support to DIV operator in CLElementWiseOperations with TestsSuhail Munshi
Partially Resolves : COMPMID-3793 Signed-off-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I14d6884c34f33a6caee11fc1230f9d2d3ae6c4c1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5425 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19CLInstanceNormalizationLayer NHWC optimisationPablo Marquez Tello
* Make changes to split the workload into two kernels. One kernel precomputes mean and variance and the second kernel just loads these precomputed values. * The new approach runs %30 faster than the original code for NHWC workloads like 32x192x256. * Resolves MLCE-337 Change-Id: I8356fcefa2d131ab4dcb32268ce7142421d073e4 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5355 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-19Port DepthwiseConvolution to new APIMichalis Spyrou
Resolves: COMPMID-4185 Change-Id: Ib5f22356356a022d567bb18d44ea272b62d10ebf Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5424 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Remove padding from CLNormalizePlanarYUVLayerKernelSheri Zhang
Resolve: COMPMID-3911 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Id5615b6a8b52030fb611a1a04bcd4664b8232e90 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5451 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Add Tensor related utilities to the new APISang-Hoon Park
A couple of utility functions to get the information about tensors are added. Those functions are placed at an additional header file for better grouping. Related test cases are also added. Resolves: COMPMID-4376 Change-Id: I6bd09cbf60fddcf4fe651906982397afb0451392 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5405 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-19Add padding consideration to pooling index computationSang-Hoon Park
Fix the pooling kernel which has been missing consideration of left padding, which can be implictly added by external kernels. Additionally, tests for FP16 have been added for the logic. Resolves: COMPMID-4363 Change-Id: I5655991cb80f749fb1ae9bbd3918b436a078f5d1 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5421 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-16Fix bug on Implicit Padding for NEON FFT2DManuel Bottini
Include paddings in address computation for input and output Resolves: COMPMID-4362 Change-Id: I1b34cf47e3b80b98d55fc8fbdeecbfd850d33197 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5439 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-16Remove unnecessary use AccessWindowsMichele Di Giorgio
In these cases, no padding is introduced and the use of AccessWindows is not necessary and makes the code more confusing. Change-Id: Id712cba35bb0440eb40c69fdc7ad0084dc9a5ab3 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5440 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-16Add GEMM heuristic for Mali-G78Gian Marco Iodice
- Replace std::map with a basic container with std::array Change-Id: I76f53ca61676ca0e5136ce61a3f3adb10e22b4c3 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5441 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-15Add Implicit Padding in DirectConvolution FixtureGiorgio Arena
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I97ef222ce9ea5a9c01089697f67863673985dff8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5410 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-15Fix validation bug in release mode for armv7Giorgio Arena
Resolve COMPMID-4377, COMPMID-4379 Change-Id: I302f08b5bf0afb5295d31843fea20181d9283658 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5435 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-14Removes redundant std::move of temporaryGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iff8d463eb5575dc7ae55a155e448488495298e16 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5431 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-14Remove unused AccessWindow* includesMichele Di Giorgio
Change-Id: I9f8d0c6e17d58700cc01fc5134cd2dffd26bc742 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5430 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-14Fix NeDepthwiseConvolution bad_alloc issueSheri Zhang
The memory allocation of the failed tests's tensors are huge, all the virtual memories are used up. And then when assign the output to _reference, a copy constructor is called and allocate more memory and caused the crash. Delete one of the tensor shape and adjust one. Resolve: COMPMID-4366 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I91a816384ec358481018a724a7b0cd59730698fc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5426 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-14Add support for a global allocator for OpenCL tensorsGeorgios Pinitas
Give the ability to the user to specify an allocator that can be used by all the internal function tensors. This being a global needs to outlive all the tensors/functions that are using it. Resolves: COMPMID-4212, COMPMID-4213 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I251871c242879976819ebca1452404133a8e62d7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5420 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-14Port NEDirectConvolutionLayer to new APIManuel Bottini
Partially resolves: COMPMID-4009 Change-Id: I19ffb61c5c4541134a5028677d2d81228740e454 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5419 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-14Remove OpenCL padding: CLNormalizationLayerKernelManuel Bottini
Only for NHWC data layout Resolves: COMPMID-3910 Change-Id: Ie2d71482b3e3b55ac155e9af152032a5de8bbd50 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5388 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Port CLConvertFullyConnectedWeights to new APITeresa Charlin
* Replace ICLKernel by IClKernel in other unrelated kernels Resolves partially: COMPMID-4187 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I173b8f2ac645dbfd7d412f4b058c5c9655c229ee Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5402 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Fix TILE initialization in direct convolution and winograd transformsGian Marco Iodice
- The array initializer for the TILE object cannot always be utilized and so we do require to manually initialize the TILE with the LOOP_UNROLLING macro - Resolves COMPMID-4371 Change-Id: I2598354b9fae84c5e3bd11219fffdcdc297215e1 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5417 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Check data type lower bound when creating a tensorGeorgios Pinitas
As underlying enum type can be an int we check also the lower bound of the data type when creating a tensor to avoid creation with negative values. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I00fa3cae988c5f20a56115b1c1b85b70e699c966 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5413 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Disallow padding addittion when operating on CL imagesGeorgios Pinitas
Importing buffer to OpenCL images requires compliance to strict HW requirements. Thus, disabling testing implicit padding when importing buffer to image. Resolves: COMPMID-4368 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ib75499040b6dc02b435125674dc2cc91eacbf7ba Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5416 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Fix T_LOAD too few parameters issueSheri Zhang
Resolve: COMPMID-4370 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I4b2a8bf252405fe9006784fa1769ad5b6e708a71 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5414 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-13Implicit padding testing along the X axis on high priority operatorsGiorgio Arena
Add artificial implicit padding testing for the following fixtures: - Scale - FullyConnected - Pooling - DepthwiseConvolution - DirectConvolution - Winograd - FFT - GEMM/GEMMLowp Create utility function that loops through a list of tensor and adds random padding based on the global seed (only for NHWC layer layout). Remove GEMMLowpAssemblyFixture since it wasn't used Remove some AssetsLibrary headers since they weren't used Resolve COMPMID-4161 Change-Id: Ib6f4f7f113ae69b993d7b2a9e04abbf3de8c99fe Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5327 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-12Fix CLDepthwiseConvolutionLayer QSYMM8_PER_CHANNEL mismatchesGiorgio Arena
Resolve COMPMID-4367 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I5e65b62c2ca52cf65950c9c343864ef55b7122c3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5407 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-12Add support for cl_image in CLDirectConvolutionLayerGian Marco Iodice
- The cl_image object can be used for the weights - cl_image can only work for f32/f16 - Fix the implicit padding on the first dimension X Resolves COMPMID-4341 Change-Id: I04e0901c69e7765c42afceca38c4a840645b9123 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5393 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-12Fix OpenCL kernel compiling failure with array initializerSheri Zhang
The issue is related with clang version, clang 3.9 has the problem, clange 4.0 works. The workaround is to add an extra {} to make this work. Partial resolves: COMPMID-4348 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ia079cbb3c44d617b1b42cb2af758b5a8ba1a032e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5399 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-12Fix validation in reshape kernel [cpu,gpu]Gian Marco Iodice
- We were validating the output data type, shape and etc when the output was not initialized yet Change-Id: I71a3cda2aa2de500f5690ae8a1cfd05ece0c3858 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5398 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-09Fix bug on Implicit Padding for CL GEMMMatrixMultiplyInterleavedTransposedManuel Bottini
Resolves: COMPMID-4342 Change-Id: I468c6d68c0284e4ec76f22037a697fff7bc5638c Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5391 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-09Winograd Output transform 7x7 reworkGiorgio Arena
Resolve COMPMID-4140 Change-Id: I17db0ee596665598d08d4359a373160f21ab9acd Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5390 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-09Fix OpenCL kernel compiling failure with array initilizerSheri Zhang
The issue is related with clang version, clang 3.9 has the problem, clange 4.0 works. The workaround is to add an extra {} to make this work. Resolves: COMPMID-4348 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I2d8fc6400f32af5406fbf2d2556127a53b2ce918 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5392 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-08Skip tests that expect exception when asserts are offMichele Di Giorgio
When asserts are off, no exceptions are thrown. Therefore, we need to skip those tests and provide a message to inform the user. Resolves: COMPMID-4346 Change-Id: I5e4696fb7c97a5a6527620b5372f5a1c5e42f931 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5381 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-08Rewritten Winograd (4x4, 5x5) output transformationAleksandr Nikolaev
This patch takes advantage of tile_helpers.h and different data layout input and tmp matrices. Resolves: COMPMID-4142 Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: I5d10bd3f08137414ee7520eef1e6d0aef8cbf160 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5382 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-08Fix incorrect return statement in gemm_uint8 heuristic selectionGeorgios Pinitas
Semantic fix that otherwise led to compilation errors when building for SVE and when MMLA instruction was enabled for int8. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I4852d806789d52c4ed1d3b9132b2f20c2f9b41fa Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5384 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-08Ensure OpenCL runtimes are initialized firstMarco Antognini
The OpenCL API Specification states: > The behavior of OpenCL API functions called from global constructors > or destructors is therefore implementation-defined. This patch improves compatibility with OpenCL runtimes that use static objects to hold their internal state. Change-Id: I850be378e9c6f0b5aa8db926fe0c62833a936724 Signed-off-by: Marco Antognini <marco.antognini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5383 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Sheri Zhang <sheri.zhang@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-08Fix convolution with bias segmentation fault issueSheri Zhang
Indirect hybrid kernels read the full width of the bias. So we need to detect the case where we are writing a partial block and pad the bias for that block. Resolves: COMPMID-4321 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ib8d8637724e34d1eae6cc22223df8d81a6d0ded6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5380 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-08Rework the OpenCL Winograd Input Transformations NHWCGian Marco Iodice
- Rework Winograd Input Transform 3x3 NHWC using the new macros - Rework Winograd Input Transform 5x5 NHWC using the new macros - Rework Winograd Input Transform 7x7 NHWC using the new macros - The new implementation is also faster than before - Winograd Input Transform 5x5/7x7 3x faster Resolves COMPMID-4139 Change-Id: Ia9c8af23a2d47d2db60ec4c44650a63a34ffa0d5 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5358 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-04-08Substitute CLFullyConnectedLayerReshapeWeights by CLTransposeTeresa Charlin
Resolves partially: COMPMID-4359 (1/2) Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Id1859f3cd530eb05f027226e2004cf518778147e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5377 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-08Substitute NEFullyConnectedLayerReshapeWeights by NETransposeTeresa Charlin
Resolves partially: COMPMID-4359 (2/2) Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Id65ef04268575cc9d74be6114e82e116b8ed106d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5378 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-07Implement Fanout mode in CPPSchedulerSiCongLi
This new scheduler mode is implemented to reduce runtime overhead on high thread counts by distributing the scheduling work to all threads. The fanout mode should only be enabled on high thread counts (e.g. > 8 threads). Alternatively the mode can be forced by setting the environment variable ARM_COMPUTE_CPP_SCHEDULER_MODE to be either "linear" (default) or "fanout". Note that on bare-metal this functionality is turned off but it does not matter as only multi-threading is not supported on bare-metal. Resolves COMPMID-4349 Signed-off-by: SiCongLi <sicong.li@arm.com> Change-Id: I46e2fab83ea24e616c82ae94dca7b2e72a73c7b8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5352 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-04-07Add support for armv7 hardfloat buildSlava Barinov
Change-Id: I9286dd199670e6311aee3c040d7b02a401522dd4 Signed-off-by: Slava Barinov <v.barinov@samsung.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5369 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-07Add per channel quantization support for NEDeconvolutionLayerFreddie Liardet
Add QSYMM8_PER_CHANNEL support on weight input for NEDeconvolutionLayer and reference version. Resolves: COMPMID-3437 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I7c9a28d4d0fea324ed8e5a24fbd0422e5ede145c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5364 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-04-07Review all shapes in datasets to account for padding removal Part 2SiCong Li
* Fix validate compare_dimensions to account for cases where 1D tensor permuted into a 3D tensor * Add the following configurations to stress padding removal: * size = 1 * size = multiple of processing size * size = non-multiple of processing size Partially resolves COMPMID-3865 Change-Id: Iee0de4c9e72b3413c0807e2c86bc2331911a4c6f Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4393 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-04-06Correct Copyright datesMichalis Spyrou
Some dates where wrongly changed to 2021 when we moved some files over to the new API. Resolves: COMPMID-4312 Change-Id: I4aae61f7f4d01f69fcb664b0f71b9e508bd1f5f8 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5361 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>