aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-08-17Revert "Fix performance regression in ClConv2D"v22.08branches/arm_compute_22_08Ramy Elgammal
- Reverting commit e54d8c07e75d70baeb80fecbb43088027ea45658 because it has caused unexpected regressions. Resolves: COMPMID-5504 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I2a0bcc6a311009a81f20a146079758ad138fff5b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8092 Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-08-16Fix performance regression in ClConv2DGian Marco Iodice
- Call gemm-based convolution when the kernel is large and the stride is unit Resolves: COMPMID-5504 Change-Id: I8dd83bd012000ed76824b96a8e37c98c861c59e4 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8081 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-08-12Fix note in guidelines docRamy Elgammal
Partially Resolves: COMPMID-5346 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I26212bfb46cd451ef01956ade0a306bbdbf5940e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8069 Reviewed-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-08-12Fix performance regression in Conv2D on OpenCLAdnan AlSinan
- Affect CL backend. - For FP32 datatype, affect different platforms. Resolves COMPMID-5479 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I705d718bc9b7def218034958f7ef86f2c2abe06d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8064 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-08-12Update release notes about armv8.6 build flag changeRamy Elgammal
Partially Resolves: COMPMID-5346 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ida1ad7ff19837203b3366466ba8018386cd6b18f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8065 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-08-11Disable unsafe FP optimizations in Winograd Output TransformGunes Bayir
The unsafe FP optimizations flag causes accuracy issues when certain conditions are met regarding the hardware type, data type and the activation function. Resolves: COMPMID-5375 Change-Id: I1b0b06549b8c108617962d006a20dd263d5e3c21 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8061 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-11Update READMERamy Elgammal
Partially Resolves: COMPMID-5346 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I41755295450b6ee698d8998d8a6d6bf9d4e4e7a9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/443006 Comments-Addressed: bsgcomp <bsgcomp@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Sicong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8052 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-11Fix CTS/SLTS failure related to Depthwise ConvolutionGunes Bayir
The issue is caused by GPUTarget not being set explicitly for the Depthwise convolution kernel, but it's being used in its build configuration. This causes the default value to be used and enables some unsafe FP optimizations. Resolves: COMPMID-5490 Change-Id: I5300a1168962cacb62cf49db795f052cf6740c7e Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8059 Reviewed-by: SiCong Li <sicong.li@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-08-08Fix for AI benchmark ResNet regressionViet-Hoa Do
* For 3x3 kernel, only choose the implementation with larger tile size if the input tensor is larger than the tile. Resolves: COMPMID-5467 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I2cf95ddb25f477cb05da3b3501e0afe9548fc33a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8022 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-08Update ErrataRamy Elgammal
Resolves: COMPMID-5345 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I2fe4cb16bddf0f6a25906ab471e0a684d9ddf0ec Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8050 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2022-08-05Update SONAME_VERSION in SConscriptRamy Elgammal
Partially Resolves: COMPMID-5346 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I1740d3244d7fcaf58613bbb508db7b46ba13f19e
2022-08-05Fix LeNet-f16 convolution regressionAdnan AlSinan
Resolves COMPMID-5420 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I24dca916f49f82e7e5ec809500ae5fe32c8adc97 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8020 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-04[ONCPUML-970] Fast math mode for fixed format kernelsPablo Marquez Tello
Minor tweaks and test for running fixed format kernels with BF16 operations when specified by the user. Change-Id: Ic8167f67b86b1298da65e46cfebed9f3b86940e4 Signed-off-by: Milos Puzovic <milos.puzovic@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8000 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-03Add Dynamic Fusion Tests with BugFixesMohammed Suhail Munshi
- Allow fusing arbitrary number of existing elementwise operators - Fix issues with 3D and 4D tensors in Elementwise Addition and Floor components - Collapse the 3D/4D window in the same way as that used by Conv2d, i.e. collapse dim 1 and dim 2 together - Fix Floor component issues when used after other components - Add Dynamic Fusion Tests (Floor + Div, Conv2d + Add + Div) - Add Addition ElementWise Broadcasting Test Resolves: [COMPMID-5356] Change-Id: I58b93a90175bb0440d43531d18cac94b5f5c2689 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/433956 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7957 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-03[ONCPUML-968] Fixed format kernel support in additional APIsMilos Puzovic
Implements required plumbing in order to be able to ask and execute fixed format kernels from NEFullyConnected, NEGEMM and NEGEMMConv2d. These APIs are used to accelerate oneDNN primitives (inner product, matrix multiplication and indirect GEMM respectively) and without changes it would not be possible to call fixed format kernels from those oneDNN primitives. Change-Id: I27534f0491ce28d0ccb98c19f318bd33dcdf2ff5 Signed-off-by: Milos Puzovic <milos.puzovic@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7999 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-02Update the GPUTarget listGian Marco Iodice
Resolves COMPMID-5405 Change-Id: I995b5e79bef13529097ed17f7854763a4cf89272 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7986 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-01Optimize add layer by considering the input tensors as 1D arrayGunes Bayir
Resolves: COMPMID-5108 Change-Id: I544f8160fbe5b4ffbef348d1fbd3dd626a6e1bdb Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8002 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-08-01Fix for OpenMP scheduler work breakdownMilos Puzovic
If number of work items is greater than number of available threads then OpenMP scheduler will only execute as many work items as there are threads. This fix makes sure that we iterate through all work items and execute all of them. Change-Id: I3ad4b732c01fadc70dacaf09af3007d2b31086c7 Signed-off-by: Milos Puzovic <milos.puzovic@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8001 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2022-08-01Fix building failure with validate_examples set to trueMichalis Spyrou
Resolves: COMPMID-5402 Change-Id: Ic4abda36c475c3a7d86fb469d7ed1b23a62ea182 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7991 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-29Updated documentationPablo Marquez Tello
* Fixed an error caused by a newline missing in line 187 * Added section about building natively on Windows on ARM * Resolves MLCE-739 Change-Id: I2f452d77247b2a264e7f122d97cbb8f288716971 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7992 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-27Fix compilation error rasied in Nightly_NEWRamy Elgammal
- related to 91780021e2 Fix for inclusion of "arm_gemm" - arm_compute::WeightFormat was passed to a function expecting arm_gemm::WeightFormat Resolves: COMPMID-5415 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ib53fad2eab0148f466a5e2f11b931754569da3d6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7989 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-07-26Fix build android build errorPablo Tello
* Resolves COMPMID-5422 Change-Id: Ib59d85aead3f8f9559392957fbc42a4ddbb27b07 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/440126 Comments-Addressed: bsgcomp <bsgcomp@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7987 Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-07-26Fix for inclusion of "arm_gemm" from src into "Types.h" from coreRamy Elgammal
- Added arm_compute::WeightFormat and converted to/from arm_gemm::WeightFormat when needed through two map function. - Moved to_string(WeightFormat) to TypePrinter.h Resolves: COMPMID-5415 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I65f7942100bcd4dbf2c5cf6c07f26c8e1e3bf86e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/438511 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Sicong Li <sicong.li@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7985 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-25Enable integrated assembler for Android™ 13 and onwards onlySiCong Li
Since Android™ 13, the underlying NDK has deprecated system assembler and favors integrated assembler instead Please also see https://github.com/android/ndk/wiki/Changelog-r23 Resolves COMPMID-5410 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ib9fa8a22f6cdb5cb4f0095e15bf332484f709630 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7956 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-25Enable march=armv8.6-a in non multi-isa buildsPablo Marquez Tello
* scons arch=armv8.6-a translates to -march=armv8.6-a * scons arch=armv8.6-a-sve translates to -march=armv8.6-a+sve * scons arch=armv8.6-a-sve2 translates to -march=armv8.6-a+sve2 * Resolves COMPMID-5408 Change-Id: I0901e1de864d00109759509af7cc2b5c9ae1cd75 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7943 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-25Mention Arm® Neoverse® in supported systems.Pablo Marquez Tello
* Resolves MLCE-888 Change-Id: I90291cb5f6eddbb889e86dd3295ff8b05b1e1311 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7953 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2022-07-22Add GemmLowp MMUL Reshaped Only Rhs Support for QASYMM8/QASYMM8_SIGNEDFreddie Liardet
This patch introduces a GEMMLowp routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5398 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I8d06453645688f3658b6c7c06f1ebc25a2505661 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7932 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-22Update ClConv2D heuristic to use direct convolutionAdnan AlSinan
Resolves COMPMID-5298 Change-Id: I4a7d788bc1f5f568bedcc22e7aca47ede6de71bf Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7891 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-21Fix direct convolution cases that were failing on OdroidAdnan AlSinan
- Affects OpenCL backend. - Resolves COMPMID-5416 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I8953f9ac5c1ec9edf99399a651a544df4276ccf1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7951 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-21Added CONTRIBUTING.mdPablo Marquez Tello
* Pointing out that development and contribution must be done on linaro * Asking not to submit pull requests on github * Resolves MLCE-774 Change-Id: Ifbcf4a7ab8e714b5c77425f0e4174520e0ece09f Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7946 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-20Remove data extraction scriptsPablo Marquez Tello
* Resolved MLCE-886 Change-Id: I3b8fbe662c715b82c08c63fa27892471a572fdd8 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7945 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-19[ONCPUML-951] Variable weight support for Convolution.Francesco Petrogalli
API changes for NEGEMMConvolutionLayer and CpuGemmConv2d Built with: scons neon=1 opencl=0 os=linux arch=armv8.2-a multi_isa=1 \ build=native -j32 Werror=false validation_tests=1 build_dir=opt \ standalone=1 asserts=1 experimental_fixed_format_kernels=1 . Tested with: ./build/opt/tests/arm_compute_validation Hardware where the test executable was run: Neoverse N1 Test coverage: * NEGEMMConvolutionLayer, CpuGemmConv2d * NHWC (the only one supported by the fixed-format kernels) * F16, F32 * Shapes: RunSmall Change-Id: I4fd3e495a7cbf61210ea02d37440ba9652934e99 Signed-off-by: Francesco Petrogalli <francesco.petrogalli@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7632 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-18Fix multi_isa build failure after Winograd integrationramelg01
Partially Resolves: COMPMID-5400 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I9ed6b85b105a15b1a35c6a921f63ea94c665438b Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/437221 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7933 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2022-07-18Fix Neoverse V1 heuristics for FP32 fast moderamelg01
Resolves: COMPMID-5401 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I432b304483236efd392dfc47d541e6759c135104 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7934 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2022-07-14Integrate new winograd APIs from MLTechramelg01
Resolves: COMPMID-5400 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ib4428436dd7a6e40d8b2d8a2f8dac1b079154551 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7894 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-13Add Gemm MMUL Reshaped Only Rhs Support for FP32/FP16Gunes Bayir
This patch introduces a GEMM routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5216 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I2e5d7806f5904347185bb3e250f73d73d6669dba Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7914 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-13Fixed clang-cl errors on Windows native builds.Pablo Tello
Partially resolves MLCE-739 Change-Id: Ice06a96d6a8a26b31e334ba4e697cd41d352b026 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7364 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-07-11List licenses used by other projects in READMESiCong Li
Resolves COMPMID-5391 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Iafd42c80ffc7a9659e2b7bc820aefb39338b56c9 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7900 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-08Extended direct conv 2d interface for tuning the OpenCl kernelGian Marco Iodice
Resolves COMPMID-5298 Change-Id: Ie9b907e5dcf86aa6add8d08799fa7ba7c264edea Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7888 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-07Add missing flag when building cl graph examples and fixMichalis Spyrou
incorrect cl cache behaviour Resolves: COMPMID-5286 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I1aa82825ef7d31d7830a00282d1a3523ccf1d746 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7883 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-07Move build option explanations in how to build guide to scons help messageMichalis Spyrou
Resolves: COMPMID-5381 Change-Id: I556e2269ff6350f58509624a3d3d94cb0853cb8d Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7882 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-07Add OpenBSD® to the list of supported systemsPablo Marquez Tello
Resolves COMPMID-5396 Change-Id: I9bf296b81145c48aabdfb56b7ffdb453f5a76976 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7892 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-05Add G57 to GPUTargetSiCong Li
Relates to: COMPMID-5299 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I19f820f698cf11020da019f4a1334cccb1e40c7e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7880 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-07-04Fix build errors on armv8.6 SVE2 with NDK 23 and 24Michalis Spyrou
Extensive use of templates resulted in a compiler crash on NDK 23 and 24. This rework solves the issue and also reduces the library size by 101Kb. Resolves: COMPMID-5384 Change-Id: I9c5c68c5e36f236b0891e44d25478743417fb16d Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7871 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-01Fix OpenBSD build errorsPablo Marquez Tello
* Partially resolves COMPMID-5396 Change-Id: I7f3e692e73da548a901a07ed6d7212df40b8c26f Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7870 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-06-30Wrong arguments for running activation function in CpuGemmDirectConv2dMichalis Spyrou
Resolves: COMPMID-5350 Change-Id: I2f5021ebe9fe4c71371f2cd6cb65cc9c0b2110b5 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7861 Reviewed-by: Francesco Petrogalli <francesco.petrogalli@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-06-30Fixed native build toolchain and compiler prefixPablo Marquez Tello
* When build=native toolchain_prefix and compiler_prefix must be set to "" instead of auto * Resolves COMPMID-5395 Change-Id: I481ee4596e16d55d2a696af9124a0d06c916cef7 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7855 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-06-29Add LUT-based leaky relu for QASYMM8 on CPUViet-Hoa Do
* Add LUT generation function for Leaky ReLU. * Some additional changes in the existing LUT implementation: + Bring back the NEON implementation of hard swish for 32-bit build. Library size of 64-bit build is not affected. + Add some extra #ifdef to remove unnecessary code in 32-bit build. Resolves: COMPMID-5386 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I1ea49611cc922765ee741e31138c888401d33e9b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7845 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-06-29Fix armv7a on Android "end-of-support" build and documentationSiCong Li
* Only print a warning for this build * Fix the wording in documentation to avoid misunderstanding. User can still run armv7a on Android, but we stop testing it and this build is discouraged Resolves COMPMID-5379 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I3d00baa2479671bcef8882db0e3d3cc3043d5811 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7838 Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-06-28Fix OpenCL Winograd output transformGian Marco Iodice
- num_tiles_x was not initialized when we were passing the argument at compilation time Resolves COMPMID-5394 Change-Id: I6004a2d47656edda3c832f980c78622045de7068 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7857 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>