aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON
AgeCommit message (Collapse)Author
2022-09-14INT8 Quantized MeanStdDevNorm (LayerNorm)Murray Kornelsen
Implements LayerNorm for qasymm8 tensors. Uses uint8x16 loads and stores. Summation is performed in integer arithmetic (vpaddl) Normalization is performed in float32 before requantizing back to int8. Signed-off-by: Murray Kornelsen <murray.kornelsen@mail.mcgill.ca> Change-Id: I2407c8b34717fb47adab98791bd76fb8a3c62f4a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7922 Comments-Addressed: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-08Fix for AI benchmark ResNet regressionViet-Hoa Do
* For 3x3 kernel, only choose the implementation with larger tile size if the input tensor is larger than the tile. Resolves: COMPMID-5467 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I2cf95ddb25f477cb05da3b3501e0afe9548fc33a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8022 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-04[ONCPUML-970] Fast math mode for fixed format kernelsPablo Marquez Tello
Minor tweaks and test for running fixed format kernels with BF16 operations when specified by the user. Change-Id: Ic8167f67b86b1298da65e46cfebed9f3b86940e4 Signed-off-by: Milos Puzovic <milos.puzovic@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8000 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-25Enable march=armv8.6-a in non multi-isa buildsPablo Marquez Tello
* scons arch=armv8.6-a translates to -march=armv8.6-a * scons arch=armv8.6-a-sve translates to -march=armv8.6-a+sve * scons arch=armv8.6-a-sve2 translates to -march=armv8.6-a+sve2 * Resolves COMPMID-5408 Change-Id: I0901e1de864d00109759509af7cc2b5c9ae1cd75 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7943 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-19[ONCPUML-951] Variable weight support for Convolution.Francesco Petrogalli
API changes for NEGEMMConvolutionLayer and CpuGemmConv2d Built with: scons neon=1 opencl=0 os=linux arch=armv8.2-a multi_isa=1 \ build=native -j32 Werror=false validation_tests=1 build_dir=opt \ standalone=1 asserts=1 experimental_fixed_format_kernels=1 . Tested with: ./build/opt/tests/arm_compute_validation Hardware where the test executable was run: Neoverse N1 Test coverage: * NEGEMMConvolutionLayer, CpuGemmConv2d * NHWC (the only one supported by the fixed-format kernels) * F16, F32 * Shapes: RunSmall Change-Id: I4fd3e495a7cbf61210ea02d37440ba9652934e99 Signed-off-by: Francesco Petrogalli <francesco.petrogalli@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7632 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-18Fix Neoverse V1 heuristics for FP32 fast moderamelg01
Resolves: COMPMID-5401 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I432b304483236efd392dfc47d541e6759c135104 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7934 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2022-07-14Integrate new winograd APIs from MLTechramelg01
Resolves: COMPMID-5400 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ib4428436dd7a6e40d8b2d8a2f8dac1b079154551 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7894 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-13Fixed clang-cl errors on Windows native builds.Pablo Tello
Partially resolves MLCE-739 Change-Id: Ice06a96d6a8a26b31e334ba4e697cd41d352b026 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7364 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-07-01Fix OpenBSD build errorsPablo Marquez Tello
* Partially resolves COMPMID-5396 Change-Id: I7f3e692e73da548a901a07ed6d7212df40b8c26f Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7870 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-06-15Fix build error v8.2-a-svePablo Marquez Tello
* bf16 instrinsics should be used when __ARM_FEATURE_SVE_BF16 is present * Fixed NDK14 compiler warning declaring copy ctor for Window explicitly * Resolves MLCE-867 Change-Id: I84ac5f213d9700e2fda7da55d83bba7cf79ad52c Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7728 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-06-13Add support for 2d and 3d indices for axis 1Pablo Marquez Tello
* Resolves COMPMID-5055 Change-Id: I2d14de29d3ec913d20c971bc8bbc9ad71e2d998f Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7547 Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-05-24[arm_gemm] Import fixed-format kernels from gemm_linux.Francesco.Petrogalli@arm.com
This is a No Functional Change Intended (NFCI) patch. It imports the kernel in the code, but the interface to select them and expose the format of the weight tensors to the user will be provided in a subsequent patch. Kernels and kernel selection code in arm_gemm has been provided by David.Mansell <David.Mansell@arm.com>. The kernels are not compiled in the library by default, but need to be selected via the `scons` option `experimental_fixed_format_kernels=1`. Resolves: ONCPUML-829 Signed-off-by: Francesco.Petrogalli@arm.com <francesco.petrogalli@arm.com> Change-Id: If00ccb2b9b7221e01b214cf9783111226ccc8bf4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7380 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-05-17DepthwiseConv reports full assembly kernel namePablo Marquez Tello
* Resolves MLCE-706 Change-Id: Ia15c925c13464397c79056dffe2a756e06020682 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7571 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-05-12Revert "Add support for 2d and 3d indices for axis 0"Mohammed Suhail Munshi
This reverts commit 0db8b8bbd941b3dab4238c03e734e7ac43c662ed. Relates to [COMPMID-5055] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I143e7965e21b956abb05ba5c41e12c5b73b7345a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7558 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-05-10Add support for 2d and 3d indices for axis 0Pablo Marquez Tello
* Partially resolves COMPMID-5055 Change-Id: Id05374b8c69e6b9ab4c2790a4de93d7172063b71 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Change-Id: Ic6e2c2d1d34abbf6222c8d56859514e267447266 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7488 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-05-06Updating a64_gemm_u8 a64_gemm_s8 kernels headersramelg01
Resolves: COMPMID-5272 Signed-off-by: ramy.elgammal@arm.com Change-Id: I185182430ca952e5bb661e0a47163965b3565a49 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7517 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-05-06Use svcreate instead of list initializations.Michalis Spyrou
Partially resolves COMPMID-5250 when building with SVE2. Change-Id: I16bd74d4cd6c70371efd8235c507ba5e7f8f906f Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-05-05Fix for Neon™ Depthwise Android P VTS test failureramelg01
Resolves: COMPMID-5237 Signed-off-by: ramy.elgammal@arm.com Change-Id: Ib1f5e262030e915a038cef587001708bbaf14c56 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7508 Reviewed-by: David Mansell Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-04-26Update Neon™ depthwise kernelramelg01
- Reduce duplication and simplify overall structure. - Improve multi-threaded performance by sharing more data in lower-level caches. Partially Resolves: COMPMID-5054 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Iac747f39b21c540122fa75218762631c4d787911 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7449 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Andrew Mundy Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-04-25Update Neon™ pooling kernelramelg01
- Reduce duplication and simplify overall structure. - Improve multi-threaded performance by sharing more data in lower-level caches. Partially Resolves: COMPMID-5054 Signed-off-by: Ramy Elgammal<ramy.elgammal@arm.com> Change-Id: I5f4dc50913401d5c1cbfc10b866fae9490cbc4d7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7404 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Andrew Mundy Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-04-06[arm_gemm] Use static validate to find arm_gemm kernels.Francesco.Petrogalli@arm.com
The static method `CpuGemmAssemblyDispatch::validate` should look into the list of the available kernels to make sure the one requested by the user was found. Formatting changes in the files touched by the patch have been automatically inserted by the formatting script. Resolves: ONCPUML-840 Change-Id: Icd650a30e142284a942c64f8a2b72441ee7b3f4e Signed-off-by: Francesco.Petrogalli@arm.com <francesco.petrogalli@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7375 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-04-04Remove Non-Inclusive Term "Master"ramelg01
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Resolves: COMPMID-5017 Change-Id: I377d04512df357191e5a60d6dcf35121e71bf153 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7360 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-03-16Remove deprecated interface from arm_compute.Francesco.Petrogalli@arm.com
The function `get_gemm_method` in arm_compute is deprecated in favor of the method `arm_gemm::GemmCommon<TypeInput, TypeOutput>::get_config()`. Signed-off-by: francesco.petrogalli@arm.com Change-Id: Idd5d879180c3995d5a07a727aa9216b8f94f01ba Signed-off-by: Francesco.Petrogalli@arm.com <francesco.petrogalli@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7304 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2022-03-10Added windows native build supportPablo Tello
Resolves MLCE-739 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Change-Id: I30a11393e928061c82a5c93d8ec195c04a0e838b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7279 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-03-08Decouple fuseBatchNormalizationKernelYair Schwarzbaum
- Decouple data type for CPU implementation supported data types are: fp32, fp16 Resolves COMPMID-4613 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: I8aff3ba2d446f64e4d182a866e3a3debc9ef613b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7175 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-03-01Multi ISA Technical DebtDana Zlotnik
* Update json struct meet multi-ISA updates * Add impl.cpp in kernels where we only have impl.h Resolves COMPMID-5173 Change-Id: I5da3c4b016a5d0115c4ba46cbfefde7bce518ac1 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7191 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-17Decouple NEL2NormalizeLayerKernelYair Schwarzbaum
Resolves: COMPMID-4615 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: Iadbfb3e45831a5072962b5b9f61e8ae2e674ccc4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7016 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-14Port MaxUnpoolingLayer kernel and add KernelSelect vaidation testDana Zlotnik
Resolves COMPMID-4958 Change-Id: Ibed5155f2e3ece46635f6ea9617bf11cefc402b1 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7028 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-09Remove deprecated remap functions.Adnan AlSinan
- Remove CLRemapKernel. - Remove NERemapKernel. Partially resolves COMPMID-4984 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ia61f9ac7447695d81178701cf0e9b7625a91eccc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7056 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-25Add OpenBSD/arm64 support.Kevin Lo
Signed-off-by: Kevin Lo <kevlo@kevlo.org> Change-Id: I6f29bdb55caeec8893f128fdd50bdcc3d058cb3c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6905 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-24Select kernel decouplingAnton Vainer
Resolves COMPMID-4614 Signed-off-by: Anton Vainer <anton.vainer@arm.com> Change-Id: I19476d43b8e685de2eed973425d5d31b9cdb84ca Signed-off-by: Anton Vainer <anton.vainer@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6960 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2022-01-21DepthwiseConv reports full assembly kernel namePablo Marquez Tello
* Fixed the kernel name in CpuDepthwiseConv2dAssemblyWrapperKernel * Resolves MLCE-706 Change-Id: I01ddbe2c030e22e5ba6761ed32110a35c314ccae Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6787 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-21A73 Devices Regression 300% fixMohammed Suhail Munshi
- Currently regresses on A73 devices (tested on android hikey, inceptionv3), this patch solves this - Changed mws for all cores to use default values - Existing mws value for A73 tuned for hikey-linux, caused regression on hikey-android Resolves [COMPMID-5044] Change-Id: Ifd6faaa34a0b405d0c390015566f2c75436dfb07 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6973 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2022-01-12Decouple NEMeanStdDevNormalizationKernelDana Zlotnik
Resolves COMPMID-4617 Change-Id: Ic8793aaf64c6137f848f39c62e33b44ae79ad21d Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6870 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-12Decouple NEInstanceNormalizationLayerKernelDana Zlotnik
Resolves COMPMID-4620 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Change-Id: I22c285339840493c9cfd4c1abfbc3768ad4db824 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6871 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-11Decouple NEBoundingBoxTransformKernelDana Zlotnik
Resolves COMPMID-4622 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Change-Id: I18acd03e323f7734635284a763442d2cb4ded177 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6872 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-11Decouple NEROIAlignLayerKernelDana Zlotnik
Resolves COMPMID-4624 Change-Id: Ib8d24c44ae62c4f8272310c9305d863d8447eafa Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6873 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-11Decouple NEGenerateProposalsLayerKernelDana Zlotnik
Resolves COMPMID-4621 Change-Id: I3d89fa6d8273cc5f61a5cc0470fd730919bcd432 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/385363 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6867 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-06Decouple NEMaxUnpoolingLayerKernelDana Zlotnik
Resolves COMPMID-4619 Change-Id: I9c43dcd3fb3a688e1c0ccc858a02376741381ba7 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6874 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-12-24Replacing non-inclusive terms with proper termsramelg01
Partially-Resolves: COMPMID-4854 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ic9757c89878b9b5a89680b5344de657f676c7bf2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6859 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2021-12-22Crop Kernel Decouplingalerah01
Change-Id: I1f39ba6a255847f4d21837a3dce4f867322203e6 Signed-off-by: alerah01 <alex.rahlis@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6799 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-12-14Update A510 arm_gemm cpu Kernelsramelg01
Resolves: COMPMID-4910 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I79b4aa51e07ad1fe81d9218ed8a8f34f0ec5ab06 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6803 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-12-10Fix 300% Regression CPU - Change default mws value in Kernel filesMohammed Suhail Munshi
Resolves: COMPMID-5001 Change-Id: I13fbe859d2557be0459ba76da0136d0efb15f311 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6809 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-11-23Decouple data type for NERangeKernelYair Schwarzbaum
- Decouple data type for CPU implementation, supported data types are: fp32, fp16, u8, u16, u32, s8, s16, s32 Resolves COMPMID-4612 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: Iec9aab9f59c5a344950c788281fc500290a19bbc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6686 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-11-16Implement 1D Adaptive Workload Splitting in CPPSchedulerDana Zlotnik
Resolves COMPMID-4649 Change-Id: I941d2f8a40737ff05c49f6695a42884731ef2dc9 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6656 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-05Update GeMM heuristic on CPUGian Marco Iodice
Change-Id: I5a5537dc75d460b3fe2efb5cb0659c19e2972955 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6590 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2021-10-18DirectConv3d support refineSheri Zhang
- Decouple data support of CpuDirectConv3dKernel - Update documentation for Conv3d Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I1d94aa28f821f45a1a3d39cc3335c8faeee89f0d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6453 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-10-18Fix precision issue in ChannelShuffleKernelPablo Marquez Tello
* Fixed the issue in NHWC Neon * Fixed the rounding error in CL * Added a new test case to reproduce the problem * Resolves COMPMID-4831 Change-Id: I1613168cad580ca5acefe8ba340130af05cffaff Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6454 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-18Add user provided JSON operator list buildFreddie Liardet
Allow ACL to be built via a user provided JSON file containing operators, data types and data layouts. Modify TFLite file to JSON file script to output data layouts. Fix build issue with "fat_binary" and "high_priority" options. Resolves: COMPMID-4697, COMPMID-4837 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I08d494151c98f804325707ffd922ffe216813023 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6427 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2021-10-18Implement Minimum Workload Size (MWS) in all CPPKernels used by small networksDana Zlotnik
* create get_mws method in ICPPKernel class that retuns default value for all kernels * overwrite the default value for all the kernels used by small networks (according to banchmark case) Resolves COMPMID-4648 Change-Id: I46d7cae61217213279d2ee740edc73f600b6d576 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6412 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>