aboutsummaryrefslogtreecommitdiff
path: root/Android.bp
AgeCommit message (Collapse)Author
2022-11-01Rewrite dynamic fusionSiCong Li
The new version introduces the following major changes: * Change public interface to simplify and standardize the user experience - Use the term "Workload" uniformly - Simplify operator interface to be a set of static methods: validate_op(), create_op() * Separate the kernel writing into its own component (template_writer). This is to allow the co-development of GpuKernelWriter, and to allow easy replacement once GpuKernelWriter is mature. * Optimize the core fusion algorithm used by the component graph. The details can be found in GpuKernelComponentGraph::fuse() * Use Gpu instead of Cl prefixes for most of the Workload interfaces (except for runtime and kernel components, which have to be language specific) This allows the potential extension to other Gpu langauges in the future. * Refactor runtime memory interface so that auxiliary tensor handling is separate from the user tensor passing. This is because the former is less stable and may require extension in the future. * Hide source code object from the user as it is not required at the moment * Deprecate the old prototype entirely by disabling it in SCons build Resolves COMPMID-5510, COMPMID-5512, COMPMID-5513 Change-Id: If69d2362856f2de4503546b7b6cf48a525cf3079 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8406 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-09-14INT8 Quantized MeanStdDevNorm (LayerNorm)Murray Kornelsen
Implements LayerNorm for qasymm8 tensors. Uses uint8x16 loads and stores. Summation is performed in integer arithmetic (vpaddl) Normalization is performed in float32 before requantizing back to int8. Signed-off-by: Murray Kornelsen <murray.kornelsen@mail.mcgill.ca> Change-Id: I2407c8b34717fb47adab98791bd76fb8a3c62f4a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7922 Comments-Addressed: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-08-17Add LUT for quantized sigmoid functionViet-Hoa Do
* Move LUT implementation to a seperate file. It will be used for both QASYMM8 and QASYMM8_SIGNED. * Fix wrong constant value related to QASYMM8_SIGNED leaky ReLU in 32-bit build. Resolves: COMPMID-5464 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I2b24d52409a38f1b66fd532f431eff8a9e4547b6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8066 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-22Add GemmLowp MMUL Reshaped Only Rhs Support for QASYMM8/QASYMM8_SIGNEDFreddie Liardet
This patch introduces a GEMMLowp routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5398 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I8d06453645688f3658b6c7c06f1ebc25a2505661 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7932 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-21Fix direct convolution cases that were failing on OdroidAdnan AlSinan
- Affects OpenCL backend. - Resolves COMPMID-5416 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I8953f9ac5c1ec9edf99399a651a544df4276ccf1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7951 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-14Integrate new winograd APIs from MLTechramelg01
Resolves: COMPMID-5400 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ib4428436dd7a6e40d8b2d8a2f8dac1b079154551 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7894 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-13Add Gemm MMUL Reshaped Only Rhs Support for FP32/FP16Gunes Bayir
This patch introduces a GEMM routine that is optimized for Arm(R) Mali(TM)-G715 and Arm(R) Mali(TM)-G615 Resolves: COMPMID-5216 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I2e5d7806f5904347185bb3e250f73d73d6669dba Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7914 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-07-08Extended direct conv 2d interface for tuning the OpenCl kernelGian Marco Iodice
Resolves COMPMID-5298 Change-Id: Ie9b907e5dcf86aa6add8d08799fa7ba7c264edea Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7888 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-06-27Implement new Elementwise Dynamic Fusion Operators: Div, FloorMichalis Spyrou
Resolves: COMPMID-5355 Change-Id: I92f73fbe885f28bbe7b07965b90cfd807c93602f Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7745 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2022-05-24[arm_gemm] Import fixed-format kernels from gemm_linux.Francesco.Petrogalli@arm.com
This is a No Functional Change Intended (NFCI) patch. It imports the kernel in the code, but the interface to select them and expose the format of the weight tensors to the user will be provided in a subsequent patch. Kernels and kernel selection code in arm_gemm has been provided by David.Mansell <David.Mansell@arm.com>. The kernels are not compiled in the library by default, but need to be selected via the `scons` option `experimental_fixed_format_kernels=1`. Resolves: ONCPUML-829 Signed-off-by: Francesco.Petrogalli@arm.com <francesco.petrogalli@arm.com> Change-Id: If00ccb2b9b7221e01b214cf9783111226ccc8bf4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7380 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-05-06Integrate Dynamic Fusion patchesSiCong Li
* Add public interfaces: * OperatorGraph: Describe a workload that could contain fused kernels * IWorkload: Generic interface for workloads built from OperatorGraph * ClWorkload: OpenCL workloads built from OperatorGraph * ClCompositeOperator: Runtime async operator to execute a ClWorkload * DependencyGraph (will likely be deprecated in later iterations) * Add example * cl_fused_conv2d_elementwise_add.cpp to explain how to use the new interfaces * Add internal translation layer * Refactor ClKernelBuildingAPI * Remove non-tile based gemm native kernel component * Minor interface changes * Add integration tests Resolves COMPMID-5161 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ib987ed79289ab0bcbd3130d54f5793408d9f1240 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7510 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-04-26Update Neon™ depthwise kernelramelg01
- Reduce duplication and simplify overall structure. - Improve multi-threaded performance by sharing more data in lower-level caches. Partially Resolves: COMPMID-5054 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Iac747f39b21c540122fa75218762631c4d787911 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7449 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Andrew Mundy Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-04-25Update Neon™ pooling kernelramelg01
- Reduce duplication and simplify overall structure. - Improve multi-threaded performance by sharing more data in lower-level caches. Partially Resolves: COMPMID-5054 Signed-off-by: Ramy Elgammal<ramy.elgammal@arm.com> Change-Id: I5f4dc50913401d5c1cbfc10b866fae9490cbc4d7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7404 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Andrew Mundy Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-04-19Add CLPool3d Int8 SupportMohammed Suhail Munshi
- Adds Qasymm8 and Qasymm8_signed support to the 3d pool operator Resolves: COMPMID-4669 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I36038c2b7c4f36baf67f7aae801356890e104538 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/410496 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7391 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-04-13Add support for int8 CpuPool3dAdnan AlSinan
- Add implementation for the CPU pooling 3d layer. - NDHWC data layout support. - Support QASYMM8/QASYMM8_SIGNED. - Add Pooling helper file for Pool3d/2d common functions. Resolves COMPMID-4668 Change-Id: Iadf042036b076099c2353d6e2fe9fc623bc263d8 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7387 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-04-13Add DirectConvolution2D kernel component for dynamic fusionGunes Bayir
Resolves: COMPMID-5156 Change-Id: I438da924cb80d3bce72106b06ca7181e0606bd01 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7399 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-04-12Fix CpuGemmAssemblyDispatch::has_opt_impl.Francesco.Petrogalli@arm.com
The QASYMM8 case was erroneously using the constructing template instead of the querying one. Change-Id: If9257df1aea0aecc3f82235d1cfcbb743fb6b852 Signed-off-by: Francesco.Petrogalli@arm.com <francesco.petrogalli@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7396 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-04-01Add CPU Pool3d FP16/32 implementationAdnan AlSinan
- Add implementation for the CPU pooling 3d layer. - NDHWC data layout support - Support FP32/FP16. - Add Pool3d to the operator list. - Fix CL Pool3d kernel comments to generate the operator list. Resolves: COMPMID-4671 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I92478a154beb12541525b648ed3dd5a58c8f27fa Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7311 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> (cherry picked from commit 572659a0e5dd1086b1c7d16fe331ff73d2acd93a)
2022-03-15Implementation of ClPooling3dramelg01
- For NDHWC layout - For F16 and F32 data types - Mixed Precision stil not supported Resolves: COMPMID-4670 Signed-off-by: ramy.elgammal@arm.com Change-Id: I0e14a13e4625569e8e5ee67e6033bd1efe0da469 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7262 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-03-08Decouple fuseBatchNormalizationKernelYair Schwarzbaum
- Decouple data type for CPU implementation supported data types are: fp32, fp16 Resolves COMPMID-4613 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: I8aff3ba2d446f64e4d182a866e3a3debc9ef613b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7175 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-03-08Merge kernel prototype patchGiorgio Arena
Resolves: COMPMID-5151 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic4024d5cd4819fe917a1d49621f1866ae2e90a37 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7260 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-03-03Decouple castKernelYair Schwarzbaum
Resolves: COMPMID-4625 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: I3c30f007804b179e5e2b439f421fbd4e57fb02e1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7149 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2022-03-01Multi ISA Technical DebtDana Zlotnik
* Update json struct meet multi-ISA updates * Add impl.cpp in kernels where we only have impl.h Resolves COMPMID-5173 Change-Id: I5da3c4b016a5d0115c4ba46cbfefde7bce518ac1 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7191 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-22Decouple CpuDirectConv2dKernelalerah01
Resolves COMPMID-4626 Exclude SVE & SVE2 paths from android.bp NDK version does not support these extensions. Change-Id: I49b147d2a84819975d3225f2920106fa1a0d742f Signed-off-by: alerah01 <alex.rahlis@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7136 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2022-02-21Decouple CpuDepthwiseConv2dNativeKernelDana Zlotnik
Resolves COMPMID-4632 Change-Id: I5e2a9f0f7801a2afaa35de871ab29cd7238923fd Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7115 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-17Decouple NEL2NormalizeLayerKernelYair Schwarzbaum
Resolves: COMPMID-4615 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: Iadbfb3e45831a5072962b5b9f61e8ae2e674ccc4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7016 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-14Exclude SVE & SVE2 paths from android.bpDana Zlotnik
NDK version does not support these extensions. Resolves COMPMID-4921 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Change-Id: Icd506c00a5cecafafd682d07c32690ccf8ca587f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/397247 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7131 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-02-14Decouple CpuGemmMatrixMultiplyKernel and CpuGemmMatrixAdditionKernelDana Zlotnik
Resolves COMPMID-4629, COMPMID-4631 Change-Id: Idceafc84735116ef63ec13a202895f954b87e32f Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7095 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-14Port MaxUnpoolingLayer kernel and add KernelSelect vaidation testDana Zlotnik
Resolves COMPMID-4958 Change-Id: Ibed5155f2e3ece46635f6ea9617bf11cefc402b1 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7028 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-09Remove deprecated remap functions.Adnan AlSinan
- Remove CLRemapKernel. - Remove NERemapKernel. Partially resolves COMPMID-4984 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ia61f9ac7447695d81178701cf0e9b7625a91eccc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7056 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-02Revert "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros"Ramy Elgammal
This reverts commit 10e88a7351 "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros" Resolves: COMPMID-5095 Signed-off-by: Ramy Elgammal<ramy.elgammal@arm.com> Change-Id: I46e167882f072e7508b6101d295accb6e089e740 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7045 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-25Rework gemm_mm_reshaped_only_rhs_ kernels with new macrosGian Marco Iodice
- Rework gemm_reshaped_rhs_only with new TILE macros - Fuse post ops in gemm_reshaped_rhs_only Resolves COMPMID-4890 Change-Id: I944948ecec6d08deaf3545b80cd3eeac26e44205 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6944 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2022-01-24Select kernel decouplingAnton Vainer
Resolves COMPMID-4614 Signed-off-by: Anton Vainer <anton.vainer@arm.com> Change-Id: I19476d43b8e685de2eed973425d5d31b9cdb84ca Signed-off-by: Anton Vainer <anton.vainer@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6960 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2022-01-19Decouple CpuElementwiseKernelDana Zlotnik
1- reorganize the folders struct according the new definition 2- separate between unary and binary implementations 3- decuple kernels - unary , binary op and binary comparision Resolves COMPMID-4634 Change-Id: I0195846cc372e74a63c659069a4508de53a22110 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6860 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-13Decouple CpuSoftmaxKernelDana Zlotnik
Resolves COMPMID-4633 Change-Id: I9f93b28fbc3b18ccaeb453596dc8e0eddfe06b6a Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6861 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-12Decouple NEMeanStdDevNormalizationKernelDana Zlotnik
Resolves COMPMID-4617 Change-Id: Ic8793aaf64c6137f848f39c62e33b44ae79ad21d Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6870 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-12Decouple NEInstanceNormalizationLayerKernelDana Zlotnik
Resolves COMPMID-4620 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Change-Id: I22c285339840493c9cfd4c1abfbc3768ad4db824 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6871 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-11Decouple NEBoundingBoxTransformKernelDana Zlotnik
Resolves COMPMID-4622 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Change-Id: I18acd03e323f7734635284a763442d2cb4ded177 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6872 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-11Decouple NEROIAlignLayerKernelDana Zlotnik
Resolves COMPMID-4624 Change-Id: Ib8d24c44ae62c4f8272310c9305d863d8447eafa Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6873 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-11Decouple NEGenerateProposalsLayerKernelDana Zlotnik
Resolves COMPMID-4621 Change-Id: I3d89fa6d8273cc5f61a5cc0470fd730919bcd432 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/385363 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6867 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-01-06Decouple NEMaxUnpoolingLayerKernelDana Zlotnik
Resolves COMPMID-4619 Change-Id: I9c43dcd3fb3a688e1c0ccc858a02376741381ba7 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6874 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-12-22Crop Kernel Decouplingalerah01
Change-Id: I1f39ba6a255847f4d21837a3dce4f867322203e6 Signed-off-by: alerah01 <alex.rahlis@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6799 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-12-20Decouple CpuActivationKernelDana Zlotnik
1- Data types were already decoupled. This commit arrange the folder struct of the activation kernel. 2- Refactor NEON CpuActivationKernel for floating-point cases. Resolves COMPMID-4636 Change-Id: Ia4527244c84260dce1dd1d4bd4a9e3cfe2486d85 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6739 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-12-14Update A510 arm_gemm cpu Kernelsramelg01
Resolves: COMPMID-4910 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I79b4aa51e07ad1fe81d9218ed8a8f34f0ec5ab06 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6803 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-12-13Remove padding from ClDirectConv2dKernelAdnan AlSinan
- Delete old NCHW ClDirectConv2d kernels. - Merge all kernels on a single file. - Removed padding from ClDirectConv2dKernel Resolves COMPMID-4721 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I624d218fb770e7b5f3c0acd4e85a21ae48470f55 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6779 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2021-11-28Decouple CpuAddKernelDana Zlotnik
1- NEON supported data types are : fp32, fp16, u8, s16, s32 , q8, q_s8 , q16 2- SVE supported data types are: fp32, fp16, u8, s16, s32 3- SVE2 supported data types are : q8, q_s8 , q16 4- Re-arange SVE folder sturct ** Need to remove gaurds and add testing after Multi ISA build system and validation tests will be avalible Resolves COMPMID-4635 Change-Id: I90e4f6a219478aa9ad5c4a6b9858496afa8af42d Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6711 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-23Decouple data type for NERangeKernelYair Schwarzbaum
- Decouple data type for CPU implementation, supported data types are: fp32, fp16, u8, u16, u32, s8, s16, s32 Resolves COMPMID-4612 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: Iec9aab9f59c5a344950c788281fc500290a19bbc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6686 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-11-20Improve start-up timer for GeMM (floating-point):ramelg01
- Pass M,N,K at runtime as kernel parameters - Add a guard macro to compile only kernel of interest - Move reshpaing kernels to gemm_utils.cl - Remove the fallback reshaping kernel with Y-Padding support Resolves: COMPMID-4888 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ida3851326f0b77e410633271de9ecca106e37931 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6662 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-02Add post ops to ClGemmMatrixMultiplyReshapedOnlyRHSKernel and ↵SiCongLi
ClGemmMatrixMultiplyNativeKernel Part 3 Partially resolves: COMPMID-4435 Change-Id: Ifc5affa3a24a70942ca2d001380205df09b03ad7 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6550 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-28Add experimental PostOp interface to ClGemmMatrixMultiplyReshapedKernel Part 1SiCongLi
This interface supports the fusion of multiple elementwise operations Partially resolves: COMPMID-4435 Change-Id: If68dd7dd98dcf239fde7cb1f0a4a6d4d1e899a6f Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6483 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>