aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/core
AgeCommit message (Collapse)Author
2022-05-11Fix inclusion guard for dynamic fusion moduleSiCong Li
Resolves COMPMID-5318 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I59594632c9891b9569089764ae26cc7be6b78fcd Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7550 Reviewed-by: Nikhil Raj Arm <nikhil.raj@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-05-10Add support for 2d and 3d indices for axis 0Pablo Marquez Tello
* Partially resolves COMPMID-5055 Change-Id: Id05374b8c69e6b9ab4c2790a4de93d7172063b71 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Change-Id: Ic6e2c2d1d34abbf6222c8d56859514e267447266 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7488 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-05-06Integrate Dynamic Fusion patchesSiCong Li
* Add public interfaces: * OperatorGraph: Describe a workload that could contain fused kernels * IWorkload: Generic interface for workloads built from OperatorGraph * ClWorkload: OpenCL workloads built from OperatorGraph * ClCompositeOperator: Runtime async operator to execute a ClWorkload * DependencyGraph (will likely be deprecated in later iterations) * Add example * cl_fused_conv2d_elementwise_add.cpp to explain how to use the new interfaces * Add internal translation layer * Refactor ClKernelBuildingAPI * Remove non-tile based gemm native kernel component * Minor interface changes * Add integration tests Resolves COMPMID-5161 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ib987ed79289ab0bcbd3130d54f5793408d9f1240 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7510 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-05-06Extend GemmLowp reference to support BATCH MATMUL for quantized data typeAdnan AlSinan
- Extends GEMMInfo class to include flags for transposing A and B. - Extends GEMMLowp fixtrues to have an option for transposing A and B. Resolves COMPMID-5075 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: If5e4b7e2b7b19ca30808a78a9641d8ba3f176a26 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7458 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2022-04-21Add missing noexcept.Michalis Spyrou
This fixes building faillures with gcc9. Change-Id: I993116866b0f1e41cb2518e880798ad1f8c2e0af Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7448 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-03-15Implementation of ClPooling3dramelg01
- For NDHWC layout - For F16 and F32 data types - Mixed Precision stil not supported Resolves: COMPMID-4670 Signed-off-by: ramy.elgammal@arm.com Change-Id: I0e14a13e4625569e8e5ee67e6033bd1efe0da469 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7262 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-03-08Merge kernel prototype patchGiorgio Arena
Resolves: COMPMID-5151 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ic4024d5cd4819fe917a1d49621f1866ae2e90a37 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7260 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-03-03Decouple castKernelYair Schwarzbaum
Resolves: COMPMID-4625 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Change-Id: I3c30f007804b179e5e2b439f421fbd4e57fb02e1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7149 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2022-03-01Add Pool3d reference implementationGunes Bayir
This patch - adds the reference implementation for the 3D pooling layer - supports FP32/FP16 and INT8/UINT8 types - adds a function to calculate the output shape for 3D pooling - adds a new type for describing pool 3d info (Pool3DInfo) Resolves: COMPMID-4659 Change-Id: I22a18fa30625c98fa827ef1b50781db6893ba9c4 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7219 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-02-21Decouple CpuDepthwiseConv2dNativeKernelDana Zlotnik
Resolves COMPMID-4632 Change-Id: I5e2a9f0f7801a2afaa35de871ab29cd7238923fd Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7115 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-10Int32 to Float32 implicit casting fixMohammed Suhail Munshi
- Int32 is cast to Float32 implicitly during range checking comparisons, value may change due to float32 range limitations. - Fixed by casting float32 var to double64 before comparisons. - This issue prevented NDK r23b compilation Resolves: [COMPMID-5051] Change-Id: I1695f9ad5fa147d81cbcfda9e0152aefab960c82 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7093 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2022-02-09Remove deprecated remap functions.Adnan AlSinan
- Remove CLRemapKernel. - Remove NERemapKernel. Partially resolves COMPMID-4984 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ia61f9ac7447695d81178701cf0e9b7625a91eccc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7056 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-02-07Adding support for Arm Mali-G710Gian Marco Iodice
Resolves COMPUTE-13901 Change-Id: Ib83d737066a55ab6452bdc34e3e4cba2d466d72a Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6971 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-01-12Enable kernel selection testing (Phase #1)Giorgio Arena
Change-Id: I1d65fb9d3a7583cf8d4163ca7c0fbee27dc52633 Signed-off-by: Yair Schwarzbaum <yair.schwarzbaum@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6767 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-12-25Add tests for FP Cpu Pooling where pool region is completely outside the inputSiCongLi
* Add floating point validation tests for this configuration * Fix reference implementation to return -inf for this configuration * Prohibit this config in Cl, as well as non-float cases in Cpu * Direct this config to non-asm path Resolves COMPMID-4998 Change-Id: If88025c51b14ea337aea2441c548f858e95e5819 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6857 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-12-24Replacing non-inclusive terms with proper termsramelg01
Partially-Resolves: COMPMID-4854 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ic9757c89878b9b5a89680b5344de657f676c7bf2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6859 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2021-12-10Fix 300% Regression CPU - Change default mws value in Kernel filesMohammed Suhail Munshi
Resolves: COMPMID-5001 Change-Id: I13fbe859d2557be0459ba76da0136d0efb15f311 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6809 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-11-26Update OpenCL headersPablo Marquez Tello
* Make changes to include CL/opencl.hpp * Update CL C++ header to v2.0.15 * Update CL Headers to v2021.06.30 * Resolves MLCE-665 Change-Id: Ie2896e213519003531ecff0889d2112838d72d1b Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/377282 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6751 Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-26Fix Cpu Conv3d gcc 8.3 build issuesFreddie Liardet
Resolves: COMPMID-4986 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I54b682d377f3bcfc57fec54113debc5e8a1d75df Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6745 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-20Improve start-up timer for GeMM (floating-point):ramelg01
- Pass M,N,K at runtime as kernel parameters - Add a guard macro to compile only kernel of interest - Move reshpaing kernels to gemm_utils.cl - Remove the fallback reshaping kernel with Y-Padding support Resolves: COMPMID-4888 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ida3851326f0b77e410633271de9ecca106e37931 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6662 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-16Implement 1D Adaptive Workload Splitting in CPPSchedulerDana Zlotnik
Resolves COMPMID-4649 Change-Id: I941d2f8a40737ff05c49f6695a42884731ef2dc9 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6656 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-12Fix PostOp dependencySiCongLi
In general src headers should not be included in any public header of other modules. Since there are modules (graph, tests) that rely on specific PostOp definitions in the previous src/core/experimental/PostOp.h, export it to the public arm_compute header Resolves COMPMID-4974 Signed-off-by: SiCongLi <sicong.li@arm.com> Change-Id: I0fa4da5108a34fe6bfff1e9d57839da4e51dc314 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6673 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-09Enable fast_math in CpuFullyConnectedcfRod
ONCPUML-529 * Add support for passing fast_math for fullyconnected layers via fc_info. * Add support for passing fast_math to run ACL benchmark graphs. * Add validation test and accuracy tests (updated fixtures). Note: abs and rel. tolerance for fast math mode are set based on experimental data. Signed-off-by: cfRod <crefeda.rodrigues@arm.com> change-Id: Ib107d6264d3ae5e36555334f39a13e678f8618df Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6521 Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-04Add validate tests for CLConvolutionLayer and CLGEMMConvolutionLayer with ↵SiCongLi
post ops * Add validate tests * Restrict post ops support in ClGemmConv2d to only those that do not need im2col or col2im. In practice this means we only support post ops in conv1x1 with stride = 1, dilation = 1 and data layout = NHWC Resolves COMPMID-4435 Change-Id: I1fdf0c5d565a4624857250075ac76db35c2f383b Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6573 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-04Add PRelu to supported PostOps in:ramelg01
- ClGemmMatrixMultiplyReshapedKernel - ClGemmMatrixMultiplyNativeKernel - ClGemmMatrixMultiplyReshapedOnlyRhsKernel Resolves: COMPMID-4713 Change-Id: I3adcb1b3d4af37ebcbc3bee19cc1845885d08600 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6553 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-01Add PostOp support to GEMM and CLGEMM operators and functions Part 2SiCongLi
* Implement PostOp interface changes * Remove spaces around "=" in TypePrinter Partially resolves COMPMID-4435 Signed-off-by: SiCongLi <sicong.li@arm.com> Change-Id: If1e2280554030a0f635e73339a2e86987f6dc41b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6484 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-01Fix dst "widening" validationSiCongLi
* Auto-initialize the dst tensor before checking for PostOp shape compliance so that we catch the invalid case of "widening" dst tensor shape * Rework post op validate test cases to be more readable Partially resolves: COMPMID-4435 Change-Id: I79943994182942f962e4d59a7fa0d6f017ae9ac7 Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6548 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-10-28Add experimental PostOp interface to ClGemmMatrixMultiplyReshapedKernel Part 1SiCongLi
This interface supports the fusion of multiple elementwise operations Partially resolves: COMPMID-4435 Change-Id: If68dd7dd98dcf239fde7cb1f0a4a6d4d1e899a6f Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6483 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-27Improve conv3d validationFreddie Liardet
Improve validation of cpu conv3d and add validation test. Align Size3D to Size3D comparison with how Size2D implements it. Remove print statement in MaxUnpooling validation tests. Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I17048d56b08704cdbf1ad978af02009e57f3aa83 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6512 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-18Implement Minimum Workload Size (MWS) in all CPPKernels used by small networksDana Zlotnik
* create get_mws method in ICPPKernel class that retuns default value for all kernels * overwrite the default value for all the kernels used by small networks (according to banchmark case) Resolves COMPMID-4648 Change-Id: I46d7cae61217213279d2ee740edc73f600b6d576 Signed-off-by: Dana Zlotnik <dana.zlotnik@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6412 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-12Remove padding in cpuPool2d NCHWFreddie Liardet
Remove padding from all cpuPool2d NCHW kernels (FP16,FP32 & Quantized) Resolves: COMPMID-4728, COMPMID-4823 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: Ida619f67cd6606b33828f2d9dee925aeb794cc50 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6358 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-10-07Add support for 5D data layout indexingGiorgio Arena
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib346bb6b90d2220ec5934c83a9a1f0cd540b8731 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6377 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-09-29Add support for non-constant weights and biases in CpuFullyConnectedGiorgio Arena
Changing the approach for specifying that weights and biases tensors are non-constant by making it a member of TensorInfo rather than an option of the functions. Resolves: COMPMID-4222, COMPMID-4811 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I9b0081ccbcf8271ce029ba6755563d64c59e1d32 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6313 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-23Use std::vector instead of Coordinates for ITensorInfo's _dims_stateGiorgio Arena
Resolve COMPMID-4798 Change-Id: I2661de88640dcfee92d207f7e884c066eb19ab94 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6306 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2021-09-22Update OpenCL header file to version 2020.12.18Sheri Zhang
Resolves: COMPMID-4656 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I7735b9828736baa7cdc4690e191a489c824530c6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6280 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-22Provide logging for configure functions in all NEON functionsramelg01
Partially Resolves: COMPMID-4718 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I655268c57fa126d9c99981c49d345a3aac75646e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6286 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2021-09-16Revert "Add support for non-constant weights and biases in CpuFullyConnected"Pablo Marquez Tello
This reverts commit aed63ee175e0d64c934389e9d1b2edd0cb1a5cdd. * Resolves COMPMID-4812 Change-Id: I16919e2f3b22c868ae146d0d10dae97a80e1ba46 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6266 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-15Adds Conv3d reference implementation support.Adnan AlSinan
Expands the interface with the following items: - Size3D Class. - Conv3dInfo Struct. - Padding3D Struct. - Add 'NDHWC' to supported Tensor Data Layouts. - Add function to compute expected size of Conv3d. Resolves COMPMID-4658 & COMPMID-4657 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: Ic7452c48461eedaa38eaf3ac458f54b031e7dfa8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6187 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-09-07Add support for non-constant weights and biases in CpuFullyConnectedMichele Di Giorgio
Changing the approach for specifying that weights and biases tensors are non-constant by making it a member of TensorInfo rather than an option of the functions. Resolves: COMPMID-4222 Change-Id: I96e6f3868f51785c9700a3ef6a1fe7b05747862c Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6162 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-09-01Printing operators parameters, currently for CpuAdd operator only.Ramy Elgammal
Resolves: COMPMID-4718 Change-Id: Id4dd762cd1b759bb814b9d0b1ea0c9ba4dfbae6f Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6139 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-09-01Fixed compiler errorPablo Marquez Tello
* Missing noexcept causes compilation to fail on GCC 9.3.0 * Resolves MLCE-595 Change-Id: I960608dbeaacac3699465da4b75740237d65559c Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6182 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-08-24Re-use auxiliary memory withing CpuWinogradConv2d operatorsGeorgios Pinitas
Input/Output transformation operations are independent and done in different time-steps of the algorithm, this memory can be re-used between this transformation stages. Moreover, reduce the allocation when extracting workspace sizes for Winograd trasformations. There is a mix return of sizes in bytes and elements, thus ensure the correct is in place. storage_size() member functions return elements while working_space() function bytes. Resolves: COMPMID-4781 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I705445ba7ca818cead48369db3cacd49684c7192 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6145 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-08-13Avoid releasing weights if they are used by multiple functionsGeorgios Pinitas
Resolves: COMPMID-4769 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iccadcbd68b0fd84ed3bf212e358a4ea944084a40 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/349845 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6107 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-08-09Remove TODO, FIXME and COMPMID comments where possibleFreddie Liardet
Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: I968787603927bcfbeacb110570eb488061ee3e43 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6058 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-22Update GEMM assembly kernelsGeorgios Pinitas
- Introduce Fp32 kernels with internal calculations in Bfloat16 when fast_mode is enabled - Improve kernel selection heuristics Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I68a9e7e862b6fd2721b46e0d7cc791091c4ab279 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5965 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-16Avoid multiple Rhs matrix transformation on ClGemmGeorgios Pinitas
ClWinogradConv2d was performing Rhs transformation on every step impacting the performance. Adds scope logging support through ARM_COMPUTE_LOG_MSG_WITH_FUNCNAME Resolves: COMPMID-4596 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ib329d3bc8d8aa21abae9fabfe61de35cc84d4819 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5925 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-06Fix manual LOOP_UNROLLINGGian Marco Iodice
The issue is caused by the number of iterations passed to LOOP_UNROLLING. When we use the manual LOOP_UNROLLING, the number of iterations must be less than or equal to 128. To overcome this problem, we create a utility function to check if any of the critical iterations (kernel dimensions) are beyond that limit. If so, the utility function, disable the manual loop unrolling. Resolves COMPMID-4609 Change-Id: I7221c967609e462a5abd1cbb74e2a120f344fcb3 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5913 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-02Rework OpenCL Depthwise ConvolutionGian Marco Iodice
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute - Remove specialized kernels for 3x3 NHWC - Simplify CLDepthwiseConvolutionLayer.cpp to call just the native implementation for both floating-point and quantized data types - Develop two parametric opencl kernels for depthwise convolution layer NHWC (floating-point and quantized) - Add support to export the weights to cl_image - Extend test for depthwise convolution on opencl Resolves COMPMID-4417 Change-Id: Ibe533f79c2860f9cac8e921895d5a8f947753a5c Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5893 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-30Revert "Rework OpenCL Depthwise Convolution"Gian Marco Iodice
This reverts commit 561c176598cd14245e2e7918fdf136d1c888d1da. Reason for revert: <validation> Change-Id: I6f2d61c27520439bb538e9265736532104b24cf8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5127 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Port the ClGemmLowp kernels to the new APIGeorgios Pinitas
Ported kernels: - CLGEMMLowpMatrixMultiplyNativeKernel - CLGEMMLowpMatrixMultiplyReshapedKernel - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel - CLGEMMLowpOffsetContributionKernel - CLGEMMLowpOffsetContributionOutputStageKernel - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel - CLGEMMLowpQuantizeDownInt32ScaleKernel Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>