aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-03-24Work around CLScale compiler-specific issueSiCong Li
Resolves COMPMID-5985 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I0e789619f09e3adefe3655df347390f057300c0f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9373 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-03-24Add Texture Pipe Support for Matmul Lhs T/NT Rhs NT kernelsGunes Bayir
Resolves: COMPMID-5945, COMPMID-5954 Change-Id: I7b27021d21f8e08c4896f6b1f595a75125064f9e Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9356 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-23Round to nearest with ties to away from zero in ReluPablo Marquez Tello
* This patch adds support for rounding modes in vmlaq_qasymm8_signed which is used to compute Relu for quantized types * Partially resolves MLCE-1018 Change-Id: I2a267b84745430e1ffe92b8bc79828a39332db18 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9354 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-21gemm_interleaved: Set up the accumulation buffer properly in alternateDavid Mansell
threading mode. Resolves: COMPMID-5844 Change-Id: Iceb0018114bbca2bfdac4d4406936f9b260539e9 Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9070 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
2023-03-21Add dynamic weights for CPU fully connected layerViet-Hoa Do
Resolves: COMPMID-5917 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I073067b490f2a1b96b81a037ea431c9a2e5c7503 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9322 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-20Implement OpenCL MatMul for Lhs T Rhs T/NT FP32/16Gunes Bayir
- Implement opencl kernel for LHS transposed and RHS non-transposed - Implement opencl kernel for LHS transposed and RHS transposed - Add validation tests Resolves: COMPMID-5953, COMPMID-5955 Change-Id: I55589acbffe86c44e29807574975978a1ec09bad Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9345 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-17Implementation of RSQRT for quantized int8Ramy Elgammal
Resolves: COMPMID-5863 Change-Id: I9ff67face62826c1d335a6b941e8516be39bdac8 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/488768 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9225 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-17Implement OpenCL MatMul for Lhs NT Rhs T/NT FP32/16Ramy Elgammal
- Implement ClNativeMatMulKernel class - Implement opencl kernel for LHS non-transposed and RHS non-transposed - Implement opencl kernel for LHS non-transposed and RHS transposed - Add test fixture and dataset for matmul - Implement transpose_tensor() for reference implementation to transpose high dimensional tensors Resolves: COMPMID-5944, COMPMID-5951 Co-authored-by: Gunes Bayir <gunes.bayir@arm.com> Co-authored-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: I1d5b8978f41be27baddb3153ade880472141573f Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9333 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-16Pin links to documentation to release versionJakub Sujak
Partially resolves: COMPMID-5969 Change-Id: I0ab9c93c81111ddd3ee9e7feb5033a7102cc98a5 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9341 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-03-14Update readmeJakub Sujak
Partially resolves: COMPMID-5969 Change-Id: I2fc6dcec53051886e46404857763ee69e2779014 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9330 Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-03-14Update SONAME_VERSION in SConscriptJakub Sujak
Partially resolves: COMPMID-5969 Change-Id: I05f7c77d7ff6ba1e65b752b4f705d8964b04357f Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9331 Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-03-14Update release log for 23.02.1 patch releaseJakub Sujak
Partially resolves: COMPMID-5969 Change-Id: I09f41abbda378cd6551bdf5bf4866b2bf4ca3096 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-14Add CropInfo to BatchToSpace reference and fixtureSiCong Li
Partially resolves COMPMID-5918, COMPMID-5865 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ib3b01e7dc1c944184a4c038045bf0469fbb9ff45 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9321 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-13arm_gemm: Add SME2 FP16 kernels.David Mansell
Resolves: COMPMID-5966 Change-Id: Ic0d694493178da029a297643855bd0cff01b174f Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9302 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-13[ONCPUML-1174] Allow src/weights mismatch for fixed formatJonathan Deakin
Without this, we have to pass in weights to be NHWC, even if they are in fact blocked/interleaved for consumption by a fixed format kernel. Signed-off-by: Jonathan Deakin <jonathan.deakin@arm.com> Change-Id: I9ee8720a21a16b17816dbecf6308e1668ddda59c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9285 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-08Add support for arbitrary parameters for CPU GatherViet-Hoa Do
* The shape of input and indices tensors, and the gather axis can be any number, as long as these are valid and the output tensor doesn't have more dimensions than the library supports. * Update the reference code to be more generic and straightforward. * Add necessary test cases. Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Resolves: COMPMID-5919 Change-Id: Ic7e2032777aa97ecc147f61d5388528697508ab1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9199 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-07Add sigmoid and tanh for dynamic fusionViet-Hoa Do
* Add sigmoid and tanh activation functions for dynamic fusion. * Add corresponding tests, but both activation functions share the same fixture implementation. Resolves: COMPMID-5939 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I0aae0eaa18b746ce89680d2773c66e09b0f854ce Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9257 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-07Add exporting compile_commands.json fileViet-Hoa Do
* Add an option in scons script to export compile_commands.json file to support development using language server. * Add .cache directory to the git ignore list. It is normally used by clangd as the local cache. Resolves: COMPMID-5940 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I8c2a1ac85942d34ada22adea3e7de2baf2189eb2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9258 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-07GEMM: SME: Allow threading for quantized GEMMs.David Mansell
The SME kernels for quantized int8/uint8 GEMMs erroneously required that maxthreads==1 before they could be selected. This resulted in them not being available on multi-thread runs. Remove that restriction. Resolves COMPMID-5962 Change-Id: Ia7933d0c66020b5e2981604ca97ff7ead95ec14e Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9274 Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-03-07Resolve the presence of variables that are unused in release mode in Dynamic ↵Omar Al Khatib
Fusion source files Resloves: [COMPMID-5960] Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com> Change-Id: I1b11f01c51a029082ed05823717b4c4ae4897798 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9270 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-06Fix LWS search space used by CLTunerSiCong Li
* Ensure CLTuner uses the real GWS used by run(), instead of the static GWS (which is usually changed at run time), by caching GWS in each kernel Note this is a somewhat inelegant workaround. The real issue stems from the fact that execution window and scheduler are very much coupled with our operator run() / run_op() method. (Please see COMPMID-5934) * Restrict LWS values to explore within GWS bound for exhaustive mode * Refactor gws_from_window() to include all the information required to calculate GWS * Log lws search space used for tuning * Fix ClDirectConv2dKernel config id Resolves COMPMID-5892 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I420490d8b94d13ada2e44eb0a12078f883379334 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9193 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-03Add weights_info as optional input for NEDeconvolutionLayerAnnop Wongwathanarat
This is so that we can leverage fixed format kernel when using gemm convolution method. Partially resolves: [ONCPUML-1129] Change-Id: I61ffa74f5cd9d75579dbc1f9aa187371f855e932 Signed-off-by: Annop Wongwathanarat <annop.wongwathanarat@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9248 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-03NEGEMMLowpMatrixMultiplyCore should be configured for optimized int8 kernel.Ethan Doe
Currently the validation routine incorrectly prevents optimized INT8 Gemm kernel from being used if the input is QASYMM8 and output type is S32. This change allows QASYMM8 input and S32 output types to leverage optimized kernel. Signed-off-by: Ethan Doe <yidoe@amazon.com> Change-Id: I65b060f522795db07d6d4df86fb7c6ddd1c626d4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9250 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-02Fix direct conv2d in dynamic fusionViet-Hoa Do
* Put input and output tensor shape value directly to the CL code. * Use texture for weights when it is possible. Resolves: COMPMID-5938 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: Ib53b310a80ce857eac36564b352136fdde55b131 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9249 Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-03-01Add support for kernel indices in MaxpoolAdnan AlSinan
- Add a max pooling implementation that returns kernel indices. - Add a parameter in pooling info object to pick kernel indices impl. - Add validation tests. Resolves: [ONCPUML-1187] Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I485ef1604f676ee14d5f7f62d33699e49c38e4d3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9192 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-28Add an option to use lowest for max-poolingAdnan AlSinan
- Add a parameter in PoolingLayerInfo class to pick which value to use as min for max-pooling. Resolves: [ONCPUML-1166] Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I34e1cccc15176bbf31523c61e99f3188ddca23e1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8989 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-28Add MatMulInfo Information class to ComputeLibraryMohammed Suhail Munshi
Partially resolves : [COMPMID-5930] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Icb6c8a9d1b5a2c5c57e37cb7c877414ed500d0cc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/494758 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Sicong Li <sicong.li@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9181 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-27Add build option to disable threads hintViet-Hoa Do
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Resolves: COMPMID-5936 Change-Id: Iedfcc632f5d900865f38bd7b164121af48546542 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9220 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-02-24Fixes for CMake and Bazel builds, tests failing in sconsDavid Svantesson
- Fix 4 failing tests for multi_isa builds when experimental_fixed_format_kernels=1 - Fixes for CMake and Bazel builds to pass validation tests - Update documentation, remove “-DCPPTHREADS=1” flag from CMake build example Partially resolves: ONCPUML-1181 Signed-off-by: David Svantesson <david.svantesson@arm.com> Change-Id: I7101676260a0adcb7b6ff6f4342ae36f921e7120 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9189 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-22Fix configuration files required for Bazel and CMake buildsGunes Bayir
Partially Resolves: COMPMID-5909 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: Ia0ae7bfe22f8866e3d82f62098757e3bc55f46ca Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9183 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-21Add Microsoft Windows® trademarksJakub Sujak
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Change-Id: Ie2332d197f274f6f44fa71093f4aff83e91c2a20 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9182 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-15Fix Intermittent Neon™ ReduceMean QASYMM8 MismatchMohammed Suhail Munshi
- Dividing scale by number of elements causes accuracy loss due to limitations in float datatype and truncation to int - Adds rounding after division on aarch64 to negate this. Resolves: [COMPMID-5839] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I54ef0f7e56f39da1fa5f30378f551b5ca419a61d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/492456 Tested-by: bsgcomp <bsgcomp@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9110 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-15Fix clang-tidy errorJakub Sujak
Resolves: COMPMID-5901 Change-Id: I975561cbbbe1b595ae5da597d20041cc478060b6 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9153 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-14Add absolute tolerance to f16 CLConv3D validation testsSiCong Li
This fixes faulty mismatch issues. In addition, this aligns with the methodology used by f32, as well as that of cpu f16 tests Resolves COMPMID-5897 Change-Id: Id4e2088a9fc5444265c69444cfa90961dd84047e Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9146 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-14Extend skip upsampling for deconvolution for non-1x1 kernelsAnnop Wongwathanarat
Skip upsampling step for deconvolution when input strides are 1 regardless of kernel size. This is achieved by setting correct paddings for unit strides convolution. Resolve: [ONCPUML-1183] Change-Id: Ief88f9fe30f6f56d3358e3cf6a506ab8b5691f18 Signed-off-by: Annop Wongwathanarat <annop.wongwathanarat@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9134 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-14Fix convolution layer fixtureViet-Hoa Do
* Fast math is enabled unexpectedly by convolution layer tests. Resolves: COMPMID-5843 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: Ib3bc36d3f9070dbfb2c76146eecbb1ce0ee90626 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9137 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-14Add missing cpu feature printouts during testsGunes Bayir
Change-Id: I85bfdc2ef65e06068659dbe1904db72e5cf98a64 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9141 Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-10Update release version and change log documentationJakub Sujak
Resolves: COMPMID-5565 Change-Id: I9dca679f57f6c3cc9489669b80a5da2aba500d34 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9122 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-10Fix DeconvolutionLayer tolerance issues in FP16 testsGunes Bayir
This patch increases the tolerance value used for FP16 tests in Neon(TM) backend. The tolerance number means 0.01f means it is ok to have 1% mismatch in the resulting tensor between the reference and the target. The value adopts a slightly stricter threshold compared to ConvolutionLayer (which is currently at 7%). This increase makes sense because Deconvolution layer uses convolution under the hood. Resolves: COMPMID-5841 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: Ie0ebf5cce1e9753dc641a947d84128dd6da402d4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9120 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-by: Sang Won Ha Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-10Update release version and change log documentationJakub Sujak
Partially resolves: COMPMID-5565 Change-Id: I058bb7b0ee2ba246bf0f0c67d68e35b57326e802 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9098 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Omar Al Khatib <omar.alkhatib@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-10Remove legacy dynamic fusion example from documentationJakub Sujak
Partially resolves: COMPMID-5565 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Change-Id: I592bc13a84451424d2fb1d6a54bb301a80306079 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9094 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Omar Al Khatib <omar.alkhatib@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-02-10Update READMEJakub Sujak
Include a notice informing users of the new tensor lock paddings behavior. Partially resolves: COMPMID-5565 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Change-Id: I2d6aaa68ac1f75acab3b97e9538b4f549fc066c7 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9096 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Omar Al Khatib <omar.alkhatib@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-10Update SONAME_VERSION in SConscriptJakub Sujak
Partially resolves: COMPMID-5565 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Change-Id: I1699f19df5558a99e00736892f4205c0bfc2cbda Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9095 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Adnan AlSinan <adnan.alsinan@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-02-09Fix performance regression in Transposed ConvolutionGunes Bayir
Resolves: COMPMID-5849 Change-Id: I86f8bbc1f3a7c12c66d5ad8fcd74dd9e69629aa0 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9102 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Dynamic-Fusion: Jakub Sujak <jakub.sujak@arm.com>
2023-02-08Update CPU kernels to remove x19 and w19Michael Tyler
Resolves: COMPMID-5805 Change-Id: Idf720bbb136474810086f5089c5ed23b3f79835a Signed-off-by: Michael Tyler <michael.tyler@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9081 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
2023-02-08Add support for dilation > 1 in assembly DepthwiseConvolutionPablo Marquez Tello
* Resolve COMPMID-5689 Change-Id: I81a3791ad054db59562b76d1c729f2b2168aee8b Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Signed-off-by: Andrew Mundy <andrew.mundy@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8919 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-02-03Fix armv7a failing GEMMConvolutionLayer testsMohammed Suhail Munshi
Resolves: [COMPMID-5854] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Ib0228409be5e816acca7e123f2660eb01a79e38f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9078 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-02-03Disable AddMulAdd armv7a testsGunes Bayir
The AddMulAdd assembly kernels only support aarch64. This patch disables the tests associated in case the build is not for aarch64. Resolves: COMPMID-5850 Change-Id: Ib2768fd6bf2497420ff224daa243027d0a69c76b Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9076 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-02Update recommended NDK to r20b in the documentationSiCong Li
Resolves COMPMID-5846 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I0afbcda72879ccf8e9ef486d3c4af7a3c75ba270 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9071 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-02-01Fix GEMMLowp/Batched MatMul mismatches on CPUMohammed Suhail Munshi
- Fixes Column Offset matrix is not being iterated through in y dimension Resolves : COMPMID-5795 Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I0190474be404b4f0e171855739cfd0a48cbed5bc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9020 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>