aboutsummaryrefslogtreecommitdiff
path: root/tests/validation/fixtures
AgeCommit message (Collapse)Author
2024-02-14[QTest] Use dynamic output quantization in Depthwise Conv testsOmar Al Khatib
Resolves: COMPMID-6483 Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com> Change-Id: I512102f5e27743098168101b5e02382f4ad4a22a Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11068 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2024-02-08Fix the bug in GpuTanh operator in dynamic fusionGunes Bayir
Tanh in dynamic fusion is a simple operator with no A and B coefficients, as its public interface implies. Tanh operator follows the TOSA specification. Customization of tanh calculation with a and b can be achieved via fusion as below: out = a * tanh(b *in) --> x = b * in y = tanh(x) out = a * y; Resolves: COMPMID-6873 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: I818765192f631ae82c2094b0fc376fb87bae4fa4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11109 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2024-02-01Use the stable CKW API in the GPU dynamic fusion backendGunes Bayir
- Refactor all kernels to work with the CKW stable API - Add support for sub-tile in the op_load/op_store CKW operator - Fix mismatch in resize - Add comments in all kernels written with CKW to help developers understand the structure of the code - Add texture image support in depthwise convolution written with CKW - Add support for different block sizes in depthwise convolution - Remove the use of the dynamic fusion helper functions. - Add support for floor in the op_unary() of CKW Resolves: COMPMID-6708, COMPMID-6743, COMPMID-6530 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Change-Id: I8104ce4d04a3138a1aeb0b84940e1f1c89e76069 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10914 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2024-01-23Make GpuWorkloadContext own all tensor info objectsViet-Hoa Do
* The tensor info objects created by calling create_tensor_info is now solely owned by the context object. The user only receives pointers to those objects. - Internally pointers to tensor info objects are used in various places. It's safer for dynamic fusion to manage these objects directly rather than relying on the users. - The validation test is updated to use the modified API. * Make various changes in dynamic fusion API to make it more friendly (e.g. making some of the objects moveable). Partially resolves: COMPMID-6707 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: Ifee70e53c05f8e7b72bf9ef123701ff291c5ee80 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10990 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2024-01-11Fix test compilation error on GCC 13.2Jakub Sujak
Remove a std::move flagged by -Wpessimizing-move Resolves: COMPMID-6777 Change-Id: Ie082dc2eab0cb11e9a29f6f6fc98866306fd2cfa Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10957 Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2024-01-04Implement dynamic quantization for GEMMLowp testsSiCong Li
This patch calculates the output quantization info based on the inputs' quantization information. The previous approach was using the same quantization information for input, weights and output. Remove QSYMM8_PER_CHANNEL path from the fixture as there are no related tests Remove repeated shapes from the dataset now that we get rid of the quantization info from the dataset. Combine signed and unsigned SmallGEMMLowpFusedBatchedMatMulDataset into one as they become identical Resolves COMPMID-6481, COMPMID-6634 Change-Id: I9f5a20f4bb45c3e5adab388564135ae8a5c0a9ea Signed-off-by: SiCong Li <sicong.li@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10680 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-11-08Optimize CpuGemmConv2d start-up timeSiCong Li
When weight has no holes, we can replace CpuWeightsReshapeKernel with: - Collapse by reinterpreting weight's 3 spatial dimensions - Perform CpuTranspose For more details see the documentation in src/cpu/operators/CpuGemmConv2d.cpp This is one optimization since the CpuTranspose is better performing than CpuWeightsReshapeKernel A second optimization is to fuse this transpose with other weight transformations (e.g. pretranspose_B_array in CpuGemmAssemblyDispatch) However this second optimization depends on how the underlying gemm methods (the fall back path: CpuGemmMatrixMultiplyKernel or the assembly path: CpuGemmAssemblyDispatch) chooses to fuse the transpose. Therefore, this patch moves the transpose down from CpuGemmConv2d, to the individual gemm operators where the fusion decision needs to be made, by passing an extra "transpose_b" flag to CpuGemm New transpose_b flag in different scopes (they are all the same, but with different names because pretranspose_b has a different meaning in GemmAssemblyDispatch): GEMMInfo::pretranspose_B -> AsmGemmInfo::transpose_b New auxilliary tensors holding the transposed b result: - CpuGemm optimized path: CpuGemmAssemblyDispatch::PrePretransposedB - CpuGemm fallback path: CpuGemm::PreTransposedRHS Note that this patch does not yet have the second optimization (COMPMID-6595), but it prepares for it. Relates to COMPMID-6595 Resolves COMPMID-6499 Change-Id: I999a2da9da4b2b15369a3cc06d7872c86e0190ea Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10526 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Anitha Raj <Anitha.Raj@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-11-06Fix Elementwise Division Dynamic Shape testsAnitha Raj
- Enable use_dynamic_shape ArithmeticDivisionDynamicShapeValidationFixture Signed-off-by: Anitha Raj <anitha.raj@arm.com> Change-Id: I42ddf5b604d728eda91fa45b239abf8caf2cda0f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10586 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-11-03Add Dynamic Quantization tests to Fully Connected LayerMohammed Suhail Munshi
This patch calculates the output quantization info based on the inputs' quantization information. The previous approach was using the same quantization information for input, weights and output. This implementation does not cover the cases where we have fused activation function. Resolves: [COMPMID-6484] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Ib58143165191e82ae8547e661ac7c8d077bda200 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10539 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-31[GPU] Update Reverse layer to allow negative axis and reversed axis orderAdnan AlSinan
- Adds option to use negative axis and inverted axis. - Adds validation tests for the above. Resolves COMPMID-6459 Change-Id: I88afd845d078f92c82ec8529ce7241fccd4c417e Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10523 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-31Fix clang-tidy errorsJakub Sujak
Resolve the following clang-tidy errors: * use of undeclared identifier 'ARM_COMPUTE_ASSERT' * no template named 'AbsoluteTolerance' * no template named 'RelativeTolerance' These errors are a result of missing include headers in test fixtures. Resolves: COMPMID-6604 Change-Id: I8058c5848bb52a44925b2f99c9e8edf84dc79acc Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10561 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-31Use dynamic quantization in Convolution and Dilated Convolution testsGunes Bayir
This patch calculates the output quantization info based on the inputs' quantization information. The previous approach was using the same quantization information for input, weights and output. This implementation does not cover the cases where we have fused activation function. Resolves: COMPMID-6482 Change-Id: I4a9d87cfef8ad18ef241d457d23f44c8519a1389 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10541 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-31Extend CKW MatMul with nt_tAdnan AlSinan
- Add the kernel variant: (nt_t) to GpuCKWMatMul. - Extend CKW MatMul validation test with nt_t. - Fixes a bug in CKW where z-dim = 1. Resolves: COMPMID-6435 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I4c5e8791e55f21ffff3c11eca7802c51a4259977 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10525 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-30Use dynamic quantization in OpenCL™ Direct Convolution testsGunes Bayir
This patch calculates the output quantization info based on the inputs' quantization information. The previous approach was using the same quantization information for input, weights and output. This implementation does not cover the cases where we have fused activation function. Note that no Neon™ tests are changed since there were not any quantized Neon Direct Convolution tests. Resolves: COMPMID-6485 Change-Id: Id32241320acae0b58552546d6d6f665cd5c63044 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10470 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-17Fix memory Error in Reverse Fixture.Adnan AlSinan
Resolves: COMPMID-6581 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I0a634e064377e54b9190241c01fc75c212522ba7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10481 Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-10Optimize NEStackLayerGunes Bayir
Optimize the stack operation in Cpu by leveraging block memcpy. Resolves: COMPMID-6498 Change-Id: I49d79d179f0375a73d654edd59fb33072112569b Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10451 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-05Optimize CLTranspose operatorJakub Sujak
* Transpose higher dimensional tensors (>2D) by collapsing higher dimensions into the third dimension thus avoiding multiple dispatches of the CL kernel * Maximize tile size without register spilling Resolves: COMPMID-6448 Change-Id: Iac094b8c428bdf319d9c28a8334cb55d58e2d14b Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10443 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-10-03Fix nightly NEON Reverse reference failureAdnan AlSinan
- Fix the reference axis vector to be the right size. - Update typos in the error messages. Resolves COMPMID-6574 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I9572365b8173b92d0fffd557e4db261b2969109c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10423 Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2023-10-02Fix Nightly failing validation tests in NEON ReverseAdnan AlSinan
Resolves COMPMID-6574 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: I6b23e2a2f7b2839f038dad538dfc5ebda62891a6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10412 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Anitha Raj <Anitha.Raj@arm.com>
2023-09-27Implement tflite compliant reverse for CPUAdnan AlSinan
- Add support for negative axis values. - Add option to use opposite ACL convention for dimension addressing. - Add validation tests for the mentioned additions. Resolves COMPMID-6497 Change-Id: I9174b201c3adc070766cc6cffcbe4ec1fe5ec1c3 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10335 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-09-20Fix the validation issue in AddMulAdd fused kernelGunes Bayir
Resolves: COMPMID-6558 Change-Id: I015d504aaa9b8a1a232b01e49ab373d415ea1de9 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10340 Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-09-18Separate the output quantization calculation logic from matmulGunes Bayir
This patch generalizes the suggested output quantization calculation to any operation that employs a dot product between two vectors, i.e. c = sum_k(a_k * b_k) + d It also consider and suggests min/max boundaries for random S32 bias generation, depending on the accumulation result. MatMulKernelFixture is modified to use this interface. Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Change-Id: Ibb528261bb0310015967e11bd7ccd9ed9cff8479 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10312 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-09-04Extend Neon ReshapeLayer validation testsAnitha Raj
- Add a test case with src and dst having same row size - Remove inline from has_holes() util function Related to COMPMID-6504 Change-Id: Iead1f17692dc57b66c5d9f01eed30169efaee0a5 Signed-off-by: Anitha Raj <anitha.raj@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10190 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-09-04Remove legacy PostOps codeJakub Sujak
PostOps was the experimental interface for Dynamic Fusion. It is now replaced by the new Dynamic Fusion interface with code generation using the Compute Kernel Writer. Resolves: COMPMID-6190 Change-Id: I813b48facef2fd6f3aee332588886b4f9b3d33d8 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10219 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-08-31Port ClTemplatePool2d to ckwAdnan AlSinan
- Fixes a bug when using FP16 constant in some cases. - Adds op_write_raw_code to handle some special cases. - Ports MxN pooling 2d layer into ckw. - Adds unary function 'negate' to ckw. - Updates pool2d validation tests to include store op. Resovles COMPMID-6263 Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com> Change-Id: If8c683761fead79bd519aef28cc65de78d3ec629 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10172 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-08-22Optimize CpuReshapeKernelAnitha Raj
Resolves COMPMID-5279 Change-Id: Id9b007eed62c200702bbfcc83b94dab7b5de1714 Signed-off-by: Anitha Raj <anitha.raj@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9962 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-08-17Remove functionality to add padding in Y dimension in validation testsAnitha Raj
Signed-off-by: Anitha Raj <anitha.raj@arm.com> Change-Id: I5b9e04f9057777bb080c40fa1f55dfee4bd866dc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10138 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-08-08Fix failure in MeanReduce layerViet-Hoa Do
Resolves: COMPMID-6423 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I9cec051a7d1a2956218f8a6d8263bd5424f6d389 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10072 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-08-01Improved testing for ArgMinMaxPablo Marquez Tello
* ArgMinMax output was fixed to S32, this patch makes the changes required to allow other output types like U64/S64 * Made changes to the ArgMinMax fixture and tests to allow specifying output data type. * Made changes to the reference reduction_operation to allow specifying the output type * Added tests case to output S64 for the CL backend. * Added missing test cases in the neon backend. * Partially resolves MLCE-1089 Change-Id: I6f1cbc7093669d12c2a3aff6974cf19d83b2ecda Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10003 Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-07-14Fix dynamic fusion compilation errorViet-Hoa Do
Resolves: COMPMID-6393 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: Idc0880a964f2827bf5bf267b72fe7db9ce116f15 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9919 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-07-13Added S64/U64 support for the input in CLCastPablo Marquez Tello
* Partially resolves MLCE-1089 Change-Id: Ie3d2fc2f755ae99cdb17b57cc90bb3f99a1843e0 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9909 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-07-12Make test fixture setup methods not be templatedMatthew Bentham
This simplifies code slightly as nothing needs those functions to be function templates. Signed-off-by: Matthew Bentham <Matthew.Bentham@arm.com> Change-Id: If48694bf5677bb83426aeba952eb87174a42dff0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/536135 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9907 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2023-07-11Add Bias to MatMul Kernels and add support for use in Fully Connected LayerMohammed Suhail Munshi
Resolves: [COMPMID-6316] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: I08e6bac9e6b46b76978da0dc6a48ccfe3dde5086 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9833 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-07-07Enable transpose convolution with non-square kernelsViet-Hoa Do
Resolves: COMPMID-6319 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I49a17ff973efc88b7ce0334c47ecf076c03f4cc3 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9829 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-06-26Use MatMul in fully connected layer with dynamic weights when supportedMohammed Suhail Munshi
- Use MatMul kernels in FC layer when using dynamic weights without broadcasting or bias. - Fix minor typo in IClMatMulNativeKernelConfig.h Partially Resolves : [COMPMID-6193] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Id494062b5b4f4e75ff9714c202dde941955afa52 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9797 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-06-19Implement FP32/FP16 MatMul NT/NT kernel using the MMUL extensionSiCong Li
Resolves COMPMID-6194 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ie45e2aa9533948b2e5235563cef1d3834494eccf Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9739 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-06-16Add Fused Activation to OpenCL MatMulMohammed Suhail Munshi
- Added fused activation to MatMul function interface - Added fused activation to CL backend - Includes tests for supported Activation Functions in MatMul Resolves: [COMPMID-6192] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Ie103212b600b60699eaf6a6394d609e6e1f5aba6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/522465 Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9714 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-06-12Add multi-sketch support for dynamic fusionViet-Hoa Do
* Tensors are owned by workload context instead of workload sketch so that they can be used by multiple sketches. * Add an integration test for multi-sketch case. Resolves: COMPMID-6148 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I37d0de5ac103fb2a85020aa1c26e49eb304f47b7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9706 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-05-05Connect CLMatMul function to quantized kernels and resolve NE BatchMatMul ↵Jakub Sujak
int_8 failures * Adapt the CLMatMul function and ClMatMul operator to use quantized kernels. * Add function-level tests. Resolves: COMPMID-5929 and COMPMID-5811 Change-Id: I5348cdcf07b8074c138e04dfef0a73399377accd Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9575 Reviewed-by: Mohmun02 <MohammedSuhail.Munshi@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-05-03Support multi-dimensional indices in the CL Gather Layer up to ↵Omar Al Khatib
four-dimensional output tensors Resolves [COMPMID-5775] Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com> Change-Id: I6f6c12ac08f0b0ad070ca5d715c531c2c3762c30 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-05-02Fix fully connected and matmul mismatchesViet-Hoa Do
* There is an issue with quantized fully connected and matmul when the lower bound of bounded ReLU is negative. * Use int32_t for the calculation of min/max quantized value rather than PixelValue to avoid this issue. Partially resolves: COMPMID-5996 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I7b22e9d56a2441fc6a4c5c4e627f57d6e00d6ff1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9502 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-28Reorder addedDavid Svantesson
Adds Reorder kernel exposing blocking reorders from arm_gemm Resolves ONCPUML-1232 Change-Id: I42bf4166311fe1771565134d3ed7039fc8e30230 Signed-off-by: David Svantesson <david.svantesson@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9500 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-28Fix the gather layer indices checkViet-Hoa Do
* If the index is out-of-bound, both CPU and GPU implementations of the gather layer will output 0. Resolves: COMPMID-5964 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: Ib029b3acfb31452f2097c8c75448fb2697cfa332 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9487 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-19Add quantized support for CPU MatMulViet-Hoa Do
Resolves: COMPMID-5899 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I89d96e292c3492ba9b1900a3e5683f9dcd11dfc6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9440 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-17Add quantized CL MatMul kernels for Lhs NT/T, Rhs NTGunes Bayir
Implement OpenCL kernels for batched Matrix Multiplication for the quantized data types QASYMM8 and QASYMM8_SIGNED. Quantized MatMul is supported with the following MatMul attributes: * adj_x = false, adj_y = false * adj_x = true, adj_y = false We consider native format kernels only. In other words, no reshaping of the operand matrices is done. Resolves: COMPMID-5921, COMPMID-5922 Change-Id: I99e0f68054a2bd635c60ec2641acc2e7ff398473 Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com> Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9435 Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-14Align naming convention of ClMatMulJakub Sujak
Ensure naming of MatMul on GPU conforms to the naming convention <backend><operator><config> i.e. ClMatMul operator with the backend ClMatMulNativeKernel. Resolves: COMPMID-6015 Change-Id: I021d235b023ad17fe97bd6913e6a50d0ba4b194e Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9443 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-04-13Implement MatMul Function and Operator with Floating Point support for CPUMohammed Suhail Munshi
- Implements MatMul function and operator for floating point datatype FP16/FP32 - Includes support for transposing dynamic tensors prior to matrix multiplication. - Adds tests for 2D/3D/4D+ tensors in MatMul with F32/F16 datatype (with all combinations of transposed/not-transposed tensors) - Updates fixture to allow for testing fused activation in MatMul - Adds tests for matmul with and without fused activation Resolved: [COMPMID-5898] Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Change-Id: Iefa84b26dd723c9a51e6c3f91023152c6c31ace2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9411 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-04Support dynamic weights for Fully Connected layers on GPUJakub Sujak
The fully connected function and operator running on GPU have been adapted to support dynamic weights. Dynamic weights require the reshape and data layout conversion of weight tensors at runtime in the prepare stage of the operator. The implementation for GPU is identical to the CPU implementation. This patch also deprecates the `are_weights_reshaped` option in Fully Connected. Resolves: COMPMID-5870 Change-Id: I28f967695879d82cc91a928d95308a4e0e52a597 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9403 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-04-03Implement MatMul FunctionRamy Elgammal
Resolves: COMPMID-5949 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Idd8cfe6ea94a14f0b23178f6781251b5f0955563 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9390 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-03-29Add quantized support for unary elementwise in CPUViet-Hoa Do
* Add quantized unary elementwise in CPU using LUT. * Widen the input data range of the test suite. - Fix CPU exponential function overflow/underflow range. - Fix saturation issue of CL round operator. Resolves: COMPMID-5763 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I41445de2b4a33ec6b01e0ab701516c240c852d0b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9367 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>