aboutsummaryrefslogtreecommitdiff
path: root/tests/validation/dynamic_fusion/gpu/Integration.cpp
AgeCommit message (Collapse)Author
2023-07-28Port ElementwiseBinary to CKW part 2SiCong Li
* Add fp16 support * Implement broadcasting to elementwise binary * Implement kernel name and kernel config id * Always use explicit cast in ckw unary, binary and ternary elementwise functions. This is to address the accidental use of double literals, with other benefits. * Refactor TypeConverter for smaller includes Resolves COMPMID-6260 Change-Id: I26b726746f8c0dd7b5942ad379d56f4d7642d15f Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9999 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-06-12Add multi-sketch support for dynamic fusionViet-Hoa Do
* Tensors are owned by workload context instead of workload sketch so that they can be used by multiple sketches. * Add an integration test for multi-sketch case. Resolves: COMPMID-6148 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I37d0de5ac103fb2a85020aa1c26e49eb304f47b7 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9706 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-01-25Implement dynamic fusion softmax operatorRamy Elgammal
- Return aux tensorInfo by get_aux_tensors() at runtime to init the aux tensor with the right size. - Keep softmax unfusable for this commit - Hence, added Tensor3D to template writer arguments declaration, for sake of keeping dynamic fusion softmax componenets' kernels matching their cl counterparts. Resolves: COMPMID-5523 Change-Id: I667f39545db925f667036ef448302c79a0330373 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/483924 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8986 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-01-24Change dynamic fusion API to return destination tensor infoGunes Bayir
The new dynamic fusion API is introduced in the following patch: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8906 For each operator (except Conv2D, which is migrated in the above patch), we - remove destination tensor from is_supported, validate and create calls - make create_op return ITensorInfo* to the intermediate destination object Affected operators: - DepthwiseConv2D - Cast - Elementwise Ops - Clamp - Reshape - Resize Resolves: COMPMID-5777 Change-Id: Ib60ec8a5f081752808455d7a7d790f2ed0627059 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8991 Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Dynamic-Fusion: Ramy Elgammal <ramy.elgammal@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2023-01-20Add missing direct conv2d tests to dynamic fusionSiCong Li
* Add direct conv2d tests as a separate fixture so that we can enable future direct conv2d specific tests * Move Conv2dAttributes to its own file Partially resolves COMPMID-5736 Change-Id: I530649488faf3bbed1a4fc7d16a74063bfdf33db Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8928 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2023-01-06Handle Intermediate tensors within the sketchGunes Bayir
- Intermediate tensor info objects are not created by the user anymore. They're returned from create_op and reused. This will prevent allocation of the intermediate tensors in case of possible interface misuse. - Sketch object handles intermediate tensor info pointers inside its implementation class via a unique pointer vector - Conv2d operator is migrated into the new interface Resolves: COMPMID-5776 Change-Id: I9422e3681eef4f2d2922f6d0a5d7786380837c6d Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8906 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-12-30Add temporary tile support for dynamic fusionViet-Hoa Do
* Multiple intermediate tensors can share the same tile. - A simple operator can reuse the input tensor for the result if the input tensor has the same shape, data type and it is only consumed by that operator. - The special case is a simple operator and an output operator consume the same tensor. However as the output operator doesn't change the content of the input tensor, it doesn't count as "consuming" the input tensor. * These temporary tiles are declared automatically by the template writer. Individual operator doesn't need to generate output tile declaration. * Cast is now simple operator. Resolves: COMPMID-5778 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I232647ac976645e2d266a62e055b9eb48c356a8e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8877 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-12-23Add multiple output support for dynamic fusionViet-Hoa Do
* The dependency graph now can schedule any acyclic graph into a sequential list of operators. This is needed as the output operators now form branches in the graph. * Fix the definition of input, output and intermediate tensors in GpuKernelComponentGroup to support non-linear but sequential list of operators. * Add constraint on GpuOperatorGroup to enforce strictly linear fusion style, but allow output operator as the only form of branch. Resolves: COMPMID-5771 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I68de3a31a2456145081f0a397e4e61dd66327682 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8823 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2022-12-16Add output operator for dynamic fusionViet-Hoa Do
* The output of the fused operator must be explicitly specified using GpuOutput operator. * Any temporary tensors used to connect the output of an operator to the input of another operator will be marked as no-alloc and won't be allocated as a tensor in the memory. Resolves: COMPMID-5771 Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com> Change-Id: I5ae8e800f8f737db23a055a92b01c4f1d78c3bb8 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8794 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-11-29Adding GpuAdd to dynamic fusion operatorsRamy Elgammal
- Provide support for Add operator - Auto initialize the destination tensor before testing fusion in conv2d and elementwise binary ops. Resolves: COMPMID-5518 Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com> Change-Id: Ibd815020f02b57f88eea7c2921bdcf98605d99c5 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8617 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2022-11-22Remove dynamic fusion prototype with tests and examplesSiCong Li
Public headers of the new experimental dynamic fusion can be found in arm_compute/dynamic_fusion/ New examples on how to use the interface can be found in tests/validation/dynamic_fusion/gpu/Integration.cpp Resolves COMPMID-5683 Change-Id: I7ccb902a227fb487562df15fc3c30118d1d95bbd Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8671 Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-11-04Fix compiler warnings in dynamic fusionSiCong Li
Resolves: COMPMID-5686 Change-Id: I608c359583c44f2f04f29faddd1c6b38a381de60 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8562 Benchmark: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2022-11-01Rewrite dynamic fusionSiCong Li
The new version introduces the following major changes: * Change public interface to simplify and standardize the user experience - Use the term "Workload" uniformly - Simplify operator interface to be a set of static methods: validate_op(), create_op() * Separate the kernel writing into its own component (template_writer). This is to allow the co-development of GpuKernelWriter, and to allow easy replacement once GpuKernelWriter is mature. * Optimize the core fusion algorithm used by the component graph. The details can be found in GpuKernelComponentGraph::fuse() * Use Gpu instead of Cl prefixes for most of the Workload interfaces (except for runtime and kernel components, which have to be language specific) This allows the potential extension to other Gpu langauges in the future. * Refactor runtime memory interface so that auxiliary tensor handling is separate from the user tensor passing. This is because the former is less stable and may require extension in the future. * Hide source code object from the user as it is not required at the moment * Deprecate the old prototype entirely by disabling it in SCons build Resolves COMPMID-5510, COMPMID-5512, COMPMID-5513 Change-Id: If69d2362856f2de4503546b7b6cf48a525cf3079 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8406 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>