Age | Commit message (Collapse) | Author |
|
Makes a small difference to compile times and opens up other opportunities
to simplify code.
Change-Id: I232876910bbe4fa9719f4a0ce4a54c090faeb5ef
Signed-off-by: Matthew Bentham <Matthew.Bentham@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/532429
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9856
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially-Resolves: COMPMID-5518
Change-Id: I8358784815bcac461d50e384fa7bc96f476d3983
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9045
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Dynamic-Fusion: SiCong Li <sicong.li@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Note: we use a separate test fixture for Multiplication op instead of reusing ElementwiseBinaryFixture to avoid exposing the internal enum ElementwiseOp to the public utils/TypePrinters.h as required by the data test case macros to print the test data. We also do not consider modifying the enum ArithmeticOp in the standard interface to include MUL without an implementation. Future work should consider refactoring this test fixture into the ElementwiseBinaryFixture to reduce the total number of fixtures/code duplication.
Resolves: COMPMID-5779
Change-Id: I84207658ce0407095b028fca0ab7bfa2950255ec
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9013
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Binary elementwise operator now can have broadcasting in either
X dimension, Y+Z dimension, or both, in either LHS or RHS
operand.
* Fix bug in CL code to support batching.
Resolves: COMPMID-5704
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I51b04986d30861f255ca9f754adffa0e6c85a26b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8898
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Dynamic-Fusion: Ramy Elgammal <ramy.elgammal@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Multiple intermediate tensors can share the same tile.
- A simple operator can reuse the input tensor for the result
if the input tensor has the same shape, data type and it is
only consumed by that operator.
- The special case is a simple operator and an output operator
consume the same tensor. However as the output operator
doesn't change the content of the input tensor, it doesn't
count as "consuming" the input tensor.
* These temporary tiles are declared automatically by the template
writer. Individual operator doesn't need to generate output tile
declaration.
* Cast is now simple operator.
Resolves: COMPMID-5778
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I232647ac976645e2d266a62e055b9eb48c356a8e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8877
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* The dependency graph now can schedule any acyclic graph into
a sequential list of operators. This is needed as the output
operators now form branches in the graph.
* Fix the definition of input, output and intermediate tensors
in GpuKernelComponentGroup to support non-linear but sequential
list of operators.
* Add constraint on GpuOperatorGroup to enforce strictly linear
fusion style, but allow output operator as the only form of
branch.
Resolves: COMPMID-5771
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I68de3a31a2456145081f0a397e4e61dd66327682
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8823
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* The output of the fused operator must be explicitly specified
using GpuOutput operator.
* Any temporary tensors used to connect the output of an operator
to the input of another operator will be marked as no-alloc
and won't be allocated as a tensor in the memory.
Resolves: COMPMID-5771
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I5ae8e800f8f737db23a055a92b01c4f1d78c3bb8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8794
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Provide support for Add operator
- Auto initialize the destination tensor before testing fusion in conv2d
and elementwise binary ops.
Resolves: COMPMID-5518
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ibd815020f02b57f88eea7c2921bdcf98605d99c5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8617
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|