Age | Commit message (Collapse) | Author |
|
* Binary elementwise operator now can have broadcasting in either
X dimension, Y+Z dimension, or both, in either LHS or RHS
operand.
* Fix bug in CL code to support batching.
Resolves: COMPMID-5704
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I51b04986d30861f255ca9f754adffa0e6c85a26b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8898
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Dynamic-Fusion: Ramy Elgammal <ramy.elgammal@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Multiple intermediate tensors can share the same tile.
- A simple operator can reuse the input tensor for the result
if the input tensor has the same shape, data type and it is
only consumed by that operator.
- The special case is a simple operator and an output operator
consume the same tensor. However as the output operator
doesn't change the content of the input tensor, it doesn't
count as "consuming" the input tensor.
* These temporary tiles are declared automatically by the template
writer. Individual operator doesn't need to generate output tile
declaration.
* Cast is now simple operator.
Resolves: COMPMID-5778
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I232647ac976645e2d266a62e055b9eb48c356a8e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8877
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* The dependency graph now can schedule any acyclic graph into
a sequential list of operators. This is needed as the output
operators now form branches in the graph.
* Fix the definition of input, output and intermediate tensors
in GpuKernelComponentGroup to support non-linear but sequential
list of operators.
* Add constraint on GpuOperatorGroup to enforce strictly linear
fusion style, but allow output operator as the only form of
branch.
Resolves: COMPMID-5771
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I68de3a31a2456145081f0a397e4e61dd66327682
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8823
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* The output of the fused operator must be explicitly specified
using GpuOutput operator.
* Any temporary tensors used to connect the output of an operator
to the input of another operator will be marked as no-alloc
and won't be allocated as a tensor in the memory.
Resolves: COMPMID-5771
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I5ae8e800f8f737db23a055a92b01c4f1d78c3bb8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8794
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Provide support for Add operator
- Auto initialize the destination tensor before testing fusion in conv2d
and elementwise binary ops.
Resolves: COMPMID-5518
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ibd815020f02b57f88eea7c2921bdcf98605d99c5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8617
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|