Age | Commit message (Collapse) | Author |
|
- Return aux tensorInfo by get_aux_tensors() at runtime to init the aux
tensor with the right size.
- Keep softmax unfusable for this commit
- Hence, added Tensor3D to template writer arguments declaration, for sake of
keeping dynamic fusion softmax componenets' kernels matching their cl
counterparts.
Resolves: COMPMID-5523
Change-Id: I667f39545db925f667036ef448302c79a0330373
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/483924
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8986
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add support for texture image to input and output of direct
convolution
- Extend T_LOAD2D_INDIRECT macro to read values from cl image storages
Resolves COMPMID-5715
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Change-Id: Idb0410f53f6d0763cd9e39895a7cbf9bc826d33a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8904
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
The new version introduces the following major changes:
* Change public interface to simplify and standardize the user experience
- Use the term "Workload" uniformly
- Simplify operator interface to be a set of static methods:
validate_op(), create_op()
* Separate the kernel writing into its own component (template_writer).
This is to allow the co-development of GpuKernelWriter, and to allow
easy replacement once GpuKernelWriter is mature.
* Optimize the core fusion algorithm used by the component graph. The
details can be found in GpuKernelComponentGraph::fuse()
* Use Gpu instead of Cl prefixes for most of the Workload interfaces
(except for runtime and kernel components, which have to be language specific)
This allows the potential extension to other Gpu langauges in the
future.
* Refactor runtime memory interface so that auxiliary tensor handling
is separate from the user tensor passing. This is because the former
is less stable and may require extension in the future.
* Hide source code object from the user as it is not required at the
moment
* Deprecate the old prototype entirely by disabling it in SCons build
Resolves COMPMID-5510, COMPMID-5512, COMPMID-5513
Change-Id: If69d2362856f2de4503546b7b6cf48a525cf3079
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8406
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|