aboutsummaryrefslogtreecommitdiff
path: root/src/core/NEON/kernels/arm_gemm/transforms/list.hpp
AgeCommit message (Collapse)Author
2024-01-12[ONCPUML-1387] Add ACL based reorder for f32 to bf16 data type conversion.Renato Arantes
The reorders supported at the moment are: ab->BA4b4a ab->BA8b4a Co-Authored-By: David Mansell <David.Mansell@arm.com> Change-Id: Ic466465629ce3bcdcee0089e251485b79b60e1f3 Signed-off-by: Renato Arantes <renato.arantes@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10775 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Benchmark: Arm Jenkins <bsgcomp@arm.com>
2021-07-22Update GEMM assembly kernelsGeorgios Pinitas
- Introduce Fp32 kernels with internal calculations in Bfloat16 when fast_mode is enabled - Improve kernel selection heuristics Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I68a9e7e862b6fd2721b46e0d7cc791091c4ab279 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5965 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-12COMPMID-3776: Indirect GEMMGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I51a1b0f098bc3a8c408c50c92221e4df3061e12c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4343 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-23COMPMID-3578: Update FP32/int8 kernel selection.David Mansell
Upgrade the current 'is_preferred()' mechanism with a new framework, where kernels instead provide an estimated cycle count figure. Compatibility with old mechanism is achieved via a wrapper which replaces a "true" result with an estimate of 0, and a "false" result with UINT64_MAX. This mechanism is then used to select between 'interleaved' and 'hybrid' FP32 NEON kernels. This uses a simple system based on counting MACs performed and bytes of data transferred (for rearrange/merge operations) and dividing by fixed performance figures, which are provided for A53, A55, A73 and 'default' figures (based on A76). Separately, a new route for performing int8 GEMMs by using the int16 kernel is provided. This performs significantly (for uint8) or slightly (for int8) better on A53 than the existing int8 route. Optimized 8-to-16 bit transforms are also included. Change-Id: I53b2e59eb9368793c78c2081e17d2445361bcc47 Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/250120 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3609 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-06COMPID-3324: Clean GEMM kernelsGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I170de1671e061a78740caee31fb4a1b8642c1369 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3505 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-01-31COMPMID-3003: Integrate assembly kernels utilizing MMLA instruction.Georgios Pinitas
MMLA is a matrix-multiply instruction introduced on armv8.6-A Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I572a54981d48f5a1e0e9e51102cb7ae28ad87806 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2663 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-09-30COMPMID-2452: Dot product optimizations on merge/transformsGeorgios Pinitas
-Adds optimized gemm transforms for AArch64. -Optimized gemm merger to only support alpha==1 and (beta==0 || beta=1) cases Change-Id: I55793b12a0381f4fd53f521d0e57416809904d96 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/2003 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-07-26COMPMID-2178: Update GEMM assembly code.Georgios Pinitas
Perform offset reduction and requantization within the assembly wrapper. Change-Id: I5d5b3e1f6f9ef4c71805362c57f88ff199c027a3 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/1541 Comments-Addressed: Pablo Marquez <pablo.tello@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-18COMPMID-1867: Add NEON/SVE GEMM Hybrid kernels.Georgios Pinitas
Change-Id: Ib40a9921e7f9a6a8be6c38872d6b3a0f24ed0cd3 Reviewed-on: https://review.mlplatform.org/515 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-09COMPMID-1675: Remove unused SVE filesGeorgios Pinitas
Removes: -sve_interleave_8way_block2_16bit -sve_interleave_8way_block4_16bit -sve_sgemm_3VLx8 Change-Id: I0aa35fe974d8e122937dfe8923ecf63ff5a52001
2018-11-08COMPMID-1675: Add SVE supportGeorgios Pinitas
Change-Id: I86679adff556b6ffc9929b35cbf1b59b3958bdb1
2018-11-02COMPMID-881: RSH new arm_gemm interface.Pablo Tello
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>