Age | Commit message (Collapse) | Author |
|
Change-Id: Id602cfe2d00deed6d994ba4c90cdc5914a8e6016
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1987
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
|
|
Block transforms could access elements out of bounds in case of input
sizes smaller than the transform blocks. Fixes the edge cases by
short-circuiting the accesses in such cases.
Change-Id: I11d172ecd80b4dde46496e9d4b446de7fb9d5dc7
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1976
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Fix incorrect stride
Change-Id: I5bb9660bd73c40d587fa869d2488b7ab5f5b1ea7
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1807
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Update GEMM assembly code.
Change-Id: Id315c51a11aa89915727c4d388e9335982216a2d
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1774
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
gemm_quint8 is only supported for 64-bit thus guarding to avoid any
build related issues.
Change-Id: Id8784dbacc467780318bd340f895a5abbd383182
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1638
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Perform offset reduction and requantization within the assembly wrapper.
Change-Id: I5d5b3e1f6f9ef4c71805362c57f88ff199c027a3
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1541
Comments-Addressed: Pablo Marquez <pablo.tello@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Has a couple of benefits, one is a disassembler that actually
understands dot product will start showing the dot product
instruction for what it is rather than just a random .word.
For the interested parties in actually why compilers and toolchains
manage to disassemble this , please go and look up mapping symbols
from toolchains.
Secondly .word is a data directive and if you ever have a customer
run Arm compute library on big endian, on AArch64 this will not work.
This is because data on big endian is well, big endian but the code
section is not big endian but just little endian. Admittedly there
will be many other things that need to be fixed for big endian
to work reliably.
Eyeballed satisfactorily with a simple case. If someone
could run this through a test run with the CI that would be
great.
Thanks,
Ramana
Change-Id: I0b9573ecbed298afc967d675b0542a6fe72b4c52
Signed-off-by: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1588
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ib3fbd8cdc42f708e16be9ac1f63d4e693dce5aeb
Signed-off-by: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1589
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I7859b82b2059e14685f8792424648ac5eacd67f1
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1418
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I3e369295a7caece8142376b75796567242c1ee8d
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1211
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
|
|
-Updates u8/s8 hybrid dot product kernels to work for any N and any K >=16.
-Adds hybrid FP32 kernels with generic and A55 variants.
-Adds SVE native kernels for fp16/u8/s8.
Change-Id: Ifc0eaba9e3c8ea5bb19d334e870e1b39e4e7e728
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/863
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: If50fc6a96e97eb96c1e3850864b13d134dcbf88a
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/607
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
|
|
Change-Id: Ifeb005f9d18d19feff11949474cce84d9e03749c
Reviewed-on: https://review.mlplatform.org/565
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ib40a9921e7f9a6a8be6c38872d6b3a0f24ed0cd3
Reviewed-on: https://review.mlplatform.org/515
Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Removes:
-sve_interleave_8way_block2_16bit
-sve_interleave_8way_block4_16bit
-sve_sgemm_3VLx8
Change-Id: I0aa35fe974d8e122937dfe8923ecf63ff5a52001
|
|
Change-Id: I86679adff556b6ffc9929b35cbf1b59b3958bdb1
|
|
Change-Id: If8fbd04d0817b9e654ffa9715879a2521de66963
|
|
Change-Id: I05d3447336ee0bf330e2a0c58fc6904be1db8f83
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152626
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
GCC (>=8) yields warning w/ -Wignored-qualifers (enabled by -Wextra) on
such usage.
Change-Id: Ib3284b60cec0ec4faf8c6e6c1e2980cbf5731973
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145384
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I80764d09bf5fb87b3a98bc0e1803d25c6c682c1f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139859
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch implements a system for separating the "validity" from
"preferred" aspect of the current heuristics in gemm_*.cpp.
Now, each gemm_*.cpp defines a list of candidate implementations,
each of which supplies an is_valid() function (to check for
validity), an is_preferred() function (the "heuristic" part), and an
instantiate() function which actually produces the GemmCommon object
pointer.
The actual gemm() function is now templated and uses this list to
select an implementation. This patch also implements a mechanism to
identify the preferred implementation, and override it via the
GemmConfig structure.
Change-Id: Id49ab7af8bf2e3e9fd951a9698883ade234d40e1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139120
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch makes the needed infrastructure changes to allow SVE
kernels to be added later on.
Change-Id: Ide5bccac2f47278e93fff3d648231aee2d5f8c2e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139070
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
The "cc" constraint was missing on the a53/a55r1 versions of this kernel.
Added "memory" to these (and the generic kernel) as well for safety.
Change-Id: I4df1b2fde43c20550ba7a51436b326f5e9e9871f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138812
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Pulled latest fixes from David's repo:
commit f43ebe932c84083332b0b1a0348241b69dda63a7
Author: David Mansell <David.Mansell@arm.com>
Date: Tue Jul 3 18:09:01 2018 +0100
Whitespace tidying, fixed comment in gemv_batched imported from ACL.
Change-Id: Ie37a623f44e90d88072236cb853ac55ac82d5f51
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138530
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: David Mansell <david.mansell@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I762a3c9add2e26b850f388a78a16861abb2bf0f9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134553
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Improve the native GEMM so it can cope with any value for M. Also
change the selection code so that the native GEMM is selected if M is
small and nmulti is large - Winograd needs GEMMs like this and they
don't thread properly with the blocked GEMM.
(also rename gemm_batched.hpp back to gemv_batched.hpp)
Change-Id: I736c33373ada562cbc0c00540520a58103faa9d5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131739
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The default templated merge, and the specialised S8 12x8 merge, were
using alpha and beta the wrong way round. Fixed.
Change-Id: Ie559b665edf1eb012e8cb54ea0bca31612bcc072
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131309
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ibab955dbade4805c39cab003362be2bb3c74b166
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130605
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
threads
Change-Id: Id5ba16a7e3382070fda936c63d174df53596da04
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129964
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ib9d91b77f1d51976da4449fa1e6eeeffae307353
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127876
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Removed CPUTarget in favor of the CPUModel type.
CPUInfo now holds a vector of N CPUs.
CPUInfo autoinitialise upon construction with 1 GENERIC CPU.
CPPScheduler fills CPUInfo's vector upon construction (runtime).
IScheduler has a single CPUInfo obj and ThreadInfo always gets a pointer to it (avoid copying the vector)
Change-Id: I30f293258c959c87f6bac5eac8b963beb6a4d365
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124626
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iba2664f33320e79bd15ca9c1399e65e4cc165be6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125265
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|