Age | Commit message (Collapse) | Author |
|
Full trademarks available in README.md
Resolves: COMPMID-4257
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ibfba2adf2eef3449433f467464ebd87d7198474d
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5116
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-3990
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: If840c79209940535450f4ea1cbf6b0ec646a168e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4866
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I51a1b0f098bc3a8c408c50c92221e4df3061e12c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4343
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Upgrade the current 'is_preferred()' mechanism with a new framework,
where kernels instead provide an estimated cycle count figure.
Compatibility with old mechanism is achieved via a wrapper which
replaces a "true" result with an estimate of 0, and a "false" result
with UINT64_MAX.
This mechanism is then used to select between 'interleaved' and
'hybrid' FP32 NEON kernels. This uses a simple system based on
counting MACs performed and bytes of data transferred (for
rearrange/merge operations) and dividing by fixed performance figures,
which are provided for A53, A55, A73 and 'default' figures (based on
A76).
Separately, a new route for performing int8 GEMMs by using the int16
kernel is provided. This performs significantly (for uint8) or
slightly (for int8) better on A53 than the existing int8 route.
Optimized 8-to-16 bit transforms are also included.
Change-Id: I53b2e59eb9368793c78c2081e17d2445361bcc47
Signed-off-by: David Mansell <David.Mansell@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/250120
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3609
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I394c6c539969940e0119cbc14174909d47e65de6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3519
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I170de1671e061a78740caee31fb4a1b8642c1369
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3505
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
GEMM_INTERLEAVE_2D was wrongly selected by the heuristic also in case of
maxthreads < 8
Change-Id: If531d44c6f00ae6f8e3a4bf22428829b252bc3d6
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3225
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Currently 1D ranges of work are specified by the scheduler
via two integers, start and end. This limit opportunities
for advance parallelism and scheduling
This patch expands the interfaces to allow for ND parallism.
`GemmCommon::get_window_size` now returns an `NDRange` specifying the work
in N-dimensions rather than with the single integer it used prior (1D)
Execute now takes an `NDCoordinate` which specifies an `NDRange` with a start
position for that work along with an `NDCoordinate` to specify the thread location
In addition to expanding the interface to enable this functionality,
we have added the capability to SGEMM when the number of threads is high
this has the effective of allowing a much greater degree of parallelism
where te problem dimension would previously have limited the number of threads.
Change-Id: I3e1a8b7276216627bec4ff6f24ac2147552ea9fb
Signed-off-by: Joseph Dobson <joseph.dobson@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2962
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I0e449306c138a562ffc1455e76ec44b2fd059d85
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2860
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
MMLA is a matrix-multiply instruction introduced on armv8.6-A
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I572a54981d48f5a1e0e9e51102cb7ae28ad87806
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2663
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: I8667e75843fdd6ac75bd8272a86a348b830da28d
Reviewed-on: https://review.mlplatform.org/c/2548
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I7f52112d2d05b1ea3d3f3d4b19b8eafab05d6c44
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2141
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
|
|
Perform offset reduction and requantization within the assembly wrapper.
Change-Id: I5d5b3e1f6f9ef4c71805362c57f88ff199c027a3
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1541
Comments-Addressed: Pablo Marquez <pablo.tello@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
-Updates u8/s8 hybrid dot product kernels to work for any N and any K >=16.
-Adds hybrid FP32 kernels with generic and A55 variants.
-Adds SVE native kernels for fp16/u8/s8.
Change-Id: Ifc0eaba9e3c8ea5bb19d334e870e1b39e4e7e728
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/863
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: Ib40a9921e7f9a6a8be6c38872d6b3a0f24ed0cd3
Reviewed-on: https://review.mlplatform.org/515
Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I86679adff556b6ffc9929b35cbf1b59b3958bdb1
|
|
Change-Id: I80764d09bf5fb87b3a98bc0e1803d25c6c682c1f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139859
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch implements a system for separating the "validity" from
"preferred" aspect of the current heuristics in gemm_*.cpp.
Now, each gemm_*.cpp defines a list of candidate implementations,
each of which supplies an is_valid() function (to check for
validity), an is_preferred() function (the "heuristic" part), and an
instantiate() function which actually produces the GemmCommon object
pointer.
The actual gemm() function is now templated and uses this list to
select an implementation. This patch also implements a mechanism to
identify the preferred implementation, and override it via the
GemmConfig structure.
Change-Id: Id49ab7af8bf2e3e9fd951a9698883ade234d40e1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139120
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch makes the needed infrastructure changes to allow SVE
kernels to be added later on.
Change-Id: Ide5bccac2f47278e93fff3d648231aee2d5f8c2e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139070
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Pulled latest fixes from David's repo:
commit f43ebe932c84083332b0b1a0348241b69dda63a7
Author: David Mansell <David.Mansell@arm.com>
Date: Tue Jul 3 18:09:01 2018 +0100
Whitespace tidying, fixed comment in gemv_batched imported from ACL.
Change-Id: Ie37a623f44e90d88072236cb853ac55ac82d5f51
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/138530
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: David Mansell <david.mansell@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Improve the native GEMM so it can cope with any value for M. Also
change the selection code so that the native GEMM is selected if M is
small and nmulti is large - Winograd needs GEMMs like this and they
don't thread properly with the blocked GEMM.
(also rename gemm_batched.hpp back to gemv_batched.hpp)
Change-Id: I736c33373ada562cbc0c00540520a58103faa9d5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131739
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ib9d91b77f1d51976da4449fa1e6eeeffae307353
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127876
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iba2664f33320e79bd15ca9c1399e65e4cc165be6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125265
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|