Age | Commit message (Collapse) | Author |
|
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: If02f7809f9b6e84979121698c5e7a62cbb41e2c3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10487
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This reverts commit aeced744b854758768243833bcdf999c0c3c1a5b.
Reason for revert: Incorrect SME architecture checks.
Change-Id: I23fe78178041a544a8791a4655bf6fe4aa375e38
Signed-off-by: David Mansell <David.Mansell@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10501
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I1f73819c25c66e4d13198e9c79755808d92b343d
Signed-off-by: David Mansell <David.Mansell@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10466
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add support for processing left-over vector to comparison kernel.
* Combine native and quantized versions of CL comparison.
Resolves: COMPMID-6424
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I31d43bdf0eab999cee6fa8144b5d8e921a1093e8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10467
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Batch dimension is added to reduction operation.
- All the dimensions higher than the batch dimension are collapsed
so that the input and output tensors are always 3-4D.
- CL kernel is called once instead of being repeatedly called
to process each sliding window.
Resolves: COMPMID-6443
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Icd99939d52d3bb648f08537e5f52ef27e894061b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10456
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
"2D" threading mode was not setting the result pointer correctly for
SME2 kernels with K blocking - for non-final blocks the result
pointer should be NULL so that the intermediate results get written
in the accumulator buffer by the kernel.
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: Idefa538e190a086e1e44a91998ab7e949e3989e4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10342
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: ONCPUML-1251
Reorders of block size 8 are only available when SVE is enabled. This
fixes an issue with these reorders being checked even when SVE is not enabled.
Signed-off-by: David Svantesson <david.svantesson@arm.com>
Change-Id: I287763a547d2ab303c3f8ce10164178b680e551e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10464
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Optimize the stack operation in Cpu by leveraging block memcpy.
Resolves: COMPMID-6498
Change-Id: I49d79d179f0375a73d654edd59fb33072112569b
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10451
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6583
Change-Id: Icb760b57df1e4573e5009e41ee13ada869acc687
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10457
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Transpose higher dimensional tensors (>2D) by collapsing higher
dimensions into the third dimension thus avoiding multiple dispatches
of the CL kernel
* Maximize tile size without register spilling
Resolves: COMPMID-6448
Change-Id: Iac094b8c428bdf319d9c28a8334cb55d58e2d14b
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10443
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix the reference axis vector to be the right size.
- Update typos in the error messages.
Resolves COMPMID-6574
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I9572365b8173b92d0fffd557e4db261b2969109c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10423
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
|
|
Resolves: COMPMID-6476, COMPMID-6477
Change-Id: Ied37c269d5a108ff72f70e3ad932cf372bda5562
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10346
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Clang-format options now match those in clang-format version 14.
Remove Astyle checks as the same code style checks are provided by clang-format.
Resolves: COMPMID-6576
Change-Id: Iefa9bb719826242a3276e9ca058d0c84624f7302
Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10399
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6474
Change-Id: Iaff5b512cf77975f2df02dcdf848711b13bf97a6
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10341
Reviewed-by: Mohmun02 <MohammedSuhail.Munshi@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* The current implementation has signfinicant inaccuracy
and the issue cascades to GELU.
* Use the implementation from Arm® Optimized Routines.
The maximum error is 1.93 ULP.
Resolves: COMPMID-6554
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: If80131e164b7a078e34dd8e05b1506698f31d17a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10395
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Code is formatted as per a revised clang format configuration
file(not part of this delivery). Version 14.0.6 is used.
Exclusion List:
- files with .cl extension
- files that are not strictly C/C++ (e.g. Android.bp, Sconscript ...)
And the following directories
- compute_kernel_writer/validation/
- tests/
- include/
- src/core/NEON/kernels/convolution/
- src/core/NEON/kernels/arm_gemm/
- src/core/NEON/kernels/arm_conv/
- data/
There will be a follow up for formatting of .cl files and the
files under tests/ and compute_kernel_writer/validation/.
Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Change-Id: Ib7eb1fcf4e7537b9feaefcfc15098a804a3fde0a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10391
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
|
|
- Add support for negative axis values.
- Add option to use opposite ACL convention for dimension addressing.
- Add validation tests for the mentioned additions.
Resolves COMPMID-6497
Change-Id: I9174b201c3adc070766cc6cffcbe4ec1fe5ec1c3
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10335
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Inclusion order of header is changed as preparatory step
for applying clang-format
Change-Id: I0c529f896ba802dfc6f30a573cdc9d9a24f3081c
Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10379
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Switch the loop unrolling order and reuse the pre-computed vectors
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I636c0530d6b21dae4dbb371c57d18b1f7c7246a8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10355
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Two implementations of the command buffer are added:
- CLMutableCommandBuffer uses mutable dispatch command buffer
extension.
- CLCompatCommandBuffer is the compatibility class for platform
without the CL extension.
Resolves: COMPMID-6454
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I15b370a50168ca940bd8fb2b5fae26230da3f472
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10298
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6475
Change-Id: Ic867cdfff5d4391cb749a04bf7cc35cda63d3b71
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10311
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch fixes some include dependencies in certain files that caused build failures in https://review.mlplatform.org/c/ml/ComputeLibrary/+/10287.
It also circumvents some clang-format glitches.
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I8e9d3307edd2d1afd17c685c9bc9429624130e5a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10313
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: <felixjohnny.thomasmathibalan@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The skeleton code consists of modifications
- to build the library with the quantized matmul kernel
- refactoring of some common utilities
- empty OpenCL Kernels for four configurations ([Lhs, Rhs] X [Nt, t])
- some validation tests and skeleton for functional tests
Resolves: COMPMID-6473
Change-Id: Id8401f789d34277dceb1f91afd68c9c88275618a
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10273
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add a test case with src and dst having same row size
- Remove inline from has_holes() util function
Related to COMPMID-6504
Change-Id: Iead1f17692dc57b66c5d9f01eed30169efaee0a5
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10190
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
PostOps was the experimental interface for Dynamic Fusion. It is now
replaced by the new Dynamic Fusion interface with code generation using
the Compute Kernel Writer.
Resolves: COMPMID-6190
Change-Id: I813b48facef2fd6f3aee332588886b4f9b3d33d8
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10219
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Inline assembler blocks attempting to bind 8 integer
registers don't compile in certain configurations (notably GCC 13.2 debug
builds with -O0 -g). Fix this by splitting the offending block into two
separate parts (straightforward as there is no flow control in the block).
Fixes: COMPMID-6532
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: I80e9a10e6a91574176d50e63c45fab055aefa659
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10197
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Emanuele Rocca <ema@linux.it>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Following CpuReshapeKernel Optimizations, update the CpuGemmConv2D and CpuFlatten
to use CpuReshape operator instead of CpuReshapeKernel
- Minor changes to comment in NEReorgLayerKernel.h
Resolves COMPMID-6504
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Change-Id: Ib6ee1fdc313d91249f9fe41c81e73324031c1ff4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10186
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: I359ed0703f4036e017b34b622f76b630cefac973
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10183
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-5279
Change-Id: Id9b007eed62c200702bbfcc83b94dab7b5de1714
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9962
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6495
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I916829222a6211fa096a833a2afc5fab5eb34ea4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10143
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add helper functions to check whether command buffer extensions
exist in CL device.
Resolves: COMPMID-6453
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Ibc287e4526e54be4702241ab8ca0cea0b8661b3a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10130
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Anitha Raj <Anitha.Raj@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6400
Change-Id: Id9935f9727f77a824afc75c35f044e3f5c173e0d
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10120
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6404
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I75aebe620567ed50817747589bbe8cfb63715a7b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10036
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Anitha Raj <Anitha.Raj@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add fp16 support
* Implement broadcasting to elementwise binary
* Implement kernel name and kernel config id
* Always use explicit cast in ckw unary, binary and ternary elementwise
functions. This is to address the accidental use of double literals,
with other benefits.
* Refactor TypeConverter for smaller includes
Resolves COMPMID-6260
Change-Id: I26b726746f8c0dd7b5942ad379d56f4d7642d15f
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9999
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Some symbols have been moved from core/Types.h. This patch retains
back compatibility so that the user can still include this header
for those symbols
* A new header core/CoreTypes.h is created to avoid circular dependency.
This header includes essential small types that are used across
functions
* Move all function info types into function_info folder for easier
tracking
Resolves COMPMID-6330
Related to https://review.mlplatform.org/c/ml/ComputeLibrary/+/9757
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I4739175c2d4d184a9bc8e28b881b497fab03ca60
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9979
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves MLCE-1089
Change-Id: I8b385ef8a00ec5de60299bc7a359766ba5417e68
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9918
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* The kernel now supports the following conversions:
S64 -> F32
U64 -> F32
* Resolves MLCE-1089
Change-Id: I277cf58b78d919fde25947520d2056e1412c7f82
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9935
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Makes a small difference to compile times and opens up other opportunities
to simplify code.
Change-Id: I232876910bbe4fa9719f4a0ce4a54c090faeb5ef
Signed-off-by: Matthew Bentham <Matthew.Bentham@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/532429
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9856
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
with fp16 and quantized types
Resolves: COMPMID-6337
Change-Id: I81542e51c9c0329f202ac8452f173b138e51a0f6
Signed-off-by: Michael Tyler <michael.tyler@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9883
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: [COMPMID-6316]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I08e6bac9e6b46b76978da0dc6a48ccfe3dde5086
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9833
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Only kernel sizes supported are 2, 4, 8 and 16.
* Resolves COMPMID-6349
Change-Id: I30c85dcb3505d47fe56a2d2a08e9221ff426ee93
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9890
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
functions
- Added checks for supported activation functions in MatMulLowpKernel validate
- Replaced incorrect float activation macro with quantized implementation in mat_mul_quantized
Resolves: [COMPMID-6339]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I15661f14877f1d3305644e6473feb5482a67e773
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/532858
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9855
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Simpler implementation without stages for axis 0
* Removed considerable amount of code.
Resolves COMPMID-6271
Change-Id: Ie8bcb2f0b55f87472f44b38872a23a922619a211
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9849
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6337
Signed-off-by: Michael Tyler <michael.tyler@arm.com>
Change-Id: Id8e9b39e55ab3e13beda720e24ba9ea7e6f97762
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9868
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6337
Change-Id: Ie9097b3f56e8071426c621386a5988bd7f7e8ef2
Signed-off-by: Michael Tyler <michael.tyler@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9852
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Partially port ElementwiseBinary component to ckw (broadcast not
supported yet)
* Port Store component to ckw
* Move KernelArgumentsHelpers to ckw_driver/ as it's only used by the
driver
ckw_driver is a middle layer between dynamic fusion and Compute Kernel
Writer (CKW). It consumes the fused kernel component stream produced by
Dynamic Fusion and uses CKW to write the kernel code complete with all
meta info needed by the runtime to enqueue the kernel.
It consists of two parts:
* Kernel writing: This resides in dynamic_fusion/sketch
* Runtime utilities: This resides in dynamic_fusion/runtime
The integration (separation between DF and CKW) occurs in two places:
* Inside GpuCKWDriver
global driver that coordinates how the
final fused kernel code is assembled together alongwith other meta
info needed by runtime.
* Inside each instantiated IGpuCKWComponentDriver
component driver that drives CKW to write component-specific code
or do component-specific configurations
Partially resolves: COMPMID-5792 COMPMID-6282 COMPMID-6260 COMPMID-6266
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ib57a080a65fe8cfee1a8df1529fe572005a6d2f2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9847
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6196, COMPMID-6197
Change-Id: I22a1c32686eb70e7676c8b4d64a76dbaeb638cb3
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9798
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Define ckw::TensorStorage. The tensor storage represents the type of tensor memory object.
* Add helper functions for setting the CKW TensorComponent and TensorStorage as OpenCL kernel arguments.
* Refactor CL Image2D method for simpler image object creation.
Resolves: COMPMID-5784
Change-Id: I2d37d06783c1dc55f3b5692b44eb49b151f2401c
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9807
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Use MatMul kernels in FC layer when using dynamic weights without broadcasting or bias.
- Fix minor typo in IClMatMulNativeKernelConfig.h
Partially Resolves : [COMPMID-6193]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: Id494062b5b4f4e75ff9714c202dde941955afa52
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9797
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-6195
Signed-off-by: ramy.elgammal@arm.com <ramy.elgammal@arm.com>
Change-Id: I8e85fe73308ed84ebb142d6d6d1562b62dddfaa5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9819
Reviewed-by: SiCong Li <sicong.li@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|