Age | Commit message (Collapse) | Author |
|
* FP16 kernels must be moved from src/cpu/kernels/pool2d/neon/nchw/all.cpp
to src/cpu/kernels/pool2d/neon/fp16.cpp.
* In src/cpu/kernels/pool2d/neon/list.h when we declare the kernels
we need to remove defined(__ARM_FEATURE_FP16_VECTOR_ARITHMETIC) so that
in std::vector<CpuPool2dKernel::PoolingKernel> available_kernels
* Partially resolves MLCE-1102
Change-Id: I000380f8eccca17e6219c4f3453980d67a2c9dd8
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10444
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Transpose higher dimensional tensors (>2D) by collapsing higher
dimensions into the third dimension thus avoiding multiple dispatches
of the CL kernel
* Maximize tile size without register spilling
Resolves: COMPMID-6448
Change-Id: Iac094b8c428bdf319d9c28a8334cb55d58e2d14b
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10443
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Only support 1x1 blocks, i.e. n0=1, m0=1.
- Dilation not supported yet.
Resolves: COMPMID-6258
Signed-off-by: ramy.elgammal@arm.com <ramy.elgammal@arm.com>
Change-Id: I1dcfd7640fb40e112736dedc81847f7b1b50dba2
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10411
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Added a new test to make sure we support the following configuration:
NCHW
InputInfo=Shape=2,2
WeightsInfo=Shape=3,3
OutputInfo=Shape=4,4,
PadStrideInfo=1,1;0,0,0,0'
* Fixed the validate() method to allow this configuration
* Resolves MLCE-1120
Change-Id: I6874ad57bb81384185984741b983bf5e19ba150c
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10417
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix the reference axis vector to be the right size.
- Update typos in the error messages.
Resolves COMPMID-6574
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I9572365b8173b92d0fffd557e4db261b2969109c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10423
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
|
|
Several test optimizations have been introduced into Winograd tests for Gpu and Cpu backends. The testing strategy has been detailed as a comment header in the test design files.
In summary
- Very large shapes in the nightly are made smaller
- If the underlying kernel is the same for different data types, we only need to stress some key aspects of the kernels (e.g. read/write lengths in case of fp32/fp16).
- In case the underlying kernel is the same (OpenCL), Fp16 is tested on a subset of the shapes
- In Cpu, there is no need to test every combination for both NCHW and NHWC as we just permute the inputs and use NHWC kernels anyways
- All activations does not need to be tested for each and every shape
Resolves: COMPMID-6464
Change-Id: Ie25fded85c65b9c7386dc21b23f9b695b1e77b07
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10393
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6476, COMPMID-6477
Change-Id: Ied37c269d5a108ff72f70e3ad932cf372bda5562
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10346
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Clang-format options now match those in clang-format version 14.
Remove Astyle checks as the same code style checks are provided by clang-format.
Resolves: COMPMID-6576
Change-Id: Iefa9bb719826242a3276e9ca058d0c84624f7302
Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10399
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6474
Change-Id: Iaff5b512cf77975f2df02dcdf848711b13bf97a6
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10341
Reviewed-by: Mohmun02 <MohammedSuhail.Munshi@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* The current implementation has signfinicant inaccuracy
and the issue cascades to GELU.
* Use the implementation from ArmĀ® Optimized Routines.
The maximum error is 1.93 ULP.
Resolves: COMPMID-6554
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: If80131e164b7a078e34dd8e05b1506698f31d17a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10395
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Code is formatted as per a revised clang format configuration
file(not part of this delivery). Version 14.0.6 is used.
Exclusion List:
- files with .cl extension
- files that are not strictly C/C++ (e.g. Android.bp, Sconscript ...)
And the following directories
- compute_kernel_writer/validation/
- tests/
- include/
- src/core/NEON/kernels/convolution/
- src/core/NEON/kernels/arm_gemm/
- src/core/NEON/kernels/arm_conv/
- data/
There will be a follow up for formatting of .cl files and the
files under tests/ and compute_kernel_writer/validation/.
Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Change-Id: Ib7eb1fcf4e7537b9feaefcfc15098a804a3fde0a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10391
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
|
|
- Add support for negative axis values.
- Add option to use opposite ACL convention for dimension addressing.
- Add validation tests for the mentioned additions.
Resolves COMPMID-6497
Change-Id: I9174b201c3adc070766cc6cffcbe4ec1fe5ec1c3
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10335
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-6458
Change-Id: I1068da3dee6b6f58e4179f5a92521a6d6457e6c4
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10380
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Inclusion order of header is changed as preparatory step
for applying clang-format
Change-Id: I0c529f896ba802dfc6f30a573cdc9d9a24f3081c
Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10379
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template select_op() which had to be moved from impl.cpp to fp16.cpp
* Partially resolves MLCE-1102
Change-Id: Ic9e73e121482fcc5e4fcbe8ae1ecd23649cbd3d1
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10359
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template max_unpooling() which had to be moved from impl.cpp to impl.h
* Partially resolves MLCE-1102
Change-Id: Iabf9a9ba9d2441032f931f33aad97acc3e332575
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10362
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
|
|
Signed-off-by: Paolo Tricerri <paolo.tricerri@arm.com>
Change-Id: If4e2944e25e48c8b7a1a6713e57838d449a987ea
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10366
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Switch the loop unrolling order and reuse the pre-computed vectors
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I636c0530d6b21dae4dbb371c57d18b1f7c7246a8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10355
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template l2_normalize_x() and
l2_normalize_yz which had to be moved from impl.cpp to impl.h
* Removed impl.cpp
* Partially resolves MLCE-1102
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Change-Id: Id00a823730108293fc712295a178dad80588af30
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10344
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the templates vector_matrix_multiply_f16() and
matrix_matrix_multiply_f16 which had to be moved from impl.cpp to fp16.cpp
* Partially resolves MLCE-1102
Change-Id: Ic87440797d6f1653c815ab6565972206f5afd0ad
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10345
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6558
Change-Id: I015d504aaa9b8a1a232b01e49ab373d415ea1de9
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10340
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Two implementations of the command buffer are added:
- CLMutableCommandBuffer uses mutable dispatch command buffer
extension.
- CLCompatCommandBuffer is the compatibility class for platform
without the CL extension.
Resolves: COMPMID-6454
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I15b370a50168ca940bd8fb2b5fae26230da3f472
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10298
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6475
Change-Id: Ic867cdfff5d4391cb749a04bf7cc35cda63d3b71
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10311
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves : [COMPMID-6212]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I29bbd9a3d96af462faf7f0ee13b9849f75e05356
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10319
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template compute_all_anchors() that
had to be moved from impl.cpp to impl.h
* Partially resolves MLCE-1102
Change-Id: Iaff6da32d0b9789ef87ba3f95bef99343612bd01
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10309
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template fused_batch_normalization_dwc_nhwc() that
had to be moved from impl.cpp to impl.h
* Removed impl.cpp
* Partially resolves MLCE-1102
Change-Id: Idaaa113c71729e32e565acf5fb5694c76c36d76d
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10308
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch fixes some include dependencies in certain files that caused build failures in https://review.mlplatform.org/c/ml/ComputeLibrary/+/10287.
It also circumvents some clang-format glitches.
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I8e9d3307edd2d1afd17c685c9bc9429624130e5a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10313
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: <felixjohnny.thomasmathibalan@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The skeleton code consists of modifications
- to build the library with the quantized matmul kernel
- refactoring of some common utilities
- empty OpenCL Kernels for four configurations ([Lhs, Rhs] X [Nt, t])
- some validation tests and skeleton for functional tests
Resolves: COMPMID-6473
Change-Id: Id8401f789d34277dceb1f91afd68c9c88275618a
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10273
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use various templates that had to be moved from
impl.cpp to impl.h
* Partially resolves MLCE-1102
Change-Id: I2e5e68fbcf5279de1ffc1be4def4f96ed05593e9
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10224
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* Partially resolves MLCE-1102
Change-Id: If53ff1927948b3ad7c9e3c9347bc2af38764e342
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10243
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template in_bounds_crop_window so it had to be moved from
impl.cpp to impl.h
* Removed the file src/cpu/kernels/crop/generic/neon/impl.cpp
* Partially resolves MLCE-1102
Change-Id: I1953849153e672ff7938f54c877c7498117dcca4
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10282
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* Partially resolves MLCE-1102
Change-Id: I7e6d998e427982d4a037dbce6d17ca378665e07f
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10241
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* Partially resolves MLCE-1102
Change-Id: I04822b043d9f87bc666750a8d95a8be8a6cc194d
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10239
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* Partially resolves MLCE-1102
Change-Id: I5ecfc8f6c0d84f92d80bec2cde6e7338794b9788
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10240
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add a test case with src and dst having same row size
- Remove inline from has_holes() util function
Related to COMPMID-6504
Change-Id: Iead1f17692dc57b66c5d9f01eed30169efaee0a5
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10190
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
PostOps was the experimental interface for Dynamic Fusion. It is now
replaced by the new Dynamic Fusion interface with code generation using
the Compute Kernel Writer.
Resolves: COMPMID-6190
Change-Id: I813b48facef2fd6f3aee332588886b4f9b3d33d8
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10219
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template run_depthwise_float() so it had to be moved from
impl.cpp to impl.h
* Partially resolves MLCE-1102
Change-Id: I428a79c4ab3a990331f20f5bd6b9fea88b0836b9
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10218
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use various templates that had to be moved from
impl.cpp to impl.h
* Removed src/cpu/kernels/pool3d/neon/impl.cpp
* Partially resolves MLCE-1102
Change-Id: I71e6a54a27fd8f04ae2a67231709aad723b09fa3
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10220
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fixes a bug when using FP16 constant in some cases.
- Adds op_write_raw_code to handle some special cases.
- Ports MxN pooling 2d layer into ckw.
- Adds unary function 'negate' to ckw.
- Updates pool2d validation tests to include store op.
Resovles COMPMID-6263
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: If8c683761fead79bd519aef28cc65de78d3ec629
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10172
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Use Compute Kernel Writer (CKW) to generate code for Resize operator in
the Dynamic Fusion interface.
Supports Nearest Neighbor and Bilinear interpolation methods.
Resolves: COMPMID-6265
Change-Id: Ib0a5158bd4208123c84f6a1dc54f29d82fd55dcd
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10174
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template roi_align() so it had to be moved from
impl.cpp to impl.h
* Removed the file src/cpu/kernels/roialign/generic/neon/impl.cpp
* Partially resolves MLCE-1102
Change-Id: If78371479042725723cea6f6c65aac76d68a1c1d
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10213
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Inline assembler blocks attempting to bind 8 integer
registers don't compile in certain configurations (notably GCC 13.2 debug
builds with -O0 -g). Fix this by splitting the offending block into two
separate parts (straightforward as there is no flow control in the block).
Fixes: COMPMID-6532
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: I80e9a10e6a91574176d50e63c45fab055aefa659
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10197
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Emanuele Rocca <ema@linux.it>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Enable fp16 in armv8a multi_isa builds
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template add_same_neon() so it had to be moved from
impl.cpp to impl.h
* Partially resolves MLCE-1102
Change-Id: Ia51007f5e663b708071958bb94bfab4535e4b2f8
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10191
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Code guarded with __ARM_FEATURE_FP16_VECTOR_ARITHMETIC needs
to be moved to an fp16.cpp file to allow compilation with
-march=armv8.2-a+fp16
* fp16.cpp needs to use the template add_same_neon() so it had to be moved from
impl.cpp to impl.h
Change-Id: I9e64a3101958fcb9c3d5c8e9b148b498b2bee05f
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10154
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Following CpuReshapeKernel Optimizations, update the CpuGemmConv2D and CpuFlatten
to use CpuReshape operator instead of CpuReshapeKernel
- Minor changes to comment in NEReorgLayerKernel.h
Resolves COMPMID-6504
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Change-Id: Ib6ee1fdc313d91249f9fe41c81e73324031c1ff4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10186
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: David Mansell <David.Mansell@arm.com>
Change-Id: I359ed0703f4036e017b34b622f76b630cefac973
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10183
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-5279
Change-Id: Id9b007eed62c200702bbfcc83b94dab7b5de1714
Signed-off-by: Anitha Raj <anitha.raj@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9962
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Take dilation into account when checking padding.
Resolves: COMPMID-6348
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I897a13ba7f37382733c35c1701d1ec310ed55331
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10147
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6495
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I916829222a6211fa096a833a2afc5fab5eb34ea4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10143
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add helper functions to check whether command buffer extensions
exist in CL device.
Resolves: COMPMID-6453
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Ibc287e4526e54be4702241ab8ca0cea0b8661b3a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10130
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Anitha Raj <Anitha.Raj@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|