Age | Commit message (Collapse) | Author |
|
The SVE implementation of ElementwiseDiv does not require s32
specialization and can use generic implementation.
Resolves: COMPMID-7159
Change-Id: I4a36831dc714f2d26b83f58b3e56d0d4038e0113
Signed-off-by: Yevgen Pronenko <yevgen.pronenko@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11776
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-7172
Change-Id: I0acac5e4cb24056a88b4356d9239b33721d65d13
Signed-off-by: Michael Tyler <michael.tyler@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11762
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Suhail M <MohammedSuhail.Munshi@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially Resolves: COMPMID-6926
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I9d13c4319042f639a8c5be385b63857d77fefff2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11768
Reviewed-by: Michael Tyler <michael.tyler@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
This wrapper allows us to utilize the functionality of CpuGemm
without directly exposing the source code.
Change-Id: I408630f52acd610c912e5c5fa02bfee5f884471e
Signed-off-by: Ryo Suzuki <ryo.suzuki@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11607
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add support for mixed sign quantized convolution.
- Add support for mixed sign dequantized GEMM.
- Add SME FP16 GEMV kernel.
- Change SME vector length function to use RDSVL instead of static variable.
- Add GEMM dilation support internally (not exposed yet).
- Remove unused "get_default_activation_values" functions.
- Add SVE fixed format interleaved BF16 DOT kernel.
- Updates and optimizations to assembly kernels.
Resolves COMPMID-6926
Change-Id: I227f502502611d4cc4111c89e30c53ce94079544
Signed-off-by: Michael Tyler <michael.tyler@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11570
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
The TEMP file setup is currently unavailable on the Windows(R)
operating system because the RANLIBCOM variable is missing.
For now, restrict the fix to POSIX(TM) operating systems.
Signed-off-by: Quoc Khanh Le <QuocKhanh.Le@arm.com>
Change-Id: Ia347a488efea5eceba9a11bde88fda2dcf88c1d5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11743
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
|
|
Resolves: COMPMID-6947
Signed-off-by: Sangwon Ha <sangwon.ha@arm.com>
Change-Id: I7fcf4f41d2961edf1fdf05e8f0b538a94f75295a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11710
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
SCons may fail during the building or linking process if the path
exceeds the maximum character limit. To address this, support for
using TEMPFILE has been added to handle excessively long command
line strings.
Signed-off-by: Quoc Khanh Le <QuocKhanh.Le@arm.com>
Change-Id: Ic94e7f087f6d044602bdc1fe3af0d0836cb22a3e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11590
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Report fix of out of bound memory write for non-optimized FP16 GeMM
kernel.
Resolves: COMPMID-6904
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ib06a5e6e70c9d86e422ab3b82a137ba46449f392
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11713
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Non-optimized FP16 GeMM kernel has out-of-bound memory write.
- This doesn't affect optimized assembly kernels.
- This bug writes 1 extra FP16 value to the destination tensor.
Resolves: COMPMID-6904
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I26b8ebcd15680b25c97c4b7e331996f397692447
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11706
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I46f936f3c503d4801c4dba85900cee00bc372683
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11690
Reviewed-by: Suhail M <MohammedSuhail.Munshi@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Enable FP16 kernels in
NEROIAlignLayerKernel
NEComputeAllAnchorsKernel
NEBoundingBoxTransformKernel
NEInstanceNormalizationLayerKernel
NEBatchNormalizationLayerKernel
* The FP16 kernels were disabled due to the use of __ARM_FEATURE_FP16_VECTOR_ARITHMETIC
* Resolves MLCE-1305
Change-Id: Ib8dd3cad631667018b25db4ba76007dbfb4bf5a5
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11677
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* The softmax kernel is using SME2 instructions on non SME2 devices
* Resolves MLCE-1304
Change-Id: I9d7d94443e7c9df4e7c1a05eeef6838f530b357b
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11676
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
|
|
Resolves ONCPUML-1648 and ONCPUML-1539
Signed-off-by: Hamza Butt <hamza.butt@arm.com>
Change-Id: Ib70a4f8cef61c2979dfd265c0755c541930ee563
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11575
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Michael Kozlov <michael.kozlov@arm.com>
Change-Id: I43d59bfbf932a37e7bda7dcf4f447f12237e0fa8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11612
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: <felixjohnny.thomasmathibalan@arm.com>
Comments-Addressed: <felixjohnny.thomasmathibalan@arm.com>
|
|
Resolves: COMPMID-6901
Change-Id: Idcd3f5f5d90f4073aaf116c0586e46013fbd64f7
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11605
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
The placeholders will be replaced only in the release branches instead of main. This will also help with commit automation.
Partially Resolves: COMPMID-7020
Change-Id: I6d68dcef2f2d07181ce5d61892b10adbfd4cd575
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11538
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
|
|
1. Remove unnecessary restriction to the exclusion only running on systems with little mid and big cores.
2. Allow override of the suggested number of threads in case the user sets the number of threads to a lower value.
Resolves [COMPMID-7014]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: Ifb76ef4454f38dd2e3e5781b5dfea07c044aeb74
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11604
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
|
|
On systems with BIG/MID/LITTLE cores, we need to exclude the LITTLE cores.
This is make changes to CPUInfo to detect number of LITTLE cores and set the num_threads to TOTAL_CORES-NUM_LITTLE cores
Resolves [COMPMID-7014]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: I3e1772e5b64d1c45304860be43233b7e5dd8dba1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11565
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-7063
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ife4d9f0b2644a649da45544b8789c51c15c9aebf
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11574
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I03fd3821d3636418f529f3395eceeaa00d02664b
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11562
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
|
|
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
COMPMID-7058
Change-Id: I9c6d18a8fddaf335bcd1e8dd562fa3838c1ca4b2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11561
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
|
|
* Resolves COMPMID-7059
Change-Id: If77e579199720b7234298d2dc844d88c05989bf9
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11556
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-7054
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I68d125b81ad7f74b2594ccda8d6ec08beef1ebd7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11555
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Resolves MLCE-1285
Change-Id: I22a37972aefe1c0f04accbc798baa18358ed8959
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11552
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Enable FP16 code when building multi_isa for armv8a architecture in
order to run on higher architectures e.g. 8.2, 8.6.
- When running this build on v8 the validation will stop it flagging
that the arch does not support FP16.
Resolves: COMPMID-7013
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I0d445e2fade31c1156d7a6e142edf2a7f84d3622
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11544
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-7021
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I809bc6ecd2845dfe6ee5de20a902aea4d07f15a5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11540
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
|
|
- Padding with batched scalar cases is unsupported, adds checks.
- Adds tests for scalar cases, without padding.
Resolves: [COMPMID-7015]
Change-Id: Ib9cf5db990420ff4b442d003ef9424e365bee86d
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11536
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
In NEQuantizeLayer for QASYMM8_SIGNED, the rounding was inconsistent
between the unrolled loop and the leftover loop, which meant identical
values (e.g. 0.5) at different indices of a Tensor could round to
different values (0 or 1 in this case). We have changed vcvtaq to
vcvtnq to round to the nearest, with ties to even. This matches the
default fegetround setting, so it is a sensible default.
Relates-to: COMPMID-6994
Signed-off-by: Jonathan Deakin <jonathan.deakin@arm.com>
Change-Id: I8e7ecb1b8dbdd3e887697a92046af99ed33fc78f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11532
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: [COMPMID-6917]
Change-Id: Id8b96efd29f6c61dd43a371341c6e1fe087953e9
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11509
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: [COMPMID-6897]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I70b1c3c5f0de8484fcb6c3b0cc0d0d8c059b0f58
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11525
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
SVE BF16 kernels need to check for svebf16(), not just bf16().
Change-Id: I89494aac40166eba59719bed9822194a48ac282d
Signed-off-by: David Mansell <David.Mansell@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11520
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
As the reorder kernel is called with WeightFormat OHWIo8
for hardware that does not support it e.g. vector length 128,
adapt the test case and add kernel implementation for this edge case.
This fixes the mismatching values that appear when OHWIo8 fixture
was run with 128 vector length.
Resolves: ONCPUML-1523, COMPMID-6281
Signed-off-by: Radu Salavat <radu.salavat@arm.com>
Change-Id: Iaa1a3b486d1725a2d6031051aa544082c1bbe913
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11421
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I69aa973e61df950060807a31230a1edd91add498
Signed-off-by: David Mansell <David.Mansell@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11514
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6899
Change-Id: I3743f2c9e5c21e1ec9f4c81d08c148666afad33a
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11505
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Sang Won Ha <sangwon.ha@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
accumulated
Similar to https://review.mlplatform.org/c/ml/ComputeLibrary/+/11500, s8f32 kernels do not support accumulate mode. This patch modifies the kernel selection and also adds more tests to stress these test cases better.
Partially Resolves: COMPMID-6995
Change-Id: I40e19446c012eb7334e4511e254cce0d635aa234
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11503
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Radu Salavat <radu.salavat@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
SME2 kernels use a different accumulation buffer and destination tensor is not copied to this buffer as initial value, thus causing mismatches. This patch modifies the kernel selection algorithm such that it does not select SME2 kernels if accumulation is required.
Resolves: COMPMID-6995
Change-Id: I82da3cba41729f938a046f26b41b63ff5716c02d
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11500
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-6894, COMPMID-6896
Change-Id: I9d29fd3701a7e0f28d83f81a6c42a7234c2587c3
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11477
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
Dynamic-Fusion: Ramy Elgammal <ramy.elgammal@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
dequantization
Signed-off-by: Radu Salavat <radu.salavat@arm.com>
Change-Id: Ib17946b526d35deeca94b5d2f163b92101e313c4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11420
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially Resolves: MLCE-1255
Change-Id: Ibadcfedd43530232c65f05e571bc8b4568a63e67
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11499
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* All per-channel requantizing hybrid assembly kernels require
these buffers to be padded.
* Resolves MLCE-1255
Change-Id: I892b8ee9b31e079189ec72f3fc6da4ce5efda974
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11491
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Building with openmp=1 cppthreads=0 caused a linker error in the
validation suite
Change-Id: I16d8a49e9190cd1288237d82583a0034e20a9f38
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11483
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: [COMPMID-6893, COMPMID-6895, COMPMID-6898]
Change-Id: I355f46aeba2213cd8d067cac7643d8d96e713c93
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11430
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: [COMPMID-6891, COMPMID-6892]
Change-Id: I5b094fff1bff4c4c59cc44f7d6beab0e40133d8e
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11394
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ifec7015ad5712d8b84d65203a5fa21cbefcb04ad
Signed-off-by: Michael Kozlov <michael.kozlov@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11438
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: <felixjohnny.thomasmathibalan@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially Resolves: ONCPUML-1444, MLINFSW-439
Change-Id: Ic7498d6944df2848f3e82eaf4e11cc5cb6ef5754
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11424
Reviewed-by: Anitha Raj <Anitha.Raj@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Sunita Nadampalli <nadampal@amazon.com>
Change-Id: I21eca31d97d6e2ca8279adb9db65f11540e72689
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11396
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
|
|
- Add support for QASYMM_SIGNED*QASYMM8_SIGNED->F32 in
CpuGemmLowpMatrixMultiplyCore
- Add s8f32 kernel using existing s8->s32 kernels with a new
DequantizeFloat OutputStage, the structure is similar to Requantize32
but the opposite way around.
- Add SME s8f32 kernels with integrated support for DequantizeFloat.
- Add scale to CpuGemmLowpOffsetContributionKernel.
- Add virtual dequantize scale to gemm_common, only implemented for
gemm_interleaved.
- Update year to 2024 in generate_build_files.
- Add dynamic flag to QuantizationInfo which signals to operators that
it can change after configuration
- Add support for dynamic quantization in NEGEMMLowpMatrixMultiplyCore
- Add dynamic quantization fixture by extending
GEMMLowpGenericMatrixMultiplyCoreValidationFixture
- Add GEMMLowpDequantizedMatrixMultiplyValidationFixture
- Store k (number of cols of A) rather than k_offset in the offset
contribution kernels so that we can recompute it when the other
offsets change
relates to: ONCPUML-1444 MLINFSW-439
Co-authored-by: Milos Puzovic <Milos.Puzovic@arm.com>
Co-authored-by: David Mansell <David.Mansell@arm.com>
Change-Id: I58a3acf2c09289a303e52eea6b336a696a5bc8da
Signed-off-by: Jonathan Deakin <jonathan.deakin@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11022
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially Resolves: ONCPUML-1442
Signed-off-by: Radu Salavat <radu.salavat@arm.com>
Change-Id: I681df5e9c399996fbc7dc362b906af151588ca44
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11416
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
|
|
Add checks for bf16 support for bf16 fixed format tests.
This ensures tests pass in multi_isa setting where library was compiled
with bf16 support, even on systems that do not support bf16.
Also adds runtime check to GEMMConvolutionLayer/Float/BFLOAT16/RunSmall.
Resolves: COMPMID-6922
Signed-off-by: David Svantesson-Yeung <david.svantesson-yeung@arm.com>
Change-Id: Ic0f09ba34b5a2c64be8bfc848a4457a6b1c4d1c3
Signed-off-by: David Svantesson-Yeung <david.svantesson-yeung@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11408
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|