Age | Commit message (Collapse) | Author |
|
Resolves: COMPMID-5719
Change-Id: I2f0911ffccce2b42a9a63fe6826eaa5d2cad06ba
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8831
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5521
Change-Id: Id38a4ce18f9ea8805a151acb064e72795535d1a0
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8859
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Partially-Resolves: COMPMID-5522
Change-Id: I1d90003079c3f24d081cc49f7b110eda753f6995
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8838
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* The dependency graph now can schedule any acyclic graph into
a sequential list of operators. This is needed as the output
operators now form branches in the graph.
* Fix the definition of input, output and intermediate tensors
in GpuKernelComponentGroup to support non-linear but sequential
list of operators.
* Add constraint on GpuOperatorGroup to enforce strictly linear
fusion style, but allow output operator as the only form of
branch.
Resolves: COMPMID-5771
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I68de3a31a2456145081f0a397e4e61dd66327682
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8823
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* This retains the behaviors of the current library's validate() method.
I.e. is_supported_op() is used to check the support level and validity
of an operator configuration without any consideration of fusion.
- validate_op() interface still checks both the op validity and the
fusion validity
- This arrangement ensures that any users of the original validate()
interface can expect to use is_supported_op() in exactly the same way
with no performance or behavioral difference.
* Force adding const tensors to ArgumentPack when adding to
OperatorGroup. This is because OperatorGroup is only for validating
fusion, and does not mutate TensorInfos at all
Partially resolves COMPMID-5736
Change-Id: I4157677f55848d66a08ec00e6a76d13a24b722b7
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8687
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
* bfloat16 cause build errors when building arch=armv8a multi_isa=1
* Resolves COMPMID-5793
Change-Id: I92114f1134c8d7bebeec556c42cfde778f9ba5bd
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8871
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: [COMPMID-5466]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: I68af0bb54580bebd2ace1fba30aa73f7f68a4dbb
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8804
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-5780
Change-Id: I34c764cd1df652f8a938772924dc49baf6ac16db
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8825
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5664
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Ica2fd82645d95bd64226a1950a013d8a9b9035eb
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8833
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fixes various mismatches when converting FP32 to BF16 and
BF16 to FP32
* Fixed segfault when trying logging=1 and trying to log BF16
* Resolves MLCE-979
Change-Id: Ie517d0b7411b4e3a7fecdee588f0e073d290625a
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8830
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* The output of the fused operator must be explicitly specified
using GpuOutput operator.
* Any temporary tensors used to connect the output of an operator
to the input of another operator will be marked as no-alloc
and won't be allocated as a tensor in the memory.
Resolves: COMPMID-5771
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I5ae8e800f8f737db23a055a92b01c4f1d78c3bb8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8794
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch optimizes transposed convolution for QASYMM and QASYMM8_SIGNED types, by extending the transposed convolution kernel written for FP32/16.
Resolves: COMPMID-5723
Change-Id: Iab8f09231938adb949c506fd915ed45b885e5c7c
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8792
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Add the CLAMP activation function for GPU backend with generic activation Component and TemplateWriter modules.
CLAMP is internally implemented as LU_BOUNDED_RELU activation function with the alpha and beta variables swapped.
We do NOT consider in-place computation cases in this patch.
* CLAMP operator for GPU backend.
* Activation Component and TemplateWriter for CL backend.
* TemplateWriter generates tiled kernel code.
* Supported data types: F16, F32.
* Validation tests for CLAMP operation.
Resolves: COMPMID-5519
Change-Id: Ieb097d6b1e6a7ed2b882518e88314454efb402f6
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8762
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch significantly reduces the pressure of CL Deconvolution tests for QSYMM8_PER_CHANNEL on the precommit and migrates some of them to nightly, while adding smaller tests for precommit.
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I3ba16cb3ebc11b5f6015f97423b0496ee2449cc7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8782
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5735
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Change-Id: I3a3a3b103993be8c3413b55998b36df429be8260
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8780
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Mohmun02 <MohammedSuhail.Munshi@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The operator is migrated into dynamic fusion for all data types supported
Resolves: COMPMID-5693
Change-Id: I3c550d3d1cd04570f453beae678c3f60d4cb1a73
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8755
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5735
Change-Id: I9958413b69c5052cfa205dd0e9457cc4953aaf35
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/474818
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8724
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5664
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I4182752e213aade19005ee984a488c2490453f8f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8747
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Implement indirect convolution kernel
- Add operator support
- Add test
Resolves COMPMID-5709
Change-Id: I9272304163471a5a40da7fdec204599f3c1d8e32
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8701
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
The fp32 mws selection variables are guarded because they are flagged as unused when data type support does not include fp32.
Resolves: COMPMID-5761
Change-Id: I7ac783e3d5ca51868b562ab879d03e02140b51a1
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8707
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Provide support for Add operator
- Auto initialize the destination tensor before testing fusion in conv2d
and elementwise binary ops.
Resolves: COMPMID-5518
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Ibd815020f02b57f88eea7c2921bdcf98605d99c5
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8617
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add SME/SME2 detection.
* Integrate SME2 implementation for:
- Normal convolution
- Winograd
- Depthwise convolution
- Pooling
Resolves: COMPMID-5700
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I2f1ca1d05f8cfeee9309ed1c0a36096a4a6aad5c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8692
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch adds Depthwise Conv2d operator into dynamic fusion interface and adds the associated tests.
Resolves: COMPMID-5517
Change-Id: I385c94dff7fd40c72b8337ef797e508df4499a82
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8678
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Implement kernel (ClIndirectConv2dAddressPrecalculationKernel)
- Implement OpenCL kernel (indirect_convolution.cl)
- Add test
Resolves COMPMID-5708
Change-Id: If7408e37cbc6f9ad8506ff3334bc574e5d6763fb
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8661
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: fadara01 <fadi.arafeh@arm.com>
Change-Id: Ieaa2fa4a6a69e7e0a48633967dabe91c786b42b7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8682
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Public headers of the new experimental dynamic fusion can be found in arm_compute/dynamic_fusion/
New examples on how to use the interface can be found in tests/validation/dynamic_fusion/gpu/Integration.cpp
Resolves COMPMID-5683
Change-Id: I7ccb902a227fb487562df15fc3c30118d1d95bbd
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8671
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Added approximate values for MWS for the following binary operators:
Add, Sub, Mul, Min, Max, Div
Change-Id: I5c4c75511129982a3f44c038ee272f09598469de
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/459609
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Signed-off-by: fadara01 <fadi.arafeh@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8392
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: ARMCL-592
Change-Id: If036a2b6d2842db2ed750e9d350bfaa3f7a76c9e
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8668
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fixes benchdnn test failures in ONCPUML-1104 when num_threads
is greater than workload size.
Signed-off-by: Crefeda Rodrigues <crefeda.rodrigues@arm.com>
Change-Id: Ic351a3ab5b548aa1843042a053130b02d0f1d40e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8655
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fix the heading and the code block.
Resolves: COMPMID-5546
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I60162b0e0aaf2a71a70e517aaeb8c75dd82d8dd9
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8652
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-5548
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I0f59bb1eaed45aef565d8cc435b59c45dc07f240
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8639
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Regression is caused by the small default mws in ActivationLayer
- Syncronization of threads takes longer than the workload on small sized tensors.
- Size 1536 is chosen arbitrarily based on the size of tensors in benchmarked networks
Resolves: [COMPMID-5655]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I02e865b578399e75484f471e67806dd4cf7502c0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/468454
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8615
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix includes int8/uint8 quantized inputs
- Bias S32 value is limited to better allow detection of mismatches in gemmlowp kernel
Resolves: [COMPMID-5659]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: Ie9cca430c6ab66253fe1d5252bd2c5396c7f38cf
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8514
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves : [COMPMID-5698]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: I21ce976473e0e8807c14989e98e68aca69c7f1f3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8603
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-5548
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I8d26dc871dfa83af2cf06e4e3997f5331235ff36
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8607
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch optimizes transposed convolution for CL backend by rewriting it in a single kernel instead of three (flip_kernel + upsample + conv). The new kernel skips the upsampling step which reduces the input space of convolution by stride_x * stride_y, resulting in significant performance improvement. It also skips the kernel flipping by traversing the weights accordingly, thus reduces the memory footprint.
Resolves: COMPMID-5676
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I8a333212dc7c5f7f0597aa58b0d56d44814baa14
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8588
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Use 22.11 for all links.
* Make markdown table code more aligned.
Partially resolves: COMPMID-5548
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I1fd41fc4df3e97573cbfe8c3bb033c0a6dcbfb5c
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8608
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Resolves MLCE-842
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Change-Id: Iae0521b25a5e6c9cc8046830f397d523dfbcc66e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8542
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove conflicting Padding2D from the unused comparison operator in the
prototpye
Resolve unused variables in release mode
Resolves COMPMID-5683
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I19d74c57e51e6cf64003ddcbc74227608bb866d2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8590
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* When the tensors are reinterpreted as 1D, any thing smaller than
10KB won't be splitted into different thread.
Resolves: COMPMID-5630
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Icff7089e37c85c8b325f099008a080a5805d36a2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8581
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Datatype supported qasymm8 and qasymm8_signed.
* Resolves COMPMID-5211
Change-Id: Ib161af3af5ad11ec6b46de0bf4fbac172d2525fb
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7729
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves : [COMPMID-5461]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: I89b99d267c32b00ef44f9bb6e7c714dfe4a0d29d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8420
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5686
Change-Id: I608c359583c44f2f04f29faddd1c6b38a381de60
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8562
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-5548
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: If12e7f02b8ff485cf856a6cee6773be54165ccae
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8557
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Replace VEC_SIZE with N0. VEC_SIZE was used in the old gemm kernel and
not used anymore in the existing ones
Resolves COMPMID-5678
Change-Id: Ia770200b9d6e24c51c57347e4634fb8eadd10385
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8556
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5511
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I0ac0acbf1de7da09f18f7b457307ec3cc99deb3b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8546
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Revert the range removal in tests for soft relu and bring the former implementation in CL backend back.
Resolves: COMPMID-5677
Change-Id: I35d5ac03a134299041ce97aabc9fff2d4380d09f
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8551
Reviewed-by: Milos Puzovic <milos.puzovic@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* The range check condition is incorrect. The maximum value
of 8-bit quantized data is 255, not 1023.
* Use 256 to make the check slightly stricter than it should.
Resolves: COMPMID-5458
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I5c071680531574b409a7ce71a732f6480caa10d8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8537
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Check whether weights are defined as constant, if they are not constant then do not pack them if they are already packed so that they can be updated.
Signed-off-by: Milos Puzovic <Milos.Puzovic@arm.com>
Change-Id: I73447e31e3660b05f8f40e04ea4ea2003eb9b802
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8539
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Added missing threshold for calculating SOFT_RELU when SVE and CL implementations are used. As a result removed from the testing bounds for input values that were set to be in the interval [-40, 40].
Resolves: COMPMID-5658
Signed-off-by: Milos Puzovic <Milos.Puzovic@arm.com>
Change-Id: I3d14df60125e36e4eb85aeb222f4fb0cc5741521
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8536
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|