Age | Commit message (Collapse) | Author |
|
- Implements MatMul function and operator for floating point datatype FP16/FP32
- Includes support for transposing dynamic tensors prior to matrix multiplication.
- Adds tests for 2D/3D/4D+ tensors in MatMul with F32/F16 datatype (with all combinations of transposed/not-transposed tensors)
- Updates fixture to allow for testing fused activation in MatMul
- Adds tests for matmul with and without fused activation
Resolved: [COMPMID-5898]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: Iefa84b26dd723c9a51e6c3f91023152c6c31ace2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9411
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Adding fallback functions neon_qasymm8_signed_elementwise_unary() and
neon_qasymm8_elementwise_unary()
- They would be called in case target is not aarch64
Resolves: COMPMID-5994
Change-Id: Id0db1e7cb0fe92f1eaef0b3a9ed2bea01b3f2a15
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9416
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resovles: COMPMID-6002
Change-Id: Ifc2b7c889679b21d7e58f533be9c865854e132ef
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9408
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The fully connected function and operator running on GPU have been adapted to support dynamic weights.
Dynamic weights require the reshape and data layout conversion of weight tensors at runtime in the prepare stage of the operator. The implementation for GPU is identical to the CPU implementation.
This patch also deprecates the `are_weights_reshaped` option in Fully Connected.
Resolves: COMPMID-5870
Change-Id: I28f967695879d82cc91a928d95308a4e0e52a597
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9403
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5949
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: Idd8cfe6ea94a14f0b23178f6781251b5f0955563
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9390
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Deprecate dynamic block shape interface
- Iterate over output window instead of input window for simpler implementation and better performance.
- Add cropping support and cropping tests
Resolves [COMPMID-5865]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: Ic67d44a6a39299ecdafc507f12e3dc5d517dfb62
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9385
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Deprecate dynamic block shape interface
- Iterate over output window instead of input window for simpler
implementation and better performance
- Add cropping support and cropping tests
Resolves COMPMID-5918
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ifea0f5f7760ffd0f4d5d4f3a5ae8d14d0b98b790
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9378
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Removed namespace arm_compute::utils::requires to fix the build error
‘requires’ is a keyword in C++20 [-Wc++20-compat]
* Added missing includes for cstdint.h
* Resolves MLCE-1040
Change-Id: I08842a273a4422f8e9b10daded680f521efe26e0
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9388
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Adds additional ARM_COMPUTE_ENABLE_FP16 guards to Convolution layer
testing to ensure that validation suite passes on armv8a hardware when
built with arch=armv8a, and multi_isa=0.
Partially resolves ONCPUML-1209
Change-Id: Ib485502e534df1fa91c5c2d7b222ea08a354cc54
Signed-off-by: Nathan John Sircombe <nathan.sircombe@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9383
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add quantized unary elementwise in CPU using LUT.
* Widen the input data range of the test suite.
- Fix CPU exponential function overflow/underflow range.
- Fix saturation issue of CL round operator.
Resolves: COMPMID-5763
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I41445de2b4a33ec6b01e0ab701516c240c852d0b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9367
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Use a vector to represent the (static) block shape instead of an N-D
Tensor. The previous use of ND Tensor as block shape was wrong, not
adhering to the specification, and non-functional (only first dim was
used anyway).
* The fixture now accepts a static block shape, because the dynamic
case is not properly implemented and will be deprecated for now.
* Fix an assertion error in reference implementation.
Partially resolves COMPMID-5918
Change-Id: I5221e52ccc05e7c1249dec3a42426f954a73729a
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9357
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Omar Al Khatib <omar.alkhatib@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5952, COMPMID-5956
Change-Id: Idbd14538e7660792254072fa9631a6f03966f89b
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9371
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5945, COMPMID-5954
Change-Id: I7b27021d21f8e08c4896f6b1f595a75125064f9e
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9356
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5917
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I073067b490f2a1b96b81a037ea431c9a2e5c7503
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9322
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Implement opencl kernel for LHS transposed and RHS non-transposed
- Implement opencl kernel for LHS transposed and RHS transposed
- Add validation tests
Resolves: COMPMID-5953, COMPMID-5955
Change-Id: I55589acbffe86c44e29807574975978a1ec09bad
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9345
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5863
Change-Id: I9ff67face62826c1d335a6b941e8516be39bdac8
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/488768
Tested-by: bsgcomp <bsgcomp@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9225
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Implement ClNativeMatMulKernel class
- Implement opencl kernel for LHS non-transposed and RHS non-transposed
- Implement opencl kernel for LHS non-transposed and RHS transposed
- Add test fixture and dataset for matmul
- Implement transpose_tensor() for reference implementation to transpose high dimensional tensors
Resolves: COMPMID-5944, COMPMID-5951
Co-authored-by: Gunes Bayir <gunes.bayir@arm.com>
Co-authored-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I1d5b8978f41be27baddb3153ade880472141573f
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9333
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves COMPMID-5918, COMPMID-5865
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ib3b01e7dc1c944184a4c038045bf0469fbb9ff45
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9321
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* The shape of input and indices tensors, and the gather axis
can be any number, as long as these are valid and the output
tensor doesn't have more dimensions than the library supports.
* Update the reference code to be more generic and straightforward.
* Add necessary test cases.
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Resolves: COMPMID-5919
Change-Id: Ic7e2032777aa97ecc147f61d5388528697508ab1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9199
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add sigmoid and tanh activation functions for dynamic fusion.
* Add corresponding tests, but both activation functions share
the same fixture implementation.
Resolves: COMPMID-5939
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I0aae0eaa18b746ce89680d2773c66e09b0f854ce
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9257
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Fusion source files
Resloves: [COMPMID-5960]
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Change-Id: I1b11f01c51a029082ed05823717b4c4ae4897798
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9270
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add a max pooling implementation that returns kernel indices.
- Add a parameter in pooling info object to pick kernel indices impl.
- Add validation tests.
Resolves: [ONCPUML-1187]
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I485ef1604f676ee14d5f7f62d33699e49c38e4d3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9192
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add a parameter in PoolingLayerInfo class to pick which value to use as min for max-pooling.
Resolves: [ONCPUML-1166]
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I34e1cccc15176bbf31523c61e99f3188ddca23e1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8989
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix 4 failing tests for multi_isa builds when experimental_fixed_format_kernels=1
- Fixes for CMake and Bazel builds to pass validation tests
- Update documentation, remove “-DCPPTHREADS=1” flag from CMake build example
Partially resolves: ONCPUML-1181
Signed-off-by: David Svantesson <david.svantesson@arm.com>
Change-Id: I7101676260a0adcb7b6ff6f4342ae36f921e7120
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9189
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Dividing scale by number of elements causes accuracy loss due to limitations in float datatype and truncation to int
- Adds rounding after division on aarch64 to negate this.
Resolves: [COMPMID-5839]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I54ef0f7e56f39da1fa5f30378f551b5ca419a61d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/492456
Tested-by: bsgcomp <bsgcomp@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9110
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This fixes faulty mismatch issues.
In addition, this aligns with the methodology used by f32, as well as
that of cpu f16 tests
Resolves COMPMID-5897
Change-Id: Id4e2088a9fc5444265c69444cfa90961dd84047e
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9146
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fast math is enabled unexpectedly by convolution layer tests.
Resolves: COMPMID-5843
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: Ib3bc36d3f9070dbfb2c76146eecbb1ce0ee90626
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9137
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch increases the tolerance value used for FP16 tests in Neon(TM) backend. The tolerance number means 0.01f means it is ok to have 1% mismatch in the resulting tensor between the reference and the target. The value adopts a slightly stricter threshold compared to ConvolutionLayer (which is currently at 7%). This increase makes sense because Deconvolution layer uses convolution under the hood.
Resolves: COMPMID-5841
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: Ie0ebf5cce1e9753dc641a947d84128dd6da402d4
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9120
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-by: Sang Won Ha
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
The AddMulAdd assembly kernels only support aarch64. This patch disables the tests associated in case the build is not for aarch64.
Resolves: COMPMID-5850
Change-Id: Ib2768fd6bf2497420ff224daa243027d0a69c76b
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9076
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fixes Column Offset matrix is not being iterated through in y dimension
Resolves : COMPMID-5795
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I0190474be404b4f0e171855739cfd0a48cbed5bc
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9020
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This is a fused operator that merges Add + Mul + Add [+ Relu-based-Activation] layers and have an intermediate output after the first Add. It's supported for FP16/32/QASYMM8/QASYMM8_SIGNED data types.
The subsequent Add and Mul are intended for scaling and the coefficients only have one dimension (per channel).
The inputs are
- input1 : nD tensor [X, Y, Z, W, ..]
- input2 : nD tensor [X, Y, Z, W, ..]
- add_coef : 1D tensor [X]
- mul_coef : 1D tensor [X]
The outputs are
- out1 : nD tensor (intermediate output) [X, Y, Z, W, ..]
- out2 : nD tensor (final output) [X, Y, Z, W, ..]
The operation can be summarized as follows:
out1 <- input1 + input2
out2 <- Act(out1 * mul_coef + add_coef)
The activation function can be Identity, Relu, Bounded Relu or Lower/Upper Bounded Relu. The intermediate output can be skipped by providing a nullptr.
The reason of providing this operator is to be able to fuse in case of Residual network patterns and save computations by reducing memory back and forward.
Resolves: COMPMID-5463
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Change-Id: I8ef577aa623b036e9a9f655cc088493fd19a6109
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9055
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially-Resolves: COMPMID-5518
Change-Id: I8358784815bcac461d50e384fa7bc96f476d3983
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9045
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Dynamic-Fusion: SiCong Li <sicong.li@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove hack in CpuGemmAssemblyDispatch.cpp which tried to guess
strides for fixed format kernels. Instead, expect that strides will
have been correctly set on weights externally
- Update fixed format test fixtures to set the strides
- If the fixed format uses fast math mode, then weights should be of
type BFLOAT16. Change the validation logic to accept this.
Resolves: [ONCPUML-1131]
Co-authored-by: Milos Puzovic <Milos.Puzovic@arm.com>
Change-Id: I0f18d8b86b0f639be25fd122fa06a591e90645f2
Signed-off-by: Jonathan Deakin <jonathan.deakin@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8985
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Note: we use a separate test fixture for Multiplication op instead of reusing ElementwiseBinaryFixture to avoid exposing the internal enum ElementwiseOp to the public utils/TypePrinters.h as required by the data test case macros to print the test data. We also do not consider modifying the enum ArithmeticOp in the standard interface to include MUL without an implementation. Future work should consider refactoring this test fixture into the ElementwiseBinaryFixture to reduce the total number of fixtures/code duplication.
Resolves: COMPMID-5779
Change-Id: I84207658ce0407095b028fca0ab7bfa2950255ec
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9013
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: ONCPUML-1110, ONCPUML-1109
Co-authored-by: Georgios Pinitas <georgios.pinitas@arm.com>
Co-authored-by: Joe Ramsay <joe.ramsay@arm.com>
Signed-off-by: David Svantesson <david.svantesson@arm.com>
Change-Id: Iea693dbe53bf0af87867d6a9e0d1fd9fbe59ef3a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8981
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Add descriptions and pointers in the tests to document the differences
in test coverage between dynamic fusion and the current library, and
most importantly, why the differences. This will come in handy when we
want to quickly check if all old tests have been migrated so that we can
safely deprecate / remove them.
Resolves COMPMID-5840
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ie6227098979e51d7921810288f594beac19bce6f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9043
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Return aux tensorInfo by get_aux_tensors() at runtime to init the aux
tensor with the right size.
- Keep softmax unfusable for this commit
- Hence, added Tensor3D to template writer arguments declaration, for sake of
keeping dynamic fusion softmax componenets' kernels matching their cl
counterparts.
Resolves: COMPMID-5523
Change-Id: I667f39545db925f667036ef448302c79a0330373
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/483924
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8986
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
The new dynamic fusion API is introduced in the following patch:
https://review.mlplatform.org/c/ml/ComputeLibrary/+/8906
For each operator (except Conv2D, which is migrated in the above patch), we
- remove destination tensor from is_supported, validate and create calls
- make create_op return ITensorInfo* to the intermediate destination object
Affected operators:
- DepthwiseConv2D
- Cast
- Elementwise Ops
- Clamp
- Reshape
- Resize
Resolves: COMPMID-5777
Change-Id: Ib60ec8a5f081752808455d7a7d790f2ed0627059
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8991
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Dynamic-Fusion: Ramy Elgammal <ramy.elgammal@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add direct conv2d tests as a separate fixture so that we can enable
future direct conv2d specific tests
* Move Conv2dAttributes to its own file
Partially resolves COMPMID-5736
Change-Id: I530649488faf3bbed1a4fc7d16a74063bfdf33db
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8928
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Adds Dynamic fusion PoolingLayer2D as Unfusable Operator
- Indices are not supported
- Adds tests for F32/F16 Datatypes
Resolves : [COMPMID-5520]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: I0d112545eb9209c836bf9ea153069f8627531e0a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8893
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-5814
Change-Id: I09b206374cf3844c09aebd3c664daec9c2335e6d
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8953
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Removed BF16 validation tests for DepthConvert
* Revert back to using inline assembly to convert to/from BF16
* Resolves COMPMID-5800
Change-Id: I803b2ad19ead297417f780c97c5b724cca6b394c
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8929
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add missing activation infos
* Remove faulty test "Shrink window"
* Split the tests based on data layout
* Fix ClDirectConv2dKernel::validate logic
Fused activation in NCHW is not supported at all
Resolves: COMPMID-5801
Change-Id: I64dfbd24b77bb02fb4a88b73d5ef84676d85b4fd
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8899
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- ITensorInfo's padding cannot be extended if its lock_paddings flag is set to True.
Resolves: COMPMID-5714
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I6bca9bbf7172822af60562310578c438b9e15f46
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8875
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5522
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: If4e5736a2f7ff42e70276d7f4e0f3ebcb38414e6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8881
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Intermediate tensor info objects are not created by the user anymore. They're returned from create_op and reused. This will prevent allocation of the intermediate tensors in case of possible interface misuse.
- Sketch object handles intermediate tensor info pointers inside its implementation class via a unique pointer vector
- Conv2d operator is migrated into the new interface
Resolves: COMPMID-5776
Change-Id: I9422e3681eef4f2d2922f6d0a5d7786380837c6d
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8906
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Binary elementwise operator now can have broadcasting in either
X dimension, Y+Z dimension, or both, in either LHS or RHS
operand.
* Fix bug in CL code to support batching.
Resolves: COMPMID-5704
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I51b04986d30861f255ca9f754adffa0e6c85a26b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8898
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Ramy Elgammal <ramy.elgammal@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Dynamic-Fusion: Ramy Elgammal <ramy.elgammal@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
* Multiple intermediate tensors can share the same tile.
- A simple operator can reuse the input tensor for the result
if the input tensor has the same shape, data type and it is
only consumed by that operator.
- The special case is a simple operator and an output operator
consume the same tensor. However as the output operator
doesn't change the content of the input tensor, it doesn't
count as "consuming" the input tensor.
* These temporary tiles are declared automatically by the template
writer. Individual operator doesn't need to generate output tile
declaration.
* Cast is now simple operator.
Resolves: COMPMID-5778
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I232647ac976645e2d266a62e055b9eb48c356a8e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8877
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially resolves: COMPMID-5794
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Change-Id: I275d0401be978e86507990bdb7dc5b1538a108d8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8884
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-5521
Change-Id: Id38a4ce18f9ea8805a151acb064e72795535d1a0
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8859
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|