Age | Commit message (Collapse) | Author |
|
ElementwiseMax, ElementwiseMin, ElementwiseSquaredDiff
Change-Id: I3833de3be6c6d573c68d3fee0cf0f42bad260817
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4072
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I72fed68cebe4073fe436b9f6372e762507aed89e
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4064
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ifae31c74eb46c561225394a387fc15332423bfa9
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4030
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Remove configuation tests that use the default data shapes.
There is no need to run them since configure will run as part
of the actual validation run.
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: If6d88a6ba5e9463fa8c615fcf76a5c07d3049d53
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3638
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Core algorithm for calculating the ROIAlign reference is implemented in
single precision floats, thus no reason for specializing for half.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I75f4edaf47b70ea0cdc7262cb1509fe69a6aa5b7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4010
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedKernel
Resolves: COMPMID-3671, COMPMID-3672
- Extend cl image support to f16 in CLGEMMMatrixMultiplyReshapedKernel
- Extend cl image support to f16 in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
- Change the interface of create_image2d_from_buffer
- Extend test
Change-Id: I27363be71fa515fbf71aa4be5ed0d6c730f38f34
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3992
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Solves also:
- COMPMID-3766: CTS Failures in Transpose Neon + FP16
Change-Id: I9d323f45f49cc0bce9e6329790bcf2f0eeec8572
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3949
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
NEGEMMLowpQuantizeDownInt32ToUint8ScaleKernel
Change-Id: I8c8b499be0a09886b701a4f678b40e57f2c48dd8
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3990
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Alter the default lower bound used for the norm from 1e-12 to 1e-6 to be
representable by the half precision dynamic range.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I8d3103b8345eb4c464a76b4f4ba5ef596d81da93
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3960
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Division follows the flooring division approach where for example 5/2=2 while
-5/2=-3
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I65756e0b31fe8d97f743a4c13dc5f96304722f75
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3929
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Tolerance issue due to requantization. The NEON implementation does all
computations in float when input and output quantization info are
different and reduction on multiple axes is required. On the other hand,
the reference performs the first reduction in float, then requantizes
and then performs the remaining reductions in the quantized domain using
the output from the first redcution. This causes small discrepancies in
few cases, hence increasing the tolerance.
Change-Id: Ib862f599ce3924cbad651bab77227d52e15eff88
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3937
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add S32 support to NEPixelWiseMultiplication and NEPixelWiseMultiplicationKernel
* Scale == 1/255 is not supported for S32, as on non-aarch64 the
precision requirement is not met, and scale is a non-standard
parameter anyway.
* Fix the data types validation logics to also test for all invalid data
type combinations.
* Add validation tests for S32 NEON PixelWiseMultiplication
* The wrap tolerance for ScaleOther (scale == 1/2^n) cases is set to
1 instead of 0 because the reference uses floating point division
followed by rounding, which is isn't bit accurate.
Change-Id: I28839afda7a4f98c985d1763620e08d98f740142
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3923
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fix convert policy validate logics and add missing validate test
* Add S32 support to NEArithmeticSubtraction and NEArithmeticSubtractionKernel
* Add S32 validation tests
Change-Id: I1b6cb15b024613c202fe9f17747a83da43a5ddcf
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3908
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia7516fadcf3df072abf9b83aef4d9939212ce082
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3918
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding from NEGEMMTranspose1xWKernel
- Extend test for validating zero padding requirement
Change-Id: I9ce4ca95a500229b045dc140cfff21fdf7373700
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3920
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding from NEGEMMInterleave4x4Kernel
- Extend test for validating zero padding requirement
Change-Id: I94abc271e005f9dd6e1721b185631f55f598dbfd
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3915
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I81b0c2482bc20b1ab5124ed6179bb94cbced7875
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3869
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ic0569fe9ed99e61084b601ce84ddc7ef288d1899
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3852
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Properly support "axis" in CL and NEON (and GC) SoftmaxLayer and
LogSoftmaxLayer in accord with mainstream frameworks. Axis now defines
the dimension on which softmax is performed, and supports the range
[-rank, rank)
* Extend validation tests to include valid and invalid axes
* Remove unnecessary LogSoftmaxLayer fixture, as it is only a
specialisation of the SoftmaxLayer fixture
* Change the validation fill value range from [-1000, 1000] to [-10,
10], as the former often results in sparse outputs with a single one and
zeros elsewhere
Change-Id: I8a0040453182b04ed88260de3ba434e98258d863
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3830
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
To prevent unexpected failures caused by some cases,
the bigger tolerance value is used, which is matched
to CL's relative tolerance value.
Change-Id: If6e3bc2f30651c54769dcd8dd647a3233a88c488
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3826
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If9a5c6ee3902a7381f4117e473adbddf006f3347
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3731
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedKernel
- Change the interface of STORE_BLOCK_BOUNDARY_AWARE passing the
conditions on Y and X rather than the X/ coordinates. This allows to
use the macro with both GEMM reshaped and GEMM reshaped rhs only
- Remove padding from the output tensor of
CLGEMMMatrixMultiplyReshapedKernel
- Add tests for validating the zero padding requirement
Change-Id: I13263cc71ce065c5be34ed198def320dd5823495
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3712
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Tolerance issue
Change-Id: I0246b70b03520b13a6a1bc5a92fb4787d7c0e734
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3711
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding requirement for the input tensor of
CLGEMMReshapeLHSMatrixKernel
- Add utility function to load a boundary aware 2d tensor from buffer
- Extend validation for validating the zero padding requirement
Change-Id: I0ac6b1b517d75fd56998f406e0cce97b40918ce1
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3701
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Remove channel paddings from all nhwc kernels (im2col_3x3_nhwc,
im2col_9x9_nhwc, im2col_generic_nhwc)
* Validate that input total spatial dimensions (with x and y paddings)
are bigger than or equal to the kernel spatial dimension.
- Otherwise it would result in invalid memory reads.
- This problem likely existed before, but was made obvious with the
removal of implicit paddings
* Add zero padding validation tests
* Fix Im2ColValidationFixture by not permuting the input shape in case of
NHWC
Change-Id: I1f895e8938af0e9130cb516106f0b4b665531709
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3696
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
For the elements that shouldn't contribute to the sum, zero
is used to compute the correct sum.
Change-Id: I5360534b5b0f81ee3d3aaaf5a046b99ecd943894
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3703
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Our implementation of reduce_axis is only compliant for default_axis.
Validate will throw an error when trying to use a different axis.
Change-Id: I4c02aa055bb4474593a3114ec9c83884d3c9120f
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3658
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
|
|
* Fix out-of-bound mem reads in cases where M < M0 in
CLGEMMMatrixMultiplyNativeKernel and
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel, as a result of the new
boundary-aware reading logics.
* Add fixture tests (alongside the padding configuration tests) in
these 2 kernels to catch all 5 possible scenarios with block dimension
configurations, which includes this particular bug as the
"...BoundaryHandlingFullInXSinglePartialInY" test case
Change-Id: I8a10ab67594171e3ea4fb6e35c84ddc4ab964fba
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3650
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add OpenCL kernel for Max unpooling layer
- Add tests for validating the result
Change-Id: If7ca79566a1198e3141f880abf46738980a62c81
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3606
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix PoolingLayer max pooling reference bug to extract indices.
Extend CLPoolingLayer max pooling to extract indices, all the paddings need to be substracted.
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: If8e82e7f7e03172ad05f5a7cd5f13cf682fd1ffc
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3649
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
With this patch: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3178
we can support any range of values since we handle overflows by clamping.
This means that for large negative values we'll get 0 and for possitive inf
which aligns with math.h implementation.
Change-Id: I01e92010bb0c514c12b19da97e369a75d782cac7
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3639
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fix invalid use of vstore_partial_1
* Add configuration tests to catch this error case
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I25a2b16a530992acc869a4335c48a8fffa420850
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3628
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Allow computations with aligned corners when the tensors have
width/height equal to 1.
Change-Id: Ia01733f6c02e0740835b26a794b9a79fa35319b4
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3634
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sadik Armagan <sadik.armagan@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: Idc5ac2dd2ba5295c00c88b44a783645327a27e15
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3617
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
In order to fix the issue caused by the limited precision of FP16.
mixed precision (float accumulator) is introduced to
NEInstanceNormalizationLayerKernel. Since the reference kernel
is doing the mixed precision, currently mixed preicision computation
is default when it is called from NEInstanceNormalizationLayer.
- Make NEInstanceNormalizationLayerKernel use kernel descriptor
to enable mixed precision computation
- NEInstanceNormalizationLayer is modified to use the descriptor
Change-Id: I7766622d715df054e303f9b441380b15b51f02b2
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3589
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Supported strides 1 and 2
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I4b9f087c0c328234159b2d1eacc2e465b3bb3c54
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3603
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I611adf4f506d406540e920b0bd6befb4b5108918
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3601
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Add broadcast support for NEPixelWiseMultiplicationKernel with QASYMM8/QASYMM8_SIGNED
Add test case for QASYMM8 broadcast
Fix QASYMM8 saturation issue
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ie67cfa8b94ab542133b031efbff8379cc57cfc2d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3586
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel
Resolves: COMPMID-3333, COMPMID-3334
* Implement an "overlap load, but don't overlap store" strategy:
- Change STORE_BLOCK_BOUNDARY_AWARE so that the partial block in y
dimension is placed at the beginning instead of at the end.
- Implement 3 auxiliary functions to calculate the lhs, bias and dst
addresses, taking into account the potential partial block in y dimension.
* Remove y load padding from Lhs and Bias tensors in
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel
* Modify config tests to assert zero-padding in new dimensions
Change-Id: I8f8585c7c0f543d720c2c91b885417c7dad35af4
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3576
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I8cfe78ad914150092ef752cdd687dce5cfeb6c5a
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3590
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I32588332080adfaa79227dadd0f152c1bd67ff62
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3577
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Extend NEPoolingLayer max pooling to extract indices for FP16
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I5a7c754be353e4c2c5d0ab3794e9427408d0c4fa
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3580
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Allow the following kernels to accept backing memory at run-time:
* NEBatchConcatenateLayerKernel
* NEDepthConcatenateLayerKernel
* NEHeightConcatenateLayerKernel
* NEWidthConcatenateLayerKernel
* Allow the following functions to accept backing memory at run-time:
* NEConcatenateLayer
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ib0b6714cff7f06a52dc74d294bc3e0d72a1c2419
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3569
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Pick the correct scales and offsets in case of broadcast.
Added tests for quantized QUANT8_ASYMM.
Change-Id: I04e90b8ae1f624b12bbdcf6ed9187e58b9135c85
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3562
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3338 Remove store padding in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
COMPMID-3336 Remove store padding in CLGEMMMatrixMultiplyNativeKernel
COMPMID-3584 Fix VSTORE to correctly deal with scalar case
* Implement STORE_BLOCK_BOUNDARY_AWARE, as part of the COMPMID-3332
investigation, with the following substantial changes:
- Separate STORE_BLOCK_PARTIAL, STORE_ROW_PARTIAL and VSTORE_PARTIAL
so that this change does not affect kernels not using STORE_BLOCK_BOUNDARY_AWARE.
- Revamp vstore_ext_n to vstore_partial_n, and enhance
VSTORE_PARTIAL to correctly handle both vector and scalar cases
* Remove the store padding (dst tensor) in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
and CLGEMMMatrixMultiplyNativeKernel
* Add configuration tests to check no padding is added by the
configuration.
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I4f0907867979d8dacedd03b4bcbd2fb19e4f1602
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3522
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
When a large input and kernel is used, the computation of "max_offset"
variable can overflow. Adjust types of the variable as well as
the variable compared with for consistency.
The test spotted the overflow is added to nightly suite.
Change-Id: I2f114e4b49167889a6d3729c71823c089d6f42e3
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3527
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
preferred presentation
Change-Id: Ib7dcfcbb24b408999dfae366b9da396485aacf78
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3525
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Missed logarithm for the summation is added to NEON, CL and reference
backends. To avoid complex changes, log softmax layer on CL backend
doesn't support quantized data types. Tests and doxygen comments
are modified accordingly.
Change-Id: Iafd29291be8b81345cb4999b2668dbc3ae0c3345
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3517
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ibd54d9d8324fe906077c181ecf227e44f3035744
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3511
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
For nearest neighbor interpolation policy with aligned corners
all of NEON, CL and reference use round() rather than float to
find the nearest integer.
Change-Id: If0360da870e983303bf0424ca1100084084c1efc
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3495
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|