Age | Commit message (Collapse) | Author |
|
- Update heuristic for CLGEMMReshapedKernel - FP16
- Update heuristic for CLGEMMReshapedOnlyRHSKernel - FP16
Change-Id: I35aa73e59d8c2d1bc6b2dd318fd8eeb3e42c27a4
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4026
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The "output_state_in" (previous output state) tensor
is used for accumulation of projection.
The argument for the tensor given to configure() has
to be changed to non-const since CLTensor needs
to be non-const for map() function call for data copying.
Even though NEON-side doesn't need the same change,
it has been done for consistency.
Change-Id: Ifba0ab6dc8260c468e9f087bf51824daefbab7a3
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4018
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I6ee7a05235d886875c55b0fc45446607981cdd2a
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4008
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Iaf7ba1092aeda78b0f2bff3134f4699afc783385
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4013
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
very few rows
Also added 2D version of the 16-bit route, and altered the selection
heuristic so that 2D mode will be used in cases where 1D mode won't
thread well.
Change-Id: I0057fde08456771dc0090ac51f50d82f8bb86044
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3903
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I9b548966201c00df8290fea7acf55c2173b0e0aa
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4011
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- The change affects Mali-G71 GPUs and should improve the performance of
GEMM in case of m = 1
Change-Id: I6b0e217e93fe468ec1325a5da74684811519c42f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4002
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedKernel
Resolves: COMPMID-3671, COMPMID-3672
- Extend cl image support to f16 in CLGEMMMatrixMultiplyReshapedKernel
- Extend cl image support to f16 in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
- Change the interface of create_image2d_from_buffer
- Extend test
Change-Id: I27363be71fa515fbf71aa4be5ed0d6c730f38f34
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3992
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Solves also:
- COMPMID-3766: CTS Failures in Transpose Neon + FP16
Change-Id: I9d323f45f49cc0bce9e6329790bcf2f0eeec8572
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3949
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: Ib5b252e1b65794a8f360276d03ff94922e1991f8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3946
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Division follows the flooring division approach where for example 5/2=2 while
-5/2=-3
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I65756e0b31fe8d97f743a4c13dc5f96304722f75
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3929
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaf1465f3144371e153ce123ac00da5cc092f77df
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3939
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add S32 support to NEPixelWiseMultiplication and NEPixelWiseMultiplicationKernel
* Scale == 1/255 is not supported for S32, as on non-aarch64 the
precision requirement is not met, and scale is a non-standard
parameter anyway.
* Fix the data types validation logics to also test for all invalid data
type combinations.
* Add validation tests for S32 NEON PixelWiseMultiplication
* The wrap tolerance for ScaleOther (scale == 1/2^n) cases is set to
1 instead of 0 because the reference uses floating point division
followed by rounding, which is isn't bit accurate.
Change-Id: I28839afda7a4f98c985d1763620e08d98f740142
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3923
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Change-Id: Ida819fb8c33790cc9da6d69eeb51e0599269197a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3931
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Change-Id: I5cd26a8829060563d63d8c53e5148631ee053eca
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3912
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Prefer NEDepthwiseConvolutionLayerNativeKernel as it has a native format
of NHWC avoiding extra transformation to the NCHW domain.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: If5d8de11691b8ef7f4c3816941f87417d0c8646b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3930
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I2ccb2c65edd2932b76e905af3d747324b65c2f7f
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3910
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I93c3b795cf6fe0b27008543b6671a3be0a965603
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3916
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Treat bf16 memory on memset as raw memory by casting to void*. This
hides the class-memaccess warning and is safe for the current class
layout of arm_compute::bfloat16
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I5e242827d3737b4491d29abe7570eefee5b6edc1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3928
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fix convert policy validate logics and add missing validate test
* Add S32 support to NEArithmeticSubtraction and NEArithmeticSubtractionKernel
* Add S32 validation tests
Change-Id: I1b6cb15b024613c202fe9f17747a83da43a5ddcf
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3908
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ia7516fadcf3df072abf9b83aef4d9939212ce082
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3918
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Base the dimensions of the valid region generated by the reshape
kernel on the output shape dimensions.
This allows correct scaling on inputs that are in NHWC format and have
width and height equal to 1 e.g. 1x1x32.
Underlying problem causing this issue is the fact that Compute Library
removes trailing 1's of a given shape.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Icfdafc469214840998e7c198b33f7358d566d2e7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3924
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Change-Id: I6d6fb2b053c74e35a86841621486bc0cd34b12b3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3911
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I5db780fb9c94160130e9986bbfc739124bfa8041
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3914
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding from NEGEMMTranspose1xWKernel
- Extend test for validating zero padding requirement
Change-Id: I9ce4ca95a500229b045dc140cfff21fdf7373700
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3920
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding from NEGEMMInterleave4x4Kernel
- Extend test for validating zero padding requirement
Change-Id: I94abc271e005f9dd6e1721b185631f55f598dbfd
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3915
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I65a738221a6c6fc3527ececda42f7a7e547755c1
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3896
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaaf3a72ec98a923ef2a4a39aeeb02f95795c2f6f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3895
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ia3030ea701e9ceb2ef567e0258e8f478e18b8b55
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3871
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I81b0c2482bc20b1ab5124ed6179bb94cbced7875
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3869
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ic0569fe9ed99e61084b601ce84ddc7ef288d1899
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3852
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Properly support "axis" in CL and NEON (and GC) SoftmaxLayer and
LogSoftmaxLayer in accord with mainstream frameworks. Axis now defines
the dimension on which softmax is performed, and supports the range
[-rank, rank)
* Extend validation tests to include valid and invalid axes
* Remove unnecessary LogSoftmaxLayer fixture, as it is only a
specialisation of the SoftmaxLayer fixture
* Change the validation fill value range from [-1000, 1000] to [-10,
10], as the former often results in sparse outputs with a single one and
zeros elsewhere
Change-Id: I8a0040453182b04ed88260de3ba434e98258d863
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3830
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Iedacf7094896f08d7c2847c8fb99bd7153deba2c
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3809
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
|
|
- Fixed issue where EltwiseLayerNode would base output shape
off of first input tensor only
- Allow QuantizationLayerNode to use any quantized data type
if specified in constructor
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Change-Id: Ib93470316799028cd573592a3d79943493eaa093
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3737
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
|
|
resnet_v2_50 when running as qasymm8 on mate20 GPU
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I0407fd1cdfb5d1d1d0f333e875ea45abdd2c5916
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3825
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
|
|
Signed-off-by: thecha01 <theo.charalambous@arm.com>
Change-Id: I764f1eabb6412350eb719cc755b8777efc7d70a1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3736
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I8045166d2d3202612fec3f6bf9651f3e6bfcb20f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3783
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I2c5e98aae7698963f106d7423df0e65cd00ee2a9
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3710
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If9a5c6ee3902a7381f4117e473adbddf006f3347
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3731
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
|
|
Change-Id: I33436dab77a47868fbd9872e0b4cf54b3a173e65
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3694
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I90e09e460b5d5d4f9ead8e3905833c5da3b9fbd6
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3762
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
|
|
The issue concerned gemmlowp_mm_reshaped_only_rhs_t_fused_output_stage_fixedpoint.
In particular the issue was with the z index to access the elements from
the lhs reduced tensor used to calculate the offset contribution.
Change-Id: I74f6398fc08894fc323ccd04fda9220752652d31
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3726
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedKernel
- Change the interface of STORE_BLOCK_BOUNDARY_AWARE passing the
conditions on Y and X rather than the X/ coordinates. This allows to
use the macro with both GEMM reshaped and GEMM reshaped rhs only
- Remove padding from the output tensor of
CLGEMMMatrixMultiplyReshapedKernel
- Add tests for validating the zero padding requirement
Change-Id: I13263cc71ce065c5be34ed198def320dd5823495
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3712
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I3f85ece08c9fc4bdbbfd72b5a872d4f2c4b76357
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3709
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding requirement for the input tensor of
CLGEMMReshapeLHSMatrixKernel
- Add utility function to load a boundary aware 2d tensor from buffer
- Extend validation for validating the zero padding requirement
Change-Id: I0ac6b1b517d75fd56998f406e0cce97b40918ce1
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3701
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Remove channel paddings from all nhwc kernels (im2col_3x3_nhwc,
im2col_9x9_nhwc, im2col_generic_nhwc)
* Validate that input total spatial dimensions (with x and y paddings)
are bigger than or equal to the kernel spatial dimension.
- Otherwise it would result in invalid memory reads.
- This problem likely existed before, but was made obvious with the
removal of implicit paddings
* Add zero padding validation tests
* Fix Im2ColValidationFixture by not permuting the input shape in case of
NHWC
Change-Id: I1f895e8938af0e9130cb516106f0b4b665531709
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3696
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
For the elements that shouldn't contribute to the sum, zero
is used to compute the correct sum.
Change-Id: I5360534b5b0f81ee3d3aaaf5a046b99ecd943894
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3703
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Put an additional cast for correctly handling scalar cases
According to opencl specs, logical operators, when performed on
scalar types, always return int regardless of the type of the scalar.
Thus if we were to use the results of a scalar logical op on the
method select, it would be incorrect for any types of width different
than 4 (the width of int)
A concrete example would be that if the VECTOR_SIZE is 1 (scalar case),
and DATA_TYPE is half/f16 (width < 4), then the result type of the ||
op would be int instead of short, which it's supposed to be, and this
would result in the ambiguous function call error for select.
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ibc4985f707f667116668c43b9f9bf39dda789528
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3698
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Fix ArmNN compiling issue through removing defulat template value for offset_no_padding() and pooling2_nchw_maxpool_indices().
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ie7114d102b8533e757b8e841afcac1adfbcb3b54
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3697
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3682: Fix performance regression on Mali-G76 (Convolution)
Updated the heuristic for GEMMReshapedOnlYRHS for Mali-G76 in order to
take into account small workload cases
Change-Id: I99fccbd0e94e4e21c0d1b88e23f02af06ef16ee9
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3689
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|