Age | Commit message (Collapse) | Author |
|
* Remove padding only for when user-supplied padding is empty
* Vectorize the case where output_window is not null and the output
window is narrow in x (smaller than vec_size_x)
Change-Id: I313089fe309e87e8529ecfd00542fcfa4dc44862
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4193
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
The kernel was not using the preprocessor arguments needed avoiding the
use of padding.
Change-Id: I6b5fdf4f3f14edbef60b9d5b60179d619700bc00
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4232
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
In gemmlowp_matrix_b_reduction kernel the accumulator data type might be
set to uint if the input data type is unsigned quantized. However, the
output of this kernel is always a signed integer, hence we need to
convert the result before storing in memory.
Change-Id: I9b936fbbcb8cd64319c42872648f5058f686b228
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4233
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3723: Remove OpenCL padding: CLGEMMLowpOffsetContributionOutputStageKernel
Change-Id: Iac265c2ac4c5749352daa311279a3b8c60ac3b3d
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4228
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Added utility functions developed by Giorgio for checking that padding
remains unchanged after configure.
Change-Id: I6862e74baf9b8792991e3f25e176c672c0a46836
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4208
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I5f77356bff6c6ab513ed3555466c8c5bf5f4c4e3
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4227
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
FP16/32
Removed unused N from partial block loading macro
Created utility to assert change in padding
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ifdd30c66dbf5f2842c6b2d939000613d5011708e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4192
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Guard the kernel with all required compile-time arguments, otherwise
the kernel might be wrongly included when compiling for other kernels
which don't have the required compile-time arguments, resulting in
mysterious kernel build errors.
Change-Id: Ib45b46a5ab14e6dc6a415c0466cf9a5963452364
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4224
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I8a685d4ac5de747a0f775bd10be9c411cf394953
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4140
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: If9d6fa8c900b68c4b6fd373f2fc1f9abb83ea917
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4145
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Remove padding requirement from the OpenCL kernels
- Extend test to validate zero padding requirement
Change-Id: I1ddf04eba783721858792efb08a2c97f11f7297e
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4206
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Refactor pooling layer kernels on OpenCL (F32/F16/QASYMM8) to avoid
padding and improve performance
- Add test for checking zero padding requirement
- Fix issue with extracting the index. The issue was caused by the
padding passed at compile time
- auto_init indices tensor in CLPoolingLayerKernel
Change-Id: I1ae5a2ef8c4ce787c80dcd73e35c17bb34623cb5
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4188
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ibbd6bee5c6a4ce4f212b207d17a65b9c33bcfa78
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4106
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
COMPMID-3725: Remove OpenCL padding: CLGEMMLowpQuantizeDownInt32ScaleKernel
Change-Id: Idea5974a56861efae3bc255f1224c7f1e88f3650
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4182
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- For AArch64, NEActivationLayerKernel uses vsqrt rather than
vinvsqrt.
- For non-AArch64, it masks values to ensure zero input
results in zero output without producing NaN.
- Test cases for FP16 and FP32's positive boundary values
are added.
Change-Id: Ic0104ee5d7045059c2e9bd052616a4a3b43a315d
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4150
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Iee28abcbba1e7b9e2f3aaa55685936dce815d5a3
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4141
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
remove padding from related OpenCL kernels
Change-Id: I0b0be8fcccf511c7214e83ba6aa8d0e901bc4f3c
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4146
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Template parameter has been removed, which reduces the binary size by:
- ~4 kB for armv8.2a
- ~12 kB for armv8a
Change-Id: Ib499a18a4980a3ee7b201507b943f900adf20a73
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4122
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I22b907eebfbe037e6e1c7bf604172f4709a9cbed
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4082
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I9d3122b4858137d422548d1d417eb04a27ae9c7b
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4143
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: If077a245156be69f34834cbfbd0a36e570ee4149
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4107
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
Change-Id: I09f557b5cecafc669e12764e8592457212168d62
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4131
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3709 Remove OpenCL padding: CLDepthConcatenateLayerKernel
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Iaea4fafd5d0f081fd5b45b0f6945302dc3365bd9
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4105
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: Id4d95c6ce5fed91bb079b8bfe1abceedefd20c97
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4117
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ie95442c6c6a145c1a45937b03cbd433bf08e36ab
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4094
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Decouples data types for NEFloorKernel
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I6756300540bc5ef32a9990246eed8619a76855f2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4084
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: I8cfdd24c4e71a6a4be610ba67a75ad2943a43801
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4097
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
macro
Change-Id: I73edadc7299247e7bc51ae37c00d3709023da44a
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4073
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Update the heuristic (m==1) for CLGEMMReshapedOnlyRHS
Change-Id: I216c158f2802d3d331e23e0d9eb0127107ec8af0
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4092
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I524b0c4b49c7a7035b7d078b9585d77b0d438e10
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4083
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
COMPMID-3803: Remove padding from NEComplexPixelWiseMultiplicationKernel
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I309fc4ab62bacbca9203d2680a9d6d52f76f70e6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4078
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
|
|
Change-Id: I530b12c6270d7dbeb3ef7af62484842ebcb65925
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4000
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: Ibf7741ffdefcceb9683c919e79302fc35c36ea65
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4031
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Removing bool template reduces the binary size by 20Kb.
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: I652cea7d320a00b6c6e44cdacb61e77f3c10e56a
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4053
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ifae31c74eb46c561225394a387fc15332423bfa9
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4030
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: I3c5cfe50e9cee30b66f4094da105d383c077aaf9
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4044
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Update heuristic for CLGEMMReshapedKernel - FP16
- Update heuristic for CLGEMMReshapedOnlyRHSKernel - FP16
Change-Id: I35aa73e59d8c2d1bc6b2dd318fd8eeb3e42c27a4
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4026
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
very few rows
Also added 2D version of the 16-bit route, and altered the selection
heuristic so that 2D mode will be used in cases where 1D mode won't
thread well.
Change-Id: I0057fde08456771dc0090ac51f50d82f8bb86044
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3903
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- The change affects Mali-G71 GPUs and should improve the performance of
GEMM in case of m = 1
Change-Id: I6b0e217e93fe468ec1325a5da74684811519c42f
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4002
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMMatrixMultiplyReshapedKernel
Resolves: COMPMID-3671, COMPMID-3672
- Extend cl image support to f16 in CLGEMMMatrixMultiplyReshapedKernel
- Extend cl image support to f16 in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
- Change the interface of create_image2d_from_buffer
- Extend test
Change-Id: I27363be71fa515fbf71aa4be5ed0d6c730f38f34
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3992
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Solves also:
- COMPMID-3766: CTS Failures in Transpose Neon + FP16
Change-Id: I9d323f45f49cc0bce9e6329790bcf2f0eeec8572
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3949
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: Ib5b252e1b65794a8f360276d03ff94922e1991f8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3946
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Division follows the flooring division approach where for example 5/2=2 while
-5/2=-3
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I65756e0b31fe8d97f743a4c13dc5f96304722f75
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3929
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iaf1465f3144371e153ce123ac00da5cc092f77df
Signed-off-by: morgolock <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3939
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add S32 support to NEPixelWiseMultiplication and NEPixelWiseMultiplicationKernel
* Scale == 1/255 is not supported for S32, as on non-aarch64 the
precision requirement is not met, and scale is a non-standard
parameter anyway.
* Fix the data types validation logics to also test for all invalid data
type combinations.
* Add validation tests for S32 NEON PixelWiseMultiplication
* The wrap tolerance for ScaleOther (scale == 1/2^n) cases is set to
1 instead of 0 because the reference uses floating point division
followed by rounding, which is isn't bit accurate.
Change-Id: I28839afda7a4f98c985d1763620e08d98f740142
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3923
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Prefer NEDepthwiseConvolutionLayerNativeKernel as it has a native format
of NHWC avoiding extra transformation to the NCHW domain.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: If5d8de11691b8ef7f4c3816941f87417d0c8646b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3930
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I93c3b795cf6fe0b27008543b6671a3be0a965603
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3916
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Treat bf16 memory on memset as raw memory by casting to void*. This
hides the class-memaccess warning and is safe for the current class
layout of arm_compute::bfloat16
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I5e242827d3737b4491d29abe7570eefee5b6edc1
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3928
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
* Fix convert policy validate logics and add missing validate test
* Add S32 support to NEArithmeticSubtraction and NEArithmeticSubtractionKernel
* Add S32 validation tests
Change-Id: I1b6cb15b024613c202fe9f17747a83da43a5ddcf
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3908
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Base the dimensions of the valid region generated by the reshape
kernel on the output shape dimensions.
This allows correct scaling on inputs that are in NHWC format and have
width and height equal to 1 e.g. 1x1x32.
Underlying problem causing this issue is the fact that Compute Library
removes trailing 1's of a given shape.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Icfdafc469214840998e7c198b33f7358d566d2e7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3924
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|