Age | Commit message (Collapse) | Author |
|
Port following functions:
- NECopy
- NEFill
- NEPermute
- NEReshapeLayer
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I75f3f837012abab79c7dde9a20a34f64f75571d8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4800
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Refactor direct convolution for NHWC
- Remove old kernels for NHWC
- Change the heuristic in CLConvolutionLayer.cpp. The new direct
convolution implementation is faster than FFT
Resolves COMPMID-3908
Change-Id: Iee15ce7b04e21847b6eaae5c6d3c1b18180e7efc
Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4876
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
- Rename NEArithmeticAdditionKernel to CpuAddKernel Cpu and move files appropriately
- Add CpuAdd under src/runtime/cpu/operators
Partially resolves: COMPMID-4005
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I1d8d406df9773fea198899f50327407d7125c38d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4867
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Explicitly cast scalar to vector for LOGICAL_NOT
Related with COMPUTE-12536 and IVGCVSW-5617
Change-Id: I03accce000f8889fc4fb88c42c3c87845acb4f42
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4874
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Rename all concatenate kernels to use the Cpu prefix and move
appropriately
Change-Id: If647173e84969936ebd211d4d5ae6d1e73150bdc
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4799
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
Implements COMPMID-3875
Change-Id: I38991eed3f4966db125862af066bfedff5994a25
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4854
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Partially implements: COMPMID-4003
Change-Id: Ie51e43e24fb9a6b5b96d13cdc3d72fbda027a68b
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4873
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-3990
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: If840c79209940535450f4ea1cbf6b0ec646a168e
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4866
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
With v8.6 arch flags, gcc10 fails to build due to
type of the argument doesn't match to its template
argument. This is fixed by adding explicit casting.
Resolves: COMPMID-4096
Change-Id: Ifc86c4b9afeb43594ea3b758de417dbdc1394880
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4872
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Missing std headers - limits, algorithm, cstddef - are added
where they have to be.
Partially implements: COMPMID-3808
Change-Id: Ia31f75370f8440dcb753e5ac6eb2eac18e9c63f3
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4861
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Large inputs produce incorrect results of Soft ReLu activations where
the output saturates at around 88.72283. This is due to the
approximation algorithm used for the logarithm.
For this reason, we introduce a threshold such that with x > threshold,
Soft ReLu activation will return the value x itself.
SVE does not seem to suffer from the same issue, hence NEON kernels only
are modified.
Resolves COMPMID-4091
Change-Id: I357883deed4e4aba571a1d3163267772096c0412
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4865
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
GEMM function used within NEWinogradLayer re-transforms the weights
after the original winograd transformation leading to double allocation
of the weights. Release appropriately and retain only one copy of the
weights, the last transformed one.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I60459bfe370bff453150dfe9536cd9e7a5b56def
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4862
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Add case for VEC_SIZE == 3 in the TRANSPOSED_U macro
Resolves: COMPMID-4094
Change-Id: I31870e589e66d895f9bf65c87aa04f32038365c0
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4864
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
By handling more general NxM blocks (where M and N can be 1,2,4,8,16)
instead of only 4x4, 8x8, 16x16 and managing corner left values with
partial stores
Resolves: COMPMID-3923
Change-Id: I49b1a560c8325e00e061bd04edcf55034d04dcd8
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4780
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add 'macos' as an additional OS build option
* Guard unsupported paths like thread scheduling control and hwcaps
checking with the __APPLE__ macro
* Map linker options to respective Mach-O linker options
Change-Id: I67bd9fa3c20831427b218ca7d3b4b9d454ab4fec
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4788
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Cast the destination pointer to (__global DATA_TYPE*) when VEC_SIZE == 1 in range.cl
Resolves: COMPMID-3906, COMPMID-4093
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ic0a334d98785ea434ed81f89dbe34e7674991f82
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4792
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Partially implements: COMPMID-3872
Change-Id: I76d81f2b8aa343f9d830298bc931e410c7c901bc
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4842
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
- Add checks for pad top/bottom bigger than (kernel size / 2)
Resolves: COMPMID-4088
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ifc5ea2154847d447bc5643d7607e7256aeddfcbf
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4840
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Use of proper vector size with boundary checking loads and stores
Resolves: COMPMID-3922
Change-Id: Ib631d499603b860fcfdbe3da903b866a125359a8
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4789
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
SVE kernels are added to all previously supported arithmetic
and comparison operations with exception of S16 arithmetic
operations due to complexity of widening and narrowing of
integer vectors.
Partially implements: COMPMID-3872
Change-Id: Ic433eb7227dfcfd0d1429f18acebec2d934ca8bd
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4778
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Decouple data type for NEON NHWC implementation, supported data types are: fp32, fp16, u8, s16, qasymm8, qasymm8_signed.
- Add SVE support for NHWC and all six data types showed above.
Resolves: COMPMID-3873
Change-Id: I097de119f4667b28b025a78cadf7185afa5f15f0
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4766
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Add `get_tensor_shape_state` and `set_tensor_shape_state` to inject
shape dynamism.
The state is represented by an array of integers which index maps to the
respective shape dimension index.
If -1 is passed as a dimension state then the corresponding dimension
is dynamic.
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I3a8a5ad109b90d4df8545b460a9f8dfcc13dfa0f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4784
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Rename NEActivationLayer to CpuActivation
- Add member function to generate execution window
Partially Resolves: COMPMID-3992
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I4e1ae15cf456b860d3080b2fedc4dbcce7d1bb79
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4791
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Add padding checks in configure
Resolves: COMPMID-3914
Change-Id: Ia5be67283402d8811ceb3007be3a666ab502d775
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4787
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Rename NEFloorKernel to CpuFloorKernel to accomodate new ISA
implementations
- Remove state and instead pass tensors to operate during run
- Add member function to generate an execution window given an input and
output tensor description
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I9240b8ec534589c0f15c354f771f1ac5d7010c3b
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4773
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
- Change raw pointers in OpenCL kernel to __global uchar*
Resolves: COMPMID-4079
Change-Id: Ieeb99ced565bef59583216fd274958b29c7b2758
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4774
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Rename lws to tuning parameters in functions used externally
Add new generalized objects for the OpenCL Tuner to accommodate
further possible tuning parameters
Resolves: COMPMID-3935
Change-Id: I0f2a0f89bca5dae4a4e4adce2f7c7cae32ecb84a
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4584
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
and signed qasymm8 data.
Change-Id: I9249e7d4871d473cb5083d2225950faad6056eb4
Signed-off-by: Arnaud Grasset <arnaud.grasset@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4763
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
It also includes decoupling of kernels using different
data types.
Partially implements: COMPMID-3872
Change-Id: I226cb9e55a5d9f8a0c63e37631f087af45f2d640
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4711
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
- Expose loose macros by prefixing "ARM_COMPUTE_"
Resolves: COMPMID-3701
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I4334b01c1a5cd8585f4a1ba2d870be956c61a83d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4769
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I4ec7561a7f6a42a22b8187968ae302dbe75023bc
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4753
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
FuseReLUIntoBatchNormFloat32CpuAccTest
1. Fix fusable and non-fusable configuration issue
2. Fix FP16 issue
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I6d0eacca7ac437f236ad403ddb283c10c8f419a6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4761
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Ensure that Im2Col transformation is valid for the given input
meta-data. In more detail, validate that the combination of input shape,
padding and kernel width leads to a valid execution window and output
shape.
Resolves: COMPMID-4040
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Id813373b2efdfdfbe71dc0d0acc1d7bf8ecd5e84
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4757
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-3912
Change-Id: I1f8bd3bfec263ebfd70bc96f9183ccdc3089db13
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4754
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
- Few bit-width dependent intrinsics are added.
- Few math functions are added.
Partially implements: COMPMID-3872
Change-Id: Ia6ab46bd170fec9c7c8d4410b7ef4d84710b68ed
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4718
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix unsupported native_cos and native_sin for half data types. Change to regular cos and sin functions.
Resolves: COMPMID-4064
Change-Id: Id07fa0fd811e00a93f5b848636ad4f4481e9a409
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4730
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
1. Decouple data type for NHWC
2. Add NHWC SVE support for BachNormalization
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I0383b969b555b429d9acebb4efa17ecba9429ea7
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4755
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
* Add -C flag to instruct preprocessor not to strip comments. This is to
prevent marker comments like '// fall through' that suppresses certain
warnings from being removed.
* Fix unused variable warnings.
* Add M_PI definition that's missing from certain toolchain standard
libraries.
Resolves COMPMID-4054
Change-Id: I1d641db668685d4b678f3d0efed84bfe9e630b4b
Signed-off-by: SiCongLi <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4692
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix erroneously typed pointers. Raw OpenCL pointers should be defined as pointing to 8bit values and then used with a cast to their true pointer types, due to offset calculation with strides
Resolves: COMPMID-4065
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: I7e792bc22fbbc2ab6b65a8f5c4dc599f63e657a6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4731
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves COMPMID-3918
Change-Id: I970b1eaf2ae6f2f5a8cfc318cd1a3dfd3ba36fdb
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4668
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
|
|
NCHW data layout
Fix border size for CLWinogradInputTransformKernel with NCHW data layout by setting it to the input's paddings. Add new the new validation shapes to the WinogradInputTransform's dataset
Resolves COMPMID-4042
Change-Id: Id93ac86e75c94ea3f2f35edcedebafada928f34a
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4694
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Resolves COMPMID-3905
Updates following kernels::
- CLDeconvolutionLayerUpsampleKernel
- CLDeconvolutionReshapeOutputKernel
- CLInstanceNormalizationLayerKernel
- CLMaxUnpoolingLayerKernel
- CLPermuteKernel
- CLQLSTMLayerNormalizationKernel
- CLReorgLayerKernel
- CLReverseKernel
- CLSpaceToBatchLayerKernel
- CLSpaceToDepthLayerKernel
- CLGenerateProposalsLayerKernel
- CLFFTDigitReverseKernel
- CLFFTRadixStageKernel
- CLFFTScaleKernel
- CLFillBorderKernel
- CLGatherKernel
- CLStridedSliceKernel
- CLBoundingBoxTransformKernel
Change-Id: I067ec670ff9cceadb1dfbf60dabef311a567d99a
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4713
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Explicitly cast scalar to vector for LOGICAL_AND and LOGICAL_OR
Resolves COMPUTE-12536
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: Iabdf7feaef9cb9b41a2fc78e73473ebcfcc3e091
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4706
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I07222a9eb03c785bb63414f581152267b133e9fc
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4699
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
|
|
Suppresses pessimizing-move during clang compilation as for some gcc
toolchains RVO is not ensured until C++17 thus an explicit call to
std::move might be required to avoid compilation error for non-copyable
ojects (e.g. std::unique_ptr)
Resolves: COMPMID-3599
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ie3fa44fb0cf631655aecbeb6c82021a68f500a33
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4230
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Adds support for ActivationLayer for SVE and SVE2.
Datatypes supported:
*FP32
*FP16
*QASYMM8
*QASYMM8_SIGNED
*QSYMM16
Change-Id: Ia3583891795cda4ca2f9fa27c440731a5c27710d
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4566
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Change-Id: Ib1ecd7aa10fec0b7e2b3d929e212c1af34c0f58d
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4533
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolves: COMPMID-4051
Change-Id: I0c0bf97212dd281c19d5081e6247e7dc0c23cd6b
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4687
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Upsample functions and kernels can be replaced with the Scale as they
provide same functionality
Partially resolves: COMPMID-3996
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: Ic2f9ba352c183aa87d69d551d5c172d0f22119e8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4679
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fixed a bug that corrected the number of dimensions of a TensorShape for added trailing 1s
- Avoided adding offset_first_element for the Depthwise 3x3 NCHW OpenCL kernels, since it wouldn't align with the window which is based on the output
- Adjusted padding requirements along the x for Depthwise 3x3 NCHW. The kernel should always add 2 * dilation_(x/y) to the num_elems_read_x/y
- Adjusted the kernel's border_size given to the border handler at function level
- Added the dataset that previously made the tests fail
Resolves: COMPMID-4041
Change-Id: Ifab7d38b263f12173fcc96a5f0bd3375756c3c53
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4673
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|