aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2020-07-28COMPMID-3385: Async support to CLArithmetic* kernels/functions Pt.2Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Idc5ac2dd2ba5295c00c88b44a783645327a27e15 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3617 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-28COMPMID-3575: Mixed preicision in NEInstanceNormalizationLayerKernelSang-Hoon Park
In order to fix the issue caused by the limited precision of FP16. mixed precision (float accumulator) is introduced to NEInstanceNormalizationLayerKernel. Since the reference kernel is doing the mixed precision, currently mixed preicision computation is default when it is called from NEInstanceNormalizationLayer. - Make NEInstanceNormalizationLayerKernel use kernel descriptor to enable mixed precision computation - NEInstanceNormalizationLayer is modified to use the descriptor Change-Id: I7766622d715df054e303f9b441380b15b51f02b2 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3589 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-24COMPMID-3385: Async support to CLArithmetic* kernels/functions Pt.1Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I94007565e688f8a0aead4f14c9fc30bfd9f9f7eb Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3613 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-24[ONCPUML-120]: Tweak of the launch heuristics for hybrid_u8u32_dot_16x4 kernelAleksandr Nikolaev
Hybrid kernel turns to be faster for qasymm8 than quantized_wrapper with interleaved. Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: I200646aee6cdcabfe125b746c7d87bfa7d06e0fc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3585 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-24COMPMID-3393: Minor tweaks on memory injection interfaceGeorgios Pinitas
* Avoid the need to overload workspace() everytime * Remove the Layer suffix from the operators * Clean interface by removing default arguments when unsupported Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7710ecd485cae13e9c2d45216debbd8103bc5a0f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3610 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-23COMPMID-3578: Update FP32/int8 kernel selection.David Mansell
Upgrade the current 'is_preferred()' mechanism with a new framework, where kernels instead provide an estimated cycle count figure. Compatibility with old mechanism is achieved via a wrapper which replaces a "true" result with an estimate of 0, and a "false" result with UINT64_MAX. This mechanism is then used to select between 'interleaved' and 'hybrid' FP32 NEON kernels. This uses a simple system based on counting MACs performed and bytes of data transferred (for rearrange/merge operations) and dividing by fixed performance figures, which are provided for A53, A55, A73 and 'default' figures (based on A76). Separately, a new route for performing int8 GEMMs by using the int16 kernel is provided. This performs significantly (for uint8) or slightly (for int8) better on A53 than the existing int8 route. Optimized 8-to-16 bit transforms are also included. Change-Id: I53b2e59eb9368793c78c2081e17d2445361bcc47 Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/250120 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3609 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-22COMPMID-3600: Fix requantization in NEPixelWiseMultiplicationKernelMichele Di Giorgio
Quantization wasn't done correctly and since we have helpers for that, the code has been modified to use them. Change-Id: Ia16577cea57dcb1864d91a06ab6aebf8ead67de5 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3608 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-22COMPMID-3535: 9x9 Direct convolution support for CL and NHWCGeorgios Pinitas
* Supported strides 1 and 2 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I4b9f087c0c328234159b2d1eacc2e465b3bb3c54 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3603 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-22COMPMID-3386: Support memory injection in CLConcatenate functions/kernelsMichele Di Giorgio
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I611adf4f506d406540e920b0bd6befb4b5108918 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3601 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-21COMPMID-3390: Async support to CLStridedSliceLayerKernel kernels/functionsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I9ff7e8d2fb4d36c4b7c44e885abf34ff6d4c577c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3587 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-21COMPMID-3326; Update heuristic for GEMMReshaped and GEMMReshapedOnlyRHSGian Marco Iodice
- Update the heuristic for Arm Mali-G77 (F32) in order to use the OpenCL image2d object on GEMM Change-Id: Ife6736a22ec2a114368bb338908f0c5f239dfad6 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3593 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-21COMPMID-3600: MUL unit test failing with data type QUANT8_ASYMMSheri Zhang
Add broadcast support for NEPixelWiseMultiplicationKernel with QASYMM8/QASYMM8_SIGNED Add test case for QASYMM8 broadcast Fix QASYMM8 saturation issue Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ie67cfa8b94ab542133b031efbff8379cc57cfc2d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3586 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-07-21COMPMID-3331 Remove y load padding from ↵SiCong Li
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel Resolves: COMPMID-3333, COMPMID-3334 * Implement an "overlap load, but don't overlap store" strategy: - Change STORE_BLOCK_BOUNDARY_AWARE so that the partial block in y dimension is placed at the beginning instead of at the end. - Implement 3 auxiliary functions to calculate the lhs, bias and dst addresses, taking into account the potential partial block in y dimension. * Remove y load padding from Lhs and Bias tensors in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel * Modify config tests to assert zero-padding in new dimensions Change-Id: I8f8585c7c0f543d720c2c91b885417c7dad35af4 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3576 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-20COMPMID-3532: Align data type support between doxygen and implementation - CLMichele Di Giorgio
Also removes some unused code. Change-Id: I85687c40999c3cdf9e6fccfcd020b0901a9515fe Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3581 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-20COMPMID-3604: Graph failures during tuningGeorgios Pinitas
Update ICLTuner interface to account for the new memory injection interface. Redirect to appropriate kernel execution interface depending on if the kernel supports memory injection or not. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I8ce29f5c22f1865c9e688d12b65e68ee4486f99c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3588 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-17COMPMID-3577: 9x9 CLDirectConvolution failuresMichele Di Giorgio
Change-Id: I32588332080adfaa79227dadd0f152c1bd67ff62 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3577 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-17COMPMID-3576: Nightly failure: NEON/PoolingLayer/Float/FP16/MaxUnpooling S10Sheri Zhang
Extend NEPoolingLayer max pooling to extract indices for FP16 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I5a7c754be353e4c2c5d0ab3794e9427408d0c4fa Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3580 Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-16[ONCPUML-97]: Implement "int8" support for 2D decomposition at high core countsAleksandr Nikolaev
Interleaved2d functionality was extended to uint8 and int8 kernels. Change-Id: If78facbce56e9ec7b2f4c23436af0bd5db7f7b69 Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3467 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-16COMPMID-3389: Async support to CLElementwiseUnaryLayerKernel kernels/functionsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I2ce75a4705cfd75e30aefa0a2ea31e751b975469 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3579 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-16COMPMID-3324: ADD CTS test failing with data type QUANT8Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I744b1916801c6d299be24e48da2e82548c3bf514 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3582 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-15[ONCPUML-96] FP16 support for 2D decomposition at high core counts.cfRod
Added changes to gemm_fp16 to pick gemm interleaved pretransposed 2D for hgemm_24x8 and sgemm_12x8. Also added the change for scheduling hints based on datatype F16. Signed-off-by: cfRod <crefeda.rodrigues@arm.com> Change-Id: Idd754cf14b47d00a70eab79bbb5ee3ecaf77450f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3477 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-15COMPMID-3326: Update heuristic for GEMMReshaped and GEMMReshapedOnlyRHSGian Marco Iodice
- Update the heuristic for Arm Mali-G76 (F32) in order to use the OpenCL image2d object on GEMM - Create utility function to validate the support for image2d Change-Id: I0913ac5f27fd07992b0ac188af753a2abeb034ca Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3559 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-14COMPMID-3374: Remove memory state from NEConcatenateLayer kernelsGeorgios Pinitas
* Allow the following kernels to accept backing memory at run-time: * NEBatchConcatenateLayerKernel * NEDepthConcatenateLayerKernel * NEHeightConcatenateLayerKernel * NEWidthConcatenateLayerKernel * Allow the following functions to accept backing memory at run-time: * NEConcatenateLayer Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ib0b6714cff7f06a52dc74d294bc3e0d72a1c2419 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3569 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-14COMPMID-3589: ADD CTS test failing with data type QUANT8_ASYMMMichalis Spyrou
Pick the correct scales and offsets in case of broadcast. Added tests for quantized QUANT8_ASYMM. Change-Id: I04e90b8ae1f624b12bbdcf6ed9187e58b9135c85 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3562 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-13COMPMID-3338 COMPMID-3336 COMPMID-3584SiCong Li
COMPMID-3338 Remove store padding in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel COMPMID-3336 Remove store padding in CLGEMMMatrixMultiplyNativeKernel COMPMID-3584 Fix VSTORE to correctly deal with scalar case * Implement STORE_BLOCK_BOUNDARY_AWARE, as part of the COMPMID-3332 investigation, with the following substantial changes: - Separate STORE_BLOCK_PARTIAL, STORE_ROW_PARTIAL and VSTORE_PARTIAL so that this change does not affect kernels not using STORE_BLOCK_BOUNDARY_AWARE. - Revamp vstore_ext_n to vstore_partial_n, and enhance VSTORE_PARTIAL to correctly handle both vector and scalar cases * Remove the store padding (dst tensor) in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel * Add configuration tests to check no padding is added by the configuration. Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I4f0907867979d8dacedd03b4bcbd2fb19e4f1602 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3522 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-13COMPMID-3531: fix index offset overflows in NEDirectConvolutionLayerKernelSang-Hoon Park
When a large input and kernel is used, the computation of "max_offset" variable can overflow. Adjust types of the variable as well as the variable compared with for consistency. The test spotted the overflow is added to nightly suite. Change-Id: I2f114e4b49167889a6d3729c71823c089d6f42e3 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3527 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-10COMPMID-3565: Exposes interface to enable thread bindingGeorgios Pinitas
Expose `set_num_threads_with_affinity` as an interface to the `IScheduler` to allow binding of threads to given logical cores. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I062db7caafb0101972ba45d31ee9e61b26800127 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3481 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-09COMPMID-3325: Add support in gemm_tuner for cl_imageManuel Bottini
Change-Id: I78f815005516ca0e83366bab017884530c1d2e86 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3518 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-09COMPMID-3324: Adjusting capitalization of Arm copyright claim to reflect Arm ↵Michele Di Giorgio
preferred presentation Change-Id: Ib7dcfcbb24b408999dfae366b9da396485aacf78 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3525 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-08COMPMID-3574: add logarithm to LogSoftmaxLayerSang-Hoon Park
Missed logarithm for the summation is added to NEON, CL and reference backends. To avoid complex changes, log softmax layer on CL backend doesn't support quantized data types. Tests and doxygen comments are modified accordingly. Change-Id: Iafd29291be8b81345cb4999b2668dbc3ae0c3345 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3517 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-07COMPMID-3532: Align data type support between doxygen and implementation - NEONMichele Di Giorgio
Change-Id: I70662cfb43890873b706b3f22b348f5d8cdd63ca Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3506 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-07COMPMID-3324: Remove pretransposed support from NEON backendGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I394c6c539969940e0119cbc14174909d47e65de6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3519 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-07COMPMID-3387: Support memory injection in CLActivationLayerGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I31f9620607b372fc3340c71e748a5ea177d9da62 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3520 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-07COMPMID-3373: Async support to NEArithmetic* kernels/functions (Pt. 2)Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Iec06adb535aaf7efb1838d921e8d6bb978b7b215 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-06COMPMID-3532: Align data type support between doxygen and implementation - CPPMichele Di Giorgio
The patch also removes some unused NEON kernels. Change-Id: I4a7622f31c88ee038b21874614a981764a03122a Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3509 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-06COMPID-3324: Clean GEMM kernelsGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I170de1671e061a78740caee31fb4a1b8642c1369 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3505 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-06COMPMID-3573: Nightly failure: ↵Sheri Zhang
CL/GEMMConvolutionLayer/Quantized/QSYMM8_PER_CHANNEL Change-Id: I0248470f6119cfc8001a940684f7d3b22269b83f Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3512 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-03COMPMID-3539: Change indexing for nearest neighbor with aligned cornersSang-Hoon Park
For nearest neighbor interpolation policy with aligned corners all of NEON, CL and reference use round() rather than float to find the nearest integer. Change-Id: If0360da870e983303bf0424ca1100084084c1efc Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3495 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-03COMPMID-3388: Async support to CLReshapeLayerKernel kernels/functionsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I141a943dfd691069317860e852ecdd0ba7391604 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3501 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-03COMPMID-3534: CLGEMMConvolutionLayer doesn't support QASYMM8_SIGNED properlymorgolock
- QASYMM8_SIGNED input and QSYMM8_PER_CHANNEL is permmited. - Validation tests are added. Change-Id: I9f699c323fa7e87afdc132c9b7888a56aebded6b Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3452 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-02COMPMID-3324: Fix per-channel quantization on N blockingGeorgios Pinitas
Direct the column to start from in the quantized code Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I8231e0b541c6b1b76becf349a1d6ddf973ade9e2 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3488 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-02COMPMID-3477: Remove padding from NEPixelWiseMultiplicationKernelSheri Zhang
Remove padding from all NEPixelWiseMultiplicationKernel functions. Add test case for U8_U8_S16(input1,input2,output). Add reference code for U8_U8_S16(input1,input2,output). Remove window shrink test from NormalizationLayer. Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I28d89790c5527a42f918814a0ee3d6ec4c273532 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3468 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-02COMPMID-3501 Modify heuristics for f16+fastmath NEON Winograd ConvSiCong Li
* Disable winograd on certain layers of squeezenet v1.1 * Fix winograd validate_kernel_3x3 Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I380c6e4a0f8338056839df3c8810f726227f210f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3348 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-01COMPMID-3153: Remove padding from NENormalizationLayerKernelManuel Bottini
Change-Id: Ib84308ea18bfa31ffbc3269a1f005d7d302139f7 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3350 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-06-30COMPMID-3324: Handle unused variable in SVE based GEMM kernels.Georgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic201433d6c2191c1498390d97dd371e578a081fe Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3480 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-06-30COMPMID-3539: Ignore align_corners for scaled size of 1Sang-Hoon Park
Scale kernels failed to validate when align_corners is true for scaled output size 1. Change this behavior to ignoring align_corners value to be aligned with expected behavior of higher-level frameworks. Also the minimum output size generated by the fixture for Scale kernels is changed to 1. Change-Id: Ib8e479af8bc43de3780005545f0c53fe195dc22e Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3478 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-06-29COMPMID-3322: Add cl_image support for GEMMReshapedOnlyRHS NTGian Marco Iodice
COMPMID-3323: Add cl_image support for GEMMReshapedOnlyRHS T - Added support for cl_image in CLGEMMMatrixMultiplyReshapedInlyRHSKernel (both NT and T kernels) - Extended the tests for the validating rhs_info.export_to_cl_image = true - Updated doxygen documentation in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.h Change-Id: If253794323aac072d84a4d8680b9a2339ab7ad92 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3437 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-06-29COMPMID-3562: Support QASYMM8_SIGNED in CLArgMinMaxLayerKernelSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I6c6efde06f000834b0b770889e3eb5ee0d14b027 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3476 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-06-26COMPMID-3560: Fix F16 performance regression (OpenCL)Gian Marco Iodice
The performance regression was caused by a change in the interface of the OpenCL kernels gemm_mm_reshaped_lhs_* Change-Id: I030df4975dc040886c17e71710a27137b50edd9b Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3465 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-06-25COMPMID-3150: Remove padding from NEL2NormalizationLayerKernelGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7ae0d56f1c1f55c7049509b1f80cc07bdc54c8ec Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3457 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>