aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2020-09-04COMPMID-3157: Remove padding from NEGEMMTranspose1xWKernelGian Marco Iodice
- Remove padding from NEGEMMTranspose1xWKernel - Extend test for validating zero padding requirement Change-Id: I9ce4ca95a500229b045dc140cfff21fdf7373700 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3920 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-03COMPMID-3143: Remove padding from NEGEMMInterleave4x4KernelGian Marco Iodice
- Remove padding from NEGEMMInterleave4x4Kernel - Extend test for validating zero padding requirement Change-Id: I94abc271e005f9dd6e1721b185631f55f598dbfd Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3915 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-09-03COMPMID-3750: Disable asm kernels when shifts are negative.morgolock
Change-Id: I65a738221a6c6fc3527ececda42f7a7e547755c1 Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3896 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-09-03COMPMID-3772: Update GEMM selection heuristic for Mali-G76 (F32)Gian Marco Iodice
Change-Id: Iaaf3a72ec98a923ef2a4a39aeeb02f95795c2f6f Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3895 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-28COMPMID-3770: Add batch size in the OpenCL GEMM kernel selectionGian Marco Iodice
Change-Id: Ia3030ea701e9ceb2ef567e0258e8f478e18b8b55 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3871 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-28COMPMID-3504: Add support for BOOL in NEON comparison operatorsMichele Di Giorgio
Change-Id: I81b0c2482bc20b1ab5124ed6179bb94cbced7875 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3869 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-28COMPMID-3670 Extend cl image support to f16 in CLGEMMReshapeRHSMatrixKernelSiCong Li
Change-Id: Ic0569fe9ed99e61084b601ce84ddc7ef288d1899 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3852 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-25COMPMID-3694 COMPMID-3695 COMPMID-3458: Softmax AxisSiCong Li
* Properly support "axis" in CL and NEON (and GC) SoftmaxLayer and LogSoftmaxLayer in accord with mainstream frameworks. Axis now defines the dimension on which softmax is performed, and supports the range [-rank, rank) * Extend validation tests to include valid and invalid axes * Remove unnecessary LogSoftmaxLayer fixture, as it is only a specialisation of the SoftmaxLayer fixture * Change the validation fill value range from [-1000, 1000] to [-10, 10], as the former often results in sparse outputs with a single one and zeros elsewhere Change-Id: I8a0040453182b04ed88260de3ba434e98258d863 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3830 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-08-25COMPMID-3661: Added multidimension support to OMP scheduler.morgolock
Change-Id: Iedacf7094896f08d7c2847c8fb99bd7153deba2c Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3809 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-25Fix EltwiseLayerNode and QuantizationLayerNodethecha01
- Fixed issue where EltwiseLayerNode would base output shape off of first input tensor only - Allow QuantizationLayerNode to use any quantized data type if specified in constructor Signed-off-by: thecha01 <theo.charalambous@arm.com> Change-Id: Ib93470316799028cd573592a3d79943493eaa093 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3737 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
2020-08-24COMPMID-3698: Fix segfault running inception_v3, inception_v4, resnet50, ↵Sheri Zhang
resnet_v2_50 when running as qasymm8 on mate20 GPU Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I0407fd1cdfb5d1d1d0f333e875ea45abdd2c5916 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3825 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-24Add support ElementwiseMax operator in graph APIthecha01
Signed-off-by: thecha01 <theo.charalambous@arm.com> Change-Id: I764f1eabb6412350eb719cc755b8777efc7d70a1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3736 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-08-20COMPMID-3696: Padded dilated Conv2D segmentationSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I8045166d2d3202612fec3f6bf9651f3e6bfcb20f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3783 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-19MLCE-229: Support for negative shifts in asm kernelsmorgolock
Change-Id: I2c5e98aae7698963f106d7423df0e65cd00ee2a9 Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3710 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-19COMPMID-3502: Add support of different quantization input/output for ReduceMeanManuel Bottini
Change-Id: If9a5c6ee3902a7381f4117e473adbddf006f3347 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3731 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-18COMPMID-3454 Patch1: Relocate data_type_from_name to core/UtilsSiCong Li
Change-Id: I33436dab77a47868fbd9872e0b4cf54b3a173e65 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3694 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-18COMPMID-3687: Remove deprecated functions in 20.05 releaseSang-Hoon Park
Change-Id: I90e09e460b5d5d4f9ead8e3905833c5da3b9fbd6 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3762 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2020-08-12COMPMID-3608: Fix z index in gemmlowp_mm_reshaped_only kernelGian Marco Iodice
The issue concerned gemmlowp_mm_reshaped_only_rhs_t_fused_output_stage_fixedpoint. In particular the issue was with the z index to access the elements from the lhs reduced tensor used to calculate the offset contribution. Change-Id: I74f6398fc08894fc323ccd04fda9220752652d31 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3726 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-12COMPMID-3337: Remove write paddings in both axes from ↵Gian Marco Iodice
CLGEMMMatrixMultiplyReshapedKernel - Change the interface of STORE_BLOCK_BOUNDARY_AWARE passing the conditions on Y and X rather than the X/ coordinates. This allows to use the macro with both GEMM reshaped and GEMM reshaped rhs only - Remove padding from the output tensor of CLGEMMMatrixMultiplyReshapedKernel - Add tests for validating the zero padding requirement Change-Id: I13263cc71ce065c5be34ed198def320dd5823495 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3712 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3339: Fix doxygen comments about VECTOR_SIZE and BOUNDARY_VECTOR_SIZESiCong Li
Change-Id: I3f85ece08c9fc4bdbbfd72b5a872d4f2c4b76357 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3709 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3335: Remove x/y-axis padding from CLGEMMReshapeLHSMatrixKernelGian Marco Iodice
- Remove padding requirement for the input tensor of CLGEMMReshapeLHSMatrixKernel - Add utility function to load a boundary aware 2d tensor from buffer - Extend validation for validating the zero padding requirement Change-Id: I0ac6b1b517d75fd56998f406e0cce97b40918ce1 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3701 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3339: Patch2: Remove paddings from im2col*_nhwc cl kernelSiCong Li
* Remove channel paddings from all nhwc kernels (im2col_3x3_nhwc, im2col_9x9_nhwc, im2col_generic_nhwc) * Validate that input total spatial dimensions (with x and y paddings) are bigger than or equal to the kernel spatial dimension. - Otherwise it would result in invalid memory reads. - This problem likely existed before, but was made obvious with the removal of implicit paddings * Add zero padding validation tests * Fix Im2ColValidationFixture by not permuting the input shape in case of NHWC Change-Id: I1f895e8938af0e9130cb516106f0b4b665531709 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3696 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3607: Fix softmax summation logic for QASYMM8_SIGNEDSang-Hoon Park
For the elements that shouldn't contribute to the sum, zero is used to compute the correct sum. Change-Id: I5360534b5b0f81ee3d3aaaf5a046b99ecd943894 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3703 Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3339: Patch1: Fix incorrect select casting in im2col nhwc kernelsSiCong Li
* Put an additional cast for correctly handling scalar cases According to opencl specs, logical operators, when performed on scalar types, always return int regardless of the type of the scalar. Thus if we were to use the results of a scalar logical op on the method select, it would be incorrect for any types of width different than 4 (the width of int) A concrete example would be that if the VECTOR_SIZE is 1 (scalar case), and DATA_TYPE is half/f16 (width < 4), then the result type of the || op would be int instead of short, which it's supposed to be, and this would result in the ambiguous function call error for select. Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ibc4985f707f667116668c43b9f9bf39dda789528 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3698 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-10COMPMID-3700: ArmNN OOB test failureSheri Zhang
Fix ArmNN compiling issue through removing defulat template value for offset_no_padding() and pooling2_nchw_maxpool_indices(). Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ie7114d102b8533e757b8e841afcac1adfbcb3b54 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3697 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-07COMPMID-3683: Fix performance regression on Mali-G76 (Fully connected)Gian Marco Iodice
COMPMID-3682: Fix performance regression on Mali-G76 (Convolution) Updated the heuristic for GEMMReshapedOnlYRHS for Mali-G76 in order to take into account small workload cases Change-Id: I99fccbd0e94e4e21c0d1b88e23f02af06ef16ee9 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3689 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-07COMPMID-3656: Disabled reduce_axis in LOG_SOFTMAX and SOFTMAXmorgolock
Our implementation of reduce_axis is only compliant for default_axis. Validate will throw an error when trying to use a different axis. Change-Id: I4c02aa055bb4474593a3114ec9c83884d3c9120f Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3658 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2020-08-06COMPMID-3652 Fix CLFullyConnectedLayer failure on S10SiCong Li
* Fix out-of-bound mem reads in cases where M < M0 in CLGEMMMatrixMultiplyNativeKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel, as a result of the new boundary-aware reading logics. * Add fixture tests (alongside the padding configuration tests) in these 2 kernels to catch all 5 possible scenarios with block dimension configurations, which includes this particular bug as the "...BoundaryHandlingFullInXSinglePartialInY" test case Change-Id: I8a10ab67594171e3ea4fb6e35c84ddc4ab964fba Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3650 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-06Added missing parameter num_groups to the validate call of NEConvolutionLayer.Alexander Jung
Change-Id: I5f14b9175b0d72133536578c1d019cab98b2f746 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3679 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-2450: Implement CLMaxUnpoolingLayerGian Marco Iodice
- Add OpenCL kernel for Max unpooling layer - Add tests for validating the result Change-Id: If7ca79566a1198e3141f880abf46738980a62c81 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3606 Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-2479: Extend CLPoolingLayer max pooling to extract indicesSheri Zhang
Fix PoolingLayer max pooling reference bug to extract indices. Extend CLPoolingLayer max pooling to extract indices, all the paddings need to be substracted. Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: If8e82e7f7e03172ad05f5a7cd5f13cf682fd1ffc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3649 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-3392: Collapse TensorMaps into a single TensorPackGeorgios Pinitas
Collapse InputTensorMap and OutputTensorMap to a single TensorPack mechanism. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ie2fdfc6b07d84ad589169ec99ca64fcf45a00bec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/253783 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3641 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2020-07-31COMPMID-3653 CL GEMM kernel creation error on certain combinations of N and N0SiCong Li
* Fix invalid use of vstore_partial_1 * Add configuration tests to catch this error case Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I25a2b16a530992acc869a4335c48a8fffa420850 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3628 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-31COMPMID-3624: CTS failure on Resize quantized in Neon and CLMichele Di Giorgio
Allow computations with aligned corners when the tensors have width/height equal to 1. Change-Id: Ia01733f6c02e0740835b26a794b9a79fa35319b4 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3634 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sadik Armagan <sadik.armagan@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-31COMPMID-3324: Fix oclgrind warningsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Ib14d158b9c5568981835312dcd9d5b9ca116649a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3637 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-29COMPMID-3585: Android R while_fib_n_5_quant8 failure on CpuAccMichele Di Giorgio
Assembly kernels do not support quantized GEMM when the multiplier is greater than 1 (which leads to negative result shift used for requantization). Change-Id: I7a766cd0d13f549d217613ca67bc952923f309de Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3538 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-29COMPMID-2078: Remove legacy TODOsGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic05ef206f76477cc2fbb9e7ad56ec1974fa013ea Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3626 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-28COMPMID-3385: Async support to CLArithmetic* kernels/functions Pt.2Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Idc5ac2dd2ba5295c00c88b44a783645327a27e15 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3617 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-28COMPMID-3575: Mixed preicision in NEInstanceNormalizationLayerKernelSang-Hoon Park
In order to fix the issue caused by the limited precision of FP16. mixed precision (float accumulator) is introduced to NEInstanceNormalizationLayerKernel. Since the reference kernel is doing the mixed precision, currently mixed preicision computation is default when it is called from NEInstanceNormalizationLayer. - Make NEInstanceNormalizationLayerKernel use kernel descriptor to enable mixed precision computation - NEInstanceNormalizationLayer is modified to use the descriptor Change-Id: I7766622d715df054e303f9b441380b15b51f02b2 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3589 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-24COMPMID-3385: Async support to CLArithmetic* kernels/functions Pt.1Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I94007565e688f8a0aead4f14c9fc30bfd9f9f7eb Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3613 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-24[ONCPUML-120]: Tweak of the launch heuristics for hybrid_u8u32_dot_16x4 kernelAleksandr Nikolaev
Hybrid kernel turns to be faster for qasymm8 than quantized_wrapper with interleaved. Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: I200646aee6cdcabfe125b746c7d87bfa7d06e0fc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3585 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-24COMPMID-3393: Minor tweaks on memory injection interfaceGeorgios Pinitas
* Avoid the need to overload workspace() everytime * Remove the Layer suffix from the operators * Clean interface by removing default arguments when unsupported Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7710ecd485cae13e9c2d45216debbd8103bc5a0f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3610 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-23COMPMID-3578: Update FP32/int8 kernel selection.David Mansell
Upgrade the current 'is_preferred()' mechanism with a new framework, where kernels instead provide an estimated cycle count figure. Compatibility with old mechanism is achieved via a wrapper which replaces a "true" result with an estimate of 0, and a "false" result with UINT64_MAX. This mechanism is then used to select between 'interleaved' and 'hybrid' FP32 NEON kernels. This uses a simple system based on counting MACs performed and bytes of data transferred (for rearrange/merge operations) and dividing by fixed performance figures, which are provided for A53, A55, A73 and 'default' figures (based on A76). Separately, a new route for performing int8 GEMMs by using the int16 kernel is provided. This performs significantly (for uint8) or slightly (for int8) better on A53 than the existing int8 route. Optimized 8-to-16 bit transforms are also included. Change-Id: I53b2e59eb9368793c78c2081e17d2445361bcc47 Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/250120 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3609 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-22COMPMID-3600: Fix requantization in NEPixelWiseMultiplicationKernelMichele Di Giorgio
Quantization wasn't done correctly and since we have helpers for that, the code has been modified to use them. Change-Id: Ia16577cea57dcb1864d91a06ab6aebf8ead67de5 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3608 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-22COMPMID-3535: 9x9 Direct convolution support for CL and NHWCGeorgios Pinitas
* Supported strides 1 and 2 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I4b9f087c0c328234159b2d1eacc2e465b3bb3c54 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3603 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-07-22COMPMID-3386: Support memory injection in CLConcatenate functions/kernelsMichele Di Giorgio
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I611adf4f506d406540e920b0bd6befb4b5108918 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3601 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-21COMPMID-3390: Async support to CLStridedSliceLayerKernel kernels/functionsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I9ff7e8d2fb4d36c4b7c44e885abf34ff6d4c577c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3587 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-21COMPMID-3326; Update heuristic for GEMMReshaped and GEMMReshapedOnlyRHSGian Marco Iodice
- Update the heuristic for Arm Mali-G77 (F32) in order to use the OpenCL image2d object on GEMM Change-Id: Ife6736a22ec2a114368bb338908f0c5f239dfad6 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3593 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-21COMPMID-3600: MUL unit test failing with data type QUANT8_ASYMMSheri Zhang
Add broadcast support for NEPixelWiseMultiplicationKernel with QASYMM8/QASYMM8_SIGNED Add test case for QASYMM8 broadcast Fix QASYMM8 saturation issue Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ie67cfa8b94ab542133b031efbff8379cc57cfc2d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3586 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2020-07-21COMPMID-3331 Remove y load padding from ↵SiCong Li
CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel Resolves: COMPMID-3333, COMPMID-3334 * Implement an "overlap load, but don't overlap store" strategy: - Change STORE_BLOCK_BOUNDARY_AWARE so that the partial block in y dimension is placed at the beginning instead of at the end. - Implement 3 auxiliary functions to calculate the lhs, bias and dst addresses, taking into account the potential partial block in y dimension. * Remove y load padding from Lhs and Bias tensors in CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLGEMMMatrixMultiplyNativeKernel * Modify config tests to assert zero-padding in new dimensions Change-Id: I8f8585c7c0f543d720c2c91b885417c7dad35af4 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3576 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>