aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-25COMPMID-3694 COMPMID-3695 COMPMID-3458: Softmax AxisSiCong Li
* Properly support "axis" in CL and NEON (and GC) SoftmaxLayer and LogSoftmaxLayer in accord with mainstream frameworks. Axis now defines the dimension on which softmax is performed, and supports the range [-rank, rank) * Extend validation tests to include valid and invalid axes * Remove unnecessary LogSoftmaxLayer fixture, as it is only a specialisation of the SoftmaxLayer fixture * Change the validation fill value range from [-1000, 1000] to [-10, 10], as the former often results in sparse outputs with a single one and zeros elsewhere Change-Id: I8a0040453182b04ed88260de3ba434e98258d863 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3830 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2020-08-25COMPMID-3661: Added multidimension support to OMP scheduler.morgolock
Change-Id: Iedacf7094896f08d7c2847c8fb99bd7153deba2c Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3809 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-25Fix EltwiseLayerNode and QuantizationLayerNodethecha01
- Fixed issue where EltwiseLayerNode would base output shape off of first input tensor only - Allow QuantizationLayerNode to use any quantized data type if specified in constructor Signed-off-by: thecha01 <theo.charalambous@arm.com> Change-Id: Ib93470316799028cd573592a3d79943493eaa093 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3737 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
2020-08-25COMPMID-3749: Adjust FP32 tolerance for NEScale validationSang-Hoon Park
To prevent unexpected failures caused by some cases, the bigger tolerance value is used, which is matched to CL's relative tolerance value. Change-Id: If6e3bc2f30651c54769dcd8dd647a3233a88c488 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3826 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-24COMPMID-3698: Fix segfault running inception_v3, inception_v4, resnet50, ↵Sheri Zhang
resnet_v2_50 when running as qasymm8 on mate20 GPU Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I0407fd1cdfb5d1d1d0f333e875ea45abdd2c5916 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3825 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-24Update SONAME_VERSION in SConscriptSang-Hoon Park
Change-Id: I8e5695dc7f9e9dd4b3b81487b1ad991920a12292 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3779 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-24COMPMID-3655: PixelWiseMultiplication incorrectly validated in GraphManuel Bottini
Change-Id: Iaae7c3fb8812c9a0c9547dfa28dda7810a81de82 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3727 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-24Add support ElementwiseMax operator in graph APIthecha01
Signed-off-by: thecha01 <theo.charalambous@arm.com> Change-Id: I764f1eabb6412350eb719cc755b8777efc7d70a1 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3736 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-08-24COMPMID-3747 Remove unnecessary and problematic constness from non-static ↵SiCong Li
private members * These classes could not be moved or assigned to prior to the change Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I4c9d97726749cb6ba69ddd4f419fb0f63db2261f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3784 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-20COMPMID-3696: Padded dilated Conv2D segmentationSheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I8045166d2d3202612fec3f6bf9651f3e6bfcb20f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3783 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-19MLCE-229: Support for negative shifts in asm kernelsmorgolock
Change-Id: I2c5e98aae7698963f106d7423df0e65cd00ee2a9 Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3710 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-19COMPMID-3502: Add support of different quantization input/output for ReduceMeanManuel Bottini
Change-Id: If9a5c6ee3902a7381f4117e473adbddf006f3347 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3731 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
2020-08-18COMPMID-3689: Update function list in doxygenSang-Hoon Park
- NE/CLMaxUnpoolingLayer have been added to the function list. - Missing documentation for deprecated interface has been added to the release note. Change-Id: I1bd5f37a15545ffc714a9fa7a0d56738d4744fd3 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3785 Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-18COMPMID-3690: Update release noteSang-Hoon Park
- Missing documentation for padding-removed NEON kernels has been added. - Missing documentation for removed NEON kernel has been added. - Minor format clean-up. Change-Id: Id3ca2c9998d220c7e63b2343306caff13fcc3a34 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3777 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-18COMPMID-3454 Patch1: Relocate data_type_from_name to core/UtilsSiCong Li
Change-Id: I33436dab77a47868fbd9872e0b4cf54b3a173e65 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3694 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-18COMPMID-3687: Remove deprecated functions in 20.05 releaseSang-Hoon Park
Change-Id: I90e09e460b5d5d4f9ead8e3905833c5da3b9fbd6 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3762 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2020-08-13COMPMID-3702: Update documentationGian Marco Iodice
- Update documentation about remove padding in GEMM - OpenCL - Update documentation about the OpenCL image object support in GEMM Change-Id: I015193ee5c5b946cf053968eeeacc042b33b6f6e Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3728 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-13COMPMID-3743: Fix for benchmark_gemm_examples.shGian Marco Iodice
The regular expression used for the gemm shapes was not correct and caused the skip of few tests - Fix the regular expression to retrieve the gemm shapes from the csv file Change-Id: Ib99d9dede728d3aba4beadc460b2a30050fba8f9 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3732 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-12COMPMID-3608: Fix z index in gemmlowp_mm_reshaped_only kernelGian Marco Iodice
The issue concerned gemmlowp_mm_reshaped_only_rhs_t_fused_output_stage_fixedpoint. In particular the issue was with the z index to access the elements from the lhs reduced tensor used to calculate the offset contribution. Change-Id: I74f6398fc08894fc323ccd04fda9220752652d31 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3726 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-12COMPMID-3337: Remove write paddings in both axes from ↵Gian Marco Iodice
CLGEMMMatrixMultiplyReshapedKernel - Change the interface of STORE_BLOCK_BOUNDARY_AWARE passing the conditions on Y and X rather than the X/ coordinates. This allows to use the macro with both GEMM reshaped and GEMM reshaped rhs only - Remove padding from the output tensor of CLGEMMMatrixMultiplyReshapedKernel - Add tests for validating the zero padding requirement Change-Id: I13263cc71ce065c5be34ed198def320dd5823495 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3712 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-12COMPMID-3456 Update gemm tuner documentationSiCong Li
* Update README with the improvements * Add a new step-by-step example section Change-Id: I4d76821fb6c2f3b5edd54edfeff053e1c92fbb6e Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3713 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-12COMPMID-3699: Nightly failure CL DirectConvolutionManuel Bottini
Tolerance issue Change-Id: I0246b70b03520b13a6a1bc5a92fb4787d7c0e734 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3711 Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3339: Fix doxygen comments about VECTOR_SIZE and BOUNDARY_VECTOR_SIZESiCong Li
Change-Id: I3f85ece08c9fc4bdbbfd72b5a872d4f2c4b76357 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3709 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3335: Remove x/y-axis padding from CLGEMMReshapeLHSMatrixKernelGian Marco Iodice
- Remove padding requirement for the input tensor of CLGEMMReshapeLHSMatrixKernel - Add utility function to load a boundary aware 2d tensor from buffer - Extend validation for validating the zero padding requirement Change-Id: I0ac6b1b517d75fd56998f406e0cce97b40918ce1 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3701 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3339: Patch2: Remove paddings from im2col*_nhwc cl kernelSiCong Li
* Remove channel paddings from all nhwc kernels (im2col_3x3_nhwc, im2col_9x9_nhwc, im2col_generic_nhwc) * Validate that input total spatial dimensions (with x and y paddings) are bigger than or equal to the kernel spatial dimension. - Otherwise it would result in invalid memory reads. - This problem likely existed before, but was made obvious with the removal of implicit paddings * Add zero padding validation tests * Fix Im2ColValidationFixture by not permuting the input shape in case of NHWC Change-Id: I1f895e8938af0e9130cb516106f0b4b665531709 Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3696 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3607: Fix softmax summation logic for QASYMM8_SIGNEDSang-Hoon Park
For the elements that shouldn't contribute to the sum, zero is used to compute the correct sum. Change-Id: I5360534b5b0f81ee3d3aaaf5a046b99ecd943894 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3703 Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-11COMPMID-3339: Patch1: Fix incorrect select casting in im2col nhwc kernelsSiCong Li
* Put an additional cast for correctly handling scalar cases According to opencl specs, logical operators, when performed on scalar types, always return int regardless of the type of the scalar. Thus if we were to use the results of a scalar logical op on the method select, it would be incorrect for any types of width different than 4 (the width of int) A concrete example would be that if the VECTOR_SIZE is 1 (scalar case), and DATA_TYPE is half/f16 (width < 4), then the result type of the || op would be int instead of short, which it's supposed to be, and this would result in the ambiguous function call error for select. Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ibc4985f707f667116668c43b9f9bf39dda789528 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3698 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-10COMPMID-3700: ArmNN OOB test failureSheri Zhang
Fix ArmNN compiling issue through removing defulat template value for offset_no_padding() and pooling2_nchw_maxpool_indices(). Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: Ie7114d102b8533e757b8e841afcac1adfbcb3b54 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3697 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-07COMPMID-3683: Fix performance regression on Mali-G76 (Fully connected)Gian Marco Iodice
COMPMID-3682: Fix performance regression on Mali-G76 (Convolution) Updated the heuristic for GEMMReshapedOnlYRHS for Mali-G76 in order to take into account small workload cases Change-Id: I99fccbd0e94e4e21c0d1b88e23f02af06ef16ee9 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3689 Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-07COMPMID-3656: Disabled reduce_axis in LOG_SOFTMAX and SOFTMAXmorgolock
Our implementation of reduce_axis is only compliant for default_axis. Validate will throw an error when trying to use a different axis. Change-Id: I4c02aa055bb4474593a3114ec9c83884d3c9120f Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3658 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2020-08-06COMPMID-3652 Fix CLFullyConnectedLayer failure on S10SiCong Li
* Fix out-of-bound mem reads in cases where M < M0 in CLGEMMMatrixMultiplyNativeKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel, as a result of the new boundary-aware reading logics. * Add fixture tests (alongside the padding configuration tests) in these 2 kernels to catch all 5 possible scenarios with block dimension configurations, which includes this particular bug as the "...BoundaryHandlingFullInXSinglePartialInY" test case Change-Id: I8a10ab67594171e3ea4fb6e35c84ddc4ab964fba Signed-off-by: SiCong Li <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3650 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-06Added missing parameter num_groups to the validate call of NEConvolutionLayer.Alexander Jung
Change-Id: I5f14b9175b0d72133536578c1d019cab98b2f746 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3679 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-3516: Update documentation for new operators in 20.08Sheri Zhang
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I419db7a5711e4727176c2960444ae32e07d8a9a6 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3536 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-2450: Implement CLMaxUnpoolingLayerGian Marco Iodice
- Add OpenCL kernel for Max unpooling layer - Add tests for validating the result Change-Id: If7ca79566a1198e3141f880abf46738980a62c81 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3606 Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-2479: Extend CLPoolingLayer max pooling to extract indicesSheri Zhang
Fix PoolingLayer max pooling reference bug to extract indices. Extend CLPoolingLayer max pooling to extract indices, all the paddings need to be substracted. Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: If8e82e7f7e03172ad05f5a7cd5f13cf682fd1ffc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3649 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-08-05COMPMID-3392: Collapse TensorMaps into a single TensorPackGeorgios Pinitas
Collapse InputTensorMap and OutputTensorMap to a single TensorPack mechanism. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ie2fdfc6b07d84ad589169ec99ca64fcf45a00bec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/253783 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3641 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2020-08-04COMPMID-3618 Add support for export_to_cl_image_rhs in GEMMTuner.pySiCong Li
* Add export_to_cl_image_rhs flag to reshaped and reshaped_only_rhs configs definitions * Exit with error when output directory is not correctly created * Add start and end timestamps to each output directory Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: Ief5e8f454e7c6d97b18bbdace877db4e9ffc124c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/252934 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3651 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-08-03COMPMID-3526: LOGISTIC support for values bigger than [-40.f,40.f]Michalis Spyrou
With this patch: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3178 we can support any range of values since we handle overflows by clamping. This means that for large negative values we'll get 0 and for possitive inf which aligns with math.h implementation. Change-Id: I01e92010bb0c514c12b19da97e369a75d782cac7 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3639 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-31COMPMID-3653 CL GEMM kernel creation error on certain combinations of N and N0SiCong Li
* Fix invalid use of vstore_partial_1 * Add configuration tests to catch this error case Signed-off-by: SiCong Li <sicong.li@arm.com> Change-Id: I25a2b16a530992acc869a4335c48a8fffa420850 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3628 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-31COMPMID-3624: CTS failure on Resize quantized in Neon and CLMichele Di Giorgio
Allow computations with aligned corners when the tensors have width/height equal to 1. Change-Id: Ia01733f6c02e0740835b26a794b9a79fa35319b4 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3634 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sadik Armagan <sadik.armagan@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-31COMPMID-3324: Fix oclgrind warningsMichalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Ib14d158b9c5568981835312dcd9d5b9ca116649a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3637 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-29COMPMID-3585: Android R while_fib_n_5_quant8 failure on CpuAccMichele Di Giorgio
Assembly kernels do not support quantized GEMM when the multiplier is greater than 1 (which leads to negative result shift used for requantization). Change-Id: I7a766cd0d13f549d217613ca67bc952923f309de Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3538 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-29COMPMID-2078: Remove legacy TODOsGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic05ef206f76477cc2fbb9e7ad56ec1974fa013ea Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3626 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-28COMPMID-3385: Async support to CLArithmetic* kernels/functions Pt.2Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: Idc5ac2dd2ba5295c00c88b44a783645327a27e15 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3617 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-28COMPMID-3575: Mixed preicision in NEInstanceNormalizationLayerKernelSang-Hoon Park
In order to fix the issue caused by the limited precision of FP16. mixed precision (float accumulator) is introduced to NEInstanceNormalizationLayerKernel. Since the reference kernel is doing the mixed precision, currently mixed preicision computation is default when it is called from NEInstanceNormalizationLayer. - Make NEInstanceNormalizationLayerKernel use kernel descriptor to enable mixed precision computation - NEInstanceNormalizationLayer is modified to use the descriptor Change-Id: I7766622d715df054e303f9b441380b15b51f02b2 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3589 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-24COMPMID-3385: Async support to CLArithmetic* kernels/functions Pt.1Michalis Spyrou
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Change-Id: I94007565e688f8a0aead4f14c9fc30bfd9f9f7eb Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3613 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-24[ONCPUML-120]: Tweak of the launch heuristics for hybrid_u8u32_dot_16x4 kernelAleksandr Nikolaev
Hybrid kernel turns to be faster for qasymm8 than quantized_wrapper with interleaved. Signed-off-by: Aleksandr Nikolaev <aleksandr.nikolaev@arm.com> Change-Id: I200646aee6cdcabfe125b746c7d87bfa7d06e0fc Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3585 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-24COMPMID-3393: Minor tweaks on memory injection interfaceGeorgios Pinitas
* Avoid the need to overload workspace() everytime * Remove the Layer suffix from the operators * Clean interface by removing default arguments when unsupported Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7710ecd485cae13e9c2d45216debbd8103bc5a0f Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3610 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-23COMPMID-3578: Update FP32/int8 kernel selection.David Mansell
Upgrade the current 'is_preferred()' mechanism with a new framework, where kernels instead provide an estimated cycle count figure. Compatibility with old mechanism is achieved via a wrapper which replaces a "true" result with an estimate of 0, and a "false" result with UINT64_MAX. This mechanism is then used to select between 'interleaved' and 'hybrid' FP32 NEON kernels. This uses a simple system based on counting MACs performed and bytes of data transferred (for rearrange/merge operations) and dividing by fixed performance figures, which are provided for A53, A55, A73 and 'default' figures (based on A76). Separately, a new route for performing int8 GEMMs by using the int16 kernel is provided. This performs significantly (for uint8) or slightly (for int8) better on A53 than the existing int8 route. Optimized 8-to-16 bit transforms are also included. Change-Id: I53b2e59eb9368793c78c2081e17d2445361bcc47 Signed-off-by: David Mansell <David.Mansell@arm.com> Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/250120 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3609 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-07-22COMPMID-3600: Fix requantization in NEPixelWiseMultiplicationKernelMichele Di Giorgio
Quantization wasn't done correctly and since we have helpers for that, the code has been modified to use them. Change-Id: Ia16577cea57dcb1864d91a06ab6aebf8ead67de5 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3608 Reviewed-by: TeresaARM <teresa.charlinreyes@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>