aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2021-06-29Port the ClGemmLowp kernels to the new APIGeorgios Pinitas
Ported kernels: - CLGEMMLowpMatrixMultiplyNativeKernel - CLGEMMLowpMatrixMultiplyReshapedKernel - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel - CLGEMMLowpOffsetContributionKernel - CLGEMMLowpOffsetContributionOutputStageKernel - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel - CLGEMMLowpQuantizeDownInt32ScaleKernel Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Port NEGEMM to memory injecting interface (Part 2)Michele Di Giorgio
- Port NEGEMMMatrixMultiplyKernel to the new API Partially resolves: COMPMID-4402 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Change-Id: I52b67055dc24bb3a417d6ec5aeeee86e21b74320 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5873 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Enable global pooling optimization on OpenCLGian Marco Iodice
- Add loop unrolling on X and use POOL_X and POOL_Y defines for the for loop Resolves COMPMID-4573 Change-Id: I33cb825cfb55912ccb0ab9d03bd33a3dab4c8b44 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5872 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Port NEGEMM to memory injecting interface (Part 1)Michele Di Giorgio
- Start porting NEGEMM to the new API - Port NEGEMMInterleave4x4Kernel to the new API - Port NEGEMMMatrixAdditionKernel to the new API - Port NEGEMMTranspose1xWKernel to the new API - Remove padding from NEGEMMMatrixAdditionKernel - Remove unused INESimpleKernel and ICPPSimpleKernel Partially resolves: COMPMID-4402 Change-Id: I63edadddfe00a54586e5384d6a0211db25ae9042 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5857 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Improve selection speed of CPU implementationsGeorgios Pinitas
CPU micro-kernel to be used was picked during kernel execution. Move selection during configuration to reduce runtime overhead. Standardize kernel names as follows: <simd_tech>_<data_type>_<data_layout>_<kernel_name> e.g. sve_fp32_nhwc_scale Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I544f1c08c8fef0f130a3bde61882ccb9a1f47f21 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5855 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Set up the framework to choose the default LWSGiorgio Arena
Resolve COMPMID-4486 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib38b7943bd776a6d75d1da163908724c49eae73d Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5864 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-28Simplify CpuInfo logicGeorgios Pinitas
Refactors the CpuInfo extraction code and cleans the usage of it in the CpuContext class Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Iea47dfdaad431fb49285da92778d6b42cf318f60 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5848 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-06-28Add code to detect Mali(TM) G31Pablo Marquez Tello
* Fixed a problem which made the library choose the Valhall path instead of Bifrost * Set block sizes to 4 for u8 on G31 * Resolves MLCE-463 Change-Id: I577f57ec96f740db15d6640e34172318a82c5778 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5858 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Georgios Pinitas <georgios.pinitas@arm.com>
2021-06-25Port NEGEMMConv2d to memory injecting interfaceMichele Di Giorgio
Resolves: COMPMID-4506, COMPMID-4570 Change-Id: I6d37a06da141f1fcfcaa8525322a319cb0234791 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5824 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-25Rename pooling implementation folder to pool2dGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I82a298e059a92d82095186d120e4934a8eeb4e3e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5854 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-24Rework gemmlowp reshaped_only_rhs using the new macrosGiorgio Arena
Resolve COMPMID-4416 Change-Id: I83cdf0de7adaf4d465ffebd494ab913182072485 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5788 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-24Rework OpenCL Depthwise ConvolutionGian Marco Iodice
- Remove dedicated kernels for NCHW. Now we only use NHWC with permute - Remove specialized kernels for 3x3 NHWC - Simplify CLDepthwiseConvolutionLayer.cpp to call just the native implementation for both floating-point and quantized data types - Develop two parametric opencl kernels for depthwise convolution layer NHWC (floating-point and quantized) - Add support to export the weights to cl_image - Extend test for depthwise convolution on opencl Resolves COMPMID-4417 Change-Id: I253dd5d959a70783c82e62b1771a5e9f91621cb0 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5806 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2021-06-23Create core library using high priority operatorsMichalis Spyrou
A smaller core library is created using a subset of the operators. Changed the structure of filelist.json in order to include more information about the kernels and make the selection easier. Resolves: COMPMID-4514 Change-Id: I079ca7d8e64346174eebdd13b834e1dd4dc36ca2 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5786 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-23Add in-place computation for elementwise operationsSheri Zhang
- Add in-place computation for elementwise operations at graph level - Modify support case to test in-place computation for elementwise operations Resolves: COMPMID-4414 Signed-off-by: Sheri Zhang <sheri.zhang@arm.com> Change-Id: I5a4de1235dd29a31160e770a16d62f4b98c84ae6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5803 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-22Port NEGEMMLowp Part 1Manuel Bottini
Details: Port NEGEMMLowpQuantizeDownInt32ScaleKernel to CpuGemmLowpQuantizeDownInt32ScaleKernel Port NEGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel to CpuGemmLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel Port NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel to CpuGemmLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel Port NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel to CpuGemmLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel Port NEGEMMLowpOutputStage functions to CpuGemmLowpOutputStage operators Partially Resolves: COMPMID-4403 Change-Id: I6d5f45e43f35d731d564ed3b5c0e804d2a318fb1 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5833 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-22Fix Winograd heuristic in CLConvolutionLayerGian Marco Iodice
- Dispath CLWinogradConvolutionLayer only for IFM >= 16 Resolves COMPMID-4591 Change-Id: Id314cd0003a0d938c48b49a1936ff7c6b7f8391f Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5841 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-22Add FP16 support to CLRemapFreddie Liardet
Add FP16 support to CLRemap when data layout is NHWC. Add relevant tests for FP16 and validation. Update NERemap function level to be consistent with CLRemap. Add depreciation notice for uint_8 only function level methods. Resolves: COMPMID-4335 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: If05f06801aef7a169b73ff1ebe760a42f11ca05c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5816 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-21Rework the CLConvolutionLayer heuristicGian Marco Iodice
- Call winograd for large kernel size (> 5x5) Resolves COMPMID-4582 Change-Id: I78a187d91bc8099bec7b5c6fa941223cfc1423ba Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5814 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-06-18Integrate improved CPU depthwise convolution kernelsMichele Di Giorgio
* Replace assembly kernels for depthwise convolution with more optimized ones. * Add int8 assembly kernels. * Fix implicit padding on optimized kernels Resolves: COMPMID-3867, COMPMID-4361 Change-Id: I0b0867e05f61be4f368f62190d55e14d0ab3ebf2 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5622 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-06-18Remove implementation headers from NESoftmaxLayer public headerGeorgios Pinitas
Resolves: COMPMID-4587 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ib216abcb0b9cd7f545d7c97e9d3447cb1b28f180 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5828 Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-15Add CPU discovery capabilities.Georgios Pinitas
Resolves: COMPMID-4500 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I008c51934ef813fb1f489b531288c4419e701955 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5799 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-15Port CLWinogradConvolutionLayer with ClWinogradConv2dManuel Bottini
Port CLWinogradInputTransformKernel Port CLWinogradFilterTransformKernel Port CLWinogradOutputTransformKernel Resolves: COMPMID-4504 Change-Id: I3177dda0b9c2f56b36cb317027e94abe8d47229e Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5680 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-15Fix incorrect memory handling in ported functionsManuel Bottini
Details of the functions: - ClSoftmax - CpuSoftmax - CpuPool2d Change-Id: Icd2c14d5df010c3b2301e2693ce6f414d7c61916 Resolves: COMPMID-4404 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5797 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-15Add NHWC support to CLRemapFrederick Liardet
Add NHWC support to CLRemap, also add relevant tests. Partially resolves COMPMID-4335. Change-Id: I119bea99be497fb85d5cd83a10f8d4e8e1f97f17 Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5773 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-11Fix errata in documentationJakub Sujak
This patch addresses the following errata found in the project documentation: * Common typos. * Missing use of trademarks. * Incomplete operator descriptions. * Examples of code that have since been removed from the library. * Plus clarification over the usage of `All` category for data types and layouts. In addition, the Operator list was not generated properly due to: * Non-matching cases in the filenames (i.e. `Elementwise` and `ElementWise`). For consistency, all usages of the latter have been renamed to the former. * Extra data layout tables in the headers for the `NESlice` and `NEStridedSlice` functions (note: not present in CL counterpart) meant documentation for those functions was generated twice. Resolves: COMPMID-4561, COMPMID-4562, COMPMID-4563 Change-Id: I1eb24559545397749e636ffbf927727fb1bc6201 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5769 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com>
2021-06-09Fixed segfault in NEGEMMConv2dPablo Tello
* Check if biases are not null before dereference * Resolves COMPMID-4547 Signed-off-by: Pablo Tello <pablo.tello@arm.com> Change-Id: I9b266deb849f40bd6a1510c444e2c522a03d934e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5784 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-06-08Add guards on SVE kernelsMichalis Spyrou
Some compiling issues are reported when building through ArmNN. Resolves: COMPMID-4569 Change-Id: If464fda9157fbdba678e54f07b235e3ef00ee51a Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5777 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-07Revert "Implement memory injection in CpuDirectGemmConv2d"Michele Di Giorgio
This reverts commit b3be45759bdd0749ae3a16fe470820f0d9830ea9. Resolves: COMPMID-4548 Change-Id: I46e0d8c67ddf988af3ce38f83177cda412db916c Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5775 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
2021-06-07Revert "Add optimization for global pooling in pooling_layer.cl"Pablo Tello
* Resolves COMPMID-4544 This reverts commit 50929ef951880469b9d579323d2f9c9f5025327d. Signed-off-by: Pablo Tello <pablo.tello@arm.com> Change-Id: Ibc40054cecc2fe99bff48154f11fe5b602b4a64a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/330380 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: bsgcomp <bsgcomp@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5774 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-07Enable fat binary supportGeorgios Pinitas
Changes our build system to allow building both Neon(TM) and SVE kernels and package them in the same binary. This will allow runtime selection of the underlying architecture. Adds new build option, fat_binary, for enabling this feature. Change-Id: I8e8386149773ce28e071a2fb7ddd8c8ae0f28a4a Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5704 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-07Fix WeightRetention tests on multiple calls of FullyConnectedGeorgios Pinitas
Alters CLGEMM to update only the mutable elements of the execution pack when reconfiguring the operator with weight retention. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I993e3a486e1b025bf6172a7935a10b3a86db106c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5771 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-02Fix bug in PReluLayer when input is 1xN sizeFreddie Liardet
Fix issue where gpu elementwise operation kernel would not compile when input is 1xN size and prelu is the chosen operator. Add relevant tests for 1xN input. Signed-off-by: Freddie Liardet <frederick.liardet@arm.com> Change-Id: If0651cfa399ca1d9c65f2632b75536c7931f27d4 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5760 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-02Fixed the compiler warning -Werror=type-limitsPablo Marquez Tello
* Comparison is always false due to limited range of data type. * rescale_value is truncated to int32_t and then is compared agains (1ll <<31) which will be always false * Resolves MLCE-508 Change-Id: I8dfd04b16beb5c2c3994eb817ed7a992fb63ed31 Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5741 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-01Fuse activation in ClDirectConv2dKernel for float typesGeorgios Pinitas
Resolves: COMPMID-4430 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I9a40033e09223d601460a7e52cc297c58c9a2737 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5757 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-01Rename ported functionsManuel Bottini
Rename CpuPooling to CpuPool2d Rename CpuPoolingKernel to CpuPool2dKernel Rename CpuPoolingAssemblyWrapperKernel to CpuPool2dAssemblyWrapperKernel Move CpuPool2dAssemblyWrapperKernel in internal subfolder Rename CpuDepthwiseConvolutionNativeKernel to CpuDepthwiseConv2dNativeKernel Rename CpuDepthwiseConvolutionAssemblyDispatch to CpuDepthwiseConv2dAssemblyDispatch Rename CpuDepthwiseConvolution to CpuDepthwiseConv2d Rename CpuDirectConvolutionKernel to CpuDirectConv2dKernel Rename CpuDirectConvolutionOutputStageKernel to CpuDirectConv2dOutputStageKernel Rename CpuDirectConvolution to CpuDirectConv2d Rename ClPoolingKernel to ClPool2dKernel Rename ClPooling to ClPool2d Rename ClDirectConvolutionKernel to ClDirectConv2dKernel Resolves: COMPMID-4405 Change-Id: I8e48f015e4e492a76a7512f5679cb3eb0cd028f6 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5708 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-01Optimize int8 arithmetic addition on CPUGiorgio Arena
Avoid accessing quantization info from TensorInfo in leftover loop. Use the already available UniformQuantizationInfo instead Create another version of the quantize utility function which assumes RoundingPolicy::TO_NEAREST_UP. This allows us to call std::lround() and avoid some overhead Resolve COMPMID-4546 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Ib481a586f879b7e937e3d54ba11100d0a37ef277 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5722 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-27Implement memory injection in CpuDirectGemmConv2dSang-Hoon Park
The following operators are now stateless by implementing memory injection. - CpuDirectGemmConv2d - CpuGemmAssemblyDispatch A test case is added to test if CpuDirectGemmConv2d can run on different group of tensors with a single configure. Resolves: COMPMID-4506 Change-Id: I48f44ed41236ca7e18da2de07bdbacc9007a3c5e Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5718 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
2021-05-26Fix node fusion mutatorSiCongLi
Node fusion mutator adds new nodes to the node list, invalidating the list iterator during iteration. The fix uses index instead. Resolves COMPMID-4543 Change-Id: Ibb38e2394c616ac6e458568e73f09fb5243e94fe Signed-off-by: SiCongLi <sicong.li@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5717 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-26DirectConvolutionLayer create image failureManuel Bottini
Add implicit padding test on weights before the configure Fix problem of considering left padding of weights when using cl image Resolves: COMPMID-4493 Change-Id: I141d2de68e8bdfcbd6f18209db4f29fcc05305a1 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5689 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-26Create CpuGemmDirectConv2dSang-Hoon Park
As the first phase of making NEGEMMConv2d stateless, CpuGemmDirectConv2d operator is created. Kernels and operators used by the operator use TensorInfo pointers instead of Tensor pointers. The CpuGemmDirectConv2d isn't completely stateless because it manages one intermediate tensor internally. This will be resolved by implementing memory injection mechanism with the following patches. Also, weight manager of CpuGemmAssemblyDispatch is disabled to enable this work. Implements: COMPMID-4506 Change-Id: Iec3ca6de29d98bef7ea95e8f4473d6dc0024a140 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5672 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-25Remove used auxilary tensors from ClGemm's tensor packSang-Hoon Park
When the used auxilary tensors are kept in the tensor pack, the pointers in the tensor pack can point memory space that has been already freed. In some cases, this leads crashes or unpredictable behaviour. Resolves: COMPMID-4536 Change-Id: I4cc83310dfe084d477a5f9bece216228c3edb172 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5710 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-25Fix retrieving device architecture twice in CLCompileContextGiorgio Arena
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: I12608eb4d681ea07d9d508226600a11365238a00 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5682 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-20Add support for dynamic weights in CL FullyConnected layerGiorgio Arena
Make GEMM use its native version if weights are dynamic. This ensures no reshape gets performed on the weights tensor Enable dynamic weights tests for the OpenCL backend Resolve COMPMID-4223 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Change-Id: Iccc4806701772cede23e24df09c786914d00034c Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5652 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2021-05-20Enable unroll through pragma based on DDK versionGiorgio Arena
Change-Id: Id98a107d512369d3799961011a84e9cc4d99e775 Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5679 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-05-19clCreateKernel failure of CL/ChannelShuffle/U8/Manuel Bottini
Create right vector for size of 16 elements Resolves: COMPMID-4532 Change-Id: I74dc89d158a6aa3f50db710cbc002a92156e2263 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5673 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-19Port DepthConvert to new ApiGeorgios Pinitas
- Renames DepthConvert to Cast - Ports both NEDepthConverLayer and CLDepthConvert variants - Removes legacy shift capability from DepthConvert, allowing only shifts of 0 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I806a0f8eb23d23502b632c529fda7edde19c8176 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5565 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-18Port CLGEMM to memory injecting interfaceGeorgios Pinitas
Moves the following kernels: - CLGEMMMatrixMultiplyKernel - CLGEMMMatrixMultiplyNativeKernel - CLGEMMMatrixMultipluReshapedKernel - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel Moves the following functions - CLGEMM Introduces facilities to easy handling of auxiliary temporary buffers under then new run interface. Such are: - CLAuxTensorHandler: That allows wrapping of workspace buffers memory to CLBuffer objects - Ability to inject TensorInfo to allocator without transferring ownership. This reduce the copy overhead if needed. Resolves: COMPMID-4188 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-18Remove padding from NERemapKernelManuel Bottini
Use of out_of_tensor function to check if parallel instructons can be used safely Reverting to serial computation otherwise Resolves: COMPMID-4449 Change-Id: I23a986612e3c5d0367e23e56f1aeedbb1330cffc Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5651 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-05-18Port CLFlattenLayer to a memory injecting interfaceGeorgios Pinitas
Resolves: COMPMID-4502 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I00621a772135f65a92aa70e6a19dd5f743915378 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5649 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-18Port NEFlatten layer to a memory injecting interfaceGeorgios Pinitas
Resolves: COMPMID-4503 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic536f62a9561d709c16d5f9cca28784cb7f281b6 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5650 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>