aboutsummaryrefslogtreecommitdiff
path: root/src/runtime/CL/functions/CLGEMMConvolutionLayer.cpp
AgeCommit message (Collapse)Author
2023-09-28Apply clang-format on repositoryFelix Thomasmathibalan
Code is formatted as per a revised clang format configuration file(not part of this delivery). Version 14.0.6 is used. Exclusion List: - files with .cl extension - files that are not strictly C/C++ (e.g. Android.bp, Sconscript ...) And the following directories - compute_kernel_writer/validation/ - tests/ - include/ - src/core/NEON/kernels/convolution/ - src/core/NEON/kernels/arm_gemm/ - src/core/NEON/kernels/arm_conv/ - data/ There will be a follow up for formatting of .cl files and the files under tests/ and compute_kernel_writer/validation/. Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Change-Id: Ib7eb1fcf4e7537b9feaefcfc15098a804a3fde0a Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10391 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
2023-09-04Remove legacy PostOps codeJakub Sujak
PostOps was the experimental interface for Dynamic Fusion. It is now replaced by the new Dynamic Fusion interface with code generation using the Compute Kernel Writer. Resolves: COMPMID-6190 Change-Id: I813b48facef2fd6f3aee332588886b4f9b3d33d8 Signed-off-by: Jakub Sujak <jakub.sujak@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10219 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: SiCong Li <sicong.li@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-11-12Fix PostOp dependencySiCongLi
In general src headers should not be included in any public header of other modules. Since there are modules (graph, tests) that rely on specific PostOp definitions in the previous src/core/experimental/PostOp.h, export it to the public arm_compute header Resolves COMPMID-4974 Signed-off-by: SiCongLi <sicong.li@arm.com> Change-Id: I0fa4da5108a34fe6bfff1e9d57839da4e51dc314 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6673 Reviewed-by: Gunes Bayir <gunes.bayir@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-11-01Add PostOp support to GEMM and CLGEMM operators and functions Part 2SiCongLi
* Implement PostOp interface changes * Remove spaces around "=" in TypePrinter Partially resolves COMPMID-4435 Signed-off-by: SiCongLi <sicong.li@arm.com> Change-Id: If1e2280554030a0f635e73339a2e86987f6dc41b Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6484 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sheri Zhang <sheri.zhang@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-08-25Move CPU/GPU files from Core/Runtime to the respective backend foldersGeorgios Pinitas
Legacy structure contained two libraries core/runtime with two backends in each. We reduce the core/runtime libraries to a single library thus merging the backend files Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I69545765fe7a730368105cdbd067d3135ec7a174 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6155 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-08-20Rename [Cl|Cpu]GemmConvolution to [Cl|Gpu]GemmConv2dGeorgios Pinitas
Renaming the gemm-based convolution operators to accomodate for new operators with higher convolution dimensonality Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Id2f2cf11404221f0e87baa0e5d08ad5d63eaf78e Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6113 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-27Port CLGEMMConvolutionLayerManuel Bottini
Details: port CLWeightsReshapeKernel to ClWeightsReshapeKernel port CLGEMMConvolutionLayer to ClGemmConvolution Resolves: COMPMID-4515 Change-Id: I7d5b4ec72db2742f6eb9f3ffc88f717c35b4f2a3 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5983 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-07-22Update GEMM assembly kernelsGeorgios Pinitas
- Introduce Fp32 kernels with internal calculations in Bfloat16 when fast_mode is enabled - Improve kernel selection heuristics Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I68a9e7e862b6fd2721b46e0d7cc791091c4ab279 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5965 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-07-16Port CLIm2ColKernel to ClIm2ColKernelManuel Bottini
Resolves: COMPMID-4516 Change-Id: I6a6db66797fa801dfe1238fceca413277241d2ec Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5946 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-07-15Port CLCol2ImKernel to ClCol2ImKernelManuel Bottini
Resolves: COMPMID-4517 Change-Id: I50cb02116a1ab86fc29200371944c4774e830746 Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5949 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2021-06-29Port the ClGemmLowp kernels to the new APIGeorgios Pinitas
Ported kernels: - CLGEMMLowpMatrixMultiplyNativeKernel - CLGEMMLowpMatrixMultiplyReshapedKernel - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel - CLGEMMLowpOffsetContributionKernel - CLGEMMLowpOffsetContributionOutputStageKernel - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel - CLGEMMLowpQuantizeDownInt32ScaleKernel Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-06-18Integrate improved CPU depthwise convolution kernelsMichele Di Giorgio
* Replace assembly kernels for depthwise convolution with more optimized ones. * Add int8 assembly kernels. * Fix implicit padding on optimized kernels Resolves: COMPMID-3867, COMPMID-4361 Change-Id: I0b0867e05f61be4f368f62190d55e14d0ab3ebf2 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5622 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2021-05-19Port DepthConvert to new ApiGeorgios Pinitas
- Renames DepthConvert to Cast - Ports both NEDepthConverLayer and CLDepthConvert variants - Removes legacy shift capability from DepthConvert, allowing only shifts of 0 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I806a0f8eb23d23502b632c529fda7edde19c8176 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5565 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2021-05-18Port CLGEMM to memory injecting interfaceGeorgios Pinitas
Moves the following kernels: - CLGEMMMatrixMultiplyKernel - CLGEMMMatrixMultiplyNativeKernel - CLGEMMMatrixMultipluReshapedKernel - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel Moves the following functions - CLGEMM Introduces facilities to easy handling of auxiliary temporary buffers under then new run interface. Such are: - CLAuxTensorHandler: That allows wrapping of workspace buffers memory to CLBuffer objects - Ability to inject TensorInfo to allocator without transferring ownership. This reduce the copy overhead if needed. Resolves: COMPMID-4188 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I7055435d831b05b749b26302082e4ac45f26dfb0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5498 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-12-01Update default C++ standard to C++14Georgios Pinitas
(3RDPARTY_UPDATE) Resolves: COMPMID-3849 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: I6369f112337310140e2d6c8e79630cd11138dfa0 Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4544 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-11-07COMPMID-3639: (3RDPARTY_UPDATE) Move CL kernels to srcSang-Hoon Park
Change-Id: I10d27db788e5086adae1841e3e2441cd9b76ef84 Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4310 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-10-20COMPMID-3637: Move utility headers from arm_compute to srcSang-Hoon Park
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: If9d6fa8c900b68c4b6fd373f2fc1f9abb83ea917 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4145 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-29COMPMID-2078: Remove legacy TODOsGeorgios Pinitas
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Change-Id: Ic05ef206f76477cc2fbb9e7ad56ec1974fa013ea Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3626 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2020-07-09COMPMID-3324: Adjusting capitalization of Arm copyright claim to reflect Arm ↵Michele Di Giorgio
preferred presentation Change-Id: Ib7dcfcbb24b408999dfae366b9da396485aacf78 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3525 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-07-03COMPMID-3534: CLGEMMConvolutionLayer doesn't support QASYMM8_SIGNED properlymorgolock
- QASYMM8_SIGNED input and QSYMM8_PER_CHANNEL is permmited. - Validation tests are added. Change-Id: I9f699c323fa7e87afdc132c9b7888a56aebded6b Signed-off-by: morgolock <pablo.tello@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3452 Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2020-04-23COMPMID-3280: Make all ML primitives for CL use the new interface - Part 2Manuel Bottini
- CLFunctions have been updated Change-Id: Ie3256a6c775bc12f3126482bd8e8a46da54b267c Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3053 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2020-03-02COMPMID-3097 Fuse activation with fully connected layer CLGiorgio Arena
Change-Id: I447030e69b9e565f2f81529a41af8c5e7ece7ecf Signed-off-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2702 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-01-14COMPMID-2760: add support for QASYMM8_SIGNED to CLGEMMConvolutionLayerSang-Hoon Park
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com> Change-Id: I55ab81d0f96c78af0396652cacf6640fc98ef3c2 Reviewed-on: https://review.mlplatform.org/c/2584 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2020-01-09COMPMID-2753: Add support for QASYMM8_SIGNED in CL kernels/functionsManuel Bottini
Change-Id: I7ed2d43f33458ba0571323f6fa9dc2e45fcd672a Signed-off-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-on: https://review.mlplatform.org/c/2516 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-12-03COMPMID-2793: Add support for QASYMM8_SIGNED in ↵Michele Di Giorgio
CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel Change-Id: I8abfdd3372cc394b98ec038b9fcb4abfe9216894 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/2401 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-11-14COMPMID-2309 : CLConvolutionLayer: support QUANT8_SYMM_PER_CHANNEL filtersVidhya Sudhan Loganathan
Change-Id: I16f6758b768ede404a064db057302ded706e1e8a Signed-off-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com> Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/2215 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-10-03COMPMID-2685: [CL] Use Weights managerMichalis Spyrou
Change-Id: Ia1818e6ecd9386e96378e64f14d02592fe3cdf0f Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/1997 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-10-02COMPMID-2661: Implement complex function dynamic tensor support.Georgios Pinitas
Change-Id: I80772cb25514009b030e5ade28cbb71ed352da67 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/2019 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-08-01COMPMID-1979: Fuse Activation Function in CLGEMM - part 4Gian Marco Iodice
Fused activation function in CLGEMM Change-Id: I644fdf09349325c0b3a2cd5fef2a3ea2c974149d Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/1640 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-06-20COMPMID-2053: Fuse bias addition with CLGEMMMatrixMultiplyReshapedKernelGian Marco Iodice
Change-Id: I5bfd38c94a6fd18a1cba2104f7e1b04e7bef6ec2 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/1359 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-06-03COMPMID-2225: Add interface support for new quantized data types.Georgios Pinitas
Add support for: -QSYMM8, 8-bit quantized symmetric -QSYMM8_PER_CHANNEL, 8-bit quantized symmetric with per channel quantization Change-Id: I00c4ff98e44af37419470af61419ee95d0de2463 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/1236 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
2019-05-10COMPMID-2177 Fix clang warningsMichalis Spyrou
Change-Id: I78039db8c58d7b14a042c41e54c25fb9cb509bf7 Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-on: https://review.mlplatform.org/c/1092 Reviewed-by: VidhyaSudhan Loganathan <vidhyasudhan.loganathan@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-04-08COMPMID-2098: Scope handling of memory group resources.Georgios Pinitas
Change-Id: Ie945526bd7845301458039edf3129253c1808505 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/938 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-03-27COMPMID-1995: Minor code fixes.Georgios Pinitas
-Remove FIXMEs and link to tickets. -Pass large object by const reference. -Implement copy assignment operator for Window. Change-Id: I975223ac42ec424f153569a8c963f29e6b86ad29 Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-on: https://review.mlplatform.org/c/899 Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-02-07COMPMID-1706: Fuse the bias addition within CLGEMMMichele Di Giorgio
Change-Id: I378f2023f4fa010f195f76716ac07aa86279bfae Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/280 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-30COMPMID-1717: CL: Implement Maximum, Minimum, SquaredDifferencegiuros01
Change-Id: Ice653e48211053bd3cd20a693bd76de6b4efc370 Reviewed-on: https://review.mlplatform.org/270 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-08COMPMID-1451: Fix fused activation in GEMMConvolutionLayerGeorgios Pinitas
-Uses output quantization information for the activation layer. -Updates checks for BoundedRelu at CL side. Change-Id: I0447860e90f1c89b67b9ace3c8daad713f6c64e0
2018-11-08COMPMID-1736: Fixed out-of-bound write in CLIm2ColGian Marco Iodice
The issue was related to CLIm2Col when the number of input channels was less than the number of elements processed by each thread. The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed in the output tensor. Also fixed an issue GEMM3D when we have a single output channel Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
2018-11-02COMPMID-1413 - Improve the performance of GEMMLowp with 8 bit dot product on ↵Gian Marco Iodice
OpenCL COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 % Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride but I have not seen any benefit (maybe because we have few arithemtic operation and we do not have more load instructions). However Depthwise convolution has been improved by 30% Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1451: Perform CLOutputStage using floats.Georgios Pinitas
Change-Id: Ic8312a5b6790aa7cd4468d42f08d557ad40e9441 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154570 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1519: Add support for 3D input/output in CLGEMMLowpOutputStageGeorgios Pinitas
Change-Id: I637add70310d2da4d82b236a6352af9d33be17a1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149706 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1518: Add support for GEMM3D in CLGEMMLowpMatrixMultiplyCoreGeorgios Pinitas
Change-Id: Ib14ac821ee5d4aff80bd602cd3e76e7018abb5e6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150268 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1470 Add auto-init of the output in NECol2imGiorgio Arena
The output of NECol2Im is already auto-initialized. This patch is about calling ShapeCalculator instead of computing the shape inside the kernel, adding validate_and_configure_window, and standardize the way convolved dims are passed (now NEON uses Size2D, while CL passes a pair of uint values: using Size2D for both implementations) Change-Id: I795696e1b6532f57847c3186c1b532c09f5a25da Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/145345 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1498 - Enable grouping in CLGEMMConvolutionLayerGian Marco Iodice
Change-Id: I15c7df21773145b03f42b6f78bd7ad2e5b8a5219 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144126 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1485 - Add support for NHWC when running NEGEMMConvolutionLayer with ↵Gian Marco Iodice
FP16/QASYMM8 When the GEMM3D check fails, now we fallback to the classic implementation with im2col and col2im. In this manner the function can work with QASYMM8 and FP16 Change-Id: I359e9da3a63956f33b5acbc9bca4383b14af10e2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143372 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1342 Add grouping support to CLIm2ColKernelGiorgio Arena
Change-Id: I4afb19751520a90fee27fb49b775cd10e92a94f5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/140476 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1488 - Add support for NHWC when running CLGEMMConvolutionLayer with ↵Gian Marco Iodice
QASYMM8 Fixed also a bug in the graph API related to the bias shape in DepthWiseConvolution for NHWC Change-Id: I275141a42e51f6747b77db1c31d1bc69e8685af5 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/143454 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1277 - Optimizing CLIm2ColKernel for NHWC.Gian Marco Iodice
This patch includes: - Im2Col optimizations for NHWC using a new data layout - Refactoring of CLIm2ColKernel adding validation method and auto-init - Removed im2col_reduced from CLIm2ColKernel and created a new kernel CLFlattenLayerKernel Change-Id: I1620640b6796baa268324b33ae92cdd8de53e27c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/141241 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
2018-11-02COMPMID-872 - Rework NEGEMMConvolutionLayer to use NEGEMMGian Marco Iodice
Change-Id: I55f0018ac7214775ebbca63f58a3bf5c93732fec Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142632 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1475: (OCLGrind) FP exception in NEGEMMConvolutionGeorgios Pinitas
Change-Id: I986099c269498cc7971b10ee634dba721954546e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/142647 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>