aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels
AgeCommit message (Collapse)Author
2019-01-11COMPMID-1677: Change ROIPooling layer interface to accept ROIs as tensorsManuel Bottini
Change-Id: If16b572a4d906187b77f32133a72a44316fa74e4 Reviewed-on: https://review.mlplatform.org/490 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-01-11COMPMID-1761: NEON: Implement PackIsabella Gottardi
Change-Id: Icc3392494b1e3361e8fd925da200827c494351b3 Reviewed-on: https://review.mlplatform.org/430 Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-01-09COMPMID-1710: Collapse window in CLDepthConvertKernelGeorgios Pinitas
Change-Id: I16589a2b3beb18e20b56059fdabccc61e26e3944 Reviewed-on: https://review.mlplatform.org/481 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-08Implementation of Permute CL kernel to handle all permutationsshubham
This patch will add a generic permute cl-kernel to handle all permutations available for tensors having rank upto 4. Change-Id: I50eb555d9d45d5ad5f7fa9b0a3862dd17551d458 Signed-off-by: shubham <shub98.gupta@samsung.com> Reviewed-on: https://review.mlplatform.org/449 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-07COMPMID-1727 - CL: Implement GatherManuel Bottini
Change-Id: I3d859da09a4de1019bb8c2046725eab942247927 Reviewed-on: https://review.mlplatform.org/386 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-28COMPMID-1860: Invalid arguments in CLDepthwiseConvolution3x3 for NHWCGeorgios Pinitas
-Alters the kernel/function selection process to use validate for selection. -Fixes border kernel input in case of permutation. Change-Id: Ia61df3a0ed661349114dc125f33ad53ee40d9c76 Reviewed-on: https://review.mlplatform.org/443 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-21COMPMID-1836: Remove CLGEMMTranspose1xWKernel and replace with ↵giuros01
CLGEMMReshapeRHSMatrixKernel Change-Id: Ic5a4f32657a155380684dcd4b44fbb608ef40cb4 Reviewed-on: https://review.mlplatform.org/418 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-19COMPMID-1834: Add transpose support to CLGEMMReshapeLHSMatrixKernelGian Marco Iodice
Change-Id: I913a7297a0c34a05b1d37eab1489b430423700e8 Reviewed-on: https://review.mlplatform.org/417 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-18COMPMID-1722 : CL: Implement RangeVidhya Sudhan Loganathan
Change-Id: I88da6eb5289c303b1dc91606c1560ce629746058 Reviewed-on: https://review.mlplatform.org/381 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-17COMPMID-1812: CLSpaceToBatch paddings not calculated correctlyIsabella Gottardi
Change-Id: I63fed6799c4ed2848ff80cd7458124692a52bb98 Reviewed-on: https://review.mlplatform.org/400 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-12-14COMPMID-1710: Fixes in StrideSlice calculations.Georgios Pinitas
Change-Id: I66eb922f1ff15142de278bf4439a61c979f98ba7 Reviewed-on: https://review.mlplatform.org/382 Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2018-12-14COMPMID-1687: Optimize CLGEMMMatrixMultiplyKernel for Mali-G76 - Part1Gian Marco Iodice
The current implementation is limited just to FP32 Change-Id: I185ab57e483e879d7c301e9cc3033efc8b41e244 Reviewed-on: https://review.mlplatform.org/389 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-12COMPMID-1786 Dispatch a single OpenCL when running CLScaleKernel with NHWC ↵Michalis Spyrou
with batch_size!=1 Change-Id: Ib5ea76c1ba7a7add1f050ca9168091bd30749725 Reviewed-on: https://review.mlplatform.org/377 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-12-11COMPMID-1775: Implement CLGEMMReshapeRHSMatrixKernel to reshape the RHS ↵Gian Marco Iodice
matrix of GEMM/GEMMLowp Change-Id: I77f2bfcc5d170bcc2428a2f27104942c1ec877d7 Reviewed-on: https://review.mlplatform.org/375 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-10COMPMID-1774: Implement CLGEMMReshapeLHSMatrixKernel to reshape the LHS ↵Gian Marco Iodice
matrix of GEMM/GEMMLowp Change-Id: I8c5fd4c8bcdffda1522c83158981ed92baa045f4 Reviewed-on: https://review.mlplatform.org/364 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-07COMPMID-1821 (Nighlty) CL/ArgMinMax/Float/ clEnqueueMapBuffer failuresMichalis Spyrou
Change-Id: I5a9d698f459fd7efdbe85d70d092e79e961cb4dd Reviewed-on: https://review.mlplatform.org/368 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-07COMPMID-1451: Fix shape check in CLStridedSliceKernelGeorgios Pinitas
Change-Id: I10979793e22296f9ab091405509d809fd692ab89 Reviewed-on: https://review.mlplatform.org/365 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-05COMPMID-1822: (Nightly) : 'CL/ArithmeticDivision mismatchesgiuros01
Change-Id: I14cea30ffa9ca735941b559bb272b8c476814a34 Reviewed-on: https://review.mlplatform.org/338 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-12-05COMPMID-1719 CL: Implement RSqrt, ExpMichalis Spyrou
Change-Id: I827b26239043a9e90d26c2583122648d2a45303a Reviewed-on: https://review.mlplatform.org/317 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05COMPMID-1723: CL: Implement ReverseMichele Di Giorgio
Change-Id: Id0d4a07af24e2331161996083b0c1bab072bd405 Reviewed-on: https://review.mlplatform.org/322 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05COMPMID-1298: Fuse ReLu activation in CLWinogradOutputTransformManuel Bottini
Change-Id: I9e6e43a5839d04c2e4b4552c05446efb0a5074cf Reviewed-on: https://review.mlplatform.org/232 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-05COMPMID-1725: Implement PackGian Marco Iodice
Change-Id: I13f6e4c600f39355f69e015409bf30dafdc5e3aa Reviewed-on: https://review.mlplatform.org/332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-11-30COMPMID-1717: CL: Implement Maximum, Minimum, SquaredDifferencegiuros01
Change-Id: Ice653e48211053bd3cd20a693bd76de6b4efc370 Reviewed-on: https://review.mlplatform.org/270 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-30COMPMID-1728 CL: Implement ArgMax/ArgMinMichalis Spyrou
Change-Id: I7eae2e55cc0b0b7bbebb7617299daaca6f75f40c Reviewed-on: https://review.mlplatform.org/292 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-28COMPMID-1716: CL Comparison operationsGeorgios Pinitas
Adds support for Equal,NotEqual,Less,LessEqual,Greater,GreaterEqual Change-Id: If0cdf4aae7f95c94709b195eee485f6663f45909
2018-11-27COMPMID-1720: CL: Implement Tilegiuros01
Change-Id: I2a18f0acea382960a8bc71a8f56928a5998f0dd6
2018-11-23COMPMID-1734: Implement CLSelectGeorgios Pinitas
Change-Id: I49b2e8b4200c9ed654736d9451e4ab9c073b4b10
2018-11-22COMPMID-1645 NEL2Normalization for FP32/FP16 & NHWCMichalis Spyrou
Change-Id: I29e35024e29781a6b943b568abec9c73649215e6
2018-11-22COMPMID-1718: Extend DepthConvert to support CastGeorgios Pinitas
Change-Id: I6ee2c0b670727fc808fa636c53ddfaec3a0036c9
2018-11-22COMPMID-1648: CLNormalizationLayer IN_MAP_2D support for NHWC for FP32/FP16Michele Di Giorgio
Change-Id: I49f1d865f5e7562f1d80db849353a89ef77e6a9e
2018-11-21COMPMID-1451 Change PriorBox output to NCHwMichalis Spyrou
Output of Priorbox should be independent of the input data layout and should always be in NCHW format Change-Id: Ie80cd4e51c78945b158c0db1af1923bdf8d7ea7b
2018-11-21COMPMID-1451 Fix macros in SpaceToBatchMichalis Spyrou
Change-Id: I95fdf5bd85becfe081f6ae587284f3b294681308
2018-11-20COMPMID-1801 : (Nightly) CLWinogradConvolutionLayer FP16 mismatchesVidhya Sudhan Loganathan
FP mixed precision support added to GEMM kernel used for fp16 winograd conv on Midgard GPUs Change-Id: I1619beb025fc484a1ac9d3e528d785edabbc7ee6
2018-11-20COMPMID-1451: Fix CLBatchToSpace static validation methodMichalis Spyrou
Change-Id: I770b044b67d93510ef65e556905135b34be7ea0a
2018-11-19COMPMID-1065 : Create documentation explaining how to add new functions / ↵Vidhya Sudhan Loganathan
kernels Change-Id: I98183f95814442b6f3dbb67a1bdae99df05b9b01
2018-11-16COMPMID-1451: Fixes for BoundingBoxTransformgiuros01
- Fixing a bug for which we did not scale the boxes before transforming them - Adding the correct_transform_coords option to BoundingBoxTransformInfo Change-Id: I40281254bcf87e7c8583c119e99562414fe59822
2018-11-16COMPMID-1451: (3RDPARTY_UPDATE) Fixes for GenerateProposals graph node and ↵Michele Di Giorgio
BoxWithNMSLimitKernel COMPMID-1792: Accuracy issue in CLGenerateProposals This patch does the following: - Some fixes for GenerateProposals function and tests - Adapting BoxWithNMSLimitKernel to only accept U32 tensors as keeps_size - Update 3rdparty - Adds a small tolerance for a GenerateProposals test Change-Id: Ia8ec1cdfe941fe05003645e86deb9ea6a6044d74
2018-11-16COMPMID-1266 : Add support for FP16 in CLWinogradConvolutionLayer: 5x5 kernelsVidhya Sudhan Loganathan
Introduced F32 accumulation for F16 winograd gemm and output transform WinogradConvolution will be available for F16 only if fast math flag is enabled Change-Id: I215593c205236a0f9669218437bb40b184ec6a4f
2018-11-16COMPMID-1785: Support for 4D tensor in CLFlattenLayerKernelGian Marco Iodice
With this patch we are able to dispatch a single GPU job also in case of batched-flatten Change-Id: I755e7af29d44b24f67fa04bad3c9b7646e8deefc
2018-11-15COMPMID-1676: Change CLROIAlign interface to accept ROIs as tensorsManuel Bottini
Change-Id: I69e995973597ba3927d29e4f6ed5438560e53d77
2018-11-15COMPMID-1329: Add support for GenerateProposals operator in CLgiuros01
Change-Id: Ib0798cc17496b7817f5b5769b25d98913a33a69d
2018-11-14COMPMID-1462 SSD support: Create CL PriorBoxMichalis Spyrou
Change-Id: I5bf5d751ec7c02d96c26a769f49d03ea23a248b7
2018-11-14COMPMID-1781 Add channel support in CLL2NormalizationMichalis Spyrou
Change-Id: Ibab049f09413258c99335b7da6b151530a1bd136
2018-11-13COMPMID-1707: Create 3 special CLWidthConcatenate kernel to concatenate 2/4 ↵Michele Di Giorgio
and 8 tensors (Part 1) Creating special cases for concatening 2 and 4 tensors. Change-Id: I6a739a494ae45011acb65369e353f9ef96970b90
2018-11-08COMPMID-1451: Removed output_depth3d from ↵Gian Marco Iodice
CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat Since we perform an element-wise operation, it is not necessary to pass the output_depth3d. Change-Id: Ibfa07a0706e902acf59b444aa61e18a348162ea9
2018-11-08COMPMID-1736: Fixed out-of-bound write in CLIm2ColGian Marco Iodice
The issue was related to CLIm2Col when the number of input channels was less than the number of elements processed by each thread. The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed in the output tensor. Also fixed an issue GEMM3D when we have a single output channel Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
2018-11-08COMPMID-1776: Revert QuantizeDownStage to use fixed-pointGeorgios Pinitas
Change-Id: I807ef84dbf893bd401dcac5c0fa3a4ee49aabc66
2018-11-06COMPMID-1703: Collapse the 4th dimensions in ↵Georgios Pinitas
CLDepthWiseConvolutionLayer3x3Kernel Change-Id: Ie274da79b15c03f86dfedc85bb721b3de34a0bb4
2018-11-02COMPMID-1699: Disable arithmetic operations in CLWinogradLayer when no ↵Georgios Pinitas
batches available. Change-Id: Iad83df2a9116a7f350de83ec59b28cd8893c8d3a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155716 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1704: Collapse the 4th dimension in CLPoolingLayerKernelGeorgios Pinitas
Change-Id: I76e57af6608b55b6f59a5d06aecc30063ee4c3cc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155733 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>