aboutsummaryrefslogtreecommitdiff
path: root/src/core/CL/kernels
AgeCommit message (Collapse)Author
2019-01-15COMPMID-1886 : Enable CLTuner for CLScaleVidhya Sudhan Loganathan
Change-Id: If92dc7fc46accdef5e67f96a91a8428c18db2028 Reviewed-on: https://review.mlplatform.org/517 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-01-15COMPMID-1687: Optimize CLGEMMMatrixMultiplyKernel (part 1)Gian Marco Iodice
Extended CLGEMMMatrixMultiplyReshapedKernel to support more parameters Change-Id: I4a27f986e3fe2dd071a4ccba5cfa0565f3db39ad Reviewed-on: https://review.mlplatform.org/495 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-01-14COMPMID-1758: NEON: Implement RangeManuel Bottini
Change-Id: I56dff9462b85760fbed6db43224cadb90d283810 Reviewed-on: https://review.mlplatform.org/472 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-14COMPMID-1724: CL Implement ProdManuel Bottini
Change-Id: I17e51f25064b53a8f7e13d6fcbecc14a192de103 Reviewed-on: https://review.mlplatform.org/387 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-14Issue COMPMID-1835: Remove CLGEMMInterleave4x4Kernel and replace with ↵giuros01
CLGEMMReshapeLHSMatrixKernel Change-Id: Id6a1bd78f9b1698b64a004e4adebc41002b15745 Reviewed-on: https://review.mlplatform.org/496 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-01-11COMPMID-1677: Change ROIPooling layer interface to accept ROIs as tensorsManuel Bottini
Change-Id: If16b572a4d906187b77f32133a72a44316fa74e4 Reviewed-on: https://review.mlplatform.org/490 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2019-01-11COMPMID-1761: NEON: Implement PackIsabella Gottardi
Change-Id: Icc3392494b1e3361e8fd925da200827c494351b3 Reviewed-on: https://review.mlplatform.org/430 Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2019-01-09COMPMID-1710: Collapse window in CLDepthConvertKernelGeorgios Pinitas
Change-Id: I16589a2b3beb18e20b56059fdabccc61e26e3944 Reviewed-on: https://review.mlplatform.org/481 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-08Implementation of Permute CL kernel to handle all permutationsshubham
This patch will add a generic permute cl-kernel to handle all permutations available for tensors having rank upto 4. Change-Id: I50eb555d9d45d5ad5f7fa9b0a3862dd17551d458 Signed-off-by: shubham <shub98.gupta@samsung.com> Reviewed-on: https://review.mlplatform.org/449 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2019-01-07COMPMID-1727 - CL: Implement GatherManuel Bottini
Change-Id: I3d859da09a4de1019bb8c2046725eab942247927 Reviewed-on: https://review.mlplatform.org/386 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-28COMPMID-1860: Invalid arguments in CLDepthwiseConvolution3x3 for NHWCGeorgios Pinitas
-Alters the kernel/function selection process to use validate for selection. -Fixes border kernel input in case of permutation. Change-Id: Ia61df3a0ed661349114dc125f33ad53ee40d9c76 Reviewed-on: https://review.mlplatform.org/443 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-21COMPMID-1836: Remove CLGEMMTranspose1xWKernel and replace with ↵giuros01
CLGEMMReshapeRHSMatrixKernel Change-Id: Ic5a4f32657a155380684dcd4b44fbb608ef40cb4 Reviewed-on: https://review.mlplatform.org/418 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-19COMPMID-1834: Add transpose support to CLGEMMReshapeLHSMatrixKernelGian Marco Iodice
Change-Id: I913a7297a0c34a05b1d37eab1489b430423700e8 Reviewed-on: https://review.mlplatform.org/417 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-18COMPMID-1722 : CL: Implement RangeVidhya Sudhan Loganathan
Change-Id: I88da6eb5289c303b1dc91606c1560ce629746058 Reviewed-on: https://review.mlplatform.org/381 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-17COMPMID-1812: CLSpaceToBatch paddings not calculated correctlyIsabella Gottardi
Change-Id: I63fed6799c4ed2848ff80cd7458124692a52bb98 Reviewed-on: https://review.mlplatform.org/400 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
2018-12-14COMPMID-1710: Fixes in StrideSlice calculations.Georgios Pinitas
Change-Id: I66eb922f1ff15142de278bf4439a61c979f98ba7 Reviewed-on: https://review.mlplatform.org/382 Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
2018-12-14COMPMID-1687: Optimize CLGEMMMatrixMultiplyKernel for Mali-G76 - Part1Gian Marco Iodice
The current implementation is limited just to FP32 Change-Id: I185ab57e483e879d7c301e9cc3033efc8b41e244 Reviewed-on: https://review.mlplatform.org/389 Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-12COMPMID-1786 Dispatch a single OpenCL when running CLScaleKernel with NHWC ↵Michalis Spyrou
with batch_size!=1 Change-Id: Ib5ea76c1ba7a7add1f050ca9168091bd30749725 Reviewed-on: https://review.mlplatform.org/377 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-12-11COMPMID-1775: Implement CLGEMMReshapeRHSMatrixKernel to reshape the RHS ↵Gian Marco Iodice
matrix of GEMM/GEMMLowp Change-Id: I77f2bfcc5d170bcc2428a2f27104942c1ec877d7 Reviewed-on: https://review.mlplatform.org/375 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-10COMPMID-1774: Implement CLGEMMReshapeLHSMatrixKernel to reshape the LHS ↵Gian Marco Iodice
matrix of GEMM/GEMMLowp Change-Id: I8c5fd4c8bcdffda1522c83158981ed92baa045f4 Reviewed-on: https://review.mlplatform.org/364 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-07COMPMID-1821 (Nighlty) CL/ArgMinMax/Float/ clEnqueueMapBuffer failuresMichalis Spyrou
Change-Id: I5a9d698f459fd7efdbe85d70d092e79e961cb4dd Reviewed-on: https://review.mlplatform.org/368 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-07COMPMID-1451: Fix shape check in CLStridedSliceKernelGeorgios Pinitas
Change-Id: I10979793e22296f9ab091405509d809fd692ab89 Reviewed-on: https://review.mlplatform.org/365 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Matthew Bentham <matthew.bentham@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-12-05COMPMID-1822: (Nightly) : 'CL/ArithmeticDivision mismatchesgiuros01
Change-Id: I14cea30ffa9ca735941b559bb272b8c476814a34 Reviewed-on: https://review.mlplatform.org/338 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <Anthony.barbier@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-12-05COMPMID-1719 CL: Implement RSqrt, ExpMichalis Spyrou
Change-Id: I827b26239043a9e90d26c2583122648d2a45303a Reviewed-on: https://review.mlplatform.org/317 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05COMPMID-1723: CL: Implement ReverseMichele Di Giorgio
Change-Id: Id0d4a07af24e2331161996083b0c1bab072bd405 Reviewed-on: https://review.mlplatform.org/322 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-12-05COMPMID-1298: Fuse ReLu activation in CLWinogradOutputTransformManuel Bottini
Change-Id: I9e6e43a5839d04c2e4b4552c05446efb0a5074cf Reviewed-on: https://review.mlplatform.org/232 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-12-05COMPMID-1725: Implement PackGian Marco Iodice
Change-Id: I13f6e4c600f39355f69e015409bf30dafdc5e3aa Reviewed-on: https://review.mlplatform.org/332 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2018-11-30COMPMID-1717: CL: Implement Maximum, Minimum, SquaredDifferencegiuros01
Change-Id: Ice653e48211053bd3cd20a693bd76de6b4efc370 Reviewed-on: https://review.mlplatform.org/270 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-30COMPMID-1728 CL: Implement ArgMax/ArgMinMichalis Spyrou
Change-Id: I7eae2e55cc0b0b7bbebb7617299daaca6f75f40c Reviewed-on: https://review.mlplatform.org/292 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-28COMPMID-1716: CL Comparison operationsGeorgios Pinitas
Adds support for Equal,NotEqual,Less,LessEqual,Greater,GreaterEqual Change-Id: If0cdf4aae7f95c94709b195eee485f6663f45909
2018-11-27COMPMID-1720: CL: Implement Tilegiuros01
Change-Id: I2a18f0acea382960a8bc71a8f56928a5998f0dd6
2018-11-23COMPMID-1734: Implement CLSelectGeorgios Pinitas
Change-Id: I49b2e8b4200c9ed654736d9451e4ab9c073b4b10
2018-11-22COMPMID-1645 NEL2Normalization for FP32/FP16 & NHWCMichalis Spyrou
Change-Id: I29e35024e29781a6b943b568abec9c73649215e6
2018-11-22COMPMID-1718: Extend DepthConvert to support CastGeorgios Pinitas
Change-Id: I6ee2c0b670727fc808fa636c53ddfaec3a0036c9
2018-11-22COMPMID-1648: CLNormalizationLayer IN_MAP_2D support for NHWC for FP32/FP16Michele Di Giorgio
Change-Id: I49f1d865f5e7562f1d80db849353a89ef77e6a9e
2018-11-21COMPMID-1451 Change PriorBox output to NCHwMichalis Spyrou
Output of Priorbox should be independent of the input data layout and should always be in NCHW format Change-Id: Ie80cd4e51c78945b158c0db1af1923bdf8d7ea7b
2018-11-21COMPMID-1451 Fix macros in SpaceToBatchMichalis Spyrou
Change-Id: I95fdf5bd85becfe081f6ae587284f3b294681308
2018-11-20COMPMID-1801 : (Nightly) CLWinogradConvolutionLayer FP16 mismatchesVidhya Sudhan Loganathan
FP mixed precision support added to GEMM kernel used for fp16 winograd conv on Midgard GPUs Change-Id: I1619beb025fc484a1ac9d3e528d785edabbc7ee6
2018-11-20COMPMID-1451: Fix CLBatchToSpace static validation methodMichalis Spyrou
Change-Id: I770b044b67d93510ef65e556905135b34be7ea0a
2018-11-19COMPMID-1065 : Create documentation explaining how to add new functions / ↵Vidhya Sudhan Loganathan
kernels Change-Id: I98183f95814442b6f3dbb67a1bdae99df05b9b01
2018-11-16COMPMID-1451: Fixes for BoundingBoxTransformgiuros01
- Fixing a bug for which we did not scale the boxes before transforming them - Adding the correct_transform_coords option to BoundingBoxTransformInfo Change-Id: I40281254bcf87e7c8583c119e99562414fe59822
2018-11-16COMPMID-1451: (3RDPARTY_UPDATE) Fixes for GenerateProposals graph node and ↵Michele Di Giorgio
BoxWithNMSLimitKernel COMPMID-1792: Accuracy issue in CLGenerateProposals This patch does the following: - Some fixes for GenerateProposals function and tests - Adapting BoxWithNMSLimitKernel to only accept U32 tensors as keeps_size - Update 3rdparty - Adds a small tolerance for a GenerateProposals test Change-Id: Ia8ec1cdfe941fe05003645e86deb9ea6a6044d74
2018-11-16COMPMID-1266 : Add support for FP16 in CLWinogradConvolutionLayer: 5x5 kernelsVidhya Sudhan Loganathan
Introduced F32 accumulation for F16 winograd gemm and output transform WinogradConvolution will be available for F16 only if fast math flag is enabled Change-Id: I215593c205236a0f9669218437bb40b184ec6a4f
2018-11-16COMPMID-1785: Support for 4D tensor in CLFlattenLayerKernelGian Marco Iodice
With this patch we are able to dispatch a single GPU job also in case of batched-flatten Change-Id: I755e7af29d44b24f67fa04bad3c9b7646e8deefc
2018-11-15COMPMID-1676: Change CLROIAlign interface to accept ROIs as tensorsManuel Bottini
Change-Id: I69e995973597ba3927d29e4f6ed5438560e53d77
2018-11-15COMPMID-1329: Add support for GenerateProposals operator in CLgiuros01
Change-Id: Ib0798cc17496b7817f5b5769b25d98913a33a69d
2018-11-14COMPMID-1462 SSD support: Create CL PriorBoxMichalis Spyrou
Change-Id: I5bf5d751ec7c02d96c26a769f49d03ea23a248b7
2018-11-14COMPMID-1781 Add channel support in CLL2NormalizationMichalis Spyrou
Change-Id: Ibab049f09413258c99335b7da6b151530a1bd136
2018-11-13COMPMID-1707: Create 3 special CLWidthConcatenate kernel to concatenate 2/4 ↵Michele Di Giorgio
and 8 tensors (Part 1) Creating special cases for concatening 2 and 4 tensors. Change-Id: I6a739a494ae45011acb65369e353f9ef96970b90
2018-11-08COMPMID-1451: Removed output_depth3d from ↵Gian Marco Iodice
CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat Since we perform an element-wise operation, it is not necessary to pass the output_depth3d. Change-Id: Ibfa07a0706e902acf59b444aa61e18a348162ea9