aboutsummaryrefslogtreecommitdiff
path: root/src/core
AgeCommit message (Collapse)Author
2018-11-02COMPMID-1712 CLPoolingLayer wrong results in QASYMM8Michalis Spyrou
Also added the test case reported by ArmNN. Change-Id: I9fe9a1b4f74267a3346529f3a597b37486593c4a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155914 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1699: Disable arithmetic operations in CLWinogradLayer when no ↵Georgios Pinitas
batches available. Change-Id: Iad83df2a9116a7f350de83ec59b28cd8893c8d3a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155716 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1704: Collapse the 4th dimension in CLPoolingLayerKernelGeorgios Pinitas
Change-Id: I76e57af6608b55b6f59a5d06aecc30063ee4c3cc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155733 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1413 - Improve the performance of GEMMLowp with 8 bit dot product on ↵Gian Marco Iodice
OpenCL COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 % Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride but I have not seen any benefit (maybe because we have few arithemtic operation and we do not have more load instructions). However Depthwise convolution has been improved by 30% Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1029: Collapse CLWinogradInputTransform/CLWinogradOutputTransformGeorgios Pinitas
Change-Id: I051748502ca24b9952e7313524bbfd708162efb4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155166 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1451: Fix CL/NEPermuteKernel PermuteVection checkIsabella Gottardi
COMPMID-1690: Add tests for NEPermute with PermutationVector dimension > 3 Change-Id: I4bfc6ff88cd46863c2e39975b5663c624db1a63d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155316 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1451: Fix inlines in cl helpersGeorgios Pinitas
Change-Id: I9cb725a8052091469904ecc7cfffa4add9914ffb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155261 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Manuel Bottini <manuel.bottini@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1530 error: dereferencing type-punned pointer will break ↵Michalis Spyrou
strict-aliasing rules Change-Id: I9e54d07cf1d77c14f124056d3724b49981bf3f97 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155292 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1681: (Nightly) NEWidthConcatenateLayer failsMichele Di Giorgio
NEWidthConcatenateLayerKernel works with 4D tensors too, hence the check has been removed and tests have been added. Change-Id: I73814cabe5fae975a44cc1a03b092c552497e57d Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155070 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-1673: Collapse window in CLArithmeticAddition when one operand is a ↵Michele Di Giorgio
vector When one of the operands is a vector, the kernel does a broadcast addition and the window is not collapsed. This represent an issue because it leads to a lot of enqueues that increases the time taken by the OpenCL driver. This patch allows to collapse the window when one of the two operands is a vector. Furthermore, it adds LWS tuner to the kernel. It also changes the number of elements processed per iteration to 8 to make better usage of the cache. Change-Id: I5f09ab0ddcffb3b7f9326a987c79a997b2d7fa8c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155003 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1451: Perform CLOutputStage using floats.Georgios Pinitas
Change-Id: Ic8312a5b6790aa7cd4468d42f08d557ad40e9441 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154570 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1451: Fuse activation in DepthwiseConvolution.Georgios Pinitas
Change-Id: Id964d9068e18aaa13ab8adcbf7a9375b034ea6c3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154651 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-11-02COMPMID-1327: Add support for BBoxTransform operator in CLgiuros01
Change-Id: I91865506166951b3bf7f06a0b2d4cde925cfefb6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153447 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1632 Add CLL2NormalizationLayer for NHWC and FP32Michalis Spyrou
Change-Id: Iae22554d5fe893fd22a000eab5bfd8275ea06eb3 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154102 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1523: Fuse BN node with convolution.Georgios Pinitas
Change-Id: I146936c9e98b343496a4b61cdbadf0eaa38e885a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154008 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1667: Add 4D tensors support to CLWidthConcatenateLayerKernelMichele Di Giorgio
Change-Id: Ibc0b1242804c2fdb183825406e3c78bd0d1d3564 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154368 Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1580 Implement ReduceMean in NEONMichalis Spyrou
Change-Id: Id974efad304c2513b8824a6561ad45ee60b9e7fb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153763 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1586: Add support for NHWC CLDeconvolutionLayerMichele Di Giorgio
COMPMID-1651: Fix QASYMM8 CLDeconvolutionLayer This patch also extends the range of values used for testing Convolution and Deconvolution to cover quantized [-1.0f, 1.0f]. Change-Id: I8b280669db67bb3ec25bf5d411c8f5954f5b0dab Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149869 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1574 Implement ReduceMean in OpenCLMichalis Spyrou
Change-Id: Id331199f569f52a37280a9ada5bf84694580b93c Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152843 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1451: Reverting changes for CLGEMM and CLGEMMLowp previuosly done ↵Isabella Gottardi
(384496) Mirroring CLGEMM behaviour to CLGEMMLowp Change-Id: I308b54e2c0de131a5322b77e83e7454db498d692 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153175 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1451: Fix NormalizationLayer accross width normalization.Georgios Pinitas
NEON and CL normalization layer was generating invalida results for radius > 4. Change-Id: I15d846405e6b3492fe44920bbf8cadceb4e5258f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153161 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Matteo Martincigh <matteo.martincigh@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1621 Deconvolution wrong output calculationMichalis Spyrou
Change-Id: Ida71312bcf6dbd854f2ab1efc65f74910c79e152 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151510 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1451: Fix compilation issues under gcc 8Georgios Pinitas
Change-Id: I05d3447336ee0bf330e2a0c58fc6904be1db8f83 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152626 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1623: NEWinograd reduce the number of output tiles.Pablo Tello
Change-Id: I4d9240924fe483d2dd127ad6a4ae6f8066f61bd1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151893 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Andrew Mundy <andrew.mundy@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1451: Enable dot kernels in NEGEMMAssembly functionsGeorgios Pinitas
Change-Id: I9dd26b80025ea3a4c66f5f0bf41b7a98dd0d3aa4 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152549 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1607 - (Nightly) CLGEMMLowpMatrixMultiplyCore errors and mismatchesIsabella Gottardi
Change-Id: I5f2e6843526cb154176a5b113627d4f36c3a8edd Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150967 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1546 Optimize PoolingLayer NHWC on NEON for all data typesMichalis Spyrou
Change-Id: I4920e43059a713126f15493f38fe50f07d0a8c7f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151087 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1610: Fixed CLDirectConvolution mismatchesPablo Tello
Kernel size 5x5 layout NHWC. Change-Id: Ia82ff211d1c954df228962b5c2c5ad8df7112449 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151740 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02[COMPMID-1331] Add support for RoIAlign operator in CLgiuros01
Change-Id: Ie215daacd10477309dbf8af1bb2b05b7a0a8f203 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150773 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1607 - (Nightly) CLGEMMLowpMatrixMultiplyCore errors and mismatchesIsabella Gottardi
COMPMID-1608 - (Nightly) CLGEMMConvolutionLayer QASYMM8 errors and mismatches COMPMID-1609 - (Nightly) CLFullyConnectedLayer QASYMM8 mismatches Change-Id: I84c0d4f468be892f437f9f38b964dc7dfb66663a Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150869 Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-286: CL colour convert to U8Manuel Bottini
Change-Id: I62bbf510cc106a90ed2884be3c9c0c127da25898 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150681 Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1519: Add support for 3D input/output in CLGEMMLowpOutputStageGeorgios Pinitas
Change-Id: I637add70310d2da4d82b236a6352af9d33be17a1 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149706 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1600: Reduce number of tile specialisations.Pablo Tello
Change-Id: I4d06eca9404ea6d3df9d0ca52f5d6f5421ab7116 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150117 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-287: NEON colour convert to U8Manuel Bottini
Change-Id: I47033fa70881fd32b13266adb6ccbf10c202aabc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150344 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com>
2018-11-02COMPMID-1596 Create UpsampleLayer for NEONMichalis Spyrou
Change-Id: I82d95c4f1c5fed13b213a2591cc2b4e0d0e02a54 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149676 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1518: Add support for GEMM3D in CLGEMMLowpMatrixMultiplyCoreGeorgios Pinitas
Change-Id: Ib14ac821ee5d4aff80bd602cd3e76e7018abb5e6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150268 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-1598 : Fix compilation error in CLDepthwiseConvolutionQS8 kernelGeorgios Pinitas
Change-Id: I65eeb0cba2af462c6ef64a536ad263c407d62811 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149609 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1540 Implement YOLOLayer on NEONMichalis Spyrou
Change-Id: Ice28996959dc666fff5e8ae486c1ff8093db083f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148367 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1446 : Add support for 3D output in NEGEMMLowpOutputStageGeorgios Pinitas
Change-Id: I61e7d39d09a9936b1128ec04038fa2d8dfe6a2c8 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149211 Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1588 Create UpsampleKernel for YOLOLayerMichalis Spyrou
Change-Id: Ic1f9e85306a0a0b1459c9f9aa35bd629deea1710 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148797 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1581: Collapse windowsGeorgios Pinitas
Change-Id: Iec56c9a96d9736a63f13b65efa33311950f20661 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148572 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1591: Fix NEPoolingLayer for NHWCGeorgios Pinitas
Restore window step across width to 4 for FP32 instead of the whole row as the kernel code was inconsistent with this decision. Change-Id: I7c4dcdf960b8cbc970a36fa1df39df2c6f000c86 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148908 Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1564: Add QASYMM8 on CLPixelwiseMultiplicationGeorgios Pinitas
Change-Id: I5f719f5b2915c18cd0ca6271db401152112863a6 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148982 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
2018-11-02COMPMID-1554 Implementing Space to Batch on OpenCL - NHWCMichalis Spyrou
Change-Id: Ifa37a6758f79d0a6ca771dcfb4c55a5d96b452d0 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148892 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: bsgcomp <bsgcomp@arm.com>
2018-11-02COMPMID-1564: Add NEDepthwiseConvolution3x3 for QASYMM8Georgios Pinitas
Change-Id: I1f55508af6f220e5f41df7b56daffb4761ed0591 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148253 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
2018-11-02COMPMID-1568: Add support for QASYMM8 to CLNormalizePlanarYUVMichele Di Giorgio
Change-Id: Id7ea6e7f57179478e5ba0e9231274e98fa089590 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148028 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1532: Add DepthwiseConvolution3x3 FP16 on NEONGeorgios Pinitas
Change-Id: I780970f317b979b3230e2b471ac01df7fda9ee14 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/148168 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-1563: Fix name of NEGEMMInterleavedWrapperAnthony Barbier
Change-Id: I5f868091cae7bd86eeeb7216d44f32c190c5a604 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/147804 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1566: Add broadcast to CLArithmeticSubtractionGeorgios Pinitas
Change-Id: I05d21f9a92013ecfd1128d12cf1561cfd6e5c5e9 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/147983 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02[COMPMID-1229] Implementing Pad on OpenCL -FP32/FP16Giuseppe Rossini
Change-Id: Ideead99410e5e0bda1035030af1bbcd0a65ea15e Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/144792 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>