aboutsummaryrefslogtreecommitdiff
path: root/arm_compute/runtime/CL/functions/CLGEMMLowpMatrixMultiplyCore.h
AgeCommit message (Collapse)Author
2019-04-16COMPMID-2110: Enable CLGEMMLowpMatrixMultiplyReshapeOnlyRHSKernel in CLGEMMLowpGian Marco Iodice
Change-Id: Ic32c803c3e2a067de10a7e46c85c962a970957b6 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/969 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-03-15COMPMID-1694: Fuse offset contribution with the output stage when we use ↵George Wort
NEGEMMLowpMatrixMultiplyCore Change-Id: Ic1a681e4cc03e1eba3bf8485d9cdb17b3e926047 Signed-off-by: giuros01 <giuseppe.rossini@arm.com> Reviewed-on: https://review.mlplatform.org/c/561 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2019-01-25COMPMID-1698: Implementing CLGEMMLowpMatrixMultiplyReshapedKernelGian Marco Iodice
Change-Id: Ia4db21b394a0b9235393202ce3c00b11cceb94ea Reviewed-on: https://review.mlplatform.org/568 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
2019-01-14Issue COMPMID-1835: Remove CLGEMMInterleave4x4Kernel and replace with ↵giuros01
CLGEMMReshapeLHSMatrixKernel Change-Id: Id6a1bd78f9b1698b64a004e4adebc41002b15745 Reviewed-on: https://review.mlplatform.org/496 Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
2018-12-21COMPMID-1836: Remove CLGEMMTranspose1xWKernel and replace with ↵giuros01
CLGEMMReshapeRHSMatrixKernel Change-Id: Ic5a4f32657a155380684dcd4b44fbb608ef40cb4 Reviewed-on: https://review.mlplatform.org/418 Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>
2018-11-02COMPMID-1413 - Improve the performance of GEMMLowp with 8 bit dot product on ↵Gian Marco Iodice
OpenCL COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 % Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride but I have not seen any benefit (maybe because we have few arithemtic operation and we do not have more load instructions). However Depthwise convolution has been improved by 30% Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082 Tested-by: bsgcomp <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
2018-11-02COMPMID-1145: (API) Introduce prepare() stage (NEON/CL/GLES)Georgios Pinitas
Change-Id: I5b46764f9c3154ec3e3b9c951cc9e6dfbcb81dfb Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/134255 Reviewed-by: Anthony Barbier <anthony.barbier@arm.com> Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Pablo Tello <pablo.tello@arm.com> Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
2018-11-02COMPMID-617: Adds CLFullyConnectionLayer validation supportGeorgios Pinitas
Change-Id: I4d2eb9872a3165fdcaa7784596e441cbe563dbc2 Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112577 Tested-by: Jenkins <bsgcomp@arm.com> Reviewed-by: Ioan-Cristian Szabo <ioan-cristian.szabo@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-661: Convolution quantized (#32)Chunosov
Change-Id: Id69df4ce98d1d89bdf9c9aa5c4d909659909b30f Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110456 Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
2018-11-02COMPMID-697 - Rework GEMMLowp interface on OpenCLGian Marco
Reworked the interface of GemmLowp in order to make easy the integration in Android NN - Added support for different output stage - Added validation for both matrix multiplication and output stage - Added bounded relu support in the output stage - Added in32_t bias support - Added optimized path for vector by matrix case This rework is required for: - Convolution quantized - Fully connected quantized Change-Id: I512283d406099cf8c614dd89d0a97ed411143afc Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110625 Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>