Age | Commit message (Collapse) | Author |
|
Instead of dispatching the sum postop for GEMM kernels to a
separate kernel + add, that requires an extra destination sized
allocation, plus 3 extra load/stores per element,
just do it in the GEMM kernel.
Resolves: ONCPUML-1442
Signed-off-by: Radu Salavat <radu.salavat@arm.com>
Co-authored-by: Milos Puzovic <milos.puzovic@arm.com>
Change-Id: I7a1f2da3300875fa1ac88b705a34390969518077
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11298
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
This patch calculates the output quantization info based on the inputs'
quantization information.
The previous approach was using the same quantization information for
input, weights and output.
Remove QSYMM8_PER_CHANNEL path from the fixture as there are no
related tests
Remove repeated shapes from the dataset now that we get rid of the
quantization info from the dataset.
Combine signed and unsigned SmallGEMMLowpFusedBatchedMatMulDataset
into one as they become identical
Resolves COMPMID-6481, COMPMID-6634
Change-Id: I9f5a20f4bb45c3e5adab388564135ae8a5c0a9ea
Signed-off-by: SiCong Li <sicong.li@arm.com>
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10680
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
- Fix includes int8/uint8 quantized inputs
- Bias S32 value is limited to better allow detection of mismatches in gemmlowp kernel
Resolves: [COMPMID-5659]
Signed-off-by: Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
Change-Id: Ie9cca430c6ab66253fe1d5252bd2c5396c7f38cf
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8514
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
multiplication with variable input tensors
Resolves: COMPMID-5506
Signed-off-by: Ramy Elgammal <ramy.elgammal@arm.com>
Change-Id: I8345a3b7a83ef46f9ec7a77197cc65c933ec9ac6
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8239
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
|
|
Ported kernels:
- CLGEMMLowpMatrixMultiplyNativeKernel
- CLGEMMLowpMatrixMultiplyReshapedKernel
- CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- CLGEMMLowpOffsetContributionKernel
- CLGEMMLowpOffsetContributionOutputStageKernel
- CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
- CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
- CLGEMMLowpQuantizeDownInt32ScaleKernel
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Resolve COMPMID-4416
Change-Id: I83cdf0de7adaf4d465ffebd494ab913182072485
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5788
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
* Add missing padding immutability asserts in all relevant CL kernels
* Remove unnecessary zero padding validation tests.
Change-Id: If93f9ccbc988e0286f5e7b135f812141476d5da0
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4446
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
COMPMID-3725: Remove OpenCL padding: CLGEMMLowpQuantizeDownInt32ScaleKernel
Change-Id: Idea5974a56861efae3bc255f1224c7f1e88f3650
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4182
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
remove padding from related OpenCL kernels
Change-Id: I0b0be8fcccf511c7214e83ba6aa8d0e901bc4f3c
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4146
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Remove configuation tests that use the default data shapes.
There is no need to run them since configure will run as part
of the actual validation run.
Signed-off-by: Michalis Spyrou <michalis.spyrou@arm.com>
Change-Id: If6d88a6ba5e9463fa8c615fcf76a5c07d3049d53
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3638
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
preferred presentation
Change-Id: Ib7dcfcbb24b408999dfae366b9da396485aacf78
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3525
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
NEGEMMLowpQuantizeDownInt32ToUint8ScaleKernel
Signed-off-by: Luca Foschiani <luca.foschiani@arm.com>
Change-Id: Ia8692f8fda16fa3b73f343e4b5b1b55e14403225
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2750
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloatKernel
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I37e6e76dbd5546c0eaedfacd01ea905c37148e8a
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2861
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
CLGEMMLowpQuantizeDownInt32ToUint8ScaleKernel
Signed-off-by: Luca Foschiani <luca.foschiani@arm.com>
Change-Id: I4f7918630ea95fc28597b3d7b189f3d8fd35aef8
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2890
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I447030e69b9e565f2f81529a41af8c5e7ece7ecf
Signed-off-by: Giorgio Arena <giorgio.arena@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2702
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ifdaeb53c512ba697f174649c026075010f54f628
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2472
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
|
|
Change-Id: I93ad3e5b9531ce1699214ff6e657a76ffdaacedd
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2396
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I16f6758b768ede404a064db057302ded706e1e8a
Signed-off-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2215
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
Quantized LSTM
Change-Id: I7eddbdf77881f313b707b9e59428245f1330a2cf
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2119
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
|
|
Change-Id: Iab74b72f7adf712a1baf16aab916ea7c8d2bf92f
Signed-off-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1497
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
|
|
NEGEMMLowpMatrixMultiplyCore
Change-Id: Ic1a681e4cc03e1eba3bf8485d9cdb17b3e926047
Signed-off-by: giuros01 <giuseppe.rossini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/561
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I359cc0fd0c3fa42ab10a770e59d58704403889b2
Reviewed-on: https://review.mlplatform.org/498
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
OpenCL
COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride
With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 %
Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride
but I have not seen any benefit (maybe because we have few arithemtic operation and we
do not have more load instructions). However Depthwise convolution has been improved by
30%
Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ib14ac821ee5d4aff80bd602cd3e76e7018abb5e6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150268
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
DoD:
- Implement NEON kernel for quantizing down the gemmlowp result. The
result should be scaled by a fixedpoint number
- Implement OpenCL kernel for quantizing down the gemmlowp result. The
result should be scaled by a fixedpoint number
- Add test for validating the result
Required for:
- Integration of GEMMLowp in Android NN
- Convolution quantized
- Fully connected quantized
Change-Id: Ia963d25d695471e963961fb49a5600e78374ac4f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110981
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I70e04d3a175ba366432ada98e9ca893c9f81b260
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/111094
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Reworked the interface of GemmLowp in order to make easy the integration
in Android NN
- Added support for different output stage
- Added validation for both matrix multiplication and output stage
- Added bounded relu support in the output stage
- Added in32_t bias support
- Added optimized path for vector by matrix case
This rework is required for:
- Convolution quantized
- Fully connected quantized
Change-Id: I512283d406099cf8c614dd89d0a97ed411143afc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110625
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
|