Age | Commit message (Collapse) | Author |
|
Change-Id: Ia4be053b9f5399fe7e241cebb4292890e957ae54
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121141
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I68a98eff57c8db719a501b68541666e8bc5f2081
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121180
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This reverts commit 9a0875951d43dda035f32d2e0728cf59d80cb4d3.
Change-Id: I6af0bc64c656f91cf1e0357f8760defa08f2e78d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121190
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
activation at graph level
Change-Id: I84d4a212629b21794451ab5fb5c5b187b5e28f98
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120127
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ifcc406d2d0a99c911d6b6c875657b0e0028255d5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119148
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
padding
Currently an assert gets fired in debug mode, and we just ignore the asymmetric padding in release mode.
Change-Id: Ia6278b5722f7e93f356a975ab3243e6bb07e44a8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120840
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
a) Added support for kernel size 5.
b) Templatised data type for transforms and batched gemms kernels.
Change-Id: Idb83dda7a5eec19e015888ab31902bd791913297
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120540
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I6413a05f6870a0d04f12d7348269b15297ae8493
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114696
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This new optimization allows to achieve 36.3 % of MAC utilisation on Mate 9 @ 1GHz.
The performance have been reported here
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Change-Id: I71b6a217068763dfdc11bbf3574ee0eb94f93679
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118531
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I3512d67b8a72b17db1381842ca42780e39cc511c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120605
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Ic6097e7cf160e8b829fb521b7b99d9a57d9799d3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118774
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5a6413548b2c9b8972c91ddba57395509dffd87e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120656
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I4ea57579d997dd6a2e248634e3b7cb58bb3e2838
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120693
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
-Swithes the 1x1 DeconvolutionLayer to use the ConvolutionLayer instead
of the DirectConvolutionLayer.
Change-Id: I3ffe152c42c3b1c7ea572f264cd3215df01aedc2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120292
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I4083e8d16bb23933634f229a1408dfd0e8f2922a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120069
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
-Enforces the use of the ConvolutionLayer function in the
DeconvolutionLayer.
-Adds tests for 4x4 Deconvolution.
-Alters the ConvolutionLayer validation to support even kernels.
Change-Id: Id27e285f078e690b8dd58490dd8ea6d875b3cec6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118632
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Added new benchmarks GEMM in order to evaluate the performance when the
input matrix B has to be reshaped only once
Change-Id: I1c4790213704ce57ea7b28f6f362c56edccd1eb9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118910
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ida1e9a836bc518bfe5563e16bf7f92bde5fc13f7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118472
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: Ifa74e2bf05546de9a49aa185e22fba50438d8ad6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113946
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: Ib8100e7c659c49694c746fa3f36ce20f44f6929f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117804
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1
Performance reported in the following confluence page:
https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02
Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Removed double managment of the same tensor object
Change-Id: Ibc74cd8c7bd199cd473ff68f692840cbf01b27b3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119119
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
Change-Id: I5a420da6a8041f9ff6d0811815f2fc74c85c56a8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119014
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ia30ec2afce0aafcd39f41440efb972b18bbda9f8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118657
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I5296815cf04e5f805d6523196567b6c01715c8b5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118711
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I83d0f2bc8e0ebfdc0b60931f2c5acf0469caf886
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118696
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Arm Cortex-A55 FPGA with 8 CPUs with lots of flags).
Change-Id: I493fb1013c6c25d9b9c809705b1ee24abac1d8d1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118456
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
1) Removed the example files winograd_layer.hpp/cpp
2) Teplatized winograd transform kernels
Change-Id: I7045fa0b801b9d30a11275914aaa2dafd254aed2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118332
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
issue on OpenGL ES
Change-Id: I7a8489bb0fddc72899ea165e414ee87bdbfb45b3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118106
Reviewed-by: Joel Liang <joel.liang@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Also, added instrumentation to support generic tensor broadcasting for
NEON and CL backends.
Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I2d3cc9668852a1ba414fc3148866df408f770dc8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118308
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0a7ea4cde1dbf8edd28908dfff80928ef7e996c4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117647
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iff50adf2993bd69c2696a47559d6b2e0011fed87
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110177
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I80437f7ba6e4b8ec1fb145300a017b3688f3f2b6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118086
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I4f2cca52caf210fdb7d6bb7e9436ac51cb5088b4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112398
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
1) Updated to the latest code from the RSH repo.
2) Moved winograd transforms into kernels.
3) Added support for biases
Change-Id: I7f39f34a599b49d7d9b549cc10a4f4d4a8007ab8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117474
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I33cf54e68f6c097ac58b6f16c3f9a720978f09cd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117289
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Iec82a91ad351cfe8d07d0976a24bd42f4703177a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116833
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: If8c1e0103ae2e3dfde3d0b9f23575c0e904c7f30
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117961
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch introduces a new GEMM capable to improve the mac utilisation
of 10% compared to the GEMM without reshape. However this implementation
is not faster in all cases as we need to take into account the time for
reshaping the matrices. For this reason an heuristic solution to select
the optimal GEMM to use has been added to the function. More information
about the heuristic implementation can be found at COMPMID-852.
With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can
improved the performance of 1.5x.
More information about the performance uplift can be found here:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02
Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I9dbb090cac731d68bd98a7d1c8ab0e1cb0a5c911
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116746
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
The performance improvements have been reported at the following
confluence page:
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Config3 of McVail looks improved by 29x
Change-Id: I8b203c0b75fc368f85cea863b7eed398fab3e79a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115783
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I801b5e393a16a9f92c062826e6fcfd5982ca7bb3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116584
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I86d7f53b5f5d1dbc22078aea5c32b08a25d1f49e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116634
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I6d97b649f1ebc289c9e6f8949e67740a6b3cbcb2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116636
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I389e0d4104b7dde60b7cdd612a83f3328517e44c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115804
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|