Age | Commit message (Collapse) | Author |
|
Was only failing for armv8.2-a for some reason
Change-Id: I3ee706aee22b7f1fb8223d0f6cc2e09bec7672ea
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131443
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I42f0e7dab38e45b5eecfe6858eaecee8939c8585
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129291
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I7ebc944ef84fb2649123954ac5bd55f9d23bbf09
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131147
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Since now the input transform can be multi-threaded, I re-ebaled Winograd in all graph examples
Change-Id: I39ef78243bb47fdae135e18dcae2102af0675b3b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131048
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I83db135fa94c6884e080f0229a9b6430d908c029
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129823
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ide7c6124eb19f13f15f517e62d705646a0cd1ecd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130184
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I524abd28188995ae9c7a43b189b1eb2d7546be93
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/130576
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I0ca02e42807c1ad9afeffb7202a3556feb11442f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129701
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I789065bfa0d4ef133388e1904c5caf31e450f80f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129495
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: If8a28fc6a3a58473df51c8e7399e6d06d0db10f9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127384
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I08ddb7f6e061178e7566518b48e4e18f8f078596
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/129825
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5831241f3fc503717cc51136453c2bf96d4b420b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/128484
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I4df63ec2f4eb27a8a6eec2bea27741bf8dec6910
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126966
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Avoid unspecified behavior in graph construction.
This is fixed in C++17.
Change-Id: I4ef45cb139bbd838103a9922441e32d2d16c33d2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/127975
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
- Cleaned up build system
Change-Id: If2faa27ee5b31fa8b972836960ab3ef671059c8d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126435
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I4d2f67206ca56e6468a6e1491ca93bdde31c32ff
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/126278
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I9c164a817c0cc5f264a5c71a59256dacc6314cb0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125456
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I279e29ce20b3dde57445264dc11491f127b44d70
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/124429
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I55eae35f35a3c7891e8d535907c861f022e43bea
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125470
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I4a2deee9e4b2c54ea79d2895cfeca44190133b24
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/125453
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I1e2a1a77097d8017c274af3f97eba6964f80f5fa
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122592
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic1685de4e19e0ac79669ef2da64e1dc96c7ea0bf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115248
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch enables GEMM to execute multiple batches in parallel
https://confluence.arm.com/display/MLENG/Winograd%3A+batched+GEMM
Change-Id: I66222db041dd35e82af11fbb262fd1ebd3ca4b2f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120866
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie2fe8eac176a80a1a53b6f349dad6287218b82d5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/122304
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
- In order to enable to OpenCL tuner, graph_init() has to be called only
once all nodes have been instantiated
Change-Id: I28a51ccada8f81c12e4f4484b892f14a530f6f4d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121707
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
ResidualLayer node (COMPMID-916) also created as required for the ResNet
architecture.
Change-Id: I4fb4d2e08a8d3ce206f96f7946f5afc3e244676a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121185
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This reverts commit 2e8c7ee2ecebd9783c97bbd602a61989e1247d6b.
Change-Id: Id90691f427a68d01480889f8d5fff190fd72c5a3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/121176
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: If0fbb6bbe5384038124d3dc189274b8266f796ca
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120771
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
ResidualLayer node (COMPMID-916) also created as required for the ResNet
architecture.
Change-Id: I3aef0b6d6fd5bfcd4916fed4d8d4466b8a92b70d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120562
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I4fe501281f527e20e8fdd0253d59ea2c4629056b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/120354
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
In order to use GEMM-based convolution in VGG16, it has been created a
function which allocates 1.8 GB. If the function fails, will be used
DIRECT convolution instead
Change-Id: Ibec8928ee6fe6684d6dc24b7df380beeb671bf27
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119490
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I9a607fe620f795cdea1a99fdd3f5f8c2fc76f980
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119234
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ida1e9a836bc518bfe5563e16bf7f92bde5fc13f7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118472
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
This patch brings the MACs utilisation up to 25 % when both stride_x and stride_y are equal to 1
Performance reported in the following confluence page:
https://confluence.arm.com/display/MLENG/Depthwise+convolution+3x3+FP32+performance%3A+ACL+18.02
Change-Id: Ida1b64be9a88805902a3d90194559b58eb1224a3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/119068
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
example
Change-Id: Ic639d51fb5dd4f78912a9b11abc7df79d205a22b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118843
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I12d4af007c123b19925ceb5e3c84285e096bc13b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118718
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
This patch introduces a new GEMM capable to improve the mac utilisation
of 10% compared to the GEMM without reshape. However this implementation
is not faster in all cases as we need to take into account the time for
reshaping the matrices. For this reason an heuristic solution to select
the optimal GEMM to use has been added to the function. More information
about the heuristic implementation can be found at COMPMID-852.
With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can
improved the performance of 1.5x.
More information about the performance uplift can be found here:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02
Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
when not running in a terminal
Change-Id: I4ec90803c5dc41b0cee05c36113ae3f189564d58
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117831
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ib21de61fe39d2768638af11c067dfc7bcf63aae2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117112
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: I389e0d4104b7dde60b7cdd612a83f3328517e44c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115804
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic76b3b6adaff8c84ba4d2ca5283d9291c69344f0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114466
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ib178a97c080ff650094d02ee49e2a0aa22376dd0
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115717
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I180281e796e1670b9ad391d82d66ecde0119ef78
Note: this is for internal use only which is why I think the hackiness of RunExample.cpp is acceptable.
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115154
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
-Propagates hints to subgraph.
-Fixes dispatching of apropriate optimized DepthwiseConvolution kernel
for OpenCL backend. NEON backend is altered to default to the generic
case until COMPMID-769 is addressed.
Change-Id: I544f05cd99a9ac253f1b19aa4e4bb222b8fdd087
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114781
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
to memory_barrier
Also fix the synchronisation issues between different kernels.
Change-Id: Ib59d83ae8d5cc8b0bdf13e6f4958edccdab91ca4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114594
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ic59b2d852d59abb3d149e29760a1e16978d41bdc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114593
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Ioan-Cristian Szabo <ioan-cristian.szabo@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I56333ed23d30c5ec3094f64b78a023589064fe06
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113375
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
Reviewed-by: Jim He <jim.he@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Symbols from translation units of arm_compute_graph were stripped during
static linkage.
Forces to include all symbols of arm_compute_graph.
Change-Id: Ib66f513792c8796fca10f8deaca887db474f2bed
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/113187
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
- Fixed data type issue in cl_sgemm
- Added support for NEON and OpenCL targets in graph examples. Before we
could run only OpenCL target
- Add auto_init() in NEDepthwiseVectorToTensorKernel
Change-Id: I4410ce6f4992b2375b980634fe55f1083cf3c471
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112850
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: BSG Visual Compute Jenkins server to access repositories on http://mpd-gerrit.cambridge.arm.com <bsgcomp@arm.com>
|