Age | Commit message (Collapse) | Author |
|
Change-Id: I84a914c13b162c4f74321c9cafc30a18ad4ebbdb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118797
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Introduced optimizations for 1x1, 3x3, 5x5 and 11x11
Change-Id: Ibb7f7a9fbec01a7684746ed8513634078126e452
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118107
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
Change-Id: If6f3888a035b557a6c369efa22b56d6c8d3efbd3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118789
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
|
|
example
Change-Id: Ic639d51fb5dd4f78912a9b11abc7df79d205a22b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118843
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I12d4af007c123b19925ceb5e3c84285e096bc13b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118718
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ia30ec2afce0aafcd39f41440efb972b18bbda9f8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118657
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I5296815cf04e5f805d6523196567b6c01715c8b5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118711
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iebd2a8fece1af87c93d6795e176d8c37ca64bbf6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118187
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I71f67789648ef05ccdedce77c7427bc0127b3a69
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116741
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ie4ac7f61675c1fb9b1748d6784fccb26f058832a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118635
Reviewed-by: Robert Hughes <robert.hughes@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I83d0f2bc8e0ebfdc0b60931f2c5acf0469caf886
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118696
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Arm Cortex-A55 FPGA with 8 CPUs with lots of flags).
Change-Id: I493fb1013c6c25d9b9c809705b1ee24abac1d8d1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118456
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
1) Removed the example files winograd_layer.hpp/cpp
2) Teplatized winograd transform kernels
Change-Id: I7045fa0b801b9d30a11275914aaa2dafd254aed2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118332
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I14ff5e2964328d22c0bba5a77683e07f0c7920e9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118389
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
issue on OpenGL ES
Change-Id: I7a8489bb0fddc72899ea165e414ee87bdbfb45b3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118106
Reviewed-by: Joel Liang <joel.liang@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I5366d11aefdb8f3ba7326ed7527eb216c4de0668
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118372
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ie480332e6e302edd406627e90be0d7df3e61dde5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118303
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
investigated
Change-Id: I5a69198bfd60d9cdd061f2db9838d9f0df9ecc23
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118454
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Also, added instrumentation to support generic tensor broadcasting for
NEON and CL backends.
Change-Id: I1bc5747a286e1a4b464c209067581e103d473b9a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/114201
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Icbb43de7642e2b433d7471d70b9dbbde850989d3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118197
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I2d3cc9668852a1ba414fc3148866df408f770dc8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118308
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
specific dataset
Change-Id: I227e90445715c3bd394e49930b010c0a5f5ca177
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118108
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic1f215c1ae85ad5c516cc3600447a50bba77ebc1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117668
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I38ae204632ae27c5fe7a0131462343397899868c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118120
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Fully Connected test names are not unique
Change-Id: Ie4654cc1cb4720c51a3114162043562d5cbc6d28
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118126
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I880ac3a1c3f5ea09ccefe27d9ee40bd60afcea2b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118056
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I0a7ea4cde1dbf8edd28908dfff80928ef7e996c4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117647
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Iff50adf2993bd69c2696a47559d6b2e0011fed87
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/110177
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I80437f7ba6e4b8ec1fb145300a017b3688f3f2b6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118086
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Some minor improvements in the test fixture, for example making sure
the values in the mapx and mapy tensors are in the range of [-5, in_width+5]
and [-5,in_height].
Tolerance was changed to 0, no mismatches expected.
Change-Id: I2fad06defb293bf9fdd1988799b19547c102dee5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118044
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: Ic460695b8a203c1080ea177b5463b48b07b70c4b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118075
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Joel Liang <joel.liang@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I4f2cca52caf210fdb7d6bb7e9436ac51cb5088b4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/112398
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: Ia2874d30780cb597a6e5039120815f2368911e0c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/118024
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
1) Updated to the latest code from the RSH repo.
2) Moved winograd transforms into kernels.
3) Added support for biases
Change-Id: I7f39f34a599b49d7d9b549cc10a4f4d4a8007ab8
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117474
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I33cf54e68f6c097ac58b6f16c3f9a720978f09cd
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117289
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Currently we output an array of timestamps: queued, submitted, start, end
This patch instead only output end-start (i.e the time it took to execute the kernel on the GPU)
Change-Id: Ic3c2b68128f6acd6bb018b7b3ead0b69dd5aca59
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117865
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Kevin Petit <kevin.petit@arm.com>
|
|
Change-Id: Iec82a91ad351cfe8d07d0976a24bd42f4703177a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116833
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: If8c1e0103ae2e3dfde3d0b9f23575c0e904c7f30
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117961
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Refactored the console printer too (So that we can re-use the code if needed)
Change-Id: I16a0f70104f82f07cd59900b383038fa5a76e1bc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117858
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Changed CLReductionOperationKernel: Now each kernel computes
a 2D slice instead of 1D. This reduces the memory footprint
from around 1.6Gb for a 4k input image to a few Mb, which was
caused by the __local memory and was probably the cause for this bug.
Change-Id: I71ac71ff09b041c945a134177600f0f3475e48cf
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117835
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
it supports asymmetric padding
Add asymmetric padding support for NEPoolingLayer
Change-Id: Ia5cc660aeca636c3c45df4916a28974cc2b7f2f4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117275
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
This patch introduces a new GEMM capable to improve the mac utilisation
of 10% compared to the GEMM without reshape. However this implementation
is not faster in all cases as we need to take into account the time for
reshaping the matrices. For this reason an heuristic solution to select
the optimal GEMM to use has been added to the function. More information
about the heuristic implementation can be found at COMPMID-852.
With this new patch, GoogleNet, MobileNet, VGG16 and SqueezeNet can
improved the performance of 1.5x.
More information about the performance uplift can be found here:
https://confluence.arm.com/display/MLENG/GEMM+FP32+performance%3A+ACL+18.02
Change-Id: I024563c06b9aed02a211a974e452bae5c233b04c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117140
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
when not running in a terminal
Change-Id: I4ec90803c5dc41b0cee05c36113ae3f189564d58
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117831
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
CustomConvolution (output S16)
Change-Id: Ic099336f558e994210a59e14ec0171fae68ccb80
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116663
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I25424481ddbbeb43f940cf51cef791e4fd83ea92
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117676
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Cortex-A55r1.
Change-Id: I640ae54dcc4591915c7a539b27728f05b70cf0eb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117616
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I9dbb090cac731d68bd98a7d1c8ab0e1cb0a5c911
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/116746
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Explicitly add -march=armv8.2-a+fp16 for target arm64-v8.2-a, otherwise __ARM_FEATURE_FP16_VECTOR_ARITHMETIC is undefined
and all the FP16 neon code is not compiled.
Change-Id: I698819d842de996c1b4c88ebd0cf8664c5f70d58
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117601
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
The performance improvements have been reported at the following
confluence page:
https://confluence.arm.com/display/MLENG/GEMMLowp+performance%3A+ACL+18.02
Config3 of McVail looks improved by 29x
Change-Id: I8b203c0b75fc368f85cea863b7eed398fab3e79a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/115783
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|
|
Change-Id: I7197d2ad7ac08112eba1570a257ad011b1ce0b75
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/117404
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
|