Age | Commit message (Collapse) | Author |
|
duration
Change-Id: Iafc1d6cd8003de64a3439ad807f4002036c73a73
|
|
The issue was related to CLIm2Col when the number of input channels was less than
the number of elements processed by each thread.
The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed
in the output tensor.
Also fixed an issue GEMM3D when we have a single output channel
Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
|
|
Change-Id: I86679adff556b6ffc9929b35cbf1b59b3958bdb1
|
|
Change-Id: I6d5f91579850906e1eb973ff6c5612195255e631
|
|
Change-Id: I807ef84dbf893bd401dcac5c0fa3a4ee49aabc66
|
|
ArmNN reported an issue with padding in CLLSTMLayer. This was due to the fact
that some tensors were allocated before they were passed to some configure
functions which attempted to change the padding requirement on already allocated
memory.
Also, increase tolerance on number of mismatches for CLBBoxTransform FP16.
Change-Id: Iad75b012be895693d0e553f3ab85f1ca7144e882
|
|
The tab characters were corrupting the output JSON file of arm_compute_validation
Change-Id: I8792fd0e02393aef60341552b428111e969a3927
|
|
Reduce the amount of precommit tests run in DirectConvolution,
Deconvolution and Pooling. Proper investigation scheduled for later.
Change-Id: Idc2510cf6877e7a605cead84f384852b609e3216
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156466
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
|
|
Note: Only ComputeLibrary files get copied over (Stub CL / GLES drivers don't, nor are the 3rdparty includes)
utils/ files are not copied either (They're not part of the core library)
Change-Id: I55e01c0ba4a5f7e649877fcdd11fdb0a51071b18
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156339
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Also added the test case reported by ArmNN.
Change-Id: I9fe9a1b4f74267a3346529f3a597b37486593c4a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155914
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Iac6a95ba7f388e65b7f1c8865c3e9bf289b233ea
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155490
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I3c3e96a743614af4c2c2391780d5de2db6191b0f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155318
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
OpenCL
COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride
With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 %
Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride
but I have not seen any benefit (maybe because we have few arithemtic operation and we
do not have more load instructions). However Depthwise convolution has been improved by
30%
Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Clear CLContext in a more regular basis to make the driver release
memory back to the system.
Change-Id: I0df847766f57719433bbaeada45fe630e38c9541
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155435
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Increases relative tolerance slightly as error was quite small.
Change-Id: I4789c5e3eeb4f2d3aaf2b4c76966474f045af4c1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155418
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
COMPMID-1690: Add tests for NEPermute with PermutationVector dimension > 3
Change-Id: I4bfc6ff88cd46863c2e39975b5663c624db1a63d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155316
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I875ffe0ccec3aa4f53bfb68d82e2a7292ab83358
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155348
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Set input range to [-1, 1] in order to avoid inf values
when calculating sqrt.
Change-Id: I18f1e427baa7830fdc587bedf27a92d78c72f49b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155397
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I7a32becd78fc231d11d50c6ff58892f4acb0ccda
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155224
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Instead of changing the tolerances I increased the sizes of the input. In this way, for a single mismatch, as it was the case, we are below the 1% tolerance set.
Change-Id: I787261a1d1adb559c1687b7bd1e0317a72594130
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155168
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Increase tolerance for fp32 and added absolute tolerance
Change-Id: Iff828457b514d6301ed2c8e04b66ce86867f72b6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155086
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
NEWidthConcatenateLayerKernel works with 4D tensors too, hence the check has
been removed and tests have been added.
Change-Id: I73814cabe5fae975a44cc1a03b092c552497e57d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155070
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
Change-Id: I7bd4a8ce81483ba56686b765ca3caabebe42882d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155000
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
vector
When one of the operands is a vector, the kernel does a broadcast addition and
the window is not collapsed. This represent an issue because it leads to a lot
of enqueues that increases the time taken by the OpenCL driver. This patch
allows to collapse the window when one of the two operands is a vector.
Furthermore, it adds LWS tuner to the kernel.
It also changes the number of elements processed per iteration to 8 to make
better usage of the cache.
Change-Id: I5f09ab0ddcffb3b7f9326a987c79a997b2d7fa8c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155003
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ic8312a5b6790aa7cd4468d42f08d557ad40e9441
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154570
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I91865506166951b3bf7f06a0b2d4cde925cfefb6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153447
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: Iae22554d5fe893fd22a000eab5bfd8275ea06eb3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154102
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I146936c9e98b343496a4b61cdbadf0eaa38e885a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154008
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ibc0b1242804c2fdb183825406e3c78bd0d1d3564
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154368
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Id974efad304c2513b8824a6561ad45ee60b9e7fb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153763
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: If496709958bf29589601eac62a268819736a4fd2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154173
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
COMPMID-1651: Fix QASYMM8 CLDeconvolutionLayer
This patch also extends the range of values used for testing Convolution and
Deconvolution to cover quantized [-1.0f, 1.0f].
Change-Id: I8b280669db67bb3ec25bf5d411c8f5954f5b0dab
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149869
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Id331199f569f52a37280a9ada5bf84694580b93c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152843
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
Change-Id: I05a1b871746a32ccc1c3ecec97b8266767c9d0a7
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153715
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I2c4dcedcd3b56e41174eebbbacd47be4e968d34d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152767
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
NEON and CL normalization layer was generating invalida results for
radius > 4.
Change-Id: I15d846405e6b3492fe44920bbf8cadceb4e5258f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153161
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Matteo Martincigh <matteo.martincigh@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: Ida71312bcf6dbd854f2ab1efc65f74910c79e152
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151510
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
Since this is floating point arithmetic the Winograd results will not be exactly
the same as direct convolution.
Changed to use relative tolerance for the nightly tests.
Change-Id: I45c6d60a097c2d4fb53650a2a33eb29a3e51d7ec
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152324
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
call to print_cpu_info moved to main
Change-Id: I6d82649964542df4e944bc79e4c16f0813976295
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/152695
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
conditionally compile the std::cout that was causing the fault
Change-Id: I7f50151ab88f19ed6eec1be11ca975614653e359
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151762
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I062e7673f26d5267ed113eae7edd361d05d6de73
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151968
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Kernel size 5x5 layout NHWC.
Change-Id: Ia82ff211d1c954df228962b5c2c5ad8df7112449
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151740
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Current implementation of winograd fp16 is not accurate enough for large runs.
disabling its use and reopening ticket(COMPMID-1266) to fix it.
The sigbus error that was originally reported against COMPMID-1559 is being tracked as COMPMID-1606
Change-Id: I45129aa366d5710402bc54b623c5fbfb865b3cd5
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151543
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
Change-Id: Ie215daacd10477309dbf8af1bb2b05b7a0a8f203
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150773
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I6f71f2da851454e8fbbdfc9223592dea9ad03bac
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/151014
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
Change-Id: I62bbf510cc106a90ed2884be3c9c0c127da25898
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150681
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Fixing bounds of random values for Normalize Planar YUV tests when using
QASYMM8.
Furthermore, since 70d252d8b4 a QASYMM8 implementation of Batch Normalization
would have been tested with tensors filled with all 1s. This patch removes that
as QASYMM8 Batch Normalization is not supported.
Change-Id: Ieab83ed36b2d7af760ceb19a07d1eedcc991957f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150492
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I47033fa70881fd32b13266adb6ccbf10c202aabc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150344
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I82d95c4f1c5fed13b213a2591cc2b4e0d0e02a54
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/149676
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ib14ac821ee5d4aff80bd602cd3e76e7018abb5e6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/150268
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|