Age | Commit message (Collapse) | Author |
|
The issue was related to CLIm2Col when the number of input channels was less than
the number of elements processed by each thread.
The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed
in the output tensor.
Also fixed an issue GEMM3D when we have a single output channel
Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
|
|
Change-Id: I86679adff556b6ffc9929b35cbf1b59b3958bdb1
|
|
Change-Id: I6d5f91579850906e1eb973ff6c5612195255e631
|
|
Change-Id: I807ef84dbf893bd401dcac5c0fa3a4ee49aabc66
|
|
Change-Id: I68c648a5246fcdc67a496602089f93d65eb1d601
|
|
Change-Id: I031488247673de305f63b2a2e636f4cb17bd57f2
|
|
Change-Id: If8fbd04d0817b9e654ffa9715879a2521de66963
|
|
ArmNN reported an issue with padding in CLLSTMLayer. This was due to the fact
that some tensors were allocated before they were passed to some configure
functions which attempted to change the padding requirement on already allocated
memory.
Also, increase tolerance on number of mismatches for CLBBoxTransform FP16.
Change-Id: Iad75b012be895693d0e553f3ab85f1ca7144e882
|
|
Change-Id: I5aae537372bf797fbb2a2bae81038f8963b041a9
|
|
CLDepthWiseConvolutionLayer3x3Kernel
Change-Id: Ie274da79b15c03f86dfedc85bb721b3de34a0bb4
|
|
The tab characters were corrupting the output JSON file of arm_compute_validation
Change-Id: I8792fd0e02393aef60341552b428111e969a3927
|
|
Change-Id: I87f193fce28d2de12514da675931813162fa292d
|
|
Reduce the amount of precommit tests run in DirectConvolution,
Deconvolution and Pooling. Proper investigation scheduled for later.
Change-Id: Idc2510cf6877e7a605cead84f384852b609e3216
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156466
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
|
|
Fixed a typo that caused compilation issues for ArmNN.
Change-Id: Iab22adaf163eb3d2978d264f0ecf1238de98a67e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156483
Reviewed-by: Francis Murtagh <francis.murtagh@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Commit 16121924 `COMPMID-1673: Collapse window in CLArithmeticAddition when one
operand is a vector` changed the number of elements processed per iteration to
8, but didn't update the quantized kernel to reflect that.
Change-Id: I49a2fbcee81f5bbc1b210b4a5c6d63b94eafdcec
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156355
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Note: Only ComputeLibrary files get copied over (Stub CL / GLES drivers don't, nor are the 3rdparty includes)
utils/ files are not copied either (They're not part of the core library)
Change-Id: I55e01c0ba4a5f7e649877fcdd11fdb0a51071b18
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156339
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Also added the test case reported by ArmNN.
Change-Id: I9fe9a1b4f74267a3346529f3a597b37486593c4a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155914
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Iac6a95ba7f388e65b7f1c8865c3e9bf289b233ea
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155490
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
batches available.
Change-Id: Iad83df2a9116a7f350de83ec59b28cd8893c8d3a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155716
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I3c3e96a743614af4c2c2391780d5de2db6191b0f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155318
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
Change-Id: I76e57af6608b55b6f59a5d06aecc30063ee4c3cc
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155733
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
Change-Id: I0cf221c706c3d957423941d3aa9a9262dcb00c00
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155593
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: Anthony Barbier <anthony.barbier@arm.com>
|
|
OpenCL
COMPMID-1424 - Add dot product support for CLDepthwise QASYMM8 3x3 NHWC non-unit stride
With this patch we are able to improve the performance of MobileNet v1-qasymm8 by 37 %
Tried to use the dot product instruction in CLDepthwise QASYMM8 3x3 NHWC non-unit stride
but I have not seen any benefit (maybe because we have few arithemtic operation and we
do not have more load instructions). However Depthwise convolution has been improved by
30%
Change-Id: Id768a99c2e53a04276707e427af5d0ec93419ada
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155082
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I051748502ca24b9952e7313524bbfd708162efb4
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155166
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Clear CLContext in a more regular basis to make the driver release
memory back to the system.
Change-Id: I0df847766f57719433bbaeada45fe630e38c9541
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155435
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Increases relative tolerance slightly as error was quite small.
Change-Id: I4789c5e3eeb4f2d3aaf2b4c76966474f045af4c1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155418
Reviewed-by: Michalis Spyrou <michalis.spyrou@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
COMPMID-1690: Add tests for NEPermute with PermutationVector dimension > 3
Change-Id: I4bfc6ff88cd46863c2e39975b5663c624db1a63d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155316
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I875ffe0ccec3aa4f53bfb68d82e2a7292ab83358
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155348
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ie1bcdd9dca2dc3b26003790a19cc80bb953385b2
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155373
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Set input range to [-1, 1] in order to avoid inf values
when calculating sqrt.
Change-Id: I18f1e427baa7830fdc587bedf27a92d78c72f49b
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155397
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I7a32becd78fc231d11d50c6ff58892f4acb0ccda
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155224
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: I9cb725a8052091469904ecc7cfffa4add9914ffb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155261
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Manuel Bottini <manuel.bottini@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
strict-aliasing rules
Change-Id: I9e54d07cf1d77c14f124056d3724b49981bf3f97
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155292
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ibb6b0ceed19111c01fcc96eb50e461f9811b61b9
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155260
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I186b8875089ff77768db051f58f2968e3e9505e3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155202
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Instead of changing the tolerances I increased the sizes of the input. In this way, for a single mismatch, as it was the case, we are below the 1% tolerance set.
Change-Id: I787261a1d1adb559c1687b7bd1e0317a72594130
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155168
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Increase tolerance for fp32 and added absolute tolerance
Change-Id: Iff828457b514d6301ed2c8e04b66ce86867f72b6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155086
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
NEWidthConcatenateLayerKernel works with 4D tensors too, hence the check has
been removed and tests have been added.
Change-Id: I73814cabe5fae975a44cc1a03b092c552497e57d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155070
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
|
|
Change-Id: I7bd4a8ce81483ba56686b765ca3caabebe42882d
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155000
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
|
|
vector
When one of the operands is a vector, the kernel does a broadcast addition and
the window is not collapsed. This represent an issue because it leads to a lot
of enqueues that increases the time taken by the OpenCL driver. This patch
allows to collapse the window when one of the two operands is a vector.
Furthermore, it adds LWS tuner to the kernel.
It also changes the number of elements processed per iteration to 8 to make
better usage of the cache.
Change-Id: I5f09ab0ddcffb3b7f9326a987c79a997b2d7fa8c
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155003
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Id106d53b9477298a117a5195f3fc5b0f36003c35
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/131903
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
|
|
Change-Id: Ic8312a5b6790aa7cd4468d42f08d557ad40e9441
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154570
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Id964d9068e18aaa13ab8adcbf7a9375b034ea6c3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154651
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: I91865506166951b3bf7f06a0b2d4cde925cfefb6
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153447
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
|
|
NEGEMMConvolutionLayer.
Change-Id: Iea5f2c5bcac8051c4c7655a6eabb2c43772eb31f
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154104
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Iae22554d5fe893fd22a000eab5bfd8275ea06eb3
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154102
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: I146936c9e98b343496a4b61cdbadf0eaa38e885a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154008
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ibc0b1242804c2fdb183825406e3c78bd0d1d3564
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/154368
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Id974efad304c2513b8824a6561ad45ee60b9e7fb
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153763
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Change-Id: Ib4ca28b82bd82f0ed4d2c906185d3f4010246616
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/153986
Reviewed-by: Giuseppe Rossini <giuseppe.rossini@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|