Age | Commit message (Collapse) | Author |
|
Change-Id: Ia8d4e46ce5d9bb366af15726bc208dc14583c6ae
|
|
Change-Id: Icf813a0a87d4a07e180eafdb5fa916b2ea4028d2
|
|
num_elems_processed was passed as a scale instead of a step
Change-Id: I8c6d58fe4432f9f6beb31c0a1e02204c96775d98
|
|
AccessWindowRectangle::update_window_if_needed()
Change-Id: I56426cc9c9688a0aa0acdd439d5887c7ef208cd2
Note: The code to shrink the window hasn't been fixed yet.
|
|
in the install_dir
Change-Id: I5ba348d36325bcffb33b1e68435d5fe27cec8402
|
|
Change-Id: I69e995973597ba3927d29e4f6ed5438560e53d77
|
|
In case of CIFG optimisation scratch buffer should have a size of
[batch_size, num_units * 3] else [batch_size, num_units * 4].
Change-Id: I43e46f7b52e791472f1196f36e9142240ba76c5c
|
|
Added test cases to exercise the code path where the reshaping of B is performed on the fly.
Change-Id: Ifa4348e1054dc0019be3927f482adf64b18fd554
|
|
Change-Id: Ib0798cc17496b7817f5b5769b25d98913a33a69d
|
|
Change-Id: Id94fb9c88a498d7b938f4f707e2e7b9b6df94880
|
|
Change-Id: I5bf5d751ec7c02d96c26a769f49d03ea23a248b7
|
|
Change-Id: Ie13a9eb6d417388b5de533bffa895796d9d2cf62
|
|
Change-Id: Ibab049f09413258c99335b7da6b151530a1bd136
|
|
and 8 tensors (Part 1)
Creating special cases for concatening 2 and 4 tensors.
Change-Id: I6a739a494ae45011acb65369e353f9ef96970b90
|
|
NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Change-Id: I1d5bc4d24059917f9ddef0873dd3043b1f2320a8
|
|
inside the namespace
Change-Id: I477f52a9adf06ba3730f94d411399977fce0f98a
|
|
-Use raw string literals in regexp in CPUUtils.cpp
-Avoid implicit cast bool->int
Change-Id: I45a403ab8d0be02bb8dec267fe59545ad1074292
|
|
Change-Id: I93b14106cda8a1f640cf5acf120d31e2ebdaf495
|
|
the test.
This is needed in order to calculate the offset between OpenCL timestamps and Wall Clock timestamps as they're using different clocks
Change-Id: I874b2a475bf98fd664a1e3e15045c80f0181af47
|
|
Some systems don't have enough memory to run the VGG networks, for example
on systems with only 2GB memory the VGG example fails throwing a bad_alloc exception.
This patch introduces the concept of global memory policy in ACL, the policy
is a mechanism which could be used by the library's functions to try to reduce
memory consumption on systems with limited memory.
In this specific case the VGG examples set the policy to MINIMIZE. The GEMM
function checks if the policy is MINIMIZE and in this case does not use the
pretransposed weights path as this requires considerable more memory.
Change-Id: I53abc3c9c64d045d8306793ffc9d24b28e228b7b
|
|
Adds 0.5f after scaling AVG pooling to be able to round to nearest as
vcvtq_u32_f32 rounds towards zero.
Change-Id: I22ce78f9e628cf4184a317edabce47211ab09456
|
|
Removed gemmlowp_mm_bifrost_transposed_dot8 kernel as not used
Change-Id: I43cf463a3a4c0cdb2808621c534ffd5c9fd47ca1
|
|
Increases the steps for calculating invsqrt used in L2 pool by 1 to increase accuracy.
Change-Id: Ib938a963809b07c30d47ec0675abae75bc086986
|
|
Change-Id: I57bbdbef85d1f6e8cf1d13324f9cc38a3e3f0cc3
|
|
Change-Id: If5be77602e37b14aea63d7ec6d8adab324628f04
|
|
Removes:
-sve_interleave_8way_block2_16bit
-sve_interleave_8way_block4_16bit
-sve_sgemm_3VLx8
Change-Id: I0aa35fe974d8e122937dfe8923ecf63ff5a52001
|
|
-Uses output quantization information for the activation layer.
-Updates checks for BoundedRelu at CL side.
Change-Id: I0447860e90f1c89b67b9ace3c8daad713f6c64e0
|
|
Change-Id: I953f3b63aa4910650a1a3f6faea31beb4f6f376a
|
|
duration
Change-Id: Iafc1d6cd8003de64a3439ad807f4002036c73a73
|
|
Change-Id: I67cbbce59d61d907fc4dc4c3997e96b347dfe895
|
|
CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Since we perform an element-wise operation, it is not necessary to pass the output_depth3d.
Change-Id: Ibfa07a0706e902acf59b444aa61e18a348162ea9
|
|
The issue was related to CLIm2Col when the number of input channels was less than
the number of elements processed by each thread.
The bug has been fixed in the validate_and_configure_window() function setting the correct number of elements accessed
in the output tensor.
Also fixed an issue GEMM3D when we have a single output channel
Change-Id: I094292d0c7662599c4a4c3916ec5f5821df5faef
|
|
Change-Id: I86679adff556b6ffc9929b35cbf1b59b3958bdb1
|
|
Change-Id: I6d5f91579850906e1eb973ff6c5612195255e631
|
|
Change-Id: I807ef84dbf893bd401dcac5c0fa3a4ee49aabc66
|
|
Change-Id: I68c648a5246fcdc67a496602089f93d65eb1d601
|
|
Change-Id: I031488247673de305f63b2a2e636f4cb17bd57f2
|
|
Change-Id: If8fbd04d0817b9e654ffa9715879a2521de66963
|
|
ArmNN reported an issue with padding in CLLSTMLayer. This was due to the fact
that some tensors were allocated before they were passed to some configure
functions which attempted to change the padding requirement on already allocated
memory.
Also, increase tolerance on number of mismatches for CLBBoxTransform FP16.
Change-Id: Iad75b012be895693d0e553f3ab85f1ca7144e882
|
|
Change-Id: I5aae537372bf797fbb2a2bae81038f8963b041a9
|
|
CLDepthWiseConvolutionLayer3x3Kernel
Change-Id: Ie274da79b15c03f86dfedc85bb721b3de34a0bb4
|
|
The tab characters were corrupting the output JSON file of arm_compute_validation
Change-Id: I8792fd0e02393aef60341552b428111e969a3927
|
|
Change-Id: I87f193fce28d2de12514da675931813162fa292d
|
|
Reduce the amount of precommit tests run in DirectConvolution,
Deconvolution and Pooling. Proper investigation scheduled for later.
Change-Id: Idc2510cf6877e7a605cead84f384852b609e3216
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156466
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Vidhya Sudhan Loganathan <vidhyasudhan.loganathan@arm.com>
|
|
Fixed a typo that caused compilation issues for ArmNN.
Change-Id: Iab22adaf163eb3d2978d264f0ecf1238de98a67e
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156483
Reviewed-by: Francis Murtagh <francis.murtagh@arm.com>
Reviewed-by: Michele DiGiorgio <michele.digiorgio@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Commit 16121924 `COMPMID-1673: Collapse window in CLArithmeticAddition when one
operand is a vector` changed the number of elements processed per iteration to
8, but didn't update the quantized kernel to reflect that.
Change-Id: I49a2fbcee81f5bbc1b210b4a5c6d63b94eafdcec
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156355
Reviewed-by: Isabella Gottardi <isabella.gottardi@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Note: Only ComputeLibrary files get copied over (Stub CL / GLES drivers don't, nor are the 3rdparty includes)
utils/ files are not copied either (They're not part of the core library)
Change-Id: I55e01c0ba4a5f7e649877fcdd11fdb0a51071b18
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/156339
Reviewed-by: Pablo Tello <pablo.tello@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
Also added the test case reported by ArmNN.
Change-Id: I9fe9a1b4f74267a3346529f3a597b37486593c4a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155914
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|
|
Change-Id: Iac6a95ba7f388e65b7f1c8865c3e9bf289b233ea
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155490
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: bsgcomp <bsgcomp@arm.com>
|
|
batches available.
Change-Id: Iad83df2a9116a7f350de83ec59b28cd8893c8d3a
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/155716
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
|