Age | Commit message (Collapse) | Author |
|
* Added ability to reduce dimension sizes when calling BuildArmComputeTensorInfo or
BuildArmComputeTensorShapes, this will attempt to remove leading 1s in order to
squeeze the number of dimensions but retain the size.
* Changed ClBatchMatMulWorkload to attempt to squeeze the number of dimensions to 3
as the CL Gemm Kernel can only support up to 3 dimensions.
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: I6b3d0886c5b97fdb686838fc3dc292833ddc4643
|
|
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I60e9284b90467f58e0acd74d3f1493546b6f1b9b
|
|
* Added ElementwiseUnary support with a mapping for Rsqrt
* Added unittests
* Added Rsqrt EndtoEnd tests for all backends
* Changed TosaRefLayerSupport to default to false on unsupported layers
Signed-off-by: David Monahan <david.monahan@arm.com>
Change-Id: I3eaa9c684647ead61520a563815581aa68bee51b
|
|
* Call Reshape EndToEnd test from 3 backends
* Tidy up some naming of tests.
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I5546af35e89d352d3f1529368518aecc0a4a534b
|
|
* Required to enable easier future merging and rebase into experimental/GpuFsa
as part of IVGCVSW-7380.
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: I066dcf00523ff430a0908666e452548ab848bd86
|
|
* GpuAcc only supports up to 3D, so no 4D test have been added
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: Ie926cd45c350be624cbdc6cb27c89d2d3f60884b
|
|
This reverts commit 21cf67af47a9cebbc10a98184c204fffa3722abd.
Reason for revert: IVGCVSW-7397 Segmentation fault/Bus error in Backends CI job nightly
Change-Id: I563e79700a857f8cf0fce0923a7040aeda29629b
|
|
Signed-off-by: Matthew Bentham <matthew.bentham@arm.com>
Change-Id: Ia7d714eb227a96ad9eeb1441afbc83e6ad2bb197
|
|
* Removed weights and bias from Convolution, DepthwiseConv & FullyConnected
layers
* Removed the weight and bias ConstTensorHandles from the QueueDescriptors
* Updated Workloads to take tensors from WorkloadInfo rather than the
QueueDescriptors
* Removed unused RedirectMembersToConstantInputs optimization and tests.
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: I9ffcdc4a1c0dff725539dd69fc435b700bd98a56
|
|
* Some pemutation vectors were not converted correctly.
* Add Transpose end to end test.
* Comments added with an example to clarify the differences betweeen
Transpose and Permute
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I6c0954ca6ce00ef5f2a6f3625abe6f4fd27b5cdf
|
|
one works fine
* Each CLBackend created its own ClContextControlWrapper which invalidated
the OpenCL context's from all CLBackends that were created before that one.
* Now CLBackends will keep a shared_ptr to a ClContextControlWrapper which
more closely matches the functionality within ACL.
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: I0744c2cb6a2f0d6b0c5fa54d786f88cf97775559
|
|
- Remove Bf16ToFp32 Conversion Layer
- Remove Fp32ToBf16 Conversion Layer
- Remove B16 Conversion tests
* Throw exception if m_ReduceFp32ToBf16 optimzer option is set to true
* Provide comments to enable fast math in order to use bf16
* Update docs to inform users to enable fast math for bf16
Execute Network Changes
* Require bf16_turbo_mode to also have fast_math_enabled set to true
- Remove setting m_ReduceFp32ToBf16 optimizer option
Signed-off-by: Ryan OShea <ryan.oshea3@arm.com>
Change-Id: Ibaa6da9d29c96a1ce32ff5196b0847fde9f04a1c
|
|
Signed-off-by: Colm Donelan <colm.donelan@arm.com>
Change-Id: I17823fb8b6bbabc4da327187167ce9582ee29b32
|
|
* Create Simple Addition EndtoEnd test
* Create EndToEndTest file in TosaRef/test directory
* Add AdditionEndToEnd test to CpuRef,CpuAcc,GpuAcc,TosaRef
Signed-off-by: Ryan OShea <ryan.oshea3@arm.com>
Change-Id: Ic44e2b457c25dcb41bb3b17c05cce0e74bf17a80
|
|
* Some CL kernels are not run after the first inference and this breaks
the profiler which is expecting a measurement for every kernel each run
* Add a function HasKernelMeasurements() to ascertain if the Event is
returning kernel measurements and if so insert 0.0 values for any missing
kernel measurements.
* Fix ExecuteNetwork to only print a json object after all inferences
have completed
Signed-off-by: Kevin May <kevin.may@arm.com>
Change-Id: I99f2bb0db847f5a52ab4c5705b072155c6b6f333
|
|
Signed-off-by: Jim Flynn <jim.flynn@arm.com>
Change-Id: I3a3aab7b5042349cb2df8517678306665e037610
|
|
* Added case for Bf16 to switch and changed Assertion to Exception
so it shows up in Release build.
Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
Change-Id: I817260dc7b7667386c4aa734bea649383866a785
|
|
* Originated from a GitHub issue: https://github.com/ARM-software/armnn/issues/667
* Initially, Arm NN supports the pool 2D operation because there is no padding
on the pool2d. Neon failure occurs when padding is followed by average pool 2D
due to folding optimization.
* Here we prevent the folding optimization from happening for the above special case
and add it in as a backend specific optimization.
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: Ia0fd90c3a6b4b9d29c81106f154617d2e893e26b
|
|
* Refactor backend capability checks in LoadedNetwork.
* ImportInputs should check the number of tensors does not exceed the
number of inputs.
* In EnqueueWorkload the check for for the count of input tensors
was ignoring pre-imported inputs.
* Added checks to verify ImportInputs/ImportOutputs worked as expected
in EndToEndTestImpl.
* Improve documentation on ImportInputs/ImportOutputs in IRuntime.hpp.
* Disabled import tests in CL and Neon EndToEndTests that cannot work.
Signed-off-by: Colm Donelan <colm.donelan@arm.com>
Change-Id: Iae4b2644a1c9f01ee72bce1afb211661cc9ae2e3
|
|
* Enabled import host memory in SL as default
* Updated import host memory functionality in GpuAcc
Signed-off-by: Sadik Armagan <sadik.armagan@arm.com>
Change-Id: I22132b1e1008159b0e7247219762e3e9ae5eba10
|
|
This reverts commit a0f8b15d4ddb5075f380003ff31b271d389d3b66.
Reason for revert: <Test ClDmaBufInternalTests review >
Change-Id: Ibc4a77fa008643849da7330391942e4c87b941e2
|
|
This reverts commit 03bf98a8bc51ad20eef4b9ca5fbf6ce15e063721.
Reason for revert: Caused failures in tests located in internal repo.
Change-Id: If35cb0ede349b270e4e7827324382e09455d8cfa
|
|
Only one bool is used to indicate whether inputs should be imported.
However, its possible for the user to want to import inputs but not
export outputs. In addition it's possible for a user to enabled import
during optimize but then pass a memory source that does not require
import.
* Add m_ExportEnabled to INetwork.hpp.
* Modify Network::dNetwork to consider both m_ImportEnabled
and m_ExportEnabled.
* Add ValidateSourcesMatchOptimizedNetwork to LoadedNetwork to validate
import options between optimize and network load.
* Update the TfLite delegate consider exportEnabled flag in the
optimizer.
!armnn-internal-tests:425350
Signed-off-by: Colm Donelan <colm.donelan@arm.com>
Change-Id: I776eab81595898e43f91ab40306962eae61329f4
|
|
* Fix made to experimental/armnn_shim_sl branch also required for armnn master branch.
* TestGenerated/GeneratedTests.Sync/argmax_1 fix.
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: Idb0324ff59e1ed13caf5f4bf899d1d3220d823d4
|
|
* BackendHelper.cpp IsXXXLayerSupported doesn't get as far as Neon/Cl
Validate functions where arm_compute::Status is returned.
* Conv2d, Depthwise, DilatedDepthwise and FullyConnected
* Tidy up if() -> if ()
* Clean up logic in FullyConnected so that isLayerSupported gets called
Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I5da1a882f4a2f55e90aa984b2b9548a847cb3a2d
|
|
* Use new INetwork::AddConvolution2dLayer
instead of deprecated version
* Remove duplicated test in SerlializerTests
* Fix some cosmetics
Signed-off-by: Keith Davis <keith.davis@arm.com>
Change-Id: I3407815bfdc1cdc01ca0a667b8e4d80d8621783f
|
|
dimensions
* Added allow-expanded-dims to TFLite parser and ArmNN delegate
* If true ArmNN will disregard dimensions with a size of 1 when
validating tensor shapes. Tensor sizes must still match.
* This allows us to support models where tensors have expanded
dimensions (i.e. extra dimensions with a size of 1).
* Fixed bug in Network where it assumed that only the first option
could be ShapeInferenceMethod.
* Fixed bug where m_ShapeInferenceMethod was lost when copying or
moving Graphs.
* Changed Delegate to pass "infer-output-shape", "allow-expanded-dims"
and other BackendOptions through to the Network during construction.
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: Ibe7c5ae6597796fc9164cb07bd372bd7f8f8cacf
|
|
!android-nn-driver:7477
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: Ibf633ccccc385bd980934ff829407d21981323ef
|
|
* Update Front-end and Tools.
* Updated Serializer, Deserializer and unit tests to reflect this.
* Updated TfLiteDelegate, TfLiteParser and OnnxParser.
* Updated Ref.
* Fixed resulting Neon / CL tests
* Unified optimizers for conv2d ops
* Optimizer Fix - Fp32ToBf16
* Partial implementation for ACL backends to fix VTS failures
!android-nn-driver:7477
Signed-off-by: Keith Davis <keith.davis@arm.com>
Change-Id: I5fb18877f7ee32643e15a9818945356274bb401b
|
|
* Add IsSupported for Pooling3d
* Add CreateWorkload case for Pooling3d
* Create new NeonPooling3dWorkload header and source files
* Add Pooling3d workload to NeonWorkloads.hpp
* Add float32 tests for Pooling3d workload
* Add Uint8 tests for Cl and NE pooling3d
Signed-off-by: Ryan OShea <ryan.oshea3@arm.com>
Change-Id: Ic992e1233d1eb8db52df2c8446183df1c907bc4d
|
|
* IVGCVSW-6940 ConstTensorsAsInput: DepthwiseConvolution2d - Complete Neon and Cl Bug Fix
* Bug fix to enable Cl and Neon Backend Compatibility ConstantTensorsAsInputs
* Updated Cl and Neon FullyConnected workloads to handle constant
weights and bias as inputs rather than reading from member variables.
* Prevent non const weights and biases passing CL and NEON validate
for Depthwise Convolution.
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: I0f505ff5998a183152f843d0f6cc74327ba920e7
|
|
* Added backend specific optimization & test for CpuAcc and GpuAcc: PermuteDepthwiseConv2dWeights
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: I600476b2e9c557a39818a574c1091c9d650b21b1
|
|
* Add Unit Tests
* Bug Fix: add Sqrt to Neon and Cl workload factories
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I0db1d813a4e7d15431e87e825e6d14e61f5ffb7d
|
|
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: I8ba7e56062c285c672dcaa9d13be319eb4f1fca6
|
|
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
Change-Id: Ib90bade63cd0437329c690b09cf719a2e2bd06a4
|
|
!android-nn-driver:7418
* Update Front-end and Tools.
* Updated Serializer, Deserializer and unit tests to reflect this.
* Updated TfLiteDelegate, TfLiteParser and OnnxParser.
* Change NNDriver to new API.
* Updated Ref.
* Neon and Cl backend partially completed (Backend.cpp files).
* Added dynamic or constant input EndToEnd tests.
* Added ConstantTensorAsInputMemeberVariableRedirect Optimization.
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: Ib18b6c10a093042e165e25237dc04a4c67ba82da
|
|
* Corrected TensorInfo order for IsUnidirectionalSequenceLstmSupported
* outputStateOut TensorInfo is not optional.
* cellStateOut TensorInfo is not optional.
* TensorInfo Order matches other QLSTM/LSTM layers.
* Added missing parameters to UnidirectionalSequenceLstmOperator for
delegate.
* Added quantized UnidirectionalSequenceLstm support to Neon
!android-nn-driver:7457
Signed-off-by: Mike Kelly <mike.kelly@arm.com>
Change-Id: I26dde1bb96793dd25eb9081ca5ae5f63752288c4
|
|
* Add IsSupported for Pooling3d
* Add CreateWorkload case for Pooling3d
* Create new ClPooling3dWorkload header and source files
* Add Pooling3d workload to ClWorkloads.hpp
* Add tests for Pooling3d workload
* Add Pooling3d build function to ArmComputeTensorUtils
Change-Id: Ia270b0fe809a171ed73af14376de8708b346d500
Signed-off-by: Ryan OShea <ryan.oshea3@arm.com>
|
|
android-nn-driver do not execute.
* Change to src/backends/cl/workloads/ClLstmFloatWorkload.cpp fix LstmTests_GpuAcc tests.
* Change to src/backends/cl/workloads/ClConvertFp16ToFp32Workload.hpp & ClConvertFp32ToFp16Workload.hpp
fix MeanTests_GpuAcc and Convolution2DTests_1.1 tests.
* Added UnitTests to src/backends/cl/test/ClImportTensorHandleTests.cpp to test import on Convert Layers.
!android-nn-driver:7264
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: I0c46dc4b9c54eca8771ab12ed0302b6224606957
|
|
!android-nn-driver:7337
Change-Id: Ide401623829cc99fb9b51e9bbce3482ce706a8dd
Signed-off-by: Jim Flynn <jim.flynn@arm.com>
|
|
* Improves performance in ExecuteNetwork when using --cached-network-filepath by using
a combination of mmap and memcpy instead of std::ifstream and reading individual bytes
* Partially solves MLCE-668
Change-Id: Ic772316b399484753f80593c02252bb1a5619157
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
|
|
Allow reading of an existing params file even when tuning.
Signed-off-by: Stuart Taylor <stuart.taylor@arm.com>
Change-Id: I6c6d9ec60908d644afbb5ff1c55f4a6cacf650d2
|
|
Signed-off-by: Cathal Corbett <cathal.corbett@arm.com>
Change-Id: Ie99fe9786eb5e30585f437d0c6362c73688148db
|
|
* Added End-To-End tests which check that allocated buffers for Cl can be re-used when going from importing to copy and vice-versa
* Change from the first patch: no longer try to align the buffers which are being copied
Signed-off-by: David Monahan <David.Monahan@arm.com>
Change-Id: I2c2153a475ca16e4eb1aaa5a95af3423877651aa
|
|
data"
This reverts commit ae91a5e058da31e912c0768f516b2ef013c3b39e.
Reason for revert: There is an intermittent failure which may affect the review jobs. If it becomes a problem merge this and ping Dave :)
Change-Id: Ie0d56e4184a525e55cd2ae59042d060bd5609017
|
|
* Added End-To-End tests which check that allocated buffers for Cl can be re-used when going from importing to copy and vice-versa
Signed-off-by: David Monahan <David.Monahan@arm.com>
Change-Id: Id7e4a4bb68ca9ec1b5e978be6286c5f110436df2
|
|
fp32/fp16 to Cl""
This reverts commit 79cef69b1ec58f9ce010461eaaad04c896a4fe15.
Reason for revert: 22.05 release.
Change-Id: Id2ecbf563e8808694fb8605604e8c3c39c29cec2
|
|
fp32/fp16 to Neon""
This reverts commit f87b90e4dbb906436cf205a2a19e199bfe9224ed.
Reason for revert: 22.02 release.
Change-Id: I1ca5a79a8957908f655a6c4e79eefa24c5aec645
|
|
Signed-off-by: David Monahan <David.Monahan@arm.com>
Change-Id: Ia916219a33535f4c288fa44fdc23961a3e54e788
|
|
to Neon"
This reverts commit b0baff73b1574a198e57d46fcd704cedc43cea16.
Reason for revert: cannot update ACL pin until 22.02 release.
Change-Id: I049a125ba3b6a9b1cd6514ef9dd14d807773ed00
|