aboutsummaryrefslogtreecommitdiff
path: root/src/armnn
AgeCommit message (Collapse)Author
2023-01-12IVGCVSW-7380 Update the GpuFsa Skeleton to build and load ACLCathal Corbett
* Reuse cl backend to be able to create ClRuntime, ClContexts etc. for the new GpuFsa backend. * Can access code defined in the experimental interface dynamic_fusion. * No BackendModelContext as model/backend options not required for now. * Any of the serializer and deserializer is emitted as context caching not required. * No ImportTensorHandle and ImportTensorHandleFactory for now. * Moved tuning and IClTensorHandle code to aclCommon as it is accessed by both cl and gpuFsa. * Small code refactor of cl backend. * Added DefaultAllocatorTests to GpuFsa backend. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I6ae591360e9d2a783aafd06e2d7bf8e0b3e623ee
2023-01-12Merge 'main' onto 'experimental/GpuFsa'.Cathal Corbett
* I6c71be11e9b73694747b27fe9febab8d9669b4d4 Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Iccaf50e2484559979d801ee9d0e130e848554733
2022-12-12IVGCVSW-7209 Remove deprecated code due to be removed in 23.02Mike Kelly
* Removed weights and bias from Convolution, DepthwiseConv & FullyConnected layers * Removed the weight and bias ConstTensorHandles from the QueueDescriptors * Updated Workloads to take tensors from WorkloadInfo rather than the QueueDescriptors * Removed unused RedirectMembersToConstantInputs optimization and tests. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I9ffcdc4a1c0dff725539dd69fc435b700bd98a56
2022-12-12Optimize the calling of IsLayerSupported().Cathal Corbett
* Done as part of 22.11/23.02 innovation days. * IsLayerSupported() is called in model prepare (delegate, android-nn-driver and shim/support_library) and again in ArmNN once model otimization is performed. * From calling IsLayerSupported() the first time, we should know that the layers are supported and what backend they are supported on. * Solution is to set the BackendId of the IConnectableLayer when IsLayerSupported() is called the first time, * In the Optimize() function we then check if the backend is set. If so, we do not call IsLayerSupported() again. * In the case a layer that is supported gets optimized, then the BackendId of that layer get set to "Unknown" for the new optimized layer and IsLayerSupported() will get called on the newly optimized layer. * Includes bug fix IVGCVSW-7213 for Android Mean FP16 CpuAcc tests. Also related to bug IVGCVSW-7211. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I7a7820d0cdb079ffb5a3a2e0c44e252f652df53b
2022-12-12Updates following execution of Includewhatyouuse on armnn/include.Colm Donelan
This tool forces explicit includes of all dependencies and highlights unused dependencies. Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I92e449245246452a0227cbd13f9c082e2088bf8c
2022-12-07IVGCVSW-6853 Rewrite BuildArmComputePermutationVector()Teresa Charlin
* Some pemutation vectors were not converted correctly. * Add Transpose end to end test. * Comments added with an example to clarify the differences betweeen Transpose and Permute Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I6c0954ca6ce00ef5f2a6f3625abe6f4fd27b5cdf
2022-12-07Fix some memory overruns / undefined behaviour in ShapeInferenceTestsMatthew Bentham
In several cases the address of a single float value on the stack was passed as a pointer to the constructor of ScopedTensor (which needs a backing-store of size equal to GetNumBytes()). Replace by using a std::vector to explicitly create and initialize the right number of elements. Signed-off-by: Matthew Bentham <matthew.bentham@arm.com> Change-Id: I8a1f4bf169bd89983f2d68047173ec901a21e1fb
2022-12-06Print BatchMatMul and Gather Descriptors on dot graphTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I01de14cd46fe614dfcb11b2b4f9323f32e01ee9d
2022-11-22IVGCVSW-7080 Remove deprecated code due to be removed in 23.02Keith Davis
* Extended deprecation time of SubgraphView interface to 23.08 Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: Ic0a729ea31402f0b39724da47212ae5cc04465c4
2022-11-16IVGCVSW-7214 Disable BF16-Turbo-Mode and remove conversion layersRyan OShea
- Remove Bf16ToFp32 Conversion Layer - Remove Fp32ToBf16 Conversion Layer - Remove B16 Conversion tests * Throw exception if m_ReduceFp32ToBf16 optimzer option is set to true * Provide comments to enable fast math in order to use bf16 * Update docs to inform users to enable fast math for bf16 Execute Network Changes * Require bf16_turbo_mode to also have fast_math_enabled set to true - Remove setting m_ReduceFp32ToBf16 optimizer option Signed-off-by: Ryan OShea <ryan.oshea3@arm.com> Change-Id: Ibaa6da9d29c96a1ce32ff5196b0847fde9f04a1c
2022-10-28IVGCVSW-7296 REDUCE_PROD tests fail when using Tf 2.10Teresa Charlin
* In TF what ArmNN calls quantized data types can be non-quantized as well. * This patch creates 2 models: * ArmNN: model where int8 and uint8 will always be quantized, but scale can be 1 and offset 0 * TFLite: model where int8 and uint8 can be quantized and non-quantized Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Id960f2f30988f2bbec88cb4e0c52c189ac957bae
2022-10-19MLCE-545 INT8 TFLite model execution abnormalKeith Davis
* Bug fix where files were being overwritten at each debug layer Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: I609fdc82afcee925824efb02183c7dbc942fced0
2022-10-19MLCE-545 INT8 TFLite model execution abnormalKeith Davis
* Add functionality to print output tensors to file in tempdir * UnitTests Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: Idfb4c186544187db1fecdfca11c662540f645439
2022-10-14IVGCVSW-7267 Make the AllowExpandedDims option workJim Flynn
Signed-off-by: Jim Flynn <jim.flynn@arm.com> Change-Id: I3573078206272c3a72a2b3acf8781ab458ea6c90
2022-10-11IVGCVSW-7222 Fix incorrect kernel measurements in profiling outputKevin May
* Some CL kernels are not run after the first inference and this breaks the profiler which is expecting a measurement for every kernel each run * Add a function HasKernelMeasurements() to ascertain if the Event is returning kernel measurements and if so insert 0.0 values for any missing kernel measurements. * Fix ExecuteNetwork to only print a json object after all inferences have completed Signed-off-by: Kevin May <kevin.may@arm.com> Change-Id: I99f2bb0db847f5a52ab4c5705b072155c6b6f333
2022-10-04MLCE-545 INT8 TFLite model execution abnormalKeith Davis
* Fix for Debug mode in ExNet does not work with ConstTensorsAsInputs * Remove unnecessary assertion with ambiguous message in LoadedNetwork Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: I9cd5d1f811dbbc89072d1190c510bf1b22e3069c
2022-09-28IVGCVSW-7209 Delay one release the removal of weights and biasTeresa Charlin
* This affects only to the layers (not workloads) Conv, DWConv and FC Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I66a91ed1a78bc0464e00423c7fc7c28c91d199ce
2022-09-15Make SubgraphViewSelector give deterministic resultsRob Hughes
The subgraphs produced by SubgraphViewSelector were not produced in a deterministic order, as the order was determined by the pointer values of some objects, which are not guaranteed to be the same for each execution. This patch adds a post-processing sorting step based on the GUIDs of the layers and the slot indices so that the results will be the same for each execution. This makes debugging the optimised graph much easier as subsequent stages can also be deterministic. It also simplifies some unit tests. Change-Id: I64f552706b7fb1bf82c19d85a448e054277917bc Signed-off-by: Rob Hughes <robert.hughes@arm.com>
2022-09-07IVGCVSW-7209 Remove deprecated code due to be removed in 22.11Teresa Charlin
* Files deleted when Stabilizing the API Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I0ae73ee36968fa880761c10358bfa827be5fe054
2022-09-06IVGCVSW-7155 SubgraphView::SubstituteSubgraph IOutputSlots incorrectly ↵Cathal Corbett
overridden Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: If594e291951a5f9ed1957a19a971c498f6e7843f
2022-09-06IVGCVSW-7006 Remove deprecated code due to be removed in 22.08Teresa Charlin
* AddConv and AddDWConv with weights and bias * ResizeBilinearDescriptor * b,blacklist option in accuracy tool !android-nn-driver:8172 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ibbc04fd18be7f938b11590bf67cd7af103cb4d99
2022-09-05IVGCVSW-6497: BatchMatMul TfLite ParserSamuel Yap
* Added armnnTfLiteParser for BatchMatMul * Added unit testing for parser * Updated CMakeLists Signed-off-by: Samuel Yap <samuel.yap@arm.com> Change-Id: If6842aaf7cf08f688093b714e2ecea6e8cd87161
2022-08-30IVGCVSW-7105: BatchMatMul Optional Parameter SupportSamuel Yap
* Added transpose parameters to pre-transpose each input tensor's slices * Added adjoint parameters to pre-adjoint each input tensor's slices * Small refactoring (BatchMatMulDescriptor static helpers and BatchMatMulImpl constructor) * Updated input validation and output shape inference for parameters * Additional layer unit tests for parameters added * Versionings incremented Signed-off-by: Samuel Yap <samuel.yap@arm.com> Change-Id: Ibe5242a8a5bf604c13de0dc65844fd6c421cc667
2022-08-29IVGCVSW-7106 Additional fix models with multiple input and output tensors.Colm Donelan
* The previous fix for IVGCVSW-7106 introduced a problem around operators with multiple inputs and outputs: addSeparator was being applied to all tensors in the list not just the last one. Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I0325d9abcb7fb512f834c61686c698bbfc29a3be
2022-08-29IVGCVSW-6954 'Arm NN SL Improvements'Sadik Armagan
* Move the Conv2D and DepthwiseConv2D validation to Optimization level when the weights and tensors are as constant inputs * Take into account offset and scales values when doing INT8 to FP32 dequantization Signed-off-by: Sadik Armagan <sadik.armagan@arm.com> Change-Id: I1f81f15640395ac041923b10dbe9151159715117
2022-08-29IVGCVSW-7106 Incorrect Json format for some networks.Colm Donelan
* ProfilingDetails assumed that every workload description included both tensors and parameters. This is not always the case. * Modify ProfilingDetails::AddDetailsToString to check the next element to be printed before deciding to add a separator and new line. Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I2577b0e8a149d0a172ee12975e18b78238d8256e
2022-08-29Bug Fix for refactor of the ExecuteNetwork for strategy in Precompiled layer.Teresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ib91b734d4add47e23ad00f76e53f1873ff617831
2022-08-05IVGCVSW-7149 FoldPadIntoQuantizedAvgPoolCpuRefTest test failing while ↵Cathal Corbett
running Arm NN Unittest Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I567452000287babad345e61ea85ea84f362f48e0
2022-08-05IVGCVSW-7147 Bug Fix for refactor of the ExecuteNetwork for strategy in ↵Teresa Charlin
ConvertLayers. * ConvertBf16ToFp32Layer * ConvertFp16ToFp32Layer * ConvertFp32ToBf16Layer * ConvertFp32ToFp16Layer Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I5e763519a12f017dc14b09ea191fdb3b7398c0d7
2022-08-05GitHub #667: Neon fold padding into average pool 2D quantization bug fix.Cathal Corbett
* Originated from a GitHub issue: https://github.com/ARM-software/armnn/issues/667 * Initially, Arm NN supports the pool 2D operation because there is no padding on the pool2d. Neon failure occurs when padding is followed by average pool 2D due to folding optimization. * Here we prevent the folding optimization from happening for the above special case and add it in as a backend specific optimization. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Ia0fd90c3a6b4b9d29c81106f154617d2e893e26b
2022-08-05Bug Fix for refactor of the ExecuteNetwork for Strategy in MemCopyLayerTeresa Charlin
* Correcting some typos Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Icb21dc4828e51afa38816bd454926fc41e9e82cb
2022-07-28Revert "Revert "IVGCVSW-6650 Refactor ExecuteNetwork""Teresa Charlin
This reverts commit 1a7f033768acb27da11503bd29abb468d2e77f9e. List of fixes to be able to add this code again: * "emplacing_back" the vector inputTensors into the vector m_InputTensorsVec outside the for loop * GetIOInfo() uses IOptimizedNetwork instead of INetwork, where the infered shapes are not saved * Add missing data type Signed32 to SetupInputsAndOutputs() * PrintOutputTensors() prints the actual output without dequantizing * Add profilingDetailsMethod as input in networkProperties in ArmNNExecutor constructor * Fix typos Change-Id: I91de166f87228282db3efa27431fe91458834442 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ic6634d48892d11e5f146cdf285e1e333e93e9937 Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
2022-07-27IVGCVSW-7109: Add Batch MatMul front end support - ReferenceSamuel Yap
* Descriptors added for BatchMatMul * Layer definition added * Input validation added (will likely change when opt. param support comes in) * Ref workload implementation for BatchMatMul added (will also change with opt. param support) * Ref layer tests made for BatchMatMul * CMake and other build files updated Signed-off-by: Samuel Yap <samuel.yap@arm.com> Change-Id: Ic885301da543ee0fbe7922b85e7f9658c4efc617
2022-07-27IVGCVSW-6978: RedirectMembersToConstantInputs does not work with ↵Francis Murtagh
Fp32NetworkToBf16Converter * Fuse FP32ToBF16Layers with Constant Layer so Conv2d/FullyConnected can have their weights redirected. * If BF16 Unsupported in Conv2d || FullyConnected revert fused Constant Layer to FP32 Change-Id: If523c708a822659d64597d9ae39cca1c2f84b76f Signed-off-by: Francis Murtagh <francis.murtagh@arm.com>
2022-07-27IVGCVSW-6896 Fix pre-import when using sync execute.Colm Donelan
* Refactor backend capability checks in LoadedNetwork. * ImportInputs should check the number of tensors does not exceed the number of inputs. * In EnqueueWorkload the check for for the count of input tensors was ignoring pre-imported inputs. * Added checks to verify ImportInputs/ImportOutputs worked as expected in EndToEndTestImpl. * Improve documentation on ImportInputs/ImportOutputs in IRuntime.hpp. * Disabled import tests in CL and Neon EndToEndTests that cannot work. Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: Iae4b2644a1c9f01ee72bce1afb211661cc9ae2e3
2022-07-27IVGCVSW-6620 Update the async api to use ExecutionDataMatthew Sloyan
* ExecutionData holds a void* which can be assigned to data required for execution in a backend. WorkingMemDescriptors are used in the Ref backend which hold TensorHandles for inputs and outputs. * Updated ExecuteAsync functions to take ExecutionData. * Added CreateExecutionData and UpdateExectutionData to IBackendInternal. * Streamlined experimental IWorkingMemHandle API by removing map related function and unused m_workingMemDescriptorMap from WorkingMemHandle. Signed-off-by: Matthew Sloyan <matthew.sloyan@arm.com> Change-Id: I54b0aab12872011743a141eb42dae200227769af
2022-07-08IVGCVSW-6957 'Import Host Memory in SL'Sadik Armagan
* Enabled import host memory in SL as default * Updated import host memory functionality in GpuAcc Signed-off-by: Sadik Armagan <sadik.armagan@arm.com> Change-Id: I22132b1e1008159b0e7247219762e3e9ae5eba10
2022-07-08IVGCVSW-7034 Modified SubgraphView returned by GetWorkingCopy()Francis Murtagh
* Add virtual GetSlotIndex to IInputSlot * Fix logic in GetWorkingCopy to use index of slots; so as not to add slots to cloned subgraphView if not in original subgraphView * Add test to cover cases when not all inputSlots to subgraphView layer are part of the original subgraphView * Mark SubgraphView::GetWorkingCopy() as const Change-Id: I1d540f84c57f97f6c834ec06ca13393ffa55d379
2022-06-29IVGCVSW-6962 Adding Const layer in the graph immediately after InputTeresa Charlin
instead of immediately before output Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I2d89a1efdabfdb4be24a8998a03fe1f502d26183
2022-06-27IVGCVSW-6981 Remove deprecated code 22.05 [Post Release]Nikhil Raj
Signed-off-by: Nikhil Raj <nikhil.raj@arm.com> Change-Id: I9ccaefbe28ea572e9e2b4a2168574804667f7460
2022-06-23NNXSW-3858: Get non-const IConnectableLayer from I/O slotsNabeel Ahmad
* Added non-const variants of existing const member functions in IInputSlot and IOutputSlot to retrieve non-const IConnectableLayer Signed-off-by: Nabeel Ahmad <nabeel.ahmad@arm.com> Change-Id: Ic3388b578324edb4d2cca36acce6560ad1ce83c5
2022-06-22Revert "Revert "IVGCVSW-6873 Import inputs but don't export outputs fails.""Francis Murtagh
This reverts commit a0f8b15d4ddb5075f380003ff31b271d389d3b66. Reason for revert: <Test ClDmaBufInternalTests review > Change-Id: Ibc4a77fa008643849da7330391942e4c87b941e2
2022-06-21Revert "IVGCVSW-6873 Import inputs but don't export outputs fails."James Conroy
This reverts commit 03bf98a8bc51ad20eef4b9ca5fbf6ce15e063721. Reason for revert: Caused failures in tests located in internal repo. Change-Id: If35cb0ede349b270e4e7827324382e09455d8cfa
2022-06-20IVGCVSW-6873 Import inputs but don't export outputs fails.Colm Donelan
Only one bool is used to indicate whether inputs should be imported. However, its possible for the user to want to import inputs but not export outputs. In addition it's possible for a user to enabled import during optimize but then pass a memory source that does not require import. * Add m_ExportEnabled to INetwork.hpp. * Modify Network::dNetwork to consider both m_ImportEnabled and m_ExportEnabled. * Add ValidateSourcesMatchOptimizedNetwork to LoadedNetwork to validate import options between optimize and network load. * Update the TfLite delegate consider exportEnabled flag in the optimizer. !armnn-internal-tests:425350 Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I776eab81595898e43f91ab40306962eae61329f4
2022-05-24IVGCVSW-6967 Add Optimizer Test for FullyConnected in Fp32ToBf16experimental/serializationIssueKeith Davis
* Test already existed but bias was not enabled so yielded false positive * Updated Conv2d and FC to have const layers as inputs Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: Id4193adef2ac67b3a4681345e4dc01414cbbbad7
2022-05-23MLCE-825: Give reason when workload unsupported for Non Constant Weights/BiasFrancis Murtagh
* BackendHelper.cpp IsXXXLayerSupported doesn't get as far as Neon/Cl Validate functions where arm_compute::Status is returned. * Conv2d, Depthwise, DilatedDepthwise and FullyConnected * Tidy up if() -> if () * Clean up logic in FullyConnected so that isLayerSupported gets called Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I5da1a882f4a2f55e90aa984b2b9548a847cb3a2d
2022-05-23IVGCVSW-6123 ConstTensorsAsInputs: Conv2dKeith Davis
* Use new INetwork::AddConvolution2dLayer instead of deprecated version * Remove duplicated test in SerlializerTests * Fix some cosmetics Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: I3407815bfdc1cdc01ca0a667b8e4d80d8621783f
2022-05-19IVGCVSW-6145 ConstTensorsAsInput: Optimizer Fix - GetConstantTensorsByRefFrancis Murtagh
* Add functionality to check for ConstantTensorsAsInputs to GetConstantTensorsByRef * Reorder optimizations so RedirectMembersToConstantInputs occurs after Conversion of Constants * Ensure graph is in topological order after loading in OptimizedNet * Fixed test to check release of m_LayerOutputs. Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Change-Id: I7cff50798d7217e8ea0d2f9b153eabd10174a566
2022-05-18IVGCVSW-6147 ConstTensorsAsInput: Optimizer - FusePermuteIntoConstLayerCathal Corbett
* No trailing permute layer after a constant layer * Unit test for optimization Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I0d098f5af41d2c55df7cef1ccfb848093320ddc1
2022-05-18IVGCVSW-6455 Support Const + Dequantize layer and optimize it.Teresa Charlin
* Support Float16 as input to Dequantize layer * Add Optimization to substitute Const+Dequantize layers with Const layer Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I58bb7e3871ca480c7b6fca93c4efb2de84e09e64 Signed-off-by: David <david.monahan@arm.com>