aboutsummaryrefslogtreecommitdiff
path: root/src/backends/neon
AgeCommit message (Collapse)Author
2023-06-07IVGCVSW-7789 Enable dynamic bias in Depthwise Convolution in CpuAccTeresa Charlin
* Dynamic bias are supported by CpuAcc for this layer * Indentation and const modifiers minor changes Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Kevin May <kevin.may@arm.com> Change-Id: I3b25f14feea55f746c254a832d97e21a1551ca36
2023-06-07Fix incorrect validation of Unidirectional Sequence LSTM on Cl and NeonNarumol Prangnawarat
Signed-off-by: Narumol Prangnawarat <narumol.prangnawarat@arm.com> Change-Id: I54c60fb98b9c560c300572f46d42b13aec7e402e
2023-05-23MLCE-1022 Fix failure on UnidirectionalSequenceLstm OperatorNarumol Prangnawarat
* Fix failure to parse UnidirectionalSequenceLstm Operator on CpuAcc * Fix failure to parse UnidirectionalSequenceLstm Operator on GpuAcc * Fix IsLayerSupported tests when there are multiple otutputs Signed-off-by: Narumol Prangnawarat <narumol.prangnawarat@arm.com> Change-Id: Ia690f34d3c7fae87bd36c97056a3ff71baa865f6
2023-05-23IVGCVSW-7732 Enable dynamic bias in FullyConnected in CpuAcc and GpuAccTeresa Charlin
* Dynamic bias are supported by ACL for this layer. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I428bd42a97e0c26c72f9925e3cb209c2fc9a650d
2023-05-18IVGCVSW-7400 POW IVGCVSW-7278 SQUARED_DIFFERENCE to CpuAcc and GpuAccJohn Mcloughlin
* Add POW SQUARED_DIFFERENCE and Unit tests for CpuAcc and GpuAcc Signed-off-by: John Mcloughlin <john.mcloughlin@arm.com> Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ifa78af2a2fda2074586d8e4d9a506b1b13fa5755
2023-05-11Revert "IVGCVSW-7454 Enable dynamic bias in CpuAcc and GpuAcc in Conv2d ↵TeresaARM
DWConv and FC" This reverts commit fecd9ed396705a17805ffc49839bd82ae24c892b. Reason for revert: IVGCVSW-7727 Dynamic bias CTS failing Change-Id: I53f67d60fca0e60a81298f90450ceef26b97c321 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
2023-05-09IVGCVSW-7454 Enable dynamic bias in CpuAcc and GpuAcc in Conv2d DWConv and FCTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ib6914a9a208475b68e969eba6f70fae4061efa9b
2023-05-08IVGCVSW-7454 Fix CpuAcc FC dynamic weightsTeresa Charlin
* Pass to ACL the flag for constant weights and bias in FC, conv and DWconv workloads Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Iae2810c8d1a402d4afc1e757846665315a80d3ea
2023-05-08IVGCVSW-7307 Add CpuAcc Batch MatMul WorkloadTeresa Charlin
* Call dedicated MatMul kernel in ACL * Add int8 tests * Add int8 to documentation * Force tensors to be dynamic (nonConst) as per request of ACL Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I992ae9aae1174214607bf29305f21cdeaf3fdc1b
2023-04-28Make Convert workloads use arm_compute::NECast in CpuAcc backendMatthew Bentham
NECast can use conversion instructions where they are available so this should in general be faster. Signed-off-by: Matthew Bentham <Matthew.Bentham@arm.com> Change-Id: I3f259e17b280a4f4c36f363965ffbc8ee8c4c29f
2023-04-27Add unit test for Neon Convert workloadsMatthew Bentham
Signed-off-by: Matthew Bentham <Matthew.Bentham@arm.com> Change-Id: I501a3e01932d44eca796e93a9383378dafc758c5
2023-04-18GitHub #719 Set quantization parameter scale to 1.0, instead of 0.0.Teresa Charlin
* Arm NN does not account for int8 or uint8 not quantized types, Tensorflow does. Not quantized int8 and uint8 is the same as quantized int8 and uint8 with scale = 1.0 and offset= 0 Default offset/zero_point was already 0, this review sets the default scale to 1.0. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ibc3eecc281de516c2cc706e17bde01c64ff9556e
2023-04-12IVGCVSW-7197 Implement Pimpl Idiom for OptimizerOptionsJohn Mcloughlin
Signed-off-by: John Mcloughlin <john.mcloughlin@arm.com> Change-Id: Id4bdc31e3e6f18ccaef232c29a2d2825c915b21c
2023-04-11IVGCVSW-7507 Pass m_Crops in BatchToSpaceND CpuAcc and GpuAcc workloadsTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I902c9187eefe7595271312fdc16273f7aa3d41cd
2023-04-04Remove GetGraph and include of Graph.hpp header from public headerMatthew Bentham
Remove deprecated GetGraph() from OptimizationViews. This method has been deprecated for a long time and no backends still need it. Remove include of Graph.hpp from the public headers. Add includes elsewhere to deal with the header fallout. Signed-off-by: Matthew Bentham <matthew.bentham@arm.com> Change-Id: I8dae275a8a446d9d0e19be62684e9b3cd2fa493d
2023-03-31Revert "IVGCVSW-3808 Deprecation notices for old ElementwiseBinary layers"Mike Kelly
This reverts commit 52e90bf59ecbe90d33368d8fc1fd120f07658aaf. Change-Id: I5a0d244593d8e760ee7ba0c9d38c02377e1bdc24 Signed-off-by: Mike Kelly <mike.kelly@arm.com>
2023-03-30IVGCVSW-7454 Enable NonConstWeights in CpuAccTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I74a93b1e2e5c9f12ce4523df3f21e5c0967fddfb
2023-03-30IVGCVSW-3808 Deprecation notices for old ElementwiseBinary layersMike Kelly
* Added Deprecation notices for old ElementwiseBinary layers. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: Iebbbaff38cc9c347b25eb2f9054c914a4f931c68
2023-03-14IVGCVSW-3808 Add ElementwiseBinaryLayerMike Kelly
!android-nn-driver:9329 * Added ElementwiseBinaryLayer that can represent all ElementwiseBinary operations including Add, Div, Sub, Maximum, Mul and Minimum. * Updated Delegate to use ElementwiseBinaryLayer instead of the Add, Div, Sub, Maximum, Mul and Minimum layers. * Updated Deserializer to use ElementwiseBinaryLayer instead of the Add, Div, Sub, Maximum, Mul and Minimum layers. * Updated OnnxParser to use ElementwiseBinaryLayer instead of the Add layer. * Updated TfLiteParser to use ElementwiseBinaryLayer instead of the Add, Div, Sub, Maximum, Mul and Minimum layers. * Updated CL and Neon tests to use ElementwiseBinaryLayer. * Updated CL and Neon Backend Specific Optimizations to accept ElementBinaryLayers as well as Add, Div, Mul, Sub, Maximum and Minimum layers. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I7cbb96b60eb01f0e2b57b0541016d48a08b86c75
2023-01-13IVGCVSW-7173 Add Rsqrt to Tosa Ref BackendDavid Monahan
* Added ElementwiseUnary support with a mapping for Rsqrt * Added unittests * Added Rsqrt EndtoEnd tests for all backends * Changed TosaRefLayerSupport to default to false on unsupported layers Signed-off-by: David Monahan <david.monahan@arm.com> Change-Id: I3eaa9c684647ead61520a563815581aa68bee51b
2023-01-12IVGCVSW-5128 Add EndToEnd test for REDUCE_SUMTeresa Charlin
* Call Reshape EndToEnd test from 3 backends * Tidy up some naming of tests. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I5546af35e89d352d3f1529368518aecc0a4a534b
2023-01-04IVGCVSW-7211 Fix float16 operators being wrongly unsupported with ↵Cathal Corbett
android-nn-driver. * Not a concern with the delegate/parser as tflite builtin operators have little float16 support * Will also fix float16 workloads running on the support_library. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Iec2033dbc8ece2140b188de1f193c344a68b9c36
2022-12-12IVGCVSW-7209 Remove deprecated code due to be removed in 23.02Mike Kelly
* Removed weights and bias from Convolution, DepthwiseConv & FullyConnected layers * Removed the weight and bias ConstTensorHandles from the QueueDescriptors * Updated Workloads to take tensors from WorkloadInfo rather than the QueueDescriptors * Removed unused RedirectMembersToConstantInputs optimization and tests. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I9ffcdc4a1c0dff725539dd69fc435b700bd98a56
2022-12-07IVGCVSW-6853 Rewrite BuildArmComputePermutationVector()Teresa Charlin
* Some pemutation vectors were not converted correctly. * Add Transpose end to end test. * Comments added with an example to clarify the differences betweeen Transpose and Permute Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I6c0954ca6ce00ef5f2a6f3625abe6f4fd27b5cdf
2022-11-16IVGCVSW-7214 Disable BF16-Turbo-Mode and remove conversion layersRyan OShea
- Remove Bf16ToFp32 Conversion Layer - Remove Fp32ToBf16 Conversion Layer - Remove B16 Conversion tests * Throw exception if m_ReduceFp32ToBf16 optimzer option is set to true * Provide comments to enable fast math in order to use bf16 * Update docs to inform users to enable fast math for bf16 Execute Network Changes * Require bf16_turbo_mode to also have fast_math_enabled set to true - Remove setting m_ReduceFp32ToBf16 optimizer option Signed-off-by: Ryan OShea <ryan.oshea3@arm.com> Change-Id: Ibaa6da9d29c96a1ce32ff5196b0847fde9f04a1c
2022-11-09IVGCVSW-7318 Support basic addition model in the TOSA Reference BackendRyan OShea
* Create Simple Addition EndtoEnd test * Create EndToEndTest file in TosaRef/test directory * Add AdditionEndToEnd test to CpuRef,CpuAcc,GpuAcc,TosaRef Signed-off-by: Ryan OShea <ryan.oshea3@arm.com> Change-Id: Ic44e2b457c25dcb41bb3b17c05cce0e74bf17a80
2022-11-01IVGCVSW-6496 Add EndToEnd Layer test for Batch MatMul WorkloadTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I6a541db9a602609282cc6f33af930ca141b83c41
2022-10-28IVGCVSW-6494 Add CpuAcc Batch MatMul Workload Fp32Teresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I2def6995f81d33e68f1ea45d8d19a1e6294049b1
2022-10-11IVGCVSW-7222 Fix incorrect kernel measurements in profiling outputKevin May
* Some CL kernels are not run after the first inference and this breaks the profiler which is expecting a measurement for every kernel each run * Add a function HasKernelMeasurements() to ascertain if the Event is returning kernel measurements and if so insert 0.0 values for any missing kernel measurements. * Fix ExecuteNetwork to only print a json object after all inferences have completed Signed-off-by: Kevin May <kevin.may@arm.com> Change-Id: I99f2bb0db847f5a52ab4c5705b072155c6b6f333
2022-08-05IVGCVSW-7111 change backend deprecation from 22.11 to 23.08Jim Flynn
Signed-off-by: Jim Flynn <jim.flynn@arm.com> Change-Id: I3a3aab7b5042349cb2df8517678306665e037610
2022-08-05IVGCVSW-6889 Seg fault running ExeNet with --bf16-turbo-mode on fpgaFrancis Murtagh
* Added case for Bf16 to switch and changed Assertion to Exception so it shows up in Release build. Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Change-Id: I817260dc7b7667386c4aa734bea649383866a785
2022-07-27IVGCVSW-6896 Fix pre-import when using sync execute.Colm Donelan
* Refactor backend capability checks in LoadedNetwork. * ImportInputs should check the number of tensors does not exceed the number of inputs. * In EnqueueWorkload the check for for the count of input tensors was ignoring pre-imported inputs. * Added checks to verify ImportInputs/ImportOutputs worked as expected in EndToEndTestImpl. * Improve documentation on ImportInputs/ImportOutputs in IRuntime.hpp. * Disabled import tests in CL and Neon EndToEndTests that cannot work. Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: Iae4b2644a1c9f01ee72bce1afb211661cc9ae2e3
2022-06-22Revert "Revert "IVGCVSW-6873 Import inputs but don't export outputs fails.""Francis Murtagh
This reverts commit a0f8b15d4ddb5075f380003ff31b271d389d3b66. Reason for revert: <Test ClDmaBufInternalTests review > Change-Id: Ibc4a77fa008643849da7330391942e4c87b941e2
2022-06-21Revert "IVGCVSW-6873 Import inputs but don't export outputs fails."James Conroy
This reverts commit 03bf98a8bc51ad20eef4b9ca5fbf6ce15e063721. Reason for revert: Caused failures in tests located in internal repo. Change-Id: If35cb0ede349b270e4e7827324382e09455d8cfa
2022-06-20IVGCVSW-6873 Import inputs but don't export outputs fails.Colm Donelan
Only one bool is used to indicate whether inputs should be imported. However, its possible for the user to want to import inputs but not export outputs. In addition it's possible for a user to enabled import during optimize but then pass a memory source that does not require import. * Add m_ExportEnabled to INetwork.hpp. * Modify Network::dNetwork to consider both m_ImportEnabled and m_ExportEnabled. * Add ValidateSourcesMatchOptimizedNetwork to LoadedNetwork to validate import options between optimize and network load. * Update the TfLite delegate consider exportEnabled flag in the optimizer. !armnn-internal-tests:425350 Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I776eab81595898e43f91ab40306962eae61329f4
2022-05-23MLCE-825: Give reason when workload unsupported for Non Constant Weights/BiasFrancis Murtagh
* BackendHelper.cpp IsXXXLayerSupported doesn't get as far as Neon/Cl Validate functions where arm_compute::Status is returned. * Conv2d, Depthwise, DilatedDepthwise and FullyConnected * Tidy up if() -> if () * Clean up logic in FullyConnected so that isLayerSupported gets called Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I5da1a882f4a2f55e90aa984b2b9548a847cb3a2d
2022-05-18IVGCVSW-6929 Support for models with implicit expandedMike Kelly
dimensions * Added allow-expanded-dims to TFLite parser and ArmNN delegate * If true ArmNN will disregard dimensions with a size of 1 when validating tensor shapes. Tensor sizes must still match. * This allows us to support models where tensors have expanded dimensions (i.e. extra dimensions with a size of 1). * Fixed bug in Network where it assumed that only the first option could be ShapeInferenceMethod. * Fixed bug where m_ShapeInferenceMethod was lost when copying or moving Graphs. * Changed Delegate to pass "infer-output-shape", "allow-expanded-dims" and other BackendOptions through to the Network during construction. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: Ibe7c5ae6597796fc9164cb07bd372bd7f8f8cacf
2022-05-17IVGCVSW-6126 ConstTensorsAsInput: Conv2d - BackendsCathal Corbett
!android-nn-driver:7477 Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Ibf633ccccc385bd980934ff829407d21981323ef
2022-05-16IVGCVSW-6124 ConstTensorsAsInput: Conv2d - FrontEndKeith Davis
* Update Front-end and Tools. * Updated Serializer, Deserializer and unit tests to reflect this. * Updated TfLiteDelegate, TfLiteParser and OnnxParser. * Updated Ref. * Fixed resulting Neon / CL tests * Unified optimizers for conv2d ops * Optimizer Fix - Fp32ToBf16 * Partial implementation for ACL backends to fix VTS failures !android-nn-driver:7477 Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: I5fb18877f7ee32643e15a9818945356274bb401b
2022-05-13IVGCVSW-6175 Add Pooling3d to NeonRyan OShea
* Add IsSupported for Pooling3d * Add CreateWorkload case for Pooling3d * Create new NeonPooling3dWorkload header and source files * Add Pooling3d workload to NeonWorkloads.hpp * Add float32 tests for Pooling3d workload * Add Uint8 tests for Cl and NE pooling3d Signed-off-by: Ryan OShea <ryan.oshea3@arm.com> Change-Id: Ic992e1233d1eb8db52df2c8446183df1c907bc4d
2022-05-13IVGCVSW-6260 ConstTensorsAsInput: Fully Connected Cl and Neon support.Cathal Corbett
* IVGCVSW-6940 ConstTensorsAsInput: DepthwiseConvolution2d - Complete Neon and Cl Bug Fix * Bug fix to enable Cl and Neon Backend Compatibility ConstantTensorsAsInputs * Updated Cl and Neon FullyConnected workloads to handle constant weights and bias as inputs rather than reading from member variables. * Prevent non const weights and biases passing CL and NEON validate for Depthwise Convolution. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I0f505ff5998a183152f843d0f6cc74327ba920e7
2022-05-12IVGCVSW-6940 ConstTensorsAsInput: DepthwiseConvolution2d - Complete ACLCathal Corbett
* Added backend specific optimization & test for CpuAcc and GpuAcc: PermuteDepthwiseConv2dWeights Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I600476b2e9c557a39818a574c1091c9d650b21b1
2022-05-10IVGCVSW-6936 Sqrt for CpuRef, CpuAcc and GpuAccTeresa Charlin
* Add Unit Tests * Bug Fix: add Sqrt to Neon and Cl workload factories Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I0db1d813a4e7d15431e87e825e6d14e61f5ffb7d
2022-05-09IVGCVSW-6862 Use same datatype for all containers of indices in NeonGatherNdTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I6b1c7c1c499dc93aa58fa9f58b64fb664e8bcc56
2022-05-06IVGCVSW-6862 Modify GATHERNd Neon workloadTeresa Charlin
* Add validate for all layers for GatherNd * Fix convert policy for Mul Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I0f2bae5107607ba3c02b5546f60dd9623cd95853
2022-05-06IVGCVSW-6936 Add SQRT support to NeonTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I195957541069cb52cdd2c8aead0e4a34498a6f38
2022-05-05IVGCVSW-6127 ConstTensorsAsInput: DepthwiseConvolution2dCathal Corbett
!android-nn-driver:7418 * Update Front-end and Tools. * Updated Serializer, Deserializer and unit tests to reflect this. * Updated TfLiteDelegate, TfLiteParser and OnnxParser. * Change NNDriver to new API. * Updated Ref. * Neon and Cl backend partially completed (Backend.cpp files). * Added dynamic or constant input EndToEnd tests. * Added ConstantTensorAsInputMemeberVariableRedirect Optimization. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Ib18b6c10a093042e165e25237dc04a4c67ba82da
2022-05-05IVGCVSW-6862 Add GATHERNd Neon workloadTeresa Charlin
* Changing the test in the delegate to match one of the unit tests Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I553ca266116ba8ee173fc951ab1ffd2b6eed1428
2022-05-05IVGCVSW-6806 Add Unidirectional Sequence Lstm support to NeonMike Kelly
* Corrected TensorInfo order for IsUnidirectionalSequenceLstmSupported * outputStateOut TensorInfo is not optional. * cellStateOut TensorInfo is not optional. * TensorInfo Order matches other QLSTM/LSTM layers. * Added missing parameters to UnidirectionalSequenceLstmOperator for delegate. * Added quantized UnidirectionalSequenceLstm support to Neon !android-nn-driver:7457 Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I26dde1bb96793dd25eb9081ca5ae5f63752288c4
2022-03-23IVGCVSW-6706 Move headers to profiling/client/includeJim Flynn
!android-nn-driver:7337 Change-Id: Ide401623829cc99fb9b51e9bbce3482ce706a8dd Signed-off-by: Jim Flynn <jim.flynn@arm.com>