aboutsummaryrefslogtreecommitdiff
path: root/src/backends/cl
AgeCommit message (Collapse)Author
2023-07-21IVGCVSW-7830 Clean upMike Kelly
* Follow up review to clean up whitespace and copyright errors mentioned in https://review.mlplatform.org/c/ml/armnn/+/9885 * Added BinaryElementwiseOperation to .dot files * Refactored ConnectedToSplitterWithMoreThan4Dims function to more generally useful ConnectedToLayerType function Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I0e3d0895888f3a3f0a9758ce30bc031aba50812b
2023-07-21IVGCVSW-7825 block non const bias on CL CONV2D.Teresa Charlin
* There's currently a problem with using a non const bias value in CLConvolution2d. We will block it for the moment. Change-Id: Iedccea44931a8826e2c1b295bbc46592d8ac3ef8 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
2023-07-14IVGCVSW-7830 Add backend optimizations to remove Reshapes where possibleMike Kelly
* Added optimization to remove reshapes for Neon and Ref Backends by using overridden TensorInfos * Added ability to delete Subgraphs during Optimization * Fixed naming error in NeonEndToEndTests and CLEndToEndTests * Added LayerNameAndTypeCheck for testing. * Fixed error where layers were not marked as altered when removed in CLBackend Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I1ac25cd4ec9821470d961831ae2c8d24882276cc
2023-07-10IVGCVSW-7785 3D tensors in BATCH_TO_SPACE and SPACE_TO_BATCH in CpuAcc & GpuAccTeresa Charlin
* Add Reshape layers before and after to extend support for 3D tensors, as ACL only supports 4D tensors for those layers * Add Unit Tests Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I4431185ce3a3b2f595d2a79bdda7095212d1c52d
2023-06-26Update ACL pin to c952596e70f2fe0073029f053e329a4e930ced8cNikhil Raj
* activationInfo passed in directly to configure() rather than part of matMulInfo Signed-off-by: Nikhil Raj <nikhil.raj@arm.com> Change-Id: I546def1c1e1cabaf50629f7d78ae0ba459766ed4
2023-06-19Update ACL pin to 043613fbb199e2c4fdd12c2c9a1785db9b0c45faNikhil Raj
* Break up Utils.h a bit to reduce unused code being included everywhere * Add FullyConnectedLayerInfo.h to ArmComputeUtils.hpp and remove Types.h * Add MatMulInfo.h to Neon and CL BatchMatMulWokloads Signed-off-by: Nikhil Raj <nikhil.raj@arm.com> Change-Id: I2fbe90cb40dc59add90735dafe9fef9aab3fbf06
2023-06-14IVGCVSW-7791 Enable dynamic bias in Conv in CpuAcc and GpuAccKevin May
Signed-off-by: Kevin May <kevin.may@arm.com> Change-Id: I722a9e4f3dba2500c624c6326f74085277e0d631
2023-06-13IVGCVSW-7790 - Enable dynamic bias in DWConv in GpuAccKevin May
* Remove checks for ias being constant * Convert ARMNN_ASSERTS to throw Signed-off-by: Kevin May <kevin.may@arm.com> Change-Id: I009f4008393502bd9e30269151ad935ef67f0bc1
2023-06-07Fix incorrect validation of Unidirectional Sequence LSTM on Cl and NeonNarumol Prangnawarat
Signed-off-by: Narumol Prangnawarat <narumol.prangnawarat@arm.com> Change-Id: I54c60fb98b9c560c300572f46d42b13aec7e402e
2023-05-23MLCE-1022 Fix failure on UnidirectionalSequenceLstm OperatorNarumol Prangnawarat
* Fix failure to parse UnidirectionalSequenceLstm Operator on CpuAcc * Fix failure to parse UnidirectionalSequenceLstm Operator on GpuAcc * Fix IsLayerSupported tests when there are multiple otutputs Signed-off-by: Narumol Prangnawarat <narumol.prangnawarat@arm.com> Change-Id: Ia690f34d3c7fae87bd36c97056a3ff71baa865f6
2023-05-23IVGCVSW-7732 Enable dynamic bias in FullyConnected in CpuAcc and GpuAccTeresa Charlin
* Dynamic bias are supported by ACL for this layer. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I428bd42a97e0c26c72f9925e3cb209c2fc9a650d
2023-05-18IVGCVSW-7400 POW IVGCVSW-7278 SQUARED_DIFFERENCE to CpuAcc and GpuAccJohn Mcloughlin
* Add POW SQUARED_DIFFERENCE and Unit tests for CpuAcc and GpuAcc Signed-off-by: John Mcloughlin <john.mcloughlin@arm.com> Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ifa78af2a2fda2074586d8e4d9a506b1b13fa5755
2023-05-11Revert "IVGCVSW-7454 Enable dynamic bias in CpuAcc and GpuAcc in Conv2d ↵TeresaARM
DWConv and FC" This reverts commit fecd9ed396705a17805ffc49839bd82ae24c892b. Reason for revert: IVGCVSW-7727 Dynamic bias CTS failing Change-Id: I53f67d60fca0e60a81298f90450ceef26b97c321 Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com>
2023-05-09IVGCVSW-7454 Enable dynamic bias in CpuAcc and GpuAcc in Conv2d DWConv and FCTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ib6914a9a208475b68e969eba6f70fae4061efa9b
2023-05-09IVGCVSW-5846 Remove TODO statements from Armnn CodeDavid Monahan
* Removed all instances of TODO statements from comments * Removed statements are noted as part of IVGCVSW-5846 * Removed ProtoxtFixture.cpp from the Onnx Parser tests as it's not used Signed-off-by: David Monahan <david.monahan@arm.com> Change-Id: Ia0a15f8a0d4123c8831638634eaa0d1018c40e2c
2023-05-08IVGCVSW-7454 Enable NonConstWeights in GpuAccTeresa Charlin
* Set flag for constant weights and bias in ACL tensorInfo in ACl workloads * Set flag for constant weights and bias in Unit Tests * Add to dot file for FullyConnected layer the constantWeights flag Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I87e1fef516ce4a8a59245dfdf7d92c153418e1d6
2023-05-08IVGCVSW-7308 Add GpuAcc Batch MatMul workloadTeresa Charlin
* Call dedicated MatMul kernel in ACL * Add int8 tests * Add int8 to documentation * Force tensors to be dynamic (nonConst) as per request of ACL Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I7b7ac20deec8637dc46ca990d339d92c4587cbe4
2023-04-12IVGCVSW-7197 Implement Pimpl Idiom for OptimizerOptionsJohn Mcloughlin
Signed-off-by: John Mcloughlin <john.mcloughlin@arm.com> Change-Id: Id4bdc31e3e6f18ccaef232c29a2d2825c915b21c
2023-04-11IVGCVSW-7507 Pass m_Crops in BatchToSpaceND CpuAcc and GpuAcc workloadsTeresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I902c9187eefe7595271312fdc16273f7aa3d41cd
2023-04-04Remove GetGraph and include of Graph.hpp header from public headerMatthew Bentham
Remove deprecated GetGraph() from OptimizationViews. This method has been deprecated for a long time and no backends still need it. Remove include of Graph.hpp from the public headers. Add includes elsewhere to deal with the header fallout. Signed-off-by: Matthew Bentham <matthew.bentham@arm.com> Change-Id: I8dae275a8a446d9d0e19be62684e9b3cd2fa493d
2023-04-03IVGCVSW-3808 Deprecation notices for old ElementwiseBinary layersMike Kelly
* Added Deprecation notices for old ElementwiseBinary layers. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I5bd0f186aaed675885d667f47e1e210ee9ec84f8
2023-03-31Revert "IVGCVSW-3808 Deprecation notices for old ElementwiseBinary layers"Mike Kelly
This reverts commit 52e90bf59ecbe90d33368d8fc1fd120f07658aaf. Change-Id: I5a0d244593d8e760ee7ba0c9d38c02377e1bdc24 Signed-off-by: Mike Kelly <mike.kelly@arm.com>
2023-03-30IVGCVSW-3808 Deprecation notices for old ElementwiseBinary layersMike Kelly
* Added Deprecation notices for old ElementwiseBinary layers. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: Iebbbaff38cc9c347b25eb2f9054c914a4f931c68
2023-03-14IVGCVSW-3808 Add ElementwiseBinaryLayerMike Kelly
!android-nn-driver:9329 * Added ElementwiseBinaryLayer that can represent all ElementwiseBinary operations including Add, Div, Sub, Maximum, Mul and Minimum. * Updated Delegate to use ElementwiseBinaryLayer instead of the Add, Div, Sub, Maximum, Mul and Minimum layers. * Updated Deserializer to use ElementwiseBinaryLayer instead of the Add, Div, Sub, Maximum, Mul and Minimum layers. * Updated OnnxParser to use ElementwiseBinaryLayer instead of the Add layer. * Updated TfLiteParser to use ElementwiseBinaryLayer instead of the Add, Div, Sub, Maximum, Mul and Minimum layers. * Updated CL and Neon tests to use ElementwiseBinaryLayer. * Updated CL and Neon Backend Specific Optimizations to accept ElementBinaryLayers as well as Add, Div, Mul, Sub, Maximum and Minimum layers. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I7cbb96b60eb01f0e2b57b0541016d48a08b86c75
2023-01-24IVGCVSW-7455 Workaround to allow CLBatchMatMul to parse some 4D modelsMike Kelly
* Added ability to reduce dimension sizes when calling BuildArmComputeTensorInfo or BuildArmComputeTensorShapes, this will attempt to remove leading 1s in order to squeeze the number of dimensions but retain the size. * Changed ClBatchMatMulWorkload to attempt to squeeze the number of dimensions to 3 as the CL Gemm Kernel can only support up to 3 dimensions. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I6b3d0886c5b97fdb686838fc3dc292833ddc4643
2023-01-13IVGCVSW-6493 Bug Fix on RHS permute GpuAcc Batch MatMul workload Fp32Teresa Charlin
Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I60e9284b90467f58e0acd74d3f1493546b6f1b9b
2023-01-13IVGCVSW-7173 Add Rsqrt to Tosa Ref BackendDavid Monahan
* Added ElementwiseUnary support with a mapping for Rsqrt * Added unittests * Added Rsqrt EndtoEnd tests for all backends * Changed TosaRefLayerSupport to default to false on unsupported layers Signed-off-by: David Monahan <david.monahan@arm.com> Change-Id: I3eaa9c684647ead61520a563815581aa68bee51b
2023-01-12IVGCVSW-5128 Add EndToEnd test for REDUCE_SUMTeresa Charlin
* Call Reshape EndToEnd test from 3 backends * Tidy up some naming of tests. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I5546af35e89d352d3f1529368518aecc0a4a534b
2023-01-11Move tuning and IClTensorHandle code from cl to aclCommon backend.Cathal Corbett
* Required to enable easier future merging and rebase into experimental/GpuFsa as part of IVGCVSW-7380. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: I066dcf00523ff430a0908666e452548ab848bd86
2023-01-09IVGCVSW-6493 Add GpuAcc Batch MatMul workload Fp32Teresa Charlin
* GpuAcc only supports up to 3D, so no 4D test have been added Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: Ie926cd45c350be624cbdc6cb27c89d2d3f60884b
2023-01-04Revert "IVGCVSW-7297 When creating multiple Executors only the last"TeresaARM
This reverts commit 21cf67af47a9cebbc10a98184c204fffa3722abd. Reason for revert: IVGCVSW-7397 Segmentation fault/Bus error in Backends CI job nightly Change-Id: I563e79700a857f8cf0fce0923a7040aeda29629b
2022-12-21Refactor, remove m_ImportFlags from ClTensorHandleFactoryMatthew Bentham
Signed-off-by: Matthew Bentham <matthew.bentham@arm.com> Change-Id: Ia7d714eb227a96ad9eeb1441afbc83e6ad2bb197
2022-12-12IVGCVSW-7209 Remove deprecated code due to be removed in 23.02Mike Kelly
* Removed weights and bias from Convolution, DepthwiseConv & FullyConnected layers * Removed the weight and bias ConstTensorHandles from the QueueDescriptors * Updated Workloads to take tensors from WorkloadInfo rather than the QueueDescriptors * Removed unused RedirectMembersToConstantInputs optimization and tests. Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I9ffcdc4a1c0dff725539dd69fc435b700bd98a56
2022-12-07IVGCVSW-6853 Rewrite BuildArmComputePermutationVector()Teresa Charlin
* Some pemutation vectors were not converted correctly. * Add Transpose end to end test. * Comments added with an example to clarify the differences betweeen Transpose and Permute Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I6c0954ca6ce00ef5f2a6f3625abe6f4fd27b5cdf
2022-11-23IVGCVSW-7297 When creating multiple Executors only the lastMike Kelly
one works fine * Each CLBackend created its own ClContextControlWrapper which invalidated the OpenCL context's from all CLBackends that were created before that one. * Now CLBackends will keep a shared_ptr to a ClContextControlWrapper which more closely matches the functionality within ACL. Signed-off-by: Mike Kelly <mike.kelly@arm.com> Change-Id: I0744c2cb6a2f0d6b0c5fa54d786f88cf97775559
2022-11-16IVGCVSW-7214 Disable BF16-Turbo-Mode and remove conversion layersRyan OShea
- Remove Bf16ToFp32 Conversion Layer - Remove Fp32ToBf16 Conversion Layer - Remove B16 Conversion tests * Throw exception if m_ReduceFp32ToBf16 optimzer option is set to true * Provide comments to enable fast math in order to use bf16 * Update docs to inform users to enable fast math for bf16 Execute Network Changes * Require bf16_turbo_mode to also have fast_math_enabled set to true - Remove setting m_ReduceFp32ToBf16 optimizer option Signed-off-by: Ryan OShea <ryan.oshea3@arm.com> Change-Id: Ibaa6da9d29c96a1ce32ff5196b0847fde9f04a1c
2022-11-15Minor error formatting fixes.Colm Donelan
Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I17823fb8b6bbabc4da327187167ce9582ee29b32
2022-11-09IVGCVSW-7318 Support basic addition model in the TOSA Reference BackendRyan OShea
* Create Simple Addition EndtoEnd test * Create EndToEndTest file in TosaRef/test directory * Add AdditionEndToEnd test to CpuRef,CpuAcc,GpuAcc,TosaRef Signed-off-by: Ryan OShea <ryan.oshea3@arm.com> Change-Id: Ic44e2b457c25dcb41bb3b17c05cce0e74bf17a80
2022-10-11IVGCVSW-7222 Fix incorrect kernel measurements in profiling outputKevin May
* Some CL kernels are not run after the first inference and this breaks the profiler which is expecting a measurement for every kernel each run * Add a function HasKernelMeasurements() to ascertain if the Event is returning kernel measurements and if so insert 0.0 values for any missing kernel measurements. * Fix ExecuteNetwork to only print a json object after all inferences have completed Signed-off-by: Kevin May <kevin.may@arm.com> Change-Id: I99f2bb0db847f5a52ab4c5705b072155c6b6f333
2022-08-05IVGCVSW-7111 change backend deprecation from 22.11 to 23.08Jim Flynn
Signed-off-by: Jim Flynn <jim.flynn@arm.com> Change-Id: I3a3aab7b5042349cb2df8517678306665e037610
2022-08-05IVGCVSW-6889 Seg fault running ExeNet with --bf16-turbo-mode on fpgaFrancis Murtagh
* Added case for Bf16 to switch and changed Assertion to Exception so it shows up in Release build. Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Change-Id: I817260dc7b7667386c4aa734bea649383866a785
2022-08-05GitHub #667: Neon fold padding into average pool 2D quantization bug fix.Cathal Corbett
* Originated from a GitHub issue: https://github.com/ARM-software/armnn/issues/667 * Initially, Arm NN supports the pool 2D operation because there is no padding on the pool2d. Neon failure occurs when padding is followed by average pool 2D due to folding optimization. * Here we prevent the folding optimization from happening for the above special case and add it in as a backend specific optimization. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Ia0fd90c3a6b4b9d29c81106f154617d2e893e26b
2022-07-27IVGCVSW-6896 Fix pre-import when using sync execute.Colm Donelan
* Refactor backend capability checks in LoadedNetwork. * ImportInputs should check the number of tensors does not exceed the number of inputs. * In EnqueueWorkload the check for for the count of input tensors was ignoring pre-imported inputs. * Added checks to verify ImportInputs/ImportOutputs worked as expected in EndToEndTestImpl. * Improve documentation on ImportInputs/ImportOutputs in IRuntime.hpp. * Disabled import tests in CL and Neon EndToEndTests that cannot work. Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: Iae4b2644a1c9f01ee72bce1afb211661cc9ae2e3
2022-07-08IVGCVSW-6957 'Import Host Memory in SL'Sadik Armagan
* Enabled import host memory in SL as default * Updated import host memory functionality in GpuAcc Signed-off-by: Sadik Armagan <sadik.armagan@arm.com> Change-Id: I22132b1e1008159b0e7247219762e3e9ae5eba10
2022-06-22Revert "Revert "IVGCVSW-6873 Import inputs but don't export outputs fails.""Francis Murtagh
This reverts commit a0f8b15d4ddb5075f380003ff31b271d389d3b66. Reason for revert: <Test ClDmaBufInternalTests review > Change-Id: Ibc4a77fa008643849da7330391942e4c87b941e2
2022-06-21Revert "IVGCVSW-6873 Import inputs but don't export outputs fails."James Conroy
This reverts commit 03bf98a8bc51ad20eef4b9ca5fbf6ce15e063721. Reason for revert: Caused failures in tests located in internal repo. Change-Id: If35cb0ede349b270e4e7827324382e09455d8cfa
2022-06-20IVGCVSW-6873 Import inputs but don't export outputs fails.Colm Donelan
Only one bool is used to indicate whether inputs should be imported. However, its possible for the user to want to import inputs but not export outputs. In addition it's possible for a user to enabled import during optimize but then pass a memory source that does not require import. * Add m_ExportEnabled to INetwork.hpp. * Modify Network::dNetwork to consider both m_ImportEnabled and m_ExportEnabled. * Add ValidateSourcesMatchOptimizedNetwork to LoadedNetwork to validate import options between optimize and network load. * Update the TfLite delegate consider exportEnabled flag in the optimizer. !armnn-internal-tests:425350 Signed-off-by: Colm Donelan <colm.donelan@arm.com> Change-Id: I776eab81595898e43f91ab40306962eae61329f4
2022-06-10IVGCVSW-6986 SLTS Failures due to Caching commitsCathal Corbett
* Fix made to experimental/armnn_shim_sl branch also required for armnn master branch. * TestGenerated/GeneratedTests.Sync/argmax_1 fix. Signed-off-by: Cathal Corbett <cathal.corbett@arm.com> Change-Id: Idb0324ff59e1ed13caf5f4bf899d1d3220d823d4
2022-05-23MLCE-825: Give reason when workload unsupported for Non Constant Weights/BiasFrancis Murtagh
* BackendHelper.cpp IsXXXLayerSupported doesn't get as far as Neon/Cl Validate functions where arm_compute::Status is returned. * Conv2d, Depthwise, DilatedDepthwise and FullyConnected * Tidy up if() -> if () * Clean up logic in FullyConnected so that isLayerSupported gets called Signed-off-by: Francis Murtagh <francis.murtagh@arm.com> Signed-off-by: Teresa Charlin <teresa.charlinreyes@arm.com> Change-Id: I5da1a882f4a2f55e90aa984b2b9548a847cb3a2d
2022-05-23IVGCVSW-6123 ConstTensorsAsInputs: Conv2dKeith Davis
* Use new INetwork::AddConvolution2dLayer instead of deprecated version * Remove duplicated test in SerlializerTests * Fix some cosmetics Signed-off-by: Keith Davis <keith.davis@arm.com> Change-Id: I3407815bfdc1cdc01ca0a667b8e4d80d8621783f