aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-11-20Revert "MLMBED-3450: Do not convert batched fully connected to conv"2.0.0.rc2Patrik Gustavsson
This reverts commit 15a8e803844b286fe9533e1cf703c76a77b090a8. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I64169443f473c9ba42551281ad6ac4b45856f420
2020-11-20vela: Add SUPPORTED_OPS.md generated fileMichael McGeagh
This file is generated from the vela option --supported-ops-report Each release, a snapshot will be taken and uploaded with the release. This is for the 2.0.0 release Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I6b618889758a1a078e21244f1f98a56800a528a3
2020-11-20MLBEDSW-3157: Add test for broadcast shapesAndreas Nevalainen
Change-Id: Ifbd6c053ac618bedce0f56fe5c4c647a71d9cc46 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-20vela: Relax lxml version constraintTim Hall
- Tested and changed lxml to an older version Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ie097820008a4d60cb4549c659a90553a0ab9efa9
2020-11-20vela: Update tool description textTim Hall
- Updated and aligned the --help and setup.py descriptions Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I78c11b1b3dd51284b34d57a6caca45cd222b4678
2020-11-20vela: Tanh and Sigmoid broken in fixup_act_reorderTim Hall
- Fixed bug due to typo in Op.type refactor Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I55916d90bf792648f496a45c358b7e897c6730ba
2020-11-20vela: Remove and change CLI optionsTim Hall
- Removed unused --show-minimum-possible-allocation - Change --allocation-alignment to --cpu-tensor-alignment Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I00e367c3190aeea08a3f136332711e9accc85ba3
2020-11-20MLBEDSW-3249: Vela config file examplesTim Hall
- Added sample vela.ini config file - Changed vela config format, split into system config and memory mode - Removed unused CPU cycle performance estimation - Added new CLI options for --memory-mode and --verbose-config - Changed CLI option --config to take multiple files - Removed CLI option --global-memory-clock-scales - Changed error helper functions to raise a VelaError exception - Refactored to create a new is_spilling_enabled function Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I27c41577e37a3859edb9524cd99784be10ef0a0d
2020-11-20vela: Rename Yoda to Ethos-U65Tim Hall
- Also changed to use Ethos-U where appropriate Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ie45ba2bb3935b305abe897b78b498681296cb7c1
2020-11-20MLBEDSW-3302: Reject per-channel scaling for unsupported opsDwight Lidman
Vela only supports per-channel scaling for convolution ops. This commit adds a check that puts ops with per-channel scaling on the CPU. A caveat worth mentioning is that neither TensorFlow Lite or TensorFlow Lite Micro support per-channel scaling for the CPU placed op, however the problem is moved away from Vela. This commit also changes a small utility function in supported_operators.py used for docstring formatting. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I9ed090592f1d05dd4566d3e54dba1ef405299383
2020-11-20vela: Improve the scaling is equal checkTim Hall
- Improved tensor and scaling query functions - Fixed bug in convert_batched_fc_to_conv Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ibc3d14036540f27cf5e993beb2163d3e0f5e5933
2020-11-19MLBEDSW-3346: Add index check during paddingAndreas Nevalainen
Change-Id: If63acbc3bcb986db6b81afa4078d5abed05d8afa Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-19MLBEDSW-3476: Fix performance regressionDiqing Zhong
- Improve the conv estimation when the block size is very small - Estimate cycles on bias/scale channel Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I275770b7f013b0812fc1ffe91f42ad07727c9dc7
2020-11-19MLBEDSW-3251 Add version to external APIPatrik Gustavsson
Added version to the external API -Added CLI-option --api_version -Added API function to get the API version Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I0143b50adf884a2b05145912a1c7bef8cecc5f02
2020-11-19[MLBEDSW-3300] Fix DepthwiseConv2D fails when bias tensor quant_values are NoneFredrik Svedberg
Fixed DepthwiseConv2D fails when bias tensor quant_values are None. Also fixed DepthwiseConv2D fails with implicit depth multiplier. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I799a565eefa498ccf7ac626fcd472b8cbd908931
2020-11-19[MLBEDSW-3348] Fix Reshape operator fails with TypeError during deserializationFredrik Svedberg
Fixed Reshape operator fails with TypeError during deserialization in some cases. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ib34142f64295de4524e52a7a28eb36e503047bc0
2020-11-18vela: Remove ExpandDims from supported ops listMichael McGeagh
EXPAND_DIMS is not yet supported by vela, and so should not be in the list of supported ops. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I5eca13eb52eb9b40ecc6592cda978614c71db99d
2020-11-18MLMBED-3468: Update scale tensors SRAM size calculationAndreas Nevalainen
Updated SRAM size calculation for scale tensors. Change-Id: Idaecc3bf0c83d58ea70163bfd194c594295b66db Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-18MLBEDSW-3494 Fix rounding of fused QuantizedPatrik Gustavsson
Fix for setting rounding to TFL for fused Quantized Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic203f95f8916e330bcbf5792b52661b6f3e99bfc
2020-11-17MLBEDSW-3403 Generate supported op reportMichael McGeagh
A new CLI has been added that allows the generation of a report containing a summary table of all TFLite ops that can be placed on the NPU, and what the constraints are for that operator to be successfully scheduled on the NPU. This option will generate a new file, SUPPORTED_OPS.md containing this information, in the current working directory. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I6a7e2a49f251b76b2ea1168fff78e00da1910b25
2020-11-17MLBEDSW-3491: Fix index out of range in code genLouis Verhaard
Usage of shape[-2] could cause index out of range. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I1b64b117f8236ce9ba321ca03bdb25e5a03a6589
2020-11-17MLBEDSW-3493: bug fixes in mark_tensorsLouis Verhaard
None inputs and unsupported tensor shapes caused asserts when marking tensor purpose/format. Change-Id: I4498b61576f529c1a594341cfbb6ba278c6e7ec5 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-11-17MLMBED-3450: Do not convert batched fully connected to convAndreas Nevalainen
Do not convert batched fully connected operators to avoid moving weights from flash to SRAM. Change-Id: I873c9ce05377de3f16e4cee9a0863f29d9ec3ad4 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-16MLBEDSW-3483, KeyError "fused_activation_function"Louis Verhaard
Bug fix for a regression: Vela could crash for operators placed on CPU. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I99dcfdb4d3029ad86ffd2c8b3fd2547554794b79
2020-11-16MLBEDSW-3350 Put softmax on CPU if beta < 0Patrik Gustavsson
Put softmax on CPU if beta < 0 Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I4ec866dd44d14e2737c4cd96474e54bb770bfb3e
2020-11-16MLBEDSW-3301: Vela fails ungracefully when reading string buffersDwight Lidman
When encountering a sparse string buffer, Vela fails both due to missing a mapping for a Numpy string type and also for not being able to read sparse buffers. The failing line is attempting to reshape a [100] buffer into a [3, 5] tensor which does not work due to Vela treating the buffer as non-sparse. The solution here is to simply not do the reshape for string buffers (which all appear to be sparse) since it is not something that will be supported in the future anyway. The related operator can then be pushed to the CPU as expected. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Iea0af6cd60a691f975209014b6aa098dde8d6a4b
2020-11-13MLBEDSW-839: Code generation using external API2.0.0.rc1Louis Verhaard
Added external API to generate register command streams. Existing code generation has been refactored to make use of this API. Change-Id: Ibb4c2b167809869f16470b14da24f08a65c82b7b Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-11-11MLBEDSW-3463: StridedSlice fixup function causes infinite recursionDwight Lidman
This commit reverts a control flow path where already modified StridedSlice operators are left untouched. If not, Vela would recurse infinitely and crash. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Iaf3ae916325bedd3dd1edd3395fb4a9ecf832590
2020-11-11MLBEDSW-3380 Update readme with build flagsMichael McGeagh
mlw_codec is part of the codebase and has build flags. README has been updated to include these. Also, added -Werror to the list, as we must build without any warnings, so treat warnings as errors. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I10114bb013fad1ec1685fafc2e41c18ff12d9f9d
2020-11-11MLBEDSW-3019: Add profiling debug databaseTim Hall
- Added mechanism to track input to output graph transforms for debugging the resultant command stream. - Provides base implementation for MLBEDSW-2661 Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2dfe8a409fbde7ad0282bfab5acb11ba1c8b82d8
2020-11-11Vela: estimate memory transfer efficiencyDiqing Zhong
Change-Id: I9e00afe0eef0e13fe990e021bcbe3dd0eda4c471 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2020-11-11Vela: Fix perf estimation for conv 1D reshapeDiqing Zhong
Change-Id: I8f139381d0e01e8ac70d89c4a312ee3000fb5fa1 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2020-11-11MLBEDSW-3146: memory transfers cycle estimationDiqing Zhong
- DMA ops cycle estimation for the first pass - fix a bug in ifm_blk_depth calculation - fix a bug in sram bandwidth calculation - merge dpu and elementwise cycles into npu cycles - use str.format() in performance print Change-Id: I78895416f47fc3c652743c5da13fc45630322371 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> (cherry picked from commit 5245e97a62c2fe54250f99b06e778f3e0c6dc376) (cherry picked from commit 16e415677403fc04a90b1a7ec554761d38315640)
2020-11-11MLBEDSW-3146: Cycle estimation for conv/pooling opsDiqing Zhong
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: Ic6ae795a1626d1cdf63a69d2ff86f7cd898f3134
2020-11-11MLBEDSW-3222: Bias tensors in fast storageAndreas Nevalainen
For IFM streamed cascades bias tensors are read several times. Moves these tensors to fast storage and add DMA commands. Change-Id: I630f6275986c1b5e3f126c925b11e22500fb1128 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-10MLBEDSW-3377: fixup_stridedslice_output may silently change CPU opsDwight Lidman
This commit removes the constraint on all tensor shapes matching the OFM shape. The motivation is that this constraint essentially only checks that the fixup function has run. This means that it removes the possibility for the fixup function to run after the supported operator check and this effectively means that any StridedSlice operator that would be placed on the CPU is still modified by the fixup function. Because the fixup function is moved to after the supported operators check, some unreachable cases are removed from the fixup function. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I7a82126b7de73bd67873b4e6daf53a6767e33d16
2020-11-10MLBEDSW-2868 Refactor separation of scale + bias tensorsPatrik Gustavsson
Changed so that there is an option to set if Tensor clone should be seen as unique or not. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ie51c1a5e84b535380d498b105aa18ccba1c8b27c
2020-11-10[MLBEDSW-3227] Improve u65 softmax performanceFredrik Svedberg
Improve u65 softmax performance by selecting more feature map tensors as SRAM candidates. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I239c9dbebbf2a929004eb01bb0f3efe77f5b97aa
2020-11-09MLBEDSW-3402 SupportedOp now returns external nameMichael McGeagh
Previously the internal operator type was printed when checking the supported operator checks. This now converts that back to the external type name. Additionally removed dead code and changed the message for cpu-only ops Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ib2b0cbcb49fdf63edb835828e266b079e63bae37
2020-11-06MLBEDSW-3212 Remove CLI opt ifm-ofm-overlapPatrik Gustavsson
Removed the CLI opt ifm-ofm-overlap Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I23faa0d10c3e71972c543e22e8155086fce73556
2020-11-04MLBEDSW-2412 All constraints have been refactoredMichael McGeagh
All existing constraints have now been refactored using the new framework. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ic9ba0d7040cb9f114b959a949bfdf777f86752c7
2020-11-04MLBEDSW-3275: Added infinity check for Relu scaling valuesJacob Bohlin
Added a supported_operators check for Relu activation functions. If the scaling value overflows to infinity, it will be placed on the CPU. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I66b7bec062599609aadcbb7531caebbc45a7451f Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
2020-11-04MLBEDSW-1974: Set Scratch buffers sizeJacob Bohlin
Set the actual size of the Scratch and Fast Scratch buffer and remove both Scratch buffers from the subgraph inputs. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I9e4213f48289d9136cdd4cd43c668d37c6af8530
2020-11-03MLBEDSW-2868 Separate scale+bias tensorsPatrik Gustavsson
Separate scale+bias tensors by different equivilence_id. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I674341950bc001ac6e4015206995f048a0dfee75
2020-10-30Vela: Fix wrong bandwidthDiqing Zhong
- copy bandwidth compression rate when weight tensor is cloned Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I41c4c1f7001e8dc12af35695f5f5d02815e28351
2020-10-28MLBEDSW-3212 Enable overlap of elementwise input/outputPatrik Gustavsson
Enable overlap of elementwise input/output Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I6e6f11953319c843c8203bf038f96778df194332
2020-10-26MLBEDSW-3283: Bug fix: StridedSlice Op is placed on CPUDiqing Zhong
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I91a3b277cda91dca3bad38908d4ed11a4f5d7d5f
2020-10-22MLBEDSW-3285: AttributeError Tensor has no attributeTim Hall
- Fixed typo in Tensor.is_quantized() Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I36156a6aa5aaff01c4f271a6a8325636173225f3
2020-10-21vela: Refactor operators to use Kernel objectsTim Hall
- Normalise kernel availability by requiring all operators offer a kernel describing how much data they consume from the source, per OFM element, regardless of whether kernels are relevant to the operation. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idbcff64879fc2eccf292b6208a7d2038eb388017
2020-10-21vela: Improve the scaling is equal checkTim Hall
- Fixed and documented both tensor and quant params scaling checks - Added quant params validity check and tensor quantisation check - Added valid tensor checks to some graph optimisation functions Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I8d6e8f03a603d28886dde511672c8399c85b794c