aboutsummaryrefslogtreecommitdiff
path: root/ethosu
AgeCommit message (Collapse)Author
2020-11-19MLBEDSW-3346: Add index check during paddingAndreas Nevalainen
Change-Id: If63acbc3bcb986db6b81afa4078d5abed05d8afa Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-19MLBEDSW-3476: Fix performance regressionDiqing Zhong
- Improve the conv estimation when the block size is very small - Estimate cycles on bias/scale channel Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I275770b7f013b0812fc1ffe91f42ad07727c9dc7
2020-11-19MLBEDSW-3251 Add version to external APIPatrik Gustavsson
Added version to the external API -Added CLI-option --api_version -Added API function to get the API version Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I0143b50adf884a2b05145912a1c7bef8cecc5f02
2020-11-19[MLBEDSW-3300] Fix DepthwiseConv2D fails when bias tensor quant_values are NoneFredrik Svedberg
Fixed DepthwiseConv2D fails when bias tensor quant_values are None. Also fixed DepthwiseConv2D fails with implicit depth multiplier. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I799a565eefa498ccf7ac626fcd472b8cbd908931
2020-11-19[MLBEDSW-3348] Fix Reshape operator fails with TypeError during deserializationFredrik Svedberg
Fixed Reshape operator fails with TypeError during deserialization in some cases. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ib34142f64295de4524e52a7a28eb36e503047bc0
2020-11-18vela: Remove ExpandDims from supported ops listMichael McGeagh
EXPAND_DIMS is not yet supported by vela, and so should not be in the list of supported ops. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I5eca13eb52eb9b40ecc6592cda978614c71db99d
2020-11-18MLMBED-3468: Update scale tensors SRAM size calculationAndreas Nevalainen
Updated SRAM size calculation for scale tensors. Change-Id: Idaecc3bf0c83d58ea70163bfd194c594295b66db Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-18MLBEDSW-3494 Fix rounding of fused QuantizedPatrik Gustavsson
Fix for setting rounding to TFL for fused Quantized Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic203f95f8916e330bcbf5792b52661b6f3e99bfc
2020-11-17MLBEDSW-3403 Generate supported op reportMichael McGeagh
A new CLI has been added that allows the generation of a report containing a summary table of all TFLite ops that can be placed on the NPU, and what the constraints are for that operator to be successfully scheduled on the NPU. This option will generate a new file, SUPPORTED_OPS.md containing this information, in the current working directory. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I6a7e2a49f251b76b2ea1168fff78e00da1910b25
2020-11-17MLBEDSW-3491: Fix index out of range in code genLouis Verhaard
Usage of shape[-2] could cause index out of range. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I1b64b117f8236ce9ba321ca03bdb25e5a03a6589
2020-11-17MLBEDSW-3493: bug fixes in mark_tensorsLouis Verhaard
None inputs and unsupported tensor shapes caused asserts when marking tensor purpose/format. Change-Id: I4498b61576f529c1a594341cfbb6ba278c6e7ec5 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-11-17MLMBED-3450: Do not convert batched fully connected to convAndreas Nevalainen
Do not convert batched fully connected operators to avoid moving weights from flash to SRAM. Change-Id: I873c9ce05377de3f16e4cee9a0863f29d9ec3ad4 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-16MLBEDSW-3483, KeyError "fused_activation_function"Louis Verhaard
Bug fix for a regression: Vela could crash for operators placed on CPU. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I99dcfdb4d3029ad86ffd2c8b3fd2547554794b79
2020-11-16MLBEDSW-3350 Put softmax on CPU if beta < 0Patrik Gustavsson
Put softmax on CPU if beta < 0 Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I4ec866dd44d14e2737c4cd96474e54bb770bfb3e
2020-11-16MLBEDSW-3301: Vela fails ungracefully when reading string buffersDwight Lidman
When encountering a sparse string buffer, Vela fails both due to missing a mapping for a Numpy string type and also for not being able to read sparse buffers. The failing line is attempting to reshape a [100] buffer into a [3, 5] tensor which does not work due to Vela treating the buffer as non-sparse. The solution here is to simply not do the reshape for string buffers (which all appear to be sparse) since it is not something that will be supported in the future anyway. The related operator can then be pushed to the CPU as expected. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Iea0af6cd60a691f975209014b6aa098dde8d6a4b
2020-11-13MLBEDSW-839: Code generation using external API2.0.0.rc1Louis Verhaard
Added external API to generate register command streams. Existing code generation has been refactored to make use of this API. Change-Id: Ibb4c2b167809869f16470b14da24f08a65c82b7b Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-11-11MLBEDSW-3463: StridedSlice fixup function causes infinite recursionDwight Lidman
This commit reverts a control flow path where already modified StridedSlice operators are left untouched. If not, Vela would recurse infinitely and crash. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Iaf3ae916325bedd3dd1edd3395fb4a9ecf832590
2020-11-11MLBEDSW-3380 Update readme with build flagsMichael McGeagh
mlw_codec is part of the codebase and has build flags. README has been updated to include these. Also, added -Werror to the list, as we must build without any warnings, so treat warnings as errors. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I10114bb013fad1ec1685fafc2e41c18ff12d9f9d
2020-11-11MLBEDSW-3019: Add profiling debug databaseTim Hall
- Added mechanism to track input to output graph transforms for debugging the resultant command stream. - Provides base implementation for MLBEDSW-2661 Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2dfe8a409fbde7ad0282bfab5acb11ba1c8b82d8
2020-11-11Vela: estimate memory transfer efficiencyDiqing Zhong
Change-Id: I9e00afe0eef0e13fe990e021bcbe3dd0eda4c471 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2020-11-11Vela: Fix perf estimation for conv 1D reshapeDiqing Zhong
Change-Id: I8f139381d0e01e8ac70d89c4a312ee3000fb5fa1 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2020-11-11MLBEDSW-3146: memory transfers cycle estimationDiqing Zhong
- DMA ops cycle estimation for the first pass - fix a bug in ifm_blk_depth calculation - fix a bug in sram bandwidth calculation - merge dpu and elementwise cycles into npu cycles - use str.format() in performance print Change-Id: I78895416f47fc3c652743c5da13fc45630322371 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> (cherry picked from commit 5245e97a62c2fe54250f99b06e778f3e0c6dc376) (cherry picked from commit 16e415677403fc04a90b1a7ec554761d38315640)
2020-11-11MLBEDSW-3146: Cycle estimation for conv/pooling opsDiqing Zhong
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: Ic6ae795a1626d1cdf63a69d2ff86f7cd898f3134
2020-11-11MLBEDSW-3222: Bias tensors in fast storageAndreas Nevalainen
For IFM streamed cascades bias tensors are read several times. Moves these tensors to fast storage and add DMA commands. Change-Id: I630f6275986c1b5e3f126c925b11e22500fb1128 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-10MLBEDSW-3377: fixup_stridedslice_output may silently change CPU opsDwight Lidman
This commit removes the constraint on all tensor shapes matching the OFM shape. The motivation is that this constraint essentially only checks that the fixup function has run. This means that it removes the possibility for the fixup function to run after the supported operator check and this effectively means that any StridedSlice operator that would be placed on the CPU is still modified by the fixup function. Because the fixup function is moved to after the supported operators check, some unreachable cases are removed from the fixup function. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I7a82126b7de73bd67873b4e6daf53a6767e33d16
2020-11-10MLBEDSW-2868 Refactor separation of scale + bias tensorsPatrik Gustavsson
Changed so that there is an option to set if Tensor clone should be seen as unique or not. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ie51c1a5e84b535380d498b105aa18ccba1c8b27c
2020-11-10[MLBEDSW-3227] Improve u65 softmax performanceFredrik Svedberg
Improve u65 softmax performance by selecting more feature map tensors as SRAM candidates. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I239c9dbebbf2a929004eb01bb0f3efe77f5b97aa
2020-11-09MLBEDSW-3402 SupportedOp now returns external nameMichael McGeagh
Previously the internal operator type was printed when checking the supported operator checks. This now converts that back to the external type name. Additionally removed dead code and changed the message for cpu-only ops Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ib2b0cbcb49fdf63edb835828e266b079e63bae37
2020-11-06MLBEDSW-3212 Remove CLI opt ifm-ofm-overlapPatrik Gustavsson
Removed the CLI opt ifm-ofm-overlap Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I23faa0d10c3e71972c543e22e8155086fce73556
2020-11-04MLBEDSW-2412 All constraints have been refactoredMichael McGeagh
All existing constraints have now been refactored using the new framework. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ic9ba0d7040cb9f114b959a949bfdf777f86752c7
2020-11-04MLBEDSW-3275: Added infinity check for Relu scaling valuesJacob Bohlin
Added a supported_operators check for Relu activation functions. If the scaling value overflows to infinity, it will be placed on the CPU. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I66b7bec062599609aadcbb7531caebbc45a7451f Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
2020-11-04MLBEDSW-1974: Set Scratch buffers sizeJacob Bohlin
Set the actual size of the Scratch and Fast Scratch buffer and remove both Scratch buffers from the subgraph inputs. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I9e4213f48289d9136cdd4cd43c668d37c6af8530
2020-11-03MLBEDSW-2868 Separate scale+bias tensorsPatrik Gustavsson
Separate scale+bias tensors by different equivilence_id. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I674341950bc001ac6e4015206995f048a0dfee75
2020-10-30Vela: Fix wrong bandwidthDiqing Zhong
- copy bandwidth compression rate when weight tensor is cloned Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I41c4c1f7001e8dc12af35695f5f5d02815e28351
2020-10-28MLBEDSW-3212 Enable overlap of elementwise input/outputPatrik Gustavsson
Enable overlap of elementwise input/output Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I6e6f11953319c843c8203bf038f96778df194332
2020-10-26MLBEDSW-3283: Bug fix: StridedSlice Op is placed on CPUDiqing Zhong
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I91a3b277cda91dca3bad38908d4ed11a4f5d7d5f
2020-10-22MLBEDSW-3285: AttributeError Tensor has no attributeTim Hall
- Fixed typo in Tensor.is_quantized() Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I36156a6aa5aaff01c4f271a6a8325636173225f3
2020-10-21vela: Refactor operators to use Kernel objectsTim Hall
- Normalise kernel availability by requiring all operators offer a kernel describing how much data they consume from the source, per OFM element, regardless of whether kernels are relevant to the operation. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idbcff64879fc2eccf292b6208a7d2038eb388017
2020-10-21vela: Improve the scaling is equal checkTim Hall
- Fixed and documented both tensor and quant params scaling checks - Added quant params validity check and tensor quantisation check - Added valid tensor checks to some graph optimisation functions Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I8d6e8f03a603d28886dde511672c8399c85b794c
2020-10-21MLBEDSW-603: Improve cycle estimation in elementwise opsDiqing Zhong
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com> Change-Id: I9f3671041c2b1497519cf42b5f52e3cd01d9c10a (cherry picked from commit e8c989f5236cce12d07a6644329935dbbf0ee8e6)
2020-10-20MLBEDSW-3268: Refactor mark_tensorsLouis Verhaard
- Refactored mark_tensor_purpose - Initial weight compression is now always done in insert_dma - Removed mark_tensor_format Change-Id: Ic719b9bcd1d27e1390d7b9ce8cd21795139ec814 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-10-19MLBEDSW-3194: Updated elementwise IFM banks countAndreas Nevalainen
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com> Change-Id: Ie404a0c13e7c7de0eff649f77e0147a0f3d73acd
2020-10-19MLBEDSW-2412 Refactor constraints for conv opsMichael McGeagh
Using a new system to report constraints, replaced existing functionality for checking conv-like ops. This new system will allow reporting of all constraints regardless of any input network. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: If81177deca2a3b57c9dd9a3a08868cbc9cef0c23
2020-10-16MLBEDSW-3004: UnpackReshaped can't be serialisedDwight Lidman
This commit fixes a bug where a rewritten Unpack operator is placed on the CPU and crashes Vela during serialisation due to the type having changed and there not being a mapping for the modified op type. The solution is to move the fixup_unpack_output function to the graph optimisation pass B, allowing the supported op check to run before it. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ic6bd4c70a478fd61adf377cb487f5b9253130314
2020-10-15MLBEDSW-3219: Suppress CPU info Const/PlaceholderLouis Verhaard
Suppress info print that Const/Placeholder/SubgraphInput are not supported on the NPU. Change-Id: I6f323b64185b01b619b584c1473ae61d010ab3a4 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-10-14Revert "MLBEDSW-3219: Suppress CPU info for Const/Placeholder"patrik.gustavsson
This reverts commit 04986c0016e59993563490fe67052371fc0e1ad2. Reason for revert: Merged by mistake Change-Id: I150ad9ba7074ad1e80f21180aeba56a454d9f748
2020-10-14MLBEDSW-3219: Suppress CPU info for Const/PlaceholderLouis Verhaard
Suppress info print that Const/Placeholder/SubgraphInput are not supported on the NPU. Change-Id: I689d25481df0cd10487484c9f639e4253df081ee Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-10-13vela: Improve extra info in constraint checksMichael McGeagh
Keeping the constraint functions consistent with each other Added specific tensor names in the extra info Added operator name to the warning generated This should help easily identify specific problematic nodes in a graph and give a good enough explanation as to why they are placed on the CPU Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ie5bbdd31e5e75fe37e3d8bb8fee1d260080bce83
2020-10-13MLBEDSW-3219 Added info print for unsupported operatorPatrik Gustavsson
Added info print for unsupported operator Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I1002d1c2249661bff17ef86d9500d1aeb2a1e38e
2020-10-12MLBEDSW-3230 Remove restriction of batching 16 for FCPatrik Gustavsson
Vela supports batching of FC, restriction removed. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ica56738f1b2676628644fc44f2039a24807f5ccb