aboutsummaryrefslogtreecommitdiff
path: root/ethosu/vela/scheduler.py
AgeCommit message (Collapse)Author
2021-02-04MLBEDSW-3937 Fix moving FM to fast storagePatrik Gustavsson
Featuremaps were never moved to fast storage when tensor is set to not use NHCWB16. This patch enables the evaluation of feature maps to be moved fast storage, also when tensor use NHWC. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I6367c975e7af8739c774cb7c34b43fb9a6776c8c
2020-12-18MLBEDSW-3654 Add/use op ifm/ofm shapesPatrik Gustavsson
Add ifm/ofm shapes to op Changed to rely on these shapes Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I571535a1dcadc2bdb04a3c727a8e1c49703b174d
2020-12-09Vela: bandwidth calculation improvementsDiqing Zhong
- Combine conv and vector_product calculation - Remove internal bandwidth - Remove blocks and hw_macs from report - Use scaled_bws for cycle estimation Related to: MLBEDSW-3598 Change-Id: I1927a8311ec563f68115e0f2ed077806b86fd717 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2020-11-27vela: Rename --keep-scale-placement CLITim Hall
- Changed to --cache-bias-scale-tensor Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I285fe253f03ba98eff36dbe996ad3a57e2ee3d99
2020-11-25MLBEDSW-3352 Fix ifm end_coord for upsamplingPatrik Gustavsson
-Fix for end_coord for upsampling -Remove restriction for ifm streaming -Added restriction for cascading on ResizeBilinear Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I384abf12cfe8ac9ce7b76066b709600ea901b248
2020-11-25MLBEDSW-3352 Avoid ifm streaming for some casesPatrik Gustavsson
Changed so it is not allowed to do ifm-streaming for TransposeConv and ResizeBilinear Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I85da279fae6202830c46e4a5500fb1b0dd6ef542
2020-11-23MLBEDSW-3468: Move of scale tensors to SRAM after weight compressorAndreas Nevalainen
After weight compressor weights has correct sizes. Placing move of scale tensors after weight compressor gives more accurate estimate of available SRAM for scale tensors. Change-Id: I4571780180778ef43e943c4e98048e17d6f33580 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-20MLBEDSW-3249: Vela config file examplesTim Hall
- Added sample vela.ini config file - Changed vela config format, split into system config and memory mode - Removed unused CPU cycle performance estimation - Added new CLI options for --memory-mode and --verbose-config - Changed CLI option --config to take multiple files - Removed CLI option --global-memory-clock-scales - Changed error helper functions to raise a VelaError exception - Refactored to create a new is_spilling_enabled function Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I27c41577e37a3859edb9524cd99784be10ef0a0d
2020-11-18MLMBED-3468: Update scale tensors SRAM size calculationAndreas Nevalainen
Updated SRAM size calculation for scale tensors. Change-Id: Idaecc3bf0c83d58ea70163bfd194c594295b66db Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-11MLBEDSW-3222: Bias tensors in fast storageAndreas Nevalainen
For IFM streamed cascades bias tensors are read several times. Moves these tensors to fast storage and add DMA commands. Change-Id: I630f6275986c1b5e3f126c925b11e22500fb1128 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-10[MLBEDSW-3227] Improve u65 softmax performanceFredrik Svedberg
Improve u65 softmax performance by selecting more feature map tensors as SRAM candidates. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I239c9dbebbf2a929004eb01bb0f3efe77f5b97aa
2020-11-06MLBEDSW-3212 Remove CLI opt ifm-ofm-overlapPatrik Gustavsson
Removed the CLI opt ifm-ofm-overlap Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I23faa0d10c3e71972c543e22e8155086fce73556
2020-10-28MLBEDSW-3212 Enable overlap of elementwise input/outputPatrik Gustavsson
Enable overlap of elementwise input/output Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I6e6f11953319c843c8203bf038f96778df194332
2020-10-08MLBEDSW-3148: Refactor OperationLouis Verhaard
- op.type is now an enum instead of a string - Removed unused operator codes - Refactored some attributes like npu_block_type, fused_activation_function - Refactored operator index calculation - Refactored a number of operator sets Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-09-25MLBEDSW-2337: Intermediate feature maps in fast storageLouis Verhaard
Attempts to use fast storage for feature maps used in between cascaded passes. This is only relevant for system configurations where feature maps are by default not placed in SRAM, but there is SRAM for fast storage. Change-Id: I207b7cf32cfcb5bea3e6b93c2da1161c4af5221d Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-09-21MLBEDSW-2816: Fix assert in schedulerDiqing Zhong
- Use non local memory as the base sram usage for a subgraph - Make avoid_for_spilling more generic for all mem configs Change-Id: I99cd30fe6a8ba075d5a70dc2138aa0635afaadb3 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2020-09-17MLBEDSW-2809: Redo the Tensor addressingJacob Bohlin
Added a static class TensorAddressMap that stores all Tensor addresses based on their equivalence_id. Made the "address" field into a property which getter and setter looks up/sets the tensor's address in TensorAddressMap. This makes the references to cpu_tensor/npu_tensor obsolete and they have been removed. Addition to scheduler: avoid SRAM spilling if an op has consumers in other subgraphs. Minor rework in LUTState; it will now assign a unique equivalence_id to the SHRAM lut tensor to avoid issues with addressing. The equivalent checks in LUTState now compares the values of the LUT instead of the the equivalence_id. Updated LUT unit tests accordingly. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I41de5a8a4e5f07b77d6544d8d4034b754993e503
2020-08-28MLBEDSW-2889: NHCWB16 format issue at end of subgraphTim Hall
- Processing reshapes at the end of NPU subgraphs selected NHCWB16 tensor format before handing over to the CPU. This commit detects end-of-subgraph during the reshape-consumers compatibility check and chooses NHWC format instead. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ieefdbecdba1a6183d79d3ac4d2505503dbf321cb
2020-08-27[MLBEDSW-2846] Do not use NHCWB16 for reduce_sum int32Fredrik Svedberg
Added checks for not using NHCWB16 for reduce_sum int32 which makes int8/uint8 softmax work. Also enabled softmax graph rewrite by default and fixed a saturation problem. Change-Id: Ic01bd9ece7e5c3edb2900b7915cc747efe9e5760 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2020-08-26MLBEDSW-2686: Use NPU tensor format for noop reshapes.1.2.0.rc2Tim Hall
- Reshapes that merely add/remove dimensions, rather than re-layout the data need not fall back to NHWC. This commit allows reshapes betweeen NPU operators to use NHCWB16. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ieb7745e586bf324e92e741a04b74caf7285f4b8b
2020-08-26MLBED-2822 Added CLI-opt for weight size est.Patrik Gustavsson
Added --weight-estimation-scaling, which enables additional scaling of weight compression scale estimate. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Idcda41257f44901d3a3f345341e07fb1ae8585a9
2020-08-14MLBEDSW-2570 Avoid usage of NHCWB16 for some casesPatrik Gustavsson
Avoid usage of NHCWB16 when Stack/Pack/Concat is performed in axis 3, and the "concat start" of each slice to be combined is not a multiple of 16. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: If3f7b4a3424be3c86fc2dc48e8649ce4c4f49485
2020-08-12MLBEDSW-2696 Fix Sram exceeded for Sram spillingPatrik Gustavsson
Avoid concat op as predecessor in ifm streaming, when Sram spilling is to be applied. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I2ba6283a7561a12d54a06552a15e122bb082b7a1
2020-07-07MLBEDSW-2551 Add support for more mem-cfgsPatrik Gustavsson
Added support for one more memory configuration- Change-Id: Iac19992386e3e9b80bd519acb1b0a399c47d736f Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
2020-06-25MLBEDSW-2306 Added more supported mem-cfgsPatrik Gustavsson
Additional supported memory configurations: -Permanent_storage = DRAM -Tensor arena either in DRAM or SRAM Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I20beb7151e306bfdba540e7c0b2a7b478b4d94e1
2020-06-18Code clean-up using black and flake8Tim Hall
- No functional change Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I5ab1198b9d092cd041fa9b85b2dee9900d299bfc
2020-06-18MLBEDSW-2432: Retain pass order for CPU subgraphCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: I92b18262608415e84266d2903e17fc5112793a38
2020-06-18MLBEDSW-2370: Add CLI option for NHCWB16Charles Xu
Make it configurable for using NHCWB16 between cascaded passes. Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: I259cdaa424d11ea38f17e671490ad1e630bbae44
2020-06-18Add reorder-python-import pre-commit hookDiego Russo
Also updated README.md Change-Id: I118309c61f4d00e8508d6b888c606995490fba39 Signed-off-by: Diego Russo <diego.russo@arm.com>
2020-06-18Add pre-commit support for sanity checksDiego Russo
Use pre-commit framework [1] to run black and flake8 before the commit. black and flake8 are managed by the pre-commit framework and they can be run manually by the user using `pre-commit run` command. Fix the code base with the help of black and flake8. Fix import statements according to PEP8 guidelines [1] Both tools have the following settings (specified in the pre-commit configuration file): * line length: 120 characters * directory to exclude: ethosu/vela/tflite/ and ethosu/vela/ethos_u55_regs Updated README.md on how to install pre-commit and how to run sanity checks. Pipenv files have been updated including new dependencies for pre-commit. [1]: https://www.python.org/dev/peps/pep-0008/#imports [2]: https://github.com/pre-commit/pre-commit Change-Id: I304d9fffdf019d390ffa396a529c8a7c2437f63d Signed-off-by: Diego Russo <diego.russo@arm.com>
2020-06-18MLBEDSW-1370: Use NHCWB16 between NPU opsPatrik Gustavsson
Added support for using NHCWB16 between cascaded passes. (For Reshape format is kept to NHWC) Change-Id: I0ef1631984fec89fe09999b64ae69563e2aefc9b Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
2020-04-29Add Vela codebase0.1.0Tim Hall
- Added modules ethosu.vela and ethosu.mlw_codec. - Added README and various configuration files. Change-Id: I3690f8c8f5966306ecddaeb2793c30ca9c6e2eee