aboutsummaryrefslogtreecommitdiff
path: root/ethosu/vela/high_level_command_stream_generator.py
AgeCommit message (Collapse)Author
2023-06-13MLBEDS-7714: Fix assert for cascaded Resize opJohan Alfven
- Cascading was recently enabled for Resize ops. A Resize op is transformed into several ops. In this case the last op is a DepthwiseConv2DBias using NEAREST resampling mode. This resampling/ upscaling is not taken into account when calculating the ifm box size, causing the coordinates to get out of bounds. - When generating the high level command stream there is a check to see if an op is a resize op. If this is the case an upscaling factor is calculated. The fix is to change this check to instead see if the operator is using NEAREST resampling mode. If that is true, the scaling factor should be used. Change-Id: I5308a383cc3310c53004ccfe2d6fabf256478a26 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-03MLBEDSW-7234: Vela assert on int16 Min+Reluwilisa01
fixed by using sched op instead of last pass op Change-Id: I2e03d39462ca07372d85c71e78189bd8c58a1b9c Signed-off-by: wilisa01 <william.isaksson@arm.com>
2023-03-14MLBEDSW-6260: Add support for using DMA to copy feature mapsJohan Alfven
- Reshape ops can be bypassed and there is no need to process them by the NPU. There are use cases when the IFM must be preserved so a memcpy is needed. This is implemented by an AvgPool. - In order to reduce the cost of the AvgPool the IFM can be copied by DMA. This is faster and also it can be turned into a real NOP in cases where the IFM and the OFM can use the same memory space. - Added new memcpy op. Only NHWC format supported since DMA can not change the format on the fly. - Allow ofm to reuse ifm for memcpy op - Make sure the DMA copy size is 16 byte aligned Change-Id: I3605a48d47646ff60d2bb3644dd3a23f872235a7 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2022-12-21MLBEDSW-7206: Fixed weight buffering problem in cascadingJohan Alfvén
- Fixed a problem where buffered weights were only used in the first stripe that was produced. The following stripes read the weights from permanent storage. Change-Id: I176909fa0e2edbecf80e8ec8ac136f42d5d3bcd4 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2022-11-16MLBEDSW-6620: Update copyright notice and yearsRickard Bolin
- Update copyright notices to use SPDX format and add OSS mail as contact. - Update years on files where it had been missed. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7e9715ea4e17b76252728c708e46df12ad67ab1f
2022-10-12MLBEDSW-6971 Fix output diff when cascading elementwise operatorsFredrik Svedberg
Fixed output diff when cascading elementwise operators with reversed operand order. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac2e28cfb53037b929459af213f4fa7715b3e6de
2022-07-23MLBEDSW-4157: Add RESIZE_NEAREST_NEIGHBOR supportTim Hall
- Changed ResizeBilinear to support ResizeNearestNeighbor as well for 1x1 IFM, IFM equal OFM, and non-align corners - Added support for ResizeNearestNeighbor with align corners by converting to a DepthwiseConv - Updated supported operator unit tests - Added is_resize() helper function and some associated refactoring Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
2022-07-13MLBEDSW-6496 mlperf_deeplabv3_mnv2_ade20k_int8 fails at verify_output for u65Fredrik Svedberg
Added check to see if additional stripe data is needed from producer op when cascading to make sure the stripes are not overwriting data still being used. Also changed scheduler to make sure ResizeBilinear always runs with even stripe height. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: If7d723e6be29575c2b55c400eebbe8275a1aa328
2022-07-11MLBEDSW-6261: Elementwise cascadingerik.andersson@arm.com
Enabled elementwise cascading for binary/single variable IFM operators. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I1c0867875fdc5c4980224fb570185c11e719d5cd
2022-05-16MLBEDSW-6263: Use separate tensors for double bufferingRickard Bolin
Uses separate tensors for the individual weight buffers in case of weight double buffering. Each weight buffer tensor gets its own individual live range. This patch is a clone of a previously reverted patch, but with some additional bug fixes applied. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I868c70d15821eb9f1399186f2da6e7345f6ee343
2022-05-04Revert "MLBEDSW-6263: Use separate tensors for double buffering"Tim Hall
This reverts commit cc5f4de1c35ba44fca7ff6295c6ae846f8242344. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0fa5babfe9ad9ec668720d04fe1c16d9a9092131
2022-03-30Update version of Black to 22.3.0Jonas Ohlsson
Update version of Black to 22.3.0 due to updated dependencies. Updates to fix reported issues due to new version. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
2022-03-30MLBEDSW-6263: Use separate tensors for double bufferingLouis Verhaard
Uses separate tensors for the individual weight buffers in case of weight double buffering. Each weight buffer tensor gets its own individual live range. Change-Id: I724a8c61a7045615fbd2ed9535663076ac8edd13 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-01-12MLBEDSW-5534: Enet_640_640_int8 output diffRickard Bolin
The output diff is caused by not including the kernel dilation when calculating the bottom padding to be used on the last h_stripe. This only shows up when using dedicated_sram since shared_sram does not split into multiple h_stripes and thus uses the padding specified by the skirt instead. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7f643748b153004d65be2124c0ac6c9d21cd803f
2021-08-16MLBEDSW-4803: Output diff fix for MobileNetV33.1.0.rc1Dwight Lidman
This commit moves a piece of code back into a loop but with a flag to make sure that the code is only executed once per loop rather than potentially every iteration. This solves the issue of an output diff because of LUT DMAs occurring before weight DMAs. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I3e597f0a955154af3d87febacea1b3920d53b7c2
2021-07-08MLBEDSW-4838 Added basic TOSA support.Patrik Gustavsson
Added basic TOSA support, enabling Vela to read and compile a .tosa file corresponding to CONV2D + Rescale + Clamp, and writing it to an optimized .tflite file. The optimized .tflite file, will in this case, hold a commandstream where the Rescale and Clamp has been fused into the CONV2D. The optimized tflite file is not output from Vela. -Added support to read .tosa file into Vela internal structure. - Added tosa_reader.py, tosa_mapper.py and helper files stored under tosa/ - Support for this limited to ~10 ops -Added reader_util.py for functions common for TOSA and TFLite -Added tosa_graph_optimiser.py -Added support to fuse Rescale into convolution -Modified handling for padding -Added support to fuse Clamp to previous op -Added graph_optimiser_util.py -Moved functions common for TOSA/TFLite graph optimization to this file. -Renamed graph_optimiser.py to tflite_graph_optmiser.py -Added separate tosa_supported_operators.py -Added supported_operator_util.py -For functions in common for TOSA/TFLite Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab
2021-06-16MLBEDSW-4644 Removed unnecessary LUT DMA commandsJacob Bohlin
Fixed a bug where a DMA command for the activation LUT would be issued for every depth-slice of an operator. This caused multiple unnecessary DMA commands. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I9c291692d8002f05656bb88214836ab389a56cdb
2021-06-08MLBEDSW-4602: Fix Deepspeech scale & bias reuse issue.Tim Hall
- Deepspeech reuses identical weights and biases throughout the network. Since biases are now interleaved with weights there is a scaling issue when the ifm scales differ between operations using the same weight and scale tensor. - This commit uses interleaved weights/scales on their first use but separates scales to source memory on subsequent use (if the ifm scale is different). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I7aae163438160a919cae04e235966e75355a6148
2021-05-27MLBEDSW-4034: New Scheduler Size or Performance OptimisationTim Hall
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
2021-04-09MLBEDSW-4073 Handle elementwise ops with same tensor for both inputsHenrik G Olsson
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I0e6bb46b7b91ed10f5bda34fba66d8b714560f47
2021-03-31MLBEDSW-4286: Bug fix Concat using IFM streamingLouis Verhaard
IFM box calculation was wrong because 2 variables were referencing/updating the same list. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Ibed4e94c474682e14a6dd898029f14af11c9479a
2021-03-16MLBEDSW-4223: Full support for PAD operatorLouis Verhaard
- Added full support for PAD operator - Hardware padding is still used whenever possible - Bug fix Pad followed by max pool if IFM contains negative values Change-Id: Ifc64d1943737d94466f5e2821009dab12a49a965 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-02-11MLBEDSW-3774 Fix avoid cascading for spillingPatrik Gustavsson
Fix avoid cascading for spilling. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: If86189bd1566eaa14387dfc2c02e3324ea6c184e
2021-02-11MLBEDSW-3774 Remove SplitSliceReadPatrik Gustavsson
Removed SplitSliceRead from subgraph during graph optimisation. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I9315d4c2a6767828dd2b4e66823d73b10ebee99c
2021-02-09MLBEDSW-3774 Removed ConcatSliceWritePatrik Gustavsson
-Removed ConcatSliceWrite from the optimised graph. Always executed as avgpool, which is equivalent with before the patch. -Added copy op to enable more removal of reshapes. Sg input/outputs need to remain. When Reshape input and outut, are sg input/outputs a copy op is needed to be inserted, in order to remove the reshape. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Id7be9966673ae34499e8518a5544104493fe326b
2021-02-08MLBEDSW-3953: Output diff in mobilenet_v3Diqing Zhong
Fixed two issues: - Cmd stream can be out of order in Ifmstreaming - In H32, LUT could be corrupted if blockdep is not 0 Change-Id: I2edd84429b93d83b2794f14937ce3fd279fd4a24 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2021-01-28MLBEDSW-3772 Reshape removalPatrik Gustavsson
-Removed reshapes in the original graph -Removed the addition of reshapes to the optimized graph -Reshapes with different ifm/ofm quantisation will remain Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I94862be53dac0d7434815e2aee5ca678228495f8
2020-12-21Revert "Revert "MLBEDSW-3645 4D class for op ifm/ofm shapes""patrik.gustavsson
This reverts commit df0a5905177f3a1b836076bc3f9f39b2e86f1794. Reason for revert: <INSERT REASONING HERE> Change-Id: I891c66fb29db9d25e942947e8d1c29a10610de51
2020-12-21Revert "MLBEDSW-3645 4D class for op ifm/ofm shapes"patrik.gustavsson
This reverts commit bf31d647dc5df47410ee577b12427ddf076d816b. Reason for revert: <INSERT REASONING HERE> Change-Id: I7b6c585b7658f94dbaa916c2b6bfe9fb463b8d37
2020-12-21MLBEDSW-3645 4D class for op ifm/ofm shapesPatrik Gustavsson
Add 4D shape class for op Ifm/ofm shapes Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic0a98da9d2f9d085605e39a9ab5a26bad6e702a3
2020-12-18MLBEDSW-3654 Add/use op ifm/ofm shapesPatrik Gustavsson
Add ifm/ofm shapes to op Changed to rely on these shapes Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I571535a1dcadc2bdb04a3c727a8e1c49703b174d
2020-11-24MLBEDSW-3352 Fix incorrectly set default valueMichael McGeagh
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I2e8384a044ee5458bc8c92562153b6383de5f17a
2020-11-13MLBEDSW-839: Code generation using external API2.0.0.rc1Louis Verhaard
Added external API to generate register command streams. Existing code generation has been refactored to make use of this API. Change-Id: Ibb4c2b167809869f16470b14da24f08a65c82b7b Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-11-11MLBEDSW-3222: Bias tensors in fast storageAndreas Nevalainen
For IFM streamed cascades bias tensors are read several times. Moves these tensors to fast storage and add DMA commands. Change-Id: I630f6275986c1b5e3f126c925b11e22500fb1128 Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
2020-11-06MLBEDSW-3212 Remove CLI opt ifm-ofm-overlapPatrik Gustavsson
Removed the CLI opt ifm-ofm-overlap Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I23faa0d10c3e71972c543e22e8155086fce73556
2020-10-12MLBEDSW-3154 Bug fix for LUT ops with IFM from SplitSliceReadLouis Verhaard
- Incorrect length check in high level command stream generator - Improved tensor names related to LUT based operations Change-Id: Ib8844a35a986e2dbef095df23f143f4633b255f9 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-10-08MLBEDSW-3148: Refactor OperationLouis Verhaard
- op.type is now an enum instead of a string - Removed unused operator codes - Refactored some attributes like npu_block_type, fused_activation_function - Refactored operator index calculation - Refactored a number of operator sets Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-09-08optim: Fix issue with IFM streaming of LUTMichael McGeagh
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I3c3ed73a6db39615ddf5987dc5696b6b09682be0
2020-09-01MLBEDSW-2903 Split mapping to tensor1.2.0.rc31.2.0Patrik Gustavsson
Split mapping to tensor Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic143f3b4d37f6904edd8f119eff1d108f70b5026
2020-08-17MLBEDSW-2688: Improved LUT supportLouis Verhaard
- Support for more than one 256-byte LUT in SHRAM - No DMA is performed for a LUT that is already located in SHRAM - Added MemArea.Shram, used for LUT, to avoid false address collision asserts during SRAM tensor allocation - Added read access to LUT in memory access calculation Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-08-12MLBEDSW-2681: Ceiling the upscale for OFM/IFMCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: I566abd5a1ffc367c6b9b8f37d5a26b61d27e840b
2020-08-05[MLBEDSW-2335] SoftMax int16Fredrik Svedberg
Added graph rewrite of Softmax for int16. Change-Id: Id7885af6056a23e8b8362fb61ae94283251eb398 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2020-07-13MLBEDSW-2584: Support cascading of Transpose ConvolutionJacob Bohlin
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I39cff126dda89d71426ab731427ca1d64d02590d
2020-07-02MLBEDSW-2340: Make the tensor address default NoneCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: I53d9d56acee57cff208dccb4822c1f1a461c416d
2020-06-18Code clean-up using black and flake8Tim Hall
- No functional change Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I5ab1198b9d092cd041fa9b85b2dee9900d299bfc
2020-06-18MLBEDSW-2435: Fix for cascading upscaling operatorsJacob Bohlin
Fixed a coordinate issue which caused the compiler to crash when cascading upscaling operators such as ResizeBilinear. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I982863573b0e5829e6d0c255dbbc308cb332a37a
2020-06-18MLBEDSW-1828: Ifm/ifm2 order is reversed in some cases of splitCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: Ib8d66f8b3c0467966165c1b53aeb7da7c8764c89
2020-06-18MLBEDSW-1941: Bug fix shared weightsLouis Verhaard
If same weight tensor was used with different block configs, errors would occur. Fixed by always cloning weight tensors, using a global weight compression cache and modifying the linear allocator to detect multiple usage of same weight compression. Change-Id: I91ca59176e1c59c66e0ac7a4227f2b5f0b47053f Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-06-18Add elementwise vector scalars supportCharles Xu
Write the constant scalars into flash. In case it's Dram or OffChipFlash, DMA the scalars from flash to sram. Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: I42300a05dfe968d623b8aec8549644549e0f54b5
2020-06-18Add reorder-python-import pre-commit hookDiego Russo
Also updated README.md Change-Id: I118309c61f4d00e8508d6b888c606995490fba39 Signed-off-by: Diego Russo <diego.russo@arm.com>