aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-08-18MLBEDSW-6844: Exclude resize ops from cascades3.5.0.rc2Johan Alfvén
Remove resize ops completely from being cascaded since there are corner cases which are not currently handled. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9923f8e119af7bdc0e93b0e69b521b399e0629af
2022-08-17MLBEDSW-6769: Fix odd stripe heights for upscalingerik.andersson@arm.com
Output diffs were found to be caused by odd input stripe heights, despite the input being an upscaling operator. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ia3791d815250364cfe7a38c3ed0e30768d64ca08
2022-08-17MLBEDSW-6645: MLCE: Optimize SRAM usageJohan Alfvén
- When compiling for shared SRAM the old scheduler has an option so that it produces less SRAM than what the new scheduler manages to produce. The old scheduler was able to creates more/longer cascades. In order to improve the new scheduler, the following has been implemented: - Take persistent IFM's into account when creating the min schedule. - Choose longer cascades when it is possible to reduce the total SRAM usage compared to using shorter cascades. - Updated calculation for estimated SRAM usage for elementwise ops. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I209bbf2d94425e4f6aacb1d151b3b2aa65c0870b
2022-08-17MLBEDSW-6830: MLCE: Fix assert on concat opJohan Alfvén
- The compiler will assert when compiling a faulty concat op. In the reported use case, there were 3 inputs with shape 1x1x2 but the output shape was 1x1x2 (expected to be 1x1x6) - The solution is to add constraints to the concat operator. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I94a505c51a9fd54d1aa92531a0415031db52378a
2022-08-16MLBEDSW-6825: Restrict NumPy version limit to 1.21.3Rickard Bolin
There is an issue with using NumPy 1.21.4 or above in setup.py with python 3.7. Restriction can most likely be removed when upgrading to python 3.8. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I9f826201d68bb5ab61f5bf76c7796442d34447b9
2022-08-16MLBEDSW-6640: Modify elementwise block size selectionRickard Bolin
Limit relative cost to 1 for elementwise operations since increasing block size when the full ofm already fits gives no additional benefits. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ib6128f6346834fd916efa59adbe07a069dbda0ae
2022-08-10Revert reversion of TensorFlow 2.9 update3.5.0.rc1erik.andersson@arm.com
With the errors caused by the previous TensorFlow 2.9 update being fixed, we can proceed with the upgrade. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ie1f025e8d984efaebc68b8d051126d49bee6b2b8
2022-07-23MLBEDSW-4157: Add RESIZE_NEAREST_NEIGHBOR supportTim Hall
- Changed ResizeBilinear to support ResizeNearestNeighbor as well for 1x1 IFM, IFM equal OFM, and non-align corners - Added support for ResizeNearestNeighbor with align corners by converting to a DepthwiseConv - Updated supported operator unit tests - Added is_resize() helper function and some associated refactoring Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
2022-07-23MLBEDSW-6616: ResizeBilinear align corners is incorrectTim Hall
- Fixed align corners support when converting in to upscale and average pool. The problem was due to the wrong ratio ifm to ofm size, causing an scaling factor that was not 2x/4x/8x. Works for uint8, int8 and int16. - Fixed checking of align corners in supported operators check - Added additional supported operators check for the size tensor - Updated and added more supported operators unit tests Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idb78fa9e76ede2c37e8ac6cb1c322154bd156898
2022-07-23vela: OFM_SCALE refactorTim Hall
- Minor rework at the register command stream level Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I58495e40efa3a95bdf6febde530f9f73fa8be30b
2022-07-19MLBEDSW-6700: Fix compiler assert when fusing tensorsJohan Alfvén
If an elemenwise op is part of a cascade, the ifm can not be overwritten by the ofm. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I1e5f7ee501be17e76684b33c6e86ab8af0f3e61f
2022-07-19MLBEDSW-6710: Revert Tensorflow 2.9Johan Alfvén
Tensorflow 2.9 contains a bug for int16x8 without biases. Revert "MLBEDSW-6635: Update to TensorFlow 2.9" This reverts commit 93f492bae9c4dd16a1f64b851b237263695ee03e. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I366d201ce4134a877d333be2aade546dfcb5d6d7
2022-07-15MLBEDSW-6703 Add SHAPE operator to supported operatorsFredrik Svedberg
Added SHAPE operator to the supported operators report. Updated the constraints for QUANTIZE and SHAPE operator. Also fixed RESHAPE consuming statically optimised shape. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I1d964d602d3f361a0f16dae8133197280dd84c48
2022-07-14MLBEDSW-6635: Update to TensorFlow 2.9erik.andersson@arm.com
Update the flatbuffers generated code to comply with TensorFlow 2.9 Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I6bf506ffb85da2d4a57a32198b471513deeaca73
2022-07-13MLBEDSW-6496 mlperf_deeplabv3_mnv2_ade20k_int8 fails at verify_output for u65Fredrik Svedberg
Added check to see if additional stripe data is needed from producer op when cascading to make sure the stripes are not overwriting data still being used. Also changed scheduler to make sure ResizeBilinear always runs with even stripe height. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: If7d723e6be29575c2b55c400eebbe8275a1aa328
2022-07-13MLBEDSW-6687 Vela crashes in npu_serialisation.py and tflite_graph_optimiser.pyFredrik Svedberg
Fixed static optimisation of Quantize operator by running unsupported formats on CPU. Also added support for int16 and corrected the calculation. Change-Id: I861c712aa6258dba53fcf4d5dae45d1d416e6141 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-07-12MLBEDSW-4856: Removed dead codeoliper01
Hardswish activation function gets converted to LUT in graph optimizer. The case for it was removed, as it was never called. Signed-off-by: oliper01 <oliver.perssonbogdanovski@arm.com> Change-Id: I376e8d7b81489c06b66d4e49f59b207600c0ccce
2022-07-11MLBEDSW-6261: Elementwise cascadingerik.andersson@arm.com
Enabled elementwise cascading for binary/single variable IFM operators. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I1c0867875fdc5c4980224fb570185c11e719d5cd
2022-06-29MLBEDSW-6314 Static optimisation for quantise OPAyaan Masood
*Quantise op becomes constant if input is known at compile time *Quantised values calculated if input of op is const and float *Const inputs to quant op that are int are requantized Change-Id: Ic94a72a392af709fe6a640d7dacbb5dc2334f16f Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-06-29MLBEDSW-6313 Static optimisation for Shape OPAyaan Masood
*Shape OP value is available at compile time hence it can be optimised *Disconnected shape OP at compile time from parent tensor *Transformed shape OP tensor into constant Change-Id: I0a024269e2b592c6146dd72e62d7a41951fb727a Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-06-27MLBEDSW-6639: Bug fix for evicted FMS in the fast storage allocatorJohan Alfvén
- The fast storage allocator is supposed to add all feature maps that does not fit in SRAM to an evicted list. However, in the case when conflicting tensors were handled the list was not updated. -This patch makes sure to update the list correctly. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibeb3b4e4927f22a8206784a478f1ac38bd7f5a87
2022-06-20MLBEDSW-6347: Improved fast storage allocatorJohan Alfvén
- The fast storage allocator only looked at tensor size, giving priority to larger tensors. The problem with this method is that it does not consider the actual read/write access of the tensor. So, a smaller tensor size can cause higher memory transactions than a bigger one. - The solution is to calculate the read/write access of the tensor and add that score to the decision when deciding where to place the tensors. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I59eb9bd3a44a0238b576cfd8f09ff27012b99070
2022-06-17MLBEDSW-6614 Improve elementwise block size selectionFredrik Svedberg
Improved block size selection by favouring larger block sizes for elementwise operations. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5b30b358d84fcd672935b863c2154bd8f4ccd928
2022-06-08MLBEDSW-4783: Make config handling more user friendlyRickard Bolin
Vela was not able to parse config file paths entered with forward slashes. This patch will make it possible to use both forward and backslashes when specifying paths. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I0f4cfc16bde5738c73059af6216d2bdc3821c68b
2022-05-24MLBEDSW-6422: Update release notes3.4.0.rc33.4.0Tim Hall
- Updated release notes and setup.py tag for 3.4 - Regenerated supported ops information Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I4ec88544b84cab168cb3e5cbc6bc392b6b3d8a39
2022-05-24MLBEDSW-4783: Fix issue with relative paths to config filesRickard Bolin
One level deep relative paths (ie ./vela.ini) were treated as the name of a folder in config_files was ".". They are now treated as relative paths. The warning message when using an absolute path has also been moved to to the error message instead for a better user experience. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7f7d4f904b9fbba97593e42203566057a2d36925
2022-05-24MLBEDSW-6593: Issue with finding some config filesRickard Bolin
The argument to the lstrip function is a list of all characters that should be stripped from the beginning of the string, in any order. To remove the actual prefix, check if the string starts with the string instead and then remove that amount of characters. The function "removeprefix" was added in python3.9 which does exactly this, but that is not yet available to vela since it supports python 3.7. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ibc5a173c6d422cb5f55feb80caef6c5c30cf7d39
2022-05-23MLBEDSW-6406: Restrict numpy version limitTim Hall
- The latest numpy versions require Python 3.8 - This can cause issues if Python 3.7 is installed which is the version that Vela is tested against - The fix is to limit the numpy version to those that support Python 3.7 Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I3a388976d5aa76395ca93202e496640c8de9f6f4
2022-05-19MLBEDSW-6563: networks failing with memory area exceeded in vela3.4.0.rc2Tim Hall
- For allocations that have a hard memory limit the Hill Climb allocator should be given more attempts to find a solution that would fit - The fix is to use a memory limit when there is a hard constraint, and a minimum iteration count, reset on every improvement, when there is a soft constraint - Added maximum number iterations CLI option Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I19ff53a0b68412de280263626778a3102cbe52fa
2022-05-19MLBEDSW-6296: improvement_dram can become NaNTim Hall
- Problem is due to a divide by zero - Fix is simply to detect and assign zero. This could also affect improvement_sram Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I29a67710a17ef22656fb5ecfe9476953ffa5533d
2022-05-19MLBEDSW-6271: Key error when using --verbose-performance optionRickard Bolin
- The print_performance function that is called when using the --verbose-performance option crashed with KeyError when no SRAM was used. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ib6af3193e8f4f368cb28d51e65afa0751773628a
2022-05-19MLBEDSW-6384: Updated weight buffering cycle calculationJohan Alfvén
- The npu cycles are not correct calculated when only one weight buffer is used, since weights can not be fetched in parallel. - Added new calculation in the single buffer case. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I8568912d11d137a298225ab77b8b3272613c76f6
2022-05-19MLBEDSW-6430: MLCE: Update to graph has sequential ethos-u opsJohan Alfvén
Update to the "Vela splitting network into two ethos operators" patch allowing the CPU pass to be moved last in the pass_list. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2e8a299101e5d65e963327bed7c8d891fff6523e
2022-05-18MLBEDSW-6430: MLCE: Vela splitting network into two ethos operatorsJohan Alfvén
- Due to how the graph is traversed, the final pass list contained unnecessary multiple Ethos-U operators. Functionality wise not a problem but it adds extra context switching between CPU and NPU. - By applying sorting rules to the pass list, it is possible to create a more optimal pass list that reduces the numbers of Ethos-U operator. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ib556f902e1f321b5c50238fada7aa92b9810b27a
2022-05-18MLBEDSW-4783: Add config file directory structureRickard Bolin
Add directory structure to support third party config files. Config files should now be placed in an appropriately named directory under the config_files directory, but can also be accessed by providing its absolute path to vela --config. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I2fcf52e7b2ddd2c4491dc370c85c0b3937d18062
2022-05-17MLBEDSW-6271: MLCE: Layer wise Utilization info from VelaTim Hall
- Added support to print per operator sram usage and performance information - Added new CLI option --verbose-performance to control this feature Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I368599b410e5d441d9804871fc51b7a1049d85b3
2022-05-17MLBEDSW-6296: Updated condition for the opt size weight buffering scheduleJohan Alfvén
Allow schedule do be used when calculations says zero total improvement but calculations on the other hand shows there are dram improvement. When testing on real target, total performance is improvement. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ib4f2a37710dc7954b72b48c38fce4817ccd7187b
2022-05-16MLBEDSW-6263: Use separate tensors for double bufferingRickard Bolin
Uses separate tensors for the individual weight buffers in case of weight double buffering. Each weight buffer tensor gets its own individual live range. This patch is a clone of a previously reverted patch, but with some additional bug fixes applied. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I868c70d15821eb9f1399186f2da6e7345f6ee343
2022-05-12MLBEDSW-6296: Regression caused by bigger weight buffering size3.4.0.rc1Johan Alfvén
- Due to that bigger weight buffer sizes are being used, there are use cases when feature maps are evicted from SRAM, causing the total performance to drop. - A way to improve this is to limit the memory for those weight buffer ops, to get the feature maps back to SRAM, and see if total performance is improved. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibfaff330677185186af9f6362dfbe04824a329f6
2022-05-11MLBEDSW-6454: Enable ReLu with negative alpha valueJohan Alfvén
Removing constraint for negative alpha value in ReLu for int8 and uint8. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Id7a3a30bf5d1f0a591f990bd04cd0dbbad5819c6
2022-05-11MLBEDSW-6518: Change to Python 3.7Dwight Lidman
This commit downgrades the required Python version to 3.7 from 3.8. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I07057908b97bcd94663f001474d877ba41411ae1
2022-05-11MLBEDSW-6452: Add byte offset in command streamTim Hall
- Added the offset address to the command stream disassembly Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I55c6ef59878c90c21d41051c076da6c1f0fa4201
2022-05-11Revert "MLBEDSW-6312: Find block config improvement"Tim Hall
This reverts commit d2b5510697e7789f5a416f9d80d3cb640eecc092. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ia3043bc9c27fe2f72f3ab2f6f7341b3a9adb4231
2022-05-09MLBEDSW-6500: Address offset out of rangeJohan Alfvén
- Cascading a slice operator with read offsets is not supported by the rolling buffer mechanism causing the address to get out of range. - The fix is to prevent ops to be cascaded if they have read offsets. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Iea7f054ac4b5a7dadf905bbe947033247284c27e
2022-05-04Revert "MLBEDSW-6263: Use separate tensors for double buffering"Tim Hall
This reverts commit cc5f4de1c35ba44fca7ff6295c6ae846f8242344. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0fa5babfe9ad9ec668720d04fe1c16d9a9092131
2022-04-27MLBEDSW-6425: Update to TensorFlow 2.8 (bugfix)Rickard Bolin
Generate flatbuffer files with relative imports. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Idd59bb2ebb829bc42677920577c1f8a04e23ca68
2022-04-27MLBEDSW-6425: Update to TensorFlow 2.8Rickard Bolin
Update the flatbuffers generated code to comply with TensorFlow 2.8 Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ia65325b88745e49dbafa803a38c0ea0e7d0478ba
2022-04-21MLBEDSW-5384 FC layers run on NPU if underlying shape is 2DAyaan Masood
*Added generic function which checks if underlying shape of FullyConnected operation is 2D and performs shape reduction *Fully connected operation >2 dimensions now run on NPU if the above case is satisfied *constraint_fc_output_2d and rewrite_fully_connected_input refactored *Added unit test to confirm this functionality Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com> Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
2022-04-20MLBEDSW-6407: Vela fails with TypeError in npu_performanceTim Hall
- This is due to calling range() on a non-integer value which in turn is due to a change in the behaviour of round() on numpy.float64 values - The fix is to always force the output of the round() to be an integer and thereby stop whole number floating point values propagating into the kernel dimensions which later feed into the range(). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ic75cb6ba85a90c81c1d762067d89a10caaa13b92
2022-04-20MLBEDSW-6371: Output diff caused by operator clone bugRickard Bolin
- Modify the operator clone function to also clone resampling mode attribute. A previous patch changed the ifm resampling mode to be an attribute of an operator rather than a tensor but did not modify the operator clone function to clone the new attribute. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7a2f6103666a0997f657de20ad962e849976b904