aboutsummaryrefslogtreecommitdiff
path: root/ethosu
AgeCommit message (Collapse)Author
2022-05-24MLBEDSW-4783: Fix issue with relative paths to config filesRickard Bolin
One level deep relative paths (ie ./vela.ini) were treated as the name of a folder in config_files was ".". They are now treated as relative paths. The warning message when using an absolute path has also been moved to to the error message instead for a better user experience. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7f7d4f904b9fbba97593e42203566057a2d36925
2022-05-24MLBEDSW-6593: Issue with finding some config filesRickard Bolin
The argument to the lstrip function is a list of all characters that should be stripped from the beginning of the string, in any order. To remove the actual prefix, check if the string starts with the string instead and then remove that amount of characters. The function "removeprefix" was added in python3.9 which does exactly this, but that is not yet available to vela since it supports python 3.7. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ibc5a173c6d422cb5f55feb80caef6c5c30cf7d39
2022-05-19MLBEDSW-6563: networks failing with memory area exceeded in vela3.4.0.rc2Tim Hall
- For allocations that have a hard memory limit the Hill Climb allocator should be given more attempts to find a solution that would fit - The fix is to use a memory limit when there is a hard constraint, and a minimum iteration count, reset on every improvement, when there is a soft constraint - Added maximum number iterations CLI option Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I19ff53a0b68412de280263626778a3102cbe52fa
2022-05-19MLBEDSW-6296: improvement_dram can become NaNTim Hall
- Problem is due to a divide by zero - Fix is simply to detect and assign zero. This could also affect improvement_sram Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I29a67710a17ef22656fb5ecfe9476953ffa5533d
2022-05-19MLBEDSW-6271: Key error when using --verbose-performance optionRickard Bolin
- The print_performance function that is called when using the --verbose-performance option crashed with KeyError when no SRAM was used. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ib6af3193e8f4f368cb28d51e65afa0751773628a
2022-05-19MLBEDSW-6384: Updated weight buffering cycle calculationJohan Alfvén
- The npu cycles are not correct calculated when only one weight buffer is used, since weights can not be fetched in parallel. - Added new calculation in the single buffer case. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I8568912d11d137a298225ab77b8b3272613c76f6
2022-05-19MLBEDSW-6430: MLCE: Update to graph has sequential ethos-u opsJohan Alfvén
Update to the "Vela splitting network into two ethos operators" patch allowing the CPU pass to be moved last in the pass_list. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2e8a299101e5d65e963327bed7c8d891fff6523e
2022-05-18MLBEDSW-6430: MLCE: Vela splitting network into two ethos operatorsJohan Alfvén
- Due to how the graph is traversed, the final pass list contained unnecessary multiple Ethos-U operators. Functionality wise not a problem but it adds extra context switching between CPU and NPU. - By applying sorting rules to the pass list, it is possible to create a more optimal pass list that reduces the numbers of Ethos-U operator. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ib556f902e1f321b5c50238fada7aa92b9810b27a
2022-05-18MLBEDSW-4783: Add config file directory structureRickard Bolin
Add directory structure to support third party config files. Config files should now be placed in an appropriately named directory under the config_files directory, but can also be accessed by providing its absolute path to vela --config. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I2fcf52e7b2ddd2c4491dc370c85c0b3937d18062
2022-05-17MLBEDSW-6271: MLCE: Layer wise Utilization info from VelaTim Hall
- Added support to print per operator sram usage and performance information - Added new CLI option --verbose-performance to control this feature Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I368599b410e5d441d9804871fc51b7a1049d85b3
2022-05-17MLBEDSW-6296: Updated condition for the opt size weight buffering scheduleJohan Alfvén
Allow schedule do be used when calculations says zero total improvement but calculations on the other hand shows there are dram improvement. When testing on real target, total performance is improvement. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ib4f2a37710dc7954b72b48c38fce4817ccd7187b
2022-05-16MLBEDSW-6263: Use separate tensors for double bufferingRickard Bolin
Uses separate tensors for the individual weight buffers in case of weight double buffering. Each weight buffer tensor gets its own individual live range. This patch is a clone of a previously reverted patch, but with some additional bug fixes applied. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I868c70d15821eb9f1399186f2da6e7345f6ee343
2022-05-12MLBEDSW-6296: Regression caused by bigger weight buffering size3.4.0.rc1Johan Alfvén
- Due to that bigger weight buffer sizes are being used, there are use cases when feature maps are evicted from SRAM, causing the total performance to drop. - A way to improve this is to limit the memory for those weight buffer ops, to get the feature maps back to SRAM, and see if total performance is improved. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibfaff330677185186af9f6362dfbe04824a329f6
2022-05-11MLBEDSW-6454: Enable ReLu with negative alpha valueJohan Alfvén
Removing constraint for negative alpha value in ReLu for int8 and uint8. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Id7a3a30bf5d1f0a591f990bd04cd0dbbad5819c6
2022-05-11MLBEDSW-6452: Add byte offset in command streamTim Hall
- Added the offset address to the command stream disassembly Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I55c6ef59878c90c21d41051c076da6c1f0fa4201
2022-05-11Revert "MLBEDSW-6312: Find block config improvement"Tim Hall
This reverts commit d2b5510697e7789f5a416f9d80d3cb640eecc092. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ia3043bc9c27fe2f72f3ab2f6f7341b3a9adb4231
2022-05-09MLBEDSW-6500: Address offset out of rangeJohan Alfvén
- Cascading a slice operator with read offsets is not supported by the rolling buffer mechanism causing the address to get out of range. - The fix is to prevent ops to be cascaded if they have read offsets. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Iea7f054ac4b5a7dadf905bbe947033247284c27e
2022-05-04Revert "MLBEDSW-6263: Use separate tensors for double buffering"Tim Hall
This reverts commit cc5f4de1c35ba44fca7ff6295c6ae846f8242344. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0fa5babfe9ad9ec668720d04fe1c16d9a9092131
2022-04-27MLBEDSW-6425: Update to TensorFlow 2.8 (bugfix)Rickard Bolin
Generate flatbuffer files with relative imports. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Idd59bb2ebb829bc42677920577c1f8a04e23ca68
2022-04-27MLBEDSW-6425: Update to TensorFlow 2.8Rickard Bolin
Update the flatbuffers generated code to comply with TensorFlow 2.8 Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ia65325b88745e49dbafa803a38c0ea0e7d0478ba
2022-04-21MLBEDSW-5384 FC layers run on NPU if underlying shape is 2DAyaan Masood
*Added generic function which checks if underlying shape of FullyConnected operation is 2D and performs shape reduction *Fully connected operation >2 dimensions now run on NPU if the above case is satisfied *constraint_fc_output_2d and rewrite_fully_connected_input refactored *Added unit test to confirm this functionality Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com> Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
2022-04-20MLBEDSW-6407: Vela fails with TypeError in npu_performanceTim Hall
- This is due to calling range() on a non-integer value which in turn is due to a change in the behaviour of round() on numpy.float64 values - The fix is to always force the output of the round() to be an integer and thereby stop whole number floating point values propagating into the kernel dimensions which later feed into the range(). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ic75cb6ba85a90c81c1d762067d89a10caaa13b92
2022-04-20MLBEDSW-6371: Output diff caused by operator clone bugRickard Bolin
- Modify the operator clone function to also clone resampling mode attribute. A previous patch changed the ifm resampling mode to be an attribute of an operator rather than a tensor but did not modify the operator clone function to clone the new attribute. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7a2f6103666a0997f657de20ad962e849976b904
2022-04-08MLBEDSW-6339 Performance drop on wav2letterJohan Alfvén
Corrected calculation for used bufferering depth. Before change there were scenarios when it was set to smaller sizes than needed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I162859ade78487e848510c6a605685e4568c7068
2022-04-04vela: Minor refactordev/mlbedsw-6271Tim Hall
- Changed comments to docstring on QuantizationParams - Simplified op type to op name conversion Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2fdf5922cc17944c9bd37917a85fdfe50a1e651d
2022-03-31vela: Added debug info to external APITim Hall
- Added optional name attributes to operators and tensors Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I3b5d881a7b1043a6ba4b58fff5d7532b271ba536
2022-03-30Update version of Black to 22.3.0Jonas Ohlsson
Update version of Black to 22.3.0 due to updated dependencies. Updates to fix reported issues due to new version. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
2022-03-30MLBEDSW-6263: Use separate tensors for double bufferingLouis Verhaard
Uses separate tensors for the individual weight buffers in case of weight double buffering. Each weight buffer tensor gets its own individual live range. Change-Id: I724a8c61a7045615fbd2ed9535663076ac8edd13 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-28MLBEDSW-6249: HillClimb improved stuck avoidanceLouis Verhaard
Added a mechanism that reduces the risk for getting stuck if the current best allocation cannot be improved by only swapping 2 indices. Change-Id: Ife379757752f0c1ed54af7bd826e0a9390d54267 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-28MLBEDSW-6098: Order check in cascade builderLouis Verhaard
Added checks in the cascade builder to ensure that scheduled operations are in the correct order. Change-Id: Ic1765a6a1cb8335ff222bfe3b2d2e642980967d7 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-21MLBEDSW-6298: MLCE: Unable to find a valid block configTim Hall
- Fixed a bug due to ResizeBilinear modifying the attributes of a shared IFM - The ifm_resampling_mode is now an attribute of an operator rather than a tensor - Changed all calls to try_block_config() to use the attribute rather than recalculating it in multiple places Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I4641e9cd6b049bd4186776d98e3e751c5e5bcc06
2022-03-21MLBEDSW-3367 Add mypy to pre-commitJonas Ohlsson
Add mypy to pre-commit and clean up all reported errors. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: If7dc869f5fecdb0e2db40f14e7d9db21aa33df71
2022-03-21MLBEDSW-6312: Find block config improvementLouis Verhaard
- The number of accumulators is doubled in an Ethos-U configuration with 2 cores - Likewise, for elementwise, depthwise and pooling operations the IFM buffer depth capacity is doubled - FindBlock: step the search space depth in multiples of ublock * ncores Change-Id: I923cc347a2f252876d405ed93095d39181103f81 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-17MLBEDSW-5332: Bug fix optimise_strided_convLouis Verhaard
Added check that horizontal padding is unaffected when applying graph optimization "optimise_strided_conv". Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I7032a44163e300cdf62cf615b4b10a1417e38eaa
2022-03-14MLBEDSW-6245: Bug fix fast storage allocatorLouis Verhaard
Fast storage allocator did not always return an optimal allocation. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Ic758b6c4a82dc2633c4752b0c204a27ed36f651b
2022-03-14Fix bug storing encoded NPU weight UUIDsJonas Ohlsson
Fix bug when storing the encoded NPU weight UUID in the NPU performance estimation. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I92127b0020f12352d923c0c9aa2b6f47e6110764
2022-03-11Vela: Fix diff in mean opDiqing Zhong
- Extend ifm/ofm dimensions explicitly in mean op This fix a bug when ifm/ofm shape has different dimensions e.g. IFM=1x19x18x25 axis=2 OFM=1x19x25, the ofm_shape should be 1x19x1x25, not 1x1x19x25 - Fix wrong weight shape Change-Id: I269eb71ea56c09deee2aa6c6433d9b2baa98a113 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2022-03-08Updated elementwise cycle calculationJohan Alfvén
- Corrected rounding error - Number of elements depends on ofm format Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I568d660b7571b6e0ffb131211b3a89c8be4b9295
2022-03-04MLBEDSW-3367 Update pre-commit flake8 versionJonas Ohlsson
Update the version of flake8 used in pre-commit to facilitate adding mypy to pre-commit. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I457dec87b77487ca6f14ff4a679c4cc927b272b0
2022-02-24MLBEDSW-6247: MLCE: Issue when running a model with PaddingTim Hall
- The bug is that TransposeConv does not support explicit padding which is needed in order to combine it with a proceeding Pad op - The fix is to exclude such combination Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ide03d034dc32b5fc9bcaaf291ab713482223a042
2022-02-22MLBEDSW-5873 Fixed divide by zero warning in memory transfer efficiencyAyaan Masood
*Corrected calculation where use of the _estimate_memory_transfer_efficiency function when calculating the scaled bandwidth for LUT transfers resulted in a divide by zero error. Change-Id: I2356e924d9ca2f315ca1988f465f58b13a8fa4c9 Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-02-22MLBEDSW-5880 Fixed Vela verbose weight flagAyaan Masood
*Original weights and encoded NPU weight now report correct size instead of zero when running vela with --verbose-weights flag (Code to update the aforementioned attributes was missing) *Removed print references to unencoded NPU weight size Change-Id: I6d3e41c04cc46d24eeb54cab89818a35e5df27be Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-02-21MLBEDSW-6148: Reduce SRAM usage for elementwise opJohan Alfvén
Reduce memory footprint when using optimization strategy Size for elementwise operations. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I30380aed587c31adbf7615f74179b4c5da686773
2022-02-15MLBEDSW-5554: Constraints for single-axis mean operations on NPUJames Peet
- Combine two MEAN operator checks for single axis averages into one - Only apply that check if the single axis is the height dimension (previously checks were also applied to width averages) - Rephrase some MEAN operator constraint descriptions Signed-off-by: James Peet <james.peet@arm.com> Change-Id: Ie0577f2b99aba1f3d6a4c39f8934eafe3813b736
2022-02-09MLBEDSW-6180: Protect overwrite of subgraph output3.3.0.rc1Johan Alfvén
Make sure output from subgraph is write protected and not overwritten by an element wise op. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ie26979913843c62794c5346a315b7089206850e0
2022-02-08MLBEDSW-5582: MLCE: memory corruption with zero concatJohan Alfvén
Fixed problem when ofm is produced by different NPU nodes by making sure that output is always in NHWC format. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I00e55c989d5860499fbaf4f4318661b17b4bda7e
2022-02-08MLBEDSW-5839: Port of improved spilling behaviourerik.andersson@arm.com
Ported the improved spilling behaviour from Regor into Vela. This replaces use_fast_storage_for_feature_maps with allocate_feature_maps and introduces the class called FastStorageComponentAllocator. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I34785840c905a79750a62863773015b00fb43387
2022-02-07MLBEDSW-6148: Allow overwrite of subgraph inputJohan Alfvén
This change will allow the subgraph's input tensor to be reused/overwritten by the output from an elementwise op if there is only one consumer attached to the input tensor. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I317188af11a5470614770e18dc8973462fd5f21c
2022-02-02MLBEDSW-3623: Diff on semantic_segmentationRickard Bolin
The root cause of this diff is precision errors caused by rounding several times when performing a resize bilinear upscaling to more than twice the initial size. This is solved by rewriting the algorithm to perform nearest neighbour upscaling to the correct size and then applying one larger average pool instead of several 2x2 pools. Avgpool with padding is limited to kernel size 8x8, which constraints the largest possible bilinear upscaling to 8 times the input size. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I846232f309ba26aab6c385e593cbe25b646c6668
2022-01-27MLBEDSW-6060: Revert patch for MLBEDSW-5582Johan Alfvén
- Issue was due to a previous patch to fix MLBEDSW-5582 - Revert fix for MLBEDSW-5582 commit 849ff81f82c10a68898e5101930b92372bec5565, - Made new fix for MLBEDSW-5582 that enforce output tensor from NPU graphs to be in NHWC format. This information is otherwise lost in the case when parts of a concatenation are placed in different custom operators resulting in mismatch bewteen NHWC and NHCWB16. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Iab3ba29d348353c854f357836e6aa7c338ae1572