ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2022-09-26	MLBEDSW-6932 LeakyRelu missing from supported ops activations	Fredrik Svedberg
	Added LeakyRelu to supported activation ops. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Icca27730946d02ec16159f988782567be716b594
2022-09-23	MLBEDSW-6928: Add int16 support for Resize Bilinear HPC	Rickard Bolin
	Setting bias tensor dtype to DataType.int32 solves rounding issues for RB HPC int16. Removing the input data type check also solves the issue of resize nearest neighbor int16 ops incorrectly getting placed on the CPU. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iee352bcb78e581c0cde3c203dfbe866f1f6fae18
2022-09-23	MLBEDSW-6686: Resize bilinear HPC with tile padding	Rickard Bolin
	- Added support for Resize Bilinear with half pixel centers for int8 and uint8. - Utilizes the new "TILE" padding mode. - Utilizes ofm stride multipliers and modified tile base offsets to write OFMs interleaved. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
2022-09-21	MLBEDSW-4338 Randomized int16 PAD output diff	Fredrik Svedberg
	The issue was that the AveragePool in these test cases were translated to DepthwiseConv2DBias and int16 convolutions always runs with reduced scale. Fixed so that reduced scale is not used in this case. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ice956eabbb37c8aa1991464870006971c6ecec43
2022-09-16	MLBEDSW-6938 Fix PReLU optimisation	Fredrik Svedberg
	Fixed PReLU optimisation to LeakyReLU with negative alpha. Added optimisation of LeakyReLU to ReLU when alpha is zero. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5e66f79b29908fffd95b6115799021138ebb401a
2022-09-15	MLBEDSW-6927: Add ofm_stride_multiplier attribute to operation	Rickard Bolin
	Allow sparse writing of OFM by multiplying H/W/C of the OFM with the values of ofm_stride_multiplier Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I65d742ad36ad3154e9914cdd22e2da928ad1f095
2022-09-13	MLBEDSW-6929 Fix LeakyReLU int16 regressions	Fredrik Svedberg
	Fixed LeakyReLU regressions for int16 due to scaling introduced for handling negative alpha. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I84a494fedf54bd4b47c4632645ded7d6cda445f8
2022-09-12	MLBEDSW-6863: Cleanup the constraint for concat	Johan Alfvén
	Removed duplicate code and moved constraint to the correct file. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2da3c5b88e1af351751c481217b8183b5948f0f8
2022-09-12	MLBEDSW-6424: Remove Pipfile support	Rickard Bolin
	Remove Pipfile support due to lack of testing and maintenance. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I93786cdbf22bfa2130601291d23cead177bd8f81
2022-09-12	MLBEDSW-6869 Improve LeakyRelu support	Fredrik Svedberg
	Added support for int16 LeakyRelu for negative alpha and alpha greater than one. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I7f522ebfe014786d0a1d96172e75c7d9bdd76921
2022-09-12	MLBEDSW-6613: Implement tile padding	Rickard Bolin
	Implement new padding mode which pads two edges of the IFM with the current values of those edges Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I8523e0cabdac80b48710703859003e33050cc150
2022-09-12	MLBEDSW-6909: Use int32 acc for the Mean op	Johan Alfvén
	Changed acc type from int16 to int32. This will solve saturation problems and the constraint added in commit "MLBEDSW-5029: Output diff for Mean op" can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I05ec8835b43313b1a264d61a2b147fa62da123fe
2022-09-08	MLEMBED-1918: Issue with REDUCE_SUM on Ethos-U65-5123.6.0.rc0	Tim Hall
	- Ethos-U65-512 requires the input to REDUCE_SUM to use NHWC format - Updated the graph optimiser format check to cover this condition - Added a exception check to the backend of the compiler to verify that this condition is not been violated by the external api or Vela internals Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2f1fabcbd264daf77d5822349d855a3a32b12c64
2022-09-06	MLBEDSW-6870 Optimisations for PReLU	Fredrik Svedberg
	Added optimisations for PReLU when the alpha values allows it. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iff9124e691663ee495379f89900e7c35dbc5f948
2022-09-01	MLBEDSW-5029: Output diff for Mean op	Johan Alfvén
	Fixed three test cases causing output diff compared to the reference kernel for the Mean operator. - If there is a possibility that the accumulator could saturate the Mean op must run CPU - Use correct rounding for the bias term - If a Reshape op is followed by a Mean op, push the Reshape op to the CPU since this cannot be handled by the NPU Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I734465730372105821a5e2f73a6a125b9eb7d7f4
2022-09-01	MLBEDSW-6755: Add per-layer performance to CSV file	wilisa01
	Dump the current per-layer performance estimation information that appears on the terminal to a CSV file. Change-Id: I00e94168704be8c3c674c8779fb807ed28607ccd Signed-off-by: wilisa01 <william.isaksson@arm.com>
2022-08-31	MLBEDSW-6832 PReLU support in Vela	Fredrik Svedberg
	Added PReLU support in graph optimiser. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3a188675e3edcdf0b4a4bfcdd134fda0bf8a560f
2022-08-25	MLBEDSW-6879: TFLG pass-through test crash3.5.0.rc4 3.5.0	Tim Hall
	- The optimisation of the SHAPE operator resulted in a divide by zero when printing the percentage of npu/cpu operators in the final output summary - The fix is to detect when there are no operators in the output tflite and then avoid the division Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I5bd2342335e9468a8b7028e6e2291a03960e2e55
2022-08-23	MLBEDSW-6748: Update SUPPORTED_OPERATORS.md and release notes3.5.0.rc3	oliper01
	- Updated SUPPORT_OPERATORS.md with Resize operators - Updated release notes with the main changes and bug fixes - Updated version numbers Signed-off-by: oliper01 <oliver.perssonbogdanovski@arm.com> Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: If25b5fab708098bc3e7eb243924b55a50f148c3a
2022-08-23	MLBEDSW-6423:Updated documentation to detail new dependencies	erik.andersson@arm.com
	Mypy and pylint was previously not included in TESTING.md. Also, installation of pre-commit, pytest and pytest-cov outside of a virtual environment was not detailed. CONTRIBUTIONS.md had an old Python version listed in the conding standard section. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Idff9454083e41d719e6d75e90cb2be2861500eb9
2022-08-18	MLBEDSW-6844: Exclude resize ops from cascades3.5.0.rc2	Johan Alfvén
	Remove resize ops completely from being cascaded since there are corner cases which are not currently handled. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9923f8e119af7bdc0e93b0e69b521b399e0629af
2022-08-17	MLBEDSW-6769: Fix odd stripe heights for upscaling	erik.andersson@arm.com
	Output diffs were found to be caused by odd input stripe heights, despite the input being an upscaling operator. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ia3791d815250364cfe7a38c3ed0e30768d64ca08
2022-08-17	MLBEDSW-6645: MLCE: Optimize SRAM usage	Johan Alfvén
	- When compiling for shared SRAM the old scheduler has an option so that it produces less SRAM than what the new scheduler manages to produce. The old scheduler was able to creates more/longer cascades. In order to improve the new scheduler, the following has been implemented: - Take persistent IFM's into account when creating the min schedule. - Choose longer cascades when it is possible to reduce the total SRAM usage compared to using shorter cascades. - Updated calculation for estimated SRAM usage for elementwise ops. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I209bbf2d94425e4f6aacb1d151b3b2aa65c0870b
2022-08-17	MLBEDSW-6830: MLCE: Fix assert on concat op	Johan Alfvén
	- The compiler will assert when compiling a faulty concat op. In the reported use case, there were 3 inputs with shape 1x1x2 but the output shape was 1x1x2 (expected to be 1x1x6) - The solution is to add constraints to the concat operator. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I94a505c51a9fd54d1aa92531a0415031db52378a
2022-08-16	MLBEDSW-6825: Restrict NumPy version limit to 1.21.3	Rickard Bolin
	There is an issue with using NumPy 1.21.4 or above in setup.py with python 3.7. Restriction can most likely be removed when upgrading to python 3.8. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I9f826201d68bb5ab61f5bf76c7796442d34447b9
2022-08-16	MLBEDSW-6640: Modify elementwise block size selection	Rickard Bolin
	Limit relative cost to 1 for elementwise operations since increasing block size when the full ofm already fits gives no additional benefits. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ib6128f6346834fd916efa59adbe07a069dbda0ae
2022-08-10	Revert reversion of TensorFlow 2.9 update3.5.0.rc1	erik.andersson@arm.com
	With the errors caused by the previous TensorFlow 2.9 update being fixed, we can proceed with the upgrade. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ie1f025e8d984efaebc68b8d051126d49bee6b2b8
2022-07-23	MLBEDSW-4157: Add RESIZE_NEAREST_NEIGHBOR support	Tim Hall
	- Changed ResizeBilinear to support ResizeNearestNeighbor as well for 1x1 IFM, IFM equal OFM, and non-align corners - Added support for ResizeNearestNeighbor with align corners by converting to a DepthwiseConv - Updated supported operator unit tests - Added is_resize() helper function and some associated refactoring Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
2022-07-23	MLBEDSW-6616: ResizeBilinear align corners is incorrect	Tim Hall
	- Fixed align corners support when converting in to upscale and average pool. The problem was due to the wrong ratio ifm to ofm size, causing an scaling factor that was not 2x/4x/8x. Works for uint8, int8 and int16. - Fixed checking of align corners in supported operators check - Added additional supported operators check for the size tensor - Updated and added more supported operators unit tests Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idb78fa9e76ede2c37e8ac6cb1c322154bd156898
2022-07-23	vela: OFM_SCALE refactor	Tim Hall
	- Minor rework at the register command stream level Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I58495e40efa3a95bdf6febde530f9f73fa8be30b
2022-07-19	MLBEDSW-6700: Fix compiler assert when fusing tensors	Johan Alfvén
	If an elemenwise op is part of a cascade, the ifm can not be overwritten by the ofm. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I1e5f7ee501be17e76684b33c6e86ab8af0f3e61f
2022-07-19	MLBEDSW-6710: Revert Tensorflow 2.9	Johan Alfvén
	Tensorflow 2.9 contains a bug for int16x8 without biases. Revert "MLBEDSW-6635: Update to TensorFlow 2.9" This reverts commit 93f492bae9c4dd16a1f64b851b237263695ee03e. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I366d201ce4134a877d333be2aade546dfcb5d6d7
2022-07-15	MLBEDSW-6703 Add SHAPE operator to supported operators	Fredrik Svedberg
	Added SHAPE operator to the supported operators report. Updated the constraints for QUANTIZE and SHAPE operator. Also fixed RESHAPE consuming statically optimised shape. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I1d964d602d3f361a0f16dae8133197280dd84c48
2022-07-14	MLBEDSW-6635: Update to TensorFlow 2.9	erik.andersson@arm.com
	Update the flatbuffers generated code to comply with TensorFlow 2.9 Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I6bf506ffb85da2d4a57a32198b471513deeaca73
2022-07-13	MLBEDSW-6496 mlperf_deeplabv3_mnv2_ade20k_int8 fails at verify_output for u65	Fredrik Svedberg
	Added check to see if additional stripe data is needed from producer op when cascading to make sure the stripes are not overwriting data still being used. Also changed scheduler to make sure ResizeBilinear always runs with even stripe height. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: If7d723e6be29575c2b55c400eebbe8275a1aa328
2022-07-13	MLBEDSW-6687 Vela crashes in npu_serialisation.py and tflite_graph_optimiser.py	Fredrik Svedberg
	Fixed static optimisation of Quantize operator by running unsupported formats on CPU. Also added support for int16 and corrected the calculation. Change-Id: I861c712aa6258dba53fcf4d5dae45d1d416e6141 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-07-12	MLBEDSW-4856: Removed dead code	oliper01
	Hardswish activation function gets converted to LUT in graph optimizer. The case for it was removed, as it was never called. Signed-off-by: oliper01 <oliver.perssonbogdanovski@arm.com> Change-Id: I376e8d7b81489c06b66d4e49f59b207600c0ccce
2022-07-11	MLBEDSW-6261: Elementwise cascading	erik.andersson@arm.com
	Enabled elementwise cascading for binary/single variable IFM operators. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I1c0867875fdc5c4980224fb570185c11e719d5cd
2022-06-29	MLBEDSW-6314 Static optimisation for quantise OP	Ayaan Masood
	Quantise op becomes constant if input is known at compile time Quantised values calculated if input of op is const and float *Const inputs to quant op that are int are requantized Change-Id: Ic94a72a392af709fe6a640d7dacbb5dc2334f16f Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-06-29	MLBEDSW-6313 Static optimisation for Shape OP	Ayaan Masood
	Shape OP value is available at compile time hence it can be optimised Disconnected shape OP at compile time from parent tensor *Transformed shape OP tensor into constant Change-Id: I0a024269e2b592c6146dd72e62d7a41951fb727a Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-06-27	MLBEDSW-6639: Bug fix for evicted FMS in the fast storage allocator	Johan Alfvén
	- The fast storage allocator is supposed to add all feature maps that does not fit in SRAM to an evicted list. However, in the case when conflicting tensors were handled the list was not updated. -This patch makes sure to update the list correctly. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibeb3b4e4927f22a8206784a478f1ac38bd7f5a87
2022-06-20	MLBEDSW-6347: Improved fast storage allocator	Johan Alfvén
	- The fast storage allocator only looked at tensor size, giving priority to larger tensors. The problem with this method is that it does not consider the actual read/write access of the tensor. So, a smaller tensor size can cause higher memory transactions than a bigger one. - The solution is to calculate the read/write access of the tensor and add that score to the decision when deciding where to place the tensors. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I59eb9bd3a44a0238b576cfd8f09ff27012b99070
2022-06-17	MLBEDSW-6614 Improve elementwise block size selection	Fredrik Svedberg
	Improved block size selection by favouring larger block sizes for elementwise operations. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5b30b358d84fcd672935b863c2154bd8f4ccd928
2022-06-08	MLBEDSW-4783: Make config handling more user friendly	Rickard Bolin
	Vela was not able to parse config file paths entered with forward slashes. This patch will make it possible to use both forward and backslashes when specifying paths. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I0f4cfc16bde5738c73059af6216d2bdc3821c68b
2022-05-24	MLBEDSW-6422: Update release notes3.4.0.rc3 3.4.0	Tim Hall
	- Updated release notes and setup.py tag for 3.4 - Regenerated supported ops information Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I4ec88544b84cab168cb3e5cbc6bc392b6b3d8a39
2022-05-24	MLBEDSW-4783: Fix issue with relative paths to config files	Rickard Bolin
	One level deep relative paths (ie ./vela.ini) were treated as the name of a folder in config_files was ".". They are now treated as relative paths. The warning message when using an absolute path has also been moved to to the error message instead for a better user experience. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7f7d4f904b9fbba97593e42203566057a2d36925
2022-05-24	MLBEDSW-6593: Issue with finding some config files	Rickard Bolin
	The argument to the lstrip function is a list of all characters that should be stripped from the beginning of the string, in any order. To remove the actual prefix, check if the string starts with the string instead and then remove that amount of characters. The function "removeprefix" was added in python3.9 which does exactly this, but that is not yet available to vela since it supports python 3.7. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ibc5a173c6d422cb5f55feb80caef6c5c30cf7d39
2022-05-23	MLBEDSW-6406: Restrict numpy version limit	Tim Hall
	- The latest numpy versions require Python 3.8 - This can cause issues if Python 3.7 is installed which is the version that Vela is tested against - The fix is to limit the numpy version to those that support Python 3.7 Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I3a388976d5aa76395ca93202e496640c8de9f6f4
2022-05-19	MLBEDSW-6563: networks failing with memory area exceeded in vela3.4.0.rc2	Tim Hall
	- For allocations that have a hard memory limit the Hill Climb allocator should be given more attempts to find a solution that would fit - The fix is to use a memory limit when there is a hard constraint, and a minimum iteration count, reset on every improvement, when there is a soft constraint - Added maximum number iterations CLI option Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I19ff53a0b68412de280263626778a3102cbe52fa
2022-05-19	MLBEDSW-6296: improvement_dram can become NaN	Tim Hall
	- Problem is due to a divide by zero - Fix is simply to detect and assign zero. This could also affect improvement_sram Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I29a67710a17ef22656fb5ecfe9476953ffa5533d