ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2023-07-10	MLBEDSW-7752: setting query shapes to Shape4D(0)	William Isaksson
	Changes query initialization shapes to Shape4D(0,0,0,0) = [0,0,0,0] instead of Shape4D(0) = [0,1,1,1]. The [0,1,1,1] tensors would affect performance estimates and are not real. Change-Id: Ic83b6f6a70c0c904b500f62756e1e125c99856c6 Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-07-06	MLBEDSW-7832: test_tflite_model_semantic converting array to scalar	Tim Hall
	- The problem is that the axis value can be either a scalar or an array containing a single element - The solution is to check the length of the shape because the size attribute returns the same value for both cases - This did not show up before because pytest warnings were not being treated as errors - Removed pre-commit pytest option that caused tests to be searched for from the root directory - Updated pyproject.toml pytest options to explicitly specify the test directories, and to treat warnings as errors Change-Id: I037054768e5c34f253b6062eadba1c3419ff65e4
2023-06-28	MLBEDSW-7716: Improve register level unit tests	Alexander Hansson
	* Improve check_cmd functions to return position of the checked commands. * Update existing unit-tests to validate ordering of commands. Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I492487d768e1e80f6ea366e29f2f99441e4f9797
2023-06-20	MLBEDSW-7449: Add function description and type annotations	Raul Farkas
	Add function description and type annotations to the optimization functions missing them. Fix type annotation issue when re-assigning variable value to a different type. Change-Id: I1ee442ff7a29cc07708fdd013430131eff599dd5 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-19	MLBEDSW-7654: Extend support for Mean where HxW > 4096	Alexander Hansson
	* Convert Means with large IFMs to several DeptwiseConv2DBias and Add operations. * Update tflite supported operator check with new height and width constraints. * Update unit-tests to verify supported operator changes. * Fix output-diff for 2D IFMs (MLBEDSW-7772) Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: Ifae6fb1cdac475ae7dac5116c5f13631ff82108a
2023-06-16	MLBEDSW-7709: MLCE: Crash when rewriting split op	Johan Alfven
	- A crash occurred due to NoneType subscriptable error when rewriting a Slice op. The reason was that the Size tensor did not contain any data. - Added constraint pushing the Slice operator to the CPU if begin or size tensor are empty. - Added test to supported operators - Updated SUPPORTED_OPS.md Change-Id: Ide204cae24e5871f0e6ae1fdc98ac68d0ce4d3ae Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-16	MLBEDSW-7315: Add support for AvgPool with stride_width > 3	Raul Farkas
	* Convert AvgPool with stride_width > 3 and Valid padding to Conv2D to optimize it to run on NPU. Change-Id: I06ab412357f0b09b1498f9019a9d1963a324ad34 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-16	MLBEDSW-7648: Fix bug with filter padding in conv2d	Raul Farkas
	* Fix bug that caused filter padding to not be added proportionally compared to the hardware padding added to IFM. * Update needed_total_padding function that calculates hardware padding to also account for the cases in which IFM width is not divisible by the stride width. * Update supported ops constraint on strides for conv2d to mark ops with stride width > 3 and IFM width that is not divisible by the optimization resize factor as not supported. * Update unit tests that verify correct functionality when checking whether ops are supported or not. Change-Id: I62f14cca890b779ca787a9603fa37c873ad522f8 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-15	MLBEDSW-7531: Remove npu_block_type on unsupported ops	Raul Farkas
	Change-Id: I4f466a7bac77d8bb6fa7243ea2e7c9f3be6d0585 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-14	MLBEDSW-7734: Update Sized import from collections	Rickard Bolin
	Update import of Sized from collections to collections.abc to work with Python 3.10 Change-Id: Iae281db9402331972ad13660d04523608b23614d Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-06-14	MLBEDSW-7748: Add RSQRT support	Johan Alfven
	- Added RSQRT int8 support, implemented as LUT. - Added test to supported operators - Updated SUPPORTED_OPS.md Change-Id: I34904772e044be8d22a6dfe426edf85358a205b7 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-14	MLBEDSW-7147: Enable weight buffering when opt for Size	Johan Alfven
	- When optimizing for Size the scheduler does not try to add weight buffering to the schedule since this would add extra SRAM usage to the peak usage. However, for all other ops that uses less SRAM than the peak there is memory available that could be used for weight buffering and hence improve the performance. - Removed limitation to only run optimize schedule when optimizing for Performance. Regardless of optimizing for Performance or Size the scheduler flow is the same except that the limit for max SRAM usage is different. Change-Id: I6880b35655e37b4916a9c15150f0b8e5126a1cd8 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-13	MLBEDS-7714: Fix assert for cascaded Resize op	Johan Alfven
	- Cascading was recently enabled for Resize ops. A Resize op is transformed into several ops. In this case the last op is a DepthwiseConv2DBias using NEAREST resampling mode. This resampling/ upscaling is not taken into account when calculating the ifm box size, causing the coordinates to get out of bounds. - When generating the high level command stream there is a check to see if an op is a resize op. If this is the case an upscaling factor is calculated. The fix is to change this check to instead see if the operator is using NEAREST resampling mode. If that is true, the scaling factor should be used. Change-Id: I5308a383cc3310c53004ccfe2d6fabf256478a26 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-31	MLBEDSW-7600: MLCE: Enable cascading for resize ops	Johan Alfven
	- Added fix when building the minimum schedule forcing the stripe to be even for is_nearest ops. This is required in order to be able to allow cascading for resize ops. - Remove limitation in cascade builder that prevents resize ops to be cascaded. Change-Id: I05150102b91531ecba786936494f1817a4472f42 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-17	MLBEDSW-7494: Update release notes3.8.0.rc2	Tim Hall
	- Added release information - Minor changes to SUPPORTED_OPS.md including version info Change-Id: I91fae4c40c6c1f25b874268b18d077a9babd4875 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-05-17	MLBEDSW-7230: Increase support for 1x1 ResizeBilinear with ↵	Alexander Hansson
	half_pixel_center=True Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I0e9db22c97a9e2fbfee618262ffc43532cfcee2c
2023-05-17	MLBEDSW-7651: Include license in generated SUPPORTED_OPS.md	Alexander Hansson
	Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I35fd042d572f62122ac681c231798c9f2163fc00
2023-05-17	MLBEDSW-7223: Fusing Pad and AvgPool causes diff	Tim Hall
	- Fixed an issue with the fusing of PAD and AVERAGE_POOL_2D whereby the rounding away from zero didn't work because it requires the zero point to be at zero but the input padding required it to be set to the desired zero point. This affected both int8 and int16. The solution was to remove it by using the bias prior to the scaling - Refactored the rounding away from zero mode Change-Id: I8f2df69df06d2a9722315c346646e5a901cb2c3b Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-05-15	MLBEDSW-7613: Crash when compiling model with resource variables	Johan Alfven
	Fixed serializing of attribute container and shared_name that accidently got lost when fixing the crash for a faulty LSTM model. Change-Id: Ibd11da65735112bed4b1c8bcc4ef048bc093ebc4 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-15	MLBEDSW-7579: Fix test_build.py test issues	Raul Farkas
	* Fix import order in test_build.py * Fix setup_tools_scm dependency version. Previously the version was restricted to < 6, creating a version restriction on Setuptools library too. Because an older version of Setuptools was used, running test_build.py::test_build_correct_readme_links would generate a UNKNOWN.egg-info directory in the src directory instead of a ethos_u_vela.egg-info directory. Change-Id: I113ca25b23b39d43fa288e6eda16377f4f5b4143 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-15	MLBEDSW-7390: Add verbose progress option	Raul Farkas
	Add --verbose-progress CLI option used to enable printing progress information in the compiler driver and scheduler. Change-Id: I99ac8c6a654e60391d5c11e28b89250405daa53a Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-15	MLBEDSW-7428: Remove unused rescale_for_faf	Rickard Bolin
	Remove unused parameter rescale for faf Change-Id: Id388d307f3eb0d27bce813ab58e3c9a5f4ba89ae Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-05-10	MLBEDSW-7283: Add opt cases for strided CONV2D	Raul Farkas
	* Implement a general optimization solution for strided CONV2D that supports a stride_w with no upper bound. * Implement filter zero padding to allow for optimization in those cases in which the filter width is not divisible by the stride width. E.g.: Filter width = 8, stride width = 3 -> Filter width = 8 + 1 (0 padding) = 9, stride width = 3 * Implement partial optimization to reduce the stride to hw supported strides (i.e. 2 and 3) when optimizing to reach a stride = 1 is not possible due to the IFM width not being divisible by the stride width. * Implement optimization for when SAME padding is used. If the pre-opt and post-opt padding do not match, add zero padding to the filter so that the post-opt IFM padding matches. Change-Id: Ia66b0d107281fa9993f6bf4d0c26627ee743253b Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-10	Revert "MLBEDSW-6343: Remove op_index constraint"	Raul Farkas
	This reverts commit 72c6a2414205e033279f80b622cdf479c05a4f5b. Reason for revert: Fix performance regression caused by breaking cascades in certain models Change-Id: I5aba6e3c59ab27c5129f4a3f0c320ed18df78943 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-10	MLBEDSW-7578: Fix output diff caused by wrong rounding in Conv2d	Johan Alfven
	- The reference calculates the rounding different between int8 and int16 for Conv2d. However, internally a Conv2d can be changed to a FullyConnect but then the rounding must still be calculated following the Conv2d reference. - The fix is to check the original type if NATURAL rounding should be used or not. int16 Conv2d uses NATURAL rounding in reference. Change-Id: I80d48b54372ef7b978ee2e9384a01934dd454e24 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-10	MLBEDSW-7572 Update LSTM with new constant precision	Fredrik Svedberg
	Updated the Q0_15_SCALE constant to match the updated value in the reference. Change-Id: Id680748c532d41fea9760ec76c0b65c0c3e73a13 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2023-05-05	MLBEDSW-7385: Unbound local var bug fix	Raul Farkas
	Treat Dynamic Weights as FeatureMap to avoid issues during scheduling caused by having non constant OPs that produce tensors used as weights. Change-Id: I2b9ee7fb62a150c5052c6c3b1a6d34f22e9426a9 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-04	MLBEDSW-7542: Fix output diff caused by wrong scaling in Conv2d	Johan Alfven
	- The reference calculates the scale slightly different between Conv2d and FullyConnect. Recently a fix was submitted to address this issue. However, internally a Conv2d can be changed to a FullyConnect but then the scale must still be calculated following the Conv2d reference. - The fix is to check the original type if FullyConnect scale should be used or not. Change-Id: I5a9fb49126f0df63712b73fb5520fdc604cee378 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-04	MLBEDSW-7504: Vela does not keep op version number	wilisa01
	We now read operator code version, store it in operator and write it out to optimized file. Signed-off-by: wilisa01 <william.isaksson@arm.com> Change-Id: Idba672531d2e2a0203a85d3ffca9cf65ace85b47
2023-05-03	MLBEDSW-7450: Fix NumPy out-of-bound conversion warning	Raul Farkas
	Change-Id: I50b85953bff13bd6ec0648dec5d86b8ac749137a Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-03	MLBEDSW-4178: Add automatic tag handling tests	Raul Farkas
	* Add test to verify that the metadata produced in the PKG-INFO file of the sdist contains the correctly formatted links extracted from README.md Change-Id: I300094470fd115b1143aa8c663837e8a77428f24 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-03	MLBEDSW-7416: per_layer_csv has extra row	wilisa01
	removed redundant row Change-Id: I8b90df3b45ed863c93572b33f695b06094103015 Signed-off-by: wilisa01 <william.isaksson@arm.com>
2023-05-03	MLBEDSW-7234: Vela assert on int16 Min+Relu	wilisa01
	fixed by using sched op instead of last pass op Change-Id: I2e03d39462ca07372d85c71e78189bd8c58a1b9c Signed-off-by: wilisa01 <william.isaksson@arm.com>
2023-05-02	MLBEDSW-7545: Fix assert when serializing a tensor	Johan Alfven
	- The assert triggers when a constant tensor is being assigned to buffer 0 and that is a violation. - The test case that triggered this problem revealed an error in the reader code. If the input tensor has constant data it should be using a Const op. Before this fix it was assigned a Placeholder op and the tensor ended up in the scratch area instead of the permanent area. Change-Id: I4f92fb5ec1f0dc594defbaca0335eabe68fd5137 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-02	MLBEDSW-2082: Add Exp support	Johan Alfven
	- Added int8 and int16 Exp support, implemented as LUT. - Added generic 8bit and 16bit LUT table functions following the implementation in the latest reference. If new ops are added by the reference, they can easily be implemented in Vela using the generic functions. - Moved convert_to_lut to lut.py to have all LUT related code in one file. - Updated SUPPORTED_OPS.md Change-Id: I388e76ea4b39162313599a5341cfb9bad71a782c Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-02	MLBEDSW-7443: Temporal mem usage is erroneously calculated	Johan Alfven
	- The array allocated in get_temporal_memory_usage is too small so the first error is that not all LiveRange elements are added to the temporal mem usage. The second error happens due to that use_fast_storage_for_feature_maps is correctly trying to update the temporal mem usage array but an assert happens due to out of bounds. The array is too small since the LiveRangeClass is reporting the wrong end time because of some inconsistencies in how the mark usage is done for subgraph tensors. - The fix is to mark the tensors with the current_time value. Also changed so that tenors are marked consistently in both extract functions. This means that the end time value to use in get_temporal_memory_usage is the current_time + 1. - Also made a small update to avoid updating current_time twice when handling subgraphs. Change-Id: Ib7e3681e370e097e433acb235740dfd69fa3ce8b Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-28	MLBEDSW-7503: Avoid changing buffer index for models with only CPU ops	Johan Alfven
	- When compiling a model that only contains CPU ops, Vela unnecessary adds an empty buffer. - This extra buffer is added due to that the fast scratch tensor always occupies index 1. - Since scratch and fast_scratch does not have any constant data they can use buffer 0. Change-Id: I25e1fb124deed7069641bde1f571b522c5bf763a Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-27	MLBEDSW-7530: Enable int16 input precision for mean operator	Rickard Bolin
	Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iaeb8f2cea0d3b576a6b138e64a882c701ac88ccb
2023-04-27	MLBEDSW-7527: Mean operator output diff	Rickard Bolin
	Mean operators with height larger than 64 are reshaped but the IFM shape was then reset to the original value, causing an output diff. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I3a89d4efac53173cbd6fe0a5c0542e028bed42ad
2023-04-25	MLBEDSW-6954: Update to TensorFlow 2.11	Rickard Bolin
	Updated FlatBuffers autogenerated files to TensorFlow 2.11 Change-Id: Ia39d30b06e9a37c9ab119d501ebf442f32167afe Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-04-24	MLBEDSW-7501: Vela unnecessary adds reshaped weights tensors	Johan Alfven
	- Weights are internally cloned and reshaped/transposed when running on the NPU. This happens already in the reader. If the op is passed through to the CPU there are code that writes backs these clones but with another round of reshape/transpose. This adds extra tensors in the optimized file compared to the original file if the original tensors are subgraph inputs. - If the op is passed trough to the CPU the clones should not be written to the file. Solved this by setting the src_tensor when making the clone. Change-Id: I9f55d542c099882882920bffe8e15b43b2ca2c8d Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-24	MLBEDSW-7458: Fused activation not passed through correctly	Johan Alfven
	- Fixed a problem where the fused activation got lost when the op was passed through to the CPU - The fix is to always make sure the attribute is not removed Change-Id: I612cfa8f6f0a0465459080762094fe61e7ddc1c3 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-21	MLBEDSW-7373: Vela sometimes write empty buffers in incorrect format	Tim Hall
	- Fixed an issue whereby a zero length buffer was written out instead of an empty buffer - Added a warning message to highlight when this type of semantically incorrect empty buffer is read from an input network Change-Id: Iac3bc71a2dbfda53737bbeb6e7f895552f0f13d0 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-04-21	MLBEDSW-7408: MLCE: Crash when serialising model LSTM	Tim Hall
	- Added checking and reporting of missing operator attributes when reading and writing TFLite file - Added a TFLite semantic check to ensure that all required attribute fields of builtin operators are read - Added some sanity checks for RESHAPE operators that run on the Ethos-U - Stopped CPU operators from having their attributes modified Change-Id: I05700681acdb09554f5945819717c08a9457295c Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-04-19	MLBEDSW-7487: Updated implementation for the Mean op	Johan Alfven
	- Latest reference has changed implementation for the Mean op and now only contain one variant. - Updated Vela implementation to match reference. The full sum is first calculated and then divided by the numbers of elements. - Removed the avg pool variant and test case. - Updated SUPPORTED_OPS.md Change-Id: I4275e36e3697fa837f119f2cefd7c0ff94231605 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-17	MLBEDSW-7196 Add LSTM support	Fredrik Svedberg
	Added int8 and int16 UNIDIRECTIONAL_SEQUENCE_LSTM support. The implementation does not include support for: * CIFG * Peephole * Projection * Normalisation This change also: * Removed unused Op.BlockLSTM operation type. * Removed the only one consumer limitation on putting the SplitSliceRead on the tensor consumer(s), if all consumers fullfills the requirements * Added Op.VariableTensorWrite as a Operation.memory_function to make sure writes to variable tensors: * Always use linear mode * Are not moved to fast scratch * Are not fused with other elementwise operation tensor ranges Change-Id: Ief831738924ac3d1f2ba6d41f10bd6dc969911f3 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2023-04-12	MLBEDSW-7437: Add 64-bit output support for ArgMax	Johan Alfven
	- Added 64-bit support for ArgMax - Updated constraints for ArgMax and regenerated SUPPORTED_OPS.md Change-Id: I4ef7d2e6fccab0088b87757f6afe40a006c77bbd Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-04	MLBEDSW-7442: Removed ofm quantization for ArgMax	Johan Alfven
	- Quantization for the OFM was added for the ArgMax operator as a workaround in order to avoid a crash in the weight compressor. This quantization is now removed. - The weight compressor expects that all tensors have a quantization. Updated code to use scale = 1.0 and zero point = 0 for tensor without quantization. Change-Id: I6816dce2db55f7d795d19f88d7fbe7ee419347fc Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-03-31	MLBEDSW-7439: Add support for input dims < 4 for ArgMax	Johan Alfven
	- Updated ARG_MAX to support IFM rank less than 4 - Regenerated SUPPORTED_OPS.md Change-Id: Icd8e72733279413cbea49021325e1ab06fdc6011 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-03-27	MLBEDSW-6343: Remove op_index constraint	Raul Farkas
	Remove op_index constraint and force linear format for all Conv2D that have strides that can be optimised. Change-Id: Idef3508ab074ea9abeacac030eaaa15a00ad1211 Signed-off-by: Raul Farkas <raul.farkas@arm.com>