ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2023-09-18	MLBEDSW-8042: MLCE: Add SQUARED_DIFFERENCE support	Johan Alfven
	- Added SQUARED_DIFFERENCE support - Updated SUPPORTED_OPS.md Change-Id: Id83d9d92129e645390c7979759dfdeff7a14c2ee Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-09-14	MLBEDSW-8010: Refine fixup_pool_strides to also check stride	Johan Gunnarsson
	Only set stride to (1, 1) if kernel, stride and IFM shape all are equal. And also set padding to VALID to handle ops with SAME padding. Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com> Change-Id: Id3cc34686f09667ea21541fac432351555344e3d
2023-09-14	MLBEDSW-8003: Limit fixup_pool_strides to AvgPool and MaxPool	Johan Gunnarsson
	This fixup is not relevant for Resize ops. Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com> Change-Id: I81b9d3c8a6dd820b1e5d747d754100282b93c641
2023-09-13	MLBEDSW-8035: Update to TensorFlow 2.13	William Isaksson
	- Adds 3 ops: Bitcast, BitcastXor, RightShift Change-Id: Ia9721c69d4f3da0deba7526addb95a9a54e63adf Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-09-12	MLBEDSW-7997: [MLCE] Extended stride support for TRANSPOSE CONV	Johan Alfven
	- Support for stride WxH 1x1 - Support for stride WxH 2x1 when IFM and KERNEL is 1D shape with height 1 - Added test to supported operators - Updated SUPPORTED_OPS.md Change-Id: Ic1abead8399a5e14a78d962f8aded0d3b3dbfcc4 Signed-off-by: Johan Alfven <johan.alfven@arm.com>X
2023-09-06	MLBEDSW-7541: Extend error message when reaching maximum recursion depth	Rickard Bolin
	Extend the error message of RecursionError when reaching default recursion depth with instructions to use the "--recursion-limit" option in Vela. Change-Id: I5c92d49b99203268c4b988f421afe7013ac3511a Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-09-05	MLBEDSW-7968: Add fixup for strides when kernel size equals IFM shape	Johan Gunnarsson
	There are networks out there with Pool ops with filter (W, H) equals IFM (W, H) equals stride (W, H). The stride is technically too large for the NPU, but we can actually run these ops in the NPU since the filter is large enough the window doesn't slide. To support these ops we need to fix the stride so later checks don't put this op on CPU. Change-Id: I8f0a46b26fb94ee76c33748589536cc5ba07ea59 Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
2023-08-29	MLBEDSW-7881: Convert Quantize op to Avgpool op in graph optimiser	Johan Gunnarsson
	This convert is already done in the pass packing stage, but doing it in the graph optimiser stage is better. Change-Id: Ib9baa98d115cf88491ce39936972a93467a378ce Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
2023-08-22	MLBEDSW-7949: [MLCE] Remove duplicate cpu tensors	Johan Alfven
	- If a npu op is followed by a convolution op than runs on the cpu, the optimized file ends up containing a duplicated tensor called _cpu. Functionality wise not a problem but the graph will look strange in a graph viewer. - This error was introduced when removing duplicate weights tensors but the above use case was not considered in that patch. - The fix is to make sure that only the weight and bias tensor are modified. Change-Id: I576f13650f1f9d3d50a421ab7100fc8b5ab62657 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-08-21	Moving Vela to use TOSA v0.80.0 specification	Rob Elliott
	* Using serialization_lib main branch to update statically copied files sha 5f920211ac23393a7b98a0d358bfbfc3232d5c8f (v0.80.0) * All files within the ethosu/vela/tosa are copied from that revision * Note: hope to move to serialization_lib as a pip module in future * Modified the ethosu/vela/{tosa_mapping,tosa_reader}.py to use v0.80.0 TOSA FlatBuffers implementation * These are the additional changes made to support this new version, with changes in the format of the FlatBuffers file and where various values are stored. Either changing from input to attribute, or moving to different attributes. Signed-off-by: Rob Elliott <robert.elliott@arm.com> Change-Id: I5e1fcc2a9964148619be3477adf1e88e84cbae2d
2023-08-21	MLBEDSW-7702: Update release notes3.9.0	Rickard Bolin
	- Added release information - Modified SUPPORTED_OPS.md version info - Update README.md and classifiers in pyproject.toml to specify Python 3.10 as recommended and tested version Change-Id: I78e5752846f261d4713b89c8efe447bcb9c095dd Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-08-16	MLBEDSW-7884: Fix crash for RSQRT3.9.0.rc2	Johan Alfven
	- RSQRT is only defined for positive numbers and therefore the zeropoint and actual input value will have an impact - Clamp the range to avoid crashing. As long as the actual input is within valid range everything works. If the input is not valid the reference will crash and not generating any output Change-Id: I1082b508d9cd85ad4b017e7b786cfff730585172 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-08-10	MLBEDSW-7832: test_tflite_model_semantic converting array to scalar	William Isaksson
	- now only converts array directly if ndim==0 Signed-off-by: William Isaksson <william.isaksson@arm.com> Change-Id: Id23e419bc7dd717f9694013180d4609819fd2f56
2023-08-09	MLBEDSW-7754: Performance estimator is not using write/read shapes3.9.0.rc1	William Isaksson
	- npu_performance now uses write/read shapes instead of using ifm/ofms for memory cycle estimations. - also fixes a would be bug in the tflite_graph_optimiser, where one read shape is not Shape4D. Change-Id: I2067069a713d2cf9e65a5cc227e803de79940fff Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-08-09	MLBEDSW-7626: Add constraint for PAD op paddings	Johan Gunnarsson
	PAD input tensor shape plus paddings must equal output tensor shape. Change-Id: Icc5dea9bf6a8f6e1c8402f4d9af4d9796e8ef1aa Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
2023-08-08	MLBEDSW-7689: Document verbose command stream options	Tim Hall
	- Documented High-Level and register-Level command stream options - Changed High-Level command stream display to show the name of the command - Fixed an issue with some operators not being displayed by the CLI option --verbose-operators - Changed an unneeded print in pass packing to a more useful assertion Change-Id: I9d53f19f4e32d0478209bc964724c27c935f66d6 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-08-08	MLBEDSW-7656: Update Python versions in README	Tim Hall
	- Added Python support information - Clarified TensorFlow support information - Updated Requires-Python version to 3.8 Change-Id: Iab38a2f4480e58a1bd36d5055342c4bf7379dd09 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-08-07	MLBEDSW-7865: Vela duplicates outputs	William Isaksson
	We now don't rewrite tensors if the tensor is already an output tensor of the current subgraph Signed-off-by: William Isaksson <william.isaksson@arm.com> Change-Id: I9cb36d830616a69d35180326437ff53bcaa62d71
2023-08-04	MLBEDSW-7681: Add Vela version to output file	William Isaksson
	Adds Vela version to description and metadata Change-Id: I75fccd1a05a396612a249b8ec1662d8cae940ee6 Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-07-31	MLBEDSW-7846: Number of CPU Ops reported is wrong	William Isaksson
	- Added support for multiple npu subgraphs to have the same cpu output tensor Change-Id: I2e787306dd64af9b03cdf2bacb4c9ff7119f6c49 Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-07-31	MLBEDSW-7397: Wrong mem_area used in scheduler	wilisa01
	Performance estimation now uses the parent_tensor mem_area instead of the scheduler_op mem_area, because the mem_area is only set on the parent_tensor by the scheduler. Signed-off-by: wilisa01 <william.isaksson@arm.com> Change-Id: I11f73686bfbd6958a8920c5e264a5f95cc3f23d1
2023-07-31	MLBEDSW-7718: Add cmd1 payload legality checks	William Isaksson
	- checks that cmd1 payloads are legal in register_command_stream_generator, - adds unit tests Change-Id: I2bc23147f60fe090c71703f08d9cbaa279fac86e Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-07-24	MLBEDSW-7165: Update to TensorFlow 2.12	William Isaksson
	- Updated FlatBuffers files using TensorFlow 2.12.0 schema - Added restriction for UnidirectionalSequenceLSTM to have 2D recurrent weights to handle that diagonal_recurrent_tensors attr is not currently supported. Change-Id: I104fd1f52485b9b83d644772dbcdeea2d17585f0 Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-07-12	MLBEDSW-7756: MLCE: Grouped convolutions runtime problem	Tim Hall
	- Added graph optimiser function to convert convolution groups into a split followed by separate convolutions and then a concat - Added semantic check for convolution groups - Added unit tests for convolution groups semantic checks - Fixed a minor typing issue with test_constraint_stride_range Change-Id: I78ade408aa23469a79c9f517c4751da8619b77a9 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-07-11	MLBEDSW-7653: Extend Mean support for depth axis	Alexander Hansson
	If any of H,W axes have shape 1, the IFM can be reshaped to support reduction over the depth axis. Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I432ff1c399b7cee4ca5f0a8f4461e9c0a936d804
2023-07-11	MLBEDSW-7652: Add mean support for batch and channel when shape is 1	Alexander Hansson
	- Add support for batch and depth channels when shape is 1 - Refactor reshaping in convert_mean_to_depthwise_conv Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: If663395934ab58c76ba92b6ebaaf484a389ae699
2023-07-11	MLBEDSW-7728: Fix DMA_WAITs in register_command_stream_generator	Alexander Hansson
	* Fix bug in register_command_stream_generator where certain high-level command streams resulted in missing DMA_WAIT commands * Add unit-tests for DMA_WAIT and KERNEL_WAIT commands Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: Iabb3ea3e95fa1ef933c50356d047b6b3f5aeafe3
2023-07-10	MLBEDSW-7833: MLCE: Fixed output diff for reshape op	Johan Alfven
	- In order to reduce memory usage, the live range mechanism have logic to check if the ifm tensor can be reused for the ofm tensor for certain operators - In this failing test case, the input to the reshape/memcpy operator has more than one consumer and this results in a faulty memory overwrite since there are missing logic that should check the ifm consumers for the memcpy operator - The fix is to add the missing logic that ifm can only have one consumer Change-Id: I2184c0f905b554f648c9732734098509e23b537c Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-07-10	MLBEDSW-7752: setting query shapes to Shape4D(0)	William Isaksson
	Changes query initialization shapes to Shape4D(0,0,0,0) = [0,0,0,0] instead of Shape4D(0) = [0,1,1,1]. The [0,1,1,1] tensors would affect performance estimates and are not real. Change-Id: Ic83b6f6a70c0c904b500f62756e1e125c99856c6 Signed-off-by: William Isaksson <william.isaksson@arm.com>
2023-07-06	MLBEDSW-7832: test_tflite_model_semantic converting array to scalar	Tim Hall
	- The problem is that the axis value can be either a scalar or an array containing a single element - The solution is to check the length of the shape because the size attribute returns the same value for both cases - This did not show up before because pytest warnings were not being treated as errors - Removed pre-commit pytest option that caused tests to be searched for from the root directory - Updated pyproject.toml pytest options to explicitly specify the test directories, and to treat warnings as errors Change-Id: I037054768e5c34f253b6062eadba1c3419ff65e4
2023-06-28	MLBEDSW-7716: Improve register level unit tests	Alexander Hansson
	* Improve check_cmd functions to return position of the checked commands. * Update existing unit-tests to validate ordering of commands. Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I492487d768e1e80f6ea366e29f2f99441e4f9797
2023-06-20	MLBEDSW-7449: Add function description and type annotations	Raul Farkas
	Add function description and type annotations to the optimization functions missing them. Fix type annotation issue when re-assigning variable value to a different type. Change-Id: I1ee442ff7a29cc07708fdd013430131eff599dd5 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-19	MLBEDSW-7654: Extend support for Mean where HxW > 4096	Alexander Hansson
	* Convert Means with large IFMs to several DeptwiseConv2DBias and Add operations. * Update tflite supported operator check with new height and width constraints. * Update unit-tests to verify supported operator changes. * Fix output-diff for 2D IFMs (MLBEDSW-7772) Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: Ifae6fb1cdac475ae7dac5116c5f13631ff82108a
2023-06-16	MLBEDSW-7709: MLCE: Crash when rewriting split op	Johan Alfven
	- A crash occurred due to NoneType subscriptable error when rewriting a Slice op. The reason was that the Size tensor did not contain any data. - Added constraint pushing the Slice operator to the CPU if begin or size tensor are empty. - Added test to supported operators - Updated SUPPORTED_OPS.md Change-Id: Ide204cae24e5871f0e6ae1fdc98ac68d0ce4d3ae Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-16	MLBEDSW-7315: Add support for AvgPool with stride_width > 3	Raul Farkas
	* Convert AvgPool with stride_width > 3 and Valid padding to Conv2D to optimize it to run on NPU. Change-Id: I06ab412357f0b09b1498f9019a9d1963a324ad34 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-16	MLBEDSW-7648: Fix bug with filter padding in conv2d	Raul Farkas
	* Fix bug that caused filter padding to not be added proportionally compared to the hardware padding added to IFM. * Update needed_total_padding function that calculates hardware padding to also account for the cases in which IFM width is not divisible by the stride width. * Update supported ops constraint on strides for conv2d to mark ops with stride width > 3 and IFM width that is not divisible by the optimization resize factor as not supported. * Update unit tests that verify correct functionality when checking whether ops are supported or not. Change-Id: I62f14cca890b779ca787a9603fa37c873ad522f8 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-15	MLBEDSW-7531: Remove npu_block_type on unsupported ops	Raul Farkas
	Change-Id: I4f466a7bac77d8bb6fa7243ea2e7c9f3be6d0585 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-14	MLBEDSW-7734: Update Sized import from collections	Rickard Bolin
	Update import of Sized from collections to collections.abc to work with Python 3.10 Change-Id: Iae281db9402331972ad13660d04523608b23614d Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-06-14	MLBEDSW-7748: Add RSQRT support	Johan Alfven
	- Added RSQRT int8 support, implemented as LUT. - Added test to supported operators - Updated SUPPORTED_OPS.md Change-Id: I34904772e044be8d22a6dfe426edf85358a205b7 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-14	MLBEDSW-7147: Enable weight buffering when opt for Size	Johan Alfven
	- When optimizing for Size the scheduler does not try to add weight buffering to the schedule since this would add extra SRAM usage to the peak usage. However, for all other ops that uses less SRAM than the peak there is memory available that could be used for weight buffering and hence improve the performance. - Removed limitation to only run optimize schedule when optimizing for Performance. Regardless of optimizing for Performance or Size the scheduler flow is the same except that the limit for max SRAM usage is different. Change-Id: I6880b35655e37b4916a9c15150f0b8e5126a1cd8 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-13	MLBEDS-7714: Fix assert for cascaded Resize op	Johan Alfven
	- Cascading was recently enabled for Resize ops. A Resize op is transformed into several ops. In this case the last op is a DepthwiseConv2DBias using NEAREST resampling mode. This resampling/ upscaling is not taken into account when calculating the ifm box size, causing the coordinates to get out of bounds. - When generating the high level command stream there is a check to see if an op is a resize op. If this is the case an upscaling factor is calculated. The fix is to change this check to instead see if the operator is using NEAREST resampling mode. If that is true, the scaling factor should be used. Change-Id: I5308a383cc3310c53004ccfe2d6fabf256478a26 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-31	MLBEDSW-7600: MLCE: Enable cascading for resize ops	Johan Alfven
	- Added fix when building the minimum schedule forcing the stripe to be even for is_nearest ops. This is required in order to be able to allow cascading for resize ops. - Remove limitation in cascade builder that prevents resize ops to be cascaded. Change-Id: I05150102b91531ecba786936494f1817a4472f42 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-24	MLBEDSW-7528: Update documentation on verbose options3.8.0.rc3 3.8.0	Rickard Bolin
	Add more detailed explanations to verbose options Change-Id: Ia001e62d4c26ea6ae07949c1c434cbfc1cc7e08a Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
2023-05-17	MLBEDSW-7494: Update release notes3.8.0.rc2	Tim Hall
	- Added release information - Minor changes to SUPPORTED_OPS.md including version info Change-Id: I91fae4c40c6c1f25b874268b18d077a9babd4875 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-05-17	MLBEDSW-7230: Increase support for 1x1 ResizeBilinear with ↵	Alexander Hansson
	half_pixel_center=True Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I0e9db22c97a9e2fbfee618262ffc43532cfcee2c
2023-05-17	MLBEDSW-7651: Include license in generated SUPPORTED_OPS.md	Alexander Hansson
	Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com> Change-Id: I35fd042d572f62122ac681c231798c9f2163fc00
2023-05-17	MLBEDSW-7223: Fusing Pad and AvgPool causes diff	Tim Hall
	- Fixed an issue with the fusing of PAD and AVERAGE_POOL_2D whereby the rounding away from zero didn't work because it requires the zero point to be at zero but the input padding required it to be set to the desired zero point. This affected both int8 and int16. The solution was to remove it by using the bias prior to the scaling - Refactored the rounding away from zero mode Change-Id: I8f2df69df06d2a9722315c346646e5a901cb2c3b Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-05-15	MLBEDSW-7613: Crash when compiling model with resource variables	Johan Alfven
	Fixed serializing of attribute container and shared_name that accidently got lost when fixing the crash for a faulty LSTM model. Change-Id: Ibd11da65735112bed4b1c8bcc4ef048bc093ebc4 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-05-15	MLBEDSW-7579: Fix test_build.py test issues	Raul Farkas
	* Fix import order in test_build.py * Fix setup_tools_scm dependency version. Previously the version was restricted to < 6, creating a version restriction on Setuptools library too. Because an older version of Setuptools was used, running test_build.py::test_build_correct_readme_links would generate a UNKNOWN.egg-info directory in the src directory instead of a ethos_u_vela.egg-info directory. Change-Id: I113ca25b23b39d43fa288e6eda16377f4f5b4143 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-15	MLBEDSW-7390: Add verbose progress option	Raul Farkas
	Add --verbose-progress CLI option used to enable printing progress information in the compiler driver and scheduler. Change-Id: I99ac8c6a654e60391d5c11e28b89250405daa53a Signed-off-by: Raul Farkas <raul.farkas@arm.com>