ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2023-05-03	MLBEDSW-4178: Add automatic tag handling tests	Raul Farkas
	* Add test to verify that the metadata produced in the PKG-INFO file of the sdist contains the correctly formatted links extracted from README.md Change-Id: I300094470fd115b1143aa8c663837e8a77428f24 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-04-27	MLBEDSW-7530: Enable int16 input precision for mean operator	Rickard Bolin
	Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iaeb8f2cea0d3b576a6b138e64a882c701ac88ccb
2023-04-19	MLBEDSW-7487: Updated implementation for the Mean op	Johan Alfven
	- Latest reference has changed implementation for the Mean op and now only contain one variant. - Updated Vela implementation to match reference. The full sum is first calculated and then divided by the numbers of elements. - Removed the avg pool variant and test case. - Updated SUPPORTED_OPS.md Change-Id: I4275e36e3697fa837f119f2cefd7c0ff94231605 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-17	MLBEDSW-7196 Add LSTM support	Fredrik Svedberg
	Added int8 and int16 UNIDIRECTIONAL_SEQUENCE_LSTM support. The implementation does not include support for: * CIFG * Peephole * Projection * Normalisation This change also: * Removed unused Op.BlockLSTM operation type. * Removed the only one consumer limitation on putting the SplitSliceRead on the tensor consumer(s), if all consumers fullfills the requirements * Added Op.VariableTensorWrite as a Operation.memory_function to make sure writes to variable tensors: * Always use linear mode * Are not moved to fast scratch * Are not fused with other elementwise operation tensor ranges Change-Id: Ief831738924ac3d1f2ba6d41f10bd6dc969911f3 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2023-03-16	MLBEDSW-7352: Refactoring move_constant_data	Johan Alfven
	Refactoring move_constant_data in the scheduler. The use case currently only work for LUT tensor, so simplifying the logic. In order to make it work for other tensors one would also have to take into consideration memory usage when building cascades and also the use_fast_storage_for_feature_maps would be effected. Change-Id: Ic8de53b65a2c17d34515002d7f184d0ab1830222 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-02-13	MLBEDSW-7250: pytest RuntimeWarning: overflow encountered in float_scalars	wilisa01
	Since test works by creating an overflow, sets NumPy to ignore overflow for this test case Change-Id: I74d03e8d73455295168352542dcb844283d54d33 Signed-off-by: wilisa01 <william.isaksson@arm.com>
2023-02-09	MLBEDSW-7331: Reinstate max stride height constraint of 3 for Conv2D	Raul Farkas
	Reinstate constraint for stride height to (1,3) instead of (1,4) for Conv2D and update unit tests. Change-Id: I17389ee040eeff0cea08279cab1c038e951569ea Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-02-09	MLBEDSW-7281: create_const_tensor OverflowError on Microsoft Windows	Tim Hall
	- Additional overflow checks are performed when running under Microsoft Windows compared to Linux. These checks happen when converting from Python int to NumPy int/uint - The problem is that the lut activation values are int32 type, however they are defined as Python ints. If these are converted to numpy.int32 it could result in an overflow error - The fix is to convert these values to uint32 but keep the operator's IFM tensor type the same (as this will allow them to be interpreted correctly) - Fixing this highlighted another problem where convert_to_lut always calls create_lut_tensor() with an int8 datatype, whereas it should be using the IFM datatype Change-Id: I781a9d850f654267aa4a67754438607c4bb95685 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-02-07	MLBEDSW-7237: CONV_2D stride 4 optimisation	Raul Farkas
	* Extend stride range from (1,3) to (1,4) * Add stride 4 support when optimising CONV_2D * Add some tests for various strides Change-Id: Iddaeb42c4a6e02695ecdd3740bc8b9dd59a7eb3c Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-02-06	MLBEDSW-7284: MLCE: Fix assert for faulty Split op	Johan Alfven
	- An assert in Vela is triggered when the number of splits does not evenly divide the input.shape[axis] value and the split offsets are calculated wrongly. - The fix is to add the same constraints as in the reference kernel and only run the Split op on the NPU when the criterias are fulfilled. - Modified test to reflect the new constraints - Updated SUPPORTED_OPS.md Change-Id: I4103ff4a3fdf9a813f5fcb7f51081b859e611100 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-01-20	MLBEDSW-7151: MLCE: Difference in model output between x86 & aarch64	Tim Hall
	- The issue is due to undefined behaviour when casting a NumPy float to a NumPy unsigned integer which occurs in create_const_tensor() - The fix is to make sure that the values are first cast to a Python float - In addition, the values datatype argument has been removed from create_const_tensor() to stop the tensor and values datatypes getting out of sync Change-Id: I134b9be8c941b361929a5ae7db8cb35f2e9728f2 Signed-off-by: Tim Hall <tim.hall@arm.com>
2022-12-09	MLBEDSW-7072: Added bias shape constraint	Johan Alfvén
	- Only 1D bias shape is supported - Modified test to reflect the constraint - Update SUPPORTED_OPS.md Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I00ae4b229d5f89512cb94f87f276af61cc66a6fd
2022-11-16	MLBEDSW-6620: Update copyright notice and years	Rickard Bolin
	- Update copyright notices to use SPDX format and add OSS mail as contact. - Update years on files where it had been missed. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7e9715ea4e17b76252728c708e46df12ad67ab1f
2022-11-15	MLBEDSW-6905: Add dilation greater than 2 support	Tim Hall
	- Added graph optimisation pass to support dilations greater than 2 in either dimension - Removed supported operators restrictions - Removed erroneous dilation on TRANSPOSE_CONV - Updated unit tests and documentation Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ide302374b0d5eff25c20501383a63f6aa7625c52
2022-11-03	MLBEDSW-7074: Updated reference kernel for the MEAN op	Johan Alfvén
	The reference kernel for the MEAN operator has changed. As a result, the mean implementation can be simplified and the constraint for mean int8 can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I318e9b495eefea99e7ac4aea4b8c436c83753405
2022-10-18	MLBEDSW-6794: ResizeNearestNeighbor with HPC	Johan Alfvén
	- Removed half pixel centers constraint for resize nearest neightbor. - Supported scale 2x, 4x and 8x. - Removed test_constraint_resize_half_pixel_centers - Regenerated SUPPORTED_OPS.md Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ic3e02e9c2b2034d537c9a9841b8fb4ee433c96dc
2022-10-03	MLBEDSW-2723 Handle int16 multiplier overflow test case	Fredrik Svedberg
	Added unit tests for scaling including saturated multiplier test. Change-Id: I87bb3a4bed8f62f5ef5cf3851b97f09ce42bf2b6 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-09-26	MLBEDSW-4075 PACK axis 0 + tanh fails with output diff	Fredrik Svedberg
	The test failed since the tanh had batch size > 1. Added checks for batch size for all supported operators. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3570352740c40eb96bd9db965dfa3c91c81ff2ad
2022-09-23	MLBEDSW-6686: Resize bilinear HPC with tile padding	Rickard Bolin
	- Added support for Resize Bilinear with half pixel centers for int8 and uint8. - Utilizes the new "TILE" padding mode. - Utilizes ofm stride multipliers and modified tile base offsets to write OFMs interleaved. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
2022-09-12	MLBEDSW-6869 Improve LeakyRelu support	Fredrik Svedberg
	Added support for int16 LeakyRelu for negative alpha and alpha greater than one. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I7f522ebfe014786d0a1d96172e75c7d9bdd76921
2022-09-12	MLBEDSW-6909: Use int32 acc for the Mean op	Johan Alfvén
	Changed acc type from int16 to int32. This will solve saturation problems and the constraint added in commit "MLBEDSW-5029: Output diff for Mean op" can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I05ec8835b43313b1a264d61a2b147fa62da123fe
2022-09-01	MLBEDSW-5029: Output diff for Mean op	Johan Alfvén
	Fixed three test cases causing output diff compared to the reference kernel for the Mean operator. - If there is a possibility that the accumulator could saturate the Mean op must run CPU - Use correct rounding for the bias term - If a Reshape op is followed by a Mean op, push the Reshape op to the CPU since this cannot be handled by the NPU Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I734465730372105821a5e2f73a6a125b9eb7d7f4
2022-07-23	MLBEDSW-4157: Add RESIZE_NEAREST_NEIGHBOR support	Tim Hall
	- Changed ResizeBilinear to support ResizeNearestNeighbor as well for 1x1 IFM, IFM equal OFM, and non-align corners - Added support for ResizeNearestNeighbor with align corners by converting to a DepthwiseConv - Updated supported operator unit tests - Added is_resize() helper function and some associated refactoring Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
2022-07-23	MLBEDSW-6616: ResizeBilinear align corners is incorrect	Tim Hall
	- Fixed align corners support when converting in to upscale and average pool. The problem was due to the wrong ratio ifm to ofm size, causing an scaling factor that was not 2x/4x/8x. Works for uint8, int8 and int16. - Fixed checking of align corners in supported operators check - Added additional supported operators check for the size tensor - Updated and added more supported operators unit tests Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idb78fa9e76ede2c37e8ac6cb1c322154bd156898
2022-06-29	MLBEDSW-6314 Static optimisation for quantise OP	Ayaan Masood
	Quantise op becomes constant if input is known at compile time Quantised values calculated if input of op is const and float *Const inputs to quant op that are int are requantized Change-Id: Ic94a72a392af709fe6a640d7dacbb5dc2334f16f Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-05-19	MLBEDSW-6563: networks failing with memory area exceeded in vela3.4.0.rc2	Tim Hall
	- For allocations that have a hard memory limit the Hill Climb allocator should be given more attempts to find a solution that would fit - The fix is to use a memory limit when there is a hard constraint, and a minimum iteration count, reset on every improvement, when there is a soft constraint - Added maximum number iterations CLI option Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I19ff53a0b68412de280263626778a3102cbe52fa
2022-05-11	MLBEDSW-6454: Enable ReLu with negative alpha value	Johan Alfvén
	Removing constraint for negative alpha value in ReLu for int8 and uint8. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Id7a3a30bf5d1f0a591f990bd04cd0dbbad5819c6
2022-04-21	MLBEDSW-5384 FC layers run on NPU if underlying shape is 2D	Ayaan Masood
	Added generic function which checks if underlying shape of FullyConnected operation is 2D and performs shape reduction Fully connected operation >2 dimensions now run on NPU if the above case is satisfied constraint_fc_output_2d and rewrite_fully_connected_input refactored Added unit test to confirm this functionality Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com> Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
2022-03-30	Update version of Black to 22.3.0	Jonas Ohlsson
	Update version of Black to 22.3.0 due to updated dependencies. Updates to fix reported issues due to new version. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
2021-11-12	MLBEDSW-5383 npu_find_block_configs() differs between v2.1.1 and v3.1.03.2.0.rc1	James Ward
	* 1D optimised block_config was incorrectly beign set to the ArchitectureBlockConfig in try_block_config() * Write external API test for the reduced block height case (on H256) Signed-off-by: James Ward <james.ward@arm.com> Change-Id: I9ced7eb31b23730e4423aabbaf769bc72fac8fc9
2021-10-29	MLBEDSW-4925: Fix resize bilinear attribute check	erik.andersson@arm.com
	Previously we did not check if half_pixel_centers was set. Since we do not support it, these cases should not run on the NPU. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I9d2675f760424d5cfb67e5d581dd1861ad165b85
2021-10-14	MLBEDSW-5162 MLCE: Vela [3.1.0] falling to run with yolov4_int8.tflite	James Ward
	* fix indices for tflite mapping of EXP operator * fix indices for tflite mapping of Transpose operator * ensure read offset after slice is aligned to 16 bytes for NHCWB16 or force linear format * add unit test to ensure mapping of indices is consistent across TFLite, TOSA and NNG Signed-off-by: James Ward <james.ward@arm.com> Change-Id: I17b6e44bc06853325d5eea62a558418ee1ebefe8
2021-09-15	MLBEDSW-5102 Update removal of memory only operators	Jonas Ohlsson
	Memory only operators such as Reshape, Squeeze and ExpandDims are removed in the graph optimiser step. - Added semantic check that memory only operators have same quantisation parameters on ifm/ofm. - Added support for the ExpandDims operator. - Addition and cleanup of related unit tests. - Removed TOSA from the generated SUPPORTED_OPS.md documentation. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: If848d8afc58c18806e10997ed94e4dae83f30879
2021-08-25	Handle sg input and output for Squeeze operator3.1.0.rc2	Jonas Ohlsson
	Update to handle the case when the Squeeze Op ifm/ofm are the subgraph ifm/ofm, to facilitate the removal of the Squeeze Op. Adding NOP to maintain the original tensors. Updated pytests for squeeze operator. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I623cae05e696fb16ccf29dedc42fd822601e9fd9
2021-08-11	vela: Fix Ethos-U65 maximum address range	Tim Hall
	- Changed Ethos-65 AXI port address width from 48 to 40-bits - Fixed the use of arena_cache_size in mem_type_size() to cover the arena as well as the cache memory area Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I826462a0cbd0c061cccbc7c83dde446778a2b1ca
2021-07-27	MLBEDSW-4853: Refactor supported operators	Jonas Ohlsson
	Refactor supported operators by breaking out model semantics into its own class. Model semantics checked right after model read. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: If442b189efcd91dda01af60b2b3adedfacdf2fad
2021-07-26	MLBEDSW-4892: Fix crash affecting biases without quantization.	James Peet
	Remove quant_values attribute from Tensor class. It only needs a single values attribute, holding either quantized or unquantized values as appropriate. Change-Id: Ie96f80ac58061b6077e0f7048dc60209fdfbcafa Signed-off-by: James Peet <james.peet@arm.com>
2021-07-09	MLBEDSW-4840 Move setting of input indices to tflite reader	Patrik Gustavsson
	Mapping to internal input indexing has been added to tflite_reader.py and tosa_reader.py. And the other way around in tflite_writer.py. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I4d8596e747cfa7c4203884c4e785eb1977e2bcc1
2021-07-08	MLBEDSW-4838 Added basic TOSA support.	Patrik Gustavsson
	Added basic TOSA support, enabling Vela to read and compile a .tosa file corresponding to CONV2D + Rescale + Clamp, and writing it to an optimized .tflite file. The optimized .tflite file, will in this case, hold a commandstream where the Rescale and Clamp has been fused into the CONV2D. The optimized tflite file is not output from Vela. -Added support to read .tosa file into Vela internal structure. - Added tosa_reader.py, tosa_mapper.py and helper files stored under tosa/ - Support for this limited to ~10 ops -Added reader_util.py for functions common for TOSA and TFLite -Added tosa_graph_optimiser.py -Added support to fuse Rescale into convolution -Modified handling for padding -Added support to fuse Clamp to previous op -Added graph_optimiser_util.py -Moved functions common for TOSA/TFLite graph optimization to this file. -Renamed graph_optimiser.py to tflite_graph_optmiser.py -Added separate tosa_supported_operators.py -Added supported_operator_util.py -For functions in common for TOSA/TFLite Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab
2021-06-17	Block config optimisation for 256/512 configurations	Tim Hall
	- 256 and 512 configuration variants execute 1D convolutions in an optimised manner compared to their 2x2 microblock dimensions. This commit takes this into account to improve Conv1D throughput on these configurations. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
2021-05-27	MLBEDSW-4034: New Scheduler Size or Performance Optimisation	Tim Hall
	- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
2021-04-29	MLBEDSW-4501: Support MEAN single axis variation	Dwight Lidman
	When a MEAN operator with a single reduction axis specifies the axis index attribute as an array with a single element rather than a scalar index, the operator is placed on the CPU even though it is technically supported. This commit fixes this issue and also adds some new tests for the axis constraints. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ia287f3b9cc80a805e972cd4b2962e52526a8dc16
2021-04-08	MLBEDSW-4334 Non-linear format decision in graph opt.	Patrik Gustavsson
	Check if non linear tensor format can be used is refactored. -Flag avoid_NHCWB16 replaced with needs_linear_format -Checking restrictions located to one function in graph optimiser. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Iec5c7996a1a6039cad052197f1ae56f7c0290440
2021-04-07	MEAN implementation changed to Average Pool	Dwight Lidman
	This is a small commit which changes one of the four MEAN implementations to a simpler one, using an AvgPool instead of a DepthwiseConv. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I9e8af071e8b820796577ee4792b4812a1212602b
2021-03-31	MLBEDSW-3502: Bug fix addresses >= 32 bit	Louis Verhaard
	Bug fix in generation of register command offsets that do not fit in 32 bit. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Iabb99cf6536c0f77b934691f8744df61f1eab3ed
2021-03-30	Performance improvement in tensor allocation	Louis Verhaard
	- Tensor allocation verification was O(N^2), is now closer to O(N) - Removed a sort in HillClimb allocator Change-Id: I286a269881490c485cc2b0eeab3b1ecffa8f3df0 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-22	MLBEDSW-3502: Add address checks	Louis Verhaard
	Added checks during command stream generation to make sure that address boundaries are respected. Change-Id: I4dbc693b42d54e35c8fcc785e8be88059e409eec Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-16	MLBEDSW-4215: Add support for MEAN to match QuantizedMeanOrSum implementation	Dwight Lidman
	This commit adds support for emulating the behavior of the QuantizedMeanOrSum implementation of MEAN in TensorFlow Lite. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ifd24e0e678e2f85cd66ab82deeaaf010d5351b1e
2021-03-16	MLBEDSW-4223: Full support for PAD operator	Louis Verhaard
	- Added full support for PAD operator - Hardware padding is still used whenever possible - Bug fix Pad followed by max pool if IFM contains negative values Change-Id: Ifc64d1943737d94466f5e2821009dab12a49a965 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-02-25	MLBEDSW-1499: Add MEAN operator	Dwight Lidman
	This commit adds support for the MEAN operator, with some caveats. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I165cb26cb5aefd68e70d2cfc68291ccf7b778921