ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2024-05-29	MLBEDSW-9131: Fix bypassing of reshape with mean op	Johan Alfven
	- A reshape operator was bypassed so that the ofm shape for the mean operator was changed. - This caused a faulty decomposition for the mean operator with an output diff as a result. - The fix is to insert a memcpy operator to preserve the correct shape. This memcpy will in most case end up as a nop so performance is still intact. Change-Id: Ibf6781d090e9eadbb4c874181a7a8c63bb557351 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2024-05-15	MLBEDSW-9067: MLCE: Group Avgpool ops for concat	Johan Alfven
	- Concat is implemented by several avgpool ops, all of them writing to the same ofm but with a slice offset. If a compiled network contains cpu fallbacks the avgpool ops might end up running in different custom ops. This works fine as long as the runtime provides the same scratch area. If not the output from the concat might be corrupt. - This fix adds an extra step to the pass packing so that all avgpool ops for a concat is group together and run within the same custom op in order to prevent possible corruption. Change-Id: I343e08d7b4046f969b3d9ec3479db6490cbe4170 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2024-04-04	MLBEDSW-8886: Regression: Output diff on LSTM	Johan Alfven
	- Fix regression caused by too strict constraints on SplitSpliceRead causing output diff for LSTM. - As long as the SplitSpliceRead shape fits within the consumer ifm shape it is ok to move the read. Change-Id: Ia6f508f99638c3aedccc7fd9f31405527bb64f87 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2024-04-03	MLBEDSW-8875: MLCE: Update criteria when to move SplitSpliceRead to consumer	Johan Alfven
	- When possible, a read slice from a split or stride is moved to the following op. The problem in this case was that the following op was a Maxpool op (from Softmax). The Maxpool op is using a different input shape compared to the original Softmax op, and this input shape was then changed when the read slice was applied to the Maxpool op. - The result is a faulty Maxpool op with an output diff. - The fix is to prevent moving the slice read when the consumer input shape differs from the Split/Stride ofm shape Change-Id: I649d89c38645fa51c20c3602954e2b8af9372076 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-06-20	MLBEDSW-7449: Add function description and type annotations	Raul Farkas
	Add function description and type annotations to the optimization functions missing them. Fix type annotation issue when re-assigning variable value to a different type. Change-Id: I1ee442ff7a29cc07708fdd013430131eff599dd5 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-06-16	MLBEDSW-7648: Fix bug with filter padding in conv2d	Raul Farkas
	* Fix bug that caused filter padding to not be added proportionally compared to the hardware padding added to IFM. * Update needed_total_padding function that calculates hardware padding to also account for the cases in which IFM width is not divisible by the stride width. * Update supported ops constraint on strides for conv2d to mark ops with stride width > 3 and IFM width that is not divisible by the optimization resize factor as not supported. * Update unit tests that verify correct functionality when checking whether ops are supported or not. Change-Id: I62f14cca890b779ca787a9603fa37c873ad522f8 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-02	MLBEDSW-2082: Add Exp support	Johan Alfven
	- Added int8 and int16 Exp support, implemented as LUT. - Added generic 8bit and 16bit LUT table functions following the implementation in the latest reference. If new ops are added by the reference, they can easily be implemented in Vela using the generic functions. - Moved convert_to_lut to lut.py to have all LUT related code in one file. - Updated SUPPORTED_OPS.md Change-Id: I388e76ea4b39162313599a5341cfb9bad71a782c Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-17	MLBEDSW-7196 Add LSTM support	Fredrik Svedberg
	Added int8 and int16 UNIDIRECTIONAL_SEQUENCE_LSTM support. The implementation does not include support for: * CIFG * Peephole * Projection * Normalisation This change also: * Removed unused Op.BlockLSTM operation type. * Removed the only one consumer limitation on putting the SplitSliceRead on the tensor consumer(s), if all consumers fullfills the requirements * Added Op.VariableTensorWrite as a Operation.memory_function to make sure writes to variable tensors: * Always use linear mode * Are not moved to fast scratch * Are not fused with other elementwise operation tensor ranges Change-Id: Ief831738924ac3d1f2ba6d41f10bd6dc969911f3 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2023-03-27	MLBEDSW-6343: Remove op_index constraint	Raul Farkas
	Remove op_index constraint and force linear format for all Conv2D that have strides that can be optimised. Change-Id: Idef3508ab074ea9abeacac030eaaa15a00ad1211 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-03-16	MLBEDSW-7312: Refactoring bypass_memory_only_ops	Johan Alfven
	- The logic when bypassing memory only ops is complicated and it still does not fix all corner cases. - This patch simplifies the logic by always bypassing the op by replacing the IFM with the OFM. If that is not possible the memory only op is changed to an memcpy op. - The bypassing was previously done in two steps but is now reduced to one. Change-Id: I545dd65e0ec77c70be479a5ada2d277cac3a027c Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-03-14	MLBEDSW-6260: Add support for using DMA to copy feature maps	Johan Alfven
	- Reshape ops can be bypassed and there is no need to process them by the NPU. There are use cases when the IFM must be preserved so a memcpy is needed. This is implemented by an AvgPool. - In order to reduce the cost of the AvgPool the IFM can be copied by DMA. This is faster and also it can be turned into a real NOP in cases where the IFM and the OFM can use the same memory space. - Added new memcpy op. Only NHWC format supported since DMA can not change the format on the fly. - Allow ofm to reuse ifm for memcpy op - Make sure the DMA copy size is 16 byte aligned Change-Id: I3605a48d47646ff60d2bb3644dd3a23f872235a7 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-02-09	MLBEDSW-7281: create_const_tensor OverflowError on Microsoft Windows	Tim Hall
	- Additional overflow checks are performed when running under Microsoft Windows compared to Linux. These checks happen when converting from Python int to NumPy int/uint - The problem is that the lut activation values are int32 type, however they are defined as Python ints. If these are converted to numpy.int32 it could result in an overflow error - The fix is to convert these values to uint32 but keep the operator's IFM tensor type the same (as this will allow them to be interpreted correctly) - Fixing this highlighted another problem where convert_to_lut always calls create_lut_tensor() with an int8 datatype, whereas it should be using the IFM datatype Change-Id: I781a9d850f654267aa4a67754438607c4bb95685 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-01-20	MLBEDSW-7151: MLCE: Difference in model output between x86 & aarch64	Tim Hall
	- The issue is due to undefined behaviour when casting a NumPy float to a NumPy unsigned integer which occurs in create_const_tensor() - The fix is to make sure that the values are first cast to a Python float - In addition, the values datatype argument has been removed from create_const_tensor() to stop the tensor and values datatypes getting out of sync Change-Id: I134b9be8c941b361929a5ae7db8cb35f2e9728f2 Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-01-13	MLBEDSW-7231: MLCE: Fixed assert caused by multiple reshape op's	Johan Alfvén
	Fixed an assert that was caused by a model that has a reshape operator followed by another reshape operator. This structure has never been thought of. However, since there is no need for the first reshape just remove it from the path while traversing the graph. Change-Id: I2a939df37502028ffc07115ac87e85375484efee Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2022-11-17	MLBEDSW-6915: MLCE - Missing operators in Debug DB3.6.0.rc2	wilisa01
	- Adds missing operators and type conversion recording to DebugDB Change-Id: If76b0b430bbe73ae1469024c3160ecf0eea26abe Signed-off-by: wilisa01 <william.isaksson@arm.com>
2022-11-16	MLBEDSW-6620: Update copyright notice and years	Rickard Bolin
	- Update copyright notices to use SPDX format and add OSS mail as contact. - Update years on files where it had been missed. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7e9715ea4e17b76252728c708e46df12ad67ab1f
2022-10-28	MLBEDSW-6975: Updated bypass functionality	Johan Alfvén
	- The previous patch the always replaced ifm with ofm introduced unnecessary avg pool ops for some cases. That patch has been reverted and this is a new solution. - Replace ifm with ofm for the following condition: a) Ops that are dependent that the original ifm tensor shape is not changed by the bypass memory op function. b) When the memory op has different IFM and OFM rank. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I16a023e169ae64c5db46f6f88516a5e1ca7ed7ef
2022-10-28	Revert "MLBEDSW-6961: Bypass functionality for memory ops"	Johan Alfvén
	This reverts commit 5060ff53f5ac2382e04a68d7772bd71a36f63845. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I8dd7e9ed8325fd2e8c17509fd9757292706f5ee7
2022-09-27	MLBEDSW-6961: Bypass functionality for memory ops	Johan Alfvén
	- In order to solve output diffs, the Reshape op was pushed to the CPU. The problem was that the Mean op ifm shape was replaced by the Reshape op ifm shape. - This limitation is now removed. Changed implementation how memory only ops are bypassed. Always replace the memory only op ifm tensor with its ofm tensor. By doing this the ifm tensor for the operator that is after the memory only op is never changed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibcdebf33fd9b7a37f90984a129500b5dac52e5ea
2022-09-23	MLBEDSW-6686: Resize bilinear HPC with tile padding	Rickard Bolin
	- Added support for Resize Bilinear with half pixel centers for int8 and uint8. - Utilizes the new "TILE" padding mode. - Utilizes ofm stride multipliers and modified tile base offsets to write OFMs interleaved. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
2022-09-08	MLEMBED-1918: Issue with REDUCE_SUM on Ethos-U65-5123.6.0.rc0	Tim Hall
	- Ethos-U65-512 requires the input to REDUCE_SUM to use NHWC format - Updated the graph optimiser format check to cover this condition - Added a exception check to the backend of the compiler to verify that this condition is not been violated by the external api or Vela internals Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2f1fabcbd264daf77d5822349d855a3a32b12c64
2021-12-16	MLBEDSW-5629: MLCE: Model falling when creating explicit_padding	Tim Hall
	- Issue was due to a previous patch to fix MLBEDSW-4350 - Manually reverted that fix 5fabfcaa2b636b02899b4d6e0ccf95d853986475 - Made a new fix for MLBEDSW-4350 that calculates the padding and skirt by taking into account the split read offsets and shapes Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I96010c1b977011aecbc411a3c91ab3e61af22db4
2021-10-14	MLBEDSW-5162 MLCE: Vela [3.1.0] falling to run with yolov4_int8.tflite	James Ward
	* fix indices for tflite mapping of EXP operator * fix indices for tflite mapping of Transpose operator * ensure read offset after slice is aligned to 16 bytes for NHCWB16 or force linear format * add unit test to ensure mapping of indices is consistent across TFLite, TOSA and NNG Signed-off-by: James Ward <james.ward@arm.com> Change-Id: I17b6e44bc06853325d5eea62a558418ee1ebefe8
2021-10-01	TOSA: Add support for Identity operation	Patrik Gustavsson
	Added support for Identity operation. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: If00b30528932f7531807ce3914d6c1875ab72fa4
2021-09-15	TOSA: Support for TABLE operator (int8)	Patrik Gustavsson
	Added support to map TABLE operator to LUT. Limitations: -Only supported for int8 -TABLE input must be constant This also adds the support for TFLite legalisation of Tanh/Sigmoid (int8/uint8). Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I1a95f61fb02fdd42c4a690494418cc0765c8b275
2021-09-15	MLBEDSW-5102 Update removal of memory only operators	Jonas Ohlsson
	Memory only operators such as Reshape, Squeeze and ExpandDims are removed in the graph optimiser step. - Added semantic check that memory only operators have same quantisation parameters on ifm/ofm. - Added support for the ExpandDims operator. - Addition and cleanup of related unit tests. - Removed TOSA from the generated SUPPORTED_OPS.md documentation. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: If848d8afc58c18806e10997ed94e4dae83f30879
2021-09-07	TOSA: Added RESHAPE, SLICE and CONCAT	Patrik Gustavsson
	Added support for Data layout ops RESHAPE, SLICE and CONCAT. -No support for bool_t -Support limited to Rank <= 4 and N = 1 Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I487ac494b6506a2a6ba947ee758aa193194dd796
2021-09-03	TOSA: Added Depthwise support	Patrik Gustavsson
	This is mainly to add support for depthwise conv2d with dephmultiplier = 1. (But there are no testcases suited, all I have sourced has depth_multiplier set to 2, which is not supported.) -Added support for depthwise conv2d. -Added support for removing Transpose of constant data -Added support for removing reshape Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I143e6246becfa78fd9f7510af0bf0d6b3fbbf2c7
2021-09-03	TOSA: Support for AVGPOOL, MAXPOOL and CONV2D	Patrik Gustavsson
	Added support for -AVGPOOL and CONV2D with TFLite correspondence -MAXPOOL -additional support for replacing RESCALE ops with avgpool. No support for breaking down tensors over the size supported by NPU. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I1d2aa50ac30a26283b3e6f1fe88cba1544b7c189
2021-08-23	MLBEDSW-4913 Fix inception_v1/v3 output diff	Jonas Ohlsson
	Fix inception_v1/v3 output diffs. Removing the Squeeze operator in the graph optimisation step. The squeeze operator removes dimensions of size 1 from tensor shape. The memory layout is preserved. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I4ceffcbb141af5ed50b0d1a9d1d67622e638c2a1
2021-07-08	MLBEDSW-4838 Added basic TOSA support.	Patrik Gustavsson
	Added basic TOSA support, enabling Vela to read and compile a .tosa file corresponding to CONV2D + Rescale + Clamp, and writing it to an optimized .tflite file. The optimized .tflite file, will in this case, hold a commandstream where the Rescale and Clamp has been fused into the CONV2D. The optimized tflite file is not output from Vela. -Added support to read .tosa file into Vela internal structure. - Added tosa_reader.py, tosa_mapper.py and helper files stored under tosa/ - Support for this limited to ~10 ops -Added reader_util.py for functions common for TOSA and TFLite -Added tosa_graph_optimiser.py -Added support to fuse Rescale into convolution -Modified handling for padding -Added support to fuse Clamp to previous op -Added graph_optimiser_util.py -Moved functions common for TOSA/TFLite graph optimization to this file. -Renamed graph_optimiser.py to tflite_graph_optmiser.py -Added separate tosa_supported_operators.py -Added supported_operator_util.py -For functions in common for TOSA/TFLite Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab