ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2023-05-10	MLBEDSW-7283: Add opt cases for strided CONV2D	Raul Farkas
	* Implement a general optimization solution for strided CONV2D that supports a stride_w with no upper bound. * Implement filter zero padding to allow for optimization in those cases in which the filter width is not divisible by the stride width. E.g.: Filter width = 8, stride width = 3 -> Filter width = 8 + 1 (0 padding) = 9, stride width = 3 * Implement partial optimization to reduce the stride to hw supported strides (i.e. 2 and 3) when optimizing to reach a stride = 1 is not possible due to the IFM width not being divisible by the stride width. * Implement optimization for when SAME padding is used. If the pre-opt and post-opt padding do not match, add zero padding to the filter so that the post-opt IFM padding matches. Change-Id: Ia66b0d107281fa9993f6bf4d0c26627ee743253b Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-10	Revert "MLBEDSW-6343: Remove op_index constraint"	Raul Farkas
	This reverts commit 72c6a2414205e033279f80b622cdf479c05a4f5b. Reason for revert: Fix performance regression caused by breaking cascades in certain models Change-Id: I5aba6e3c59ab27c5129f4a3f0c320ed18df78943 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-05-02	MLBEDSW-2082: Add Exp support	Johan Alfven
	- Added int8 and int16 Exp support, implemented as LUT. - Added generic 8bit and 16bit LUT table functions following the implementation in the latest reference. If new ops are added by the reference, they can easily be implemented in Vela using the generic functions. - Moved convert_to_lut to lut.py to have all LUT related code in one file. - Updated SUPPORTED_OPS.md Change-Id: I388e76ea4b39162313599a5341cfb9bad71a782c Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-27	MLBEDSW-7527: Mean operator output diff	Rickard Bolin
	Mean operators with height larger than 64 are reshaped but the IFM shape was then reset to the original value, causing an output diff. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I3a89d4efac53173cbd6fe0a5c0542e028bed42ad
2023-04-21	MLBEDSW-7408: MLCE: Crash when serialising model LSTM	Tim Hall
	- Added checking and reporting of missing operator attributes when reading and writing TFLite file - Added a TFLite semantic check to ensure that all required attribute fields of builtin operators are read - Added some sanity checks for RESHAPE operators that run on the Ethos-U - Stopped CPU operators from having their attributes modified Change-Id: I05700681acdb09554f5945819717c08a9457295c Signed-off-by: Tim Hall <tim.hall@arm.com>
2023-04-19	MLBEDSW-7487: Updated implementation for the Mean op	Johan Alfven
	- Latest reference has changed implementation for the Mean op and now only contain one variant. - Updated Vela implementation to match reference. The full sum is first calculated and then divided by the numbers of elements. - Removed the avg pool variant and test case. - Updated SUPPORTED_OPS.md Change-Id: I4275e36e3697fa837f119f2cefd7c0ff94231605 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-17	MLBEDSW-7196 Add LSTM support	Fredrik Svedberg
	Added int8 and int16 UNIDIRECTIONAL_SEQUENCE_LSTM support. The implementation does not include support for: * CIFG * Peephole * Projection * Normalisation This change also: * Removed unused Op.BlockLSTM operation type. * Removed the only one consumer limitation on putting the SplitSliceRead on the tensor consumer(s), if all consumers fullfills the requirements * Added Op.VariableTensorWrite as a Operation.memory_function to make sure writes to variable tensors: * Always use linear mode * Are not moved to fast scratch * Are not fused with other elementwise operation tensor ranges Change-Id: Ief831738924ac3d1f2ba6d41f10bd6dc969911f3 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2023-04-12	MLBEDSW-7437: Add 64-bit output support for ArgMax	Johan Alfven
	- Added 64-bit support for ArgMax - Updated constraints for ArgMax and regenerated SUPPORTED_OPS.md Change-Id: I4ef7d2e6fccab0088b87757f6afe40a006c77bbd Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-04-04	MLBEDSW-7442: Removed ofm quantization for ArgMax	Johan Alfven
	- Quantization for the OFM was added for the ArgMax operator as a workaround in order to avoid a crash in the weight compressor. This quantization is now removed. - The weight compressor expects that all tensors have a quantization. Updated code to use scale = 1.0 and zero point = 0 for tensor without quantization. Change-Id: I6816dce2db55f7d795d19f88d7fbe7ee419347fc Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-03-31	MLBEDSW-7439: Add support for input dims < 4 for ArgMax	Johan Alfven
	- Updated ARG_MAX to support IFM rank less than 4 - Regenerated SUPPORTED_OPS.md Change-Id: Icd8e72733279413cbea49021325e1ab06fdc6011 Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-03-27	MLBEDSW-6343: Remove op_index constraint	Raul Farkas
	Remove op_index constraint and force linear format for all Conv2D that have strides that can be optimised. Change-Id: Idef3508ab074ea9abeacac030eaaa15a00ad1211 Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-03-22	MLBEDSW-6435: Implement support for ArgMax along depth dimension	Rickard Bolin
	- Add support for ArgMax along depth dimension with a depth limit of 127. - Only supports 8-bit input and 32-bit output Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I5f6f0503135bebabbb1ca637f9729587b7c60740
2023-03-16	MLBEDSW-7312: Refactoring bypass_memory_only_ops	Johan Alfven
	- The logic when bypassing memory only ops is complicated and it still does not fix all corner cases. - This patch simplifies the logic by always bypassing the op by replacing the IFM with the OFM. If that is not possible the memory only op is changed to an memcpy op. - The bypassing was previously done in two steps but is now reduced to one. Change-Id: I545dd65e0ec77c70be479a5ada2d277cac3a027c Signed-off-by: Johan Alfven <johan.alfven@arm.com>
2023-02-15	MLBEDSW-7211: Convert fixup_asymmetric_weights to supported ops check	wilisa01
	Changed default behaviour to place int8 ops with asymmetric quantization on cpu, and added an option to force symmetric quantization Change-Id: Ib9b717aaf61eae78833254ca3dfa745f4f253dc6 Signed-off-by: wilisa01 <william.isaksson@arm.com>
2023-02-15	MLBEDSW-7342: Regression: Output diff for mlperf_deeplabv3_mnv2_ade20k_int8	wilisa01
	Swapped order of ifms to add in tflite graph optimiser. The output diff was caused by the second input tensor being placed on sram, despite there being no dma request to move it there. Change-Id: I2e83b669ba226c7e96a0bb0d46ba811434cf7bb6 Signed-off-by: wilisa01 <william.isaksson@arm.com>
2023-02-10	MLBEDSW-4960: convert_resizebilinear_1x1_to_add creates constant output tensor	wilisa01
	Sets second input tensor of resize op to be constant and refactored function. Signed-off-by: wilisa01 <william.isaksson@arm.com> Change-Id: I496764f18b4c1ae0fa1a828dd7a90e937a42d41b
2023-02-07	MLBEDSW-7237: CONV_2D stride 4 optimisation	Raul Farkas
	* Extend stride range from (1,3) to (1,4) * Add stride 4 support when optimising CONV_2D * Add some tests for various strides Change-Id: Iddaeb42c4a6e02695ecdd3740bc8b9dd59a7eb3c Signed-off-by: Raul Farkas <raul.farkas@arm.com>
2023-01-20	MLBEDSW-7151: MLCE: Difference in model output between x86 & aarch64	Tim Hall
	- The issue is due to undefined behaviour when casting a NumPy float to a NumPy unsigned integer which occurs in create_const_tensor() - The fix is to make sure that the values are first cast to a Python float - In addition, the values datatype argument has been removed from create_const_tensor() to stop the tensor and values datatypes getting out of sync Change-Id: I134b9be8c941b361929a5ae7db8cb35f2e9728f2 Signed-off-by: Tim Hall <tim.hall@arm.com>
2022-12-22	MLBEDSW-7203: Data type alias deprecations	Rickard Bolin
	Deprecation of some data type aliases in NumPy version 1.24.0 caused Vela to crash when using Python version 3.8 or above. Replaced the deprecated aliases. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ide167ee864a340194ec5e69537c8718192c78ace
2022-11-17	MLBEDSW-6915: MLCE - Missing operators in Debug DB3.6.0.rc2	wilisa01
	- Adds missing operators and type conversion recording to DebugDB Change-Id: If76b0b430bbe73ae1469024c3160ecf0eea26abe Signed-off-by: wilisa01 <william.isaksson@arm.com>
2022-11-16	MLBEDSW-6620: Update copyright notice and years	Rickard Bolin
	- Update copyright notices to use SPDX format and add OSS mail as contact. - Update years on files where it had been missed. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7e9715ea4e17b76252728c708e46df12ad67ab1f
2022-11-15	MLBEDSW-6905: Add dilation greater than 2 support	Tim Hall
	- Added graph optimisation pass to support dilations greater than 2 in either dimension - Removed supported operators restrictions - Removed erroneous dilation on TRANSPOSE_CONV - Updated unit tests and documentation Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ide302374b0d5eff25c20501383a63f6aa7625c52
2022-11-09	MLBEDSW-6881 SHAPE single op network is optimised to nothing3.6.0.rc1	Fredrik Svedberg
	Fixed by adding an operation to copy the statically optimised data to the subgraph output. Change-Id: Ica757e37d5460237973444ffd39c7d2850f319e3 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-11-03	MLBEDSW-7074: Updated reference kernel for the MEAN op	Johan Alfvén
	The reference kernel for the MEAN operator has changed. As a result, the mean implementation can be simplified and the constraint for mean int8 can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I318e9b495eefea99e7ac4aea4b8c436c83753405
2022-10-18	MLBEDSW-6941: Set correct OFM shape for fc op	Johan Alfvén
	If IFM operator shape is rewritten so that batching is greater than one for fully connect, the OFM batch must also be calculated. This change will fix output diffs for networks that have fully connect OFM with rank greater than 2. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I5009edc647a1449a02c8116b45808c1c68beffe6
2022-10-18	MLBEDSW-6794: ResizeNearestNeighbor with HPC	Johan Alfvén
	- Removed half pixel centers constraint for resize nearest neightbor. - Supported scale 2x, 4x and 8x. - Removed test_constraint_resize_half_pixel_centers - Regenerated SUPPORTED_OPS.md Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ic3e02e9c2b2034d537c9a9841b8fb4ee433c96dc
2022-10-04	MLBEDSW-6969 Remove RescaleAdd and RescaleMul operators	Fredrik Svedberg
	Removed RescaleAdd and RescaleMul operators in favour of Operation.explicit_scale and removed Operation.rescale. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Idccd8851731d4bb8d4e84970e0fd6b409d7d4e45
2022-09-27	MLBEDSW-6708 Check the bias tensor in graph optimiser mean op	Fredrik Svedberg
	Cleaned up bias tensor use in graph optimiser for Mean operator. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ibcbfa010a4de67d97181df664b420168d6883d1e
2022-09-27	MLBEDSW-6962: MEAN height is greater than max kernel height	Johan Alfvén
	Fixed bug when height is greater than max kernel height. The shape of the weight must match the ifm shape. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I901a8af2edd5858bb15d53d85ef8e2389049ada7
2022-09-23	MLBEDSW-6928: Add int16 support for Resize Bilinear HPC	Rickard Bolin
	Setting bias tensor dtype to DataType.int32 solves rounding issues for RB HPC int16. Removing the input data type check also solves the issue of resize nearest neighbor int16 ops incorrectly getting placed on the CPU. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iee352bcb78e581c0cde3c203dfbe866f1f6fae18
2022-09-23	MLBEDSW-6686: Resize bilinear HPC with tile padding	Rickard Bolin
	- Added support for Resize Bilinear with half pixel centers for int8 and uint8. - Utilizes the new "TILE" padding mode. - Utilizes ofm stride multipliers and modified tile base offsets to write OFMs interleaved. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
2022-09-21	MLBEDSW-4338 Randomized int16 PAD output diff	Fredrik Svedberg
	The issue was that the AveragePool in these test cases were translated to DepthwiseConv2DBias and int16 convolutions always runs with reduced scale. Fixed so that reduced scale is not used in this case. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ice956eabbb37c8aa1991464870006971c6ecec43
2022-09-16	MLBEDSW-6938 Fix PReLU optimisation	Fredrik Svedberg
	Fixed PReLU optimisation to LeakyReLU with negative alpha. Added optimisation of LeakyReLU to ReLU when alpha is zero. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5e66f79b29908fffd95b6115799021138ebb401a
2022-09-13	MLBEDSW-6929 Fix LeakyReLU int16 regressions	Fredrik Svedberg
	Fixed LeakyReLU regressions for int16 due to scaling introduced for handling negative alpha. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I84a494fedf54bd4b47c4632645ded7d6cda445f8
2022-09-12	MLBEDSW-6869 Improve LeakyRelu support	Fredrik Svedberg
	Added support for int16 LeakyRelu for negative alpha and alpha greater than one. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I7f522ebfe014786d0a1d96172e75c7d9bdd76921
2022-09-12	MLBEDSW-6613: Implement tile padding	Rickard Bolin
	Implement new padding mode which pads two edges of the IFM with the current values of those edges Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I8523e0cabdac80b48710703859003e33050cc150
2022-09-12	MLBEDSW-6909: Use int32 acc for the Mean op	Johan Alfvén
	Changed acc type from int16 to int32. This will solve saturation problems and the constraint added in commit "MLBEDSW-5029: Output diff for Mean op" can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I05ec8835b43313b1a264d61a2b147fa62da123fe
2022-09-06	MLBEDSW-6870 Optimisations for PReLU	Fredrik Svedberg
	Added optimisations for PReLU when the alpha values allows it. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iff9124e691663ee495379f89900e7c35dbc5f948
2022-09-01	MLBEDSW-5029: Output diff for Mean op	Johan Alfvén
	Fixed three test cases causing output diff compared to the reference kernel for the Mean operator. - If there is a possibility that the accumulator could saturate the Mean op must run CPU - Use correct rounding for the bias term - If a Reshape op is followed by a Mean op, push the Reshape op to the CPU since this cannot be handled by the NPU Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I734465730372105821a5e2f73a6a125b9eb7d7f4
2022-08-31	MLBEDSW-6832 PReLU support in Vela	Fredrik Svedberg
	Added PReLU support in graph optimiser. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3a188675e3edcdf0b4a4bfcdd134fda0bf8a560f
2022-07-23	MLBEDSW-4157: Add RESIZE_NEAREST_NEIGHBOR support	Tim Hall
	- Changed ResizeBilinear to support ResizeNearestNeighbor as well for 1x1 IFM, IFM equal OFM, and non-align corners - Added support for ResizeNearestNeighbor with align corners by converting to a DepthwiseConv - Updated supported operator unit tests - Added is_resize() helper function and some associated refactoring Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
2022-07-23	MLBEDSW-6616: ResizeBilinear align corners is incorrect	Tim Hall
	- Fixed align corners support when converting in to upscale and average pool. The problem was due to the wrong ratio ifm to ofm size, causing an scaling factor that was not 2x/4x/8x. Works for uint8, int8 and int16. - Fixed checking of align corners in supported operators check - Added additional supported operators check for the size tensor - Updated and added more supported operators unit tests Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idb78fa9e76ede2c37e8ac6cb1c322154bd156898
2022-07-15	MLBEDSW-6703 Add SHAPE operator to supported operators	Fredrik Svedberg
	Added SHAPE operator to the supported operators report. Updated the constraints for QUANTIZE and SHAPE operator. Also fixed RESHAPE consuming statically optimised shape. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I1d964d602d3f361a0f16dae8133197280dd84c48
2022-07-13	MLBEDSW-6687 Vela crashes in npu_serialisation.py and tflite_graph_optimiser.py	Fredrik Svedberg
	Fixed static optimisation of Quantize operator by running unsupported formats on CPU. Also added support for int16 and corrected the calculation. Change-Id: I861c712aa6258dba53fcf4d5dae45d1d416e6141 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-06-29	MLBEDSW-6314 Static optimisation for quantise OP	Ayaan Masood
	Quantise op becomes constant if input is known at compile time Quantised values calculated if input of op is const and float *Const inputs to quant op that are int are requantized Change-Id: Ic94a72a392af709fe6a640d7dacbb5dc2334f16f Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-06-29	MLBEDSW-6313 Static optimisation for Shape OP	Ayaan Masood
	Shape OP value is available at compile time hence it can be optimised Disconnected shape OP at compile time from parent tensor *Transformed shape OP tensor into constant Change-Id: I0a024269e2b592c6146dd72e62d7a41951fb727a Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-04-21	MLBEDSW-5384 FC layers run on NPU if underlying shape is 2D	Ayaan Masood
	Added generic function which checks if underlying shape of FullyConnected operation is 2D and performs shape reduction Fully connected operation >2 dimensions now run on NPU if the above case is satisfied constraint_fc_output_2d and rewrite_fully_connected_input refactored Added unit test to confirm this functionality Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com> Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
2022-04-20	MLBEDSW-6407: Vela fails with TypeError in npu_performance	Tim Hall
	- This is due to calling range() on a non-integer value which in turn is due to a change in the behaviour of round() on numpy.float64 values - The fix is to always force the output of the round() to be an integer and thereby stop whole number floating point values propagating into the kernel dimensions which later feed into the range(). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ic75cb6ba85a90c81c1d762067d89a10caaa13b92
2022-03-30	Update version of Black to 22.3.0	Jonas Ohlsson
	Update version of Black to 22.3.0 due to updated dependencies. Updates to fix reported issues due to new version. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
2022-03-21	MLBEDSW-6298: MLCE: Unable to find a valid block config	Tim Hall
	- Fixed a bug due to ResizeBilinear modifying the attributes of a shared IFM - The ifm_resampling_mode is now an attribute of an operator rather than a tensor - Changed all calls to try_block_config() to use the attribute rather than recalculating it in multiple places Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I4641e9cd6b049bd4186776d98e3e751c5e5bcc06