ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2021-07-05	MLBEDSW-3890 handling scratch tensor	Samuel Panijel
	vela: Possible issue with handling scratch tensor on non-ethosu custom op Fixing a case where a tensor input name ends with "scratch". 4 test cases passing this change: 1) non-optimized tflite - input tensor name is _split_1_scratch 2) optimized tflite - input tensor name is _split_1_scratch 3) optimized tflite - input tensor name is _split_1_scratch and custom operation name is non_ethus_u 4) non-optimized tflite - input tensor name is _split_1_scratch_fast Change-Id: Ia515805825b7f9a646607c5075b7ea3a0cf6aad8 Signed-off-by: Samuel Panijel <samuel.panijel@arm.com>
2021-06-25	MLBEDSW-4819: MLCE: weight_compressor int has no attribute astype	Tim Hall
	- Added type checking so that the correct type conversion can be used Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ia83f46029fac7bad63844c090b87d23c2072b105
2021-06-22	MLBEDSW-4807 Elementwise IFM/OFM overlap	Jacob Bohlin
	Reinstated allowing the IFM and OFM tensor to overlap for Elementwise operations. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: Ide6db7781f3ca7a36c8ff9e3efdc7943a7bf6d7f
2021-06-17	Block config optimisation for 256/512 configurations	Tim Hall
	- 256 and 512 configuration variants execute 1D convolutions in an optimised manner compared to their 2x2 microblock dimensions. This commit takes this into account to improve Conv1D throughput on these configurations. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
2021-06-17	vela: Improve block configuration and weight buffering algorithm	Tim Hall
	- Update block config selection to take into account partial IFM fetches at edge of non-whole OFM block data. - Change to scheduler depth slicing for networks in MLBEDSW-4637 for improved buffering. This helps general performance by buffering larger depth slices. - Bug fix for opt_max_schedule always being fitted to SRAM which prevented the optimisation step running in some cases. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
2021-06-16	MLBEDSW-4635: yolo_v3 output diff	Jacob Bohlin
	Fixed an issue where the scheduler would set the incorrect tensor layout. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I28abdf3f3c523d7da0cf8840316ece37dad364ab
2021-06-16	MLBEDSW-4644 Removed unnecessary LUT DMA commands	Jacob Bohlin
	Fixed a bug where a DMA command for the activation LUT would be issued for every depth-slice of an operator. This caused multiple unnecessary DMA commands. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I9c291692d8002f05656bb88214836ab389a56cdb
2021-06-09	mlw_codec: Fixed alignment warning	Mauricio Briceno
	- Restructured pointer API to prevent alignment warnings - Changed weight tensor data type to np.int16 Change-Id: I310c1ca733bf98724c84e8b2194becb4be3e7eea
2021-06-08	MLBEDSW-4602: Fix Deepspeech scale & bias reuse issue.	Tim Hall
	- Deepspeech reuses identical weights and biases throughout the network. Since biases are now interleaved with weights there is a scaling issue when the ifm scales differ between operations using the same weight and scale tensor. - This commit uses interleaved weights/scales on their first use but separates scales to source memory on subsequent use (if the ifm scale is different). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I7aae163438160a919cae04e235966e75355a6148
2021-06-03	MLBEDSW-4688: Fix performance estimates	Patrik Gustavsson
	Putting back the estimates related to unbuffered weight transfer. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I2072066bc1e01814fe3b0b87a912f69646da861c
2021-05-27	MLBEDSW-4034: New Scheduler Size or Performance Optimisation	Tim Hall
	- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
2021-05-21	MLBEDSW-4219: Add tensor allocation info to summary	Tim Hall
	- Moved new tensor allocation info under --verbose-allocation flag - Tidied up and added histogram to --verbose--allocation print Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I76fb5187319aedf86f599f57b766220cafc17326
2021-05-20	[MLBEDSW-4623] Fix sub-module imports	Fredrik Svedberg
	Fixed sub-module imports. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I6ab5c04ba5f3411f8cf8ac95606fe036fae11442
2021-05-20	Fix mlw_codec build warnings	Fredrik Svedberg
	Fixed mlw_codec build warnings. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I8ec8fb3b092cce0629c690677984549febf01adc
2021-05-13	Fix mlw_module	Fredrik Svedberg
	Fixedx size calculation in mlw_reorder_encode. Fixed build warnings. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac9408b9972a29b5a3403ba11f80dc4eaaa35453
2021-05-07	weight_compressor: added mlw_reorder_encode3.0.0.rc1	Mauricio Briceno
	- Moves reordering to C - Runtime is greatly minimized for encoding weights Change-Id: Ifff01e7b1ea6d5cec68310a155c3b80aa1a38545 Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>
2021-05-07	[MLBEDSW-4530] Improve --verbose-graph output	Fredrik Svedberg
	Improved --verbose-graph output by adding labels to each print. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I49039ff6af1c06f49208591f02effa4ff73f982a
2021-05-07	MLBEDSW-4534 Limit ifm box depth	Patrik Gustavsson
	Limit the ifm box depth to ifm shape depth Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I889aed9ef7e338faa1fca074fb2843fa2cedecc8
2021-05-07	vela: Remove unused serialisation params	Tim Hall
	- Removed unused nng parameter Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0bb2eb101a84ea8022c8eb7bcbd86d617e933510
2021-05-06	[MLBEDSW-4254] Improve weight information in summary	Fredrik Svedberg
	Improved weight information showed in summary if --verbose-weights option is used. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac142f2a813bf1c05aa9da3f8a384466e2914d06
2021-05-04	MLBEDSW-4429: elementwise_broadcast output diff	Dwight Lidman
	This commit fixes a regression caused by a recent commit where io_ranges and elementwise_broadcast were failing with off-by-one errors. The culprit was the incorrect usage of NATURAL rounding in cases of elementwise ADD and SUB where the input and output scales were equal and advanced scaling was not used. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I35d56298e911a4d1bbca7d201bcde6044c8a5490
2021-05-03	MLBEDSW-4539: MEAN axis check exception fix	Dwight Lidman
	A recent fix to another MEAN bug introduced a new bug. The bug was due to some incorrect logic for checking the axis attribute. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I65d3486a12e029f7c4450074f03fcd1974f65d8a
2021-04-30	MLBEDSW-4350 Use padding instead of skirt for merged SplitSlice	Henrik G Olsson
	When the operations are merged some later passes are confused by start and end coordinates for the convolution not being along the edges of the IFM, and omitting padding. But we need the zero padding to keep the output the same as before the transformation. Also fixes bug where Vela could crash if convolution had explicit start coordinate. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I8449d237350d528f83738b2f09124f1ed79c07ca
2021-04-29	MLBEDSW-4501: Support MEAN single axis variation	Dwight Lidman
	When a MEAN operator with a single reduction axis specifies the axis index attribute as an array with a single element rather than a scalar index, the operator is placed on the CPU even though it is technically supported. This commit fixes this issue and also adds some new tests for the axis constraints. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ia287f3b9cc80a805e972cd4b2962e52526a8dc16
2021-04-21	MLBEDSW-4413: MobileNet V3 regression	Dwight Lidman
	This commit resolves a recent regression in multiple networks (including MobileNet V3). The regression was caused by a recent change to IFM block size calculation where a term mistakenly left out (due to it missing from the spec). The IFM microblock size has been amended for the Ethos U-55 128 config and the block size calculations now use these sizes instead (although equivalent with OFM microblock sizes). Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ic504b4becd6c3a26334a7275189d78ff0fe2cf69
2021-04-19	[MLBEDSW-4421] Fix verbose-all crash	Fredrik Svedberg
	Fixed exception when using the CLI option --verbose-all. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I203fe31ad6914936730343958009e2370045c67c
2021-04-19	[MLBEDSW-4414] Fix verbose-operators for multiple custom ops	Fredrik Svedberg
	Fixed exception for --verbose-operators option when there are multiple custom operators in the network. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5ab743d96a4e0367818fbe46cc47896c691d888c
2021-04-16	MLBEDSW-4132 Fix off-by-one error for negative packing axis	Henrik G Olsson
	Also applies to unpack. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I07e7083aeb6aefd6e26f9d134b858080f28f1719
2021-04-16	MLBEDSW Vela: Fix check for format restrictions	Patrik Gustavsson
	Fixed the check related to if there are any CPU producers/consumers. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I0ed08c650d1ca34e8e148aee68a5ed09c25fdd87
2021-04-16	MLBEDSW-3550 Only use simple scaling when bitexact with TFLite	Henrik G Olsson
	For 8 bit arithmetic we cannot guarantee reproducibility in the general case since precision differs, affecting rounding near half integers. It should be safe when the ratio between output and input scales has its 12 LSBs all set to 0, however. For 16 bit arithmetic it should be sufficient to adjust the input and output scalings with a factor of 2 to get the same rounding. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I809c0042615d16c5488d61f0c7d88e1a1315e6eb
2021-04-15	MLBEDSW-4397 Fix Reshape ifm/ofm prod/cons by cpu op	Patrik Gustavsson
	Not only the sg input outputs need to be considered before removing Reshape. Added check if Reshape ifm/ofm is produced respectively consumed by CPU. Handling is the same as if tensor is sg input/output. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: If509e1d23e3f22ed4c963d8dabd8c00c6b9c07e3
2021-04-14	MLBEDSW-4103: Block config calc update	erik.andersson@arm.com
	The previous calculation of the IFM block height and width yielded incorrect block configs when running transpose_conv networks with certain hardware constraints. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I8b6936a3e8c37da640bdeac84ecfea8363f910f9
2021-04-09	MLBEDSW-4073 Handle elementwise ops with same tensor for both inputs	Henrik G Olsson
	Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I0e6bb46b7b91ed10f5bda34fba66d8b714560f47
2021-04-08	Fix stats_writer.py exception	Fredrik Svedberg
	Fixed exception in stats_writer.py. Change-Id: I625390aec185345cadd0d8fa5edb66907b9be242 Signed-off-by: Fredrik Svedberg <Fredrik.Svedberg@arm.com>
2021-04-08	MLBEDSW-4334 Non-linear format decision in graph opt.	Patrik Gustavsson
	Check if non linear tensor format can be used is refactored. -Flag avoid_NHCWB16 replaced with needs_linear_format -Checking restrictions located to one function in graph optimiser. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Iec5c7996a1a6039cad052197f1ae56f7c0290440
2021-04-07	MEAN implementation changed to Average Pool	Dwight Lidman
	This is a small commit which changes one of the four MEAN implementations to a simpler one, using an AvgPool instead of a DepthwiseConv. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I9e8af071e8b820796577ee4792b4812a1212602b
2021-04-06	MLBEDSW-4249 Hide stack traces in error messages	Henrik G Olsson
	When faced with an invalid tflite file we now catch the exception to make it clear to the user that the issue is with the input and not with Vela, instead of just crashing. Same also applies to our own Vela error messages. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I56a81c5be9e1f46f3b98a88c6d24ee42fa0e450d
2021-03-31	MLBEDSW-4286: Bug fix Concat using IFM streaming	Louis Verhaard
	IFM box calculation was wrong because 2 variables were referencing/updating the same list. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Ibed4e94c474682e14a6dd898029f14af11c9479a
2021-03-31	MLBEDSW-3461: Check configuration SRAM size	Louis Verhaard
	Added check that configured SRAM size is within bounds. Change-Id: I5dce3df0788f2b00402e9a541bad11612fa19463 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-31	Handle absent weights_compression_ration when printing	Henrik G Olsson
	Change-Id: Iafb31af73d80adcc901b241c34dda78be360bc14 Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
2021-03-31	MLBEDSW-3502: Bug fix addresses >= 32 bit	Louis Verhaard
	Bug fix in generation of register command offsets that do not fit in 32 bit. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Iabb99cf6536c0f77b934691f8744df61f1eab3ed
2021-03-30	Performance improvement in tensor allocation	Louis Verhaard
	- Tensor allocation verification was O(N^2), is now closer to O(N) - Removed a sort in HillClimb allocator Change-Id: I286a269881490c485cc2b0eeab3b1ecffa8f3df0 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-30	MLBEDSW-4219: Add tensor allocation info to summary	erik.andersson@arm.com
	Added the theoretically minimum max memory usage and the allocator overhead to the Vela summary. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: If373dfeaac50d6f8b56554d435bf22af2c3acda3
2021-03-26	MLBEDSW-4163: OFM zero point outside valid range	Dwight Lidman
	This commit fixes a bug where the OFM zero point would assume values outside of [0, 255] due to it's usage as a stand-in for a bias when emulating the TensorFlow Lite implementation of MEAN. The solution is to adjust for the bias using an ADD operator with the bias value as an int16 const tensor. The 16-bit integer is needed as the bias is 32 bits in the original implementation but can effectively assume values in the range [-255, 255]. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I84df48ea89bb559954f1b2c289b65e08a6418274
2021-03-25	MLBEDSW-4071: Power of two handling 16-bit tanh/sigmoid	Louis Verhaard
	Added special handling of power-of-two input scales for 16-bit tanh/sigmoid to align with the reference. Change-Id: I87831bcd587623d7db7100e768905355c2c98e9d Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-22	MLBEDSW-3502: Add address checks	Louis Verhaard
	Added checks during command stream generation to make sure that address boundaries are respected. Change-Id: I4dbc693b42d54e35c8fcc785e8be88059e409eec Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-19	MLBEDSW-3458: Added command stream size check.	erik.andersson@arm.com
	If the command stream size exceeds a certain threshold, a VelaError will now be raised. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I9b9383f4c298a778b160cd527374e9244e4cae26
2021-03-19	Address generation fix	Mauricio Briceno
	- The architecture supports address extensions wider than 32b via the cmd1.param Change-Id: I7a01b4596f7a54f6be05b8e2c454494e6751757b Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>
2021-03-16	MLBEDSW-4215: Add support for MEAN to match QuantizedMeanOrSum implementation	Dwight Lidman
	This commit adds support for emulating the behavior of the QuantizedMeanOrSum implementation of MEAN in TensorFlow Lite. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ifd24e0e678e2f85cd66ab82deeaaf010d5351b1e
2021-03-16	MLBEDSW-4223: Full support for PAD operator	Louis Verhaard
	- Added full support for PAD operator - Hardware padding is still used whenever possible - Bug fix Pad followed by max pool if IFM contains negative values Change-Id: Ifc64d1943737d94466f5e2821009dab12a49a965 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>