ethos-u/ethos-u-vela.git

Age	Commit message (Collapse)	Author
2022-04-27	MLBEDSW-6425: Update to TensorFlow 2.8 (bugfix)	Rickard Bolin
	Generate flatbuffer files with relative imports. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Idd59bb2ebb829bc42677920577c1f8a04e23ca68
2022-04-27	MLBEDSW-6425: Update to TensorFlow 2.8	Rickard Bolin
	Update the flatbuffers generated code to comply with TensorFlow 2.8 Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ia65325b88745e49dbafa803a38c0ea0e7d0478ba
2022-04-21	MLBEDSW-5384 FC layers run on NPU if underlying shape is 2D	Ayaan Masood
	Added generic function which checks if underlying shape of FullyConnected operation is 2D and performs shape reduction Fully connected operation >2 dimensions now run on NPU if the above case is satisfied constraint_fc_output_2d and rewrite_fully_connected_input refactored Added unit test to confirm this functionality Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com> Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
2022-04-20	MLBEDSW-6407: Vela fails with TypeError in npu_performance	Tim Hall
	- This is due to calling range() on a non-integer value which in turn is due to a change in the behaviour of round() on numpy.float64 values - The fix is to always force the output of the round() to be an integer and thereby stop whole number floating point values propagating into the kernel dimensions which later feed into the range(). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ic75cb6ba85a90c81c1d762067d89a10caaa13b92
2022-04-20	MLBEDSW-6371: Output diff caused by operator clone bug	Rickard Bolin
	- Modify the operator clone function to also clone resampling mode attribute. A previous patch changed the ifm resampling mode to be an attribute of an operator rather than a tensor but did not modify the operator clone function to clone the new attribute. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7a2f6103666a0997f657de20ad962e849976b904
2022-04-08	MLBEDSW-6339 Performance drop on wav2letter	Johan Alfvén
	Corrected calculation for used bufferering depth. Before change there were scenarios when it was set to smaller sizes than needed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I162859ade78487e848510c6a605685e4568c7068
2022-04-04	vela: Minor refactordev/mlbedsw-6271	Tim Hall
	- Changed comments to docstring on QuantizationParams - Simplified op type to op name conversion Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2fdf5922cc17944c9bd37917a85fdfe50a1e651d
2022-03-31	vela: Added debug info to external API	Tim Hall
	- Added optional name attributes to operators and tensors Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I3b5d881a7b1043a6ba4b58fff5d7532b271ba536
2022-03-30	Update version of Black to 22.3.0	Jonas Ohlsson
	Update version of Black to 22.3.0 due to updated dependencies. Updates to fix reported issues due to new version. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
2022-03-30	MLBEDSW-6263: Use separate tensors for double buffering	Louis Verhaard
	Uses separate tensors for the individual weight buffers in case of weight double buffering. Each weight buffer tensor gets its own individual live range. Change-Id: I724a8c61a7045615fbd2ed9535663076ac8edd13 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-28	MLBEDSW-6249: HillClimb improved stuck avoidance	Louis Verhaard
	Added a mechanism that reduces the risk for getting stuck if the current best allocation cannot be improved by only swapping 2 indices. Change-Id: Ife379757752f0c1ed54af7bd826e0a9390d54267 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-28	MLBEDSW-6098: Order check in cascade builder	Louis Verhaard
	Added checks in the cascade builder to ensure that scheduled operations are in the correct order. Change-Id: Ic1765a6a1cb8335ff222bfe3b2d2e642980967d7 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-21	MLBEDSW-6298: MLCE: Unable to find a valid block config	Tim Hall
	- Fixed a bug due to ResizeBilinear modifying the attributes of a shared IFM - The ifm_resampling_mode is now an attribute of an operator rather than a tensor - Changed all calls to try_block_config() to use the attribute rather than recalculating it in multiple places Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I4641e9cd6b049bd4186776d98e3e751c5e5bcc06
2022-03-21	MLBEDSW-3367 Add mypy to pre-commit	Jonas Ohlsson
	Add mypy to pre-commit and clean up all reported errors. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: If7dc869f5fecdb0e2db40f14e7d9db21aa33df71
2022-03-21	MLBEDSW-6312: Find block config improvement	Louis Verhaard
	- The number of accumulators is doubled in an Ethos-U configuration with 2 cores - Likewise, for elementwise, depthwise and pooling operations the IFM buffer depth capacity is doubled - FindBlock: step the search space depth in multiples of ublock * ncores Change-Id: I923cc347a2f252876d405ed93095d39181103f81 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-03-17	MLBEDSW-5332: Bug fix optimise_strided_conv	Louis Verhaard
	Added check that horizontal padding is unaffected when applying graph optimization "optimise_strided_conv". Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I7032a44163e300cdf62cf615b4b10a1417e38eaa
2022-03-14	MLBEDSW-6245: Bug fix fast storage allocator	Louis Verhaard
	Fast storage allocator did not always return an optimal allocation. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Ic758b6c4a82dc2633c4752b0c204a27ed36f651b
2022-03-14	Fix bug storing encoded NPU weight UUIDs	Jonas Ohlsson
	Fix bug when storing the encoded NPU weight UUID in the NPU performance estimation. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I92127b0020f12352d923c0c9aa2b6f47e6110764
2022-03-11	Vela: Fix diff in mean op	Diqing Zhong
	- Extend ifm/ofm dimensions explicitly in mean op This fix a bug when ifm/ofm shape has different dimensions e.g. IFM=1x19x18x25 axis=2 OFM=1x19x25, the ofm_shape should be 1x19x1x25, not 1x1x19x25 - Fix wrong weight shape Change-Id: I269eb71ea56c09deee2aa6c6433d9b2baa98a113 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2022-03-08	Updated elementwise cycle calculation	Johan Alfvén
	- Corrected rounding error - Number of elements depends on ofm format Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I568d660b7571b6e0ffb131211b3a89c8be4b9295
2022-03-04	MLBEDSW-3367 Update pre-commit flake8 version	Jonas Ohlsson
	Update the version of flake8 used in pre-commit to facilitate adding mypy to pre-commit. Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com> Change-Id: I457dec87b77487ca6f14ff4a679c4cc927b272b0
2022-02-24	MLBEDSW-6247: MLCE: Issue when running a model with Padding	Tim Hall
	- The bug is that TransposeConv does not support explicit padding which is needed in order to combine it with a proceeding Pad op - The fix is to exclude such combination Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ide03d034dc32b5fc9bcaaf291ab713482223a042
2022-02-22	MLBEDSW-5873 Fixed divide by zero warning in memory transfer efficiency	Ayaan Masood
	*Corrected calculation where use of the _estimate_memory_transfer_efficiency function when calculating the scaled bandwidth for LUT transfers resulted in a divide by zero error. Change-Id: I2356e924d9ca2f315ca1988f465f58b13a8fa4c9 Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-02-22	MLBEDSW-5880 Fixed Vela verbose weight flag	Ayaan Masood
	Original weights and encoded NPU weight now report correct size instead of zero when running vela with --verbose-weights flag (Code to update the aforementioned attributes was missing) Removed print references to unencoded NPU weight size Change-Id: I6d3e41c04cc46d24eeb54cab89818a35e5df27be Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
2022-02-21	MLBEDSW-6148: Reduce SRAM usage for elementwise op	Johan Alfvén
	Reduce memory footprint when using optimization strategy Size for elementwise operations. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I30380aed587c31adbf7615f74179b4c5da686773
2022-02-15	MLBEDSW-5554: Constraints for single-axis mean operations on NPU	James Peet
	- Combine two MEAN operator checks for single axis averages into one - Only apply that check if the single axis is the height dimension (previously checks were also applied to width averages) - Rephrase some MEAN operator constraint descriptions Signed-off-by: James Peet <james.peet@arm.com> Change-Id: Ie0577f2b99aba1f3d6a4c39f8934eafe3813b736
2022-02-09	MLBEDSW-6180: Protect overwrite of subgraph output3.3.0.rc1	Johan Alfvén
	Make sure output from subgraph is write protected and not overwritten by an element wise op. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ie26979913843c62794c5346a315b7089206850e0
2022-02-08	MLBEDSW-5582: MLCE: memory corruption with zero concat	Johan Alfvén
	Fixed problem when ofm is produced by different NPU nodes by making sure that output is always in NHWC format. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I00e55c989d5860499fbaf4f4318661b17b4bda7e
2022-02-08	MLBEDSW-5839: Port of improved spilling behaviour	erik.andersson@arm.com
	Ported the improved spilling behaviour from Regor into Vela. This replaces use_fast_storage_for_feature_maps with allocate_feature_maps and introduces the class called FastStorageComponentAllocator. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I34785840c905a79750a62863773015b00fb43387
2022-02-07	MLBEDSW-6148: Allow overwrite of subgraph input	Johan Alfvén
	This change will allow the subgraph's input tensor to be reused/overwritten by the output from an elementwise op if there is only one consumer attached to the input tensor. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I317188af11a5470614770e18dc8973462fd5f21c
2022-02-02	MLBEDSW-3623: Diff on semantic_segmentation	Rickard Bolin
	The root cause of this diff is precision errors caused by rounding several times when performing a resize bilinear upscaling to more than twice the initial size. This is solved by rewriting the algorithm to perform nearest neighbour upscaling to the correct size and then applying one larger average pool instead of several 2x2 pools. Avgpool with padding is limited to kernel size 8x8, which constraints the largest possible bilinear upscaling to 8 times the input size. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I846232f309ba26aab6c385e593cbe25b646c6668
2022-01-27	MLBEDSW-6060: Revert patch for MLBEDSW-5582	Johan Alfvén
	- Issue was due to a previous patch to fix MLBEDSW-5582 - Revert fix for MLBEDSW-5582 commit 849ff81f82c10a68898e5101930b92372bec5565, - Made new fix for MLBEDSW-5582 that enforce output tensor from NPU graphs to be in NHWC format. This information is otherwise lost in the case when parts of a concatenation are placed in different custom operators resulting in mismatch bewteen NHWC and NHCWB16. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Iab3ba29d348353c854f357836e6aa7c338ae1572
2022-01-25	MLBEDSW-6018: Fix double buffering on dual core	Louis Verhaard
	Only the first half of weight double buffers was used on dual core configurations, which causes degraded performance. Change-Id: I49972c00343bbffbae28ed11c645e993ed61d43f Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2022-01-24	MLBEDSW-5582: MLCE: memory corruption with zero concat	Tim Hall
	- This bug was due to an interaction between multiple Ethos-U custom operators and concatenation of constant tensors - It resulted in different parts of the concatenation being placed in different custom operators - The fix involves places all parts of the concatenation into the same custom operator by switching to a breadth first search in pass packing Signed-off-by: Johan Alfven <johan.alfven@arm.com> Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ic47613cfd7bf675b4674dc91d6f9765849ba3130
2022-01-21	MLBEDSW-4870: Update to TensorFlow 2.7	Rickard Bolin
	Update the flatbuffers generated code to comply with TensorFlow 2.7 Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iff29b05a6e145245861329b4ff9fc9fbd968da53
2022-01-18	Optimize tensor allocation verification	Johan Alfvén
	By not comparing items that have already been compared with each other, the number of iterations for the loop is reduced. For large network with long live ranges, this improves compile time significantly. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I298cd6f109527fc32f6db77ffffca9e765a84ce0
2022-01-12	MLBEDSW-5534: Enet_640_640_int8 output diff	Rickard Bolin
	The output diff is caused by not including the kernel dilation when calculating the bottom padding to be used on the last h_stripe. This only shows up when using dedicated_sram since shared_sram does not split into multiple h_stripes and thus uses the padding specified by the skirt instead. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I7f643748b153004d65be2124c0ac6c9d21cd803f
2022-01-04	Set product in driver action config struct	Jonny Svärd
	Signed-off-by: Jonny Svärd <jonny.svaerd@arm.com> Change-Id: Ib398024c2f41beb4f93f7976c678a9fd54af94a5
2021-12-23	MLBEDSW-4704: Crash when loading empty constant tensors	erik.andersson@arm.com
	Fixed a crash caused by loading a network containing operators with empty constant tensors. This could occur when a branched network is split before said branches have converged. We now put the affected operator on the CPU. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I63e9cd13cecf86d976c5750c727e218c334c32b5
2021-12-20	MLBEDSW-5740: Fix assert when setting address on identical LUT tensors	Johan Alfvén
	When an LUT tensor address is updated with another existing LUT tensor address, also make sure to update the equivalence id. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I5ce8c608d9ff6d31e16212b1a725b4147dd3f6f1
2021-12-20	MLBEDSW-5844: Inconsistent calculation of read shapes	Tim Hall
	- This bug causes a regression in the use of unpack and split operators - The bug is due to the read_shapes attribute being an absolute calculation for slice and strided_slice, but a relative one for unpack and split - The fix is to consistently treat the attribute as a shape relative to the read_offset Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I4504b161be507ea22ca6ee40fbe7808bfe049405
2021-12-17	MLBEDSW-5834: split shape is None when split offset is not	Tim Hall
	- This bug causes an exception to occur when trying to index split shape in Box.transform_with_strides_and_skirt() - The bug was due to the read shapes not being initialised when creating a primary op in pass packing Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I3ebd7fc4c7ef5c06488a36d8340a17ae6afd4609
2021-12-16	MLBEDSW-5629: MLCE: Model falling when creating explicit_padding	Tim Hall
	- Issue was due to a previous patch to fix MLBEDSW-4350 - Manually reverted that fix 5fabfcaa2b636b02899b4d6e0ccf95d853986475 - Made a new fix for MLBEDSW-4350 that calculates the padding and skirt by taking into account the split read offsets and shapes Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I96010c1b977011aecbc411a3c91ab3e61af22db4
2021-12-16	MLBEDSW-5554: Place MEAN op exceeding max height with axis==1 on CPU	Rickard Bolin
	Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I87dc5963972a7ef91db467b2ff8e0261e9899372
2021-12-02	MLBEDSW-5717 Fix for sigmoid int16	Patrik Gustavsson
	Fixed issue with sigmoid int16 with 1/2048 scaling. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I32718757e3776e6be89fe94a9b38368c78f0006b
2021-11-26	MLBEDSW-5417: Update release notes & supported ops3.2.0.rc3 3.2.0	Dwight Lidman
	This commit updates the release notes for Vela version 3.2.0. It also updates the SUPPORTED_OPS.md file with new constraints. Updated the API version as a result of the bug fix commit 399c4a2d77df791e5d988c51d7fb1824ac4f266f. Updated Vela version in setup.py. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I181e89f639a1da6013e8511ebe2d7e4f81242916
2021-11-25	MLBEDSW-5507: Fix vela summary for passes	Tim Hall
	- Removed the passes information as this was no longer correct or useful - Fixed the reporting of the number of CPU operators Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I80bf3f023de7d470af9aa5c6fe7bcb58c60ccd0b
2021-11-25	MLBEDSW-3602: Output mismatch on some mobilenet_v1 int8 and int16	Tim Hall
	- The failing tests contain operations with dynamic tensors which are not supported and therefore they should be placed on the CPU. However, a bug in the removal of RESHAPEs which contain a dynamic shape prevented this happening. - This change adds a check to make sure that RESHAPE ops with a dynamic shape tensor are not removed and instead are placed on the CPU. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2d7481f7f80f99a0f01df100d956933777e6875a
2021-11-15	TOSA: Add ifm ofm elem size into raw output3.2.0.rc2	Diqing Zhong
	Change-Id: I645496536a6bddf2bd289a87be9d7cef11693954 Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
2021-11-12	MLBEDSW-5383 npu_find_block_configs() differs between v2.1.1 and v3.1.03.2.0.rc1	James Ward
	* 1D optimised block_config was incorrectly beign set to the ArchitectureBlockConfig in try_block_config() * Write external API test for the reduced block height case (on H256) Signed-off-by: James Ward <james.ward@arm.com> Change-Id: I9ced7eb31b23730e4423aabbaf769bc72fac8fc9