aboutsummaryrefslogtreecommitdiff
path: root/ethosu
AgeCommit message (Collapse)Author
2021-07-05MLBEDSW-3890 handling scratch tensorSamuel Panijel
vela: Possible issue with handling scratch tensor on non-ethosu custom op Fixing a case where a tensor input name ends with "scratch". 4 test cases passing this change: 1) non-optimized tflite - input tensor name is _split_1_scratch 2) optimized tflite - input tensor name is _split_1_scratch 3) optimized tflite - input tensor name is _split_1_scratch and custom operation name is non_ethus_u 4) non-optimized tflite - input tensor name is _split_1_scratch_fast Change-Id: Ia515805825b7f9a646607c5075b7ea3a0cf6aad8 Signed-off-by: Samuel Panijel <samuel.panijel@arm.com>
2021-06-25MLBEDSW-4819: MLCE: weight_compressor int has no attribute astypeTim Hall
- Added type checking so that the correct type conversion can be used Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ia83f46029fac7bad63844c090b87d23c2072b105
2021-06-22MLBEDSW-4807 Elementwise IFM/OFM overlapJacob Bohlin
Reinstated allowing the IFM and OFM tensor to overlap for Elementwise operations. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: Ide6db7781f3ca7a36c8ff9e3efdc7943a7bf6d7f
2021-06-17Block config optimisation for 256/512 configurationsTim Hall
- 256 and 512 configuration variants execute 1D convolutions in an optimised manner compared to their 2x2 microblock dimensions. This commit takes this into account to improve Conv1D throughput on these configurations. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
2021-06-17vela: Improve block configuration and weight buffering algorithmTim Hall
- Update block config selection to take into account partial IFM fetches at edge of non-whole OFM block data. - Change to scheduler depth slicing for networks in MLBEDSW-4637 for improved buffering. This helps general performance by buffering larger depth slices. - Bug fix for opt_max_schedule always being fitted to SRAM which prevented the optimisation step running in some cases. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
2021-06-16MLBEDSW-4635: yolo_v3 output diffJacob Bohlin
Fixed an issue where the scheduler would set the incorrect tensor layout. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I28abdf3f3c523d7da0cf8840316ece37dad364ab
2021-06-16MLBEDSW-4644 Removed unnecessary LUT DMA commandsJacob Bohlin
Fixed a bug where a DMA command for the activation LUT would be issued for every depth-slice of an operator. This caused multiple unnecessary DMA commands. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I9c291692d8002f05656bb88214836ab389a56cdb
2021-06-09mlw_codec: Fixed alignment warningMauricio Briceno
- Restructured pointer API to prevent alignment warnings - Changed weight tensor data type to np.int16 Change-Id: I310c1ca733bf98724c84e8b2194becb4be3e7eea
2021-06-08MLBEDSW-4602: Fix Deepspeech scale & bias reuse issue.Tim Hall
- Deepspeech reuses identical weights and biases throughout the network. Since biases are now interleaved with weights there is a scaling issue when the ifm scales differ between operations using the same weight and scale tensor. - This commit uses interleaved weights/scales on their first use but separates scales to source memory on subsequent use (if the ifm scale is different). Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I7aae163438160a919cae04e235966e75355a6148
2021-06-03MLBEDSW-4688: Fix performance estimatesPatrik Gustavsson
Putting back the estimates related to unbuffered weight transfer. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I2072066bc1e01814fe3b0b87a912f69646da861c
2021-05-27MLBEDSW-4034: New Scheduler Size or Performance OptimisationTim Hall
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
2021-05-21MLBEDSW-4219: Add tensor allocation info to summaryTim Hall
- Moved new tensor allocation info under --verbose-allocation flag - Tidied up and added histogram to --verbose--allocation print Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I76fb5187319aedf86f599f57b766220cafc17326
2021-05-20[MLBEDSW-4623] Fix sub-module importsFredrik Svedberg
Fixed sub-module imports. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I6ab5c04ba5f3411f8cf8ac95606fe036fae11442
2021-05-20Fix mlw_codec build warningsFredrik Svedberg
Fixed mlw_codec build warnings. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I8ec8fb3b092cce0629c690677984549febf01adc
2021-05-13Fix mlw_moduleFredrik Svedberg
Fixedx size calculation in mlw_reorder_encode. Fixed build warnings. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac9408b9972a29b5a3403ba11f80dc4eaaa35453
2021-05-07weight_compressor: added mlw_reorder_encode3.0.0.rc1Mauricio Briceno
- Moves reordering to C - Runtime is greatly minimized for encoding weights Change-Id: Ifff01e7b1ea6d5cec68310a155c3b80aa1a38545 Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>
2021-05-07[MLBEDSW-4530] Improve --verbose-graph outputFredrik Svedberg
Improved --verbose-graph output by adding labels to each print. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I49039ff6af1c06f49208591f02effa4ff73f982a
2021-05-07MLBEDSW-4534 Limit ifm box depthPatrik Gustavsson
Limit the ifm box depth to ifm shape depth Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I889aed9ef7e338faa1fca074fb2843fa2cedecc8
2021-05-07vela: Remove unused serialisation paramsTim Hall
- Removed unused nng parameter Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0bb2eb101a84ea8022c8eb7bcbd86d617e933510
2021-05-06[MLBEDSW-4254] Improve weight information in summaryFredrik Svedberg
Improved weight information showed in summary if --verbose-weights option is used. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac142f2a813bf1c05aa9da3f8a384466e2914d06
2021-05-04MLBEDSW-4429: elementwise_broadcast output diffDwight Lidman
This commit fixes a regression caused by a recent commit where io_ranges and elementwise_broadcast were failing with off-by-one errors. The culprit was the incorrect usage of NATURAL rounding in cases of elementwise ADD and SUB where the input and output scales were equal and advanced scaling was not used. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I35d56298e911a4d1bbca7d201bcde6044c8a5490
2021-05-03MLBEDSW-4539: MEAN axis check exception fixDwight Lidman
A recent fix to another MEAN bug introduced a new bug. The bug was due to some incorrect logic for checking the axis attribute. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I65d3486a12e029f7c4450074f03fcd1974f65d8a
2021-04-30MLBEDSW-4350 Use padding instead of skirt for merged SplitSliceHenrik G Olsson
When the operations are merged some later passes are confused by start and end coordinates for the convolution not being along the edges of the IFM, and omitting padding. But we need the zero padding to keep the output the same as before the transformation. Also fixes bug where Vela could crash if convolution had explicit start coordinate. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I8449d237350d528f83738b2f09124f1ed79c07ca
2021-04-29MLBEDSW-4501: Support MEAN single axis variationDwight Lidman
When a MEAN operator with a single reduction axis specifies the axis index attribute as an array with a single element rather than a scalar index, the operator is placed on the CPU even though it is technically supported. This commit fixes this issue and also adds some new tests for the axis constraints. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ia287f3b9cc80a805e972cd4b2962e52526a8dc16
2021-04-21MLBEDSW-4413: MobileNet V3 regressionDwight Lidman
This commit resolves a recent regression in multiple networks (including MobileNet V3). The regression was caused by a recent change to IFM block size calculation where a term mistakenly left out (due to it missing from the spec). The IFM microblock size has been amended for the Ethos U-55 128 config and the block size calculations now use these sizes instead (although equivalent with OFM microblock sizes). Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ic504b4becd6c3a26334a7275189d78ff0fe2cf69
2021-04-19[MLBEDSW-4421] Fix verbose-all crashFredrik Svedberg
Fixed exception when using the CLI option --verbose-all. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I203fe31ad6914936730343958009e2370045c67c
2021-04-19[MLBEDSW-4414] Fix verbose-operators for multiple custom opsFredrik Svedberg
Fixed exception for --verbose-operators option when there are multiple custom operators in the network. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5ab743d96a4e0367818fbe46cc47896c691d888c
2021-04-16MLBEDSW-4132 Fix off-by-one error for negative packing axisHenrik G Olsson
Also applies to unpack. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I07e7083aeb6aefd6e26f9d134b858080f28f1719
2021-04-16MLBEDSW Vela: Fix check for format restrictionsPatrik Gustavsson
Fixed the check related to if there are any CPU producers/consumers. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I0ed08c650d1ca34e8e148aee68a5ed09c25fdd87
2021-04-16MLBEDSW-3550 Only use simple scaling when bitexact with TFLiteHenrik G Olsson
For 8 bit arithmetic we cannot guarantee reproducibility in the general case since precision differs, affecting rounding near half integers. It should be safe when the ratio between output and input scales has its 12 LSBs all set to 0, however. For 16 bit arithmetic it should be sufficient to adjust the input and output scalings with a factor of 2 to get the same rounding. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I809c0042615d16c5488d61f0c7d88e1a1315e6eb
2021-04-15MLBEDSW-4397 Fix Reshape ifm/ofm prod/cons by cpu opPatrik Gustavsson
Not only the sg input outputs need to be considered before removing Reshape. Added check if Reshape ifm/ofm is produced respectively consumed by CPU. Handling is the same as if tensor is sg input/output. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: If509e1d23e3f22ed4c963d8dabd8c00c6b9c07e3
2021-04-14MLBEDSW-4103: Block config calc updateerik.andersson@arm.com
The previous calculation of the IFM block height and width yielded incorrect block configs when running transpose_conv networks with certain hardware constraints. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I8b6936a3e8c37da640bdeac84ecfea8363f910f9
2021-04-09MLBEDSW-4073 Handle elementwise ops with same tensor for both inputsHenrik G Olsson
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I0e6bb46b7b91ed10f5bda34fba66d8b714560f47
2021-04-08Fix stats_writer.py exceptionFredrik Svedberg
Fixed exception in stats_writer.py. Change-Id: I625390aec185345cadd0d8fa5edb66907b9be242 Signed-off-by: Fredrik Svedberg <Fredrik.Svedberg@arm.com>
2021-04-08MLBEDSW-4334 Non-linear format decision in graph opt.Patrik Gustavsson
Check if non linear tensor format can be used is refactored. -Flag avoid_NHCWB16 replaced with needs_linear_format -Checking restrictions located to one function in graph optimiser. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Iec5c7996a1a6039cad052197f1ae56f7c0290440
2021-04-07MEAN implementation changed to Average PoolDwight Lidman
This is a small commit which changes one of the four MEAN implementations to a simpler one, using an AvgPool instead of a DepthwiseConv. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I9e8af071e8b820796577ee4792b4812a1212602b
2021-04-06MLBEDSW-4249 Hide stack traces in error messagesHenrik G Olsson
When faced with an invalid tflite file we now catch the exception to make it clear to the user that the issue is with the input and not with Vela, instead of just crashing. Same also applies to our own Vela error messages. Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com> Change-Id: I56a81c5be9e1f46f3b98a88c6d24ee42fa0e450d
2021-03-31MLBEDSW-4286: Bug fix Concat using IFM streamingLouis Verhaard
IFM box calculation was wrong because 2 variables were referencing/updating the same list. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Ibed4e94c474682e14a6dd898029f14af11c9479a
2021-03-31MLBEDSW-3461: Check configuration SRAM sizeLouis Verhaard
Added check that configured SRAM size is within bounds. Change-Id: I5dce3df0788f2b00402e9a541bad11612fa19463 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-31Handle absent weights_compression_ration when printingHenrik G Olsson
Change-Id: Iafb31af73d80adcc901b241c34dda78be360bc14 Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
2021-03-31MLBEDSW-3502: Bug fix addresses >= 32 bitLouis Verhaard
Bug fix in generation of register command offsets that do not fit in 32 bit. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: Iabb99cf6536c0f77b934691f8744df61f1eab3ed
2021-03-30Performance improvement in tensor allocationLouis Verhaard
- Tensor allocation verification was O(N^2), is now closer to O(N) - Removed a sort in HillClimb allocator Change-Id: I286a269881490c485cc2b0eeab3b1ecffa8f3df0 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-30MLBEDSW-4219: Add tensor allocation info to summaryerik.andersson@arm.com
Added the theoretically minimum max memory usage and the allocator overhead to the Vela summary. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: If373dfeaac50d6f8b56554d435bf22af2c3acda3
2021-03-26MLBEDSW-4163: OFM zero point outside valid rangeDwight Lidman
This commit fixes a bug where the OFM zero point would assume values outside of [0, 255] due to it's usage as a stand-in for a bias when emulating the TensorFlow Lite implementation of MEAN. The solution is to adjust for the bias using an ADD operator with the bias value as an int16 const tensor. The 16-bit integer is needed as the bias is 32 bits in the original implementation but can effectively assume values in the range [-255, 255]. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I84df48ea89bb559954f1b2c289b65e08a6418274
2021-03-25MLBEDSW-4071: Power of two handling 16-bit tanh/sigmoidLouis Verhaard
Added special handling of power-of-two input scales for 16-bit tanh/sigmoid to align with the reference. Change-Id: I87831bcd587623d7db7100e768905355c2c98e9d Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-22MLBEDSW-3502: Add address checksLouis Verhaard
Added checks during command stream generation to make sure that address boundaries are respected. Change-Id: I4dbc693b42d54e35c8fcc785e8be88059e409eec Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2021-03-19MLBEDSW-3458: Added command stream size check.erik.andersson@arm.com
If the command stream size exceeds a certain threshold, a VelaError will now be raised. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I9b9383f4c298a778b160cd527374e9244e4cae26
2021-03-19Address generation fixMauricio Briceno
- The architecture supports address extensions wider than 32b via the cmd1.param Change-Id: I7a01b4596f7a54f6be05b8e2c454494e6751757b Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>
2021-03-16MLBEDSW-4215: Add support for MEAN to match QuantizedMeanOrSum implementationDwight Lidman
This commit adds support for emulating the behavior of the QuantizedMeanOrSum implementation of MEAN in TensorFlow Lite. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ifd24e0e678e2f85cd66ab82deeaaf010d5351b1e
2021-03-16MLBEDSW-4223: Full support for PAD operatorLouis Verhaard
- Added full support for PAD operator - Hardware padding is still used whenever possible - Bug fix Pad followed by max pool if IFM contains negative values Change-Id: Ifc64d1943737d94466f5e2821009dab12a49a965 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>