Age | Commit message (Collapse) | Author |
|
A number of bring-up were failing after the update
to TensorFlow 2.3. After updating to TensorFlow 2.5
the problems persisted and more failures were
introduced when they were expected to be solved.
However, with this small patch that changes the
rounding mode for ResizeBilinear, all tests now pass.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I5f2f3859b9008187ca318d5270da7b850b170b18
|
|
This commit updates the flatbuffers generated code
to comply with TensorFlow 2.5, as well as stripping
away some legacy code.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I01fe47ec2bde6e78fdde21ee1bc0a71f560c53ae
|
|
- Fix bug with MEAN ops calling create_const_tensor using the
quant_value_dtype keyword argument.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I8cff542ae840fb110ea97c0cc86bb761d5a884d3
|
|
Refactor supported operators by breaking out model semantics
into its own class. Model semantics checked right after model
read.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If442b189efcd91dda01af60b2b3adedfacdf2fad
|
|
Signed-off-by: James Peet <james.peet@arm.com>
Change-Id: I5bf39aa4f1fb48bcb0423edc4cd1d01f59aac1db
|
|
Remove quant_values attribute from Tensor class.
It only needs a single values attribute, holding either
quantized or unquantized values as appropriate.
Change-Id: Ie96f80ac58061b6077e0f7048dc60209fdfbcafa
Signed-off-by: James Peet <james.peet@arm.com>
|
|
Deep speech was exhibiting poor performance in its first three
layers due to poor SHRAM utilisation.
- Given a choice between multiple identical-cost block configs,
the allocator was choosing the first one it encountered. This
commit biases the choice towards blocks with a larger IFM
fetch area to improve SHRAM utilisation.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2ff18a13444b8812cb451a606ff692bf290e7d20
|
|
- Fixed typo with not using ifm.mem_type
- Fixed bug with using ifm1 properties when only ifm2 is a potential match
- Removed restriction on not considering SHL and SHR for overlap
- Removed some dead reshape code
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Id9bcc3c2b3ee9ac7b6276187d3e2f513b4acd4b5
|
|
Mapping to internal input indexing has been added to
tflite_reader.py and tosa_reader.py.
And the other way around in tflite_writer.py.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I4d8596e747cfa7c4203884c4e785eb1977e2bcc1
|
|
Added basic TOSA support, enabling Vela to
read and compile a .tosa file corresponding to
CONV2D + Rescale + Clamp, and writing it to an
optimized .tflite file.
The optimized .tflite file, will in this case, hold
a commandstream where the Rescale and Clamp has been
fused into the CONV2D.
The optimized tflite file is not output from Vela.
-Added support to read .tosa file into Vela
internal structure.
- Added tosa_reader.py, tosa_mapper.py and
helper files stored under tosa/
- Support for this limited to ~10 ops
-Added reader_util.py for functions common
for TOSA and TFLite
-Added tosa_graph_optimiser.py
-Added support to fuse Rescale into convolution
-Modified handling for padding
-Added support to fuse Clamp to previous op
-Added graph_optimiser_util.py
-Moved functions common for TOSA/TFLite graph
optimization to this file.
-Renamed graph_optimiser.py to tflite_graph_optmiser.py
-Added separate tosa_supported_operators.py
-Added supported_operator_util.py
-For functions in common for TOSA/TFLite
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab
|
|
vela: Possible issue with handling scratch tensor on non-ethosu custom op
Fixing a case where a tensor input name ends with "scratch".
4 test cases passing this change:
1) non-optimized tflite - input tensor name is _split_1_scratch
2) optimized tflite - input tensor name is _split_1_scratch
3) optimized tflite - input tensor name is _split_1_scratch and custom
operation name is non_ethus_u
4) non-optimized tflite - input tensor name is _split_1_scratch_fast
Change-Id: Ia515805825b7f9a646607c5075b7ea3a0cf6aad8
Signed-off-by: Samuel Panijel <samuel.panijel@arm.com>
|
|
- Added type checking so that the correct type conversion can be used
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ia83f46029fac7bad63844c090b87d23c2072b105
|
|
Reinstated allowing the IFM and OFM tensor to overlap for Elementwise
operations.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ide6db7781f3ca7a36c8ff9e3efdc7943a7bf6d7f
|
|
- 256 and 512 configuration variants execute 1D convolutions
in an optimised manner compared to their 2x2 microblock
dimensions. This commit takes this into account to improve
Conv1D throughput on these configurations.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
|
|
- Update block config selection to take into account partial
IFM fetches at edge of non-whole OFM block data.
- Change to scheduler depth slicing for networks in MLBEDSW-4637
for improved buffering. This helps general performance by buffering
larger depth slices.
- Bug fix for opt_max_schedule always being fitted to SRAM which
prevented the optimisation step running in some cases.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
|
|
Fixed an issue where the scheduler would set the incorrect tensor
layout.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I28abdf3f3c523d7da0cf8840316ece37dad364ab
|
|
Fixed a bug where a DMA command for the activation LUT would be issued
for every depth-slice of an operator. This caused multiple
unnecessary DMA commands.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I9c291692d8002f05656bb88214836ab389a56cdb
|
|
- Restructured pointer API to prevent alignment warnings
- Changed weight tensor data type to np.int16
Change-Id: I310c1ca733bf98724c84e8b2194becb4be3e7eea
|
|
- Deepspeech reuses identical weights and biases throughout
the network. Since biases are now interleaved with weights
there is a scaling issue when the ifm scales differ between
operations using the same weight and scale tensor.
- This commit uses interleaved weights/scales on their first use
but separates scales to source memory on subsequent use (if
the ifm scale is different).
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I7aae163438160a919cae04e235966e75355a6148
|
|
Putting back the estimates related to unbuffered
weight transfer.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I2072066bc1e01814fe3b0b87a912f69646da861c
|
|
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
|
|
- Moved new tensor allocation info under --verbose-allocation flag
- Tidied up and added histogram to --verbose--allocation print
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I76fb5187319aedf86f599f57b766220cafc17326
|
|
Fixed sub-module imports.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I6ab5c04ba5f3411f8cf8ac95606fe036fae11442
|
|
Fixed mlw_codec build warnings.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I8ec8fb3b092cce0629c690677984549febf01adc
|
|
Fixedx size calculation in mlw_reorder_encode.
Fixed build warnings.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iac9408b9972a29b5a3403ba11f80dc4eaaa35453
|
|
- Moves reordering to C
- Runtime is greatly minimized for encoding weights
Change-Id: Ifff01e7b1ea6d5cec68310a155c3b80aa1a38545
Signed-off-by: Mauricio Briceno <mauricio.briceno@arm.com>
|
|
Improved --verbose-graph output by adding labels to each print.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I49039ff6af1c06f49208591f02effa4ff73f982a
|
|
Limit the ifm box depth to ifm shape depth
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I889aed9ef7e338faa1fca074fb2843fa2cedecc8
|
|
- Removed unused nng parameter
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0bb2eb101a84ea8022c8eb7bcbd86d617e933510
|
|
Improved weight information showed in summary if --verbose-weights
option is used.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iac142f2a813bf1c05aa9da3f8a384466e2914d06
|
|
This commit fixes a regression caused by a recent
commit where io_ranges and elementwise_broadcast
were failing with off-by-one errors.
The culprit was the incorrect usage of NATURAL
rounding in cases of elementwise ADD and SUB
where the input and output scales were equal and
advanced scaling was not used.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I35d56298e911a4d1bbca7d201bcde6044c8a5490
|
|
A recent fix to another MEAN bug introduced a new
bug. The bug was due to some incorrect logic for
checking the axis attribute.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I65d3486a12e029f7c4450074f03fcd1974f65d8a
|
|
When the operations are merged some later passes are confused by start
and end coordinates for the convolution not being along the edges of
the IFM, and omitting padding. But we need the zero padding to keep
the output the same as before the transformation.
Also fixes bug where Vela could crash if convolution had explicit
start coordinate.
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
Change-Id: I8449d237350d528f83738b2f09124f1ed79c07ca
|
|
When a MEAN operator with a single reduction axis
specifies the axis index attribute as an array with
a single element rather than a scalar index, the
operator is placed on the CPU even though it is
technically supported.
This commit fixes this issue and also adds some new
tests for the axis constraints.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ia287f3b9cc80a805e972cd4b2962e52526a8dc16
|
|
This commit resolves a recent regression
in multiple networks (including MobileNet V3).
The regression was caused by a recent change to
IFM block size calculation where a term mistakenly
left out (due to it missing from the spec).
The IFM microblock size has been amended for the
Ethos U-55 128 config and the block size calculations
now use these sizes instead (although equivalent with
OFM microblock sizes).
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ic504b4becd6c3a26334a7275189d78ff0fe2cf69
|
|
Fixed exception when using the CLI option --verbose-all.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I203fe31ad6914936730343958009e2370045c67c
|
|
Fixed exception for --verbose-operators option when there are
multiple custom operators in the network.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I5ab743d96a4e0367818fbe46cc47896c691d888c
|
|
Also applies to unpack.
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
Change-Id: I07e7083aeb6aefd6e26f9d134b858080f28f1719
|
|
Fixed the check related to if there are any CPU
producers/consumers.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I0ed08c650d1ca34e8e148aee68a5ed09c25fdd87
|
|
For 8 bit arithmetic we cannot guarantee reproducibility in the general
case since precision differs, affecting rounding near half integers.
It should be safe when the ratio between output and input scales has
its 12 LSBs all set to 0, however.
For 16 bit arithmetic it should be sufficient to adjust the input and
output scalings with a factor of 2 to get the same rounding.
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
Change-Id: I809c0042615d16c5488d61f0c7d88e1a1315e6eb
|
|
Not only the sg input outputs need to be considered
before removing Reshape.
Added check if Reshape ifm/ofm is produced respectively
consumed by CPU. Handling is the same as if tensor is
sg input/output.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: If509e1d23e3f22ed4c963d8dabd8c00c6b9c07e3
|
|
The previous calculation of the IFM block height and width
yielded incorrect block configs when running transpose_conv
networks with certain hardware constraints.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I8b6936a3e8c37da640bdeac84ecfea8363f910f9
|
|
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
Change-Id: I0e6bb46b7b91ed10f5bda34fba66d8b714560f47
|
|
Fixed exception in stats_writer.py.
Change-Id: I625390aec185345cadd0d8fa5edb66907b9be242
Signed-off-by: Fredrik Svedberg <Fredrik.Svedberg@arm.com>
|
|
Check if non linear tensor format can be used is
refactored.
-Flag avoid_NHCWB16 replaced with needs_linear_format
-Checking restrictions located to one function in graph optimiser.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Iec5c7996a1a6039cad052197f1ae56f7c0290440
|
|
This is a small commit which changes one of
the four MEAN implementations to a simpler
one, using an AvgPool instead of a
DepthwiseConv.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I9e8af071e8b820796577ee4792b4812a1212602b
|
|
When faced with an invalid tflite file we now catch the exception to
make it clear to the user that the issue is with the input and not with
Vela, instead of just crashing.
Same also applies to our own Vela error messages.
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
Change-Id: I56a81c5be9e1f46f3b98a88c6d24ee42fa0e450d
|
|
IFM box calculation was wrong because 2 variables were
referencing/updating the same list.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Ibed4e94c474682e14a6dd898029f14af11c9479a
|
|
Added check that configured SRAM size is within bounds.
Change-Id: I5dce3df0788f2b00402e9a541bad11612fa19463
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Change-Id: Iafb31af73d80adcc901b241c34dda78be360bc14
Signed-off-by: Henrik G Olsson <henrik.olsson@arm.com>
|