Age | Commit message (Collapse) | Author |
|
-Only support for avgpool when there is
no padding. For this case, global scaling can be used.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I026b83b05f02c57c79f49935f5ec501a6d28bb91
|
|
Added support for Data layout ops
RESHAPE, SLICE and CONCAT.
-No support for bool_t
-Support limited to Rank <= 4 and N = 1
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I487ac494b6506a2a6ba947ee758aa193194dd796
|
|
Additional check added for when constant data can be moved
to fast storage.
Do not move constant data for concat.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ib8b5fd1483ee9fabe48e9874a5723af9b7c5231a
|
|
This commit fixes one assert regarding rolling buffers for 3D tensors.
It also addresses another issue where the incorrect weight buffering was
proposed for cascaded operators.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I2501f35e5668b3085d917751cfc8002d250973d8
|
|
Added support for ADD, SUB and MUL
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I52acdc126b16e2cf4096bcf7a77023ea7d204998
|
|
This is mainly to add support for depthwise conv2d
with dephmultiplier = 1.
(But there are no testcases suited, all I have sourced
has depth_multiplier set to 2, which is not supported.)
-Added support for depthwise conv2d.
-Added support for removing Transpose of constant data
-Added support for removing reshape
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I143e6246becfa78fd9f7510af0bf0d6b3fbbf2c7
|
|
Fixed output diff for wav2letter int16 by correcting the scaling
used for LeakyRelu.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I8be1e14c25d223dc6e42c4ec498ff4d3d9de65d7
|
|
Added support for
-AVGPOOL and CONV2D with TFLite correspondence
-MAXPOOL
-additional support for replacing RESCALE ops with avgpool.
No support for breaking down tensors over the
size supported by NPU.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I1d2aa50ac30a26283b3e6f1fe88cba1544b7c189
|
|
- Add TOSA output generation in npz format
Change-Id: I97822e3a93a8fef1a95a990f23ef2c4ca5a8f73a
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
This commit contains the release notes
for Vela 3.1.0. It also increases the
PyPI documentation tag.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Iffd9fac7d4a7ccb34c3558990ef4bb97e548bf4c
|
|
Update to handle the case when the Squeeze Op ifm/ofm are the
subgraph ifm/ofm, to facilitate the removal of the Squeeze Op.
Adding NOP to maintain the original tensors.
Updated pytests for squeeze operator.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I623cae05e696fb16ccf29dedc42fd822601e9fd9
|
|
Updated the README.md to include some examples of
new scheduler modes.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: Ifa1a9a69b94ab37efa3aac7e82bb89e0e3a25b85
|
|
To avoid using Python 3.6 incompatible versions of NumPy (> 1.19.5),
an upper bound on version is added for NumPy in setup.py.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I3929bd7dbea6866905665186af1c4b3ba43ccbd0
|
|
A commit pertaining to MLBEDSW-4738 where the
functionality of find_block_configs() in the
external API was reinstated had previously been
merged, but was done without increasing the API
version. This commit amends that mistake.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I32f559d626e0f4e93c522813b6f4e12beaa50e57
|
|
This commit adds a CLI option for setting
the recursion limit. This option was originally
removed because it was considered unnecessary,
but in some cases of very large (enormous) networks,
a RecursionError is encountered during graph traversal.
A simple solution for issues like those is to manually
increase the recursion limit.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Id0dbf68edf59b151abfa91783b5f8f021c1bb40f
|
|
Bug fix in cascade builder: tensors produced with operators requiring full OFM
or consumed by operators requiring full IFM could be added as intermediate buffers
to a cascade.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Id84e9e1940bf85ab4cbc42a03e65f64da16a094c
|
|
- Deleted file as it was no longer needed
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I03df2fc98964b96f4c7eabcf98dd5baa19de78ca
|
|
Fix inception_v1/v3 output diffs.
Removing the Squeeze operator in the graph optimisation step.
The squeeze operator removes dimensions of size 1 from tensor shape.
The memory layout is preserved.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I4ceffcbb141af5ed50b0d1a9d1d67622e638c2a1
|
|
- Fixed index error in memory_snapshot
- When removing a cascade, also references are removed
Change-Id: I2b35dc52671d8ce115eb32bfdd93584391d1fc6d
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
option""
This reverts commit 257a31e93cb2c7a8c06a102211ebb05b3ba78cd8.
Reason for revert: <INSERT REASONING HERE>
Change-Id: If4f565d8c692e2b32903819561591d9e4af619fa
|
|
Relationship to other patches
This reverts commit b6dd9c2e5fcf2885fb42dab567378c8aec22215c.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I50afb5ac4e33e5b8cd4f2aac1f5b94700ab8eeb1
|
|
- Changed mem_type_size() to only return a hard limit
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ia9271c54a592965f88f52fe25a52b3efaca88500
|
|
Fixed a bug that caused the constant and buffered weights to expect
different encoding.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I77acee29d104bc7c8e132907e61a72b581ace0e5
|
|
Reinstated the v2.1.0 functionality for find_block_configs(). This is
used exclusively by the external API.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I6977f13866957edb083769658cc8c57c2b3556fb
|
|
This commit moves a piece of code back into a loop
but with a flag to make sure that the code is only
executed once per loop rather than potentially every
iteration. This solves the issue of an output diff
because of LUT DMAs occurring before weight DMAs.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I3e597f0a955154af3d87febacea1b3920d53b7c2
|
|
Previous to this commit some networks were failing due
one or more options in the TFLite mapping being
incorrect after the update to match TF 2.5.
This commit reverts those changes.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ia0b577ca44d76486fc3e0ea9780e0dc1d2baf65e
|
|
- Changed Ethos-65 AXI port address width from 48 to 40-bits
- Fixed the use of arena_cache_size in mem_type_size() to cover the
arena as well as the cache memory area
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I826462a0cbd0c061cccbc7c83dde446778a2b1ca
|
|
Adoptions related to changes for constant data
in TOSA.
Constant data not longer stored in .npy files, but
within the .tosa-file.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ia1148c2f8b783b3926a1ee0b9ad0a3aeff9d22f5
|
|
- Minor rewording to CONTRIBUTIONS.md to aid clarity
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I57ee455dbaf17e6b087d905cbef9ae8b1a652817
|
|
A number of bring-up were failing after the update
to TensorFlow 2.3. After updating to TensorFlow 2.5
the problems persisted and more failures were
introduced when they were expected to be solved.
However, with this small patch that changes the
rounding mode for ResizeBilinear, all tests now pass.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I5f2f3859b9008187ca318d5270da7b850b170b18
|
|
This commit updates the flatbuffers generated code
to comply with TensorFlow 2.5, as well as stripping
away some legacy code.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I01fe47ec2bde6e78fdde21ee1bc0a71f560c53ae
|
|
- Fix bug with MEAN ops calling create_const_tensor using the
quant_value_dtype keyword argument.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I8cff542ae840fb110ea97c0cc86bb761d5a884d3
|
|
Refactor supported operators by breaking out model semantics
into its own class. Model semantics checked right after model
read.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If442b189efcd91dda01af60b2b3adedfacdf2fad
|
|
Signed-off-by: James Peet <james.peet@arm.com>
Change-Id: I5bf39aa4f1fb48bcb0423edc4cd1d01f59aac1db
|
|
Remove quant_values attribute from Tensor class.
It only needs a single values attribute, holding either
quantized or unquantized values as appropriate.
Change-Id: Ie96f80ac58061b6077e0f7048dc60209fdfbcafa
Signed-off-by: James Peet <james.peet@arm.com>
|
|
Deep speech was exhibiting poor performance in its first three
layers due to poor SHRAM utilisation.
- Given a choice between multiple identical-cost block configs,
the allocator was choosing the first one it encountered. This
commit biases the choice towards blocks with a larger IFM
fetch area to improve SHRAM utilisation.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2ff18a13444b8812cb451a606ff692bf290e7d20
|
|
Adds tosa package to setup.py
Change-Id: I931301313b6d79402e3caf53df7718588a5f538d
Signed-off-by: James Peet <james.peet@arm.com>
|
|
- Fixed typo with not using ifm.mem_type
- Fixed bug with using ifm1 properties when only ifm2 is a potential match
- Removed restriction on not considering SHL and SHR for overlap
- Removed some dead reshape code
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Id9bcc3c2b3ee9ac7b6276187d3e2f513b4acd4b5
|
|
Mapping to internal input indexing has been added to
tflite_reader.py and tosa_reader.py.
And the other way around in tflite_writer.py.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I4d8596e747cfa7c4203884c4e785eb1977e2bcc1
|
|
Added basic TOSA support, enabling Vela to
read and compile a .tosa file corresponding to
CONV2D + Rescale + Clamp, and writing it to an
optimized .tflite file.
The optimized .tflite file, will in this case, hold
a commandstream where the Rescale and Clamp has been
fused into the CONV2D.
The optimized tflite file is not output from Vela.
-Added support to read .tosa file into Vela
internal structure.
- Added tosa_reader.py, tosa_mapper.py and
helper files stored under tosa/
- Support for this limited to ~10 ops
-Added reader_util.py for functions common
for TOSA and TFLite
-Added tosa_graph_optimiser.py
-Added support to fuse Rescale into convolution
-Modified handling for padding
-Added support to fuse Clamp to previous op
-Added graph_optimiser_util.py
-Moved functions common for TOSA/TFLite graph
optimization to this file.
-Renamed graph_optimiser.py to tflite_graph_optmiser.py
-Added separate tosa_supported_operators.py
-Added supported_operator_util.py
-For functions in common for TOSA/TFLite
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab
|
|
vela: Possible issue with handling scratch tensor on non-ethosu custom op
Fixing a case where a tensor input name ends with "scratch".
4 test cases passing this change:
1) non-optimized tflite - input tensor name is _split_1_scratch
2) optimized tflite - input tensor name is _split_1_scratch
3) optimized tflite - input tensor name is _split_1_scratch and custom
operation name is non_ethus_u
4) non-optimized tflite - input tensor name is _split_1_scratch_fast
Change-Id: Ia515805825b7f9a646607c5075b7ea3a0cf6aad8
Signed-off-by: Samuel Panijel <samuel.panijel@arm.com>
|
|
- Added type checking so that the correct type conversion can be used
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ia83f46029fac7bad63844c090b87d23c2072b105
|
|
Reinstated allowing the IFM and OFM tensor to overlap for Elementwise
operations.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ide6db7781f3ca7a36c8ff9e3efdc7943a7bf6d7f
|
|
- 256 and 512 configuration variants execute 1D convolutions
in an optimised manner compared to their 2x2 microblock
dimensions. This commit takes this into account to improve
Conv1D throughput on these configurations.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
|
|
- Update block config selection to take into account partial
IFM fetches at edge of non-whole OFM block data.
- Change to scheduler depth slicing for networks in MLBEDSW-4637
for improved buffering. This helps general performance by buffering
larger depth slices.
- Bug fix for opt_max_schedule always being fitted to SRAM which
prevented the optimisation step running in some cases.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I97642c5adec3bb684b1daabf2b81574c27d4eef2
|
|
Fixed an issue where the scheduler would set the incorrect tensor
layout.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I28abdf3f3c523d7da0cf8840316ece37dad364ab
|
|
Fixed a bug where a DMA command for the activation LUT would be issued
for every depth-slice of an operator. This caused multiple
unnecessary DMA commands.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I9c291692d8002f05656bb88214836ab389a56cdb
|
|
- Restructured pointer API to prevent alignment warnings
- Changed weight tensor data type to np.int16
Change-Id: I310c1ca733bf98724c84e8b2194becb4be3e7eea
|
|
- Deepspeech reuses identical weights and biases throughout
the network. Since biases are now interleaved with weights
there is a scaling issue when the ifm scales differ between
operations using the same weight and scale tensor.
- This commit uses interleaved weights/scales on their first use
but separates scales to source memory on subsequent use (if
the ifm scale is different).
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I7aae163438160a919cae04e235966e75355a6148
|
|
Putting back the estimates related to unbuffered
weight transfer.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I2072066bc1e01814fe3b0b87a912f69646da861c
|