Age | Commit message (Collapse) | Author |
|
Replaces LeakyRelu operations with LUT activation function when possible,
else to a combination of multiplication/maximization.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I3d2eb2dba7145997c3cc711d0ef18ab355fbb416
|
|
- Minor cleanup of register command stream generator too
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0514622402ee9b0557769dd7c7decfddecc87ffa
|
|
- Fixed bug with the supported operator check rejecting operators based
upon an incorrect comparison of the tensor quantisations
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibd0eb50077465d2c515c6ee10394d9b43cdf730c
|
|
Includes a number of changes:
* Handle non-existing optional inputs
* Handle disabled optional inputs (-1 indexed)
* Added unit tests for parsing operators
* Add bias tensor to the different Convolutions + FullyConnected if
it's missing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ib88d2b610314b1c886fc0aef4f9da87430ce6ae5
|
|
Implemented LUT generation for softmax uint8/int8 to match the
reference.
Change-Id: Ib9acaa295ee1066591e800023d75f364520b44c1
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Very small quantization scales, below around 2^-31, would return
negative shift values.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I4ca368284c097820f83e5ae53412a08c34516c7f
|
|
-Make it clear that --permanent-storage option, only is valid
for Ethos-U55.
-Removed Shram from allowed values
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ice6cacd509713e33bcb380c16dcd3c3b34a82a33
|
|
Added that NHCWB16 is accounted for in the sram estimates
in the scheduler, for intermediate buffers in ifm streaming.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Icda5e05dd3663935f528f1a06d36d9e1de123cc8
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ia83ab5ba28d193215e3f8fbc52552b0356111723
|
|
There may be cases where after optimisations, there are no operators
contained within the subgraph. Upon serialising and writing out the vela
optimised tflite file, it would crash for such a corner case. This fixes
it allowing it to not crash but instead write out the empty tflite file.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ia879d1ffdbab21706b15e99aa107fb2d8d4dd3de
|
|
This commit adds an entry in the tflite_mapping.py
for the ROUND operator, which was previously missing.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I22d6c60969eea6a785366c6741893718ba3cb8ae
|
|
- Removed some of the clutter
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I9a12f681247befd44dbbc9d7fbd135f0603d2fbd
|
|
- Fixed. It only affected operators with striding greater than 1x1
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I129e46586aa16079ddbce3898569676ba9891372
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I04f299e2d3319113fedf2fa401b88bae64fea66d
|
|
This commit adds missing entries and options in the
tflite_mapping which should in theory allow every
existing TensorFlow Lite operator to be passed through Vela
without crashing.
Previously some entries were missing and was crashing
with a custom error whenever encountered.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ia69b7a84164bb57c52ceaf7380160794b7f0d9ee
|
|
Vela often fails when encountering operators that have
inputs or outputs with shape == []. Only for elementwise
ops where shape is broadcasted from IFM2 to IFM1 is this
supported.
This commit adds a restriction which places ops with
shape [] tensors on the CPU except in the special case
of broadcasting for elemwise ops.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I5b0855233e3b83870209f4da00fb2dbd0184fee0
|
|
DMA transfer of weights is prevented when the weight
double buffer is assumed to not fit Sram.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I9809dca1d4b335436e1a0b81093640361ada255e
|
|
NHCWB16 is avoided for the input tensor for SplitSliceRead,
when any of the consumers has an start offset in C-dimension
that is not a multiple of 16.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I333e2acfbeb02b9c34ee5ea28074baff12ea7b24
|
|
Added graph rewrite of Softmax for uint8/int8.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iecdd5d2cd3156a601b3313debba4a3562e6be5d7
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: If22fd21f9953a62305620a4e804e5caacb342c89
|
|
This commit fixes a bug where CPU ops were getting
passed on as NPU ops in weight_compressor.py due to
Operation.find_npu_op() incorrectly returning any
op with an 'npu_block_type' attribute (which every
op has) as an NPU op.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I7a758f8d1b1237907816bc1be7b77aff765ae688
|
|
4 dimensions where assumed in check if NHCWB16 should be avoided.
Changed check so that if axis corresponds to C-dimension,
NHCWB16 should be avoided.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I7784a7a813a3c3438d6142523bf0a3ba81742aca
|
|
- This commit removes unnecessary dependency checks and implements
on-demand calculation of the NPU/DMA dependencies.
Signed-off-by: <tim.hall@arm.com>
Change-Id: I85e681d1ab133bd88f64296dc00500f3c188e777
|
|
Added complex64 datatype to allow pass through without crashing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I8beeceafb32182d4877a9880d21d51ba21033030
|
|
- Support for more than one 256-byte LUT in SHRAM
- No DMA is performed for a LUT that is already located in SHRAM
- Added MemArea.Shram, used for LUT, to avoid false address collision
asserts during SRAM tensor allocation
- Added read access to LUT in memory access calculation
Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Avoid usage of NHCWB16 when Stack/Pack/Concat is performed in axis 3,
and the "concat start" of each slice to be combined is not a multiple
of 16.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: If3f7b4a3424be3c86fc2dc48e8649ce4c4f49485
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id762ee2c03cd8f162cd0c450511ee5b2e0624586
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I5b8db6430e79ec7a5836d8dd00a03413647de8ba
|
|
*the decorator is causing the verification tests to fail when using TF
2.1, but not with TF 2.2, hence removing it for now.
Change-Id: I07357c0fef383d9a65278fe99ad8e4d3f7dc6d9b
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
This commit adds a missing entry for TensorPurpose.Unknown,
mapping to MemType.Unknown in the tensor_storage_mem_type
dictionary in the ArchitectureFeatures class in
architecture_features.py
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I6c3d942e8c6f1c71c6496bdd621ca8d46ea76147
|
|
This commit amends a mistake where the resample_mode
attribute of a tensor would be accessed without checking
if the tensor in question was actually there first.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Id2ceb1d6e38133611fcecfc2ac97150c927ceee2
|
|
Avoid concat op as predecessor in ifm streaming,
when Sram spilling is to be applied.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I2ba6283a7561a12d54a06552a15e122bb082b7a1
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I566abd5a1ffc367c6b9b8f37d5a26b61d27e840b
|
|
Fixed an issue with Fully Connected weights' shape used for compression
scale calculations causing incorrect performance estimates.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id3a5c187ad3e942b8e3d4c690b3dbba3c6fda922
|
|
We already import numeric_util so no need to import it again for one
func
Also replace handcoded full shape code with one already existing in
numeric_util
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ib569409fbfd457a7b4b99006d51d9c43f25a1c2c
|
|
add_input_tensor, set_output_tensor, create_const_tensor and
create_reshape_tensor have recently been added.
This replaces all found existing instances with these new helper
functions
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If33be8dbf237b2087b562b03cdeb51da1f99a786
|
|
There were a number of "TensorUtil" functions defined in softmax.py
These have been moved to their respective classes for Tensor and
Operator respectively.
Two of the functions were not a simple tensor/op function. These helper
functions have been moved to tensor.py for the simple fact that they
return Tensor's
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I17d39c4e11f0837b7867b4a54da2e4a56383e095
|
|
The input tflite file potentially has metadata attached to it, which was
lost when writing the vela optimised tflite file out.
This patch preserves any metadata found.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I7b4e941696d21b81802fd4398cd405323778bedf
|
|
For binary elementwise ops with broadcasting in first IFM.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I25af67be8d3a852247989bc3ddc8e08e946f6bfa
|
|
A valid strided slice should have (positive) non-zero elements
when you do "end - begin"
When encountering an invalid strided slice, vela asserted.
This now checks that it is valid and wont claim support if it isnt.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I33ef118bd6a31ac78c680acb5229ff31b0809d6a
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ibd0cd152fbc46dea0c92fd1bf7da1ffc9803fdba
|
|
*Renamed pack_bias_and_scale to encode_bias to be consumed externally
*added unit test for the API
Change-Id: I71829f3fcb390c475795848f0be3d132d3e158ee
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
Added graph rewrite of Softmax for int16.
Change-Id: Id7885af6056a23e8b8362fb61ae94283251eb398
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I44428d77b2e8e44a477e5c4dfe28ab8dd1792838
|
|
- In networks that share the scale & bias tensor between operators,
differences in operator quantization causes conflicting HW packed
scale & bias values for the tensor. This commit replicates the
scale and bias tensors per operator, similar to weights handling,
to avoid this conflct.
Signed-off-by: <tim.hall@arm.com>
Change-Id: Idee1fdf222ec849b6659adb0891b331d162524b7
|
|
A newer version of numpy gives a deprecation warning. This patch
resolves the deprecation warning so the user should never see it clutter
their output.
Tested on numpy version 1.19.0
Change-Id: I0c468818de4a2e5e2fcb109c45f51b2f1801b7b5
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
|
|
If the total cycle count is zero (for whatever reason), then a divide by
zero can occur when calculating the midpoint_fps.
This change protects against that by detecting when that is the case and
instead setting the midpoint_fps to nan.
Further calculations using that variable is safe and results in nan
throughout.
Change-Id: I2d29545d331a6eb5b27b6d9c931587c15f877e74
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
|
|
When using the various verbose options to print extra info, there is no
break in the output produced by vela.
Added the name of the function as part of the printing.
Added the name of the subgraph to distinguish between them.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ib489cf5043bd9d49b22c976afc545ee600965737
|
|
Reshape ops should contain a "new_shape" attribute. An invalid tflite
file without this attribute caused vela to crash.
The new_shape however is the same as the output shape, so if missing, we
can easily add this missing attribute.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I28ebf028c68bf34bcf03746f57fce53abfcf09e1
|
|
By converting certain Conv2D's (where the kernel size is 1x1 and the
IFM H and W are both 1) to Fully Connected's, vela can better know
whether the weights need to be cached/double buffered or not.
This change decreases the number of NPU_OP_DMA_START commands found in
the resulting command stream.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I928150d9f360578dde75a83986bea1560d83cbdd
|