Age | Commit message (Collapse) | Author |
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I2cb3f6639e4bb8a984fa3647ee7b4678ed6f5890
|
|
LUT related updates specific for 16K SHRAM:
- prevent LUT DMA transfer from overwriting accumulator SHRAM of an ongoing operation
- do not use the last 2K of SHRAM as accumulator during LUT operations
Change-Id: I17066e0410c6f07b125ed245002d7b19269a7a8a
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit fixes a bug wherein Split operators
are being erroneously placed on the CPU due to
a 0-dimensional input that disqualifies it from
NPU placement; a restriction introduced in a
recent commit.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I83c047ddf071d662343087c69bdb2a014dd209c3
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ida307afc33cd7963bdeb505df400732a3efcc846
|
|
Replaces LeakyRelu operations with LUT activation function when possible,
else to a combination of multiplication/maximization.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I3d2eb2dba7145997c3cc711d0ef18ab355fbb416
|
|
- Minor cleanup of register command stream generator too
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0514622402ee9b0557769dd7c7decfddecc87ffa
|
|
- Fixed bug with the supported operator check rejecting operators based
upon an incorrect comparison of the tensor quantisations
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibd0eb50077465d2c515c6ee10394d9b43cdf730c
|
|
Includes a number of changes:
* Handle non-existing optional inputs
* Handle disabled optional inputs (-1 indexed)
* Added unit tests for parsing operators
* Add bias tensor to the different Convolutions + FullyConnected if
it's missing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ib88d2b610314b1c886fc0aef4f9da87430ce6ae5
|
|
Implemented LUT generation for softmax uint8/int8 to match the
reference.
Change-Id: Ib9acaa295ee1066591e800023d75f364520b44c1
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Very small quantization scales, below around 2^-31, would return
negative shift values.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I4ca368284c097820f83e5ae53412a08c34516c7f
|
|
-Make it clear that --permanent-storage option, only is valid
for Ethos-U55.
-Removed Shram from allowed values
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ice6cacd509713e33bcb380c16dcd3c3b34a82a33
|
|
Added that NHCWB16 is accounted for in the sram estimates
in the scheduler, for intermediate buffers in ifm streaming.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Icda5e05dd3663935f528f1a06d36d9e1de123cc8
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ia83ab5ba28d193215e3f8fbc52552b0356111723
|
|
There may be cases where after optimisations, there are no operators
contained within the subgraph. Upon serialising and writing out the vela
optimised tflite file, it would crash for such a corner case. This fixes
it allowing it to not crash but instead write out the empty tflite file.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ia879d1ffdbab21706b15e99aa107fb2d8d4dd3de
|
|
This commit adds an entry in the tflite_mapping.py
for the ROUND operator, which was previously missing.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I22d6c60969eea6a785366c6741893718ba3cb8ae
|
|
- Removed some of the clutter
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I9a12f681247befd44dbbc9d7fbd135f0603d2fbd
|
|
- Fixed. It only affected operators with striding greater than 1x1
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I129e46586aa16079ddbce3898569676ba9891372
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I04f299e2d3319113fedf2fa401b88bae64fea66d
|
|
This commit adds missing entries and options in the
tflite_mapping which should in theory allow every
existing TensorFlow Lite operator to be passed through Vela
without crashing.
Previously some entries were missing and was crashing
with a custom error whenever encountered.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ia69b7a84164bb57c52ceaf7380160794b7f0d9ee
|
|
Vela often fails when encountering operators that have
inputs or outputs with shape == []. Only for elementwise
ops where shape is broadcasted from IFM2 to IFM1 is this
supported.
This commit adds a restriction which places ops with
shape [] tensors on the CPU except in the special case
of broadcasting for elemwise ops.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I5b0855233e3b83870209f4da00fb2dbd0184fee0
|
|
DMA transfer of weights is prevented when the weight
double buffer is assumed to not fit Sram.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I9809dca1d4b335436e1a0b81093640361ada255e
|
|
NHCWB16 is avoided for the input tensor for SplitSliceRead,
when any of the consumers has an start offset in C-dimension
that is not a multiple of 16.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I333e2acfbeb02b9c34ee5ea28074baff12ea7b24
|
|
Added graph rewrite of Softmax for uint8/int8.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iecdd5d2cd3156a601b3313debba4a3562e6be5d7
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: If22fd21f9953a62305620a4e804e5caacb342c89
|
|
This commit fixes a bug where CPU ops were getting
passed on as NPU ops in weight_compressor.py due to
Operation.find_npu_op() incorrectly returning any
op with an 'npu_block_type' attribute (which every
op has) as an NPU op.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I7a758f8d1b1237907816bc1be7b77aff765ae688
|
|
4 dimensions where assumed in check if NHCWB16 should be avoided.
Changed check so that if axis corresponds to C-dimension,
NHCWB16 should be avoided.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I7784a7a813a3c3438d6142523bf0a3ba81742aca
|
|
- This commit removes unnecessary dependency checks and implements
on-demand calculation of the NPU/DMA dependencies.
Signed-off-by: <tim.hall@arm.com>
Change-Id: I85e681d1ab133bd88f64296dc00500f3c188e777
|
|
Added complex64 datatype to allow pass through without crashing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I8beeceafb32182d4877a9880d21d51ba21033030
|
|
- Support for more than one 256-byte LUT in SHRAM
- No DMA is performed for a LUT that is already located in SHRAM
- Added MemArea.Shram, used for LUT, to avoid false address collision
asserts during SRAM tensor allocation
- Added read access to LUT in memory access calculation
Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Avoid usage of NHCWB16 when Stack/Pack/Concat is performed in axis 3,
and the "concat start" of each slice to be combined is not a multiple
of 16.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: If3f7b4a3424be3c86fc2dc48e8649ce4c4f49485
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id762ee2c03cd8f162cd0c450511ee5b2e0624586
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I5b8db6430e79ec7a5836d8dd00a03413647de8ba
|
|
*the decorator is causing the verification tests to fail when using TF
2.1, but not with TF 2.2, hence removing it for now.
Change-Id: I07357c0fef383d9a65278fe99ad8e4d3f7dc6d9b
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
This commit adds a missing entry for TensorPurpose.Unknown,
mapping to MemType.Unknown in the tensor_storage_mem_type
dictionary in the ArchitectureFeatures class in
architecture_features.py
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I6c3d942e8c6f1c71c6496bdd621ca8d46ea76147
|
|
This commit amends a mistake where the resample_mode
attribute of a tensor would be accessed without checking
if the tensor in question was actually there first.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Id2ceb1d6e38133611fcecfc2ac97150c927ceee2
|
|
Avoid concat op as predecessor in ifm streaming,
when Sram spilling is to be applied.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I2ba6283a7561a12d54a06552a15e122bb082b7a1
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I566abd5a1ffc367c6b9b8f37d5a26b61d27e840b
|
|
Fixed an issue with Fully Connected weights' shape used for compression
scale calculations causing incorrect performance estimates.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id3a5c187ad3e942b8e3d4c690b3dbba3c6fda922
|
|
We already import numeric_util so no need to import it again for one
func
Also replace handcoded full shape code with one already existing in
numeric_util
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ib569409fbfd457a7b4b99006d51d9c43f25a1c2c
|
|
add_input_tensor, set_output_tensor, create_const_tensor and
create_reshape_tensor have recently been added.
This replaces all found existing instances with these new helper
functions
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If33be8dbf237b2087b562b03cdeb51da1f99a786
|
|
There were a number of "TensorUtil" functions defined in softmax.py
These have been moved to their respective classes for Tensor and
Operator respectively.
Two of the functions were not a simple tensor/op function. These helper
functions have been moved to tensor.py for the simple fact that they
return Tensor's
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I17d39c4e11f0837b7867b4a54da2e4a56383e095
|
|
The input tflite file potentially has metadata attached to it, which was
lost when writing the vela optimised tflite file out.
This patch preserves any metadata found.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I7b4e941696d21b81802fd4398cd405323778bedf
|
|
For binary elementwise ops with broadcasting in first IFM.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I25af67be8d3a852247989bc3ddc8e08e946f6bfa
|
|
A valid strided slice should have (positive) non-zero elements
when you do "end - begin"
When encountering an invalid strided slice, vela asserted.
This now checks that it is valid and wont claim support if it isnt.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I33ef118bd6a31ac78c680acb5229ff31b0809d6a
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ibd0cd152fbc46dea0c92fd1bf7da1ffc9803fdba
|
|
*Renamed pack_bias_and_scale to encode_bias to be consumed externally
*added unit test for the API
Change-Id: I71829f3fcb390c475795848f0be3d132d3e158ee
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
Added graph rewrite of Softmax for int16.
Change-Id: Id7885af6056a23e8b8362fb61ae94283251eb398
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I44428d77b2e8e44a477e5c4dfe28ab8dd1792838
|
|
- In networks that share the scale & bias tensor between operators,
differences in operator quantization causes conflicting HW packed
scale & bias values for the tensor. This commit replicates the
scale and bias tensors per operator, similar to weights handling,
to avoid this conflct.
Signed-off-by: <tim.hall@arm.com>
Change-Id: Idee1fdf222ec849b6659adb0891b331d162524b7
|
|
A newer version of numpy gives a deprecation warning. This patch
resolves the deprecation warning so the user should never see it clutter
their output.
Tested on numpy version 1.19.0
Change-Id: I0c468818de4a2e5e2fcb109c45f51b2f1801b7b5
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
|