Age | Commit message (Collapse) | Author |
|
Added graph rewrite of Softmax for uint8/int8.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iecdd5d2cd3156a601b3313debba4a3562e6be5d7
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: If22fd21f9953a62305620a4e804e5caacb342c89
|
|
This commit fixes a bug where CPU ops were getting
passed on as NPU ops in weight_compressor.py due to
Operation.find_npu_op() incorrectly returning any
op with an 'npu_block_type' attribute (which every
op has) as an NPU op.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I7a758f8d1b1237907816bc1be7b77aff765ae688
|
|
4 dimensions where assumed in check if NHCWB16 should be avoided.
Changed check so that if axis corresponds to C-dimension,
NHCWB16 should be avoided.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I7784a7a813a3c3438d6142523bf0a3ba81742aca
|
|
- This commit removes unnecessary dependency checks and implements
on-demand calculation of the NPU/DMA dependencies.
Signed-off-by: <tim.hall@arm.com>
Change-Id: I85e681d1ab133bd88f64296dc00500f3c188e777
|
|
Added complex64 datatype to allow pass through without crashing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I8beeceafb32182d4877a9880d21d51ba21033030
|
|
- Support for more than one 256-byte LUT in SHRAM
- No DMA is performed for a LUT that is already located in SHRAM
- Added MemArea.Shram, used for LUT, to avoid false address collision
asserts during SRAM tensor allocation
- Added read access to LUT in memory access calculation
Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Avoid usage of NHCWB16 when Stack/Pack/Concat is performed in axis 3,
and the "concat start" of each slice to be combined is not a multiple
of 16.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: If3f7b4a3424be3c86fc2dc48e8649ce4c4f49485
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id762ee2c03cd8f162cd0c450511ee5b2e0624586
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I5b8db6430e79ec7a5836d8dd00a03413647de8ba
|
|
*the decorator is causing the verification tests to fail when using TF
2.1, but not with TF 2.2, hence removing it for now.
Change-Id: I07357c0fef383d9a65278fe99ad8e4d3f7dc6d9b
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
This commit adds a missing entry for TensorPurpose.Unknown,
mapping to MemType.Unknown in the tensor_storage_mem_type
dictionary in the ArchitectureFeatures class in
architecture_features.py
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I6c3d942e8c6f1c71c6496bdd621ca8d46ea76147
|
|
This commit amends a mistake where the resample_mode
attribute of a tensor would be accessed without checking
if the tensor in question was actually there first.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Id2ceb1d6e38133611fcecfc2ac97150c927ceee2
|
|
Avoid concat op as predecessor in ifm streaming,
when Sram spilling is to be applied.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I2ba6283a7561a12d54a06552a15e122bb082b7a1
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I566abd5a1ffc367c6b9b8f37d5a26b61d27e840b
|
|
Fixed an issue with Fully Connected weights' shape used for compression
scale calculations causing incorrect performance estimates.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id3a5c187ad3e942b8e3d4c690b3dbba3c6fda922
|
|
We already import numeric_util so no need to import it again for one
func
Also replace handcoded full shape code with one already existing in
numeric_util
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ib569409fbfd457a7b4b99006d51d9c43f25a1c2c
|
|
add_input_tensor, set_output_tensor, create_const_tensor and
create_reshape_tensor have recently been added.
This replaces all found existing instances with these new helper
functions
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If33be8dbf237b2087b562b03cdeb51da1f99a786
|
|
There were a number of "TensorUtil" functions defined in softmax.py
These have been moved to their respective classes for Tensor and
Operator respectively.
Two of the functions were not a simple tensor/op function. These helper
functions have been moved to tensor.py for the simple fact that they
return Tensor's
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I17d39c4e11f0837b7867b4a54da2e4a56383e095
|
|
The input tflite file potentially has metadata attached to it, which was
lost when writing the vela optimised tflite file out.
This patch preserves any metadata found.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I7b4e941696d21b81802fd4398cd405323778bedf
|
|
For binary elementwise ops with broadcasting in first IFM.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I25af67be8d3a852247989bc3ddc8e08e946f6bfa
|
|
A valid strided slice should have (positive) non-zero elements
when you do "end - begin"
When encountering an invalid strided slice, vela asserted.
This now checks that it is valid and wont claim support if it isnt.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I33ef118bd6a31ac78c680acb5229ff31b0809d6a
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ibd0cd152fbc46dea0c92fd1bf7da1ffc9803fdba
|
|
*Renamed pack_bias_and_scale to encode_bias to be consumed externally
*added unit test for the API
Change-Id: I71829f3fcb390c475795848f0be3d132d3e158ee
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
Added graph rewrite of Softmax for int16.
Change-Id: Id7885af6056a23e8b8362fb61ae94283251eb398
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I44428d77b2e8e44a477e5c4dfe28ab8dd1792838
|
|
- In networks that share the scale & bias tensor between operators,
differences in operator quantization causes conflicting HW packed
scale & bias values for the tensor. This commit replicates the
scale and bias tensors per operator, similar to weights handling,
to avoid this conflct.
Signed-off-by: <tim.hall@arm.com>
Change-Id: Idee1fdf222ec849b6659adb0891b331d162524b7
|
|
A newer version of numpy gives a deprecation warning. This patch
resolves the deprecation warning so the user should never see it clutter
their output.
Tested on numpy version 1.19.0
Change-Id: I0c468818de4a2e5e2fcb109c45f51b2f1801b7b5
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
|
|
If the total cycle count is zero (for whatever reason), then a divide by
zero can occur when calculating the midpoint_fps.
This change protects against that by detecting when that is the case and
instead setting the midpoint_fps to nan.
Further calculations using that variable is safe and results in nan
throughout.
Change-Id: I2d29545d331a6eb5b27b6d9c931587c15f877e74
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
|
|
When using the various verbose options to print extra info, there is no
break in the output produced by vela.
Added the name of the function as part of the printing.
Added the name of the subgraph to distinguish between them.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ib489cf5043bd9d49b22c976afc545ee600965737
|
|
No need for git to track those.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I8fd5b8099961133a3abec111c78a4ca7a42bdf77
|
|
Reshape ops should contain a "new_shape" attribute. An invalid tflite
file without this attribute caused vela to crash.
The new_shape however is the same as the output shape, so if missing, we
can easily add this missing attribute.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I28ebf028c68bf34bcf03746f57fce53abfcf09e1
|
|
By converting certain Conv2D's (where the kernel size is 1x1 and the
IFM H and W are both 1) to Fully Connected's, vela can better know
whether the weights need to be cached/double buffered or not.
This change decreases the number of NPU_OP_DMA_START commands found in
the resulting command stream.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I928150d9f360578dde75a83986bea1560d83cbdd
|
|
There is a repeating pattern of setting the 3 different shapes in a
tensor to a single shape value.
This adds a new function in the tensor class that does this for you.
Changed existing instances of manually setting shape to use this new
function.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ibc74e741ea47cec473e6be42cc102f721ec63b11
|
|
*lint
*added unit tests
*added typecheck
*added docstring for the api
Change-Id: Ibd4bc40d4381ac40ad2ea3d500b26c4ec565ab07
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
This commit adds a quantization restriction check
for supported operators, so that operators with
different quantization between its IFM (1/2) and
OFM tensors that do not support it, are correctly
placed on the CPU.
The quantization between two tensors is compared
using a new equality function implemented for
the QuantizationParameters class.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I70ff36b4ab4955f328d6e6e699f00dbc43c0404a
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ia9c70d62c6abc827cbdf73a8bb37afd595796741
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I39cff126dda89d71426ab731427ca1d64d02590d
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Iaf4d7ab9c32b0d783072c5f131a61bfebe77cc16
|
|
Automatically generated, no functional changes.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ia6a791f7dbadc352bc8a7b528afa070e8540b4d0
|
|
Avoid encoding empty weights stream.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I120ede14f19705e169c5f03ed344036a58b5f84f
|
|
Fix for alignment of tensor for bias and scale
Change-Id: I303a225a536f169909cec9ba4d5cee088110bb94
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
|
|
- Fixed bug with the size of the scale and bias tensor
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4d267d4c918a5c834ebdff82de4f021717e95203
|
|
Added support for one more memory configuration-
Change-Id: Iac19992386e3e9b80bd519acb1b0a399c47d736f
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
|
|
This will give a worst case estimate of the Double Buffer size in the
Scheduler and it will no longer be able to choose strategies that end
up with a buffer that doesn't fit in SRAM.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I763731f63c7672679f3b8cd6db65dad03b946ae5
|
|
- Parallelism mode register was being written for non Yoda targets.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I31b50031dab4d615733c4c3790dec8934117f275
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I53d9d56acee57cff208dccb4822c1f1a461c416d
|
|
Restrict settings of permanent storage.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Iaa81ee05e8e567b2737825be634baa9085192f0e
|
|
- Added release information
- Added PyPi documentation
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Iaae64cfe10a2fa65f0559d13940b19d6f57edfdc
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ief50c934b9e9b0bd3024d3ed0bbaa7b655971952
|