Age | Commit message (Collapse) | Author |
|
Allows fusing of LUT with a preceding operator regardless of
input/output scale.
Change-Id: Ia378adbb3fe61d71299feb085f7313377e0efa39
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
If a tflite file with no ops but just the input/output tensor is given,
vela wrote an empty optimised tflite file with no tensors given.
This fixes that by allowing all placeholder tensors to also be
serialised on write.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If79817100869e712a75264889f401e38de0b1e7a
|
|
Added batching to softmax by reshaping the input.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I0b516f9bf2410fb86372b229beba4a7280c498cc
|
|
Removed CLI-option permanent-storage
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I03e03205a183bd538292a73a07b095546fa3d95a
|
|
For SplitV sizesplit can contain one -1 indicating that
dimension is to be inferred.
Support added to handle this.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ib9fc8dd2ee1749e81a978d85f2d4a016698bb441
|
|
Added the CLI option. Only applies to CPU tensors. Added an
AllocationError which is raised when Allocation fails.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I89164dea3ac7b7add7bc40aec2ce8fe50600105d
|
|
Fix int16 multiplier saturation to match the reference.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I4a9c859482f7deb3899f90c7e9eb40c255ee4c45
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I75aad9bf59ad76ee6a0c0feb4d7299b50d787fe8
|
|
Split mapping to tensor
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic143f3b4d37f6904edd8f119eff1d108f70b5026
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I857aeb7aeb34f4b8ea47e6ac954cead268335e32
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I04618fd0d29075e7d3f8f27a320129603f045163
|
|
- Set ACC_FORAMT to 32-bit for pooling operations.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I69ebd08c2db4c5ec966ca13c872c9b0c8330bb6f
|
|
- Fixed bias check to use quantised values.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I6d87439938b9b5aeec87814e5a30d59fd06d5748
|
|
- Corrected the rounding mode for softmax
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: If136491c7668e85fba1e2c56c8cff11aa32db328
|
|
Fixed a zero point issue for int32 ifm.
Change-Id: I9149cb24d5b030ea5216a028a113518e458a8d15
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Enables LUT for LeakyRelu with int8/uint8 even if input scale
is different from the output scale.
Fusing LUT with a previous operator for this situation
requires further work.
Change-Id: I9eddfe36f457e763d44eb3e05fbe240eac7cfec9
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Processing reshapes at the end of NPU subgraphs selected NHCWB16
tensor format before handing over to the CPU. This commit detects
end-of-subgraph during the reshape-consumers compatibility check
and chooses NHWC format instead.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ieefdbecdba1a6183d79d3ac4d2505503dbf321cb
|
|
Allows int64 data type to be used as long as all values can be packed
into a int40 value.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I0e25ec482e3ea765a5fd00bcf7e212a9e65a1461
|
|
Fixed serialisation of scalar ifm tensors with values larger than
byte sized.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I2714398db91b83f24e5271c1d5de1c0e8211f9ab
|
|
Added checks for not using NHCWB16 for reduce_sum int32 which makes
int8/uint8 softmax work.
Also enabled softmax graph rewrite by default and fixed a saturation
problem.
Change-Id: Ic01bd9ece7e5c3edb2900b7915cc747efe9e5760
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I287c24725126c169afec779b921e43c3ab26f739
|
|
- Setup ifm/ifm2 based on primary op's inputs
Change-Id: I727eab473165d7cc876b70fa8873fbc0c1480fb5
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
Updated kernel size check, width and height was swapped
and added weight sum check.
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
Change-Id: Idb18cf258ac19b3a0d71134dab5a117bcd778b59
|
|
- Reshapes that merely add/remove dimensions, rather than re-layout the
data need not fall back to NHWC. This commit allows reshapes betweeen
NPU operators to use NHCWB16.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ieb7745e586bf324e92e741a04b74caf7285f4b8b
|
|
Signed-off-by: Stefan Nannesson <stefan.nannesson@arm.com>
Change-Id: I7ad0b8e5b2431b46b53f51d809ca2642039a0012
|
|
For int16, using LeakyRelu (with bug fix) gives exactly
the same results as Mul+Max if input/output scales are the same.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I4f4db464d77b0aaf0d25ddfca534f91d08db548d
|
|
Added --weight-estimation-scaling, which enables
additional scaling of weight compression scale estimate.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Idcda41257f44901d3a3f345341e07fb1ae8585a9
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I2cb3f6639e4bb8a984fa3647ee7b4678ed6f5890
|
|
LUT related updates specific for 16K SHRAM:
- prevent LUT DMA transfer from overwriting accumulator SHRAM of an ongoing operation
- do not use the last 2K of SHRAM as accumulator during LUT operations
Change-Id: I17066e0410c6f07b125ed245002d7b19269a7a8a
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit fixes a bug wherein Split operators
are being erroneously placed on the CPU due to
a 0-dimensional input that disqualifies it from
NPU placement; a restriction introduced in a
recent commit.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I83c047ddf071d662343087c69bdb2a014dd209c3
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ida307afc33cd7963bdeb505df400732a3efcc846
|
|
Replaces LeakyRelu operations with LUT activation function when possible,
else to a combination of multiplication/maximization.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I3d2eb2dba7145997c3cc711d0ef18ab355fbb416
|
|
- Minor cleanup of register command stream generator too
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0514622402ee9b0557769dd7c7decfddecc87ffa
|
|
- Fixed bug with the supported operator check rejecting operators based
upon an incorrect comparison of the tensor quantisations
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibd0eb50077465d2c515c6ee10394d9b43cdf730c
|
|
Includes a number of changes:
* Handle non-existing optional inputs
* Handle disabled optional inputs (-1 indexed)
* Added unit tests for parsing operators
* Add bias tensor to the different Convolutions + FullyConnected if
it's missing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ib88d2b610314b1c886fc0aef4f9da87430ce6ae5
|
|
Implemented LUT generation for softmax uint8/int8 to match the
reference.
Change-Id: Ib9acaa295ee1066591e800023d75f364520b44c1
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Very small quantization scales, below around 2^-31, would return
negative shift values.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I4ca368284c097820f83e5ae53412a08c34516c7f
|
|
-Make it clear that --permanent-storage option, only is valid
for Ethos-U55.
-Removed Shram from allowed values
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ice6cacd509713e33bcb380c16dcd3c3b34a82a33
|
|
Added that NHCWB16 is accounted for in the sram estimates
in the scheduler, for intermediate buffers in ifm streaming.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Icda5e05dd3663935f528f1a06d36d9e1de123cc8
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ia83ab5ba28d193215e3f8fbc52552b0356111723
|
|
There may be cases where after optimisations, there are no operators
contained within the subgraph. Upon serialising and writing out the vela
optimised tflite file, it would crash for such a corner case. This fixes
it allowing it to not crash but instead write out the empty tflite file.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ia879d1ffdbab21706b15e99aa107fb2d8d4dd3de
|
|
This commit adds an entry in the tflite_mapping.py
for the ROUND operator, which was previously missing.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I22d6c60969eea6a785366c6741893718ba3cb8ae
|
|
- Removed some of the clutter
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I9a12f681247befd44dbbc9d7fbd135f0603d2fbd
|
|
- Fixed. It only affected operators with striding greater than 1x1
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I129e46586aa16079ddbce3898569676ba9891372
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I04f299e2d3319113fedf2fa401b88bae64fea66d
|
|
This commit adds missing entries and options in the
tflite_mapping which should in theory allow every
existing TensorFlow Lite operator to be passed through Vela
without crashing.
Previously some entries were missing and was crashing
with a custom error whenever encountered.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ia69b7a84164bb57c52ceaf7380160794b7f0d9ee
|
|
Vela often fails when encountering operators that have
inputs or outputs with shape == []. Only for elementwise
ops where shape is broadcasted from IFM2 to IFM1 is this
supported.
This commit adds a restriction which places ops with
shape [] tensors on the CPU except in the special case
of broadcasting for elemwise ops.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I5b0855233e3b83870209f4da00fb2dbd0184fee0
|
|
DMA transfer of weights is prevented when the weight
double buffer is assumed to not fit Sram.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I9809dca1d4b335436e1a0b81093640361ada255e
|
|
NHCWB16 is avoided for the input tensor for SplitSliceRead,
when any of the consumers has an start offset in C-dimension
that is not a multiple of 16.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I333e2acfbeb02b9c34ee5ea28074baff12ea7b24
|
|
Added graph rewrite of Softmax for uint8/int8.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iecdd5d2cd3156a601b3313debba4a3562e6be5d7
|