Age | Commit message (Collapse) | Author |
|
- Update to TensorFlow 2.14 and minimum required Python version to 3.9.
- Update version pins on NumPy and FlatBuffers.
- Add constraint to Offset attribute of StridedSlice operator
Change-Id: I8c7122def963202e5f47e92b62be607935ed05cf
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
|
|
The operator mapping for the RANDOM_UNIFORM operator was missing the
seed and seed 2 options which resulted in those options being removed
when the operator was passed through Vela.
Change-Id: I8469c239ec1d20d775c31a52e4954baf159643f2
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
|
|
Markdown's git reporitory has moved to different location.
Change-Id: Iae401c1d283d937347cbce546836470647333201
Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
|
|
- Fixed a regression where DepthWiseConv used in argmax int64
had the wrong shape.
- The error was introduced when adding support for a new operator
that changed the weight shape for the cast utility function. That
change only worked because reorder_depthwise_weights was called
later. Since argmax is converted after reorder_depthwise_weights
the cast operator in argmax got the wrong shape.
- The fix is to set the correct weight shape in the cast operator
and then mark that the weights already have been transposed correctly.
Change-Id: I61f5694f078cfcaf0d46d43faead6eb7e0a23ade
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
Update to 23.1.21
Change-Id: I2a9aaa7cbb725c2f417b87577a1f8d6ad4697d76
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
- Added SQUARED_DIFFERENCE support
- Updated SUPPORTED_OPS.md
Change-Id: Id83d9d92129e645390c7979759dfdeff7a14c2ee
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
Only set stride to (1, 1) if kernel, stride and IFM shape all are
equal. And also set padding to VALID to handle ops with SAME padding.
Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: Id3cc34686f09667ea21541fac432351555344e3d
|
|
This fixup is not relevant for Resize ops.
Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
Change-Id: I81b9d3c8a6dd820b1e5d747d754100282b93c641
|
|
- Adds 3 ops: Bitcast, BitcastXor, RightShift
Change-Id: Ia9721c69d4f3da0deba7526addb95a9a54e63adf
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
- Support for stride WxH 1x1
- Support for stride WxH 2x1 when IFM and KERNEL
is 1D shape with height 1
- Added test to supported operators
- Updated SUPPORTED_OPS.md
Change-Id: Ic1abead8399a5e14a78d962f8aded0d3b3dbfcc4
Signed-off-by: Johan Alfven <johan.alfven@arm.com>X
|
|
Extend the error message of RecursionError when reaching default
recursion depth with instructions to use the "--recursion-limit"
option in Vela.
Change-Id: I5c92d49b99203268c4b988f421afe7013ac3511a
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
|
|
There are networks out there with Pool ops with filter (W, H) equals
IFM (W, H) equals stride (W, H). The stride is technically too large
for the NPU, but we can actually run these ops in the NPU since the
filter is large enough the window doesn't slide. To support these ops
we need to fix the stride so later checks don't put this op on CPU.
Change-Id: I8f0a46b26fb94ee76c33748589536cc5ba07ea59
Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
|
|
This convert is already done in the pass packing stage, but doing it
in the graph optimiser stage is better.
Change-Id: Ib9baa98d115cf88491ce39936972a93467a378ce
Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
|
|
- If a npu op is followed by a convolution op than runs on the cpu,
the optimized file ends up containing a duplicated tensor called _cpu.
Functionality wise not a problem but the graph will look strange in a
graph viewer.
- This error was introduced when removing duplicate weights
tensors but the above use case was not considered in that patch.
- The fix is to make sure that only the weight and bias tensor are
modified.
Change-Id: I576f13650f1f9d3d50a421ab7100fc8b5ab62657
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
* Using serialization_lib main branch to update statically copied
files sha 5f920211ac23393a7b98a0d358bfbfc3232d5c8f (v0.80.0)
* All files within the ethosu/vela/tosa are copied from that revision
* Note: hope to move to serialization_lib as a pip module in future
* Modified the ethosu/vela/{tosa_mapping,tosa_reader}.py to use
v0.80.0 TOSA FlatBuffers implementation
* These are the additional changes made to support this new version,
with changes in the format of the FlatBuffers file and where various
values are stored. Either changing from input to attribute, or
moving to different attributes.
Signed-off-by: Rob Elliott <robert.elliott@arm.com>
Change-Id: I5e1fcc2a9964148619be3477adf1e88e84cbae2d
|
|
- Added release information
- Modified SUPPORTED_OPS.md version info
- Update README.md and classifiers in pyproject.toml to specify Python
3.10 as recommended and tested version
Change-Id: I78e5752846f261d4713b89c8efe447bcb9c095dd
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
|
|
- RSQRT is only defined for positive numbers and
therefore the zeropoint and actual input value
will have an impact
- Clamp the range to avoid crashing. As long as the actual
input is within valid range everything works. If the input
is not valid the reference will crash and not generating
any output
Change-Id: I1082b508d9cd85ad4b017e7b786cfff730585172
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
- now only converts array directly if ndim==0
Signed-off-by: William Isaksson <william.isaksson@arm.com>
Change-Id: Id23e419bc7dd717f9694013180d4609819fd2f56
|
|
- npu_performance now uses write/read shapes instead of using ifm/ofms
for memory cycle estimations.
- also fixes a would be bug in the tflite_graph_optimiser, where one
read shape is not Shape4D.
Change-Id: I2067069a713d2cf9e65a5cc227e803de79940fff
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
PAD input tensor shape plus paddings must equal output tensor shape.
Change-Id: Icc5dea9bf6a8f6e1c8402f4d9af4d9796e8ef1aa
Signed-off-by: Johan Gunnarsson <johan.gunnarsson@arm.com>
|
|
- Documented High-Level and register-Level command stream options
- Changed High-Level command stream display to show the name of the
command
- Fixed an issue with some operators not being displayed by the
CLI option --verbose-operators
- Changed an unneeded print in pass packing to a more useful assertion
Change-Id: I9d53f19f4e32d0478209bc964724c27c935f66d6
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
- Added Python support information
- Clarified TensorFlow support information
- Updated Requires-Python version to 3.8
Change-Id: Iab38a2f4480e58a1bd36d5055342c4bf7379dd09
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
We now don't rewrite tensors if the tensor is already an output tensor of the current subgraph
Signed-off-by: William Isaksson <william.isaksson@arm.com>
Change-Id: I9cb36d830616a69d35180326437ff53bcaa62d71
|
|
Adds Vela version to description and metadata
Change-Id: I75fccd1a05a396612a249b8ec1662d8cae940ee6
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
- Added support for multiple npu subgraphs to have the same cpu output tensor
Change-Id: I2e787306dd64af9b03cdf2bacb4c9ff7119f6c49
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
Performance estimation now uses the parent_tensor mem_area instead of
the scheduler_op mem_area, because the mem_area is only set on the
parent_tensor by the scheduler.
Signed-off-by: wilisa01 <william.isaksson@arm.com>
Change-Id: I11f73686bfbd6958a8920c5e264a5f95cc3f23d1
|
|
- checks that cmd1 payloads are legal in
register_command_stream_generator,
- adds unit tests
Change-Id: I2bc23147f60fe090c71703f08d9cbaa279fac86e
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
- Updated FlatBuffers files using TensorFlow 2.12.0 schema
- Added restriction for UnidirectionalSequenceLSTM to have 2D recurrent
weights to handle that diagonal_recurrent_tensors attr is not
currently supported.
Change-Id: I104fd1f52485b9b83d644772dbcdeea2d17585f0
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
- Added graph optimiser function to convert convolution groups into
a split followed by separate convolutions and then a concat
- Added semantic check for convolution groups
- Added unit tests for convolution groups semantic checks
- Fixed a minor typing issue with test_constraint_stride_range
Change-Id: I78ade408aa23469a79c9f517c4751da8619b77a9
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
If any of H,W axes have shape 1, the IFM can be reshaped to support
reduction over the depth axis.
Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com>
Change-Id: I432ff1c399b7cee4ca5f0a8f4461e9c0a936d804
|
|
- Add support for batch and depth channels when shape is 1
- Refactor reshaping in convert_mean_to_depthwise_conv
Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com>
Change-Id: If663395934ab58c76ba92b6ebaaf484a389ae699
|
|
* Fix bug in register_command_stream_generator where certain
high-level command streams resulted in missing DMA_WAIT commands
* Add unit-tests for DMA_WAIT and KERNEL_WAIT commands
Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com>
Change-Id: Iabb3ea3e95fa1ef933c50356d047b6b3f5aeafe3
|
|
- In order to reduce memory usage, the live range mechanism have logic
to check if the ifm tensor can be reused for the ofm tensor for certain
operators
- In this failing test case, the input to the reshape/memcpy operator
has more than one consumer and this results in a faulty memory overwrite
since there are missing logic that should check the ifm consumers for
the memcpy operator
- The fix is to add the missing logic that ifm can only have one consumer
Change-Id: I2184c0f905b554f648c9732734098509e23b537c
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
Changes query initialization shapes to Shape4D(0,0,0,0) = [0,0,0,0]
instead of Shape4D(0) = [0,1,1,1]. The [0,1,1,1] tensors would affect
performance estimates and are not real.
Change-Id: Ic83b6f6a70c0c904b500f62756e1e125c99856c6
Signed-off-by: William Isaksson <william.isaksson@arm.com>
|
|
- The problem is that the axis value can be either a scalar or an
array containing a single element
- The solution is to check the length of the shape because the size
attribute returns the same value for both cases
- This did not show up before because pytest warnings were not being
treated as errors
- Removed pre-commit pytest option that caused tests to be searched for
from the root directory
- Updated pyproject.toml pytest options to explicitly specify the test
directories, and to treat warnings as errors
Change-Id: I037054768e5c34f253b6062eadba1c3419ff65e4
|
|
* Improve check_cmd functions to return position of the checked commands.
* Update existing unit-tests to validate ordering of commands.
Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com>
Change-Id: I492487d768e1e80f6ea366e29f2f99441e4f9797
|
|
Add function description and type annotations to the optimization
functions missing them.
Fix type annotation issue when re-assigning variable
value to a different type.
Change-Id: I1ee442ff7a29cc07708fdd013430131eff599dd5
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
* Convert Means with large IFMs to several DeptwiseConv2DBias and Add
operations.
* Update tflite supported operator check with new height and width
constraints.
* Update unit-tests to verify supported operator changes.
* Fix output-diff for 2D IFMs (MLBEDSW-7772)
Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com>
Change-Id: Ifae6fb1cdac475ae7dac5116c5f13631ff82108a
|
|
- A crash occurred due to NoneType subscriptable error when
rewriting a Slice op. The reason was that the Size tensor did
not contain any data.
- Added constraint pushing the Slice operator to the CPU if
begin or size tensor are empty.
- Added test to supported operators
- Updated SUPPORTED_OPS.md
Change-Id: Ide204cae24e5871f0e6ae1fdc98ac68d0ce4d3ae
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
* Convert AvgPool with stride_width > 3 and Valid padding to Conv2D to
optimize it to run on NPU.
Change-Id: I06ab412357f0b09b1498f9019a9d1963a324ad34
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
* Fix bug that caused filter padding to not be added proportionally
compared to the hardware padding added to IFM.
* Update needed_total_padding function that calculates hardware padding
to also account for the cases in which IFM width is not divisible by
the stride width.
* Update supported ops constraint on strides for conv2d to mark ops with
stride width > 3 and IFM width that is not divisible by the
optimization resize factor as not supported.
* Update unit tests that verify correct functionality when checking
whether ops are supported or not.
Change-Id: I62f14cca890b779ca787a9603fa37c873ad522f8
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
Change-Id: I4f466a7bac77d8bb6fa7243ea2e7c9f3be6d0585
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
Update import of Sized from collections to collections.abc to work with
Python 3.10
Change-Id: Iae281db9402331972ad13660d04523608b23614d
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
|
|
- Added RSQRT int8 support, implemented as LUT.
- Added test to supported operators
- Updated SUPPORTED_OPS.md
Change-Id: I34904772e044be8d22a6dfe426edf85358a205b7
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
- When optimizing for Size the scheduler does not try to add weight
buffering to the schedule since this would add extra SRAM usage to
the peak usage. However, for all other ops that uses less SRAM than
the peak there is memory available that could be used for weight
buffering and hence improve the performance.
- Removed limitation to only run optimize schedule when optimizing
for Performance. Regardless of optimizing for Performance or Size the
scheduler flow is the same except that the limit for max SRAM usage is
different.
Change-Id: I6880b35655e37b4916a9c15150f0b8e5126a1cd8
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
- Cascading was recently enabled for Resize ops. A Resize op is
transformed into several ops. In this case the last op is a
DepthwiseConv2DBias using NEAREST resampling mode. This resampling/
upscaling is not taken into account when calculating the ifm box
size, causing the coordinates to get out of bounds.
- When generating the high level command stream there is a check to
see if an op is a resize op. If this is the case an upscaling factor
is calculated. The fix is to change this check to instead see if the
operator is using NEAREST resampling mode. If that is true, the
scaling factor should be used.
Change-Id: I5308a383cc3310c53004ccfe2d6fabf256478a26
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
- Added fix when building the minimum schedule forcing the stripe
to be even for is_nearest ops. This is required in order to be
able to allow cascading for resize ops.
- Remove limitation in cascade builder that prevents resize ops
to be cascaded.
Change-Id: I05150102b91531ecba786936494f1817a4472f42
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
Add more detailed explanations to verbose options
Change-Id: Ia001e62d4c26ea6ae07949c1c434cbfc1cc7e08a
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
|
|
- Added release information
- Minor changes to SUPPORTED_OPS.md including version info
Change-Id: I91fae4c40c6c1f25b874268b18d077a9babd4875
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
half_pixel_center=True
Signed-off-by: Alexander Hansson <Alexander.Hansson@arm.com>
Change-Id: I0e9db22c97a9e2fbfee618262ffc43532cfcee2c
|