Age | Commit message (Collapse) | Author |
|
* Add test to verify that the metadata produced in the PKG-INFO file of
the sdist contains the correctly formatted links extracted from
README.md
Change-Id: I300094470fd115b1143aa8c663837e8a77428f24
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: Iaeb8f2cea0d3b576a6b138e64a882c701ac88ccb
|
|
- Latest reference has changed implementation for the Mean op
and now only contain one variant.
- Updated Vela implementation to match reference. The full sum
is first calculated and then divided by the numbers of elements.
- Removed the avg pool variant and test case.
- Updated SUPPORTED_OPS.md
Change-Id: I4275e36e3697fa837f119f2cefd7c0ff94231605
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
Added int8 and int16 UNIDIRECTIONAL_SEQUENCE_LSTM support.
The implementation does not include support for:
* CIFG
* Peephole
* Projection
* Normalisation
This change also:
* Removed unused Op.BlockLSTM operation type.
* Removed the only one consumer limitation on putting the SplitSliceRead
on the tensor consumer(s), if all consumers fullfills the requirements
* Added Op.VariableTensorWrite as a Operation.memory_function to make
sure writes to variable tensors:
* Always use linear mode
* Are not moved to fast scratch
* Are not fused with other elementwise operation tensor ranges
Change-Id: Ief831738924ac3d1f2ba6d41f10bd6dc969911f3
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Refactoring move_constant_data in the scheduler. The use case currently
only work for LUT tensor, so simplifying the logic. In order to make it
work for other tensors one would also have to take into consideration
memory usage when building cascades and also the
use_fast_storage_for_feature_maps would be effected.
Change-Id: Ic8de53b65a2c17d34515002d7f184d0ab1830222
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
Since test works by creating an overflow, sets NumPy to ignore overflow for this test case
Change-Id: I74d03e8d73455295168352542dcb844283d54d33
Signed-off-by: wilisa01 <william.isaksson@arm.com>
|
|
Reinstate constraint for stride height to (1,3) instead of (1,4) for
Conv2D and update unit tests.
Change-Id: I17389ee040eeff0cea08279cab1c038e951569ea
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
- Additional overflow checks are performed when running under
Microsoft Windows compared to Linux. These checks happen when
converting from Python int to NumPy int/uint
- The problem is that the lut activation values are int32 type,
however they are defined as Python ints. If these are converted to
numpy.int32 it could result in an overflow error
- The fix is to convert these values to uint32 but keep the
operator's IFM tensor type the same (as this will allow them to be
interpreted correctly)
- Fixing this highlighted another problem where convert_to_lut
always calls create_lut_tensor() with an int8 datatype, whereas it
should be using the IFM datatype
Change-Id: I781a9d850f654267aa4a67754438607c4bb95685
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
* Extend stride range from (1,3) to (1,4)
* Add stride 4 support when optimising CONV_2D
* Add some tests for various strides
Change-Id: Iddaeb42c4a6e02695ecdd3740bc8b9dd59a7eb3c
Signed-off-by: Raul Farkas <raul.farkas@arm.com>
|
|
- An assert in Vela is triggered when the number of splits does
not evenly divide the input.shape[axis] value and the split offsets
are calculated wrongly.
- The fix is to add the same constraints as in the reference kernel
and only run the Split op on the NPU when the criterias are fulfilled.
- Modified test to reflect the new constraints
- Updated SUPPORTED_OPS.md
Change-Id: I4103ff4a3fdf9a813f5fcb7f51081b859e611100
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
|
|
- The issue is due to undefined behaviour when casting a NumPy float
to a NumPy unsigned integer which occurs in create_const_tensor()
- The fix is to make sure that the values are first cast to a Python
float
- In addition, the values datatype argument has been removed from
create_const_tensor() to stop the tensor and values datatypes getting
out of sync
Change-Id: I134b9be8c941b361929a5ae7db8cb35f2e9728f2
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
- Only 1D bias shape is supported
- Modified test to reflect the constraint
- Update SUPPORTED_OPS.md
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I00ae4b229d5f89512cb94f87f276af61cc66a6fd
|
|
- Update copyright notices to use SPDX format and add OSS mail as contact.
- Update years on files where it had been missed.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I7e9715ea4e17b76252728c708e46df12ad67ab1f
|
|
- Added graph optimisation pass to support dilations greater than 2
in either dimension
- Removed supported operators restrictions
- Removed erroneous dilation on TRANSPOSE_CONV
- Updated unit tests and documentation
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ide302374b0d5eff25c20501383a63f6aa7625c52
|
|
The reference kernel for the MEAN operator has changed.
As a result, the mean implementation can be simplified
and the constraint for mean int8 can be removed.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I318e9b495eefea99e7ac4aea4b8c436c83753405
|
|
- Removed half pixel centers constraint for resize nearest neightbor.
- Supported scale 2x, 4x and 8x.
- Removed test_constraint_resize_half_pixel_centers
- Regenerated SUPPORTED_OPS.md
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: Ic3e02e9c2b2034d537c9a9841b8fb4ee433c96dc
|
|
Added unit tests for scaling including saturated multiplier test.
Change-Id: I87bb3a4bed8f62f5ef5cf3851b97f09ce42bf2b6
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
The test failed since the tanh had batch size > 1.
Added checks for batch size for all supported operators.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I3570352740c40eb96bd9db965dfa3c91c81ff2ad
|
|
- Added support for Resize Bilinear with half pixel centers for int8 and
uint8.
- Utilizes the new "TILE" padding mode.
- Utilizes ofm stride multipliers and modified tile base offsets to
write OFMs interleaved.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
|
|
Added support for int16 LeakyRelu for negative alpha and alpha
greater than one.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I7f522ebfe014786d0a1d96172e75c7d9bdd76921
|
|
Changed acc type from int16 to int32. This will solve
saturation problems and the constraint added in
commit "MLBEDSW-5029: Output diff for Mean op"
can be removed.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I05ec8835b43313b1a264d61a2b147fa62da123fe
|
|
Fixed three test cases causing output diff compared to
the reference kernel for the Mean operator.
- If there is a possibility that the accumulator could saturate
the Mean op must run CPU
- Use correct rounding for the bias term
- If a Reshape op is followed by a Mean op, push the Reshape op
to the CPU since this cannot be handled by the NPU
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I734465730372105821a5e2f73a6a125b9eb7d7f4
|
|
- Changed ResizeBilinear to support ResizeNearestNeighbor as well for
1x1 IFM, IFM equal OFM, and non-align corners
- Added support for ResizeNearestNeighbor with align corners by
converting to a DepthwiseConv
- Updated supported operator unit tests
- Added is_resize() helper function and some associated refactoring
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
|
|
- Fixed align corners support when converting in to upscale and average
pool. The problem was due to the wrong ratio ifm to ofm size, causing an
scaling factor that was not 2x/4x/8x. Works for uint8, int8 and int16.
- Fixed checking of align corners in supported operators check
- Added additional supported operators check for the size tensor
- Updated and added more supported operators unit tests
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Idb78fa9e76ede2c37e8ac6cb1c322154bd156898
|
|
*Quantise op becomes constant if input is known at compile time
*Quantised values calculated if input of op is const and float
*Const inputs to quant op that are int are requantized
Change-Id: Ic94a72a392af709fe6a640d7dacbb5dc2334f16f
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
|
|
- For allocations that have a hard memory limit the Hill Climb allocator
should be given more attempts to find a solution that would fit
- The fix is to use a memory limit when there is a hard constraint, and
a minimum iteration count, reset on every improvement, when there is a soft
constraint
- Added maximum number iterations CLI option
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I19ff53a0b68412de280263626778a3102cbe52fa
|
|
Removing constraint for negative alpha value in ReLu
for int8 and uint8.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: Id7a3a30bf5d1f0a591f990bd04cd0dbbad5819c6
|
|
*Added generic function which checks if underlying shape of
FullyConnected operation is 2D and performs shape reduction
*Fully connected operation >2 dimensions now run on NPU if the above
case is satisfied
*constraint_fc_output_2d and rewrite_fully_connected_input refactored
*Added unit test to confirm this functionality
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
|
|
Update version of Black to 22.3.0 due to updated dependencies.
Updates to fix reported issues due to new version.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
|
|
* 1D optimised block_config was incorrectly beign set to the ArchitectureBlockConfig in try_block_config()
* Write external API test for the reduced block height case (on H256)
Signed-off-by: James Ward <james.ward@arm.com>
Change-Id: I9ced7eb31b23730e4423aabbaf769bc72fac8fc9
|
|
Previously we did not check if half_pixel_centers
was set. Since we do not support it, these cases
should not run on the NPU.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I9d2675f760424d5cfb67e5d581dd1861ad165b85
|
|
* fix indices for tflite mapping of EXP operator
* fix indices for tflite mapping of Transpose operator
* ensure read offset after slice is aligned to 16 bytes for NHCWB16 or force linear format
* add unit test to ensure mapping of indices is consistent across TFLite, TOSA and NNG
Signed-off-by: James Ward <james.ward@arm.com>
Change-Id: I17b6e44bc06853325d5eea62a558418ee1ebefe8
|
|
Memory only operators such as Reshape, Squeeze and ExpandDims are
removed in the graph optimiser step.
- Added semantic check that memory only operators have same
quantisation parameters on ifm/ofm.
- Added support for the ExpandDims operator.
- Addition and cleanup of related unit tests.
- Removed TOSA from the generated SUPPORTED_OPS.md documentation.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If848d8afc58c18806e10997ed94e4dae83f30879
|
|
Update to handle the case when the Squeeze Op ifm/ofm are the
subgraph ifm/ofm, to facilitate the removal of the Squeeze Op.
Adding NOP to maintain the original tensors.
Updated pytests for squeeze operator.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I623cae05e696fb16ccf29dedc42fd822601e9fd9
|
|
- Changed Ethos-65 AXI port address width from 48 to 40-bits
- Fixed the use of arena_cache_size in mem_type_size() to cover the
arena as well as the cache memory area
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I826462a0cbd0c061cccbc7c83dde446778a2b1ca
|
|
Refactor supported operators by breaking out model semantics
into its own class. Model semantics checked right after model
read.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If442b189efcd91dda01af60b2b3adedfacdf2fad
|
|
Remove quant_values attribute from Tensor class.
It only needs a single values attribute, holding either
quantized or unquantized values as appropriate.
Change-Id: Ie96f80ac58061b6077e0f7048dc60209fdfbcafa
Signed-off-by: James Peet <james.peet@arm.com>
|
|
Mapping to internal input indexing has been added to
tflite_reader.py and tosa_reader.py.
And the other way around in tflite_writer.py.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I4d8596e747cfa7c4203884c4e785eb1977e2bcc1
|
|
Added basic TOSA support, enabling Vela to
read and compile a .tosa file corresponding to
CONV2D + Rescale + Clamp, and writing it to an
optimized .tflite file.
The optimized .tflite file, will in this case, hold
a commandstream where the Rescale and Clamp has been
fused into the CONV2D.
The optimized tflite file is not output from Vela.
-Added support to read .tosa file into Vela
internal structure.
- Added tosa_reader.py, tosa_mapper.py and
helper files stored under tosa/
- Support for this limited to ~10 ops
-Added reader_util.py for functions common
for TOSA and TFLite
-Added tosa_graph_optimiser.py
-Added support to fuse Rescale into convolution
-Modified handling for padding
-Added support to fuse Clamp to previous op
-Added graph_optimiser_util.py
-Moved functions common for TOSA/TFLite graph
optimization to this file.
-Renamed graph_optimiser.py to tflite_graph_optmiser.py
-Added separate tosa_supported_operators.py
-Added supported_operator_util.py
-For functions in common for TOSA/TFLite
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic3c540504ec8c5eb4771397fdc6882050ecf33ab
|
|
- 256 and 512 configuration variants execute 1D convolutions
in an optimised manner compared to their 2x2 microblock
dimensions. This commit takes this into account to improve
Conv1D throughput on these configurations.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I6ecdf6e4a219e356327b22f8393f50ee8817af23
|
|
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
|
|
When a MEAN operator with a single reduction axis
specifies the axis index attribute as an array with
a single element rather than a scalar index, the
operator is placed on the CPU even though it is
technically supported.
This commit fixes this issue and also adds some new
tests for the axis constraints.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ia287f3b9cc80a805e972cd4b2962e52526a8dc16
|
|
Check if non linear tensor format can be used is
refactored.
-Flag avoid_NHCWB16 replaced with needs_linear_format
-Checking restrictions located to one function in graph optimiser.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Iec5c7996a1a6039cad052197f1ae56f7c0290440
|
|
This is a small commit which changes one of
the four MEAN implementations to a simpler
one, using an AvgPool instead of a
DepthwiseConv.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I9e8af071e8b820796577ee4792b4812a1212602b
|
|
Bug fix in generation of register command offsets that do not fit in 32 bit.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Iabb99cf6536c0f77b934691f8744df61f1eab3ed
|
|
- Tensor allocation verification was O(N^2), is now closer to O(N)
- Removed a sort in HillClimb allocator
Change-Id: I286a269881490c485cc2b0eeab3b1ecffa8f3df0
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added checks during command stream generation to make sure
that address boundaries are respected.
Change-Id: I4dbc693b42d54e35c8fcc785e8be88059e409eec
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit adds support for emulating the behavior
of the QuantizedMeanOrSum implementation of MEAN in
TensorFlow Lite.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ifd24e0e678e2f85cd66ab82deeaaf010d5351b1e
|
|
- Added full support for PAD operator
- Hardware padding is still used whenever possible
- Bug fix Pad followed by max pool if IFM contains negative values
Change-Id: Ifc64d1943737d94466f5e2821009dab12a49a965
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit adds support for the MEAN operator,
with some caveats.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I165cb26cb5aefd68e70d2cfc68291ccf7b778921
|