Age | Commit message (Collapse) | Author |
|
This reverts commit bf31d647dc5df47410ee577b12427ddf076d816b.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I7b6c585b7658f94dbaa916c2b6bfe9fb463b8d37
|
|
Add 4D shape class for op Ifm/ofm shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic0a98da9d2f9d085605e39a9ab5a26bad6e702a3
|
|
Add ifm/ofm shapes to op
Changed to rely on these shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I571535a1dcadc2bdb04a3c727a8e1c49703b174d
|
|
Use an Enum instead of a bytestring to specify VALID or SAME padding
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I4e87f8c32b3bfac176d822a68de061e85a558fce
|
|
- We have combined estimates for conv and fc, add the fix back
Change-Id: I49a29c716189b37b387df4b46efab5f4e6125994
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
Pylint W0102:
When a mutable value as list or dictionary is detected in a
default value for an argument.
Replace detected instances with None, and upon checking for None, sets
the default accordingly
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I4eb73d07d01d4cdefa586eb71b9c76746eee3b11
|
|
Moved blockdep calculation and other helper functions for
code generation to a separate file.
Change-Id: I2f8ccea478654272ebf42217fc5c1800e9ad177a
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Blockdep calculation can now handle different sized IFM/OFM.
Change-Id: I898a3c1c3a6778916802f3dbfa658328e5093096
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit adds a constraint to FullyConnected
ops in supported_operators.py that puts any
such op on the CPU if tensor dimensions of the
output(s) are not 2D.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I8c898a780b40fc4a1383c09213f0696ea6699b7d
|
|
Added public API function npu_find_block_configs.
Change-Id: Ib0925a62d7c5d19a9b9fbd8d808943c2ea2df02f
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Added API.md that describes the external APIs.
- Renamed npu_get_api_version
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I6e6e6103a889da656b4e00c3cce3eee60dfa844a
|
|
Added external API to add driver actions to a command stream.
Change-Id: Ie4779c1c745defc5769fa694358470cd6aea191c
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
All external APIs are now exposed by api.py.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I33f480e424692ac30e9c7d791f583199f31164a7
|
|
This reverts commit 15a8e803844b286fe9533e1cf703c76a77b090a8.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I64169443f473c9ba42551281ad6ac4b45856f420
|
|
Change-Id: Ifbd6c053ac618bedce0f56fe5c4c647a71d9cc46
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
|
|
- Added sample vela.ini config file
- Changed vela config format, split into system config and memory mode
- Removed unused CPU cycle performance estimation
- Added new CLI options for --memory-mode and --verbose-config
- Changed CLI option --config to take multiple files
- Removed CLI option --global-memory-clock-scales
- Changed error helper functions to raise a VelaError exception
- Refactored to create a new is_spilling_enabled function
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I27c41577e37a3859edb9524cd99784be10ef0a0d
|
|
- Also changed to use Ethos-U where appropriate
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ie45ba2bb3935b305abe897b78b498681296cb7c1
|
|
Vela only supports per-channel scaling for
convolution ops. This commit adds a check that
puts ops with per-channel scaling on the CPU.
A caveat worth mentioning is that neither
TensorFlow Lite or TensorFlow Lite Micro support
per-channel scaling for the CPU placed op,
however the problem is moved away from Vela.
This commit also changes a small utility function
in supported_operators.py used for docstring
formatting.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I9ed090592f1d05dd4566d3e54dba1ef405299383
|
|
Added version to the external API
-Added CLI-option --api_version
-Added API function to get the API version
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I0143b50adf884a2b05145912a1c7bef8cecc5f02
|
|
Do not convert batched fully connected operators to avoid moving
weights from flash to SRAM.
Change-Id: I873c9ce05377de3f16e4cee9a0863f29d9ec3ad4
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
|
|
Put softmax on CPU if beta < 0
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I4ec866dd44d14e2737c4cd96474e54bb770bfb3e
|
|
When encountering a sparse string buffer, Vela fails
both due to missing a mapping for a Numpy string type
and also for not being able to read sparse buffers.
The failing line is attempting to reshape a [100]
buffer into a [3, 5] tensor which does not work due
to Vela treating the buffer as non-sparse.
The solution here is to simply not do the reshape
for string buffers (which all appear to be sparse)
since it is not something that will be supported in
the future anyway.
The related operator can then be pushed to the CPU
as expected.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Iea0af6cd60a691f975209014b6aa098dde8d6a4b
|
|
Added external API to generate register command streams.
Existing code generation has been refactored to make
use of this API.
Change-Id: Ibb4c2b167809869f16470b14da24f08a65c82b7b
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit removes the constraint on all tensor
shapes matching the OFM shape.
The motivation is that this constraint essentially
only checks that the fixup function has run.
This means that it removes the possibility for the
fixup function to run after the supported operator
check and this effectively means that any
StridedSlice operator that would be placed on the
CPU is still modified by the fixup function.
Because the fixup function is moved to after the
supported operators check, some unreachable cases
are removed from the fixup function.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I7a82126b7de73bd67873b4e6daf53a6767e33d16
|
|
All existing constraints have now been refactored using the new
framework.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ic9ba0d7040cb9f114b959a949bfdf777f86752c7
|
|
Using a new system to report constraints, replaced existing
functionality for checking conv-like ops.
This new system will allow reporting of all constraints regardless of
any input network.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If81177deca2a3b57c9dd9a3a08868cbc9cef0c23
|
|
Keeping the constraint functions consistent with each other
Added specific tensor names in the extra info
Added operator name to the warning generated
This should help easily identify specific problematic nodes in a graph
and give a good enough explanation as to why they are placed on the CPU
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ie5bbdd31e5e75fe37e3d8bb8fee1d260080bce83
|
|
This commit changes and amends some parts of the
restriction functions in order to make sure
operators are correctly placed.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I336cf33a874c9078a5bbf81ce129ff917dbc5e9a
|
|
- op.type is now an enum instead of a string
- Removed unused operator codes
- Refactored some attributes like npu_block_type, fused_activation_function
- Refactored operator index calculation
- Refactored a number of operator sets
Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
A new mechanism to report generic restrictions/constraints for
operators has been implemented.
Each check is its own defined function, and has a general reason for
the constraint defined as its docstring.
This allows us to query all reasons up front and report this without
having to run through real data to trigger the checks.
This is part of a larger refactoring and the specific restrictions will
be replaced by a similar mechanism.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Id3fb2639f91cfac5fc5b8c14f7620de1a85972b2
|
|
Updated supported operator checks for StridedSlice:
- allow negative indices in begin/end values
- added more checks on shapes
Change-Id: I3ac76bfa6b313f0e2250f0749f152fb0e3aa033c
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Attempts to use fast storage for feature maps used in between
cascaded passes.
This is only relevant for system configurations where feature maps
are by default not placed in SRAM, but there is SRAM for fast storage.
Change-Id: I207b7cf32cfcb5bea3e6b93c2da1161c4af5221d
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added a static class TensorAddressMap that stores all Tensor addresses
based on their equivalence_id. Made the "address" field into a property
which getter and setter looks up/sets the tensor's address in
TensorAddressMap.
This makes the references to cpu_tensor/npu_tensor obsolete and they
have been removed.
Addition to scheduler: avoid SRAM spilling if an op has consumers in
other subgraphs.
Minor rework in LUTState; it will now assign a unique equivalence_id to
the SHRAM lut tensor to avoid issues with addressing. The equivalent
checks in LUTState now compares the values of the LUT instead of the the
equivalence_id.
Updated LUT unit tests accordingly.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I41de5a8a4e5f07b77d6544d8d4034b754993e503
|
|
Improved unit test coverage of fp_math.py
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I883fd984a1bfa67102826a400380e41a363fc59d
|
|
Removed CLI-option permanent-storage
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I03e03205a183bd538292a73a07b095546fa3d95a
|
|
Added the CLI option. Only applies to CPU tensors. Added an
AllocationError which is raised when Allocation fails.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I89164dea3ac7b7add7bc40aec2ce8fe50600105d
|
|
Enables LUT for LeakyRelu with int8/uint8 even if input scale
is different from the output scale.
Fusing LUT with a previous operator for this situation
requires further work.
Change-Id: I9eddfe36f457e763d44eb3e05fbe240eac7cfec9
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added checks for not using NHCWB16 for reduce_sum int32 which makes
int8/uint8 softmax work.
Also enabled softmax graph rewrite by default and fixed a saturation
problem.
Change-Id: Ic01bd9ece7e5c3edb2900b7915cc747efe9e5760
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Added --weight-estimation-scaling, which enables
additional scaling of weight compression scale estimate.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Idcda41257f44901d3a3f345341e07fb1ae8585a9
|
|
Includes a number of changes:
* Handle non-existing optional inputs
* Handle disabled optional inputs (-1 indexed)
* Added unit tests for parsing operators
* Add bias tensor to the different Convolutions + FullyConnected if
it's missing.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ib88d2b610314b1c886fc0aef4f9da87430ce6ae5
|
|
Implemented LUT generation for softmax uint8/int8 to match the
reference.
Change-Id: Ib9acaa295ee1066591e800023d75f364520b44c1
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
- Support for more than one 256-byte LUT in SHRAM
- No DMA is performed for a LUT that is already located in SHRAM
- Added MemArea.Shram, used for LUT, to avoid false address collision
asserts during SRAM tensor allocation
- Added read access to LUT in memory access calculation
Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
*Renamed pack_bias_and_scale to encode_bias to be consumed externally
*added unit test for the API
Change-Id: I71829f3fcb390c475795848f0be3d132d3e158ee
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
*lint
*added unit tests
*added typecheck
*added docstring for the api
Change-Id: Ibd4bc40d4381ac40ad2ea3d500b26c4ec565ab07
Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
|
|
- Fixed custom operator pass through
- Added error printing functions for operators and tensor
- Minor cleanup of custom exception handling
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Idf295df1e4c544381dc480244d880c32fb285e38
|
|
Added custom exceptions to handle different types of input errors.
Also performed minor formatting changes using flake8/black.
Change-Id: Ie5b05361507d5e569aff045757aec0a4a755ae98
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Moved len1_array_to_scalar from a nested function to a staticmethod
of TFLiteSubgraph.
Change-Id: I182f0b70f03070855c1a4478d26644892c1ebb15
Signed-off-by: Diego Russo <diego.russo@arm.com>
|
|
Added unit tests for LiveRange.
Change-Id: I4d4a16e7ec215fa39fa1be3dda3be22b4632689c
Signed-off-by: Diego Russo <diego.russo@arm.com>
|