Age | Commit message (Collapse) | Author |
|
PAD followed by max/average pool is run on NPU if NPU
padding can be used. Average pool is converted to depthwise.
Change-Id: Icc3652e6d9ecff5ac3dc7d92080313d90c245404
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Straight port of the C++ implementation to python.
- Renamed the allocator from "Search" to "HillClimb"
Change-Id: I50797d541f326d0264daf79bf7866aef32350a60
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
-Removed ConcatSliceWrite from the optimised graph.
Always executed as avgpool, which is equivalent with
before the patch.
-Added copy op to enable more removal of reshapes.
Sg input/outputs need to remain. When Reshape input and
outut, are sg input/outputs a copy op is needed to
be inserted, in order to remove the reshape.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Id7be9966673ae34499e8518a5544104493fe326b
|
|
Added supported operator check that 32-bit fused activation functions
are not supported.
Change-Id: I01fdafeff8fdb13c71eae4f63be7e6f81b9223df
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Added checks for unsupported pad sizes in PAD operator
- Bug fix right pad/bottom pad calculation when replacing PAD operator
by hardware padding
Change-Id: Ib84be711277d987052f14352ab386e0e0b774987
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Change-Id: If49abc31f093f1bd3393bee86f821fd35972086f
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
When FC input is fixed by changing ifm_shape,
avoid_NHCWB16 must be set to ifm.
-Fixed issue with ResizeBilinear
-Changed to post order for concat ops in graph optimisation
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ie0c6a86637c210c0833ae9b2f8e7c494c5d4f66e
|
|
-Removed reshapes in the original graph
-Removed the addition of reshapes to the
optimized graph
-Reshapes with different ifm/ofm quantisation will remain
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I94862be53dac0d7434815e2aee5ca678228495f8
|
|
Fixed a bug where PAD having no consumers would result in a crash.
Now the constraint doesn't crash and thus the intended error message is shown, resulting in easier debugging.
Change-Id: I1e4403d47a6152e7adbf7bc065db86d4217d39cc
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
|
|
- Added operator check that OFM scale > smallest float32 number
- Generalized the restriction that IFM/OFM scale must not be infinite
Change-Id: I918f5ea3d8fdec6e8f6bd6780ed13a19d1234ed6
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added handling of input tensors with constant string data.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Ieb5164a9d56d580ad08ea834bf2cbb7288cd9539
|
|
Constraints and unit tests were added to check the new pad operator.
Change-Id: Id6d4cf2c4da486928c8f46ba1fa124eec66895a6
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
|
|
Replaces the PAD operator by hardware padding when possible.
Change-Id: I9dce0885e51a4a73715824d7368637222e39b2b3
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This reverts commit df0a5905177f3a1b836076bc3f9f39b2e86f1794.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I891c66fb29db9d25e942947e8d1c29a10610de51
|
|
This reverts commit bf31d647dc5df47410ee577b12427ddf076d816b.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I7b6c585b7658f94dbaa916c2b6bfe9fb463b8d37
|
|
Add 4D shape class for op Ifm/ofm shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic0a98da9d2f9d085605e39a9ab5a26bad6e702a3
|
|
Add ifm/ofm shapes to op
Changed to rely on these shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I571535a1dcadc2bdb04a3c727a8e1c49703b174d
|
|
Use an Enum instead of a bytestring to specify VALID or SAME padding
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I4e87f8c32b3bfac176d822a68de061e85a558fce
|
|
- We have combined estimates for conv and fc, add the fix back
Change-Id: I49a29c716189b37b387df4b46efab5f4e6125994
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
Pylint W0102:
When a mutable value as list or dictionary is detected in a
default value for an argument.
Replace detected instances with None, and upon checking for None, sets
the default accordingly
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I4eb73d07d01d4cdefa586eb71b9c76746eee3b11
|
|
Moved blockdep calculation and other helper functions for
code generation to a separate file.
Change-Id: I2f8ccea478654272ebf42217fc5c1800e9ad177a
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Blockdep calculation can now handle different sized IFM/OFM.
Change-Id: I898a3c1c3a6778916802f3dbfa658328e5093096
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit adds a constraint to FullyConnected
ops in supported_operators.py that puts any
such op on the CPU if tensor dimensions of the
output(s) are not 2D.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I8c898a780b40fc4a1383c09213f0696ea6699b7d
|
|
Added public API function npu_find_block_configs.
Change-Id: Ib0925a62d7c5d19a9b9fbd8d808943c2ea2df02f
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Added API.md that describes the external APIs.
- Renamed npu_get_api_version
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I6e6e6103a889da656b4e00c3cce3eee60dfa844a
|
|
Added external API to add driver actions to a command stream.
Change-Id: Ie4779c1c745defc5769fa694358470cd6aea191c
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
All external APIs are now exposed by api.py.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I33f480e424692ac30e9c7d791f583199f31164a7
|
|
This reverts commit 15a8e803844b286fe9533e1cf703c76a77b090a8.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I64169443f473c9ba42551281ad6ac4b45856f420
|
|
Change-Id: Ifbd6c053ac618bedce0f56fe5c4c647a71d9cc46
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
|
|
- Added sample vela.ini config file
- Changed vela config format, split into system config and memory mode
- Removed unused CPU cycle performance estimation
- Added new CLI options for --memory-mode and --verbose-config
- Changed CLI option --config to take multiple files
- Removed CLI option --global-memory-clock-scales
- Changed error helper functions to raise a VelaError exception
- Refactored to create a new is_spilling_enabled function
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I27c41577e37a3859edb9524cd99784be10ef0a0d
|
|
- Also changed to use Ethos-U where appropriate
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ie45ba2bb3935b305abe897b78b498681296cb7c1
|
|
Vela only supports per-channel scaling for
convolution ops. This commit adds a check that
puts ops with per-channel scaling on the CPU.
A caveat worth mentioning is that neither
TensorFlow Lite or TensorFlow Lite Micro support
per-channel scaling for the CPU placed op,
however the problem is moved away from Vela.
This commit also changes a small utility function
in supported_operators.py used for docstring
formatting.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I9ed090592f1d05dd4566d3e54dba1ef405299383
|
|
Added version to the external API
-Added CLI-option --api_version
-Added API function to get the API version
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I0143b50adf884a2b05145912a1c7bef8cecc5f02
|
|
Do not convert batched fully connected operators to avoid moving
weights from flash to SRAM.
Change-Id: I873c9ce05377de3f16e4cee9a0863f29d9ec3ad4
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
|
|
Put softmax on CPU if beta < 0
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I4ec866dd44d14e2737c4cd96474e54bb770bfb3e
|
|
When encountering a sparse string buffer, Vela fails
both due to missing a mapping for a Numpy string type
and also for not being able to read sparse buffers.
The failing line is attempting to reshape a [100]
buffer into a [3, 5] tensor which does not work due
to Vela treating the buffer as non-sparse.
The solution here is to simply not do the reshape
for string buffers (which all appear to be sparse)
since it is not something that will be supported in
the future anyway.
The related operator can then be pushed to the CPU
as expected.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Iea0af6cd60a691f975209014b6aa098dde8d6a4b
|
|
Added external API to generate register command streams.
Existing code generation has been refactored to make
use of this API.
Change-Id: Ibb4c2b167809869f16470b14da24f08a65c82b7b
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit removes the constraint on all tensor
shapes matching the OFM shape.
The motivation is that this constraint essentially
only checks that the fixup function has run.
This means that it removes the possibility for the
fixup function to run after the supported operator
check and this effectively means that any
StridedSlice operator that would be placed on the
CPU is still modified by the fixup function.
Because the fixup function is moved to after the
supported operators check, some unreachable cases
are removed from the fixup function.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I7a82126b7de73bd67873b4e6daf53a6767e33d16
|
|
All existing constraints have now been refactored using the new
framework.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ic9ba0d7040cb9f114b959a949bfdf777f86752c7
|
|
Using a new system to report constraints, replaced existing
functionality for checking conv-like ops.
This new system will allow reporting of all constraints regardless of
any input network.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If81177deca2a3b57c9dd9a3a08868cbc9cef0c23
|
|
Keeping the constraint functions consistent with each other
Added specific tensor names in the extra info
Added operator name to the warning generated
This should help easily identify specific problematic nodes in a graph
and give a good enough explanation as to why they are placed on the CPU
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ie5bbdd31e5e75fe37e3d8bb8fee1d260080bce83
|
|
This commit changes and amends some parts of the
restriction functions in order to make sure
operators are correctly placed.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I336cf33a874c9078a5bbf81ce129ff917dbc5e9a
|
|
- op.type is now an enum instead of a string
- Removed unused operator codes
- Refactored some attributes like npu_block_type, fused_activation_function
- Refactored operator index calculation
- Refactored a number of operator sets
Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
A new mechanism to report generic restrictions/constraints for
operators has been implemented.
Each check is its own defined function, and has a general reason for
the constraint defined as its docstring.
This allows us to query all reasons up front and report this without
having to run through real data to trigger the checks.
This is part of a larger refactoring and the specific restrictions will
be replaced by a similar mechanism.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Id3fb2639f91cfac5fc5b8c14f7620de1a85972b2
|
|
Updated supported operator checks for StridedSlice:
- allow negative indices in begin/end values
- added more checks on shapes
Change-Id: I3ac76bfa6b313f0e2250f0749f152fb0e3aa033c
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Attempts to use fast storage for feature maps used in between
cascaded passes.
This is only relevant for system configurations where feature maps
are by default not placed in SRAM, but there is SRAM for fast storage.
Change-Id: I207b7cf32cfcb5bea3e6b93c2da1161c4af5221d
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added a static class TensorAddressMap that stores all Tensor addresses
based on their equivalence_id. Made the "address" field into a property
which getter and setter looks up/sets the tensor's address in
TensorAddressMap.
This makes the references to cpu_tensor/npu_tensor obsolete and they
have been removed.
Addition to scheduler: avoid SRAM spilling if an op has consumers in
other subgraphs.
Minor rework in LUTState; it will now assign a unique equivalence_id to
the SHRAM lut tensor to avoid issues with addressing. The equivalent
checks in LUTState now compares the values of the LUT instead of the the
equivalence_id.
Updated LUT unit tests accordingly.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I41de5a8a4e5f07b77d6544d8d4034b754993e503
|
|
Improved unit test coverage of fp_math.py
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I883fd984a1bfa67102826a400380e41a363fc59d
|
|
Removed CLI-option permanent-storage
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I03e03205a183bd538292a73a07b095546fa3d95a
|
|
Added the CLI option. Only applies to CPU tensors. Added an
AllocationError which is raised when Allocation fails.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I89164dea3ac7b7add7bc40aec2ce8fe50600105d
|