Age | Commit message (Collapse) | Author |
|
- Vela failed to compile networks with multiple subgraphs because
only cascaded passes in the root subgraph were used when
extracting the live ranges. The fix is to extract the subgraph
range live on Ops that have connected subgraphs.
- The tf_writer did not handle multiple subgraphs in a correct way
resulting in corrupt buffer data in the optimized tflite file. The buffer
index must be unique for every tensor.
-Added support to handle multiple subgraphs for the OfflineMemoryAllocation
meta data. The change will not change behavior for single graphs.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I2328dfc1f07e2e4faf43a75423ea95423096ffa3
|
|
If an elemenwise op is part of a cascade, the ifm can not
be overwritten by the ofm.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I1e5f7ee501be17e76684b33c6e86ab8af0f3e61f
|
|
Uses separate tensors for the individual weight buffers
in case of weight double buffering.
Each weight buffer tensor gets its own individual live range.
This patch is a clone of a previously reverted patch, but with some
additional bug fixes applied.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I868c70d15821eb9f1399186f2da6e7345f6ee343
|
|
This reverts commit cc5f4de1c35ba44fca7ff6295c6ae846f8242344.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0fa5babfe9ad9ec668720d04fe1c16d9a9092131
|
|
Update version of Black to 22.3.0 due to updated dependencies.
Updates to fix reported issues due to new version.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
|
|
Uses separate tensors for the individual weight buffers
in case of weight double buffering.
Each weight buffer tensor gets its own individual live range.
Change-Id: I724a8c61a7045615fbd2ed9535663076ac8edd13
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Ported the improved spilling behaviour from Regor
into Vela. This replaces use_fast_storage_for_feature_maps
with allocate_feature_maps and introduces the class called
FastStorageComponentAllocator.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I34785840c905a79750a62863773015b00fb43387
|
|
This change will allow the subgraph's input tensor
to be reused/overwritten by the output from an elementwise op
if there is only one consumer attached to the input tensor.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I317188af11a5470614770e18dc8973462fd5f21c
|
|
Added checks to avoid merging elementwise op live ranges for subgraph
inputs and outputs, which sometimes caused problems when parts of the
network run on CPU.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Id07ab277a205b8550d19a276559f8904b9a4b4be
|
|
- Fixed index error in memory_snapshot
- When removing a cascade, also references are removed
Change-Id: I2b35dc52671d8ce115eb32bfdd93584391d1fc6d
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Fixed typo with not using ifm.mem_type
- Fixed bug with using ifm1 properties when only ifm2 is a potential match
- Removed restriction on not considering SHL and SHR for overlap
- Removed some dead reshape code
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Id9bcc3c2b3ee9ac7b6276187d3e2f513b4acd4b5
|
|
Reinstated allowing the IFM and OFM tensor to overlap for Elementwise
operations.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ide6db7781f3ca7a36c8ff9e3efdc7943a7bf6d7f
|
|
- Deepspeech reuses identical weights and biases throughout
the network. Since biases are now interleaved with weights
there is a scaling issue when the ifm scales differ between
operations using the same weight and scale tensor.
- This commit uses interleaved weights/scales on their first use
but separates scales to source memory on subsequent use (if
the ifm scale is different).
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I7aae163438160a919cae04e235966e75355a6148
|
|
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
|
|
- Tensor allocation verification was O(N^2), is now closer to O(N)
- Removed a sort in HillClimb allocator
Change-Id: I286a269881490c485cc2b0eeab3b1ecffa8f3df0
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Add ifm/ofm shapes to op
Changed to rely on these shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I571535a1dcadc2bdb04a3c727a8e1c49703b174d
|
|
Pylint W0102:
When a mutable value as list or dictionary is detected in a
default value for an argument.
Replace detected instances with None, and upon checking for None, sets
the default accordingly
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I4eb73d07d01d4cdefa586eb71b9c76746eee3b11
|
|
- Removed unused --show-minimum-possible-allocation
- Change --allocation-alignment to --cpu-tensor-alignment
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I00e367c3190aeea08a3f136332711e9accc85ba3
|
|
Removed the CLI opt ifm-ofm-overlap
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I23faa0d10c3e71972c543e22e8155086fce73556
|
|
Enable overlap of elementwise input/output
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I6e6f11953319c843c8203bf038f96778df194332
|
|
- op.type is now an enum instead of a string
- Removed unused operator codes
- Refactored some attributes like npu_block_type, fused_activation_function
- Refactored operator index calculation
- Refactored a number of operator sets
Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added a static class TensorAddressMap that stores all Tensor addresses
based on their equivalence_id. Made the "address" field into a property
which getter and setter looks up/sets the tensor's address in
TensorAddressMap.
This makes the references to cpu_tensor/npu_tensor obsolete and they
have been removed.
Addition to scheduler: avoid SRAM spilling if an op has consumers in
other subgraphs.
Minor rework in LUTState; it will now assign a unique equivalence_id to
the SHRAM lut tensor to avoid issues with addressing. The equivalent
checks in LUTState now compares the values of the LUT instead of the the
equivalence_id.
Updated LUT unit tests accordingly.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I41de5a8a4e5f07b77d6544d8d4034b754993e503
|
|
Added the CLI option. Only applies to CPU tensors. Added an
AllocationError which is raised when Allocation fails.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I89164dea3ac7b7add7bc40aec2ce8fe50600105d
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I53d9d56acee57cff208dccb4822c1f1a461c416d
|
|
Additional supported memory configurations:
-Permanent_storage = DRAM
-Tensor arena either in DRAM or SRAM
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I20beb7151e306bfdba540e7c0b2a7b478b4d94e1
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ia7127148d00280bf9c3759dd6dcbe500a4cfcc78
|
|
Also updated README.md
Change-Id: I118309c61f4d00e8508d6b888c606995490fba39
Signed-off-by: Diego Russo <diego.russo@arm.com>
|
|
Use pre-commit framework [1] to run black and flake8 before the commit.
black and flake8 are managed by the pre-commit framework and they can be
run manually by the user using `pre-commit run` command.
Fix the code base with the help of black and flake8.
Fix import statements according to PEP8 guidelines [1]
Both tools have the following settings (specified in the pre-commit
configuration file):
* line length: 120 characters
* directory to exclude: ethosu/vela/tflite/ and ethosu/vela/ethos_u55_regs
Updated README.md on how to install pre-commit and how to run sanity checks.
Pipenv files have been updated including new dependencies for pre-commit.
[1]: https://www.python.org/dev/peps/pep-0008/#imports
[2]: https://github.com/pre-commit/pre-commit
Change-Id: I304d9fffdf019d390ffa396a529c8a7c2437f63d
Signed-off-by: Diego Russo <diego.russo@arm.com>
|
|
- Added modules ethosu.vela and ethosu.mlw_codec.
- Added README and various configuration files.
Change-Id: I3690f8c8f5966306ecddaeb2793c30ca9c6e2eee
|