Age | Commit message (Collapse) | Author |
|
- Added sample vela.ini config file
- Changed vela config format, split into system config and memory mode
- Removed unused CPU cycle performance estimation
- Added new CLI options for --memory-mode and --verbose-config
- Changed CLI option --config to take multiple files
- Removed CLI option --global-memory-clock-scales
- Changed error helper functions to raise a VelaError exception
- Refactored to create a new is_spilling_enabled function
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I27c41577e37a3859edb9524cd99784be10ef0a0d
|
|
- Also changed to use Ethos-U where appropriate
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ie45ba2bb3935b305abe897b78b498681296cb7c1
|
|
Added external API to generate register command streams.
Existing code generation has been refactored to make
use of this API.
Change-Id: Ibb4c2b167809869f16470b14da24f08a65c82b7b
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Added mechanism to track input to output graph transforms for
debugging the resultant command stream.
- Provides base implementation for MLBEDSW-2661
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2dfe8a409fbde7ad0282bfab5acb11ba1c8b82d8
|
|
For IFM streamed cascades bias tensors are read several times.
Moves these tensors to fast storage and add DMA commands.
Change-Id: I630f6275986c1b5e3f126c925b11e22500fb1128
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
|
|
- Normalise kernel availability by requiring all operators offer a kernel
describing how much data they consume from the source, per OFM element,
regardless of whether kernels are relevant to the operation.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Idbcff64879fc2eccf292b6208a7d2038eb388017
|
|
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
Change-Id: Ie404a0c13e7c7de0eff649f77e0147a0f3d73acd
|
|
- op.type is now an enum instead of a string
- Removed unused operator codes
- Refactored some attributes like npu_block_type, fused_activation_function
- Refactored operator index calculation
- Refactored a number of operator sets
Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Fixed crash in networks with 5D tensors.
Fixed crash for (int32) tensors without quantization.
Added validity checks for concatenation.
Moved unfusing of activation function from tflite_reader to graph_optimiser.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Ib9ba8891dc95ef5491e15d0feedef44331a26393
|
|
Added support to convert batched FC to conv.
This enables choosing a suitable block-size.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Idc49e4fb6d29c554f10a38ece7996a7b7795ffad
|
|
Allows fusing of LUT with a preceding operator regardless of
input/output scale.
Change-Id: Ia378adbb3fe61d71299feb085f7313377e0efa39
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Corrected the rounding mode for softmax
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: If136491c7668e85fba1e2c56c8cff11aa32db328
|
|
Fixed a zero point issue for int32 ifm.
Change-Id: I9149cb24d5b030ea5216a028a113518e458a8d15
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Enables LUT for LeakyRelu with int8/uint8 even if input scale
is different from the output scale.
Fusing LUT with a previous operator for this situation
requires further work.
Change-Id: I9eddfe36f457e763d44eb3e05fbe240eac7cfec9
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
For int16, using LeakyRelu (with bug fix) gives exactly
the same results as Mul+Max if input/output scales are the same.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I4f4db464d77b0aaf0d25ddfca534f91d08db548d
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ida307afc33cd7963bdeb505df400732a3efcc846
|
|
Implemented LUT generation for softmax uint8/int8 to match the
reference.
Change-Id: Ib9acaa295ee1066591e800023d75f364520b44c1
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: Ia83ab5ba28d193215e3f8fbc52552b0356111723
|
|
Added graph rewrite of Softmax for uint8/int8.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Iecdd5d2cd3156a601b3313debba4a3562e6be5d7
|
|
- This commit removes unnecessary dependency checks and implements
on-demand calculation of the NPU/DMA dependencies.
Signed-off-by: <tim.hall@arm.com>
Change-Id: I85e681d1ab133bd88f64296dc00500f3c188e777
|
|
- Support for more than one 256-byte LUT in SHRAM
- No DMA is performed for a LUT that is already located in SHRAM
- Added MemArea.Shram, used for LUT, to avoid false address collision
asserts during SRAM tensor allocation
- Added read access to LUT in memory access calculation
Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Id762ee2c03cd8f162cd0c450511ee5b2e0624586
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I5b8db6430e79ec7a5836d8dd00a03413647de8ba
|
|
For binary elementwise ops with broadcasting in first IFM.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I25af67be8d3a852247989bc3ddc8e08e946f6bfa
|
|
Added graph rewrite of Softmax for int16.
Change-Id: Id7885af6056a23e8b8362fb61ae94283251eb398
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I44428d77b2e8e44a477e5c4dfe28ab8dd1792838
|
|
A newer version of numpy gives a deprecation warning. This patch
resolves the deprecation warning so the user should never see it clutter
their output.
Tested on numpy version 1.19.0
Change-Id: I0c468818de4a2e5e2fcb109c45f51b2f1801b7b5
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Iaf4d7ab9c32b0d783072c5f131a61bfebe77cc16
|
|
Automatically generated, no functional changes.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: Ia6a791f7dbadc352bc8a7b528afa070e8540b4d0
|
|
- Parallelism mode register was being written for non Yoda targets.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I31b50031dab4d615733c4c3790dec8934117f275
|
|
- If blockdepth or core count resulted in empty or non-existent substreams, the
command generator generated an error. This commit changes the command stream
generator to only program cores that have streams and are enabled for the
configuration.
Change-Id: I4e724b19de14d3a12e886ec6b17d0038593dfb59
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
- Multicore weight and scale stream interleaving for
multicore hardware architecture.
Change-Id: Ic82850463391c629d90d08c26cf0c48dd438286d
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
Additional supported memory configurations:
-Permanent_storage = DRAM
-Tensor arena either in DRAM or SRAM
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I20beb7151e306bfdba540e7c0b2a7b478b4d94e1
|
|
Signed-off-by: Charles Xu <charles.xu@arm.com>
Change-Id: I78f475f9837a7c11f01b2693b17efe1a7c6481cc
|
|
Bug fix in the generation of the NPU_SET_IFM2_SCALAR parameter.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Ie261a90dcfa61ed269d27a100eb48c58af8a325d
|
|
Change-Id: Ie6d8d6de9f3447f19ba06aafa9fa480fc96a973b
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
|
|
- Dilation added to SET_KERNEL_STRIDE instruction
- Kernel height/width adjusted for dilation
- Updated padding calculation
- Updated weight compression
Change-Id: I0c8190223e223b039a305aba0f37896ae1de2b80
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
This commit fixes a bug where there would be an off-by-one error
in some cases for ResizeBilinear.
It is resolved by treating it the same way as an AvgPool in
regards to setting the zero point.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I2835d5dcf360f65e19265c339e5ffd02de16c823
|
|
Fixed scaling for int16 tanh/sigmoid to match the reference.
Change-Id: I3110298b7e8638a82cc05bedc03de389dec27898
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
- 5 step rnnoise was failing due to secondary tensors
not being checked for operator dependency. This commit
adds ifm2 comparisons to the dependency check.
Change-Id: I629c8a70997481efb7f596d8b77512d3419eaab4
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
There was output diff when both IFMs are referring to the same tensor in
binary elementwise operations. IFM2 dimension-instructions were not written
by vela.
Change-Id: I40a0dcbc9557f7308222b7230e5586d8f2a04c6a
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Extend IFM to full dimension for the performance
metrics calculation.
Change-Id: Iae923e37280ab0f22b7a272f28970973a5142534
Signed-off-by: Charles Xu <charles.xu@arm.com>
|
|
Also updated README.md
Change-Id: I118309c61f4d00e8508d6b888c606995490fba39
Signed-off-by: Diego Russo <diego.russo@arm.com>
|
|
This patch adds support for the ResizeBilinear operator.
It is implemented using a 2x2 Nearest Neighbor upscale
followed by a 2x2 Average Pool.
Depending on the argument align_corners
the output is either of shape:
- (2 * M, 2 * N) when align_corners == True, or
- (2 * M - 1, 2 * N - 1) when align_corners == False
where (M, N) is the input shape.
The padding mode is SAME when align_corners == True
and VALID when align_corners == False.
The argument half_pixel_centers is out of scope and is
as of now ignored.
Note that only upscaling by a factor of 2 is supported.
Change-Id: Ia6d6d010c4f1bb13f5f839bc8d16872a626d9a3b
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
|
|
Use pre-commit framework [1] to run black and flake8 before the commit.
black and flake8 are managed by the pre-commit framework and they can be
run manually by the user using `pre-commit run` command.
Fix the code base with the help of black and flake8.
Fix import statements according to PEP8 guidelines [1]
Both tools have the following settings (specified in the pre-commit
configuration file):
* line length: 120 characters
* directory to exclude: ethosu/vela/tflite/ and ethosu/vela/ethos_u55_regs
Updated README.md on how to install pre-commit and how to run sanity checks.
Pipenv files have been updated including new dependencies for pre-commit.
[1]: https://www.python.org/dev/peps/pep-0008/#imports
[2]: https://github.com/pre-commit/pre-commit
Change-Id: I304d9fffdf019d390ffa396a529c8a7c2437f63d
Signed-off-by: Diego Russo <diego.russo@arm.com>
|
|
This patch adds support for strides of size 3.
It removes some obsolete code for a corner case that
no longer exists.
It also changes the setting of the bitfield in
NPU_SET_KERNEL_STRIDE so that it matches the specification.
Change-Id: I7dabcf72b7826ca0b3c98e9d23209027204079a8
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
|
|
In order to support constant IFM and IFM2, i.e. predefined inputs placed
in Flash, the REGION commands had to be updated to be emitted for every
op. They are emitted based on the 'mem_area' field of the Tensor.
Change-Id: I434e8efc915af4119fa2ce37a05240a151593141
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
|
|
Enabled int16 support quantization to match the reference.
Change-Id: Ib369640241a9a491f2b0bc52d7f6cb025e30344b
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
Change-Id: I897bea10ae744162fd285838ee2b2c018695a278
(cherry picked from commit d5ac9b55faa899ac686433e79900cadd321b71bf)
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
- Added modules ethosu.vela and ethosu.mlw_codec.
- Added README and various configuration files.
Change-Id: I3690f8c8f5966306ecddaeb2793c30ca9c6e2eee
|