Age | Commit message (Collapse) | Author |
|
- Changed comments to docstring on QuantizationParams
- Simplified op type to op name conversion
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2fdf5922cc17944c9bd37917a85fdfe50a1e651d
|
|
- Added optional name attributes to operators and tensors
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I3b5d881a7b1043a6ba4b58fff5d7532b271ba536
|
|
Update version of Black to 22.3.0 due to updated dependencies.
Updates to fix reported issues due to new version.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I60056aae452093ce8dcea1f499ecced22b25eef1
|
|
Uses separate tensors for the individual weight buffers
in case of weight double buffering.
Each weight buffer tensor gets its own individual live range.
Change-Id: I724a8c61a7045615fbd2ed9535663076ac8edd13
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added a mechanism that reduces the risk for getting stuck
if the current best allocation cannot be improved by only
swapping 2 indices.
Change-Id: Ife379757752f0c1ed54af7bd826e0a9390d54267
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added checks in the cascade builder to ensure that scheduled operations
are in the correct order.
Change-Id: Ic1765a6a1cb8335ff222bfe3b2d2e642980967d7
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- Fixed a bug due to ResizeBilinear modifying the attributes of a
shared IFM
- The ifm_resampling_mode is now an attribute of an operator rather
than a tensor
- Changed all calls to try_block_config() to use the attribute rather
than recalculating it in multiple places
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4641e9cd6b049bd4186776d98e3e751c5e5bcc06
|
|
Add mypy to pre-commit and clean up all reported errors.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If7dc869f5fecdb0e2db40f14e7d9db21aa33df71
|
|
- The number of accumulators is doubled in an Ethos-U configuration with
2 cores
- Likewise, for elementwise, depthwise and pooling operations
the IFM buffer depth capacity is doubled
- FindBlock: step the search space depth in multiples of ublock * ncores
Change-Id: I923cc347a2f252876d405ed93095d39181103f81
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Added check that horizontal padding is unaffected when applying
graph optimization "optimise_strided_conv".
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I7032a44163e300cdf62cf615b4b10a1417e38eaa
|
|
Fast storage allocator did not always return an optimal
allocation.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Ic758b6c4a82dc2633c4752b0c204a27ed36f651b
|
|
Fix bug when storing the encoded NPU weight UUID in the
NPU performance estimation.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I92127b0020f12352d923c0c9aa2b6f47e6110764
|
|
- Extend ifm/ofm dimensions explicitly in mean op
This fix a bug when ifm/ofm shape has different dimensions
e.g. IFM=1x19x18x25 axis=2 OFM=1x19x25,
the ofm_shape should be 1x19x1x25, not 1x1x19x25
- Fix wrong weight shape
Change-Id: I269eb71ea56c09deee2aa6c6433d9b2baa98a113
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
- Corrected rounding error
- Number of elements depends on ofm format
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I568d660b7571b6e0ffb131211b3a89c8be4b9295
|
|
Update the version of flake8 used in pre-commit to facilitate
adding mypy to pre-commit.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: I457dec87b77487ca6f14ff4a679c4cc927b272b0
|
|
- Bump minor release version and add release notes
- Update README and SUPPORTED_OPS versions
Change-Id: Ic14d028483c12d281e69515b25f66346d9a3afeb
Signed-off-by: James Peet <james.peet@arm.com>
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
- Updated the Memory Modes section in OPTIONS.md
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibfd3d2d6e1bf4a070d2af705878a5cc49381ce29
|
|
- The bug is that TransposeConv does not support explicit padding
which is needed in order to combine it with a proceeding Pad op
- The fix is to exclude such combination
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ide03d034dc32b5fc9bcaaf291ab713482223a042
|
|
*Corrected calculation where use of the
_estimate_memory_transfer_efficiency function when calculating the
scaled bandwidth for LUT transfers resulted in a divide by zero error.
Change-Id: I2356e924d9ca2f315ca1988f465f58b13a8fa4c9
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
|
|
*Original weights and encoded NPU weight now report correct size instead
of zero when running vela with --verbose-weights flag
(Code to update the aforementioned attributes was missing)
*Removed print references to unencoded NPU weight size
Change-Id: I6d3e41c04cc46d24eeb54cab89818a35e5df27be
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
|
|
Reduce memory footprint when using optimization strategy Size
for elementwise operations.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I30380aed587c31adbf7615f74179b4c5da686773
|
|
Signed-off-by: James Peet <james.peet@arm.com>
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4c9acb04a9df2181829e3a98aab840f32ae6458e
|
|
Updated constraints affect:
- Constant tensors
- MEAN operations
- RESIZE_BILINEAR operations
Signed-off-by: James Peet <james.peet@arm.com>
Change-Id: I2a041fa2300a9ba6da048cc61e164f34897b2f50
|
|
- Combine two MEAN operator checks for single axis averages into one
- Only apply that check if the single axis is the height dimension
(previously checks were also applied to width averages)
- Rephrase some MEAN operator constraint descriptions
Signed-off-by: James Peet <james.peet@arm.com>
Change-Id: Ie0577f2b99aba1f3d6a4c39f8934eafe3813b736
|
|
Make sure output from subgraph is write protected and
not overwritten by an element wise op.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: Ie26979913843c62794c5346a315b7089206850e0
|
|
Change required python version from 3.6 to 3.8 in setup.py and allow
any python3 version for black pre-commit linting.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I0d8936d92efd5137561834c0de1a3449f9e5f25c
|
|
Fixed problem when ofm is produced by different NPU nodes by
making sure that output is always in NHWC format.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I00e55c989d5860499fbaf4f4318661b17b4bda7e
|
|
Ported the improved spilling behaviour from Regor
into Vela. This replaces use_fast_storage_for_feature_maps
with allocate_feature_maps and introduces the class called
FastStorageComponentAllocator.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I34785840c905a79750a62863773015b00fb43387
|
|
This change will allow the subgraph's input tensor
to be reused/overwritten by the output from an elementwise op
if there is only one consumer attached to the input tensor.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I317188af11a5470614770e18dc8973462fd5f21c
|
|
The root cause of this diff is precision errors caused by rounding
several times when performing a resize bilinear upscaling to more than
twice the initial size. This is solved by rewriting the algorithm to
perform nearest neighbour upscaling to the correct size and then
applying one larger average pool instead of several 2x2 pools. Avgpool
with padding is limited to kernel size 8x8, which constraints the
largest possible bilinear upscaling to 8 times the input size.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I846232f309ba26aab6c385e593cbe25b646c6668
|
|
- Issue was due to a previous patch to fix MLBEDSW-5582
- Revert fix for MLBEDSW-5582
commit 849ff81f82c10a68898e5101930b92372bec5565,
- Made new fix for MLBEDSW-5582 that enforce
output tensor from NPU graphs to be in NHWC format.
This information is otherwise lost in the case when
parts of a concatenation are placed in different custom operators
resulting in mismatch bewteen NHWC and NHCWB16.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: Iab3ba29d348353c854f357836e6aa7c338ae1572
|
|
This reverts commit 3c25ff658cf847ccfcb2d4d5796ffbe13a511894.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I5403bf4c53dced1075160313876fa8681eaa617f
|
|
Only the first half of weight double buffers was used
on dual core configurations, which causes degraded performance.
Change-Id: I49972c00343bbffbae28ed11c645e993ed61d43f
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- This bug was due to an interaction between multiple Ethos-U custom
operators and concatenation of constant tensors
- It resulted in different parts of the concatenation being placed in
different custom operators
- The fix involves places all parts of the concatenation into the
same custom operator by switching to a breadth first search in pass
packing
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ic47613cfd7bf675b4674dc91d6f9765849ba3130
|
|
Update the flatbuffers generated code to comply with TensorFlow 2.7
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: Iff29b05a6e145245861329b4ff9fc9fbd968da53
|
|
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I288decbc0affa7f9475cabbd16cd20005e15e2a2
|
|
By not comparing items that have already been compared with
each other, the number of iterations for the loop is reduced.
For large network with long live ranges, this improves compile
time significantly.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I298cd6f109527fc32f6db77ffffca9e765a84ce0
|
|
The output diff is caused by not including the kernel dilation when
calculating the bottom padding to be used on the last h_stripe. This
only shows up when using dedicated_sram since shared_sram does not split
into multiple h_stripes and thus uses the padding specified by the skirt
instead.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I7f643748b153004d65be2124c0ac6c9d21cd803f
|
|
Signed-off-by: Jonny Svärd <jonny.svaerd@arm.com>
Change-Id: Ib398024c2f41beb4f93f7976c678a9fd54af94a5
|
|
Fixed a crash caused by loading a network containing
operators with empty constant tensors.
This could occur when a branched network is split
before said branches have converged.
We now put the affected operator on the CPU.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I63e9cd13cecf86d976c5750c727e218c334c32b5
|
|
When an LUT tensor address is updated with another existing LUT tensor
address, also make sure to update the equivalence id.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I5ce8c608d9ff6d31e16212b1a725b4147dd3f6f1
|
|
- This bug causes a regression in the use of unpack and split operators
- The bug is due to the read_shapes attribute being an absolute calculation
for slice and strided_slice, but a relative one for unpack and split
- The fix is to consistently treat the attribute as a shape relative to the
read_offset
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4504b161be507ea22ca6ee40fbe7808bfe049405
|
|
- This bug causes an exception to occur when trying to index split
shape in Box.transform_with_strides_and_skirt()
- The bug was due to the read shapes not being initialised when creating
a primary op in pass packing
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I3ebd7fc4c7ef5c06488a36d8340a17ae6afd4609
|
|
Added the section "Bug Resolution", which links the user
to BUGS.md.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I36cd5f1317bf050ee91abf33df70dae627e83175
|
|
- Issue was due to a previous patch to fix MLBEDSW-4350
- Manually reverted that fix 5fabfcaa2b636b02899b4d6e0ccf95d853986475
- Made a new fix for MLBEDSW-4350 that calculates the padding and
skirt by taking into account the split read offsets and shapes
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I96010c1b977011aecbc411a3c91ab3e61af22db4
|
|
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I87dc5963972a7ef91db467b2ff8e0261e9899372
|
|
Fixed issue with sigmoid int16 with 1/2048 scaling.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I32718757e3776e6be89fe94a9b38368c78f0006b
|
|
This commit updates the release notes for Vela
version 3.2.0.
It also updates the SUPPORTED_OPS.md file with new
constraints.
Updated the API version as a result of the bug fix
commit 399c4a2d77df791e5d988c51d7fb1824ac4f266f.
Updated Vela version in setup.py.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I181e89f639a1da6013e8511ebe2d7e4f81242916
|
|
This commit corrects some errors and clarifies
the section on cycle counts.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: If1198cb797ffdb2bd23b4a9624cf480a30aacaf6
|
|
Created "BUGS.md" which details to the Vela Community
how to issue bug report using the Maniphest Bug Tracker.
Also added a reference to it in "README.md".
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I0120a890c8447907e32de6b10a24eceade09df7d
|