Age | Commit message (Collapse) | Author |
|
- Bump minor release version and add release notes
- Update README and SUPPORTED_OPS versions
Change-Id: Ic14d028483c12d281e69515b25f66346d9a3afeb
Signed-off-by: James Peet <james.peet@arm.com>
Signed-off-by: Tim Hall <tim.hall@arm.com>
|
|
- Updated the Memory Modes section in OPTIONS.md
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibfd3d2d6e1bf4a070d2af705878a5cc49381ce29
|
|
- The bug is that TransposeConv does not support explicit padding
which is needed in order to combine it with a proceeding Pad op
- The fix is to exclude such combination
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ide03d034dc32b5fc9bcaaf291ab713482223a042
|
|
*Corrected calculation where use of the
_estimate_memory_transfer_efficiency function when calculating the
scaled bandwidth for LUT transfers resulted in a divide by zero error.
Change-Id: I2356e924d9ca2f315ca1988f465f58b13a8fa4c9
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
|
|
*Original weights and encoded NPU weight now report correct size instead
of zero when running vela with --verbose-weights flag
(Code to update the aforementioned attributes was missing)
*Removed print references to unencoded NPU weight size
Change-Id: I6d3e41c04cc46d24eeb54cab89818a35e5df27be
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
|
|
Reduce memory footprint when using optimization strategy Size
for elementwise operations.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I30380aed587c31adbf7615f74179b4c5da686773
|
|
Signed-off-by: James Peet <james.peet@arm.com>
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4c9acb04a9df2181829e3a98aab840f32ae6458e
|
|
Updated constraints affect:
- Constant tensors
- MEAN operations
- RESIZE_BILINEAR operations
Signed-off-by: James Peet <james.peet@arm.com>
Change-Id: I2a041fa2300a9ba6da048cc61e164f34897b2f50
|
|
- Combine two MEAN operator checks for single axis averages into one
- Only apply that check if the single axis is the height dimension
(previously checks were also applied to width averages)
- Rephrase some MEAN operator constraint descriptions
Signed-off-by: James Peet <james.peet@arm.com>
Change-Id: Ie0577f2b99aba1f3d6a4c39f8934eafe3813b736
|
|
Make sure output from subgraph is write protected and
not overwritten by an element wise op.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: Ie26979913843c62794c5346a315b7089206850e0
|
|
Change required python version from 3.6 to 3.8 in setup.py and allow
any python3 version for black pre-commit linting.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I0d8936d92efd5137561834c0de1a3449f9e5f25c
|
|
Fixed problem when ofm is produced by different NPU nodes by
making sure that output is always in NHWC format.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I00e55c989d5860499fbaf4f4318661b17b4bda7e
|
|
Ported the improved spilling behaviour from Regor
into Vela. This replaces use_fast_storage_for_feature_maps
with allocate_feature_maps and introduces the class called
FastStorageComponentAllocator.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I34785840c905a79750a62863773015b00fb43387
|
|
This change will allow the subgraph's input tensor
to be reused/overwritten by the output from an elementwise op
if there is only one consumer attached to the input tensor.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I317188af11a5470614770e18dc8973462fd5f21c
|
|
The root cause of this diff is precision errors caused by rounding
several times when performing a resize bilinear upscaling to more than
twice the initial size. This is solved by rewriting the algorithm to
perform nearest neighbour upscaling to the correct size and then
applying one larger average pool instead of several 2x2 pools. Avgpool
with padding is limited to kernel size 8x8, which constraints the
largest possible bilinear upscaling to 8 times the input size.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I846232f309ba26aab6c385e593cbe25b646c6668
|
|
- Issue was due to a previous patch to fix MLBEDSW-5582
- Revert fix for MLBEDSW-5582
commit 849ff81f82c10a68898e5101930b92372bec5565,
- Made new fix for MLBEDSW-5582 that enforce
output tensor from NPU graphs to be in NHWC format.
This information is otherwise lost in the case when
parts of a concatenation are placed in different custom operators
resulting in mismatch bewteen NHWC and NHCWB16.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: Iab3ba29d348353c854f357836e6aa7c338ae1572
|
|
This reverts commit 3c25ff658cf847ccfcb2d4d5796ffbe13a511894.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I5403bf4c53dced1075160313876fa8681eaa617f
|
|
Only the first half of weight double buffers was used
on dual core configurations, which causes degraded performance.
Change-Id: I49972c00343bbffbae28ed11c645e993ed61d43f
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
- This bug was due to an interaction between multiple Ethos-U custom
operators and concatenation of constant tensors
- It resulted in different parts of the concatenation being placed in
different custom operators
- The fix involves places all parts of the concatenation into the
same custom operator by switching to a breadth first search in pass
packing
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ic47613cfd7bf675b4674dc91d6f9765849ba3130
|
|
Update the flatbuffers generated code to comply with TensorFlow 2.7
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: Iff29b05a6e145245861329b4ff9fc9fbd968da53
|
|
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I288decbc0affa7f9475cabbd16cd20005e15e2a2
|
|
By not comparing items that have already been compared with
each other, the number of iterations for the loop is reduced.
For large network with long live ranges, this improves compile
time significantly.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I298cd6f109527fc32f6db77ffffca9e765a84ce0
|
|
The output diff is caused by not including the kernel dilation when
calculating the bottom padding to be used on the last h_stripe. This
only shows up when using dedicated_sram since shared_sram does not split
into multiple h_stripes and thus uses the padding specified by the skirt
instead.
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I7f643748b153004d65be2124c0ac6c9d21cd803f
|
|
Signed-off-by: Jonny Svärd <jonny.svaerd@arm.com>
Change-Id: Ib398024c2f41beb4f93f7976c678a9fd54af94a5
|
|
Fixed a crash caused by loading a network containing
operators with empty constant tensors.
This could occur when a branched network is split
before said branches have converged.
We now put the affected operator on the CPU.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I63e9cd13cecf86d976c5750c727e218c334c32b5
|
|
When an LUT tensor address is updated with another existing LUT tensor
address, also make sure to update the equivalence id.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I5ce8c608d9ff6d31e16212b1a725b4147dd3f6f1
|
|
- This bug causes a regression in the use of unpack and split operators
- The bug is due to the read_shapes attribute being an absolute calculation
for slice and strided_slice, but a relative one for unpack and split
- The fix is to consistently treat the attribute as a shape relative to the
read_offset
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4504b161be507ea22ca6ee40fbe7808bfe049405
|
|
- This bug causes an exception to occur when trying to index split
shape in Box.transform_with_strides_and_skirt()
- The bug was due to the read shapes not being initialised when creating
a primary op in pass packing
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I3ebd7fc4c7ef5c06488a36d8340a17ae6afd4609
|
|
Added the section "Bug Resolution", which links the user
to BUGS.md.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I36cd5f1317bf050ee91abf33df70dae627e83175
|
|
- Issue was due to a previous patch to fix MLBEDSW-4350
- Manually reverted that fix 5fabfcaa2b636b02899b4d6e0ccf95d853986475
- Made a new fix for MLBEDSW-4350 that calculates the padding and
skirt by taking into account the split read offsets and shapes
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I96010c1b977011aecbc411a3c91ab3e61af22db4
|
|
Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: I87dc5963972a7ef91db467b2ff8e0261e9899372
|
|
Fixed issue with sigmoid int16 with 1/2048 scaling.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I32718757e3776e6be89fe94a9b38368c78f0006b
|
|
This commit updates the release notes for Vela
version 3.2.0.
It also updates the SUPPORTED_OPS.md file with new
constraints.
Updated the API version as a result of the bug fix
commit 399c4a2d77df791e5d988c51d7fb1824ac4f266f.
Updated Vela version in setup.py.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I181e89f639a1da6013e8511ebe2d7e4f81242916
|
|
This commit corrects some errors and clarifies
the section on cycle counts.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: If1198cb797ffdb2bd23b4a9624cf480a30aacaf6
|
|
Created "BUGS.md" which details to the Vela Community
how to issue bug report using the Maniphest Bug Tracker.
Also added a reference to it in "README.md".
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I0120a890c8447907e32de6b10a24eceade09df7d
|
|
- Removed the passes information as this was no longer correct
or useful
- Fixed the reporting of the number of CPU operators
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I80bf3f023de7d470af9aa5c6fe7bcb58c60ccd0b
|
|
- The failing tests contain operations with dynamic tensors which
are not supported and therefore they should be placed on the CPU.
However, a bug in the removal of RESHAPEs which contain a dynamic
shape prevented this happening.
- This change adds a check to make sure that RESHAPE ops with a
dynamic shape tensor are not removed and instead are placed on the
CPU.
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2d7481f7f80f99a0f01df100d956933777e6875a
|
|
This commit adds the author_email field with email
address <mlg-vela@arm.com> to the
setuptools.setup() function in setup.py.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: If3b2605ea9b05a8a4c6f899d8af77cbaec9ce9b5
|
|
Change-Id: I645496536a6bddf2bd289a87be9d7cef11693954
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
* 1D optimised block_config was incorrectly beign set to the ArchitectureBlockConfig in try_block_config()
* Write external API test for the reduced block height case (on H256)
Signed-off-by: James Ward <james.ward@arm.com>
Change-Id: I9ced7eb31b23730e4423aabbaf769bc72fac8fc9
|
|
This reverts commit 0af0d383925968626a7c37,
which caused a regression by rejecting
previously passing tests as faulty.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: If11737713b6873a67162387e407eadf174b434ec
|
|
* Add small aesthetic changes to summary
* Move "_cpu" suffix from cloned tensor to original tensor such that suffix is no longer externally visible
Signed-off-by: James Ward <james.ward@arm.com>
Change-Id: I97427561bd9acb04765ae9de6278760511278118
|
|
Fixed by adjusting zero points for ops with int8 IFM and asymmetric weights
since the reference does not support asymmetric weights for int8 IFM and
ignores the zero points.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I2a206a01a471a53aa864a6a3616aa23d2a5a23c8
|
|
- Back-to-back 16-bit activation ops were packed into the same pass
because there was no check to disallow it
- The solution is to set the appropriate incompatible-flags
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Idb3c741a7b52e0d81c1f687f6ecf78352b7872dd
|
|
Previously we did not check if half_pixel_centers
was set. Since we do not support it, these cases
should not run on the NPU.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I9d2675f760424d5cfb67e5d581dd1861ad165b85
|
|
* Add check for tensor with no operations, raising error if its constant-data buffer is empty
Signed-off-by: Alex Matthews <alex.matthews@arm.com>
Change-Id: Ib210dcc9733e4ecedbada0f430e8b3c4a8384999
Signed-off-by: James Ward <james.ward@arm.com>
|
|
Change convert_pad optimiser to use op.ifm_shapes attribute in place of
the fickle op.ifm.shape (which in this case had changed due to the
optimised-out reshape)
Signed-off-by: James Ward <james.ward@arm.com>
Change-Id: I13fbd846ac8d3342afd7844d1041cfa15aaae124
|
|
Added checks to avoid merging elementwise op live ranges for subgraph
inputs and outputs, which sometimes caused problems when parts of the
network run on CPU.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Id07ab277a205b8550d19a276559f8904b9a4b4be
|
|
Make sure unsupported memory only operations are issued
to the CPU.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: Ifdf7c3056ab45d707db5b67113549a73133b69c8
|
|
Fixed crash in nn_graph.print_graph_with_tensors() and
nn_graph.print_graph_with_tensor_quantization() for optional
input tensors.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I7a2d23892558006485c5c84842d65aa221dba44b
|