Age | Commit message (Collapse) | Author |
|
*Added generic function which checks if underlying shape of
FullyConnected operation is 2D and performs shape reduction
*Fully connected operation >2 dimensions now run on NPU if the above
case is satisfied
*constraint_fc_output_2d and rewrite_fully_connected_input refactored
*Added unit test to confirm this functionality
Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>
Change-Id: I0e29c767e5b84841eb53bbc44464b36a454f7b38
|
|
- Changed comments to docstring on QuantizationParams
- Simplified op type to op name conversion
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I2fdf5922cc17944c9bd37917a85fdfe50a1e651d
|
|
- Fixed a bug due to ResizeBilinear modifying the attributes of a
shared IFM
- The ifm_resampling_mode is now an attribute of an operator rather
than a tensor
- Changed all calls to try_block_config() to use the attribute rather
than recalculating it in multiple places
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I4641e9cd6b049bd4186776d98e3e751c5e5bcc06
|
|
Add mypy to pre-commit and clean up all reported errors.
Signed-off-by: Jonas Ohlsson <jonas.ohlsson@arm.com>
Change-Id: If7dc869f5fecdb0e2db40f14e7d9db21aa33df71
|
|
This change will allow the subgraph's input tensor
to be reused/overwritten by the output from an elementwise op
if there is only one consumer attached to the input tensor.
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
Change-Id: I317188af11a5470614770e18dc8973462fd5f21c
|
|
Change-Id: I645496536a6bddf2bd289a87be9d7cef11693954
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
|
|
Fixed by adjusting zero points for ops with int8 IFM and asymmetric weights
since the reference does not support asymmetric weights for int8 IFM and
ignores the zero points.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I2a206a01a471a53aa864a6a3616aa23d2a5a23c8
|
|
This commit fixes a number of bugs where per-axis
quantization would make Vela crash and would not
be properly recognized.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I50a461d200274b43ec76f3a7357bf66db6d49964
|
|
Added support for elementwise operations:
-Support for up to Rank == 6
-Support for Batch > 1 for Rank == 4
-For binary elementwise ops this includes handling
of broadcasting in dimensions above H-dimension
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I73850bbfb288077a99bd2ceecbf989172016da24
|
|
This commit fixes one assert regarding rolling buffers for 3D tensors.
It also addresses another issue where the incorrect weight buffering was
proposed for cascaded operators.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I2501f35e5668b3085d917751cfc8002d250973d8
|
|
Bug fix in cascade builder: tensors produced with operators requiring full OFM
or consumed by operators requiring full IFM could be added as intermediate buffers
to a cascade.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: Id84e9e1940bf85ab4cbc42a03e65f64da16a094c
|
|
Remove quant_values attribute from Tensor class.
It only needs a single values attribute, holding either
quantized or unquantized values as appropriate.
Change-Id: Ie96f80ac58061b6077e0f7048dc60209fdfbcafa
Signed-off-by: James Peet <james.peet@arm.com>
|
|
- Merged dev/scheduler at 83639f90e8c828f70de6e29142355a940224959b
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I0050529d4b42da93768c7264296434dd877fb5b4
|
|
Check if non linear tensor format can be used is
refactored.
-Flag avoid_NHCWB16 replaced with needs_linear_format
-Checking restrictions located to one function in graph optimiser.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Iec5c7996a1a6039cad052197f1ae56f7c0290440
|
|
This commit adds support for emulating the behavior
of the QuantizedMeanOrSum implementation of MEAN in
TensorFlow Lite.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: Ifd24e0e678e2f85cd66ab82deeaaf010d5351b1e
|
|
Fixed pass through of LSTM operator.
Change-Id: I23140c69ab6cdc83f6bb8129256b4cc6a7c5ffac
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
|
|
When running specific networks containing LeakyReLU operators, Vela would crash when cloning an ofm of a LeakyReLU operator.
In this procedure a deepcopy usage would try to copy an OperatorInfo object, which caused an error.
This was fixed by replacing the deepcopy usage with a copy and then manually referencing new instances of sensitive variables.
Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com>
Change-Id: I46917858896fbdf52245dac6c6d9c18bc7ecdd0d
|
|
-Removed reshapes in the original graph
-Removed the addition of reshapes to the
optimized graph
-Reshapes with different ifm/ofm quantisation will remain
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I94862be53dac0d7434815e2aee5ca678228495f8
|
|
Fixed assertion when reading back in an ethos-u custom op.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I275ec9187ffead1e96f2522ecbd658328fa4ef69
|
|
This reverts commit df0a5905177f3a1b836076bc3f9f39b2e86f1794.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I891c66fb29db9d25e942947e8d1c29a10610de51
|
|
This reverts commit bf31d647dc5df47410ee577b12427ddf076d816b.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I7b6c585b7658f94dbaa916c2b6bfe9fb463b8d37
|
|
Add 4D shape class for op Ifm/ofm shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ic0a98da9d2f9d085605e39a9ab5a26bad6e702a3
|
|
Add ifm/ofm shapes to op
Changed to rely on these shapes
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I571535a1dcadc2bdb04a3c727a8e1c49703b174d
|
|
Due to an issue with potential cyclical imports, especially when running
individual parts of vela standalone for example with pytest, the
specialised error functions are moved out of errors.py to their
respective locations.
The use of getattr over isinstance prevents the need to import the
tensor/operator class causing the cyclical import issue.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: If8cee4b1a2562660c6a47e1c7aeb5d7fd4dd1fca
|
|
Added __lt__ for Tensor to avoid errors when sorting tensors.
Change-Id: I19bb591ef17aa0d4a3389da411bd8863c2218d55
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
Change-Id: I4a5c53d0c5957595fc639b174b2b227ea043d409
|
|
Minor refactoring to use fstrings.
Improve Error classes to correctly inherit the base class.
Use existing exception classes instead of plain exceptions where it
makes sense.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: I0941c04e91010da1db77299517a8e2d896371e77
|
|
Change-Id: I1b35e039f43471cc0f61cb46ed4d5aff5469d11d
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Replace conditional checks against sets with tuples.
If not requiring uniqueness, or complex set operations, it is quicker to
use tuples instead.
Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com>
Change-Id: Ie8732c8d46067244963936c53f0ec81adda50372
|
|
Vela only supports per-channel scaling for
convolution ops. This commit adds a check that
puts ops with per-channel scaling on the CPU.
A caveat worth mentioning is that neither
TensorFlow Lite or TensorFlow Lite Micro support
per-channel scaling for the CPU placed op,
however the problem is moved away from Vela.
This commit also changes a small utility function
in supported_operators.py used for docstring
formatting.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I9ed090592f1d05dd4566d3e54dba1ef405299383
|
|
- Improved tensor and scaling query functions
- Fixed bug in convert_batched_fc_to_conv
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibc3d14036540f27cf5e993beb2163d3e0f5e5933
|
|
None inputs and unsupported tensor shapes caused asserts when
marking tensor purpose/format.
Change-Id: I4498b61576f529c1a594341cfbb6ba278c6e7ec5
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
For IFM streamed cascades bias tensors are read several times.
Moves these tensors to fast storage and add DMA commands.
Change-Id: I630f6275986c1b5e3f126c925b11e22500fb1128
Signed-off-by: Andreas Nevalainen <andreas.nevalainen@arm.com>
|
|
Changed so that there is an option to set if Tensor clone should be
seen as unique or not.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Ie51c1a5e84b535380d498b105aa18ccba1c8b27c
|
|
Separate scale+bias tensors by different equivilence_id.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: I674341950bc001ac6e4015206995f048a0dfee75
|
|
- copy bandwidth compression rate when weight tensor is cloned
Signed-off-by: Diqing Zhong <diqing.zhong@arm.com>
Change-Id: I41c4c1f7001e8dc12af35695f5f5d02815e28351
|
|
- Fixed typo in Tensor.is_quantized()
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I36156a6aa5aaff01c4f271a6a8325636173225f3
|
|
- Fixed and documented both tensor and quant params scaling checks
- Added quant params validity check and tensor quantisation check
- Added valid tensor checks to some graph optimisation functions
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: I8d6e8f03a603d28886dde511672c8399c85b794c
|
|
- op.type is now an enum instead of a string
- Removed unused operator codes
- Refactored some attributes like npu_block_type, fused_activation_function
- Refactored operator index calculation
- Refactored a number of operator sets
Change-Id: I641f65ee375794b7aec42abc0664251ae37d78e8
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Attempts to use fast storage for feature maps used in between
cascaded passes.
This is only relevant for system configurations where feature maps
are by default not placed in SRAM, but there is SRAM for fast storage.
Change-Id: I207b7cf32cfcb5bea3e6b93c2da1161c4af5221d
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
|
|
Assign different equivalence ids to weights with same values but
different compression, to ensure correct addressing.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I13aabad71520e4f4a78fb2d6a81740bdd4d1256c
|
|
Added a static class TensorAddressMap that stores all Tensor addresses
based on their equivalence_id. Made the "address" field into a property
which getter and setter looks up/sets the tensor's address in
TensorAddressMap.
This makes the references to cpu_tensor/npu_tensor obsolete and they
have been removed.
Addition to scheduler: avoid SRAM spilling if an op has consumers in
other subgraphs.
Minor rework in LUTState; it will now assign a unique equivalence_id to
the SHRAM lut tensor to avoid issues with addressing. The equivalent
checks in LUTState now compares the values of the LUT instead of the the
equivalence_id.
Updated LUT unit tests accordingly.
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I41de5a8a4e5f07b77d6544d8d4034b754993e503
|
|
Added batching to softmax by reshaping the input.
Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
Change-Id: I0b516f9bf2410fb86372b229beba4a7280c498cc
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I287c24725126c169afec779b921e43c3ab26f739
|
|
Added --weight-estimation-scaling, which enables
additional scaling of weight compression scale estimate.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Idcda41257f44901d3a3f345341e07fb1ae8585a9
|
|
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com>
Change-Id: I2cb3f6639e4bb8a984fa3647ee7b4678ed6f5890
|
|
Replaces LeakyRelu operations with LUT activation function when possible,
else to a combination of multiplication/maximization.
Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
Change-Id: I3d2eb2dba7145997c3cc711d0ef18ab355fbb416
|
|
- Fixed bug with the supported operator check rejecting operators based
upon an incorrect comparison of the tensor quantisations
Signed-off-by: Tim Hall <tim.hall@arm.com>
Change-Id: Ibd0eb50077465d2c515c6ee10394d9b43cdf730c
|
|
Added that NHCWB16 is accounted for in the sram estimates
in the scheduler, for intermediate buffers in ifm streaming.
Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com>
Change-Id: Icda5e05dd3663935f528f1a06d36d9e1de123cc8
|
|
This commit fixes a bug where CPU ops were getting
passed on as NPU ops in weight_compressor.py due to
Operation.find_npu_op() incorrectly returning any
op with an 'npu_block_type' attribute (which every
op has) as an NPU op.
Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
Change-Id: I7a758f8d1b1237907816bc1be7b77aff765ae688
|