aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-26MLBEDSW-2686: Use NPU tensor format for noop reshapes.1.2.0.rc2Tim Hall
- Reshapes that merely add/remove dimensions, rather than re-layout the data need not fall back to NHWC. This commit allows reshapes betweeen NPU operators to use NHCWB16. Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ieb7745e586bf324e92e741a04b74caf7285f4b8b
2020-08-26Update to HI 1.0.6Stefan Nannesson
Signed-off-by: Stefan Nannesson <stefan.nannesson@arm.com> Change-Id: I7ad0b8e5b2431b46b53f51d809ca2642039a0012
2020-08-26MLBEDSW-2688: use LeakyRelu for int16Louis Verhaard
For int16, using LeakyRelu (with bug fix) gives exactly the same results as Mul+Max if input/output scales are the same. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I4f4db464d77b0aaf0d25ddfca534f91d08db548d
2020-08-26MLBED-2822 Added CLI-opt for weight size est.Patrik Gustavsson
Added --weight-estimation-scaling, which enables additional scaling of weight compression scale estimate. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Idcda41257f44901d3a3f345341e07fb1ae8585a9
2020-08-26MLBEDSW-2847: Fix for TransposeConv crash and u8 output diffJacob Bohlin
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I2cb3f6639e4bb8a984fa3647ee7b4678ed6f5890
2020-08-26MLBEDSW-2688: LUT DMA may require kernel waitLouis Verhaard
LUT related updates specific for 16K SHRAM: - prevent LUT DMA transfer from overwriting accumulator SHRAM of an ongoing operation - do not use the last 2K of SHRAM as accumulator during LUT operations Change-Id: I17066e0410c6f07b125ed245002d7b19269a7a8a Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-08-25MLBEDSW-2867: Split operators get placed on CPUDwight Lidman
This commit fixes a bug wherein Split operators are being erroneously placed on the CPU due to a 0-dimensional input that disqualifies it from NPU placement; a restriction introduced in a recent commit. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I83c047ddf071d662343087c69bdb2a014dd209c3
2020-08-24MLBEDSW-2654: Convert Resizebilinear to a number of 2x2 poolsCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: Ida307afc33cd7963bdeb505df400732a3efcc846
2020-08-24MLBEDSW-2688: LeakyRelu rewrite to LUT or MUL/MAXLouis Verhaard
Replaces LeakyRelu operations with LUT activation function when possible, else to a combination of multiplication/maximization. Signed-off-by: Louis Verhaard <louis.verhaard@arm.com> Change-Id: I3d2eb2dba7145997c3cc711d0ef18ab355fbb416
2020-08-21MLBEDSW-2646: Refactor unknown operator serialisation1.2.0.rc1Tim Hall
- Minor cleanup of register command stream generator too Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I0514622402ee9b0557769dd7c7decfddecc87ffa
2020-08-21MLBEDSW-2679: Tensor quant comparison is incorrectTim Hall
- Fixed bug with the supported operator check rejecting operators based upon an incorrect comparison of the tensor quantisations Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Ibd0eb50077465d2c515c6ee10394d9b43cdf730c
2020-08-21MLBEDSW-2663: Handle optional tensorsJacob Bohlin
Includes a number of changes: * Handle non-existing optional inputs * Handle disabled optional inputs (-1 indexed) * Added unit tests for parsing operators * Add bias tensor to the different Convolutions + FullyConnected if it's missing. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: Ib88d2b610314b1c886fc0aef4f9da87430ce6ae5
2020-08-21[MLBEDSW-2730] Implement LUT generation for softmax uint8/int8Fredrik Svedberg
Implemented LUT generation for softmax uint8/int8 to match the reference. Change-Id: Ib9acaa295ee1066591e800023d75f364520b44c1 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2020-08-21Added a lower bound for the valid range of shiftJacob Bohlin
Very small quantization scales, below around 2^-31, would return negative shift values. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I4ca368284c097820f83e5ae53412a08c34516c7f
2020-08-21MLBEDSW-2664 Clarify help for CLI-opt permanent-storagePatrik Gustavsson
-Make it clear that --permanent-storage option, only is valid for Ethos-U55. -Removed Shram from allowed values Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Ice6cacd509713e33bcb380c16dcd3c3b34a82a33
2020-08-21MLBEDSW-2822 Account for NHCWB16 in scheduler est.Patrik Gustavsson
Added that NHCWB16 is accounted for in the sram estimates in the scheduler, for intermediate buffers in ifm streaming. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: Icda5e05dd3663935f528f1a06d36d9e1de123cc8
2020-08-21MLBEDSW-2611: Update global scale for 16 bit to tanh and sigmoidCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: Ia83ab5ba28d193215e3f8fbc52552b0356111723
2020-08-20MLBEDSW-2783 Vela crashed on empty tflite fileMichael McGeagh
There may be cases where after optimisations, there are no operators contained within the subgraph. Upon serialising and writing out the vela optimised tflite file, it would crash for such a corner case. This fixes it allowing it to not crash but instead write out the empty tflite file. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ia879d1ffdbab21706b15e99aa107fb2d8d4dd3de
2020-08-20MLBEDSW-2824: Add mapping for ROUND operatorDwight Lidman
This commit adds an entry in the tflite_mapping.py for the ROUND operator, which was previously missing. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I22d6c60969eea6a785366c6741893718ba3cb8ae
2020-08-19vela: Minor refactor of operation classTim Hall
- Removed some of the clutter Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I9a12f681247befd44dbbc9d7fbd135f0603d2fbd
2020-08-19MLBEDSW-2683: Neural Network MACs is wrongTim Hall
- Fixed. It only affected operators with striding greater than 1x1 Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I129e46586aa16079ddbce3898569676ba9891372
2020-08-19MLBEDSW-2728: Only insert primary op for NPU opsJacob Bohlin
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I04f299e2d3319113fedf2fa401b88bae64fea66d
2020-08-19MLBEDSW-2731: Allow all TensorFlow Lite operators to pass throughDwight Lidman
This commit adds missing entries and options in the tflite_mapping which should in theory allow every existing TensorFlow Lite operator to be passed through Vela without crashing. Previously some entries were missing and was crashing with a custom error whenever encountered. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Ia69b7a84164bb57c52ceaf7380160794b7f0d9ee
2020-08-19MLBEDSW-2729: Add restrictions for shapeless tensorsDwight Lidman
Vela often fails when encountering operators that have inputs or outputs with shape == []. Only for elementwise ops where shape is broadcasted from IFM2 to IFM1 is this supported. This commit adds a restriction which places ops with shape [] tensors on the CPU except in the special case of broadcasting for elemwise ops. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I5b0855233e3b83870209f4da00fb2dbd0184fee0
2020-08-19MLBEDSW-2636 Prevent DMA of weight to Sram in some casesPatrik Gustavsson
DMA transfer of weights is prevented when the weight double buffer is assumed to not fit Sram. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I9809dca1d4b335436e1a0b81093640361ada255e
2020-08-19MLBEDSW-2779 Avoid NHCWB16 in some SplitSliceRead casesPatrik Gustavsson
NHCWB16 is avoided for the input tensor for SplitSliceRead, when any of the consumers has an start offset in C-dimension that is not a multiple of 16. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I333e2acfbeb02b9c34ee5ea28074baff12ea7b24
2020-08-19[MLBEDSW-2657] Softmax uint8/int8Fredrik Svedberg
Added graph rewrite of Softmax for uint8/int8. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iecdd5d2cd3156a601b3313debba4a3562e6be5d7
2020-08-18MLBEDSW-2718: Fix a bug that would reshape TransposeConv inputJacob Bohlin
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: If22fd21f9953a62305620a4e804e5caacb342c89
2020-08-18MLBEDSW-2589: Skip weight compression for CPU opsDwight Lidman
This commit fixes a bug where CPU ops were getting passed on as NPU ops in weight_compressor.py due to Operation.find_npu_op() incorrectly returning any op with an 'npu_block_type' attribute (which every op has) as an NPU op. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I7a758f8d1b1237907816bc1be7b77aff765ae688
2020-08-18MLBEDSW-2779 Consider num dimensions, in check for NHCWB16Patrik Gustavsson
4 dimensions where assumed in check if NHCWB16 should be avoided. Changed check so that if axis corresponds to C-dimension, NHCWB16 should be avoided. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I7784a7a813a3c3438d6142523bf0a3ba81742aca
2020-08-18Vela: Rework NPU/DMA dependency insertion (for MLBEDSW-2620)Tim Hall
- This commit removes unnecessary dependency checks and implements on-demand calculation of the NPU/DMA dependencies. Signed-off-by: <tim.hall@arm.com> Change-Id: I85e681d1ab133bd88f64296dc00500f3c188e777
2020-08-18MLBEDSW-2732: Added complex64 to datatypesJacob Bohlin
Added complex64 datatype to allow pass through without crashing. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I8beeceafb32182d4877a9880d21d51ba21033030
2020-08-17MLBEDSW-2688: Improved LUT supportLouis Verhaard
- Support for more than one 256-byte LUT in SHRAM - No DMA is performed for a LUT that is already located in SHRAM - Added MemArea.Shram, used for LUT, to avoid false address collision asserts during SRAM tensor allocation - Added read access to LUT in memory access calculation Change-Id: If4d1eded5ed029d253f4f5efb2d80495fc3eac99 Signed-off-by: Louis Verhaard <louis.verhaard@arm.com>
2020-08-14MLBEDSW-2570 Avoid usage of NHCWB16 for some casesPatrik Gustavsson
Avoid usage of NHCWB16 when Stack/Pack/Concat is performed in axis 3, and the "concat start" of each slice to be combined is not a multiple of 16. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: If3f7b4a3424be3c86fc2dc48e8649ce4c4f49485
2020-08-13MLBEDSW-2639: Remove reverse_op_order attributeJacob Bohlin
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: Id762ee2c03cd8f162cd0c450511ee5b2e0624586
2020-08-13MLBEDSW-2755: Added check that ifm2_tensor is setJacob Bohlin
Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I5b8db6430e79ec7a5836d8dd00a03413647de8ba
2020-08-12[MLBEDSW-2749] removed the decorator for typecheckManupa Karunaratne
*the decorator is causing the verification tests to fail when using TF 2.1, but not with TF 2.2, hence removing it for now. Change-Id: I07357c0fef383d9a65278fe99ad8e4d3f7dc6d9b Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>
2020-08-12MLBEDSW-2726: Vela crashes when marking tensor with TensorPurpose.UnknownDwight Lidman
This commit adds a missing entry for TensorPurpose.Unknown, mapping to MemType.Unknown in the tensor_storage_mem_type dictionary in the ArchitectureFeatures class in architecture_features.py Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: I6c3d942e8c6f1c71c6496bdd621ca8d46ea76147
2020-08-12MLBEDSW-2586: Null check before accessing tensor resampling modeDwight Lidman
This commit amends a mistake where the resample_mode attribute of a tensor would be accessed without checking if the tensor in question was actually there first. Signed-off-by: Dwight Lidman <dwight.lidman@arm.com> Change-Id: Id2ceb1d6e38133611fcecfc2ac97150c927ceee2
2020-08-12MLBEDSW-2696 Fix Sram exceeded for Sram spillingPatrik Gustavsson
Avoid concat op as predecessor in ifm streaming, when Sram spilling is to be applied. Signed-off-by: Patrik Gustavsson <patrik.gustavsson@arm.com> Change-Id: I2ba6283a7561a12d54a06552a15e122bb082b7a1
2020-08-12MLBEDSW-2681: Ceiling the upscale for OFM/IFMCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: I566abd5a1ffc367c6b9b8f37d5a26b61d27e840b
2020-08-12MLBEDSW-2684: Fix weight compression scale calculations for FCJacob Bohlin
Fixed an issue with Fully Connected weights' shape used for compression scale calculations causing incorrect performance estimates. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: Id3a5c187ad3e942b8e3d4c690b3dbba3c6fda922
2020-08-12vela: Remove redundant import, reuse existing funcMichael McGeagh
We already import numeric_util so no need to import it again for one func Also replace handcoded full shape code with one already existing in numeric_util Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: Ib569409fbfd457a7b4b99006d51d9c43f25a1c2c
2020-08-12MLBEDSW-2637 Utilise new tensor and operator funcsMichael McGeagh
add_input_tensor, set_output_tensor, create_const_tensor and create_reshape_tensor have recently been added. This replaces all found existing instances with these new helper functions Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: If33be8dbf237b2087b562b03cdeb51da1f99a786
2020-08-12MLBEDSW-2637 Refactor util funcs out of softmax.pyMichael McGeagh
There were a number of "TensorUtil" functions defined in softmax.py These have been moved to their respective classes for Tensor and Operator respectively. Two of the functions were not a simple tensor/op function. These helper functions have been moved to tensor.py for the simple fact that they return Tensor's Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I17d39c4e11f0837b7867b4a54da2e4a56383e095
2020-08-12MLBEDSW-2383 Preserve previous metadataMichael McGeagh
The input tflite file potentially has metadata attached to it, which was lost when writing the vela optimised tflite file out. This patch preserves any metadata found. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I7b4e941696d21b81802fd4398cd405323778bedf
2020-08-10MLBEDSW-2639: Moved the IFM/IFM2 order switch to register cmd stream generatorJacob Bohlin
For binary elementwise ops with broadcasting in first IFM. Signed-off-by: Jacob Bohlin <jacob.bohlin@arm.com> Change-Id: I25af67be8d3a852247989bc3ddc8e08e946f6bfa
2020-08-06MLBEDSW-2549 Crash with incorrect strided slice opMichael McGeagh
A valid strided slice should have (positive) non-zero elements when you do "end - begin" When encountering an invalid strided slice, vela asserted. This now checks that it is valid and wont claim support if it isnt. Signed-off-by: Michael McGeagh <michael.mcgeagh@arm.com> Change-Id: I33ef118bd6a31ac78c680acb5229ff31b0809d6a
2020-08-06Skip the NOP resizebilinear opCharles Xu
Signed-off-by: Charles Xu <charles.xu@arm.com> Change-Id: Ibd0cd152fbc46dea0c92fd1bf7da1ffc9803fdba
2020-08-06[EXTAPI] exposing encode of biases to be consumed by an external APIManupa Karunaratne
*Renamed pack_bias_and_scale to encode_bias to be consumed externally *added unit test for the API Change-Id: I71829f3fcb390c475795848f0be3d132d3e158ee Signed-off-by: Manupa Karunaratne <manupa.karunaratne@arm.com>