aboutsummaryrefslogtreecommitdiff
path: root/ethosu
AgeCommit message (Collapse)Author
2022-10-19MLBEDSW-6880: Add support for multiple subgraphsJohan Alfvén
- Vela failed to compile networks with multiple subgraphs because only cascaded passes in the root subgraph were used when extracting the live ranges. The fix is to extract the subgraph range live on Ops that have connected subgraphs. - The tf_writer did not handle multiple subgraphs in a correct way resulting in corrupt buffer data in the optimized tflite file. The buffer index must be unique for every tensor. -Added support to handle multiple subgraphs for the OfflineMemoryAllocation meta data. The change will not change behavior for single graphs. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2328dfc1f07e2e4faf43a75423ea95423096ffa3
2022-10-19MLBEDSW-7020: TRANSPOSE_CONV stride documentation is confusingTim Hall
- The op contained supported operator checks for both the stride being in the range 1 to 3, and being equal to 2. Whilst both are correct, only the later is needed - Removed the stride in the range 1 to 3 check for TRANSPOSE_CONV - Regenerated the documentation Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I9789cdbd3ed65ce310f1529036abbac62296d2ca
2022-10-18MLBEDSW-6941: Set correct OFM shape for fc opJohan Alfvén
If IFM operator shape is rewritten so that batching is greater than one for fully connect, the OFM batch must also be calculated. This change will fix output diffs for networks that have fully connect OFM with rank greater than 2. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I5009edc647a1449a02c8116b45808c1c68beffe6
2022-10-18MLBEDSW-6794: ResizeNearestNeighbor with HPCJohan Alfvén
- Removed half pixel centers constraint for resize nearest neightbor. - Supported scale 2x, 4x and 8x. - Removed test_constraint_resize_half_pixel_centers - Regenerated SUPPORTED_OPS.md Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ic3e02e9c2b2034d537c9a9841b8fb4ee433c96dc
2022-10-12MLBEDSW-6971 Fix output diff when cascading elementwise operatorsFredrik Svedberg
Fixed output diff when cascading elementwise operators with reversed operand order. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac2e28cfb53037b929459af213f4fa7715b3e6de
2022-10-12MLBEDSW-6987 Regressions after removing RescaleAdd/RescaleMulFredrik Svedberg
The problem was that the updated conditions for elementwise cascading was to permissive after the RescaleAdd removal. Conditions for elementwise updated and transpose convolution removed from cascading since it does have issues. Change-Id: I0151256c4e3905fad39152941eec44bc76035d30 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-10-11MLBEDSW-6626: Initialize lut_val in mlw_codecJohan Alfvén
The palette variable located on the stack was not properly initialized and could potentially overwrite the stack memory when palette size was increased to 2. Make sure lut value is initialized. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9fecfe218dc39c0157d1af015e725d1e4becf2f0
2022-10-04MLBEDSW-6969 Remove RescaleAdd and RescaleMul operatorsFredrik Svedberg
Removed RescaleAdd and RescaleMul operators in favour of Operation.explicit_scale and removed Operation.rescale. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Idccd8851731d4bb8d4e84970e0fd6b409d7d4e45
2022-10-03MLBEDSW-6955: Update to TensorFlow 2.10erik.andersson@arm.com
- Updated to TensorFlow 2.10 and FlatBuffers 2.0.7 - Changed absolute to relative imports in the auto-generated code - Updated Vela's TFLite writer to support FlatBuffer builder's internal number of elements count - Removed use of deprecated numElems argument to FlatBuffer builder's EndVector() Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: If447778134db81ae0ac374c7397e1140082372fd
2022-10-03MLBEDSW-2723 Handle int16 multiplier overflow test caseFredrik Svedberg
Added unit tests for scaling including saturated multiplier test. Change-Id: I87bb3a4bed8f62f5ef5cf3851b97f09ce42bf2b6 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-09-27MLBEDSW-6708 Check the bias tensor in graph optimiser mean opFredrik Svedberg
Cleaned up bias tensor use in graph optimiser for Mean operator. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ibcbfa010a4de67d97181df664b420168d6883d1e
2022-09-27MLBEDSW-6961: Bypass functionality for memory opsJohan Alfvén
- In order to solve output diffs, the Reshape op was pushed to the CPU. The problem was that the Mean op ifm shape was replaced by the Reshape op ifm shape. - This limitation is now removed. Changed implementation how memory only ops are bypassed. Always replace the memory only op ifm tensor with its ofm tensor. By doing this the ifm tensor for the operator that is after the memory only op is never changed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibcdebf33fd9b7a37f90984a129500b5dac52e5ea
2022-09-27MLBEDSW-6962: MEAN height is greater than max kernel heightJohan Alfvén
Fixed bug when height is greater than max kernel height. The shape of the weight must match the ifm shape. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I901a8af2edd5858bb15d53d85ef8e2389049ada7
2022-09-27MLBEDSW-6933: Clean up address_for_coordinate functionRickard Bolin
Make the address_for_coordinate function a bit easier to read Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I854e1643a39108edc8b1de95198d30a1891fdfd1
2022-09-26MLBEDSW-4075 PACK axis 0 + tanh fails with output diffFredrik Svedberg
The test failed since the tanh had batch size > 1. Added checks for batch size for all supported operators. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3570352740c40eb96bd9db965dfa3c91c81ff2ad
2022-09-26MLBEDSW-6932 LeakyRelu missing from supported ops activationsFredrik Svedberg
Added LeakyRelu to supported activation ops. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Icca27730946d02ec16159f988782567be716b594
2022-09-23MLBEDSW-6928: Add int16 support for Resize Bilinear HPCRickard Bolin
Setting bias tensor dtype to DataType.int32 solves rounding issues for RB HPC int16. Removing the input data type check also solves the issue of resize nearest neighbor int16 ops incorrectly getting placed on the CPU. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iee352bcb78e581c0cde3c203dfbe866f1f6fae18
2022-09-23MLBEDSW-6686: Resize bilinear HPC with tile paddingRickard Bolin
- Added support for Resize Bilinear with half pixel centers for int8 and uint8. - Utilizes the new "TILE" padding mode. - Utilizes ofm stride multipliers and modified tile base offsets to write OFMs interleaved. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
2022-09-21MLBEDSW-4338 Randomized int16 PAD output diffFredrik Svedberg
The issue was that the AveragePool in these test cases were translated to DepthwiseConv2DBias and int16 convolutions always runs with reduced scale. Fixed so that reduced scale is not used in this case. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ice956eabbb37c8aa1991464870006971c6ecec43
2022-09-16MLBEDSW-6938 Fix PReLU optimisationFredrik Svedberg
Fixed PReLU optimisation to LeakyReLU with negative alpha. Added optimisation of LeakyReLU to ReLU when alpha is zero. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5e66f79b29908fffd95b6115799021138ebb401a
2022-09-15MLBEDSW-6927: Add ofm_stride_multiplier attribute to operationRickard Bolin
Allow sparse writing of OFM by multiplying H/W/C of the OFM with the values of ofm_stride_multiplier Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I65d742ad36ad3154e9914cdd22e2da928ad1f095
2022-09-13MLBEDSW-6929 Fix LeakyReLU int16 regressionsFredrik Svedberg
Fixed LeakyReLU regressions for int16 due to scaling introduced for handling negative alpha. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I84a494fedf54bd4b47c4632645ded7d6cda445f8
2022-09-12MLBEDSW-6863: Cleanup the constraint for concatJohan Alfvén
Removed duplicate code and moved constraint to the correct file. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2da3c5b88e1af351751c481217b8183b5948f0f8
2022-09-12MLBEDSW-6869 Improve LeakyRelu supportFredrik Svedberg
Added support for int16 LeakyRelu for negative alpha and alpha greater than one. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I7f522ebfe014786d0a1d96172e75c7d9bdd76921
2022-09-12MLBEDSW-6613: Implement tile paddingRickard Bolin
Implement new padding mode which pads two edges of the IFM with the current values of those edges Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I8523e0cabdac80b48710703859003e33050cc150
2022-09-12MLBEDSW-6909: Use int32 acc for the Mean opJohan Alfvén
Changed acc type from int16 to int32. This will solve saturation problems and the constraint added in commit "MLBEDSW-5029: Output diff for Mean op" can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I05ec8835b43313b1a264d61a2b147fa62da123fe
2022-09-08MLEMBED-1918: Issue with REDUCE_SUM on Ethos-U65-5123.6.0.rc0Tim Hall
- Ethos-U65-512 requires the input to REDUCE_SUM to use NHWC format - Updated the graph optimiser format check to cover this condition - Added a exception check to the backend of the compiler to verify that this condition is not been violated by the external api or Vela internals Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2f1fabcbd264daf77d5822349d855a3a32b12c64
2022-09-06MLBEDSW-6870 Optimisations for PReLUFredrik Svedberg
Added optimisations for PReLU when the alpha values allows it. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iff9124e691663ee495379f89900e7c35dbc5f948
2022-09-01MLBEDSW-5029: Output diff for Mean opJohan Alfvén
Fixed three test cases causing output diff compared to the reference kernel for the Mean operator. - If there is a possibility that the accumulator could saturate the Mean op must run CPU - Use correct rounding for the bias term - If a Reshape op is followed by a Mean op, push the Reshape op to the CPU since this cannot be handled by the NPU Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I734465730372105821a5e2f73a6a125b9eb7d7f4
2022-09-01MLBEDSW-6755: Add per-layer performance to CSV filewilisa01
Dump the current per-layer performance estimation information that appears on the terminal to a CSV file. Change-Id: I00e94168704be8c3c674c8779fb807ed28607ccd Signed-off-by: wilisa01 <william.isaksson@arm.com>
2022-08-31MLBEDSW-6832 PReLU support in VelaFredrik Svedberg
Added PReLU support in graph optimiser. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3a188675e3edcdf0b4a4bfcdd134fda0bf8a560f
2022-08-25MLBEDSW-6879: TFLG pass-through test crash3.5.0.rc43.5.0Tim Hall
- The optimisation of the SHAPE operator resulted in a divide by zero when printing the percentage of npu/cpu operators in the final output summary - The fix is to detect when there are no operators in the output tflite and then avoid the division Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I5bd2342335e9468a8b7028e6e2291a03960e2e55
2022-08-18MLBEDSW-6844: Exclude resize ops from cascades3.5.0.rc2Johan Alfvén
Remove resize ops completely from being cascaded since there are corner cases which are not currently handled. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9923f8e119af7bdc0e93b0e69b521b399e0629af
2022-08-17MLBEDSW-6769: Fix odd stripe heights for upscalingerik.andersson@arm.com
Output diffs were found to be caused by odd input stripe heights, despite the input being an upscaling operator. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ia3791d815250364cfe7a38c3ed0e30768d64ca08
2022-08-17MLBEDSW-6645: MLCE: Optimize SRAM usageJohan Alfvén
- When compiling for shared SRAM the old scheduler has an option so that it produces less SRAM than what the new scheduler manages to produce. The old scheduler was able to creates more/longer cascades. In order to improve the new scheduler, the following has been implemented: - Take persistent IFM's into account when creating the min schedule. - Choose longer cascades when it is possible to reduce the total SRAM usage compared to using shorter cascades. - Updated calculation for estimated SRAM usage for elementwise ops. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I209bbf2d94425e4f6aacb1d151b3b2aa65c0870b
2022-08-17MLBEDSW-6830: MLCE: Fix assert on concat opJohan Alfvén
- The compiler will assert when compiling a faulty concat op. In the reported use case, there were 3 inputs with shape 1x1x2 but the output shape was 1x1x2 (expected to be 1x1x6) - The solution is to add constraints to the concat operator. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I94a505c51a9fd54d1aa92531a0415031db52378a
2022-08-16MLBEDSW-6640: Modify elementwise block size selectionRickard Bolin
Limit relative cost to 1 for elementwise operations since increasing block size when the full ofm already fits gives no additional benefits. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Ib6128f6346834fd916efa59adbe07a069dbda0ae
2022-08-10Revert reversion of TensorFlow 2.9 update3.5.0.rc1erik.andersson@arm.com
With the errors caused by the previous TensorFlow 2.9 update being fixed, we can proceed with the upgrade. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ie1f025e8d984efaebc68b8d051126d49bee6b2b8
2022-07-23MLBEDSW-4157: Add RESIZE_NEAREST_NEIGHBOR supportTim Hall
- Changed ResizeBilinear to support ResizeNearestNeighbor as well for 1x1 IFM, IFM equal OFM, and non-align corners - Added support for ResizeNearestNeighbor with align corners by converting to a DepthwiseConv - Updated supported operator unit tests - Added is_resize() helper function and some associated refactoring Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Id5bdf2a25e8aa6a4f28b7236250abf768141ce37
2022-07-23MLBEDSW-6616: ResizeBilinear align corners is incorrectTim Hall
- Fixed align corners support when converting in to upscale and average pool. The problem was due to the wrong ratio ifm to ofm size, causing an scaling factor that was not 2x/4x/8x. Works for uint8, int8 and int16. - Fixed checking of align corners in supported operators check - Added additional supported operators check for the size tensor - Updated and added more supported operators unit tests Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: Idb78fa9e76ede2c37e8ac6cb1c322154bd156898
2022-07-23vela: OFM_SCALE refactorTim Hall
- Minor rework at the register command stream level Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I58495e40efa3a95bdf6febde530f9f73fa8be30b
2022-07-19MLBEDSW-6700: Fix compiler assert when fusing tensorsJohan Alfvén
If an elemenwise op is part of a cascade, the ifm can not be overwritten by the ofm. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I1e5f7ee501be17e76684b33c6e86ab8af0f3e61f
2022-07-19MLBEDSW-6710: Revert Tensorflow 2.9Johan Alfvén
Tensorflow 2.9 contains a bug for int16x8 without biases. Revert "MLBEDSW-6635: Update to TensorFlow 2.9" This reverts commit 93f492bae9c4dd16a1f64b851b237263695ee03e. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I366d201ce4134a877d333be2aade546dfcb5d6d7
2022-07-15MLBEDSW-6703 Add SHAPE operator to supported operatorsFredrik Svedberg
Added SHAPE operator to the supported operators report. Updated the constraints for QUANTIZE and SHAPE operator. Also fixed RESHAPE consuming statically optimised shape. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I1d964d602d3f361a0f16dae8133197280dd84c48
2022-07-14MLBEDSW-6635: Update to TensorFlow 2.9erik.andersson@arm.com
Update the flatbuffers generated code to comply with TensorFlow 2.9 Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I6bf506ffb85da2d4a57a32198b471513deeaca73
2022-07-13MLBEDSW-6496 mlperf_deeplabv3_mnv2_ade20k_int8 fails at verify_output for u65Fredrik Svedberg
Added check to see if additional stripe data is needed from producer op when cascading to make sure the stripes are not overwriting data still being used. Also changed scheduler to make sure ResizeBilinear always runs with even stripe height. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: If7d723e6be29575c2b55c400eebbe8275a1aa328
2022-07-13MLBEDSW-6687 Vela crashes in npu_serialisation.py and tflite_graph_optimiser.pyFredrik Svedberg
Fixed static optimisation of Quantize operator by running unsupported formats on CPU. Also added support for int16 and corrected the calculation. Change-Id: I861c712aa6258dba53fcf4d5dae45d1d416e6141 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-07-12MLBEDSW-4856: Removed dead codeoliper01
Hardswish activation function gets converted to LUT in graph optimizer. The case for it was removed, as it was never called. Signed-off-by: oliper01 <oliver.perssonbogdanovski@arm.com> Change-Id: I376e8d7b81489c06b66d4e49f59b207600c0ccce
2022-07-11MLBEDSW-6261: Elementwise cascadingerik.andersson@arm.com
Enabled elementwise cascading for binary/single variable IFM operators. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: I1c0867875fdc5c4980224fb570185c11e719d5cd
2022-06-29MLBEDSW-6314 Static optimisation for quantise OPAyaan Masood
*Quantise op becomes constant if input is known at compile time *Quantised values calculated if input of op is const and float *Const inputs to quant op that are int are requantized Change-Id: Ic94a72a392af709fe6a640d7dacbb5dc2334f16f Signed-off-by: Ayaan Masood <Ayaan.Masood@arm.com>