aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-28Revert "MLBEDSW-6961: Bypass functionality for memory ops"Johan Alfvén
This reverts commit 5060ff53f5ac2382e04a68d7772bd71a36f63845. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I8dd7e9ed8325fd2e8c17509fd9757292706f5ee7
2022-10-27MLBEDSW-7060: Bias tensor should be in 1D shapeJohan Alfvén
Always make sure the bias is a 1D tensor. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ic0cb85d4fb9d2e07b4d1b7ac6059bffa432e28a3
2022-10-26MLBEDSW-7063: Fix output diff for networks with split opsJohan Alfvén
- Due to a SPLIT op the following ADD op did get an IFM shape that is bigger than its original shape but that is handled by read_offset and read_shapes. The problem was that the IFM was considered not be primary and an erroneously swap was done. - Make it even more clear when the swap is allowed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I0aefa04234f66c935f269267ae8ed1d77da64c81
2022-10-26MLBEDSW-7059: Updated offset calculation for SliceJohan Alfvén
Corrected offset calculation for operator Slice. All values in tensor begin and tensor size must be used to calculate the offset range in order to read the correct data. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ic463d8f72a2167f8129109b8dcf005f034cce6ed
2022-10-26MLBEDSW-6984: Optimize fast storage for feature mapsJohan Alfvén
- Remove very long live ranges that are standing out compared to its neighbors. This can be seen on large networks with complex structure. If they are chosen instead of shorter live ranges, it will be difficult for the HillClimb Allocator to find a perfect fit in the final allocation. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I6cf23adfdc06c1e93e12e9cf816453d940ff31f7
2022-10-25MLBEDSW-7028: Fix compiler assert for elementwise opJohan Alfvén
- Refactored erroneously if statement that allowed illegal swapping between ifm1 and ifm2 for elementwise operators. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Iec571f710824432edac9104d960f199f33a1b241
2022-10-21MLBEDSW-6840: New stripe algo for optimize sub scheduleJohan Alfvén
- The algorithm for trying out different stripes in order to optimize a sub schedule/cascade, have a problem that it can split the initial cascade into several smaller cascades. The problem with this is that it will increase IFM/OFM DRAM bandwith and performance will drop. - Changed the stripe algorithm to prefer long cascades. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I4f38b381597b7094819e9dd463aa1876e4e6bc62
2022-10-20MLBEDSW-7019: Update to elementwise cascadingJohan Alfvén
- The cascade builder is using the ifm_ifm2_correct_order function in order to decide if the operator is cascadable or not. The problem is that this function expects a full shape or no shape and the cascade builder did not provide that, so the operator was reported to be non cascadable. - The fix is to provide a full 4D shape, also refactoring ifm_ifm2_correct_order to use 4D shape to avoid confusion in the future. - Refactoring code so that the scheduler can perform a correct ifm and ifm2 swap. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9a86c4690612f332afa428456a07e67698852495
2022-10-20MLBEDSW-6931: Refactoring merge elementwise opsJohan Alfvén
Change code in cascade builder to instead use common functionality in live range. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I7bbd7ea3d1e7e085813e9d93256a54e6bab2267b
2022-10-19MLBEDSW-6880: Add support for multiple subgraphsJohan Alfvén
- Vela failed to compile networks with multiple subgraphs because only cascaded passes in the root subgraph were used when extracting the live ranges. The fix is to extract the subgraph range live on Ops that have connected subgraphs. - The tf_writer did not handle multiple subgraphs in a correct way resulting in corrupt buffer data in the optimized tflite file. The buffer index must be unique for every tensor. -Added support to handle multiple subgraphs for the OfflineMemoryAllocation meta data. The change will not change behavior for single graphs. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2328dfc1f07e2e4faf43a75423ea95423096ffa3
2022-10-19MLBEDSW-7020: TRANSPOSE_CONV stride documentation is confusingTim Hall
- The op contained supported operator checks for both the stride being in the range 1 to 3, and being equal to 2. Whilst both are correct, only the later is needed - Removed the stride in the range 1 to 3 check for TRANSPOSE_CONV - Regenerated the documentation Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I9789cdbd3ed65ce310f1529036abbac62296d2ca
2022-10-18MLBEDSW-7018: Update CONTRIBUTIONS docJohan Alfvén
In order to be able to add your SSH key there must exist a valid email address in your account. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I60c70e63ea6ad015d5a10d8e9efec6d61d56cbad
2022-10-18MLBEDSW-6941: Set correct OFM shape for fc opJohan Alfvén
If IFM operator shape is rewritten so that batching is greater than one for fully connect, the OFM batch must also be calculated. This change will fix output diffs for networks that have fully connect OFM with rank greater than 2. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I5009edc647a1449a02c8116b45808c1c68beffe6
2022-10-18MLBEDSW-6794: ResizeNearestNeighbor with HPCJohan Alfvén
- Removed half pixel centers constraint for resize nearest neightbor. - Supported scale 2x, 4x and 8x. - Removed test_constraint_resize_half_pixel_centers - Regenerated SUPPORTED_OPS.md Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ic3e02e9c2b2034d537c9a9841b8fb4ee433c96dc
2022-10-12MLBEDSW-6971 Fix output diff when cascading elementwise operatorsFredrik Svedberg
Fixed output diff when cascading elementwise operators with reversed operand order. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iac2e28cfb53037b929459af213f4fa7715b3e6de
2022-10-12MLBEDSW-6987 Regressions after removing RescaleAdd/RescaleMulFredrik Svedberg
The problem was that the updated conditions for elementwise cascading was to permissive after the RescaleAdd removal. Conditions for elementwise updated and transpose convolution removed from cascading since it does have issues. Change-Id: I0151256c4e3905fad39152941eec44bc76035d30 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-10-11MLBEDSW-6626: Initialize lut_val in mlw_codecJohan Alfvén
The palette variable located on the stack was not properly initialized and could potentially overwrite the stack memory when palette size was increased to 2. Make sure lut value is initialized. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9fecfe218dc39c0157d1af015e725d1e4becf2f0
2022-10-04MLBEDSW-6969 Remove RescaleAdd and RescaleMul operatorsFredrik Svedberg
Removed RescaleAdd and RescaleMul operators in favour of Operation.explicit_scale and removed Operation.rescale. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Idccd8851731d4bb8d4e84970e0fd6b409d7d4e45
2022-10-03MLBEDSW-6979: Installing on aarch64 with Python 3.8 failsTim Hall
- The issue is due to the numpy version needed when installing on aarch64 with Python 3.8 and TensorFlow - The fix is to use the python_version variable when specifing the numpy version Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I6134b6dbccefc3be0b87feb17e3176b7f42641b3
2022-10-03MLBEDSW-6955: Update to TensorFlow 2.10erik.andersson@arm.com
- Updated to TensorFlow 2.10 and FlatBuffers 2.0.7 - Changed absolute to relative imports in the auto-generated code - Updated Vela's TFLite writer to support FlatBuffer builder's internal number of elements count - Removed use of deprecated numElems argument to FlatBuffer builder's EndVector() Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: If447778134db81ae0ac374c7397e1140082372fd
2022-10-03MLBEDSW-2723 Handle int16 multiplier overflow test caseFredrik Svedberg
Added unit tests for scaling including saturated multiplier test. Change-Id: I87bb3a4bed8f62f5ef5cf3851b97f09ce42bf2b6 Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com>
2022-09-27MLBEDSW-6708 Check the bias tensor in graph optimiser mean opFredrik Svedberg
Cleaned up bias tensor use in graph optimiser for Mean operator. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ibcbfa010a4de67d97181df664b420168d6883d1e
2022-09-27MLBEDSW-6961: Bypass functionality for memory opsJohan Alfvén
- In order to solve output diffs, the Reshape op was pushed to the CPU. The problem was that the Mean op ifm shape was replaced by the Reshape op ifm shape. - This limitation is now removed. Changed implementation how memory only ops are bypassed. Always replace the memory only op ifm tensor with its ofm tensor. By doing this the ifm tensor for the operator that is after the memory only op is never changed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: Ibcdebf33fd9b7a37f90984a129500b5dac52e5ea
2022-09-27MLBEDSW-6962: MEAN height is greater than max kernel heightJohan Alfvén
Fixed bug when height is greater than max kernel height. The shape of the weight must match the ifm shape. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I901a8af2edd5858bb15d53d85ef8e2389049ada7
2022-09-27MLBEDSW-6933: Clean up address_for_coordinate functionRickard Bolin
Make the address_for_coordinate function a bit easier to read Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I854e1643a39108edc8b1de95198d30a1891fdfd1
2022-09-26MLBEDSW-4075 PACK axis 0 + tanh fails with output diffFredrik Svedberg
The test failed since the tanh had batch size > 1. Added checks for batch size for all supported operators. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3570352740c40eb96bd9db965dfa3c91c81ff2ad
2022-09-26MLBEDSW-6932 LeakyRelu missing from supported ops activationsFredrik Svedberg
Added LeakyRelu to supported activation ops. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Icca27730946d02ec16159f988782567be716b594
2022-09-23MLBEDSW-6928: Add int16 support for Resize Bilinear HPCRickard Bolin
Setting bias tensor dtype to DataType.int32 solves rounding issues for RB HPC int16. Removing the input data type check also solves the issue of resize nearest neighbor int16 ops incorrectly getting placed on the CPU. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: Iee352bcb78e581c0cde3c203dfbe866f1f6fae18
2022-09-23MLBEDSW-6686: Resize bilinear HPC with tile paddingRickard Bolin
- Added support for Resize Bilinear with half pixel centers for int8 and uint8. - Utilizes the new "TILE" padding mode. - Utilizes ofm stride multipliers and modified tile base offsets to write OFMs interleaved. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I37fa77c022a368f05fda0ead75d8696c9205f833
2022-09-21MLBEDSW-4338 Randomized int16 PAD output diffFredrik Svedberg
The issue was that the AveragePool in these test cases were translated to DepthwiseConv2DBias and int16 convolutions always runs with reduced scale. Fixed so that reduced scale is not used in this case. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Ice956eabbb37c8aa1991464870006971c6ecec43
2022-09-16MLBEDSW-6938 Fix PReLU optimisationFredrik Svedberg
Fixed PReLU optimisation to LeakyReLU with negative alpha. Added optimisation of LeakyReLU to ReLU when alpha is zero. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I5e66f79b29908fffd95b6115799021138ebb401a
2022-09-15MLBEDSW-6927: Add ofm_stride_multiplier attribute to operationRickard Bolin
Allow sparse writing of OFM by multiplying H/W/C of the OFM with the values of ofm_stride_multiplier Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I65d742ad36ad3154e9914cdd22e2da928ad1f095
2022-09-13MLBEDSW-6929 Fix LeakyReLU int16 regressionsFredrik Svedberg
Fixed LeakyReLU regressions for int16 due to scaling introduced for handling negative alpha. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I84a494fedf54bd4b47c4632645ded7d6cda445f8
2022-09-12MLBEDSW-6863: Cleanup the constraint for concatJohan Alfvén
Removed duplicate code and moved constraint to the correct file. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I2da3c5b88e1af351751c481217b8183b5948f0f8
2022-09-12MLBEDSW-6424: Remove Pipfile supportRickard Bolin
Remove Pipfile support due to lack of testing and maintenance. Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I93786cdbf22bfa2130601291d23cead177bd8f81
2022-09-12MLBEDSW-6869 Improve LeakyRelu supportFredrik Svedberg
Added support for int16 LeakyRelu for negative alpha and alpha greater than one. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I7f522ebfe014786d0a1d96172e75c7d9bdd76921
2022-09-12MLBEDSW-6613: Implement tile paddingRickard Bolin
Implement new padding mode which pads two edges of the IFM with the current values of those edges Signed-off-by: Rickard Bolin <rickard.bolin@arm.com> Change-Id: I8523e0cabdac80b48710703859003e33050cc150
2022-09-12MLBEDSW-6909: Use int32 acc for the Mean opJohan Alfvén
Changed acc type from int16 to int32. This will solve saturation problems and the constraint added in commit "MLBEDSW-5029: Output diff for Mean op" can be removed. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I05ec8835b43313b1a264d61a2b147fa62da123fe
2022-09-08MLEMBED-1918: Issue with REDUCE_SUM on Ethos-U65-5123.6.0.rc0Tim Hall
- Ethos-U65-512 requires the input to REDUCE_SUM to use NHWC format - Updated the graph optimiser format check to cover this condition - Added a exception check to the backend of the compiler to verify that this condition is not been violated by the external api or Vela internals Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I2f1fabcbd264daf77d5822349d855a3a32b12c64
2022-09-06MLBEDSW-6870 Optimisations for PReLUFredrik Svedberg
Added optimisations for PReLU when the alpha values allows it. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: Iff9124e691663ee495379f89900e7c35dbc5f948
2022-09-01MLBEDSW-5029: Output diff for Mean opJohan Alfvén
Fixed three test cases causing output diff compared to the reference kernel for the Mean operator. - If there is a possibility that the accumulator could saturate the Mean op must run CPU - Use correct rounding for the bias term - If a Reshape op is followed by a Mean op, push the Reshape op to the CPU since this cannot be handled by the NPU Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I734465730372105821a5e2f73a6a125b9eb7d7f4
2022-09-01MLBEDSW-6755: Add per-layer performance to CSV filewilisa01
Dump the current per-layer performance estimation information that appears on the terminal to a CSV file. Change-Id: I00e94168704be8c3c674c8779fb807ed28607ccd Signed-off-by: wilisa01 <william.isaksson@arm.com>
2022-08-31MLBEDSW-6832 PReLU support in VelaFredrik Svedberg
Added PReLU support in graph optimiser. Signed-off-by: Fredrik Svedberg <fredrik.svedberg@arm.com> Change-Id: I3a188675e3edcdf0b4a4bfcdd134fda0bf8a560f
2022-08-25MLBEDSW-6879: TFLG pass-through test crash3.5.0.rc43.5.0Tim Hall
- The optimisation of the SHAPE operator resulted in a divide by zero when printing the percentage of npu/cpu operators in the final output summary - The fix is to detect when there are no operators in the output tflite and then avoid the division Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I5bd2342335e9468a8b7028e6e2291a03960e2e55
2022-08-23MLBEDSW-6748: Update SUPPORTED_OPERATORS.md and release notes3.5.0.rc3oliper01
- Updated SUPPORT_OPERATORS.md with Resize operators - Updated release notes with the main changes and bug fixes - Updated version numbers Signed-off-by: oliper01 <oliver.perssonbogdanovski@arm.com> Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: If25b5fab708098bc3e7eb243924b55a50f148c3a
2022-08-23MLBEDSW-6423:Updated documentation to detail new dependencieserik.andersson@arm.com
Mypy and pylint was previously not included in TESTING.md. Also, installation of pre-commit, pytest and pytest-cov outside of a virtual environment was not detailed. CONTRIBUTIONS.md had an old Python version listed in the conding standard section. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Idff9454083e41d719e6d75e90cb2be2861500eb9
2022-08-18MLBEDSW-6844: Exclude resize ops from cascades3.5.0.rc2Johan Alfvén
Remove resize ops completely from being cascaded since there are corner cases which are not currently handled. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I9923f8e119af7bdc0e93b0e69b521b399e0629af
2022-08-17MLBEDSW-6769: Fix odd stripe heights for upscalingerik.andersson@arm.com
Output diffs were found to be caused by odd input stripe heights, despite the input being an upscaling operator. Signed-off-by: erik.andersson@arm.com <erik.andersson@arm.com> Change-Id: Ia3791d815250364cfe7a38c3ed0e30768d64ca08
2022-08-17MLBEDSW-6645: MLCE: Optimize SRAM usageJohan Alfvén
- When compiling for shared SRAM the old scheduler has an option so that it produces less SRAM than what the new scheduler manages to produce. The old scheduler was able to creates more/longer cascades. In order to improve the new scheduler, the following has been implemented: - Take persistent IFM's into account when creating the min schedule. - Choose longer cascades when it is possible to reduce the total SRAM usage compared to using shorter cascades. - Updated calculation for estimated SRAM usage for elementwise ops. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I209bbf2d94425e4f6aacb1d151b3b2aa65c0870b
2022-08-17MLBEDSW-6830: MLCE: Fix assert on concat opJohan Alfvén
- The compiler will assert when compiling a faulty concat op. In the reported use case, there were 3 inputs with shape 1x1x2 but the output shape was 1x1x2 (expected to be 1x1x6) - The solution is to add constraints to the concat operator. Signed-off-by: Johan Alfven <johan.alfven@arm.com> Change-Id: I94a505c51a9fd54d1aa92531a0415031db52378a