aboutsummaryrefslogtreecommitdiff
path: root/OPTIONS.md
diff options
context:
space:
mode:
authorTim Hall <tim.hall@arm.com>2020-11-01 20:59:36 +0000
committerTim Hall <tim.hall@arm.com>2020-11-20 12:55:47 +0000
commit1bd531dec0b4eb745fb8856d14c1aba2b8a73026 (patch)
treea0265a0accd2395277fe88be27164d09541abc7f /OPTIONS.md
parentc8a73868d40cf63380f634baeb51aa7aa993fc0c (diff)
downloadethos-u-vela-1bd531dec0b4eb745fb8856d14c1aba2b8a73026.tar.gz
MLBEDSW-3249: Vela config file examples
- Added sample vela.ini config file - Changed vela config format, split into system config and memory mode - Removed unused CPU cycle performance estimation - Added new CLI options for --memory-mode and --verbose-config - Changed CLI option --config to take multiple files - Removed CLI option --global-memory-clock-scales - Changed error helper functions to raise a VelaError exception - Refactored to create a new is_spilling_enabled function Signed-off-by: Tim Hall <tim.hall@arm.com> Change-Id: I27c41577e37a3859edb9524cd99784be10ef0a0d
Diffstat (limited to 'OPTIONS.md')
-rw-r--r--OPTIONS.md225
1 files changed, 128 insertions, 97 deletions
diff --git a/OPTIONS.md b/OPTIONS.md
index f02b91e9..baf6c5a3 100644
--- a/OPTIONS.md
+++ b/OPTIONS.md
@@ -2,13 +2,13 @@
This file contains a more verbose and detailed description of the Vela
Compiler's CLI options than the built-in help strings. It also defines and
-describes the Vela system configuration file format.
+describes Vela's configuration file format.
## Command Line Interface
### Network (required)
-Filename of the network model to compile. The file has to be a `.tflite` file.
+Filename of the network model to compile. The file has to be a `.tflite` file.
**Type: POSIX path**
**Default: N/A**
@@ -18,7 +18,7 @@ vela path/to/network.tflite
### Help
-Displays the help strings of all CLI options. Can be used without the required
+Displays the help strings of all CLI options. Can be used without the required
Network argument.
**Type: N/A**
**Default: N/A**
@@ -29,7 +29,7 @@ vela --help
### Version
-Displays the version of the installed Vela Compiler. Can be used without the
+Displays the version of the installed Vela Compiler. Can be used without the
required Network argument.
**Type: N/A**
**Default: N/A**
@@ -75,19 +75,21 @@ vela network.tflite --output-dir ./custom_directory
### Config
-Specifies the path to the config file. The file has to be a `.ini` file. The
-format is described further in a the Config section below.
+Specifies the path to the Vela configuration file. The format of the file is a
+Python ConfigParser `.ini` file. This option can be specified multiple times to
+allow multiple files to be searched for the required system config and memory
+mode. More details can be found in the Configuration File section below.
**Type: POSIX path**
**Default: use default configuration**
```bash
-vela network.tflite --config custom_config.ini
+vela network.tflite --config my_vela_cfg1.ini --config my_vela_cfg2.ini --system-config My_Sys_Cfg --memory-mode My_Mem_Mode
```
### Cascading
-Controls the packing of multiple passes into cascades. This allows for lower
-memory usage. If the network's intermediate feature maps are too large for the
+Controls the packing of multiple passes into cascades. This allows for lower
+memory usage. If the network's intermediate feature maps are too large for the
system's SRAM this optimisation is required.
**Type: Boolean**
**Default: True**
@@ -109,7 +111,7 @@ vela network.tflite --keep-scale-placement
Force a specific block configuration in the format HxWxC, where H, W, and C are
positive integers specifying height, width, and channels (depth), respectively.
-The default behaviour is Vela searching for an optimal block configuration. An
+The default behaviour is Vela searching for an optimal block configuration. An
exception will be raised if the chosen block configuration is incompatible.
**Type: String**
**Default: N/A**
@@ -121,7 +123,7 @@ vela network.tflite --force-block-config 2x2x8
### Timing
Measure time taken for different compiler steps, e.g. model reading and
-scheduling. Prints the results to standard out.
+scheduling. Prints the results to standard out.
**Type: Set True**
**Default: False**
@@ -131,9 +133,9 @@ vela network.tflite --timing
### Accelerator Configuration
-Choose which hardware accelerator configuration to compile for. Format is
+Choose which hardware accelerator configuration to compile for. Format is
accelerator name followed by a hyphen, followed by the number of MACs in the
-configuration.
+configuration.
**Type: String**
**Default: ethos-u55-256**
**Choices: [ethos-u55-32, ethos-u55-64, ethos-u55-128, ethos-u55-256]**
@@ -144,13 +146,24 @@ vela network.tflite --accelerator-config ethos-u55-64
### System Config
-Selects the system configuration to use as specified in the System Configuration
-File (see section below).
+Selects the system configuration to use as specified in the Vela configuration
+file (see section below).
**Type: String**
**Default: Use internal default config**
```bash
-vela network.tflite --system-config MySysConfig
+vela network.tflite --config my_vela_cfg.ini --system-config My_Sys_Cfg
+```
+
+### Memory Mode
+
+Selects the memory mode to use as specified in the Vela configuration file (see
+section below).
+**Type: String**
+**Default: Use internal default config**
+
+```bash
+vela network.tflite --config my_vela_cfg.ini --memory-mode My_Mem_Mode
```
### Tensor Allocator
@@ -167,9 +180,9 @@ vela network.tflite --tensor-allocator=LinearAlloc
### Ifm Streaming
-Controls scheduler IFM streaming search. Vela's scheduler will choose between
-IFM Streaming and Weight Streaming for optimal memory usage. Disabling this will
-cause Vela to always choose Weight Streaming.
+Controls scheduler IFM streaming search. Vela's scheduler will choose between
+IFM Streaming and Weight Streaming for optimal memory usage. Disabling this
+will cause Vela to always choose Weight Streaming.
**Type: Boolean**
**Default: True**
@@ -179,8 +192,8 @@ vela network.tflite --ifm-streaming False
### Block Config Limit
-Limit the block config search space. This will result in faster compilation
-times but may impact the performance of the output network. Use 0 for unlimited
+Limit the block config search space. This will result in faster compilation
+times but may impact the performance of the output network. Use 0 for unlimited
search.
**Type: Integer**
**Default: 16**
@@ -190,22 +203,10 @@ search.
vela network.tflite --block-config-limit 0
```
-### Global Memory Clock Scale
-
-Performs an additional scaling of the individual memory clock scales specified
-by the system configuration. Used to globally adjust the bandwidth of the
-various memories
-**Type: Float**
-**Default: 1.0**
-
-```bash
-vela network.tflite --global-memory-clock-scale 1.5
-```
-
### Pareto Metric
-Controls the calculation of the pareto metric. Use 'BwCycMemBlkH' to consider
-Block Height in addition to Bandwidth, Cycle count and Memory. This can reduce
+Controls the calculation of the pareto metric. Use 'BwCycMemBlkH' to consider
+Block Height in addition to Bandwidth, Cycle count and Memory. This can reduce
SRAM usage in some circumstances.
**Type: String**
**Default: BwCycMem**
@@ -218,9 +219,9 @@ vela network.tflite --pareto-metric BwCycMemBlkH
### Recursion Limit
Some of Vela's algorithms use recursion and the required depth can be network
-dependant. This option allows the limit to be increased if needed. The maximum
-limit is platform dependent. If limit is set too low then compilation will raise
-a RecursionError exception.
+dependant. This option allows the limit to be increased if needed. The maximum
+limit is platform dependent. If limit is set too low then compilation will
+raise a RecursionError exception.
**Type: Integer**
**Default: 10000**
@@ -244,7 +245,7 @@ vela network.tflite --enable-debug-db
### Max Block Dependency
Set the maximum value that can be used for the block dependency delay between
-NPU kernel operations. A lower value may result in longer execution time.
+NPU kernel operations. A lower value may result in longer execution time.
**Type: Integer**
**Default: 3**
**Choices: [0, 1, 2, 3]**
@@ -255,8 +256,9 @@ vela network.tflite --max-block-dependency 0
### Tensor Format Between Cascaded Passes
-Controls if NHCWB16 or NHWC Tensor format should be used in between cascaded passes. NHWCB16 means FeatureMaps are laid
-out in 1x1x16B bricks in row-major order. This enables more efficient FeatureMap reading from external memory.
+Controls if NHCWB16 or NHWC Tensor format should be used in between cascaded
+passes. NHWCB16 means FeatureMaps are laid out in 1x1x16B bricks in row-major
+order. This enables more efficient FeatureMap reading from external memory.
**Type: Boolean**
**Default: True**
**Choices: [True, False]**
@@ -267,9 +269,10 @@ vela network.tflite --nhcwb16-between-cascaded-passes
### Scaling of weight estimates
-Performs an additional scaling of weight compression estimate used by Vela to estimate SRAM usage.
-Increasing this scaling factor will make the estimates more conservative (lower) and this can result
-in optimisations that use less SRAM, albeit at the cost of performance (inference speed).
+Performs an additional scaling of weight compression estimate used by Vela to
+estimate SRAM usage. Increasing this scaling factor will make the estimates
+more conservative (lower) and this can result in optimisations that use less
+SRAM, albeit at the cost of performance (inference speed).
**Type: Float**
**Default: 1.0**
@@ -279,8 +282,9 @@ vela network.tflite --weight-estimation-scaling=1.2
### Allocation alignment
-Controls the allocation byte alignment. Only affects CPU tensors, NPU tensors will remain 16-byte
-aligned independent of this option. Alignment has to be a power of two and greater or equal to 16.
+Controls the allocation byte alignment. Only affects CPU tensors, NPU tensors
+will remain 16-byte aligned independent of this option. Alignment has to be a
+power of two and greater or equal to 16.
**Type: Integer**
**Default: 16**
@@ -317,6 +321,16 @@ Show the operations that fall back to the CPU.
vela network.tflite --show-cpu-operations
```
+### Verbose Config
+
+Verbose system configuration and memory mode. If no `--system-config` or
+`--memory-mode` CLI options are specified then the `internal-default` values
+will be displayed.
+
+```bash
+vela network.tflite --verbose-config
+```
+
### Verbose Graph
Verbose graph rewriter.
@@ -405,62 +419,79 @@ Verbose operator list.
vela network.tflite --verbose-operators
```
-## System Configuration File
+## Configuration File
+
+This is used to describe various properties of the Ethos-U embedded system. The
+configuration file is selected using the `--config` CLI option along with a file
+that describes the properties. The format of the file is a Python ConfigParser
+`.ini` file format consists of sections used to identify a configuration, and
+key/value pair options used to specify the properties. All sections and
+key/value pairs are case-sensitive.
+
+There are two types of section, system configuration `[System_Config.*]`
+sections and memory mode `[Memory_Mode.*]` sections. A complete Ethos-U
+embedded system should define at least one entry in each section, where an entry
+is identified using the format `[Part.Name]` (Part = {System_Config or
+Memory_Mode}, Name = {a string with no spaces}.). A configuration file may
+contain multiple entries per section, with the entries `.Name` being used to
+select it using the `--system-config` and `--memory-mode` CLI options. If the
+CLI options are not specified then the sections named `internal-default` are
+used. These are special sections which are defined internally and contain
+default values.
+
+Each section contains a number of options which are described in more detail
+below. All options are optional. If they are not specified, then they will be
+assigned a value of 1 (or the equivalent). They will not be assigned the value
+of `internal-default`.
+
+One special option is the `inherit` option. This can be used in any section and
+its value is the name of another section to inherit options from. The only
+restriction on this option is that recursion is not allowed and so it cannot
+reference its own section.
+
+To see the configuration values being used by Vela use the `--verbose_config`
+CLI option. This can also be used to display the internal-default values and to
+see a full list of all the available options.
-This is used to describe various properties of the embedded system that the
-network will run in. The configuration file is selected with the `--config` CLI
-option. The system config is selected by Name (defined in the
-`[SysConfig.Name]` field) with the CLI option `--system-config`. The `cpu=X`
-attribute in the `[SysConfig.Name]` is used to cross-reference and select CPU
-operator attributes in the `[CpuPerformance.OpName]` section.
-Example usage based on the file below:
+An example Vela configuration file, called `vela_cfg.ini`, is included in the
+directory containing this file. Example usage based on this file is:
```bash
-vela network.tflite --config sys_cfg_vela.ini --system-config MySysConfig
+vela network.tflite --accelerator-config ethos-u55-256 --config vela_cfg.ini --system-config Ethos_U55_High_End_Embedded --memory-mode Shared_Sram
```
-Example of a Vela system configuration file.
+The following is an in-line explanation of the Vela configuration file format:
```ini
-; File: sys_cfg_vela.ini
-; The file contains two parts; a system config part and a CPU operator
-; performance part.
-
-; System config
-; Specifies properties such as the core clock speed, the size and speed of the
-; four potential memory areas, and for various types of data which memory area
-; is used to store them. The cpu property is used to link with the CPU operator
-; performance.
-; The four potential memory areas are: Sram, Dram, OnChipFlash, OffChipFlash.
-
-[SysConfig.MySysConfig]
-npu_freq=500e6
-cpu=MyCpu
-Sram_clock_scale=1
-Sram_port_width=64
-Dram_clock_scale=1
-Dram_port_width=64
-OnChipFlash_clock_scale=1
-OnChipFlash_port_width=64
-OffChipFlash_clock_scale=0.25
-OffChipFlash_port_width=32
-permanent_storage_mem_area=OffChipFlash
-feature_map_storage_mem_area=Sram
-fast_storage_mem_area=Sram
-
-; CPU operator performance
-; Specifies properties that are used by a linear model to estimate the
-; performance for any operations that will be run on the CPU (such as those not
-; supported by the NPU). Setting the intercept and slope to 0 will result in
-; the operator being excluded from the performance estimation. This is the same
-; as not specifying the operator. If an explicit cpu is specified rather than
-; using the default then the cpu name must match the cpu specified in the
-; SysConfig.<system config name> section.
-
-[CpuPerformance.MyCpuOperator]
-default.intercept=0.0
-default.slope=1.0
-
-MyCpu.intercept=0.0
-MyCpu.slope=1.0
+; file: my_vela_cfg.ini
+; -----------------------------------------------------------------------------
+; Vela configuration file
+
+; -----------------------------------------------------------------------------
+; System Configuration
+
+; My_Sys_Cfg
+[System_Config.My_Sys_Cfg]
+core_clock=??? ---> Clock frequency of the Ethos-U. ??? = {float in Hz}
+axi0_port=??? ---> Memory type connected to AXI0. ??? = {Sram, Dram, OnChipFlash or OffChipFlash}
+axi1_port=??? ---> Memory type connected to AXI1. ??? = {Sram, Dram, OnChipFlash or OffChipFlash}
+Sram_clock_scale=??? ---> Scaling of core_clock to specify the Sram bandwidth. Only required if selected by an AXI port. ??? = {float 0.0 to 1.0}
+Dram_clock_scale=??? ---> Scaling of core_clock to specify the Dram bandwidth. Only required if selected by an AXI port. ??? = {float 0.0 to 1.0}
+OnChipFlash_clock_scale=??? ---> Scaling of core_clock to specify the OnChipFlash bandwidth. Only required if selected by an AXI port. ??? = {float 0.0 to 1.0}
+OffChipFlash_clock_scale=??? ---> Scaling of core_clock to specify the OffChipFlash bandwidth. Only required if selected by an AXI port. ??? = {float 0.0 to 1.0}
+
+; -----------------------------------------------------------------------------
+; Memory Mode
+
+; My_Mem_Mode_Parent
+[Memory_Mode.My_Mem_Mode_Parent]
+const_mem_area=??? ---> AXI port used by the read-only data (e.g. weight tensors, scale & bias tensors). ??? = {Axi0, Axi1}
+arena_mem_area=??? ---> AXI port used by the read-write data (e.g. feature map tensors, internal buffers). ??? = {Axi0, Axi1}
+cache_mem_area=??? ---> AXI port used by the dedicated SRAM read-write (e.g. feature map part-tensors, internal buffers). ??? = {Axi0, Axi1}
+cache_sram_size=??? ---> Size of the dedicated cache SRAM. Only required when cache_mem_area != arena_mem_area. ??? = {int in Bytes}
+
+; My_Mem_Mode_Child
+[Memory_Mode.My_Mem_Mode_Child]
+inherit=??? ---> Parent section to inherit from. An option in the child overwrites an identical option in the parent. ??? = {[Part.Name]}
+cache_sram_size=??? ---> Size of the dedicated cache SRAM. Only required when cache_mem_area != arena_mem_area. ??? = {int in Bytes}
```