diff options
author | Kristofer Jonsson <kristofer.jonsson@arm.com> | 2021-09-09 09:47:21 +0200 |
---|---|---|
committer | Kristofer Jonsson <kristofer.jonsson@arm.com> | 2021-09-09 11:01:48 +0200 |
commit | ff2084bc16c91ec71820785ed4f5018886375549 (patch) | |
tree | 5f3fcc77075500405d42165e4e9099cbec4ad90b /README.md | |
parent | ce05c41cec3ec68460f377dd63b567b60f070527 (diff) | |
download | ethos-u-core-platform-ff2084bc16c91ec71820785ed4f5018886375549.tar.gz |
Document memory configurations
Change-Id: I165651921106acb6893750dfeabec7537188c223
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 49 |
1 files changed, 48 insertions, 1 deletions
@@ -7,7 +7,7 @@ inference on an Arm Ethos-U compatible platform. This repository contains target specific files, like linker scripts. Target agnostic software components are provided in the -[core_software](https://review.mlplatform.org/plugins/gitiles/ml/ethos-u/ethos-u-core-software) +[core_software](https://git.mlplatform.org/ml/ethos-u/ethos-u-core-software.git) repository. # Targets @@ -117,6 +117,53 @@ be written to 0x02000000. Power up the board with the PBON and the application output will be seen on the serial console. +# Memory configurations + +Embedded systems come in very different configurations, but typically they have +a limited amount of high bandwidth low latency memory like SRAM, and some more +low bandwidth high latency memory like flash or DRAM. + +The Tensorflow Lite for Microcontrollers (TFLu) framework needs two buffers to +run an inference, the *model* and the *arena*. The model contains static data +like weights and biases. The arena contains read write data like activations, +IFM, OFM, temporary data etc. Please note that the IFM and OFM are located +*inside* of the arena. + +The placement of the model and arena has a big impact on the performance. There +are three configurations that make sense for most systems. + +| Model | Arena | Spilling | Note | +|------------|------------|----------|----------------| +| SRAM | SRAM | No | | +| Flash/DRAM | SRAM | No | | +| Flash/DRAM | Flash/DRAM | Yes | Ethos-U65 only | + +## Model and arena in SRAM + +For optimal performance both model and arena should be placed in SRAM. + +## Model flash/DRAM, Arena SRAM + +If both model and arena do not fit in SRAM, then it makes most sense to move the +model to flash/DRAM. The performance penalty depends on the network and will +need to be measured. For example weight bound networks will experience a larger +performance drop than MAC bound networks. + +## Model and arena in flash/DRAM (Ethos-U65 only) + +Moving both model and arena to flash/DRAM comes with quite a hefty performance +penalty. To mitigate some of this *spilling* can be used. + +Spilling means that a small buffer is reserved in SRAM that acts like a cache +for frequently accessed data. When spilling is enabled +[Vela](https://git.mlplatform.org/ml/ethos-u/ethos-u-vela.git/about/) will +prepend and append extra instructions to the command stream to DMA copy data +between the arena and the spilling buffer. + +Some of the data stored in the spilling buffer must be copied back to the arena, +which is done as DMA transfer over AXI 1. This is only supported by Ethos-U65, +because Ethos-U55 is equipped with a readonly AXI 1 interface. + # Multi NPU The Tensorflow Lite for Microcontrollers (TFLu) framework supports running |