1 files changed, 71 insertions, 80 deletions
diff --git a/docs/sections/memory_considerations.md b/docs/sections/memory_considerations.md
index fc81f8f..89baf41 100644
--- a/docs/sections/memory_considerations.md
+++ b/docs/sections/memory_considerations.md
@@ -7,7 +7,7 @@
   - [Understanding memory usage from Vela output](#understanding-memory-usage-from-vela-output)
     - [Total SRAM used](#total-sram-used)
     - [Total Off-chip Flash used](#total-off_chip-flash-used)
-  - [Non-default configurations](#non-default-configurations)
+  - [Memory mode configurations](#memory-mode-configurations)
   - [Tensor arena and neural network model memory placement](#tensor-arena-and-neural-network-model-memory-placement)
   - [Memory usage for ML use-cases](#memory-usage-for-ml-use_cases)
   - [Memory constraints](#memory-constraints)
@@ -94,52 +94,88 @@ buffers is. These are:
 
 ### Total SRAM used
 
-When the neural network model is compiled with Vela, a summary report that includes memory usage is generated. For
-example, compiling the keyword spotting model
+When the neural network model is compiled with Vela, a summary report that includes memory usage is generated.
+For example, compiling the keyword spotting model
 [ds_cnn_clustered_int8](https://github.com/ARM-software/ML-zoo/blob/master/models/keyword_spotting/ds_cnn_large/tflite_clustered_int8/ds_cnn_clustered_int8.tflite)
-with Vela produces, among others, the following output:
+with the Vela command:
+
+```commandline
+vela \
+  --accelerator-config=ethos-u55-128 \
+  --optimise Performance \
+  --config scripts/vela/default_vela.ini
+  --memory-mode=Shared_Sram
+  --system-config=Ethos_U55_High_End_Embedded
+  ds_cnn_clustered_int8.tflite
+```
+
+It produces, among others, the following output:
 
 ```log
-Total SRAM used                                 70.77 KiB
-Total Off-chip Flash used                      430.78 KiB
+Total SRAM used                                146.31 KiB
+Total Off-chip Flash used                      452.42 KiB
 ```
 
 The `Total SRAM used` here shows the required memory to store the `tensor arena` for the TensorFlow Lite Micro
 framework. This is the amount of memory required to store the input, output, and intermediate buffers. In the preceding
-example, the tensor arena requires 70.77 KiB of available SRAM.
+example, the tensor arena requires 146.31 KiB of available SRAM.
 
 > **Note:** Vela can only estimate the SRAM required for graph execution. It has no way of estimating the memory used by
 > internal structures from TensorFlow Lite Micro framework.
 
-Therefore, we recommend that you top this memory size by at least 2KiB. We also recoomend that you also carve out the
+Therefore, we recommend that you top this memory size by at least 2KiB. We also recommend that you also carve out the
 `tensor arena` of this size, and then place it on the SRAM of the target system.
 
 ### Total Off-chip Flash used
 
 The `Total Off-chip Flash` parameter indicates the minimum amount of flash required to store the neural network model.
-In the preceding example, the system must have a minimum of 430.78 KiB of available flash memory to store the `.tflite`
+In the preceding example, the system must have a minimum of 452.42 KiB of available flash memory to store the `.tflite`
 file contents.
 
 > **Note:** The Arm® *Corstone™-300* system uses the DDR region as a flash memory. The timing adapter sets up the AXI
 > bus that is wired to the DDR to mimic both bandwidth and latency characteristics of a flash memory device.
 
-## Non-default configurations
+## Memory mode configurations
+
+The preceding example outlines a typical configuration for *Ethos-U55* NPU, and this corresponds to the default
+Vela memory mode setting.
+Evaluation kit supports all the *Ethos-U* NPU memory modes:
+
+|  *Ethos™-U* NPU  |   Default Memory Mode  |  Other Memory Modes supported  |
+|------------------|------------------------|--------------------------------|
+|   *Ethos™-U55*   |     `Shared_Sram`      |          `Sram_Only`           |
+|   *Ethos™-U65*   |    `Dedicated_Sram`    |         `Shared_Sram`          |
 
-The preceding example outlines a typical configuration, and this corresponds to the default Vela setting. However, the
-system SRAM can also be used to store the neural network model along with the `tensor arena`. Vela supports optimizing
-the model for this configuration with its `Sram_Only` memory mode.
+For further information on the default settings, please refer to: [default_vela.ini](../../scripts/vela/default_vela.ini).
 
-For further information, please refer to: [vela.ini](../../scripts/vela/vela.ini).
+For *Ethos-U55* NPU, the system SRAM can also be used to store the neural network model along with the `tensor arena`.
+Vela supports optimizing the model for this configuration with its `Sram_Only` memory mode.
+Although the Vela settings for this configurations suggests that only AXI0 bus is used, when compiling the model
+a warning is generated, for example:
+
+```log
+vela \
+  --accelerator-config=ethos-u55-128 \
+  --optimise Performance \
+  --config scripts/vela/default_vela.ini
+  --memory-mode=Sram_Only
+  --system-config=Ethos_U55_High_End_Embedded
+  ds_cnn_clustered_int8.tflite
+
+Info: Changing const_mem_area from Sram to OnChipFlash. This will use the same characteristics as Sram.
+```
 
-To make use of a neural network model that is optimized for this configuration, the linker script for the target
-platform must be changed. By default, the linker scripts are set up to support the default configuration only.
+This means that the  neural network model is always placed in the flash region. In this case, timing adapters for the
+AXI buses are set the same values to mimic both bandwidth and latency characteristics of a SRAM memory device.
+See [Ethos-U55 NPU timing adapter default configuration](../../scripts/cmake/timing_adapter/ta_config_u55_high_end.cmake).
 
 For script snippets, please refer to: [Memory constraints](./memory_considerations.md#memory-constraints).
 
 > **Note:**
 >
-> 1. The the `Shared_Sram` memory mode represents the default configuration.
-> 2. The `Dedicated_Sram` mode is only applicable for the Arm® *Ethos™-U65*.
+> 1. The `Shared_Sram` memory mode represents the default configuration.
+> 2. The `Dedicated_Sram` memory mode is only applicable for the Arm® *Ethos™-U65*.
+> 3. The `Sram_only` memory mode is only applicable for the Arm® *Ethos™-U55*.
 
 ## Tensor arena and neural network model memory placement
 
@@ -147,18 +183,15 @@ The evaluation kit uses the name `activation buffer` for the `tensor arena` in t
 Every use-case application has a corresponding `<use_case_name>_ACTIVATION_BUF_SZ` parameter that governs the maximum
 available size of the `activation buffer` for that particular use-case.
 
-The linker script is set up to place this memory region in SRAM. However, if the memory required is more than what the
-target platform supports, this buffer needs to be placed on flash instead. Every target platform has a profile
-definition in the form of a `CMake` file.
+The linker script is set up to place this memory region in SRAM for *Ethos-U55* and in flash for *Ethos-U65*.
+Every target platform has a profile definition in the form of a `CMake` file.
 
 For further information and an example, please refer to: [Corstone-300 profile](../../scripts/cmake/subsystem-profiles/corstone-sse-300.cmake).
 
 The parameter `ACTIVATION_BUF_SRAM_SZ` defines the maximum SRAM size available for the platform. This is propagated
-through the build system. If the `<use_case_name>_ACTIVATION_BUF_SZ` for a given use-case is *more* than the
-`ACTIVATION_BUF_SRAM_SZ` for the target build platform, then the `activation buffer` is placed on the flash memory
-instead.
+through the build system.
 
-The neural network model is always placed in the flash region. However, this can be changed in the linker script.
+The neural network model is always placed in the flash region (even in case of `Sram_Only` memory mode as mentioned earlier).
 
 ## Memory usage for ML use-cases
 
@@ -168,12 +201,12 @@ memory requirements for the different use-cases of the evaluation kit.
 > **Note:** The SRAM usage does not include memory used by TensorFlow Lite Micro and must be topped up as explained
 > under [Total SRAM used](#total-sram-used).
 
-- [Keyword spotting model](https://github.com/ARM-software/ML-zoo/tree/master/models/keyword_spotting/ds_cnn_large/tflite_clustered_int8)
+- [Keyword spotting model](https://github.com/ARM-software/ML-zoo/tree/68b5fbc77ed28e67b2efc915997ea4477c1d9d5b//models/keyword_spotting/ds_cnn_large/tflite_clustered_int8)
   requires
   - 70.7 KiB of SRAM
   - 430.7 KiB of flash memory.
 
-- [Image classification model](https://github.com/ARM-software/ML-zoo/tree/master/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8)
+- [Image classification model](https://github.com/ARM-software/ML-zoo/tree/e0aa361b03c738047b9147d1a50e3f2dcb13dbcb/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8)
   requires
   - 638.6 KiB of SRAM
   - 3.1 MB of flash memory.
@@ -199,38 +232,8 @@ scatter file is as follows:
 ;---------------------------------------------------------
 LOAD_REGION_0       0x00000000                  0x00080000
 {
-    ;-----------------------------------------------------
-    ; First part of code mem - 512kiB
-    ;-----------------------------------------------------
-    itcm.bin        0x00000000                  0x00080000
-    {
-        *.o (RESET, +First)
-        * (InRoot$$Sections)
-
-        ; Essentially only RO-CODE, RO-DATA is in a
-        ; different region.
-        .ANY (+RO)
-    }
-
-    ;-----------------------------------------------------
-    ; 128kiB of 512kiB DTCM is used for any other RW or ZI
-    ; data. Note: this region is internal to the Cortex-M
-    ; CPU.
-    ;-----------------------------------------------------
-    dtcm.bin        0x20000000                  0x00020000
-    {
-        ; Any R/W and/or zero initialised data
-        .ANY(+RW +ZI)
-    }
 
-    ;-----------------------------------------------------
-    ; 384kiB of stack space within the DTCM region. See
-    ; `dtcm.bin` for the first section. Note: by virtue of
-    ; being part of DTCM, this region is only accessible
-    ; from Cortex-M55.
-    ;-----------------------------------------------------
-    ARM_LIB_STACK   0x20020000 EMPTY ALIGN 8    0x00060000
-    {}
+...
 
     ;-----------------------------------------------------
     ; SSE-300's internal SRAM of 4MiB - reserved for
@@ -240,8 +243,11 @@ LOAD_REGION_0       0x00000000                  0x00080000
     ;-----------------------------------------------------
     isram.bin       0x31000000  UNINIT ALIGN 16 0x00400000
     {
-        ; activation buffers a.k.a tensor arena
-        *.o (.bss.NoInit.activation_buf)
+        ; Cache area (if used)
+        *.o (.bss.NoInit.ethos_u_cache)
+
+        ; activation buffers a.k.a tensor arena when memory mode sram only
+        *.o (.bss.NoInit.activation_buf_sram)
     }
 }
 
@@ -251,7 +257,7 @@ LOAD_REGION_0       0x00000000                  0x00080000
 LOAD_REGION_1       0x70000000                  0x02000000
 {
     ;-----------------------------------------------------
-    ; 32 MiB of DRAM space for neural network model,
+    ; 32 MiB of DDR space for neural network model,
     ; input vectors and labels. If the activation buffer
     ; size required by the network is bigger than the
     ; SRAM size available, it is accommodated here.
@@ -261,33 +267,18 @@ LOAD_REGION_1       0x70000000                  0x02000000
         ; nn model's baked in input matrices
         *.o (ifm)
 
-        ; nn model
+        ; nn model's default space
         *.o (nn_model)
 
         ; labels
         *.o (labels)
 
-        ; if the activation buffer (tensor arena) doesn't
-        ; fit in the SRAM region, we accommodate it here
-        *.o (activation_buf)
+        ; activation buffers a.k.a tensor arena when memory mode dedicated sram
+        *.o (activation_buf_dram)
     }
 
-    ;-----------------------------------------------------
-    ; First 256kiB of BRAM (FPGA SRAM) used for RO data.
-    ; Note: Total BRAM size available is 2MiB.
-    ;-----------------------------------------------------
-    bram.bin        0x11000000          ALIGN 8 0x00040000
-    {
-        ; RO data (incl. unwinding tables for debugging)
-        .ANY (+RO-DATA)
-    }
+...
 
-    ;-----------------------------------------------------
-    ; Remaining part of the 2MiB BRAM used as heap space.
-    ; 0x00200000 - 0x00040000 = 0x001C0000 (1.75 MiB)
-    ;-----------------------------------------------------
-    ARM_LIB_HEAP    0x11040000 EMPTY ALIGN 8    0x001C0000
-    {}
 }
 
 ```