8 files changed, 2534 insertions, 0 deletions
diff --git a/docs/sections/appendix.md b/docs/sections/appendix.md
new file mode 100644
index 0000000..7b56faa
--- /dev/null
+++ b/docs/sections/appendix.md
@@ -0,0 +1,20 @@
+# Appendix
+
+## Arm® Cortex®-M55 Memory map overview for Corstone™-300 reference design
+
+The table below is the memory mapping information specific to the Arm® Cortex®-M55.
+
+| Name  | Base address | Limit address |  Size     | IDAU |  Remarks                                                  |
+|-------|--------------|---------------|-----------|------|-----------------------------------------------------------|
+| ITCM  | 0x0000_0000  |  0x0007_FFFF  |   512 kiB |  NS  |   ITCM code region                                        |
+| BRAM  | 0x0100_0000  |  0x0120_0000  |   2 MiB   |  NS  |   FPGA data SRAM region                                   |
+| DTCM  | 0x2000_0000  |  0x2007_FFFF  |  512 kiB  |  NS  |   4 banks for 128 kiB each                                |
+| SRAM  | 0x2100_0000  |  0x213F_FFFF  |  4 MiB    |  NS  |   2 banks of 2 MiB each as SSE-300 internal SRAM region   |
+| DDR   | 0x6000_0000  |  0x6FFF_FFFF  |   256 MiB |  NS  |   DDR memory region                                       |
+| ITCM  | 0x1000_0000  |  0x1007_FFFF  |   512 kiB |  S   |   ITCM code region                                        |
+| BRAM  | 0x1100_0000  |  0x1120_0000  |   2 MiB   |  S   |   FPGA data SRAM region                                   |
+| DTCM  | 0x3000_0000  |  0x3007_FFFF  |   512 kiB |  S   |   4 banks for 128 kiB each                                |
+| SRAM  | 0x3100_0000  |  0x313F_FFFF  |   4 MiB   |  S   |   2 banks of 2 MiB each as SSE-300 internal SRAM region   |
+| DDR   | 0x7000_0000  |  0x7FFF_FFFF  |  256 MiB  |  S   |   DDR memory region                                       |
+
+Default memory map can be found here: https://developer.arm.com/documentation/101051/0002/Memory-model/Memory-map
+\ No newline at end of file
diff --git a/docs/sections/building.md b/docs/sections/building.md
new file mode 100644
index 0000000..56771b8
--- /dev/null
+++ b/docs/sections/building.md
@@ -0,0 +1,1023 @@
+# Building the Code Samples application from sources
+
+## Contents
+
+- [Building the Code Samples application from sources](#building-the-code-samples-application-from-sources)
+  - [Contents](#contents)
+  - [Build prerequisites](#build-prerequisites)
+  - [Build options](#build-options)
+  - [Build process](#build-process)
+    - [Preparing build environment](#preparing-build-environment)
+    - [Create a build directory](#create-a-build-directory)
+    - [Configuring the build for `MPS3: SSE-300`](#configuring-the-build-for-mps3-sse-300)
+    - [Configuring the build for `MPS3: SSE-200`](#configuring-the-build-for-mps3-sse-200)
+    - [Configuring the build native unit-test](#configuring-the-build-native-unit-test)
+    - [Configuring the build for `simple_platform`](#configuring-the-build-for-simple_platform)
+    - [Building the configured project](#building-the-configured-project)
+  - [Building timing adapter with custom options](#building-timing-adapter-with-custom-options)
+  - [Add custom inputs](#add-custom-inputs)
+  - [Add custom model](#add-custom-model)
+  - [Optimize custom model with Vela compiler](#optimize-custom-model-with-vela-compiler)
+  - [Memory constraints](#memory-constraints)
+  - [Automatic file generation](#automatic-file-generation)
+
+This section assumes the use of an **x86 Linux** build machine.
+
+## Build prerequisites
+
+Before proceeding, please, make sure that the following prerequisites
+are fulfilled:
+
+- Arm Compiler version 6.14 or above is installed and available on the
+    path.
+
+    Test the compiler by running:
+
+    ```commandline
+    armclang -v
+    ```
+
+    ```log
+    Product: ARM Compiler 6.14 Professional
+    Component: ARM Compiler 6.14
+    ```
+
+    > **Note:** Add compiler to the path, if needed:
+    >
+    > `export PATH=/path/to/armclang/bin:$PATH`
+
+- Compiler license is configured correctly
+
+- CMake version 3.15 or above is installed and available on the path.
+    Test CMake by running:
+
+    ```commandline
+    cmake --version
+    ```
+
+    ```log
+    cmake version 3.16.2
+    ```
+
+    > **Note:** Add cmake to the path, if needed:
+    >
+    > `export PATH=/path/to/cmake/bin:$PATH`
+
+- Python 3.6 or above is installed. Test python version by running:
+
+    ```commandline
+    python3 --version
+    ```
+
+    ```log
+    Python 3.6.8
+    ```
+
+- Build system will create python virtual environment during the build
+    process. Please make sure that python virtual environment module is
+    installed:
+
+    ```commandline
+    python3 -m venv
+    ```
+
+- Make or MinGW make For Windows
+
+    ```commandline
+    make --version
+    ```
+
+    ```log
+    GNU Make 4.1
+
+    ...
+    ```
+
+    > **Note:** Add it to the path environment variable, if needed.
+
+- Access to the Internet to download the third party dependencies, specifically: TensorFlow Lite Micro, Arm Ethos-U55
+driver and CMSIS. Instructions for downloading these are listed under [preparing build environment](#preparing-build-environment).
+
+## Build options
+
+The project build system allows user to specify custom NN
+model (in `.tflite` format) or images and compile application binary from
+sources.
+
+The build system uses pre-built TensorFlow Lite for Microcontrollers
+library and Arm® Ethos™-U55 driver libraries from the delivery package.
+
+The build script is parameterized to support different options. Default
+values for build parameters will build the executable compatible with
+the Ethos-U55 Fast Model.
+
+The build parameters are:
+
+- `TARGET_PLATFORM`: Target platform to execute application:
+  - `mps3`
+  - `native`
+  - `simple_plaform`
+
+- `TARGET_SUBSYSTEM`: Platform target subsystem; this specifies the
+    design implementation for the deployment target. For both, the MPS3
+    FVP and the MPS3 FPGA, this should be left to the default value of
+    SSE-300:
+  - `sse-300` (default - [Arm® Corstone™-300](https://developer.arm.com/ip-products/subsystem/corstone/corstone-300))
+  - `sse-200`
+
+- `TENSORFLOW_SRC_PATH`: Path to the root of the TensorFlow directory.
+    The default value points to the TensorFlow submodule in the
+    [ethos-u](https://git.mlplatform.org/ml/ethos-u/ethos-u.git/about/) `dependencies` folder.
+
+- `ETHOS_U55_DRIVER_SRC_PATH`: Path to the Ethos-U55 core driver sources.
+    The default value points to the core_driver submodule in the
+    [ethos-u](https://git.mlplatform.org/ml/ethos-u/ethos-u.git/about/) `dependencies` folder.
+
+- `CMSIS_SRC_PATH`: Path to the CMSIS sources to be used to build TensorFlow
+    Lite Micro library. This parameters is optional and valid only for
+    Arm® Cortex®-M CPU targeted configurations. The default value points to the CMSIS submodule in the
+    [ethos-u](https://git.mlplatform.org/ml/ethos-u/ethos-u.git/about/) `dependencies` folder.
+
+- `ETHOS_U55_ENABLED`: Sets whether the use of Ethos-U55 is available for
+    the deployment target. By default, this is set and therefore
+    application is built with Ethos-U55 supported.
+
+- `CPU_PROFILE_ENABLED`: Sets whether profiling information for the CPU
+    core should be displayed. By default, this is set to false, but can
+    be turned on for FPGA targets. The the FVP, the CPU core's cycle
+    counts are not meaningful and should not be used.
+
+- `LOG_LEVEL`: Sets the verbosity level for the application's output
+    over UART/stdout. Valid values are `LOG_LEVEL_TRACE`, `LOG_LEVEL_DEBUG`,
+    `LOG_LEVEL_INFO`, `LOG_LEVEL_WARN` and `LOG_LEVEL_ERROR`. By default, it
+    is set to `LOG_LEVEL_INFO`.
+
+- `<use_case>_MODEL_TFLITE_PATH`: Path to the model file that will be
+    processed and included into the application axf file. The default
+    value points to one of the delivered set of models. Make sure the
+    model chosen is aligned with the `ETHOS_U55_ENABLED` setting.
+
+  - When using Ethos-U55 backend, the NN model is assumed to be
+    optimized by Vela compiler.
+    However, even if not, it will fall back on the CPU and execute,
+    if supported by TensorFlow Lite Micro.
+
+  - When use of Ethos-U55 is disabled, and if a Vela optimized model
+    is provided, the application will report a failure at runtime.
+
+- `USE_CASE_BUILD`: specifies the list of applications to build. By
+    default, the build system scans sources to identify available ML
+    applications and produces executables for all detected use-cases.
+    This parameter can accept single value, for example,
+    `USE_CASE_BUILD=img_class` or multiple values, for example,
+    `USE_CASE_BUILD="img_class;kws"`.
+
+- `ETHOS_U55_TIMING_ADAPTER_SRC_PATH`: Path to timing adapter sources.
+    The default value points to the `timing_adapter` dependencies folder.
+
+- `TA_CONFIG_FILE`: Path to the CMake configuration file containing the
+    timing adapter parameters. Used only if the timing adapter build is
+    enabled.
+
+- `TENSORFLOW_LITE_MICRO_CLEAN_BUILD`: Optional parameter to enable/disable
+    "cleaning" prior to building for the TensorFlow Lite Micro library.
+    It is enabled by default.
+
+- `TENSORFLOW_LITE_MICRO_CLEAN_DOWNLOADS`: Optional parameter to enable wiping
+    out TPIP downloads from TensorFlow source tree prior to each build.
+    It is disabled by default.
+
+- `ARMCLANG_DEBUG_DWARF_LEVEL`: When the CMake build type is specified as `Debug`
+    and when armclang toolchain is being used to build for a Cortex-M CPU target,
+    this optional argument can be set to specify the DWARF format.
+    By default, this is set to 4 and is synonymous with passing `-g`
+    flag to the compiler. This is compatible with Arm-DS and other tools
+    which can interpret the latest DWARF format. To allow debugging using
+    the Model Debugger from Arm FastModel Tools Suite, this argument can be used
+    to pass DWARF format version as "3". Note: this option is only available
+    when CMake project is configured with `-DCMAKE_BUILD_TYPE=Debug` argument.
+    Also, the same dwarf format is used for building TensorFlow Lite Micro library.
+
+> **Note:** For details on the specific use case build options, follow the
+> instructions in the use-case specific documentation.
+> Also, when setting any of the CMake configuration parameters that expect a directory/file path , it is advised
+>to **use absolute paths instead of relative paths**.
+
+## Build process
+
+The build process can summarized in three major steps:
+
+- Prepare the build environment by downloading third party sources required, see
+[Preparing build environment](#preparing-build-environment).
+
+- Configure the build for the platform chosen.
+This stage includes:
+  - CMake options configuration
+  - When `<use_case>_MODEL_TFLITE_PATH` build options aren't provided, defaults neural network models are be downloaded
+from [Arm ML-Zoo](https://github.com/ARM-software/ML-zoo/). In case of native build, network's input and output data
+for tests are downloaded.
+  - Some files such as neural network models, network's inputs and output labels are automatically converted
+    into C/C++ arrays, see [Automatic file generation](#automatic-file-generation).
+
+- Build the application.\
+During this stage application and third party libraries are built see [Building the configured project](#building-the-configured-project).
+
+### Preparing build environment
+
+Certain third party sources are required to be present on the development machine to allow the example sources in this
+repository to link against.
+
+1. [TensorFlow Lite Micro repository](https://github.com/tensorflow/tensorflow)
+2. [Ethos-U55 core driver repository](https://review.mlplatform.org/admin/repos/ml/ethos-u/ethos-u-core-driver)
+3. [CMSIS-5](https://github.com/ARM-software/CMSIS_5.git)
+
+These are part of the [ethos-u repository](https://git.mlplatform.org/ml/ethos-u/ethos-u.git/about/) and set as
+submodules of this project.
+
+To pull the submodules:
+
+```sh
+git submodule update --init
+```
+
+This will download all the required components and place them in a tree like:
+
+```tree
+dependencies
+ └── ethos-u
+     ├── cmsis
+     ├── core_driver
+     ├── tensorflow
+     └── ...
+```
+
+> **NOTE**: The default source paths for the TPIP sources assume the above directory structure, but all of the relevant
+>paths can be overridden by CMake configuration arguments `TENSORFLOW_SRC_PATH`, `ETHOS_U55_DRIVER_SRC_PATH`,
+>and `CMSIS_SRC_PATH`.
+
+### Create a build directory
+
+Create a build directory in the root of the project and navigate inside:
+
+```commandline
+mkdir build && cd build
+```
+
+### Configuring the build for `MPS3: SSE-300`
+
+On Linux, execute the following command to build the application to run
+on the Ethos-U55 when providing only the mandatory arguments for CMake configuration:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake ..
+```
+
+For Windows, add `-G "MinGW Makefiles"`:
+
+```commandline
+cmake \
+    -G "MinGW Makefiles" \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake ..
+```
+
+Toolchain option `CMAKE_TOOLCHAIN_FILE` points to the toolchain specific
+file to set the compiler and platform specific parameters.
+
+To configure a build that can be debugged using Arm-DS, we can just specify
+the build type as `Debug`:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake \
+    -DCMAKE_BUILD_TYPE=Debug ..
+```
+
+To configure a build that can be debugged using a tool that only supports
+DWARF format 3 (Modeldebugger for example), we can use:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake \
+    -DCMAKE_BUILD_TYPE=Debug \
+    -DARMCLANG_DEBUG_DWARF_LEVEL=3 ..
+```
+
+If the TensorFlow source tree is not in its default expected location,
+set the path using `TENSORFLOW_SRC_PATH`.
+Similarly, if the Ethos-U55 driver and CMSIS are not in the default location,
+`ETHOS_U55_DRIVER_SRC_PATH` and `CMSIS_SRC_PATH` can be used to configure their location. For example:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake \
+    -DTENSORFLOW_SRC_PATH=/my/custom/location/tensorflow \
+    -DETHOS_U55_DRIVER_SRC_PATH=/my/custom/location/core_driver \
+    -DCMSIS_SRC_PATH=/my/custom/location/cmsis ..
+```
+
+> **Note:** If re-building with changed parameters values, it is
+highly advised to clean the build directory and re-run the CMake command.
+
+### Configuring the build for `MPS3: SSE-200`
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-200 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake ..
+```
+
+for Windows add `-G "MinGW Makefiles"`:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-200 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake \
+    -G "MinGW Makefiles ..
+```
+
+### Configuring the build native unit-test
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=native \
+    -DCMAKE_TOOLCHAIN_FILE=public/scripts/cmake/native-toolchain.cmake ..
+```
+
+For Windows add `-G "MinGW Makefiles"`:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=native \
+    -DCMAKE_TOOLCHAIN_FILE=public/scripts/cmake/native-toolchain.cmake \
+    -G "MinGW Makefiles ..
+```
+
+Results of the build will be placed under `build/bin/` folder:
+
+```tree
+ bin
+  |- dev_ethosu_eval-tests
+  |_ ethos-u
+```
+
+### Configuring the build for `simple_platform`
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=simple_platform \
+    -DCMAKE_TOOLCHAIN_FILE=public/scripts/cmake/bare-metal-toolchain.cmake ..
+```
+
+For Windows add `-G "MinGW Makefiles"`:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=simple_platform \
+    -DCMAKE_TOOLCHAIN_FILE=public/scripts/cmake/bare-metal-toolchain.cmake \
+    -G "MinGW Makefiles" ..
+```
+
+### Building the configured project
+
+If the CMake command succeeds, build the application as follows:
+
+```commandline
+make -j4
+```
+
+or for Windows:
+
+```commandline
+mingw32-make -j4
+```
+
+Add `VERBOSE=1` to see compilation and link details.
+
+Results of the build will be placed under `build/bin` folder, an
+example:
+
+```tree
+bin
+ ├── ethos-u-<use_case_name>.axf
+ ├── ethos-u-<use_case_name>.htm
+ ├── ethos-u-<use_case_name>.map
+ ├── images-<use_case_name>.txt
+ └── sectors
+        └── <use_case>
+                ├── dram.bin
+                └── itcm.bin
+```
+
+Where for each implemented use-case under the `source/use-case` directory,
+the following build artefacts will be created:
+
+- `ethos-u-<use case name>.axf`: The built application binary for a ML
+    use case.
+
+- `ethos-u-<use case name>.map`: Information from building the
+    application (e.g. libraries used, what was optimized, location of
+    objects).
+
+- `ethos-u-<use case name>.htm`: Human readable file containing the
+    call graph of application functions.
+
+- `sectors/`: Folder containing the built application, split into files
+    for loading into different FPGA memory regions.
+
+- `images-<use case name>.txt`: Tells the FPGA which memory regions to
+    use for loading the binaries in sectors/** folder.
+
+> **Note:**  For the specific use case commands see the relative section
+in the use case documentation.
+
+## Building timing adapter with custom options
+
+The sources also contains the configuration for a timing adapter utility
+for the Ethos-U55 driver. The timing adapter allows the platform to simulate user
+provided memory bandwidth and latency constraints.
+
+The timing adapter driver aims to control the behavior of two AXI buses
+used by Ethos-U55. One is for SRAM memory region and the other is for
+flash or DRAM. The SRAM is where intermediate buffers are expected to be
+allocated and therefore, this region can serve frequent R/W traffic
+generated by computation operations while executing a neural network
+inference. The flash or DDR is where we expect to store the model
+weights and therefore, this bus would typically be used only for R/O
+traffic.
+
+It is used for MPS3 FPGA as well as for Fast Model environment.
+
+The CMake build framework allows the parameters to control the behavior
+of each bus with following parameters:
+
+- `MAXR`: Maximum number of pending read operations allowed. 0 is
+    inferred as infinite, and the default value is 4.
+
+- `MAXW`: Maximum number of pending write operations allowed. 0 is
+    inferred as infinite, and the default value is 4.
+
+- `MAXRW`: Maximum number of pending read+write operations allowed. 0 is
+    inferred as infinite, and the default value is 8.
+
+- `RLATENCY`: Minimum latency, in cycle counts, for a read operation.
+    This is the duration between ARVALID and RVALID signals. The default
+    value is 50.
+
+- `WLATENCY`: Minimum latency, in cycle counts, for a write operation.
+    This is the duration between WVALID + WLAST and BVALID being
+    de-asserted. The default value is 50.
+
+- `PULSE_ON`: Number of cycles during which addresses are let through.
+    The default value is 5100.
+
+- `PULSE_OFF`: Number of cycles during which addresses are blocked. The
+    default value is 5100.
+
+- `BWCAP`: Maximum number of 64-bit words transferred per pulse cycle. A
+    pulse cycle is PULSE_ON + PULSE_OFF. 0 is inferred as infinite, and
+    the default value is 625.
+
+- `MODE`: Timing adapter operation mode. Default value is 0
+
+  - Bit 0: 0=simple; 1=latency-deadline QoS throttling of read vs.
+        write
+
+  - Bit 1: 1=enable random AR reordering (0=default),
+
+  - Bit 2: 1=enable random R reordering (0=default),
+
+  - Bit 3: 1=enable random B reordering (0=default)
+
+For timing adapter's CMake build configuration, the SRAM AXI is assigned
+index 0 and the flash/DRAM AXI bus has index 1. To change the bus
+parameter for the build a "***TA_\<index>_**"* prefix should be added
+to the above. For example, **TA0_MAXR=10** will set the SRAM AXI bus's
+maximum pending reads to 10.
+
+As an example, if we have the following parameters for flash/DRAM
+region:
+
+- `TA1_MAXR` = "2"
+
+- `TA1_MAXW` = "0"
+
+- `TA1_MAXRW` = "0"
+
+- `TA1_RLATENCY` = "64"
+
+- `TA1_WLATENCY` = "32"
+
+- `TA1_PULSE_ON` = "320"
+
+- `TA1_PULSE_OFF` = "80"
+
+- `TA1_BWCAP` = "50"
+
+For a clock rate of 500MHz, this would translate to:
+
+- The maximum duty cycle for any operation is:\
+![Maximum duty cycle formula](../media/F1.png)
+
+- Maximum bit rate for this bus (64-bit wide) is:\
+![Maximum bit rate formula](../media/F2.png)
+
+- With a read latency of 64 cycles, and maximum pending reads as 2,
+    each read could be a maximum of 64 or 128 bytes, as defined for
+    Ethos-U55\'s AXI bus\'s attribute.
+
+    The bandwidth is calculated solely by read parameters ![Bandwidth formula](
+        ../media/F3.png)
+
+    This is higher than the overall bandwidth dictated by the bus parameters
+    of \
+    ![Overall bandwidth formula](../media/F4.png)
+
+This suggests that the read operation is limited only by the overall bus
+bandwidth.
+
+Timing adapter requires recompilation to change parameters. Default timing
+adapter configuration file pointed to by `TA_CONFIG_FILE` build parameter is
+located in the scripts/cmake folder and contains all options for AXI0 and
+AXI1 described above.
+
+An example of scripts/cmake/ta_config.cmake:
+
+```cmake
+# Timing adapter options
+set(TA_INTERACTIVE OFF)
+
+# Timing adapter settings for AXI0
+set(TA0_MAXR "8")
+set(TA0_MAXW "8")
+set(TA0_MAXRW "0")
+set(TA0_RLATENCY "32")
+set(TA0_WLATENCY "32")
+set(TA0_PULSE_ON "3999")
+set(TA0_PULSE_OFF "1")
+set(TA0_BWCAP "4000")
+...
+```
+
+An example of the build with custom timing adapter configuration:
+
+```commandline
+cmake \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake \
+    -DTA_CONFIG_FILE=scripts/cmake/my_ta_config.cmake ..
+```
+
+## Add custom inputs
+
+The application performs inference on input data found in the folder set
+by the CMake parameters, for more information see the 3.3 section in the
+specific use case documentation.
+
+## Add custom model
+
+The application performs inference using the model pointed to by the
+CMake parameter `MODEL_TFLITE_PATH`.
+
+> **Note:** If you want to run the model using Ethos-U55, ensure your custom
+model has been run through the Vela compiler successfully before continuing.
+
+To run the application with a custom model you will need to provide a
+labels_<model_name>.txt file of labels associated with the model.
+Each line of the file should correspond to one of the outputs in your
+model. See the provided labels_mobilenet_v2_1.0_224.txt file in the
+img_class use case for an example.
+
+Then, you must set `<use_case>_MODEL_TFLITE_PATH` to the location of
+the Vela processed model file and `<use_case>_LABELS_TXT_FILE` to the
+location of the associated labels file:
+
+```commandline
+cmake \
+    -D<use_case>_MODEL_TFLITE_PATH=<path/to/custom_model_after_vela.tflite> \
+    -D<use_case>_LABELS_TXT_FILE=<path/to/labels_custom_model.txt> \
+    -DTARGET_PLATFORM=mps3 \
+    -DTARGET_SUBSYSTEM=sse-300 \
+    -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake ..
+```
+
+> **Note:** For the specific use case command see the relative section in the use case documentation.
+
+For Windows, add `-G MinGW Makefiles` to the CMake command.
+
+> **Note:** Clean the build directory before re-running the CMake command.
+
+The TensorFlow Lite for Microcontrollers model pointed to by `<use_case>_MODEL_TFLITE_PATH` and
+labels text file pointed to by `<use_case>_LABELS_TXT_FILE` will be
+converted to C++ files during the CMake configuration stage and then
+compiled into the application for performing inference with.
+
+The log from the configuration stage should tell you what model path and
+labels file have been used:
+
+```log
+-- User option TARGET_PLATFORM is set to mps3
+-- User option <use_case>_MODEL_TFLITE_PATH is set to
+<path/to/custom_model_after_vela.tflite>
+...
+-- User option <use_case>_LABELS_TXT_FILE is set to
+<path/to/labels_custom_model.txt>
+...
+-- Using <path/to/custom_model_after_vela.tflite>
+++ Converting custom_model_after_vela.tflite to custom_model_after_vela.tflite.cc
+-- Generating labels file from <path/to/labels_custom_model.txt>
+-- writing to <path/to/build>/generated/include/Labels.hpp and <path/to/build>/generated/src/Labels.cc
+...
+```
+
+After compiling, your custom model will have now replaced the default
+one in the application.
+
+## Optimize custom model with Vela compiler
+
+> **Note:** This tool is not available within this project.
+It is a python tool available from <https://pypi.org/project/ethos-u-vela/>.
+The source code is hosted on <https://git.mlplatform.org/ml/ethos-u/ethos-u-vela.git/>.
+
+The Vela compiler is a tool that can optimize a neural network model
+into a version that can run on an embedded system containing Ethos-U55.
+
+The optimized model will contain custom operators for sub-graphs of the
+model that can be accelerated by Ethos-U55, the remaining layers that
+cannot be accelerated are left unchanged and will run on the CPU using
+optimized (CMSIS-NN) or reference kernels provided by the inference
+engine.
+
+After the compilation, the optimized model can only be executed on a
+system with Ethos-U55.
+
+> **Note:** The NN model provided during the build and compiled into the application
+executable binary defines whether CPU or NPU is used to execute workloads.
+If unoptimized model is used, then inference will run on Cortex-M CPU.
+
+Vela compiler accepts parameters to influence a model optimization. The
+model provided within this project has been optimized with
+the following parameters:
+
+```commandline
+vela \
+    --accelerator-config=ethos-u55-128 \
+    --block-config-limit=0 \
+    --config my_vela_cfg.ini \
+    --memory-mode Shared_Sram \
+    --system-config Ethos_U55_High_End_Embedded \
+    <model>.tflite
+```
+
+Where:
+
+- `--accelerator-config`: Specify the accelerator configuration to use
+    between ethos-u55-256, ethos-u55-128, ethos-u55-64 and ethos-u55-32.
+- `--block-config-limit`: Limit block config search space, use zero for
+    unlimited.
+- `--config`: Specifies the path to the Vela configuration file. The format of the file is a Python ConfigParser .ini file.
+    An example can be found in the `dependencies` folder [vela.ini](../../scripts/vela/vela.ini).
+- `--memory-mode`: Selects the memory mode to use as specified in the Vela configuration file.
+- `--system-config`:Selects the system configuration to use as specified in the Vela configuration file.
+
+Vela compiler accepts `.tflite` file as input and saves optimized network
+model as a `.tflite` file.
+
+Using `--show-cpu-operations` and `--show-subgraph-io-summary` will show
+all the operations that fall back to the CPU and a summary of all the
+subgraphs and their inputs and outputs.
+
+To see Vela helper for all the parameters use: `vela --help`.
+
+Please, get in touch with your Arm representative to request access to
+Vela Compiler documentation for more details.
+
+> **Note:** By default, use of the Ethos-U55 is enabled in the CMake configuration.
+This could be changed by passing `-DETHOS_U55_ENABLED`.
+
+## Memory constraints
+
+Both the MPS3 Fixed Virtual Platform and the MPS3 FPGA platform share
+the linker script (scatter file) for SSE-300 design. The design is set
+by the CMake configuration parameter `TARGET_SUBSYSTEM` as described in
+the previuous section.
+
+The memory map exposed by this design is presented in Appendix 1. This
+can be used as a reference when editing the scatter file, especially to
+make sure that region boundaries are respected. The snippet from MPS3's
+scatter file is presented below:
+
+```
+;---------------------------------------------------------
+; First load region
+;---------------------------------------------------------
+LOAD_REGION_0 0x00000000 0x00080000
+{
+    ;-----------------------------------------------------
+    ; First part of code mem -- 512kiB
+    ;-----------------------------------------------------
+    itcm.bin 0x00000000 0x00080000
+    {
+        *.o (RESET, +First)
+        * (InRoot$$Sections)
+        .ANY (+RO)
+    }
+
+    ;-----------------------------------------------------
+    ; 128kiB of 512kiB bank is used for any other RW or ZI
+    ; data. Note: this region is internal to the Cortex-M CPU
+    ;-----------------------------------------------------
+    dtcm.bin 0x20000000 0x00020000
+    {
+        .ANY(+RW +ZI)
+    }
+
+    ;-----------------------------------------------------
+    ; 128kiB of stack space within the DTCM region
+    ;-----------------------------------------------------
+    ARM_LIB_STACK 0x20020000 EMPTY ALIGN 8 0x00020000
+    {}
+
+    ;-----------------------------------------------------
+    ; 256kiB of heap space within the DTCM region
+    ;-----------------------------------------------------
+
+    ARM_LIB_HEAP 0x20040000 EMPTY ALIGN 8 0x00040000
+    {}
+
+    ;-----------------------------------------------------
+    ; SSE-300's internal SRAM
+    ;-----------------------------------------------------
+    isram.bin 0x21000000 UNINIT ALIGN 16 0x00080000
+    {
+        ; activation buffers a.k.a tensor arena
+        *.o (.bss.NoInit.activation_buf)
+    }
+}
+
+;---------------------------------------------------------
+; Second load region
+;---------------------------------------------------------
+LOAD_REGION_1 0x60000000 0x02000000
+{
+    ;-----------------------------------------------------
+    ; 32 MiB of DRAM space for nn model and input vectors
+    ;-----------------------------------------------------
+    dram.bin 0x60000000 ALIGN 16 0x02000000
+    {
+        ; nn model's baked in input matrices
+        *.o (ifm)
+
+        ; nn model
+        *.o (nn_model)
+
+        ; if the activation buffer (tensor arena) doesn't
+        ; fit in the SRAM region, we accommodate it here
+        *.o (activation_buf)
+    }
+}
+```
+
+It is worth noting that in the bitfile implementation, only the BRAM,
+internal SRAM and DDR memory regions are accessible to the Ethos-U55
+block. In the above snippet, the internal SRAM region memory can be seen
+to be utilized by activation buffers with a limit of 512kiB. If used,
+this region will be written to by the Ethos-U55 block frequently. A bigger
+region of memory for storing the model is placed in the DDR region,
+under LOAD_REGION_1. The two load regions are necessary as the MPS3's
+motherboard configuration controller limits the load size at address
+0x00000000 to 512kiB. This has implications on how the application **is
+deployed** on MPS3 as explained under the section 3.8.3.
+
+## Automatic file generation
+
+As mentioned in the previous sections, some files such as neural network
+models, network's inputs, and output labels are automatically converted
+into C/C++ arrays during the CMake project configuration stage.
+Additionally, some code is generated to allow access to these arrays.
+
+An example:
+
+```log
+-- Building use-cases: img_class.
+-- Found sources for use-case img_class
+-- User option img_class_FILE_PATH is set to /tmp/samples
+-- User option img_class_IMAGE_SIZE is set to 224
+-- User option img_class_LABELS_TXT_FILE is set to /tmp/labels/labels_model.txt
+-- Generating image files from /tmp/samples
+++ Converting cat.bmp to cat.cc
+++ Converting dog.bmp to dog.cc
+-- Skipping file /tmp/samples/files.md due to unsupported image format.
+++ Converting kimono.bmp to kimono.cc
+++ Converting tiger.bmp to tiger.cc
+++ Generating /tmp/build/generated/img_class/include/InputFiles.hpp
+-- Generating labels file from /tmp/labels/labels_model.txt
+-- writing to /tmp/build/generated/img_class/include/Labels.hpp and /tmp/build/generated/img_class/src/Labels.cc
+-- User option img_class_ACTIVATION_BUF_SZ is set to 0x00200000
+-- User option img_class_MODEL_TFLITE_PATH is set to /tmp/models/model.tflite
+-- Using /tmp/models/model.tflite
+++ Converting model.tflite to    model.tflite.cc
+...
+```
+
+In particular, the building options pointing to the input files `<use_case>_FILE_PATH`,
+the model `<use_case>_MODEL_TFLITE_PATH` and labels text file `<use_case>_LABELS_TXT_FILE`
+are used by python scripts in order to generate not only the converted array files,
+but also some headers with utility functions.
+
+For example, the generated utility functions for image classification are:
+
+- `build/generated/include/InputFiles.hpp`
+
+```c++
+#ifndef GENERATED_IMAGES_H
+#define GENERATED_IMAGES_H
+
+#include <cstdint>
+
+#define NUMBER_OF_FILES  (2U)
+#define IMAGE_DATA_SIZE  (150528U)
+
+extern const uint8_t im0[IMAGE_DATA_SIZE];
+extern const uint8_t im1[IMAGE_DATA_SIZE];
+
+const char* get_filename(const uint32_t idx);
+const uint8_t* get_img_array(const uint32_t idx);
+
+#endif /* GENERATED_IMAGES_H */
+```
+
+- `build/generated/src/InputFiles.cc`
+
+```c++
+#include "InputFiles.hpp"
+
+static const char *img_filenames[] = {
+    "img1.bmp",
+    "img2.bmp",
+};
+
+static const uint8_t *img_arrays[] = {
+    im0,
+    im1
+};
+
+const char* get_filename(const uint32_t idx)
+{
+    if (idx < NUMBER_OF_FILES) {
+        return img_filenames[idx];
+    }
+    return nullptr;
+}
+
+const uint8_t* get_img_array(const uint32_t idx)
+{
+    if (idx < NUMBER_OF_FILES) {
+        return img_arrays[idx];
+    }
+    return nullptr;
+}
+```
+
+These headers are generated using python templates, that are in `scripts/py/templates/*.template`.
+
+```tree
+scripts/
+├── cmake
+│   ├── ...
+│   ├── subsystem-profiles
+│   │   ├── corstone-sse-200.cmake
+│   │   └── corstone-sse-300.cmake
+│   ├── templates
+│   │   ├── mem_regions.h.template
+│   │   ├── peripheral_irqs.h.template
+│   │   └── peripheral_memmap.h.template
+│   └── ...
+└── py
+    ├── <generation scripts>
+    ├── requirements.txt
+    └── templates
+        ├── audio.cc.template
+        ├── AudioClips.cc.template
+        ├── AudioClips.hpp.template
+        ├── default.hpp.template
+        ├── header_template.txt
+        ├── image.cc.template
+        ├── Images.cc.template
+        ├── Images.hpp.template
+        ├── Labels.cc.template
+        ├── Labels.hpp.template
+        ├── testdata.cc.template
+        ├── TestData.cc.template
+        ├── TestData.hpp.template
+        └── tflite.cc.template
+```
+
+Based on the type of use case the correct conversion is called in the use case cmake file
+(audio or image respectively for voice or vision use cases).
+For example, the generations call for image classification (`source/use_case/img_class/usecase.cmake`):
+
+```c++
+# Generate input files
+generate_images_code("${${use_case}_FILE_PATH}"
+                     ${SRC_GEN_DIR}
+                     ${INC_GEN_DIR}
+                     "${${use_case}_IMAGE_SIZE}")
+
+# Generate labels file
+set(${use_case}_LABELS_CPP_FILE Labels)
+generate_labels_code(
+    INPUT           "${${use_case}_LABELS_TXT_FILE}"
+    DESTINATION_SRC ${SRC_GEN_DIR}
+    DESTINATION_HDR ${INC_GEN_DIR}
+    OUTPUT_FILENAME "${${use_case}_LABELS_CPP_FILE}"
+)
+
+...
+
+# Generate model file
+generate_tflite_code(
+    MODEL_PATH ${${use_case}_MODEL_TFLITE_PATH}
+    DESTINATION ${SRC_GEN_DIR}
+)
+```
+
+> **Note:** When required, for models and labels conversion it's possible to add extra parameters such
+> as extra code to put in `<model>.cc` file or namespaces.
+>
+> ```c++
+> set(${use_case}_LABELS_CPP_FILE Labels)
+> generate_labels_code(
+>     INPUT           "${${use_case}_LABELS_TXT_FILE}"
+>     DESTINATION_SRC ${SRC_GEN_DIR}
+>     DESTINATION_HDR ${INC_GEN_DIR}
+>     OUTPUT_FILENAME "${${use_case}_LABELS_CPP_FILE}"
+>     NAMESPACE       "namespace1" "namespace2"
+> )
+>
+> ...
+>
+> set(EXTRA_MODEL_CODE
+>     "/* Model parameters for ${use_case} */"
+>     "extern const int   g_myvariable2     = value1"
+>     "extern const int   g_myvariable2     = value2"
+> )
+>
+> generate_tflite_code(
+>     MODEL_PATH ${${use_case}_MODEL_TFLITE_PATH}
+>     DESTINATION ${SRC_GEN_DIR}
+>     EXPRESSIONS ${EXTRA_MODEL_CODE}
+>     NAMESPACE   "namespace1" "namespace2"
+> )
+> ```
+
+In addition to input file conversions, the correct platform/system profile is selected
+(in `scripts/cmake/subsystem-profiles/*.cmake`) based on `TARGET_SUBSYSTEM` build option
+and the variables set are used to generate memory region sizes, base addresses and IRQ numbers,
+respectively used to generate mem_region.h, peripheral_irqs.h and peripheral_memmap.h headers.
+Templates from `scripts/cmake/templates/*.template` are used to generate the header files.
+
+After the build, the files generated in the build folder are:
+
+```tree
+build/generated/
+├── bsp
+│   ├── mem_regions.h
+│   ├── peripheral_irqs.h
+│   └── peripheral_memmap.h
+├── <use_case_name1>
+│   ├── include
+│   │   ├── InputFiles.hpp
+│   │   └── Labels.hpp
+│   └── src
+│       ├── <uc1_input_file1>.cc
+│       ├── <uc1_input_file2>.cc
+│       ├── InputFiles.cc
+│       ├── Labels.cc
+│       └── <uc1_model_name>.tflite.cc
+└──  <use_case_name2>
+    ├── include
+    │   ├── InputFiles.hpp
+    │   └── Labels.hpp
+    └── src
+        ├── <uc2_input_file1>.cc
+        ├── <uc2_input_file2>.cc
+        ├── InputFiles.cc
+        ├── Labels.cc
+        └── <uc2_model_name>.tflite.cc
+```
+
+Next section of the documentation: [Deployment](../documentation.md#Deployment).
diff --git a/docs/sections/coding_guidelines.md b/docs/sections/coding_guidelines.md
new file mode 100644
index 0000000..f1813d3
--- /dev/null
+++ b/docs/sections/coding_guidelines.md
@@ -0,0 +1,323 @@
+# Coding standards and guidelines
+
+## Contents
+
+- [Introduction](#introduction)
+- [Language version](#language-version)
+- [File naming](#file-naming)
+- [File layout](#file-layout)
+- [Block Management](#block-management)
+- [Naming Conventions](#naming-conventions)
+  - [C++ language naming conventions](#c_language-naming-conventions)
+  - [C language naming conventions](#c-language-naming-conventions)
+- [Layout and formatting conventions](#layout-and-formatting-conventions)
+- [Language usage](#language-usage)
+
+## Introduction
+
+This document presents some standard coding guidelines to be followed for contributions to this repository. Most of the
+code is written in C++, but there is some written in C as well. There is a clear C/C++ boundary at the Hardware
+Abstraction Layer (HAL). Both these languages follow different naming conventions within this repository, by design, to:
+
+- have clearly distinguishable C and C++ sources.
+- make cross language function calls stand out. Mostly these will be C++ function calls to the HAL functions written in C.
+However, because we also issue function calls to third party API's (and they may not follow these conventions), the
+intended outcome may not be fully realised in all of the cases.
+
+## Language version
+
+For this project, code written in C++ shall use a subset of the C++11 feature set and software
+may be written using the C++11 language standard. Code written in C should be compatible
+with the C99 standard.
+
+Software components written in C/C++ may use the language features allowed and encouraged by this documentation.
+
+## File naming
+
+- C files should have `.c` extension
+- C++ files should have `.cc` or `.cpp` extension.
+- Header files for functions implemented in C should have `.h` extension.
+- Header files for functions implemented in C++ should have `.hpp` extension.
+
+## File layout
+
+- Standard copyright notice must be included in all files:
+
+  ```copyright
+  /*
+  * Copyright (c) <years additions were made to project> <your name>, Arm Limited. All rights reserved.
+  * SPDX-License-Identifier: Apache-2.0
+  *
+  * Licensed under the Apache License, Version 2.0 (the "License");
+  * you may not use this file except in compliance with the License.
+  * You may obtain a copy of the License at
+  *
+  *     http://www.apache.org/licenses/LICENSE-2.0
+  *
+  * Unless required by applicable law or agreed to in writing, software
+  * distributed under the License is distributed on an "AS IS" BASIS,
+  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  * See the License for the specific language governing permissions and
+  * limitations under the License.
+  */
+  ```
+
+- Source lines must be no longer than 120 characters. Prefer to spread code out vertically rather than horizontally,
+  wherever it makes sense:
+
+  ```C++
+  # This is significantly easier to read
+  enum class SomeEnum1
+  {
+      ENUM_VALUE_1,
+      ENUM_VALUE_2,
+      ENUM_VALUE_3
+  };
+
+  # than this
+  enum class SomeEnum2 { ENUM_VALUE_1, ENUM_VALUE_2, ENUM_VALUE_3 };
+  ```
+
+- Block indentation should use 4 characters, no tabs.
+
+- Each statement must be on a separate line.
+
+  ```C++
+  int a, b; // Error prone
+  int c, *d;
+
+  int e = 0; // GOOD
+  int *p = nullptr; // GOOD
+  ```
+
+- Source must not contain commented out code or unreachable code
+
+## Block Management
+
+- Blocks must use braces and braces location must be consistent.
+  - Each function has its opening brace at the next line on the same indentation level as its header, the code within
+  the braces is indented and the closing brace at the end is on the same level as the opening.
+  For compactness, if the class/function body is empty braces are accepted on the same line.
+
+  - Conditional statements and loops, even if are just single-statement body, needs to be surrounded by braces, the
+opening brace is at the same line, the closing brace is at the next line on the same indentation level as its header;
+the same rule is applied to classes.
+
+    ```C++
+    class Class1 {
+    public:
+        Class1();
+    private:
+        int element;
+    };
+
+    void NotEmptyFunction()
+    {
+        if (condition) {
+            // [...]
+        } else {
+            // [...]
+        }
+        // [...]
+        for(start_cond; end_cond; step_cond) {
+            // [...]
+        }
+    }
+
+    void EmptyFunction() {}
+    ```
+
+  - Cases within switch are indented and enclosed in brackets:
+
+    ```C++
+    switch (option)
+    {
+        case 1:
+        {
+            // handle option 1
+            break;
+        }
+        case 2:
+        {
+            // handle option 2
+            break;
+        }
+        default:
+        {
+            break;
+        }
+    }
+    ```
+
+## Naming Conventions
+
+### C++ language naming conventions
+
+- Type (class, struct, enum) names must be `PascalCase`:
+
+  ```C++
+  class SomeClass
+  {
+      // [...]
+  };
+  void SomeFunction()
+  {
+      // [...]
+  }
+  ```
+
+- Variables and parameter names must be `camelCase`:
+
+  ```C++
+  int someVariable;
+
+  void SomeFunction(int someParameter) {}
+  ```
+
+- Macros, pre-processor definitions, and enumeration values should use upper case names:
+
+  ```C++
+  #define SOME_DEFINE
+
+  enum class SomeEnum
+  {
+      ENUM_VALUE_1,
+      ENUM_VALUE_2
+  };
+  ```
+
+- Namespace names must be lower case
+
+  ```C++
+  namespace nspace
+  {
+  void FunctionInNamespace();
+  };
+  ```
+
+- Source code should use Hungarian notation to annotate the name of a variable with information about its meaning.
+
+  | Prefix | Class | Description |
+  | ------ | ----- | ----------- |
+  | p | Type      | Pointer to any other type |
+  | k | Qualifier | Constant |
+  | v | Qualifier | Volatile |
+  | m | Scope     | Member of a class or struct |
+  | s | Scope     | Static |
+  | g | Scope     | Used to indicate variable has scope beyond the current function: file-scope or externally visible scope|
+
+The following examples  of Hungarian notation are one possible set of uses:
+
+  ```C++
+  int g_GlobalInt=123;
+  char* m_pNameOfMemberPointer=nullptr;
+  const float g_kSomeGlobalConstant = 1.234f;
+  static float ms_MyStaticMember =  4.321f;
+  bool myLocalVariable=true;
+  ```
+
+### C language naming conventions
+
+For C sources, we follow the Linux variant of the K&R style wherever possible.
+
+- For function and variable names we use `snake_case` convention:
+
+  ```C
+  int some_variable;
+
+  void some_function(int some_parameter) {}
+  ```
+
+- Macros, pre-processor definitions, and enumeration values should use upper case names:
+
+  ```C
+  #define SOME_DEFINE
+
+  enum some_enum
+  {
+      ENUM_VALUE_1,
+      ENUM_VALUE_2
+  };
+  ```
+
+## Layout and formatting conventions
+
+- C++ class code layout
+  Public function definitions should be at the top of a class definition, since they are things most likely to be used
+by other people.
+  Private functions and member variables should be last.
+  Class functions and member variables should be laid out logically in blocks of related functionality.
+
+- Class  inheritance keywords are not indented.
+
+  ```C++
+  class MyClass
+  {
+  public:
+    int m_PublicMember;
+  protected:
+    int m_ProtectedMember;
+  private:
+    int m_PrivateMember;
+  };
+  ```
+
+- Don't leave trailing spaces at the end of lines.
+
+- Empty lines should have no trailing spaces.
+
+- For pointers and references, the symbols `*` and `&` should be adjacent to the name of the type, not the name
+  of the variable.
+
+  ```C++
+  char* someText = "abc";
+
+  void SomeFunction(const SomeObject& someObject) {}
+  ```
+
+## Language usage
+
+- Header `#include` statements should be minimized.
+  Inclusion of unnecessary headers slows down compilation, and can hide errors where a function calls a
+  subroutine which it should not be using if the unnecessary header defining this subroutine is included.
+
+  Header statements should be included in the following order:
+
+  - Header file corresponding to the current source file (if applicable)
+  - Headers from the same component
+  - Headers from other components
+  - Third-party headers
+  - System headers
+
+  > **Note:** Leave one blank line between each of these groups for readability.
+  >Use quotes for headers from within the same project and angle brackets for third-party and system headers.
+  >Do not use paths relative to the current source file, such as `../Header.hpp`. Instead configure your include paths
+>in the project makefiles.
+
+  ```C++
+  #include "ExampleClass.hpp"     // Own header
+
+  #include "Header1.hpp"          // Header from same component
+  #include "Header1.hpp"          // Header from same component
+
+  #include "other/Header3.hpp"    // Header from other component
+
+  #include <ThirdParty.hpp>       // Third-party headers
+
+  #include <vector>               // System  header
+
+  // [...]
+  ```
+
+- C++ casts should use the template-styled case syntax
+
+  ```C++
+  int a = 100;
+  float b = (float)a; // Not OK
+  float c = static_cast<float>(a); // OK
+  ```
+
+- Use the const keyword to declare constants instead of define.
+
+- Should use `nullptr` instead of `NULL`,
+  C++11 introduced the `nullptr` type to distinguish null pointer constants from the integer 0.
diff --git a/docs/sections/customizing.md b/docs/sections/customizing.md
new file mode 100644
index 0000000..e92c327
--- /dev/null
+++ b/docs/sections/customizing.md
@@ -0,0 +1,731 @@
+# Implementing custom ML application
+
+- [Software project description](#software-project-description)
+- [HAL API](#hal-api)
+- [Main loop function](#main-loop-function)
+- [Application context](#application-context)
+- [Profiler](#profiler)
+- [NN Model API](#nn-model-api)
+- [Adding custom ML use case](#adding-custom-ml-use-case)
+- [Implementing main loop](#implementing-main-loop)
+- [Implementing custom NN model](#implementing-custom-nn-model)
+- [Executing inference](#executing-inference)
+- [Printing to console](#printing-to-console)
+- [Reading user input from console](#reading-user-input-from-console)
+- [Output to MPS3 LCD](#output-to-mps3-lcd)
+- [Building custom use case](#building-custom-use-case)
+
+This section describes how to implement a custom Machine Learning
+application running on Fast Model FVP or on the Arm MPS3 FPGA prototyping board.
+
+Arm® Ethos™-U55 code sample software project offers a simple way to incorporate
+additional use-case code into the existing infrastructure and provides a build
+system that automatically picks up added functionality and produces corresponding
+executable for each use-case. This is achieved by following certain configuration
+and code implementation conventions.
+
+The following sign will indicate the important conventions to apply:
+
+> **Convention:** The code is developed using C++11 and C99 standards.
+This is governed by TensorFlow Lite for Microcontrollers framework.
+
+## Software project description
+
+As mentioned in the [Repository structure](../documentation.md#repository-structure) section, project sources are:
+
+```tree
+├── docs
+│ ├── ...
+│ └── Documentation.md
+├── resources
+│ └── img_class
+│      └── ...
+├── scripts
+│ └── ...
+├── source
+│ ├── application
+│ │ ├── hal
+│ │ ├── main
+│ │ └── tensorflow-lite-micro
+│ └── use_case
+│     └──img_class
+├── CMakeLists.txt
+└── Readme.md
+```
+
+Where `source` contains C/C++ sources for the platform and ML applications.
+Common code related to the Ethos-U55 code samples software
+framework resides in the *application* sub-folder and ML application specific logic (use-cases)
+sources are in the *use-case* subfolder.
+
+> **Convention**: Separate use-cases must be organized in sub-folders under the use-case folder.
+The name of the directory is used as a name for this use-case and could be provided
+as a `USE_CASE_BUILD` parameter value.
+It is expected by the build system that sources for the use-case are structured as follows:
+headers in an include directory, C/C++ sources in a src directory.
+For example:
+>
+>```tree
+>use_case
+> └──img_class
+>       ├── include
+>       │   └── *.hpp
+>       └── src
+>           └── *.cc
+>```
+
+## HAL API
+
+Hardware abstraction layer is represented by the following interfaces.
+To access them, include hal.h header.
+
+- *hal_platfrom* structure:\
+    Structure that defines a platform context to be used by the application
+
+  |  Attribute name    | Description                                                                                                                                                         |
+  |--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+  |  inited            |  Initialization flag. Is set after the platfrom_init() function is called.                                                                                          |
+  |  plat_name         |  Platform name. it is set to "mps3-bare" for MPS3 build and "FVP" for Fast Model build.                                                                             |
+  |  data_acq          |  Pointer to data acquisition module responsible for user interaction and other data collection for the application logic.                                           |
+  |  data_psn          |  Pointer to data presentation module responsible for data output through components available in the selected platform: LCD -- for MPS3, console -- for Fast Model. |
+  |  timer             |  Pointer to platform timer implementation (see platform_timer)                                                                                                      |
+  |  platform_init     |  Pointer to platform initialization function.                                                                                                                       |
+  |  platform_release  |  Pointer to platform release function                                                                                                                               |
+
+- *hal_init* function:\
+    Initializes the HAL structure based on compile time config. This
+    should be called before any other function in this API.
+
+  |  Parameter name  | Description|
+  |------------------|-----------------------------------------------------|
+  |  platform        | Pointer to a pre-allocated *hal_platfrom* struct.   |
+  |  data_acq        | Pointer to a pre-allocated data acquisition module  |
+  |  data_psn        | Pointer to a pre-allocated data presentation module |
+  |  timer           | Pointer to a pre-allocated timer module             |
+  |  return          | zero if successful, error code otherwise            |
+
+- *hal_platform_init* function:\
+  Initializes the HAL platform and all the modules on the platform the
+  application requires to run.
+
+  | Parameter name  | Description                                                         |
+  | ----------------| ------------------------------------------------------------------- |
+  | platform        | Pointer to a pre-allocated and initialized *hal_platfrom* struct.   |
+  | return          | zero if successful, error code otherwise.                           |
+
+- *hal_platform_release* function\
+  Releases the HAL platform. This should release resources acquired.
+
+  | Parameter name  | Description                                                         |
+  | ----------------| ------------------------------------------------------------------- |
+  |  platform       | Pointer to a pre-allocated and initialized *hal_platfrom* struct.   |
+
+- *data_acq_module* structure:\
+  Structure to encompass the data acquisition module and it's
+  methods.
+
+  | Attribute name | Description                                        |
+  |----------------|----------------------------------------------------|
+  | inited         | Initialization flag. Is set after the system_init () function is called. |
+  | system_name  | Channel name. It is set to "UART" for MPS3 build and fastmodel builds. |
+  | system_init    | Pointer to data acquisition module initialization function. The pointer is set according to the platform selected during the build. This function is called by the platforminitialization routines.                           |
+  | get_input      | Pointer to a function reading user input. The pointer is set according to the selected platform during the build. For MPS3 and fastmodel environments, the function reads data from UART.   |
+
+- *data_psn_module* structure:\
+  Structure to encompass the data presentation module and its methods.
+
+  | Attribute name     | Description                                    |
+  |--------------------|------------------------------------------------|
+  | inited             | Initialization flag. It is set after the system_init () function is called. |
+  | system_name        | System component name used to present data. It is set to "lcd" for MPS3 build and to "log_psn" for fastmodel build. In case of fastmodel, all pixel drawing functions are replaced by console output of the data summary.                              |
+  | system_init        | Pointer to data presentation module initialization function. The pointer is set according to the platform selected during the build. This function is called by the platform initialization routines. |
+  | present_data_image | Pointer to a function to draw an image. The pointer is set according to the selected platform during the build. For MPS3, the image will be drawn on the LCD; for fastmodel  image summary will be printed in the UART  (coordinates, channel info, downsample factor) |
+  | present_data_text  | Pointer to a function to print a text. The pointer is set according to the selected platform during the build. For MPS3, the text will be drawn on the LCD; for fastmodel text will be printed in the UART. |
+  | present_box        | Pointer to a function to draw a rectangle. The pointer is set according to the selected platform during the build. For MPS3, the image will be drawn on the LCD; for fastmodel  image summary will be printed in the UART. |
+  | clear              | Pointer to a function to clear the output. The pointer is set according to the selected platform during the build. For MPS3, the function will clear the LCD; for fastmodel will do nothing. |
+  | set_text_color     | Pointer to a function to set text color for the next call of present_data_text() function. The pointer is set according to the selected platform during the build. For MPS3, the function will set the color for the text printed on the LCD; for fastmodel -- will do nothing. |
+  | set_led            | Pointer to a function controlling an LED (led_num) with on/off  |
+
+- *platform_timer* structure:\
+    Structure to hold a platform specific timer implementation.
+
+  | Attribute name     | Description                                    |
+  |--------------------|------------------------------------------------|
+  |  inited            |  Initialization flag. It is set after the timer is initialized by the *hal_platform_init* function. |
+  |  reset             |   Pointer to a function to reset a timer. |
+  |  get_time_counter  |   Pointer to a function to get current time counter. |
+  |  get_duration_ms   |   Pointer to a function to calculate duration between two time-counters in milliseconds. |
+  |  get_duration_us   |   Pointer to a function to calculate duration between two time-counters in microseconds |
+  |  get_npu_cycle_diff |  Pointer to a function to calculate duration between two time-counters in Ethos-U55 cycles. Available only when project is configured with ETHOS_U55_ENABLED set. |
+
+Example of the API initialization in the main function:
+
+```c++
+#include "hal.h"
+
+int main ()
+
+{
+
+  hal_platform platform;
+  data_acq_module dataAcq;
+  data_psn_module dataPsn;
+  platform_timer timer;
+
+  /* Initialise the HAL and platform */
+  hal_init(&platform, &dataAcq, &dataPsn, &timer);
+  hal_platform_init(&platform);
+
+  ...
+
+  hal_platform_release(&platform);
+
+  return 0;
+
+}
+```
+
+## Main loop function
+
+Code samples application main function will delegate the use-case
+logic execution to the main loop function that must be implemented for
+each custom ML scenario.
+
+Main loop function takes the initialized *hal_platform* structure
+pointer as an argument.
+
+The main loop function has external linkage and main executable for the
+use-case will have reference to the function defined in the use-case
+code.
+
+```c++
+void main_loop(hal_platform& platform){
+
+...
+
+}
+```
+
+## Application context
+
+Application context could be used as a holder for a state between main
+loop iterations. Include AppContext.hpp to use ApplicationContext class.
+
+| Method name  | Description                                                     |
+|--------------|-----------------------------------------------------------------|
+|  Set         |  Saves given value as a named attribute in the context.         |
+|  Get         |  Gets the saved attribute from the context by the given name.   |
+|  Has         |  Checks if an attribute with a given name exists in the context. |
+
+For example:
+
+```c++
+#include "hal.h"
+#include "AppContext.hpp"
+
+void main_loop(hal_platform& platform) {
+
+    /* Instantiate application context */
+    arm::app::ApplicationContext caseContext;
+    caseContext.Set<hal_platform&>("platform", platform);
+    caseContext.Set<uint32_t>("counter", 0);
+
+    /* loop */
+  while (true) {
+    // do something, pass application context down the call stack
+  }
+}
+```
+
+## Profiler
+
+Profiler is a helper class assisting in collection of timings and
+Ethos-U55 cycle counts for operations. It uses platform timer to get
+system timing information.
+
+| Method name          | Description                                               |
+|----------------------|-----------------------------------------------------------|
+|  StartProfiling      | Starts profiling and records the starting timing data.    |
+|  StopProfiling       | Stops profiling and records the ending timing data.       |
+|  Reset               | Resets the profiler and clears all collected data.        |
+|  GetResultsAndReset  | Gets the results as string and resets the profiler.       |
+
+Usage example:
+
+```c++
+Profiler profiler{&platform, "Inference"};
+
+profiler.StartProfiling();
+// Code running inference to profile
+profiler.StopProfiling();
+
+info("%s\n", profiler.GetResultsAndReset().c_str());
+```
+
+## NN Model API
+
+Model (refers to neural network model) is an abstract class wrapping the
+underlying TensorFlow Lite Micro API and providing methods to perform
+common operations such as TensorFlow Lite Micro framework
+initialization, inference execution, accessing input and output tensor
+objects.
+
+To use this abstraction, import TensorFlowLiteMicro.hpp header.
+
+| Method name              | Description                                                                  |
+|--------------------------|------------------------------------------------------------------------------|
+|  GetInputTensor          |   Returns the pointer to the model\'s input tensor.                          |
+|  GetOutputTensor         |   Returns the pointer to the model\'s output tensor                          |
+|  GetType                 |   Returns the model's data type                                              |
+|  GetInputShape           |   Return the pointer to the model\'s input shape                             |
+|  GetOutputShape          |   Return the pointer to the model\'s output shape                            |
+|  LogTensorInfo           |   Logs the tensor information to stdout for the given tensor pointer: tensor name, tensor address, tensor type, tensor memory size and quantization params.  |
+|  LogInterpreterInfo      |   Logs the interpreter information to stdout.                                |
+|  Init                    |   Initializes the TensorFlow Lite Micro framework, allocates require memory for the model. |
+|  IsInited                |  Checks if this model object has been initialized.                           |
+|  IsDataSigned            |  Checks if the model uses signed data type.                                  |
+|  RunInference            |  Runs the inference (invokes the interpreter).                               |
+|  GetOpResolver()         |  Returns the reference to the TensorFlow Lite Micro operator resolver.       |
+|  EnlistOperations        |  Registers required operators with TensorFlow Lite Micro operator resolver.  |
+|  GetTensorArena          |  Returns pointer to memory region to be used for tensors allocations.        |
+|  GetActivationBufferSize |  Returns the size of the tensor arena memory region.                         |
+
+> **Convention**:  Each ML use-case must have extension of this class and implementation of the protected virtual methods:
+>
+>```c++
+>virtual const tflite::MicroOpResolver& GetOpResolver() = 0;
+>virtual bool EnlistOperations() = 0;
+>virtual uint8_t* GetTensorArena() = 0;
+>virtual size_t GetActivationBufferSize() = 0;
+>```
+>
+>Network models have different set of operators that must be registered with
+tflite::MicroMutableOpResolver object in the EnlistOperations method.
+Network models could require different size of activation buffer that is returned as
+tensor arena memory for TensorFlow Lite Micro framework by the GetTensorArena
+and GetActivationBufferSize methods.
+
+Please see MobileNetModel.hpp and MobileNetModel.cc files from image
+classification ML application use-case as an example of the model base
+class extension.
+
+## Adding custom ML use case
+
+This section describes how to implement additional use-case and compile
+it into the binary executable to run with Fast Model or MPS3 FPGA board.
+It covers common major steps: application main loop creation,
+description of the NN model, inference execution.
+
+In addition, few useful examples are provided: reading user input,
+printing into console, drawing images into MPS3 LCD.
+
+```tree
+use_case
+   └──hello_world
+      ├── include
+      └── src
+```
+
+Start with creation of a sub-directory under the *use_case* directory and
+two other directories *src* and *include* as described in
+[Software project description](#software-project-description) section:
+
+## Implementing main loop
+
+Use-case main loop is the place to put use-case main logic. Essentially,
+it is an infinite loop that reacts on user input, triggers use-case
+conditional logic based on the input and present results back to the
+user. However, it could also be a simple logic that runs a single inference
+and then exits.
+
+Main loop has knowledge about the platform and has access to the
+platform components through the hardware abstraction layer (referred to as HAL).
+
+Create a *MainLoop.cc* file in the *src* directory (the one created under
+[Adding custom ML use case](#adding-custom-ml-use-case)), the name is not
+important. Define *main_loop* function with the signature described in
+[Main loop function](#main-loop-function):
+
+```c++
+#include "hal.h"
+
+void main_loop(hal_platform& platform) {
+  printf("Hello world!");
+}
+```
+
+The above is already a working use-case, if you compile and run it (see
+[Building custom usecase](#Building-custom-use-case)) the application will start, print
+message to console and exit straight away.
+
+Now, you can start filling this function with logic.
+
+## Implementing custom NN model
+
+Before inference could be run with a custom NN model, TensorFlow Lite
+Micro framework must learn about the operators/layers included in the
+model. Developer must register operators using *MicroMutableOpResolver*
+API.
+
+Ethos-U55 code samples project has an abstraction around TensorFlow
+Lite Micro API (see [NN model API](#nn-model-api)). Create *HelloWorld.hpp* in
+the use-case include sub-directory, extend Model abstract class and
+declare required methods.
+
+For example:
+
+```c++
+#include "Model.hpp"
+
+namespace arm {
+namespace app {
+
+class HelloWorldModel: public Model {
+  protected:
+    /** @brief   Gets the reference to op resolver interface class. */
+    const tflite::MicroOpResolver& GetOpResolver() override;
+
+    /** @brief   Adds operations to the op resolver instance. */
+    bool EnlistOperations() override;
+
+    const uint8_t* ModelPointer() override;
+
+    size_t ModelSize() override;
+
+  private:
+    /* Maximum number of individual operations that can be enlisted. */
+    static constexpr int _m_maxOpCnt = 5;
+
+    /* A mutable op resolver instance. */
+    tflite::MicroMutableOpResolver<_maxOpCnt> _m_opResolver;
+  };
+} /* namespace app */
+} /* namespace arm */
+```
+
+Create `HelloWorld.cc` file in the `src` sub-directory and define the methods
+there. Include `HelloWorldModel.hpp` created earlier. Note that `Model.hpp`
+included in the header provides access to TensorFlow Lite Micro's operation
+resolver API.
+
+Please, see `use_case/image_classifiaction/src/MobileNetModel.cc` for
+code examples.\
+If you are using a TensorFlow Lite model compiled with Vela, it is important to add
+custom Ethos-U55 operator to the operators list.
+
+The following example shows how to add the custom Ethos-U55 operator with
+TensorFlow Lite Micro framework. We will use the ARM_NPU define to exclude
+the code if the application was built without NPU support.
+
+```c++
+#include "HelloWorldModel.hpp"
+
+bool arm::app::HelloWorldModel::EnlistOperations() {
+
+  #if defined(ARM_NPU)
+    if (kTfLiteOk == this->_opResolver.AddEthosU()) {
+        info("Added %s support to op resolver\n",
+            tflite::GetString_ETHOSU());
+    } else {
+        printf_err("Failed to add Arm NPU support to op resolver.");
+        return false;
+    }
+  #endif /* ARM_NPU */
+
+    return true;
+}
+```
+
+To minimize application memory footprint, it is advised to register only
+operators used by the NN model.
+
+Define `ModelPointer` and `ModelSize` methods. These functions are wrappers around the
+functions generated in the C++ file containing the neural network model as an array.
+This generation the C++ array from the .tflite file, logic needs to be defined in
+the `usecase.cmake` file for this `HelloWorld` example.
+
+For more details on `usecase.cmake`, see [Building custom use case](#building-custom-use-case).
+For details on code generation flow in general, see [Automatic file generation](./building.md#Automatic-file-generation)
+
+The TensorFlow Lite model data is read during Model::init() method execution, see
+*application/tensorflow-lite-micro/Model.cc* for more details. Model invokes
+`ModelPointer()` function which calls the `GetModelPointer()` function to get
+neural network model data memory address. The `GetModelPointer()` function
+will be generated during the build and could be found in the
+file `build/generated/hello_world/src/<model_file_name>.cc`. Generated
+file is added to the compilation automatically.
+
+Use \${use-case}_MODEL_TFLITE_PATH build parameter to include custom
+model to the generation/compilation process (see [Build options](./building.md/#build-options)).
+
+## Executing inference
+
+To run an inference successfully it is required to have:
+
+- a TensorFlow Lite model file
+- extended Model class
+- place to add the code to invoke inference
+- main loop function
+- and some input data.
+
+For the hello_world example below, the input array is not populated.
+However, for real-world scenarios, this data should either be read from
+an on-board device or be prepared in the form of C++ sources before
+compilation and be baked into the application.
+
+For example, the image classification application has extra build steps
+to generate C++ sources from the provided images with
+*generate_images_code* CMake function.
+
+> **Note:**
+Check the input data type for your NN model and input array data type are  the same.
+For example, generated C++ sources for images store image data as uint8 array. For models that were
+quantized to int8 data type, it is important to convert image data to int8 correctly before inference execution.
+Asymmetric data type to symmetric data type conversion involves positioning zero value, i.e. subtracting an
+offset for uint8 values. Please check image classification application source for the code example
+(ConvertImgToInt8 function).
+
+The following code adds inference invocation to the main loop function:
+
+```c++
+#include "hal.h"
+#include "HelloWorldModel.hpp"
+
+  void main_loop(hal_platform& platform) {
+
+  /* model wrapper object */
+  arm::app::HelloWorldModel model;
+
+  /* Load the model */
+  if (!model.Init()) {
+    printf_err("failed to initialise model\n");
+    return;
+  }
+
+  TfLiteTensor *outputTensor = model.GetOutputTensor();
+  TfLiteTensor *inputTensor = model.GetInputTensor();
+
+  /* dummy input data*/
+  uint8_t inputData[1000];
+
+  memcpy(inputTensor->data.data, inputData, 1000);
+
+  /* run inference */
+  model.RunInference();
+
+  const uint32_t tensorSz = outputTensor->bytes;
+  const uint8_t * outputData = tflite::GetTensorData<uint8>(outputTensor);
+}
+```
+
+The code snippet has several important blocks:
+
+- Creating HelloWorldModel object and initializing it.
+
+  ```c++
+  arm::app::HelloWorldModel model;
+
+  /* Load the model */
+  if (!model.Init()) {
+    printf_err(\"failed to initialise model\\n\");
+    return;
+  }
+  ```
+
+- Getting pointers to allocated input and output tensors.
+
+  ```c++
+  TfLiteTensor *outputTensor = model.GetOutputTensor();
+  TfLiteTensor *inputTensor = model.GetInputTensor();
+  ```
+
+- Copying input data to the input tensor. We assume input tensor size
+  to be 1000 uint8 elements.
+
+  ```c++
+  memcpy(inputTensor->data.data, inputData, 1000);
+  ```
+
+- Running inference
+
+  ```c++
+  model.RunInference();
+  ```
+
+- Reading inference results: data and data size from the output
+  tensor. We assume that output layer has uint8 data type.
+
+  ```c++
+  Const uint32_t tensorSz = outputTensor->bytes ;
+
+  const uint8_t *outputData = tflite::GetTensorData<uint8>(outputTensor);
+  ```
+
+Adding profiling for Ethos-U55 is easy. Include `Profiler.hpp` header and
+invoke `StartProfiling` and `StopProfiling` around inference
+execution.
+
+```c++
+Profiler profiler{&platform, "Inference"};
+
+profiler.StartProfiling();
+model.RunInference();
+profiler.StopProfiling();
+std::string profileResults = profiler.GetResultsAndReset();
+
+info("%s\n", profileResults.c_str());
+```
+
+## Printing to console
+
+Provided examples already used some function to print messages to the
+console. The full list of available functions:
+
+- `printf`
+- `trace` - printf wrapper for tracing messages
+- `debug` - printf wrapper for debug messages
+- `info` - printf wrapper for informational messages
+- `warn` - printf wrapper for warning messages
+- `printf_err` - printf wrapper for error messages
+
+`printf` wrappers could be switched off with `LOG_LEVEL` define:
+
+trace (0) < debug (1) < info (2) < warn (3) < error (4).
+
+Default output level is info = level 2.
+
+## Reading user input from console
+
+Platform data acquisition module has get_input function to read keyboard
+input from the UART. It can be used as follows:
+
+```c++
+char ch_input[128];
+platform.data_acq->get_input(ch_input, sizeof(ch_input));
+```
+
+The function will block until user provides an input.
+
+## Output to MPS3 LCD
+
+Platform presentation module has functions to print text or an image to
+the board LCD:
+
+- `present_data_text`
+- `present_data_image`
+
+Text presentation function has the following signature:
+
+- `const char* str`: string to print.
+- `const uint32_t str_sz`: string size.
+- `const uint32_t pos_x`: x coordinate of the first letter in pixels.
+- `const uint32_t pos_y`: y coordinate of the first letter in pixels.
+- `const uint32_t alow_multiple_lines`: signals whether the text is
+    allowed to span multiple lines on the screen, or should be truncated
+    to the current line.
+
+This function does not wrap text, if the given string cannot fit on the
+screen it will go outside the screen boundary.
+
+Example that prints "Hello world" on the LCD:
+
+```c++
+std::string hello("Hello world");
+platform.data_psn->present_data_text(hello.c_str(), hello.size(), 10, 35, 0);
+```
+
+Image presentation function has the following signature:
+
+- `uint8_t* data`: image data pointer;
+- `const uint32_t width`: image width;
+- `const uint32_t height`: image height;
+- `const uint32_t channels`: number of channels. Only 1 and 3 channels are supported now.
+- `const uint32_t pos_x`: x coordinate of the first pixel.
+- `const uint32_t pos_y`: y coordinate of the first pixel.
+- `const uint32_t downsample_factor`: the factor by which the image is to be down sampled.
+
+For example, the following code snippet visualizes an input tensor data
+for MobileNet v2 224 (down sampling it twice):
+
+```c++
+platform.data_psn->present_data_image((uint8_t *) inputTensor->data.data, 224, 224, 3, 10, 35, 2);
+```
+
+Please see [hal-api](#hal-api) section for other data presentation
+functions.
+
+## Building custom use case
+
+There is one last thing to do before building and running a use-case
+application: create a `usecase.cmake` file in the root of your use-case,
+the name of the file is not important.
+
+> **Convention:**  The build system searches for CMake file in each use-case directory and includes it into the build
+> flow. This file could be used to specify additional application specific build options, add custom build steps or
+> override standard compilation and linking flags.
+> Use `USER_OPTION` function to add additional build option. Prefix variable name with `${use_case}` (use-case name) to
+> avoid names collisions with other CMake variables.
+> Some useful variable names visible in use-case CMake file:
+>
+> - `DEFAULT_MODEL_PATH` – default model path to use if use-case specific `${use_case}_MODEL_TFLITE_PATH` is not set
+>in the build arguments.
+>- `TARGET_NAME` – name of the executable.
+> - `use_case` – name of the current use-case.
+> - `UC_SRC` – list of use-case sources.
+> - `UC_INCLUDE` – path to the use-case headers.
+> - `ETHOS_U55_ENABLED` – flag indicating if the current build supports Ethos-U55.
+> - `TARGET_PLATFORM` – Target platform being built for.
+> - `TARGET_SUBSYSTEM` – If target platform supports multiple subsystems, this is the name of the subsystem.
+> - All standard build options.
+>   - `CMAKE_CXX_FLAGS` and `CMAKE_C_FLAGS` – compilation flags.
+>   - `CMAKE_EXE_LINKER_FLAGS` – linker flags.
+
+For the hello world use-case it will be enough to create
+`helloworld.cmake` file and set DEFAULT_MODEL_PATH:
+
+```cmake
+if (ETHOS_U55_ENABLED EQUAL 1)
+  set(DEFAULT_MODEL_PATH  ${DEFAULT_MODEL_DIR}/helloworldmodel_uint8_vela.tflite)
+else()
+  set(DEFAULT_MODEL_PATH  ${DEFAULT_MODEL_DIR}/helloworldmodel_uint8.tflite)
+endif()
+```
+
+This can be used in subsequent section, for example:
+
+```cmake
+USER_OPTION(${use_case}_MODEL_TFLITE_PATH "Neural network model in tflite format."
+    ${DEFAULT_MODEL_PATH}
+    FILEPATH
+    )
+
+# Generate model file
+generate_tflite_code(
+    MODEL_PATH ${${use_case}_MODEL_TFLITE_PATH}
+    DESTINATION ${SRC_GEN_DIR}
+    )
+```
+
+This ensures that the model path pointed by `${use_case}_MODEL_TFLITE_PATH` is converted to a C++ array and is picked
+up by the build system. More information on auto-generations is available under section
+[Automatic file generation](./building.md#Automatic-file-generation).
+
+To build you application follow the general instructions from
+[Add Custom inputs](#add-custom-inputs) and specify the name of the use-case in the
+build command:
+
+```commandline
+cmake \
+  -DTARGET_PLATFORM=mps3 \
+  -DTARGET_SUBSYSTEM=sse-300 \
+  -DUSE_CASE_BUILD=hello_world \
+  -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/bare-metal-toolchain.cmake ..
+```
+
+For Windows, add `-G "MinGW Makefiles"` to the CMake command.
+
+As a result, `ethos-u-hello_world.axf` should be created, MPS3 build
+will also produce `sectors/hello_world` directory with binaries and
+`images-hello_world.txt` to be copied to the board MicroSD card.
+
+Next section of the documentation: [Testing and benchmarking](../documentation.md#Testing-and-benchmarking).
diff --git a/docs/sections/deployment.md b/docs/sections/deployment.md
new file mode 100644
index 0000000..354d30b
--- /dev/null
+++ b/docs/sections/deployment.md
@@ -0,0 +1,281 @@
+# Deployment
+
+- [Fixed Virtual Platform](#fixed-virtual-platform)
+  - [Setting up the MPS3 Arm Corstone-300 FVP](#setting-up-the-mps3-arm-corstone-300-fvp)
+  - [Deploying on an FVP emulating MPS3](#deploying-on-an-fvp-emulating-mps3)
+- [MPS3 board](#mps3-board)
+  - [Deployment on MPS3 board](#deployment-on-mps3-board)
+
+The sample application for Arm® Ethos™-U55 can be deployed on two
+target platforms, both of which implement the Arm® Corstone™-300 design (see
+<https://www.arm.com/products/iot/soc/corstone-300>):
+
+- A physical Arm MPS3 FPGA prototyping board
+
+- An MPS3 FVP
+
+## Fixed Virtual Platform
+
+The FVP is available publicly from [Arm Ecosystem FVP downloads
+](https://developer.arm.com/tools-and-software/open-source-software/arm-platforms-software/arm-ecosystem-fvps).
+Download the correct archive from the list under `Arm Corstone-300`. We need the one which:
+
+- Emulates MPS3 board (not for MPS2 FPGA board)
+- Contains support for Arm® Ethos™-U55
+
+> **Note:** Currently, the FVP only has a Linux OS version. Also, there are no FVPs available for `SSE-200`
+> which satisfy the above conditions.
+
+For FVP, the elf or the axf file can be run using the Fast Model
+executable as outlined under the [Starting Fast Model simulation](./setup.md/#starting-fast-model-simulation)
+except for the binary being pointed at here
+is the one just built using the steps in the previous section.
+
+### Setting up the MPS3 Arm Corstone-300 FVP
+
+For Ethos-U55 sample application, please download the MPS3 version of the
+Arm® Corstone™-300 model that contains Ethos-U55 and Arm® Cortex®-M55. The model is
+currently only supported on Linux based machines. To install the FVP:
+
+- Unpack the archive
+
+- Run the install script in the extracted package
+
+    `./FVP_Corstone_SSE-300_Ethos-U55.sh`
+
+- Follow the instructions to install the FVP to your desired location
+
+### Deploying on an FVP emulating MPS3
+
+This section assumes that the FVP has been installed (see [Setting up the MPS3 Arm Corstone-300 FVP](#Setting-up-the-MPS3-Arm-Corstone-300-FVP)) to the user's home directory `~/FVP_Corstone_SSE-300_Ethos-U55`.
+
+The installation, typically, will have the executable under `~/FVP_Corstone_SSE-300_Ethos-U55/model/<OS>_<compiler-version>/`
+directory. For the example below, we assume it to be `~/FVP_Corstone_SSE-300_Ethos-U55/models/Linux64_GCC-6.4`.
+
+To run a use case on the FVP, from the [Build directory](../sections/building.md#Create-a-build-directory):
+
+```commandline
+~/FVP_Corstone_SSE-300_Ethos-U55/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55 -a ./bin/ethos-u-<use_case>.axf
+telnetterminal0: Listening for serial connection on port 5000
+telnetterminal1: Listening for serial connection on port 5001
+telnetterminal2: Listening for serial connection on port 5002
+telnetterminal5: Listening for serial connection on port 5003
+
+    Ethos-U rev 0 --- Oct 13 2020 11:27:45
+    (C) COPYRIGHT 2019-2020 Arm Limited
+    ALL RIGHTS RESERVED
+```
+
+This will also launch a telnet window with the sample application's standard output and error log entries containing
+information about the pre-built application version, TensorFlow Lite Micro library version used, data type as well as
+the input and output tensor sizes of the model compiled into the executable binary.
+
+After the application has started it outputs a menu and waits for the user input from telnet terminal.
+
+For example, the image classification use case can be started by:
+
+```commandline
+~/FVP_Corstone_SSE-300_Ethos-U55/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55 -a ./bin/ethos-u-img_class.axf
+```
+
+The FVP supports many command line parameters:
+
+- passed by using `-C <param>=<value>`. The most important ones are:
+  - `ethosu.num_macs`: Sets the Ethos-U55 configuration for the model. Valid parameters are `32`, `64`, `256`,
+    and the default one `128`. The number signifies the 8x8 MACs performed per cycle count available on the hardware.
+  - `cpu0.CFGITCMSZ`: ITCM size for the Cortex-M CPU. Size of ITCM is pow(2, CFGITCMSZ - 1) KB
+  - `cpu0.CFGDTCMSZ`: DTCM size for the Cortex-M CPU. Size of DTCM is pow(2, CFGDTCMSZ - 1) KB
+  - `mps3_board.telnetterminal0.start_telnet` : Starts the telnet session if nothing connected.
+  - `mps3_board.uart0.out_file`: Sets the output file to hold data written by the UART
+    (use '-' to send all output to stdout, empty by default).
+  - `mps3_board.uart0.shutdown_on_eot`: Sets to shutdown simulation when a EOT (ASCII 4) char is transmitted.
+  - `mps3_board.visualisation.disable-visualisation`: Enables or disables visualisation (disabled by default).
+
+  To start the model in `128` mode for Ethos-U55:
+
+    ```commandline
+    ~/FVP_Corstone_SSE-300_Ethos-U55/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55 -a ./bin/ethos-u-img_class.axf -C ethosu.num_macs=128
+    ```
+
+- `-l`: shows the full list of supported parameters
+
+    ```commandline
+    ~/FVP_Corstone_SSE-300_Ethos-U55/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55 -l
+    ```
+
+- `--stat`: prints some run statistics on simulation exit
+
+    ```commandline
+    ~/FVP_Corstone_SSE-300_Ethos-U55/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55 --stat
+    ```
+
+- `--timelimit`: sets the number of wall clock seconds for the simulator to run, excluding startup and shutdown.
+
+## MPS3 board
+
+> **Note:**  Before proceeding, make sure you have the MPS3 board powered on,
+and USB A to B connected between your machine and the MPS3.
+The connector on the MPS3 is marked as "Debug USB".
+
+![MPS3](../media/mps3.png)
+
+1. MPS3 board top view.
+
+Once the board has booted, the micro SD card will enumerate as a mass
+storage device. On most systems this will be automatically mounted, but
+you might need to mount it manually.
+
+Also, there should be four serial-over-USB ports available for use via
+this connection. On Linux based machines, these would typically be
+*/dev/ttyUSB\<n\>* to */dev/ttyUSB\<n+3\>*.
+
+The default configuration for all of them is 115200, 8/N/1 (15200 bauds,
+8 bits, no parity and 1 stop bit) with no flow control.
+
+> **Note:** For Windows machines, additional FTDI drivers might need to be installed
+for these serial ports to be available.
+For more information on getting started with an MPS3 board, please refer to
+<https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/MPS3GettingStarted.pdf>
+
+### Deployment on MPS3 board
+
+> **NOTE**: These instructions are valid only if the evaluation is being
+ done using the MPS3 FPGA platform using either `SSE-200` or `SSE-300`.
+
+To run the application on MPS3 platform, firstly it's necessary to make sure
+that the platform has been set up using the correct configuration.
+For details, on platform set up, please see the relevant documentation. For `Arm Corstone-300`, this is available
+[here](https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/DAI0547B_SSE300_PLUS_U55_FPGA_for_mps3.pdf?revision=d088d931-03c7-40e4-9045-31ed8c54a26f&la=en&hash=F0C7837C8ACEBC3A0CF02D871B3A6FF93E09C6B8).
+
+For MPS3 board, instead of loading the axf file directly, the executable blobs
+generated under the *sectors/<use_case>* subdirectory need to be
+copied over to the MP3 board's micro SD card. Also, every use case build
+generates a corresponding images.txt file which is used by the MPS3 to
+understand which memory regions the blobs are to be loaded into.
+
+Once the USB A <--> B cable between the MPS3 and the development machine
+is connected and the MPS3 board powered on, the board should enumerate
+as a mass storage device over this USB connection.
+There might be two devices also, depending on the version of the board
+you are using. The device named `V2M-MPS3` or `V2MMPS3` is the `SD card`.
+
+If the axf/elf file is within 1MiB, it can be flashed into the FPGA
+memory directly without having to break it down into separate load
+region specific blobs. However, with neural network models exceeding
+this size, it becomes necessary to follow this approach.
+
+1. For example, the image classification use case will produce:
+
+    ```tree
+    ./bin/sectors/
+        └── img_class
+            ├── dram.bin
+            └── itcm.bin
+    ```
+
+    For example, if the micro SD card is mounted at
+    /media/user/V2M-MPS3/:
+
+    ```commandline
+    cp -av ./bin/sectors/img_class/* /media/user/V2M-MPS3/SOFTWARE/
+    ```
+
+2. The generated `\<use-case\>_images.txt` file needs to be copied
+over to the MPS3. The exact location for the destination will depend
+on the MPS3 board's version and the application note for the bit
+file in use.
+For example, for MPS3 board hardware revision C, using an
+application note directory named "ETHOSU", to replace the images.txt
+file:
+
+    ```commandline
+    cp ./bin/images-img_class.txt /media/user/V2M-MPS3/MB/HBI0309C/ETHOSU/images.txt
+    ```
+
+3. Open the first serial port available from MPS3, for example,
+"/dev/ttyUSB0". This can be typically done using minicom, screen or
+Putty application. Make sure the flow control setting is switched
+off.
+
+    ```commandline
+    minicom --D /dev/ttyUSB0
+    ```
+
+    ```log
+    Welcome to minicom 2.7.1
+    OPTIONS: I18n
+    Compiled on Aug 13 2017, 15:25:34.
+    Port /dev/ttyUSB0, 16:05:34
+    Press CTRL-A Z for help on special keys
+    Cmd>
+    ```
+
+4. In another terminal, open the second serial port, for example,
+    "/dev/ttyUSB1":
+
+    ```commandline
+    minicom --D /dev/ttyUSB1
+    ```
+
+5. On the first serial port, issue a "reboot" command and press the
+    return key
+
+    ```commandline
+    $ Cmd> reboot
+    ```
+
+    ```log
+    Rebooting...Disabling debug USB..Board rebooting...
+
+    ARM V2M-MPS3 Firmware v1.3.2
+    Build Date: Apr 20 2018
+
+    Powering up system...
+    Switching on main power...
+    Configuring motherboard (rev C, var A)...
+    ```
+
+    This will go on to reboot the board and prime the application to run by
+    flashing the binaries into their respective FPGA memory locations. For example:
+
+    ```log
+    Reading images file \MB\HBI0309C\ETHOSU\images.txt
+    Writing File \SOFTWARE\itcm.bin to Address 0x00000000
+
+    ............
+
+    File \SOFTWARE\itcm.bin written to memory address 0x00000000
+    Image loaded from \SOFTWARE\itcm.bin
+    Writing File \SOFTWARE\dram.bin to Address 0x08000000
+
+    ..........................................................................
+
+
+    File \SOFTWARE\dram.bin written to memory address 0x08000000
+    Image loaded from \SOFTWARE\dram.bin
+    ```
+
+6. When the reboot from previous step is completed, issue a reset
+        command on the command prompt.
+
+    ``` commandline
+    $ Cmd> reset
+    ```
+
+    This will trigger the application to start, and the output should be visible on the second serial connection.
+
+7. On the second serial port, output similar to section 2.2 should be visible:
+
+    ```log
+    [INFO] Setting up system tick IRQ (for NPU)
+    [INFO] V2M-MPS3 revision C
+    [INFO] Application Note AN540, Revision B
+    [INFO] FPGA build 1
+    [INFO] Core clock has been set to: 32000000 Hz
+    [INFO] CPU ID: 0x410fd220
+    [INFO] CPU: Cortex-M55 r0p0
+    ...
+    ```
+
+
+Next section of the main documentation, [Running code samples applications](../documentation.md#Running-code-samples-applications).
diff --git a/docs/sections/run.md b/docs/sections/run.md
new file mode 100644
index 0000000..90ee7c8
--- /dev/null
+++ b/docs/sections/run.md
@@ -0,0 +1,42 @@
+
+# Running Ethos-U55 Code Samples
+
+- [Starting Fast Model simulation](#starting-fast-model-simulation)
+
+This section covers the process for getting started with pre-built binaries for the Code Samples.
+
+## Starting Fast Model simulation
+
+Once built application binaries and assuming the install location of the FVP
+was set to ~/FVP_install_location, the simulation can be started by:
+
+```commandline
+FVP_install_location/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55
+./bin/mps3-sse-300/ethos-u-<use_case>.axf
+```
+
+This will start the Fast Model simulation for the chosen use-case.
+
+A log output should appear on the terminal:
+
+```log
+telnetterminal0: Listening for serial connection on port 5000
+telnetterminal1: Listening for serial connection on port 5001
+telnetterminal2: Listening for serial connection on port 5002
+telnetterminal5: Listening for serial connection on port 5003
+```
+
+This will also launch a telnet window with the sample application's
+standard output and error log entries containing information about the
+pre-built application version, TensorFlow Lite Micro library version
+used, data type as well as the input and output tensor sizes of the
+model compiled into the executable binary.
+
+![FVP](../media/fvp.png)
+
+![FVP Terminal](../media/fvpterminal.png)
+
+> **Note:**
+For details on the specific use-case follow the instructions in the corresponding documentation.
+
+Next section of the documentation: [Implementing custom ML application](../documentation.md#Implementing-custom-ML-application).
diff --git a/docs/sections/testing_benchmarking.md b/docs/sections/testing_benchmarking.md
new file mode 100644
index 0000000..43bb7f4
--- /dev/null
+++ b/docs/sections/testing_benchmarking.md
@@ -0,0 +1,87 @@
+# Testing and benchmarking
+
+- [Testing](#testing)
+- [Benchmarking](#benchmarking)
+
+## Testing
+
+The `tests` folder has the following structure:
+
+```tree
+.
+├── common
+│   └── ...
+├── use_case
+│   ├── <usecase1>
+│   │   └── ...
+│   ├── <usecase2>
+│   │   └── ...
+└── utils
+    └── ...
+```
+
+Where:
+
+- `common`: contains tests for generic and common appplication functions.
+- `use_case`: contains all the use case specific tests in the respective folders.
+- `utils`: contains utilities sources used only within the tests.
+
+When [configuring](./building.md#configuring-the-build-native-unit-test) and
+[building](./building.md#Building-the-configured-project) for `native` target platform results of the build will
+be placed under `build/bin/` folder, for example:
+
+```tree
+.
+├── dev_ethosu_eval-<usecase1>-tests
+├── dev_ethosu_eval-<usecase2>-tests
+├── ethos-u-<usecase1>
+└── ethos-u-<usecase1>
+```
+
+To execute unit-tests for a specific use-case in addition to the common tests:
+
+```commandline
+dev_ethosu_eval-<use_case>-tests
+```
+
+```log
+[INFO] native platform initialised
+[INFO] ARM Ethos-U55 Evaluation application for MPS3 FPGA Prototyping Board and FastModel
+
+...
+===============================================================================
+   All tests passed (37 assertions in 7 test cases)
+```
+
+Tests output could have `[ERROR]` messages, that's alright - they are coming from negative scenarios tests.
+
+## Benchmarking
+
+Profiling is enabled by default when configuring the project. This will enable displaying:
+
+- the active and idle NPU cycle counts when Arm® Ethos™-U55 is enabled (see `-DETHOS_U55_ENABLED` in
+  [Build options](./building.md#build-options).
+- CPU cycle counts and/or in milliseconds elapsed for inferences performed if CPU profiling is enabled
+  (see `-DCPU_PROFILE_ENABLED` in [Build options](./building.md#build-options). This should be done only
+  when running on a physical FPGA board as the FVP does not contain a cycle-approximate or cycle-accurate Cortex-M model.
+
+For example:
+
+- On the FVP:
+
+```log
+    Active NPU cycles: 5475412
+    Idle NPU cycles:   702
+```
+
+- For MPS3 platform, the time duration in milliseconds is also reported when `-DCPU_PROFILE_ENABLED=1` is added to
+  CMake configuration command:
+
+```log
+    Active NPU cycles: 5629033
+    Idle NPU cycles:   1005276
+    Active CPU cycles: 993553 (approx)
+    Time in ms:        210
+```
+
+Next section of the main documentation: [Troubleshooting](../documentation.md#Troubleshooting).
diff --git a/docs/sections/troubleshooting.md b/docs/sections/troubleshooting.md
new file mode 100644
index 0000000..40b975a
--- /dev/null
+++ b/docs/sections/troubleshooting.md
@@ -0,0 +1,27 @@
+# Troubleshooting
+
+- [Inference results are incorrect for my custom files](#inference-results-are-incorrect-for-my-custom-files)
+- [The application does not work with my custom model](#the-application-does-not-work-with-my-custom-model)
+
+## Inference results are incorrect for my custom files
+
+Ensure that the files you are using match the requirements of the model
+you are using and that cmake parameters are set accordingly. More
+information on these cmake parameters is detailed in their separate
+sections. Note that preprocessing of the files could also affect the
+inference result, such as the rescaling and padding operations done for
+image classification.
+
+## The application does not work with my custom model
+
+Ensure that your model is in a fully quantized `.tflite` file format,
+either uint8 or int8, and has successfully been run through the Vela
+compiler.
+
+Check that cmake parameters match your new models input requirements.
+
+> **Note:** Vela tool is not available within this software project.
+It is a python tool available from <https://pypi.org/project/ethos-u-vela/>.
+The source code is hosted on <https://git.mlplatform.org/ml/ethos-u/ethos-u-vela.git/>.
+
+Next section of the documentation: [Contribution guidelines](../documentation.md#Contribution-guidelines).