From 005534664e192cf909a11435c4bc4696b1f4c51f Mon Sep 17 00:00:00 2001 From: Richard Burton Date: Wed, 10 Nov 2021 16:27:14 +0000 Subject: MLECO-2354 MLECO-2355 MLECO-2356: Moving noise reduction to public repository * Use RNNoise model from PMZ * Add Noise reduction use-case Signed-off-by: Richard burton Change-Id: Ia8cc7ef102e22a5ff8bfbd3833594a4905a66057 --- docs/documentation.md | 11 +- docs/quick_start.md | 52 ++++ docs/sections/arm_virtual_hardware.md | 2 +- docs/use_cases/noise_reduction.md | 529 ++++++++++++++++++++++++++++++++++ 4 files changed, 588 insertions(+), 6 deletions(-) create mode 100644 docs/use_cases/noise_reduction.md (limited to 'docs') diff --git a/docs/documentation.md b/docs/documentation.md index a186fbb..0642075 100644 --- a/docs/documentation.md +++ b/docs/documentation.md @@ -206,11 +206,12 @@ What these folders contain: The models used in the use-cases implemented in this project can be downloaded from: [Arm ML-Zoo](https://github.com/ARM-software/ML-zoo). -- [Mobilenet V2](https://github.com/ARM-software/ML-zoo/tree/e0aa361b03c738047b9147d1a50e3f2dcb13dbcb/models/image_classification/mobilenet_v2_1.0_224/tflite_uint8). -- [DS-CNN](https://github.com/ARM-software/ML-zoo/tree/68b5fbc77ed28e67b2efc915997ea4477c1d9d5b//models/keyword_spotting/ds_cnn_large/tflite_clustered_int8). -- [Wav2Letter](https://github.com/ARM-software/ML-zoo/tree/1a92aa08c0de49a7304e0a7f3f59df6f4fd33ac8/models/speech_recognition/wav2letter/tflite_pruned_int8). -- [Anomaly Detection](https://github.com/ARM-software/ML-zoo/tree/7c32b097f7d94aae2cd0b98a8ed5a3ba81e66b18/models/anomaly_detection/micronet_medium/tflite_int8). -- [Visual Wake Word](https://github.com/ARM-software/ML-zoo/raw/7dd3b16bb84007daf88be8648983c07f3eb21140/models/visual_wake_words/micronet_vww4/tflite_int8/vww4_128_128_INT8.tflite). +- [Mobilenet V2](https://github.com/ARM-software/ML-zoo/tree/e0aa361b03c738047b9147d1a50e3f2dcb13dbcb/models/image_classification/mobilenet_v2_1.0_224/tflite_int8) +- [DS-CNN](https://github.com/ARM-software/ML-zoo/tree/68b5fbc77ed28e67b2efc915997ea4477c1d9d5b//models/keyword_spotting/ds_cnn_large/tflite_clustered_int8) +- [Wav2Letter](https://github.com/ARM-software/ML-zoo/tree/1a92aa08c0de49a7304e0a7f3f59df6f4fd33ac8/models/speech_recognition/wav2letter/tflite_pruned_int8) +- [MicroNet for Anomaly Detection](https://github.com/ARM-software/ML-zoo/tree/7c32b097f7d94aae2cd0b98a8ed5a3ba81e66b18/models/anomaly_detection/micronet_medium/tflite_int8) +- [MicroNet for Visual Wake Word](https://github.com/ARM-software/ML-zoo/raw/7dd3b16bb84007daf88be8648983c07f3eb21140/models/visual_wake_words/micronet_vww4/tflite_int8/vww4_128_128_INT8.tflite) +- [RNNoise](https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/rnnoise_INT8.tflite) When using *Ethos-U* NPU backend, Vela compiler optimizes the the NN model. However, if not and it is supported by TensorFlow Lite Micro, then it falls back on the CPU and execute. diff --git a/docs/quick_start.md b/docs/quick_start.md index 3488447..7613912 100644 --- a/docs/quick_start.md +++ b/docs/quick_start.md @@ -102,6 +102,26 @@ curl -L https://github.com/ARM-software/ML-zoo/raw/68b5fbc77ed28e67b2efc915997ea --output ./resources_downloaded/kws_asr/kws/ifm0.npy curl -L https://github.com/ARM-software/ML-zoo/raw/68b5fbc77ed28e67b2efc915997ea4477c1d9d5b/models/keyword_spotting/ds_cnn_large/tflite_clustered_int8/testing_output/Identity/0.npy \ --output ./resources_downloaded/kws_asr/kws/ofm0.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/rnnoise_INT8.tflite \ + --output ./resources_downloaded/noise_reduction/rnnoise_INT8.tflite +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_input/main_input_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ifm0.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_input/vad_gru_prev_state_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ifm1.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_input/noise_gru_prev_state_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ifm2.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_input/denoise_gru_prev_state_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ifm3.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_output/Identity_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ofm0.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_output/Identity_1_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ofm1.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_output/Identity_2_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ofm2.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_output/Identity_3_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ofm3.npy +curl -L https://github.com/ARM-software/ML-zoo/raw/a061600058097a2785d6f1f7785e5a2d2a142955/models/noise_suppression/RNNoise/tflite_int8/testing_output/Identity_4_int8/0.npy \ + --output ./resources_downloaded/noise_reduction/ofm4.npy curl -L https://github.com/ARM-software/ML-zoo/raw/68b5fbc77ed28e67b2efc915997ea4477c1d9d5b/models/keyword_spotting/dnn_small/tflite_int8/dnn_s_quantized.tflite \ --output ./resources_downloaded/inference_runner/dnn_s_quantized.tflite @@ -217,6 +237,38 @@ mv resources_downloaded/ad/ad_medium_int8_vela.tflite resources_downloaded/ad/ad --output-dir=resources_downloaded/ad mv resources_downloaded/ad/ad_medium_int8_vela.tflite resources_downloaded/ad/ad_medium_int8_vela_Y256.tflite +. resources_downloaded/env/bin/activate && vela resources_downloaded/vww/vww4_128_128_INT8.tflite \ + --accelerator-config=ethos-u55-128 \ + --optimise Performance --config scripts/vela/default_vela.ini \ + --memory-mode=Shared_Sram \ + --system-config=Ethos_U55_High_End_Embedded \ + --output-dir=resources_downloaded/ad +mv resources_downloaded/vww/vww4_128_128_INT8_vela.tflite resources_downloaded/vww/vww4_128_128_INT8_vela_H128.tflite + +. resources_downloaded/env/bin/activate && vela resources_downloaded/vww/vww4_128_128_INT8.tflite \ + --accelerator-config=ethos-u65-256 \ + --optimise Performance --config scripts/vela/default_vela.ini \ + --memory-mode=Dedicated_Sram \ + --system-config=Ethos_U65_High_End \ + --output-dir=resources_downloaded/ad +mv resources_downloaded/vww/vww4_128_128_INT8_vela.tflite resources_downloaded/vww/vww4_128_128_INT8_vela_Y256.tflite + +. resources_downloaded/env/bin/activate && vela resources_downloaded/noise_reduction/rnnoise_INT8.tflite \ + --accelerator-config=ethos-u55-128 \ + --optimise Performance --config scripts/vela/default_vela.ini \ + --memory-mode=Shared_Sram \ + --system-config=Ethos_U55_High_End_Embedded \ + --output-dir=resources_downloaded/ad +mv resources_downloaded/noise_reduction/rnnoise_INT8_vela.tflite resources_downloaded/noise_reduction/rnnoise_INT8_vela_H128.tflite + +. resources_downloaded/env/bin/activate && vela resources_downloaded/noise_reduction/rnnoise_INT8.tflite \ + --accelerator-config=ethos-u65-256 \ + --optimise Performance --config scripts/vela/default_vela.ini \ + --memory-mode=Dedicated_Sram \ + --system-config=Ethos_U65_High_End \ + --output-dir=resources_downloaded/ad +mv resources_downloaded/noise_reduction/rnnoise_INT8_vela.tflite resources_downloaded/noise_reduction/rnnoise_INT8_vela_Y256.tflite + mkdir cmake-build-mps3-sse-300-gnu-release and cd cmake-build-mps3-sse-300-gnu-release cmake .. \ diff --git a/docs/sections/arm_virtual_hardware.md b/docs/sections/arm_virtual_hardware.md index 2f05525..ca60a28 100644 --- a/docs/sections/arm_virtual_hardware.md +++ b/docs/sections/arm_virtual_hardware.md @@ -23,5 +23,5 @@ Note that you can register to receive free AWS credits to use Arm Virtual Hardwa You can find more information about Arm Virtual Hardware [here](https://arm-software.github.io/VHT/main/overview/html/index.html). -Once you have access to the AWS instance, we recommend starting from the [quick start guide](../quick_start.md) in order to get familiar +Once you have access to the AWS instance, we recommend starting from the [quick start guide](../quick_start.md#Quick-start-example-ML-application) in order to get familiar with the ml-embedded-evaluation-kit. Note that on the AWS instance, the FVP is available under `/opt/FVP_Corstone_SSE-300`. diff --git a/docs/use_cases/noise_reduction.md b/docs/use_cases/noise_reduction.md new file mode 100644 index 0000000..e6df89c --- /dev/null +++ b/docs/use_cases/noise_reduction.md @@ -0,0 +1,529 @@ +# Noise Reduction Code Sample + +- [Noise Reduction Code Sample](#noise-reduction-code-sample) + - [Introduction](#introduction) + - [How the default neural network model works](#how-the-default-neural-network-model-works) + - [Post-processing](#post_processing) + - [Dumping of memory contents from the Fixed Virtual Platform](#dumping-of-memory-contents-from-the-fixed-virtual-platform) + - [Dumping post processed results for all inferences](#dumping-post_processed-results-for-all-inferences) + - [Prerequisites](#prerequisites) + - [Building the code sample application from sources](#building-the-code-sample-application-from-sources) + - [Build options](#build-options) + - [Build process](#build-process) + - [Add custom input](#add-custom-input) + - [Add custom model](#add-custom-model) + - [Setting up and running Ethos-U NPU code sample](#setting-up-and-running-ethos_u-npu-code-sample) + - [Setting up the Ethos-U NPU Fast Model](#setting-up-the-ethos_u-npu-fast-model) + - [Starting Fast Model simulation](#starting-fast-model-simulation) + - [Running Noise Reduction](#running-noise-reduction) + +## Introduction + +This document describes the process of setting up and running the Arm® Ethos™-U NPU Noise Reduction +example. + +Use case code is stored in the following directory: [source/use_case/noise_reduction](../../source/use_case/noise_reduction). + +## How the default neural network model works + +Instead of replicating a "noisy audio in" and "clean audio out" problem, a simpler version is +defined. We use different frequency bands for the audio (22 in the original paper +[RNNoise: Learning Noise Suppression](https://jmvalin.ca/demo/rnnoise/)). It is based on a scale like the "Mel scale" +or "Bark scale" and calculates the energies for each band. Using this type of scale, the bands get +divided up and the result is based on what is important to the human ear. + +When we have a noisy audio clip, the model takes the energy levels of these different bands as +input. The model then tries to predict a value (called a gain), to apply to each frequency band. It +is expected that applying this gain to each band brings the audio back to what a "clean" audio +sample would have been like. It is like a 22-band equalizer, where we quickly adjust the level of +each band so that the noise is removed. However, the signal, or speech, still passes through. + +In addition to the 22 band values calculated, the input features also include: + +- First and second derivatives of the first 6 coefficients, +- The pitch period (1/frequency), +- The pitch gain for six bands, +- A value used to detect if speech is occurring. + +This provides 42 feature inputs, `22 + 6 + 6 + 1 + 6 + 1 = 42`, and the model produces `22` (gain +values) outputs. + +> **Note:** The model also has a second output that predicts if speech is occurring in the given +> sample. + +The pre-processing works in a windowed fashion, on 20ms of the audio clip at a time, and the stride +is 10ms. So, for example, if we provide one second of audio this gives us `1000ms/10ms = 100` windows of +features and, therefore, an input shape of `100x42` to the model. The output shape of the model is +then `100x22`, representing the gain values to apply to each of the 100 windows. + +These output gain values can then be applied to each corresponding window of the noisy audio clip, +producing a cleaner output. + +For more information please refer to the original paper: +[A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement](https://arxiv.org/pdf/1709.08243.pdf) + +## Post-processing + +After each inference the output of the model is passed to post-processing code which uses the gain values the model +produced to generate audio with the noise removed from it. + +For you to verify the outputs of the model after post-processing, you will have manually use an [offline script](../../scripts/py/rnnoise_dump_extractor.py) +to convert the post-processed outputs into a wav file. +This offline script takes a dump file as the input and saves the denoised WAV file to disk. The following is an example +of how to call the script from the command line after running the use-case and +[selecting to dump memory contents](#dumping-post_processed-results-for-all-inferences). + +```commandline +python scripts/py/rnnoise_dump_extractor.py --dump_file --output_dir +``` + +The application for this use case has been written to dump the post-processed output to the address pointed to by +the CMake parameter `noise_reduction_MEM_DUMP_BASE_ADDR`. The default value is set to `0x80000000`. + +### Dumping of memory contents from the Fixed Virtual Platform + +The fixed virtual platform supports dumping of memory contents to a file. This can be done by +specifying command-line arguments when starting the FVP executable. For example, the argument: + +```commandline +$ FVP_Corstone_SSE-300_Ethos-U55 -a ./bin/ethos-u-noise_reduction.axf \ + --dump cpu0=output.bin@Memory:0x80000000,0x100000 +``` + +Dumps 1 MiB worth of data from address `0x80000000` to the file `output.bin`. + +### Dumping post-processed results for all inferences + +The Noise Reduction application uses the memory address specified by +`noise_reduction_MEM_DUMP_BASE_ADDR` as a buffer to store post-processed results from all inferences. +The maximum size of this buffer is set by the parameter +`noise_reduction_MEM_DUMP_LEN` which defaults to 1 MiB. + +Logging information is generated for every inference run performed. Each line corresponds to the post-processed +result of that inference being written to a certain location in memory. + +For example: + +```log +INFO - Audio Clip dump header info (20 bytes) written to 0x80000000 +INFO - Inference 1/136 +INFO - Copied 960 bytes to 0x80000014 +... +INFO - Inference 136/136 +INFO - Copied 960 bytes to 0x8001fa54 +``` + +In the preceding output we can see that it starts at the default address of +`0x80000000` where some header information is dumped. Then, after the first inference 960 bytes +(480 INT16 values) are written to the first address after the dumped header `0x80000014`. +Each inference afterward will then write another 960 bytes to the next address and so on until all inferences +are complete. + +When consolidating all inference outputs for an entire audio clip, the application output should report: + +```log +INFO - Output memory dump of 130580 bytes written at address 0x80000000 +``` + +The application output log states that there are 130580 bytes worth of valid data ready to be read +from `0x80000000`. If the FVP was started with the `--dump` option, then the output file is created +when the FVP instance exits. + +## Prerequisites + +See [Prerequisites](../documentation.md#prerequisites) + +## Building the code sample application from sources + +### Build options + +In addition to the already specified build option in the main documentation, keyword spotting use +case adds: + +- `noise_reduction_MODEL_TFLITE_PATH` - The path to the NN model file in *TFLite* format. The model + is processed and is included in the application axf file. The default value points to one of the + delivered set of models. Note that the parameter + `ETHOS_U_NPU_ENABLED` must be aligned with the chosen model. Therefore: + - if `ETHOS_U_NPU_ENABLED` is set to `On` or `1`, we assume that the NN model is optimized. The + model naturally falls back to the Arm® Cortex®-M CPU if an unoptimized model is supplied. + - if `ETHOS_U_NPU_ENABLED` is set to `Off` or `0`, then we assume that the NN model is unoptimized. + In this case, supplying an optimized model results in a runtime error. + +- `noise_reduction_FILE_PATH`: The path to the directory containing WAV files, or a path to single + WAV file, to be used in the application. The default value points to the + `resources/noise_reduction/samples` folder containing the delivered set of audio clips. + +- `noise_reduction_AUDIO_RATE`: The input data sampling rate. Each audio file from `noise_reduction_FILE_PATH` is + preprocessed during the build to match the NN model input requirements. The default value is `48000`. + +- `noise_reduction_AUDIO_MONO`: If set to `ON`, then the audio data is converted to mono. The default value is `ON`. + +- `noise_reduction_AUDIO_OFFSET`: Begins loading audio data and starts from this specified offset, defined in seconds. + The default value is set to `0`. + +- `noise_reduction_AUDIO_DURATION`: The length of the audio data to be used in the application in seconds. + The default is `0`, meaning that the whole audio file is used. + +- `noise_reduction_AUDIO_MIN_SAMPLES`: Minimum number of samples required by the network model. If the audio clip is shorter than + this number, then it is padded with zeros. The default value is `480`. + +- `noise_reduction_ACTIVATION_BUF_SZ`: The intermediate, or activation, buffer size reserved for the + neural network model. By default, it is set to 2MiB. + +To **ONLY** build a `noise_reduction` example application, add `-DUSE_CASE_BUILD=noise_reduction` + (as specified in [Building](../documentation.md#Building) to the `cmake` command line). + +### Build process + +> **Note:** This section describes the process for configuring the build for `MPS3: SSE-300`. To +> configure a different target platform, please see the [Building](../documentation.md#Building) +> section. + +To **only** build the `noise_reduction` example, create a build directory, and then navigate inside. +For example: + +```commandline +mkdir build_noise_reduction && cd build_noise_reduction +``` + +On Linux, when providing only the mandatory arguments for CMake configuration, use the following +command to build the Noise Reduction application to run on the *Ethos-U55* Fast Model: + +```commandline +cmake ../ -DUSE_CASE_BUILD=noise_reduction +``` + +To configure a build that can be debugged using Arm DS, we specify the build type as `Debug` and use +the `Arm Compiler` toolchain file: + +```commandline +cmake .. \ + -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/toolchains/bare-metal-armclang.cmake \ + -DCMAKE_BUILD_TYPE=Debug \ + -DUSE_CASE_BUILD=noise_reduction +``` + +For more notes, please refer to: + +- [Configuring with custom TPIP dependencies](../sections/building.md#configuring-with-custom-tpip-dependencies) +- [Using Arm Compiler](../sections/building.md#using-arm-compiler) +- [Configuring the build for simple-platform](../sections/building.md#configuring-the-build-for-simple_platform) +- [Working with model debugger from Arm Fast Model Tools](../sections/building.md#working-with-model-debugger-from-arm-fast-model-tools) +- [Building for different Ethos-U variants](../sections/building.md#building-for-different-ethos_u-npu-variants) + +> **Note:** If you are rebuilding with changed parameters values, it is highly advised that you +> clean the build directory and rerun the CMake command. + +If the CMake command is successful, then build the application as follows: + +```commandline +make -j4 +``` + +> **Note:** To see compilation and link details, add `VERBOSE=1`. + +The build results are placed under the `build/bin` folder. For example: + +```tree +bin + ├── ethos-u-noise_reduction.axf + ├── ethos-u-noise_reduction.htm + ├── ethos-u-noise_reduction.map + ├── images-noise_reduction.txt + └── sectors + └── noise_reduction + ├── dram.bin + └── itcm.bin +``` + +Based on the preceding output, the files contain the following information: + +- `ethos-u-noise_reduction.axf`: The built application binary for the noise reduction use case. + +- `ethos-u-noise_reduction.map`: Information from building the application (for example. The + libraries used, what was optimized, and location of objects). + +- `ethos-u-noise_reduction.htm`: A human readable file containing the call graph of application + functions. + +- `sectors/`: This folder contains the built application, which is split into files for loading into + different FPGA memory regions. + +- `Images-noise_reduction.txt`: Tells the FPGA which memory regions to use for loading the binaries + in the `sectors/...` folder. + +### Add custom input + +To run with inputs different to the ones supplied, the parameter `noise_reduction_FILE_PATH` can be +pointed to a WAV file, or a directory containing WAV files. Once you have a directory with WAV files, +run the following command: + +```commandline +cmake .. \ + -DUSE_CASE_BUILD=noise_reduction \ + -Dnoise_reduction_FILE_PATH=/path/to/custom/wav_files +``` + +### Add custom model + +The application performs inference using the model pointed to by the CMake parameter +`noise_reduction_MODEL_TFLITE_PATH`. + +> **Note:** If you want to run the model using *Ethos-U* ensure that your custom model has been +> run through the Vela compiler successfully before continuing. + +For further information: [Optimize model with Vela compiler](../sections/building.md#optimize-custom-model-with-vela-compiler). + +An example: + +```commandline +cmake .. \ + -Dnoise_reduction_MODEL_TFLITE_PATH= \ + -DUSE_CASE_BUILD=noise_reduction +``` + +> **Note** Changing the neural network model often also requires the pre-processing implementation +> to be changed. Please refer to: +> [How the default neural network model works](#how-the-default-neural-network-model-works). + +> **Note:** Before re-running the CMake command, clean the build directory. + +The `.tflite` model file, which is pointed to by `noise_reduction_MODEL_TFLITE_PATH`, is converted +to C++ files during the CMake configuration stage. It is then compiled into the application for +performing inference with. + +To see which model path was used, inspect the configuration stage log: + +```log +-- User option noise_reduction_MODEL_TFLITE_PATH is set to +... +-- Using +++ Converting custom_model_after_vela.tflite to custom_model_after_vela.tflite.cc +-- Generating labels file from +-- writing to +... +``` + +After compiling, your custom model replaces the default one in the application. + +## Setting up and running Ethos-U NPU code sample + +### Setting up the Ethos-U NPU Fast Model + +The FVP is available publicly from [Arm Ecosystem FVP downloads](https://developer.arm.com/tools-and-software/open-source-software/arm-platforms-software/arm-ecosystem-fvps). + +For the *Ethos-U* evaluation, please download the MPS3 based version of the Arm® *Corstone™-300* model that contains *Cortex-M55* +and offers a choice of the *Ethos-U55* and *Ethos-U65* processors. + +To install the FVP: + +- Unpack the archive, + +- Run the install script in the extracted package: + +```commandline +$./FVP_Corstone_SSE-300.sh +``` + +- Follow the instructions to install the FVP to your required location. + +### Starting Fast Model simulation + +Once the building step has completed, the application binary `ethos-u-noise_reduction.axf` can be +found in the `build/bin` folder. Assuming the install location of the FVP was set to +`~/FVP_install_location`, start the simulation with the following command: + +```commandline +~/FVP_install_location/models/Linux64_GCC-6.4/FVP_Corstone_SSE-300_Ethos-U55 ./bin/mps3-sse-300/ethos-u-noise_reduction.axf +``` + +A log output then appears on the terminal: + +```log +telnetterminal0: Listening for serial connection on port 5000 +telnetterminal1: Listening for serial connection on port 5001 +telnetterminal2: Listening for serial connection on port 5002 +telnetterminal5: Listening for serial connection on port 5003 +``` + +This also launches a telnet window with the standard output of the sample application. It also +includes error log entries containing information about the pre-built application version, +TensorFlow Lite Micro library version used, and the data type. As well as the input and output +tensor sizes of the model that was compiled into the executable binary. + +After the application has started, if `noise_reduction_FILE_PATH` pointed to a single file (or a +folder containing a single input file), then the inference starts immediately. If multiple inputs +are chosen, then a menu is output and waits for the user input from telnet terminal. + +For example: + +```log +User input required +Enter option number from: + + 1. Run noise reduction on the next WAV + 2. Run noise reduction on a WAV at chosen index + 3. Run noise reduction on all WAVs + 4. Show NN model info + 5. List audio clips + +Choice: +``` + +1. “Run noise reduction on the next WAV”: Runs processing and inference on the next in line WAV file. + + > **Note:** Depending on the size of the input WAV file, multiple inferences can be invoked. + +2. “Run noise reduction on a WAV at chosen index”: Runs processing and inference on the WAV file + corresponding to the chosen index. + + > **Note:** Select the index in the range of supplied WAVs during application build. By default, + the pre-built application has three files and indexes from 0-2. + +3. “Run noise reduction on all WAVs”: Triggers sequential processing and inference executions on + all baked-in WAV files. + +4. “Show NN model info”: Prints information about the model data type, including the input and + output tensor sizes. For example: + + ```log + INFO - Model info: + INFO - Model INPUT tensors: + INFO - tensor type is INT8 + INFO - tensor occupies 42 bytes with dimensions + INFO - 0: 1 + INFO - 1: 1 + INFO - 2: 42 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.221501 + INFO - ZeroPoint[0] = 14 + INFO - tensor type is INT8 + INFO - tensor occupies 24 bytes with dimensions + INFO - 0: 1 + INFO - 1: 24 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.007843 + INFO - ZeroPoint[0] = -1 + INFO - tensor type is INT8 + INFO - tensor occupies 48 bytes with dimensions + INFO - 0: 1 + INFO - 1: 48 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.047942 + INFO - ZeroPoint[0] = -128 + INFO - tensor type is INT8 + INFO - tensor occupies 96 bytes with dimensions + INFO - 0: 1 + INFO - 1: 96 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.007843 + INFO - ZeroPoint[0] = -1 + INFO - Model OUTPUT tensors: + INFO - tensor type is INT8 + INFO - tensor occupies 96 bytes with dimensions + INFO - 0: 1 + INFO - 1: 1 + INFO - 2: 96 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.007843 + INFO - ZeroPoint[0] = -1 + INFO - tensor type is INT8 + INFO - tensor occupies 22 bytes with dimensions + INFO - 0: 1 + INFO - 1: 1 + INFO - 2: 22 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.003906 + INFO - ZeroPoint[0] = -128 + INFO - tensor type is INT8 + INFO - tensor occupies 48 bytes with dimensions + INFO - 0: 1 + INFO - 1: 1 + INFO - 2: 48 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.047942 + INFO - ZeroPoint[0] = -128 + INFO - tensor type is INT8 + INFO - tensor occupies 24 bytes with dimensions + INFO - 0: 1 + INFO - 1: 1 + INFO - 2: 24 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.007843 + INFO - ZeroPoint[0] = -1 + INFO - tensor type is INT8 + INFO - tensor occupies 1 bytes with dimensions + INFO - 0: 1 + INFO - 1: 1 + INFO - 2: 1 + INFO - Quant dimension: 0 + INFO - Scale[0] = 0.003906 + INFO - ZeroPoint[0] = -128 + INFO - Activation buffer (a.k.a tensor arena) size used: 1940 + INFO - Number of operators: 1 + INFO - Operator 0: ethos-u + INFO - Use of Arm uNPU is enabled + ``` + +5. “List audio clips”: Prints a list of pair audio indexes. The original filenames are embedded in + the application. For example: + + ```log + INFO - List of Files: + INFO - 0 => p232_113.wav + INFO - 1 => p232_208.wav + INFO - 2 => p257_031.wav + ``` + +### Running Noise Reduction + +Selecting the first option runs inference on the first file. + +The following example illustrates an application output: + +```log +INFO - Audio Clip dump header info (20 bytes) written to 0x80000000 +INFO - Inference 1/136 +INFO - Copied 960 bytes to 0x80000014 +INFO - Inference 2/136 +INFO - Copied 960 bytes to 0x800003d4 +... +INFO - Inference 136/136 +INFO - Copied 960 bytes to 0x8001fa54 +INFO - Output memory dump of 130580 bytes written at address 0x80000000 +INFO - Final results: +INFO - Profile for Inference: +INFO - NPU AXI0_RD_DATA_BEAT_RECEIVED beats: 530 +INFO - NPU AXI0_WR_DATA_BEAT_WRITTEN beats: 376 +INFO - NPU AXI1_RD_DATA_BEAT_RECEIVED beats: 13911 +INFO - NPU ACTIVE cycles: 103870 +INFO - NPU IDLE cycles: 643 +INFO - NPU TOTAL cycles: 104514 +``` + +> **Note:** When running Fast Model, each inference can take several seconds on most systems. + +Each inference dumps the post processed output to memory. For further information, please refer to: +[Dumping post processed results for all inferences](#dumping-post_processed-results-for-all-inferences). + +The profiling section of the log shows that for this inference: + +- *Ethos-U* NPU PMU report for each inference: + + - 104514: The total number of NPU cycles. + + - 103870: How many NPU cycles were used for computation. + + - 643: How many cycles the NPU was idle for. + + - 530: The number of AXI beats with read transactions from AXI0 bus. + > **Note:** The AXI0 is the bus where the *Ethos-U* NPU reads and writes to the computation + > buffers, or the activation buf or tensor arenas. + + - 370: The number of AXI beats with write transactions to the AXI0 bus. + + - 13911: The number of AXI beats with read transactions from AXI1 bus. + > **Note:** The AXI1 is the bus where *Ethos-U* NPU reads the model, which is read-only. + +- For FPGA platforms, the CPU cycle count can also be enabled. However, for FVP, do not use the CPU + cycle counters as the CPU model is not cycle-approximate or cycle-accurate. -- cgit v1.2.1