From c43480ac8b1ad89cee944d10d61aca127a146050 Mon Sep 17 00:00:00 2001
From: Jan Eilers <jan.eilers@arm.com>
Date: Wed, 25 Nov 2020 17:17:10 +0000
Subject: IVGCVSW-5457 Add How-To documentation for the TfLite Delegate

 * Add guide.md to the delegate directory
 * Mentioned the guide in README.md

Signed-off-by: Jan Eilers <jan.eilers@arm.com>
Change-Id: Iaa373ecf28f57f4d8383489fecf43670ae8ca190
---
 README.md                    |   2 +
 delegate/BuildGuideNative.md | 237 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 239 insertions(+)
 create mode 100644 delegate/BuildGuideNative.md

diff --git a/README.md b/README.md
index f76085c13c..e6959eb66f 100644
--- a/README.md
+++ b/README.md
@@ -14,6 +14,8 @@ There is a guide for backend development: [Backend development guide](src/backen
 
 There is a guide for installation of ArmNN, Tensorflow Lite Parser and PyArmnn via our Apt Repository: [Installation via Apt Repository](InstallationViaAptRepository.md)
 
+There is a getting started guide for our ArmNN TfLite Delegate: [Build the TfLite Delegate natively](delegate/BuildGuideNative.md)
+
 API Documentation is available at https://github.com/ARM-software/armnn/wiki/Documentation.
 
 Dox files to generate Arm NN doxygen files can be found at armnn/docs/. Following generation the xhtml files can be found at armnn/documentation/
diff --git a/delegate/BuildGuideNative.md b/delegate/BuildGuideNative.md
new file mode 100644
index 0000000000..0f591d1d98
--- /dev/null
+++ b/delegate/BuildGuideNative.md
@@ -0,0 +1,237 @@
+# Introduction
+
+The ArmNN Delegate can be found within the ArmNN repository but it is a standalone piece of software. However,
+it makes use of the ArmNN library. For this reason we have added two options to build the delegate. The first option
+allows you to build the delegate together with the ArmNN library, the second option is a standalone build 
+of the delegate.
+
+This tutorial uses an Aarch64 machine with Ubuntu 18.04 installed that can build all components
+natively (no cross-compilation required). This is to keep this guide simple.
+
+1. [Dependencies](#Dependencies)
+   * [Build Tensorflow for C++](#Build Tensorflow for C++)
+   * [Build Flatbuffers](#Build Flatbuffers)
+   * [Build the Arm Compute Library](#Build the Arm Compute Library)
+   * [Build the ArmNN Library](#Build the ArmNN Library)
+2. [Build the TfLite Delegate (Stand-Alone)](#Build the TfLite Delegate (Stand-Alone))
+3. [Build the Delegate together with ArmNN](#Build the Delegate together with ArmNN)
+4. [Integrate the ArmNN TfLite Delegate into your project](#Integrate the ArmNN TfLite Delegate into your project)
+
+# Dependencies
+
+Build Dependencies:
+ * Tensorflow and Tensorflow Lite version 2.3.1
+ * Flatbuffers 1.12.0
+ * ArmNN 20.11 or higher
+
+Required Tools:
+ * Git
+ * pip
+ * wget
+ * zip
+ * unzip
+ * cmake 3.7.0 or higher
+ * scons
+ * bazel 3.1.0
+
+Our first step is to build all the build dependencies I have mentioned above. We will have to create quite a few
+directories. To make navigation a bit easier define a base directory for the project. At this stage we can also
+install all the tools that are required during the build.
+```bash
+export BASEDIR=/home
+cd $BASEDIR
+apt-get update && apt-get install git wget unzip zip python git cmake scons
+```
+
+## Build Tensorflow for C++
+Tensorflow has a few dependencies on it's own. It requires the python packages pip3, numpy, wheel, keras_preprocessing
+and also bazel which is used to compile Tensoflow. A description on how to build bazel can be 
+found [here](https://docs.bazel.build/versions/master/install-compile-source.html). There are multiple ways. 
+I decided to compile from source because that should work for any platform and therefore adds the most value 
+to this guide. Depending on your operating system and architecture there might be an easier way.
+```bash
+# Install the python packages
+pip3 install -U pip numpy wheel
+pip3 install -U keras_preprocessing --no-deps
+
+# Bazel has a dependency on JDK
+apt-get install openjdk-11-jdk
+# Build Bazel
+wget -O bazel-3.1.0-dist.zip https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip
+unzip -d bazel bazel-3.1.0-dist.zip
+cd bazel
+env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh 
+# This creates an "output" directory where the bazel binary can be found
+ 
+# Download Tensorflow
+cd $BASEDIR
+git clone https://github.com/tensorflow/tensorflow.git
+cd tensorflow/
+git checkout tags/v2.3.1 # Minimum version required for the delegate
+```
+Before tensorflow can be built, targets need to be defined in the `BUILD` file that can be 
+found in the root directory of Tensorflow. Append the following two targets to the file:
+```
+cc_binary(
+     name = "libtensorflow_all.so",
+     linkshared = 1,
+     deps = [
+         "//tensorflow/core:framework",
+         "//tensorflow/core:tensorflow",
+         "//tensorflow/cc:cc_ops",
+         "//tensorflow/cc:client_session",
+         "//tensorflow/cc:scope",
+         "//tensorflow/c:c_api",
+     ],
+)
+cc_binary(
+     name = "libtensorflow_lite_all.so",
+     linkshared = 1,
+     deps = [
+         "//tensorflow/lite:framework",
+         "//tensorflow/lite/kernels:builtin_ops",
+     ],
+)
+```
+Now the build process can be started. When calling "configure", as below, a dialog shows up that asks the 
+user to specify additional options. If you don't have any particular needs to your build, decline all 
+additional options and choose default values. Building `libtensorflow_all.so` requires quite some time. 
+This might be a good time to get yourself another drink and take a break.
+```bash
+PATH="$BASEDIR/bazel/output:$PATH" ./configure
+$BASEDIR/bazel/output/bazel build --define=grpc_no_ares=true --config=opt --config=monolithic --strip=always --config=noaws libtensorflow_all.so
+$BASEDIR/bazel/output/bazel build --config=opt --config=monolithic --strip=always libtensorflow_lite_all.so
+```
+
+## Build Flatbuffers
+
+Flatbuffers is a memory efficient cross-platform serialization library as 
+described [here](https://google.github.io/flatbuffers/). It is used in tflite to store models and is also a dependency 
+of the delegate. After downloading the right version it can be built and installed using cmake.
+```bash
+cd $BASEDIR
+wget -O flatbuffers-1.12.0.zip https://github.com/google/flatbuffers/archive/v1.12.0.zip
+unzip -d . flatbuffers-1.12.0.zip
+cd flatbuffers-1.12.0 
+mkdir install && mkdir build && cd build
+# I'm using a different install directory but that is not required
+cmake .. -DCMAKE_INSTALL_PREFIX:PATH=$BASEDIR/flatbuffers-1.12.0/install 
+make install
+```
+
+## Build the Arm Compute Library
+
+The ArmNN library depends on the Arm Compute Library (ACL). It provides a set of functions that are optimized for 
+both Arm CPUs and GPUs. The Arm Compute Library is used directly by ArmNN to run machine learning workloads on 
+Arm CPUs and GPUs.
+
+It is important to have the right version of ACL and ArmNN to make it work. Luckily, ArmNN and ACL are developed 
+very closely and released together. If you would like to use the ArmNN version "20.11" you can use the same "20.11"
+version for ACL too.
+
+To build the Arm Compute Library on your platform, download the Arm Compute Library and check the branch 
+out that contains the version you want to use and build it using `scons`.
+```bash
+cd $BASEDIR
+git clone https://review.mlplatform.org/ml/ComputeLibrary 
+cd ComputeLibrary/
+git checkout <branch_name> # e.g. branches/arm_compute_20_11
+# The machine used for this guide only has a Neon CPU which is why I only have "neon=1" but if 
+# your machine has an arm Gpu you can enable that by adding `opencl=1 embed_kernels=1 to the command below
+scons arch=arm64-v8a neon=1 extra_cxx_flags="-fPIC" benchmark_tests=0 validation_tests=0 
+```
+
+## Build the ArmNN Library
+
+After building ACL we can now continue building ArmNN. To do so, download the repository and checkout the same 
+version as you did for ACL. Create a build directory and use cmake to build it.
+```bash
+cd $BASEDIR
+git clone "https://review.mlplatform.org/ml/armnn" 
+cd armnn
+git checkout <branch_name> # e.g. branches/armnn_20_11
+mkdir build && cd build
+# if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below
+cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary -DARMCOMPUTENEON=1 -DBUILD_UNIT_TESTS=0 
+make
+```
+
+# Build the TfLite Delegate (Stand-Alone)
+
+The delegate as well as ArmNN is built using cmake. Create a build directory as usual and build the Delegate
+with the additional cmake arguments shown below
+```bash
+cd $BASEDIR/armnn/delegate && mkdir build && cd build
+cmake .. -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \     # Directory where tensorflow libraries can be found
+         -DTENSORFLOW_ROOT=$BASEDIR/tensorflow \                  # The top directory of the tensorflow repository
+         -DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \        # In our case the same as TENSORFLOW_LIB_DIR 
+         -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install \ # The install directory 
+         -DArmnn_DIR=$BASEDIR/armnn/build \                       # Directory where the ArmNN library can be found
+         -DARMNN_SOURCE_DIR=$BASEDIR/armnn                        # The top directory of the ArmNN repository. 
+                                                                  # Required are the includes for ArmNN
+make
+```
+
+To ensure that the build was successful you can run the unit tests for the delegate that can be found in 
+the build directory for the delegate. [Doctest](https://github.com/onqtam/doctest) was used to create those tests. Using test filters you can
+filter out tests that your build is not configured for. In this case, because ArmNN was only built for Cpu 
+acceleration (CpuAcc), we filter for all test suites that have `CpuAcc` in their name.
+```bash
+cd $BASEDIR/armnn/delegate/build
+./DelegateUnitTests --test-suite=*CpuAcc* 
+```
+If you have built for Gpu acceleration as well you might want to change your test-suite filter:
+```bash
+./DelegateUnitTests --test-suite=*CpuAcc*,*GpuAcc*
+```
+
+
+# Build the Delegate together with ArmNN
+
+In the introduction it was mentioned that there is a way to integrate the delegate build into ArmNN. This is
+pretty straight forward. The cmake arguments that were previously used for the delegate have to be added
+to the ArmNN cmake arguments. Also another argument `BUILD_ARMNN_TFLITE_DELEGATE` needs to be added to 
+instruct ArmNN to build the delegate as well. The new commands to build ArmNN are as follows:
+```bash
+cd $BASEDIR
+git clone "https://review.mlplatform.org/ml/armnn" 
+cd armnn
+git checkout <branch_name> # e.g. branches/armnn_20_11
+mkdir build && cd build
+# if you've got an arm Gpu add `-DARMCOMPUTECL=1` to the command below
+cmake .. -DARMCOMPUTE_ROOT=$BASEDIR/ComputeLibrary \
+         -DARMCOMPUTENEON=1 \
+         -DBUILD_UNIT_TESTS=0 \
+         -DBUILD_ARMNN_TFLITE_DELEGATE=1 \
+         -DTENSORFLOW_LIB_DIR=$BASEDIR/tensorflow/bazel-bin \
+         -DTENSORFLOW_ROOT=$BASEDIR/tensorflow \
+         -DTFLITE_LIB_ROOT=$BASEDIR/tensorflow/bazel-bin \
+         -DFLATBUFFERS_ROOT=$BASEDIR/flatbuffers-1.12.0/install
+make
+```
+The delegate library can then be found in `build/armnn/delegate`.
+
+
+# Integrate the ArmNN TfLite Delegate into your project
+
+The delegate can be integrated into your c++ project by creating a TfLite Interpreter and 
+instructing it to use the ArmNN delegate for the graph execution. This should look similiar
+to the following code snippet.
+```objectivec
+// Create TfLite Interpreter
+std::unique_ptr<Interpreter> armnnDelegateInterpreter;
+InterpreterBuilder(tfLiteModel, ::tflite::ops::builtin::BuiltinOpResolver())
+                  (&armnnDelegateInterpreter)
+
+// Create the ArmNN Delegate
+armnnDelegate::DelegateOptions delegateOptions(backends);
+std::unique_ptr<TfLiteDelegate, decltype(&armnnDelegate::TfLiteArmnnDelegateDelete)>
+                    theArmnnDelegate(armnnDelegate::TfLiteArmnnDelegateCreate(delegateOptions),
+                                     armnnDelegate::TfLiteArmnnDelegateDelete);
+
+// Instruct the Interpreter to use the armnnDelegate
+armnnDelegateInterpreter->ModifyGraphWithDelegate(theArmnnDelegate.get());
+```
+For further information on using TfLite Delegates 
+please visit the [tensorflow website](https://www.tensorflow.org/lite/guide)
+
-- 
cgit v1.2.1