# Object Detection Sample Application ## Introduction This sample application guides the user and shows how to perform object detection using PyArmNN or Arm NN TensorFlow Lite Delegate API. We assume the user has already built PyArmNN by following the instructions of the README in the main PyArmNN directory. ##### Running with Armn NN TensorFlow Lite Delegate There is an option to use the Arm NN TensorFlow Lite Delegate instead of Arm NN TensorFlow Lite Parser for the object detection inference. The Arm NN TensorFlow Lite Delegate is part of Arm NN library and its purpose is to accelerate certain TensorFlow Lite (TfLite) operators on Arm hardware. The main advantage of using the Arm NN TensorFlow Lite Delegate over the Arm NN TensorFlow Lite Parser is that the number of supported operations is far greater, which means Arm NN TfLite Delegate can execute all TfLite models, and accelerates any operations that Arm NN supports. In addition, in the delegate options there are some optimizations applied by default in order to improve the inference performance at the expanse of a slight accuracy reduction. In this example we enable fast math and reduce float32 to float16 optimizations. Using the **fast_math** flag can lead to performance improvements in fp32 and fp16 layers but may result in results with reduced or different precision. The fast_math flag will not have any effect on int8 performance. The **reduce-fp32-to-fp16** feature works best if all operators of the model are in Fp32. ArmNN will add conversion layers between layers that weren't in Fp32 in the first place or if the operator is not supported in Fp16. The overhead of these conversions can lead to a slower overall performance if too many conversions are required. One can turn off these optimizations in the `create_network` function found in the `network_executor_tflite.py`. Just change the `optimization_enable` flag to false. We provide example scripts for performing object detection from video file and video stream with `run_video_file.py` and `run_video_stream.py`. The application takes a model and video file or camera feed as input, runs inference on each frame, and draws bounding boxes around detected objects, with the corresponding labels and confidence scores overlaid. A similar implementation of this object detection application is also provided in C++ in the examples for ArmNN. ##### Performing Object Detection with Style Transfer and TensorFlow Lite Delegate In addition to running Object Detection using TensorFlow Lite Delegate, instead of drawing bounding boxes on each frame, there is an option to run style transfer to create stylized detections. Style transfer is the ability to create a new image, known as a pastiche, based on two input images: one representing an artistic style and one representing the content frame containing class detections. The style transfer consists of two submodels: Style Prediction Model: A MobilenetV2-based neural network that takes an input style image to create a style bottleneck vector. Style Transform Model: A neural network that applies a style bottleneck vector to a content image and creates a stylized image. An image containing an art style is preprocessed to a correct size and dimension. The preprocessed style image is passed to a style predict network which calculates and returns a style bottleneck tensor. The style transfer network receives the style bottleneck, and a content frame that contains detections, which then transforms the requested class detected and returns a stylized frame. ## Prerequisites ##### PyArmNN Before proceeding to the next steps, make sure that you have successfully installed the newest version of PyArmNN on your system by following the instructions in the README of the PyArmNN root directory. You can verify that PyArmNN library is installed and check PyArmNN version using: ```bash $ pip show pyarmnn ``` You can also verify it by running the following and getting output similar to below: ```bash $ python -c "import pyarmnn as ann;print(ann.GetVersion())" '29.0.0' ``` ##### Dependencies Install the following libraries on your system: ```bash $ sudo apt-get install python3-opencv ``` This section is needed only if running with Arm NN TensorFlow Lite Delegate is desired\ If there is no libarmnnDelegate.so file in your ARMNN_LIB path, download Arm NN artifacts with Arm NN delegate according to your platform and Arm NN latest version (for this example aarch64 and v21.11 respectively): ```bash $ export $WORKSPACE=`pwd` $ mkdir ./armnn_artifacts ; cd armnn_artifacts $ wget https://github.com/ARM-software/armnn/releases/download/v21.11/ArmNN-linux-aarch64.tar.gz $ tar -xvzf ArmNN*.tar.gz $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd` ``` Create a virtual environment: ```bash $ python3.7 -m venv devenv --system-site-packages $ source devenv/bin/activate ``` Install the dependencies from the object_detection example folder: * In case the python version is 3.8 or lower, tflite_runtime version 2.5.0 (without post1 suffix) should be installed. (requirements.txt file should be amended) ```bash $ cd $WORKSPACE/armnn/python/pyarmnn/examples/object_detection $ pip install -r requirements.txt ``` --- # Performing Object Detection ## Object Detection from Video File The `run_video_file.py` example takes a video file as input, runs inference on each frame, and produces frames with bounding boxes drawn around detected objects. The processed frames are written to video file. The user can specify these arguments at command line: * `--video_file_path` - Required: Path to the video file to run object detection on * `--model_file_path` - Required: Path to .tflite, .pb or .onnx object detection model * `--model_name` - Required: The name of the model being used. Assembles the workflow for the input model. The examples support the model names: * `ssd_mobilenet_v1` * `yolo_v3_tiny` * `--label_path` - Required: Path to labels file for the specified model file * `--output_video_file_path` - Path to the output video file with detections added in * `--preferred_backends` - You can specify one or more backend in order of preference. Accepted backends include `CpuAcc, GpuAcc, CpuRef`. Arm NN will decide which layers of the network are supported by the backend, falling back to the next if a layer is unsupported. Defaults to `['CpuAcc', 'CpuRef']` * `--tflite_delegate_path` - Optional. Path to the Arm NN TensorFlow Lite Delegate library (libarmnnDelegate.so). If provided, Arm NN TensorFlow Lite Delegate will be used instead of PyArmNN. * `--profiling_enabled` - Optional. Enabling this option will print important ML related milestones timing information in micro-seconds. By default, this option is disabled. Accepted options are `true/false` The `run_video_file.py` example can also perform style transfer on a selected class of detected objects, and stylize the detections based on a given style image. In addition, to run style transfer, the user needs to specify these arguments at command line: * `--style_predict_model_file_path` - Path to the style predict model that will be used to create a style bottleneck tensor * `--style_transfer_model_file_path` - Path to the style transfer model to use which will perform the style transfer * `--style_image_path` - Path to a .jpg/jpeg/png style image to create stylized frames * `--style_transfer_class` - A detected class name to transform its style Run the sample script: ```bash $ python run_video_file.py --video_file_path --model_file_path --model_name --tflite_delegate_path --style_predict_model_file_path --style_transfer_model_file_path --style_image_path --style_transfer_class ``` ## Object Detection from Video Stream The `run_video_stream.py` example captures frames from a video stream of a device, runs inference on each frame, and produces frames with bounding boxes drawn around detected objects. A window is displayed and refreshed with the latest processed frame. The user can specify these arguments at command line: * `--video_source` - Device index to access video stream. Defaults to primary device camera at index 0 * `--model_file_path` - Required: Path to .tflite, .pb or .onnx object detection model * `--model_name` - Required: The name of the model being used. Assembles the workflow for the input model. The examples support the model names: * `ssd_mobilenet_v1` * `yolo_v3_tiny` * `--label_path` - Required: Path to labels file for the specified model file * `--preferred_backends` - You can specify one or more backend in order of preference. Accepted backends include `CpuAcc, GpuAcc, CpuRef`. Arm NN will decide which layers of the network are supported by the backend, falling back to the next if a layer is unsupported. Defaults to `['CpuAcc', 'CpuRef']` * `--tflite_delegate_path` - Optional. Path to the Arm NN TensorFlow Lite Delegate library (libarmnnDelegate.so). If provided, Arm NN TensorFlow Lite Delegate will be used instead of PyArmNN. * `--profiling_enabled` - Optional. Enabling this option will print important ML related milestones timing information in micro-seconds. By default, this option is disabled. Accepted options are `true/false` Run the sample script: ```bash $ python run_video_stream.py --model_file_path --model_name --tflite_delegate_path --label_path --video_file_path