README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202

TOSA Reference Model
=============

# Introduction

The *Tensor Operator Set Architecture (TOSA) Specification
<https://git.mlplatform.org/tosa/specification.git/>* is a set of operators
with defined accuracy and compatibility constraints that Arm expects
to be implemented on its Neural Processing Units (NPUs).  Most
operators from the common ML frameworks (TensorFlow, PyTorch, etc)
should be expressible in TOSA.  TOSA is focused on inference, leaving
training to the original frameworks.

The *TOSA Reference Model* package provides a reference implementation
and testing infrastructure for TOSA.  The reference model consumes a
FlatBuffers serialization of the network subgraph generated by the
TOSA Serialization Library, along with input tensors for placeholder
nodes in NumPy format.  By default, the model validates and evalutes
the network subgraph, and writes out the resulting output tensors in
NumPy format.

# Installation Requirements

The *TOSA Reference Model* and testing suite requires the following
tools:

* CMake version 3.4 or later
* GNU Make 4.1 or later
* GCC (tested with 7.5.0) or Clang C++ compiler (tested with clang-9)
  with C++17 support

The model includes the TOSA Serialization Library, Eigen 3.3.7, and
FlatBuffers 1.11.0 as git submodules.  The model is written using
C++17 and has been primarily tested on Ubuntu x86_64 18.04 LTS Linux
systems.

The testing infrastructure requires:
* Python 3.6 or later
* TensorFlow 2.3 or later
* NumPy 1.15 or later

Check out the required git submodules with:

``` bash
$ git submodule init
$ git submodule update
```

# Compilation

The *TOSA Reference Model* build can be prepared by creating makefiles using CMake:

``` bash
$ mkdir -p build
$ cd build
$ cmake ..
```

Optionally, `-DCMAKE_BUILD_MODE=Debug` can be used on the `cmake`
command to create a debug build.  Next compile using `make`:

``` bash
$ make
```

The resulting executable will be named:
`reference_model/tosa_reference_model`.  CMake only needs to be re-run
if the build environment changes (e.g., new dependencies or source
files).  Code changes that do not affect these build rules can be
rebuilt simply using `make`.

# Usage

The inputs to the *TOSA Reference Model* consist of a FlatBuffers file
containing the serialized subgraph, a sequence of placeholder node
name/input tensor NumPy file pairs (produced by an external tool), and
a prefix for output tensor NumPy files (produced by the reference model).

An example command is shown below:

``` bash
$ mkdir -p examples_out/test_add_1x4x4x4_f32
$ ./build/reference_model/tosa_reference_model \
  -Csubgraph_dir=examples/test_add_1x4x4x4_f32/flatbuffer-tflite \
  -Csubgraph_file=test_add_1x4x4x4_f32.tosa \
  -Cinput_dir=examples/test_add_1x4x4x4_f32/ \
  -Coutput_dir=examples_out/test_add_1x4x4x4_f32/ \
  -Coutput_tensor_prefix=ref_model_tflite_ \
  -Cinput_tensor=InputTensor-tflite0:InputTensor-tflite0.npy,InputTensor-tflite1:InputTensor-tflite1.npy
```

On a successful execution, the output tensors will be written in NumPy
format into output tensors in -Coutput_dir and prefixed with
-Coutput_tensor_prefix.

When using JSON-formatted FlatBuffers input (.json extension), the
FlatBuffers schema file from the TOSA Serialization library must be
specified using -Coperator_fbs=.  When using the binary FlatBuffers
format (.tosa), the schema is not necessary.

## Examples

The TOSA Reference Model distribution contains several example
networks with inputs and reference outputs generated by
TensorFlow or TensorFlow Lite in the examples directory.

These examples can be run through the TOSA Reference model and should
produce the equivalent TOSA-compliant reference output tensors.
Please note that differences in floating-point ordering and rounding
may cause small differences in output for floating-point tests and
differences in quantized scaling between TensorFlow Lite and the TOSA
Specification may cause differences in quantized integer tests.

# Debugging

The debugging facility can be enabled by setting a debug scope and
debug level on the command line.  For most purposes, the following
flags will work: `-dALL -lHIGH`.  Debug output can be directed to a
file using the `-o` switch.

# TOSA Unit Test Infrastructure

The TOSA Unit Test infrastruture builds and runs self-contained tests
for implementations of the *Tensor Operator Set Architecture (TOSA)
Specification*.  These tools directly generate TOSA operators for
verification of the TOSA reference model against existing frameworks
or other operator implementations.

The test builder tool generates tests with random arguments and
reference inputs for each TOSA operator.  Currently, the test builder
focuses on generating a wide range of legal arguments to each
operator, but it also has limited support for generating tests with
illegal arguments in order to make sure such usages are properly
detected.

The unit tests are typically structured as a combination of input
placeholder nodes, const nodes, and attributes feeding into a single
TOSA operator.  The unit tests use a Python copy of the FlatBuffers
schema written by ``flatc`` to verif/tosa.

Each test has a JSON file which provides machine-readable metadata for
the test, including the .tosa flatbuffer file, names, shapes, and
NumPy filenames for each input and output tensor.  There is also a
boolean value for whether a failure is expected because the test is
expected to trigger an invalid set of operands or attributes.

The test runner tool executes the unit tests on the TOSA Reference
Model to generate reference output tensor values (for legal tests).
The test runner is a modular tool which can be exended to run the same
tests on additional tools or frameworks.  The reference output NumPy
files are generated by this step and can be programatically compared
with output of other tools. to validate those tools.

## Usage

### Unit Test Builder
The test builder is in ``verif/tosa_verif_build_tests.py``.  The
builder generates test outputs in ``./vtest/<operator_name>/`` by
default.  To restrict test generation to particular regular expression
wildcard, use the ``--filter `` argument.  The tool can be run with no
arguments to generate all tests.

Inputs and certain attributes are created using a random number
generator, while others are exhaustive (within reasonable bounds)
where the combinatorics allow exhaustive tests.  The test generation
is deterministic for a given random seed, but additional tests can be
generated using ``--seed``.  As many corner-case error are often
uncovered using creative tensor shapes, the random seed parameter will
help get coverage of additional shapes.

Additional parameters on some operators can be found in the command
line help.

### Unit Test Runner

The unit test running script takes self-contained unit tests from the
builder and runs them on the reference model.  Shell wildcards can be
used to run more than one test at a time and tests can be run in
parallel using the ``-j`` switch.  For example, to run all of the
add operator tests:

``` bash
$ ./verif/tosa_verif_run_ref.py -t vtest/add/add* -j 8
```

The test runner is quiet by default, so running a large number of
tests without any obvious errors will show no output while the tests
are running.  The ``-v`` switch will show the command being run in the
background.

To enable debugging on the reference model, shortcut commands have
been provided: ``--ref-debug=high`` and ``--ref-intermediates`` to
turn on debugging and dump intermediate tensor values.

Additional Systems Under Test (SUTs), such as reference
implementations of operators, full frameworks, etc, can be defined by
extending the TosaTestRunner class.  The SUTs can then be enabled by
using the ``--sut-module`` flag.

# License

The *TOSA Reference Model* and TOSA Unit Tests are licensed under Apache-2.0.