diff options
author | Eric Kunze <eric.kunze@arm.com> | 2022-05-13 14:54:06 -0700 |
---|---|---|
committer | Eric Kunze <eric.kunze@arm.com> | 2022-05-16 11:44:15 -0700 |
commit | eef012e19898ca86a8b9f0e6c1b2f30692bc6860 (patch) | |
tree | 4112426ff04a0e299d7fb541388a96a105558aaa /chapters/introduction.adoc | |
parent | 6de978203f071082afcc9090a6ca4c39e0273051 (diff) | |
download | specification-eef012e19898ca86a8b9f0e6c1b2f30692bc6860.tar.gz |
Add the uint16_t data type
An unsigned 16-bit integer data type for use with image networks.
Limited to only operating with the RESCALE operator for conversion
to signed int16.
Zero point can be 0 or 32768 in the RESCALE to allow for no loss of
precision (by subtracting 32768), or keeping all values as positive,
(zero point=0) with scaling/clipping as defined in the other RESCALE
arguments.
Change-Id: Id1aebab68fa207f8f8cc235fc3fa5d050307198e
Signed-off-by: Eric Kunze <eric.kunze@arm.com>
Diffstat (limited to 'chapters/introduction.adoc')
-rw-r--r-- | chapters/introduction.adoc | 11 |
1 files changed, 8 insertions, 3 deletions
diff --git a/chapters/introduction.adoc b/chapters/introduction.adoc index 4263135..eafaaca 100644 --- a/chapters/introduction.adoc +++ b/chapters/introduction.adoc @@ -199,12 +199,12 @@ For details of interpreting the quantized data, see the <<Quantization Scaling>> |int4_t | -7 | +7 -|Signed 4-bit two's-complement values. Excludes -8 to maintain a symmetric about zero range for weights. +|Signed 4-bit two's-complement value. Excludes -8 to maintain a symmetric about zero range for weights. |int8_t | -128 | +127 -|Signed 8-bit two's-complement values. +|Signed 8-bit two's-complement value. |uint8_t | 0 @@ -214,7 +214,12 @@ For details of interpreting the quantized data, see the <<Quantization Scaling>> |int16_t | -32768 | +32767 -|Signed 16-bit two's-complement values. +|Signed 16-bit two's-complement value. + +|uint16_t +| 0 +| 65535 +|Unsigned 16-bit value. |int32_t | -(1<<31) |