diff options
author | Eric Kunze <eric.kunze@arm.com> | 2022-04-07 16:54:46 -0700 |
---|---|---|
committer | Eric Kunze <eric.kunze@arm.com> | 2022-06-17 20:38:16 +0000 |
commit | 42229d03fe55c45f0ad2ba68f190f3d68a78ae79 (patch) | |
tree | fde2487db3fe2c4e8257beec9b54044fac9da931 /chapters/introduction.adoc | |
parent | f9e5ba94f12a71f088c790f532cd62d33b8d25d0 (diff) | |
download | specification-42229d03fe55c45f0ad2ba68f190f3d68a78ae79.tar.gz |
Initial work on floating-point type definition
Define operations in terms of common floating-point data
types. Definitions for the data types are in the introduction.
Added a section to describe status of the different profiles.
Signed-off-by: Eric Kunze <eric.kunze@arm.com>
Change-Id: Iac57026806acfb7913f40af61176322fb02b7cc1
Diffstat (limited to 'chapters/introduction.adoc')
-rw-r--r-- | chapters/introduction.adoc | 24 |
1 files changed, 22 insertions, 2 deletions
diff --git a/chapters/introduction.adoc b/chapters/introduction.adoc index 9b2e0c0..93206ca 100644 --- a/chapters/introduction.adoc +++ b/chapters/introduction.adoc @@ -106,6 +106,16 @@ The following table summarizes the three profiles: |Main Training|TOSA-MT|Yes|Yes|Yes |=== +=== Status + +The TOSA specification is a work in progress. + +* The Base Inference profile should be considered to be near release quality, with conformance tests available. +* The Main Inference profile has most of the expected operators in place, but is still subject to change. +* The reference model and conformance tests do not yet support all of the floating point types that have been defined. +* There is not currently a conformance test suite available for Main Inference. +* Main Training profile is pre-alpha, significant work still needs to be done for the profile, and no conformance tests are available. + === Compliance This section defines when a TOSA implementation is compliant to a given TOSA specification profile. @@ -267,10 +277,20 @@ The number formats supported by a given operator are listed in its table of supp | (1<<47)-1 |Signed 48-bit two's-complement value. -|float_t +|fp16_t +| -infinity +| +infinity +| 16-bit floating-point value. + +|bf16_t +| -infinity +| +infinity +| 16-bit brain float value. + +|fp32_t | -infinity | +infinity -|floating-point number. Must have features defined in the section <<Floating-point>>. +| 32-bit floating-point value. |=== Note: In this specification minimum<type> and maximum<type> will denote the minimum and maximum values of the data as stored in memory (ignoring the zero point). |