From c577f2c6a3b4ddb6ba87a882723c53a248afbeba Mon Sep 17 00:00:00 2001 From: telsoa01 Date: Fri, 31 Aug 2018 09:22:23 +0100 Subject: Release 18.08 --- third-party/half/README.txt | 288 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 288 insertions(+) create mode 100644 third-party/half/README.txt (limited to 'third-party/half/README.txt') diff --git a/third-party/half/README.txt b/third-party/half/README.txt new file mode 100644 index 0000000000..3a0960c125 --- /dev/null +++ b/third-party/half/README.txt @@ -0,0 +1,288 @@ +HALF-PRECISION FLOATING POINT LIBRARY (Version 1.12.0) +------------------------------------------------------ + +This is a C++ header-only library to provide an IEEE 754 conformant 16-bit +half-precision floating point type along with corresponding arithmetic +operators, type conversions and common mathematical functions. It aims for both +efficiency and ease of use, trying to accurately mimic the behaviour of the +builtin floating point types at the best performance possible. + + +INSTALLATION AND REQUIREMENTS +----------------------------- + +Comfortably enough, the library consists of just a single header file +containing all the functionality, which can be directly included by your +projects, without the neccessity to build anything or link to anything. + +Whereas this library is fully C++98-compatible, it can profit from certain +C++11 features. Support for those features is checked automatically at compile +(or rather preprocessing) time, but can be explicitly enabled or disabled by +defining the corresponding preprocessor symbols to either 1 or 0 yourself. This +is useful when the automatic detection fails (for more exotic implementations) +or when a feature should be explicitly disabled: + + - 'long long' integer type for mathematical functions returning 'long long' + results (enabled for VC++ 2003 and newer, gcc and clang, overridable with + 'HALF_ENABLE_CPP11_LONG_LONG'). + + - Static assertions for extended compile-time checks (enabled for VC++ 2010, + gcc 4.3, clang 2.9 and newer, overridable with 'HALF_ENABLE_CPP11_STATIC_ASSERT'). + + - Generalized constant expressions (enabled for VC++ 2015, gcc 4.6, clang 3.1 + and newer, overridable with 'HALF_ENABLE_CPP11_CONSTEXPR'). + + - noexcept exception specifications (enabled for VC++ 2015, gcc 4.6, clang 3.0 + and newer, overridable with 'HALF_ENABLE_CPP11_NOEXCEPT'). + + - User-defined literals for half-precision literals to work (enabled for + VC++ 2015, gcc 4.7, clang 3.1 and newer, overridable with + 'HALF_ENABLE_CPP11_USER_LITERALS'). + + - Type traits and template meta-programming features from + (enabled for VC++ 2010, libstdc++ 4.3, libc++ and newer, overridable with + 'HALF_ENABLE_CPP11_TYPE_TRAITS'). + + - Special integer types from (enabled for VC++ 2010, libstdc++ 4.3, + libc++ and newer, overridable with 'HALF_ENABLE_CPP11_CSTDINT'). + + - Certain C++11 single-precision mathematical functions from for + an improved implementation of their half-precision counterparts to work + (enabled for VC++ 2013, libstdc++ 4.3, libc++ and newer, overridable with + 'HALF_ENABLE_CPP11_CMATH'). + + - Hash functor 'std::hash' from (enabled for VC++ 2010, + libstdc++ 4.3, libc++ and newer, overridable with 'HALF_ENABLE_CPP11_HASH'). + +The library has been tested successfully with Visual C++ 2005-2015, gcc 4.4-4.8 +and clang 3.1. Please contact me if you have any problems, suggestions or even +just success testing it on other platforms. + + +DOCUMENTATION +------------- + +Here follow some general words about the usage of the library and its +implementation. For a complete documentation of its iterface look at the +corresponding website http://half.sourceforge.net. You may also generate the +complete developer documentation from the library's only include file's doxygen +comments, but this is more relevant to developers rather than mere users (for +reasons described below). + +BASIC USAGE + +To make use of the library just include its only header file half.hpp, which +defines all half-precision functionality inside the 'half_float' namespace. The +actual 16-bit half-precision data type is represented by the 'half' type. This +type behaves like the builtin floating point types as much as possible, +supporting the usual arithmetic, comparison and streaming operators, which +makes its use pretty straight-forward: + + using half_float::half; + half a(3.4), b(5); + half c = a * b; + c += 3; + if(c > a) + std::cout << c << std::endl; + +Additionally the 'half_float' namespace also defines half-precision versions +for all mathematical functions of the C++ standard library, which can be used +directly through ADL: + + half a(-3.14159); + half s = sin(abs(a)); + long l = lround(s); + +You may also specify explicit half-precision literals, since the library +provides a user-defined literal inside the 'half_float::literal' namespace, +which you just need to import (assuming support for C++11 user-defined literals): + + using namespace half_float::literal; + half x = 1.0_h; + +Furthermore the library provides proper specializations for +'std::numeric_limits', defining various implementation properties, and +'std::hash' for hashing half-precision numbers (assuming support for C++11 +'std::hash'). Similar to the corresponding preprocessor symbols from +the library also defines the 'HUGE_VALH' constant and maybe the 'FP_FAST_FMAH' +symbol. + +CONVERSIONS AND ROUNDING + +The half is explicitly constructible/convertible from a single-precision float +argument. Thus it is also explicitly constructible/convertible from any type +implicitly convertible to float, but constructing it from types like double or +int will involve the usual warnings arising when implicitly converting those to +float because of the lost precision. On the one hand those warnings are +intentional, because converting those types to half neccessarily also reduces +precision. But on the other hand they are raised for explicit conversions from +those types, when the user knows what he is doing. So if those warnings keep +bugging you, then you won't get around first explicitly converting to float +before converting to half, or use the 'half_cast' described below. In addition +you can also directly assign float values to halfs. + +In contrast to the float-to-half conversion, which reduces precision, the +conversion from half to float (and thus to any other type implicitly +convertible from float) is implicit, because all values represetable with +half-precision are also representable with single-precision. This way the +half-to-float conversion behaves similar to the builtin float-to-double +conversion and all arithmetic expressions involving both half-precision and +single-precision arguments will be of single-precision type. This way you can +also directly use the mathematical functions of the C++ standard library, +though in this case you will invoke the single-precision versions which will +also return single-precision values, which is (even if maybe performing the +exact same computation, see below) not as conceptually clean when working in a +half-precision environment. + +The default rounding mode for conversions from float to half uses truncation +(round toward zero, but mapping overflows to infinity) for rounding values not +representable exactly in half-precision. This is the fastest rounding possible +and is usually sufficient. But by redefining the 'HALF_ROUND_STYLE' +preprocessor symbol (before including half.hpp) this default can be overridden +with one of the other standard rounding modes using their respective constants +or the equivalent values of 'std::float_round_style' (it can even be +synchronized with the underlying single-precision implementation by defining it +to 'std::numeric_limits::round_style'): + + - 'std::round_indeterminate' or -1 for the fastest rounding (default). + + - 'std::round_toward_zero' or 0 for rounding toward zero. + + - std::round_to_nearest' or 1 for rounding to the nearest value. + + - std::round_toward_infinity' or 2 for rounding toward positive infinity. + + - std::round_toward_neg_infinity' or 3 for rounding toward negative infinity. + +In addition to changing the overall default rounding mode one can also use the +'half_cast'. This converts between half and any built-in arithmetic type using +a configurable rounding mode (or the default rounding mode if none is +specified). In addition to a configurable rounding mode, 'half_cast' has +another big difference to a mere 'static_cast': Any conversions are performed +directly using the given rounding mode, without any intermediate conversion +to/from 'float'. This is especially relevant for conversions to integer types, +which don't necessarily truncate anymore. But also for conversions from +'double' or 'long double' this may produce more precise results than a +pre-conversion to 'float' using the single-precision implementation's current +rounding mode would. + + half a = half_cast(4.2); + half b = half_cast::round_style>(4.2f); + assert( half_cast( 0.7_h ) == 1 ); + assert( half_cast( 4097 ) == 4096.0_h ); + assert( half_cast( 4097 ) == 4100.0_h ); + assert( half_cast( std::numeric_limits::min() ) > 0.0_h ); + +When using round to nearest (either as default or through 'half_cast') ties are +by default resolved by rounding them away from zero (and thus equal to the +behaviour of the 'round' function). But by redefining the +'HALF_ROUND_TIES_TO_EVEN' preprocessor symbol to 1 (before including half.hpp) +this default can be changed to the slightly slower but less biased and more +IEEE-conformant behaviour of rounding half-way cases to the nearest even value. + + #define HALF_ROUND_TIES_TO_EVEN 1 + #include + ... + assert( half_cast(3.5_h) + == half_cast(4.5_h) ); + +IMPLEMENTATION + +For performance reasons (and ease of implementation) many of the mathematical +functions provided by the library as well as all arithmetic operations are +actually carried out in single-precision under the hood, calling to the C++ +standard library implementations of those functions whenever appropriate, +meaning the arguments are converted to floats and the result back to half. But +to reduce the conversion overhead as much as possible any temporary values +inside of lengthy expressions are kept in single-precision as long as possible, +while still maintaining a strong half-precision type to the outside world. Only +when finally assigning the value to a half or calling a function that works +directly on halfs is the actual conversion done (or never, when further +converting the result to float. + +This approach has two implications. First of all you have to treat the +library's documentation at http://half.sourceforge.net as a simplified version, +describing the behaviour of the library as if implemented this way. The actual +argument and return types of functions and operators may involve other internal +types (feel free to generate the exact developer documentation from the Doxygen +comments in the library's header file if you really need to). But nevertheless +the behaviour is exactly like specified in the documentation. The other +implication is, that in the presence of rounding errors or over-/underflows +arithmetic expressions may produce different results when compared to +converting to half-precision after each individual operation: + + half a = std::numeric_limits::max() * 2.0_h / 2.0_h; // a = MAX + half b = half(std::numeric_limits::max() * 2.0_h) / 2.0_h; // b = INF + assert( a != b ); + +But this should only be a problem in very few cases. One last word has to be +said when talking about performance. Even with its efforts in reducing +conversion overhead as much as possible, the software half-precision +implementation can most probably not beat the direct use of single-precision +computations. Usually using actual float values for all computations and +temproraries and using halfs only for storage is the recommended way. On the +one hand this somehow makes the provided mathematical functions obsolete +(especially in light of the implicit conversion from half to float), but +nevertheless the goal of this library was to provide a complete and +conceptually clean half-precision implementation, to which the standard +mathematical functions belong, even if usually not needed. + +IEEE CONFORMANCE + +The half type uses the standard IEEE representation with 1 sign bit, 5 exponent +bits and 10 mantissa bits (11 when counting the hidden bit). It supports all +types of special values, like subnormal values, infinity and NaNs. But there +are some limitations to the complete conformance to the IEEE 754 standard: + + - The implementation does not differentiate between signalling and quiet + NaNs, this means operations on halfs are not specified to trap on + signalling NaNs (though they may, see last point). + + - Though arithmetic operations are internally rounded to single-precision + using the underlying single-precision implementation's current rounding + mode, those values are then converted to half-precision using the default + half-precision rounding mode (changed by defining 'HALF_ROUND_STYLE' + accordingly). This mixture of rounding modes is also the reason why + 'std::numeric_limits::round_style' may actually return + 'std::round_indeterminate' when half- and single-precision rounding modes + don't match. + + - Because of internal truncation it may also be that certain single-precision + NaNs will be wrongly converted to half-precision infinity, though this is + very unlikely to happen, since most single-precision implementations don't + tend to only set the lowest bits of a NaN mantissa. + + - The implementation does not provide any floating point exceptions, thus + arithmetic operations or mathematical functions are not specified to invoke + proper floating point exceptions. But due to many functions implemented in + single-precision, those may still invoke floating point exceptions of the + underlying single-precision implementation. + +Some of those points could have been circumvented by controlling the floating +point environment using or implementing a similar exception mechanism. +But this would have required excessive runtime checks giving two high an impact +on performance for something that is rarely ever needed. If you really need to +rely on proper floating point exceptions, it is recommended to explicitly +perform computations using the built-in floating point types to be on the safe +side. In the same way, if you really need to rely on a particular rounding +behaviour, it is recommended to either use single-precision computations and +explicitly convert the result to half-precision using 'half_cast' and +specifying the desired rounding mode, or synchronize the default half-precision +rounding mode to the rounding mode of the single-precision implementation (most +likely 'HALF_ROUND_STYLE=1', 'HALF_ROUND_TIES_TO_EVEN=1'). But this is really +considered an expert-scenario that should be used only when necessary, since +actually working with half-precision usually comes with a certain +tolerance/ignorance of exactness considerations and proper rounding comes with +a certain performance cost. + + +CREDITS AND CONTACT +------------------- + +This library is developed by CHRISTIAN RAU and released under the MIT License +(see LICENSE.txt). If you have any questions or problems with it, feel free to +contact me at rauy@users.sourceforge.net. + +Additional credit goes to JEROEN VAN DER ZIJP for his paper on "Fast Half Float +Conversions", whose algorithms have been used in the library for converting +between half-precision and single-precision values. -- cgit v1.2.1