6 Basics [basic]

6.8 Types [basic.types]

6.8.3 Optional extended floating-point types [basic.extended.fp]

If the implementation supports an extended floating-point type ([basic.fundamental]) whose properties are specified by the ISO/IEC/IEEE 60559 floating-point interchange format binary16, then the typedef-name std​::​float16_­t is defined in the header <stdfloat> and names such a type, the macro __STDCPP_­FLOAT16_­T__ is defined ([cpp.predefined]), and the floating-point literal suffixes f16 and F16 are supported ([lex.fcon]).
If the implementation supports an extended floating-point type whose properties are specified by the ISO/IEC/IEEE 60559 floating-point interchange format binary32, then the typedef-name std​::​float32_­t is defined in the header <stdfloat> and names such a type, the macro __STDCPP_­FLOAT32_­T__ is defined, and the floating-point literal suffixes f32 and F32 are supported.
If the implementation supports an extended floating-point type whose properties are specified by the ISO/IEC/IEEE 60559 floating-point interchange format binary64, then the typedef-name std​::​float64_­t is defined in the header <stdfloat> and names such a type, the macro __STDCPP_­FLOAT64_­T__ is defined, and the floating-point literal suffixes f64 and F64 are supported.
If the implementation supports an extended floating-point type whose properties are specified by the ISO/IEC/IEEE 60559 floating-point interchange format binary128, then the typedef-name std​::​float128_­t is defined in the header <stdfloat> and names such a type, the macro __STDCPP_­FLOAT128_­T__ is defined, and the floating-point literal suffixes f128 and F128 are supported.
If the implementation supports an extended floating-point type with the properties, as specified by ISO/IEC/IEEE 60559, of radix (b) of 2, storage width in bits (k) of 16, precision in bits (p) of 8, maximum exponent (emax) of 127, and exponent field width in bits (w) of 8, then the typedef-name std​::​bfloat16_­t is defined in the header <stdfloat> and names such a type, the macro __STDCPP_­BFLOAT16_­T__ is defined, and the floating-point literal suffixes bf16 and BF16 are supported.
[Note 1:
A summary of the parameters for each type is given in Table 16.
The precision p includes the implicit 1 bit at the beginning of the mantissa, so the storage used for the mantissa is bits.
ISO/IEC/IEEE 60559 does not assign a name for a type having the parameters specified for std​::​bfloat16_­t.
— end note]
Table 16: Properties of named extended floating-point types [tab:basic.extended.fp]
Parameter
float16_­t
float32_­t
float64_­t
float128_­t
bfloat16_­t
ISO/IEC/IEEE 60559 name
binary16
binary32
binary64
binary128
k, storage width in bits
16
32
64
128
16
p, precision in bits
11
24
53
113
8
emax, maximum exponent
15
127
1023
16383
127
w, exponent field width in bits
5
8
11
15
8
Recommended practice: Any names that the implementation provides for the extended floating-point types described in this subsection that are in addition to the names defined in the <stdfloat> header should be chosen to increase compatibility and interoperability with the interchange types _­Float16, _­Float32, _­Float64, and _­Float128 defined in ISO/IEC TS 18661-3 and with future versions of the C standard.