# IEEE 754 Floating Point Type in C++

Let’s say we want to use IEEE 754 32/64bit floating point types in C++, then
there is `float`

and `double`

right? Unfortunately, C++ standard guarantees
almost nothing about the built-in floating point types.

§ 6.7.1.8 There are three floating-point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double.

The value representation of floating-point types is implementation-defined. …

So are we just doomed? No! There is std::numeric_limits that gives various floating point type trait information, and neat C++ compile time tricks we can use to craft a clean type API. So let’s try. The goal is to construct the following IEEE754 floating point types.

```
#include <iostream>
#include "ieee754_types.hpp"
int main() {
IEEE_754_2008_Binary<32> x = 1.0;
IEEE_754_2008_Binary<64> y = 2.0;
std::cout << x + y << std::endl;
// Compile time error if the requested type doesn't exist in the system.
// IEEE_754_2008_Binary<16> z;
}
```

`IEEE_754_2008_Binary<n>`

is n-bit IEEE 754 floating point type. Of course, for
most systems, `IEEE_754_2008_Binary<32>`

is `float`

and
`IEEE_754_2008_Binary<64>`

is `double`

. In case the requested type is not
available, like `IEEE_754_2008_Binary<16>`

, it should cause a compile error with
a clear error message.

Well, I guess a natural question at this point is: “Do we really need this?
Can’t we just assume `float`

and `double`

are IEEE 754 because they actually are
for the 99.9% systems out there?”. I thought so, but then later, I’ve found that
people have submitted related proposals,
N1703,
N3626, to
C/C++ standards committee to fix this issue with additional standard types,
`float16_t`

, `float32_t`

, `float64_t`

, and `float128_t`

. So maybe it’s not
entirely pointless after all. Anyways, let’s get started.

First, let’s begin with checking if a given type, `T`

, fulfills IEEE 754 and
other desired conditions.

```
template <int storage_bits, int exponent_bits, int mantissa_bits>
struct Is_Ieee754_2008_Binary_Interchange_Format {
template <typename T>
static constexpr bool value =
::std::is_floating_point<T>() &&
::std::numeric_limits<T>::is_iec559 &&
::std::numeric_limits<T>::radix == 2 &&
get_storage_bits<T>() == storage_bits &&
get_exponent_bits<T>() == exponent_bits &&
get_mantissa_bits<T>() == mantissa_bits;
};
```

We used variable template for the type dependent boolean value computation, and also wrapped by a template struct so that we can pass it around easily as a type template parameter later.

First, we check if `T`

compiles IEEE 754 (equivalently,
IEC 60559) with
`std::is_iec559`

.
We should also check `radix == 2`

since IEEE 754 defines two types of floating
points, binary and decimal. Finally, we check if `T`

has the requested number of
storage(width), exponent, and mantissa bits. IEEE 754 defines the standard
number of exponent and mantissa bits for certain sizes, 16, 32, 64, 128, 160,
…, but it also allows implementations to have arbitrary sizes and bits (e.g.,
x86 extended precision format),
so we need to check that if `T`

has the exact format we want.

We can calculate the number of bits of `T`

with the following simple compile
time functions.

```
template <typename T>
constexpr int get_storage_bits() {
return sizeof(T) * CHAR_BIT;
}
template <typename T>
constexpr int get_exponent_bits() {
int exponent_range = ::std::numeric_limits<T>::max_exponent -
::std::numeric_limits<T>::min_exponent;
int bits = 0;
while ((exponent_range >> bits) > 0) ++bits;
return bits;
}
template <typename T>
constexpr int get_mantissa_bits() {
return ::std::numeric_limits<T>::digits - 1;
}
```

For the mantissa bits, the leading bit is implicit so we need to subtract 1. For
the exponent bits, there is no direct property available in
`std::numeric_limits`

so instead we calculate the minimum number of bits
required to represent its exponent range.

Now, we have everything needed to figure out if the given `T`

is the type we’re
looking for. The next step is to automatically select such type among the
built-in floating point types, `float`

, `double`

, and `long double`

, given the
size in bits, e.g., 32, 64. This is where it gets interesting.

The following `find_type()`

recursive function selects a type among `T`

and `Ts`

that satisfies the condition `C`

. In our case, `T`

and `Ts`

are
`float, double, long double`

, and `C`

is the struct we defined previously,
`Is_Ieee754_2008_Binary_Interchange_Format<storage_bits, exponent_bits, storage_bits, mantissa_bits>`

.

```
template <typename C, typename T, typename... Ts>
constexpr auto find_type() {
throw;
if constexpr (C::template value<T>) {
return T();
} else if constexpr (sizeof...(Ts) >= 1) {
return find_type<C, Ts...>();
} else {
return void();
}
}
```

`typename... Ts`

is a
parameter pack that
can match any number of types. So `T`

will be `float`

and `Ts`

will be
`double, long double`

. The first if condition, `C::template value<T>`

checks if
`T`

satisfies the condition given by `C`

, if so, it returns a default instance
of `T`

. The second if condition, `sizeof...(Ts) >= 1`

, checks if there are more
types in `Ts`

to exam, if so, it recursively calls itself, `find_type()`

, with
`Ts`

to continue the search. Finally, if there is nothing in `Ts`

, it returns a
void instance.

Note that since the return type of `find_type()`

is `auto`

, the return type will
be deduced to what `find_type()`

returns at compile time. In addition,
`if constexpr`

discards the unused conditional paths at compile time, so even though
`find_type()`

has multiple return statements with different types, it compiles
successfully.

Since `find_type()`

’s return type is what we need, we can do
`decltype(find_type<...>())`

to get that. The statement `throw;`

at the first
line of `find_type()`

is not necessary but it’s there to indicate that
`find_type()`

is not supposed to be called directly.

The following code defines `BinaryFloatOrVoid`

type using
`decltype(find_type<...>())`

. The newly defined type, `BinaryFloatOrVoid`

, will
be a IEEE754 floating point type that matches the given storage, exponent, and
mantissa bits, or `void`

if the search fails.

```
template <int storage_bits,
int exponent_bits =
standard_binary_interchange_format_exponent_bits<storage_bits>(),
int mantissa_bits =
standard_binary_interchange_format_mantissa_bits<storage_bits>()>
using BinaryFloatOrVoid =
decltype(find_type< //
Is_Ieee754_2008_Binary_Interchange_Format<storage_bits, //
exponent_bits, //
mantissa_bits>,
float, double, long double>());
```

`standard_binary_interchange_format_exponent_bits()`

and
`standard_binary_interchange_format_mantissa_bits()`

functions just return the
number of standard exponent and mantissa bits respectively, and we set them as
the default values for `exponent_bits`

and `mantissa_bits`

for convenience. I
will omit their actual implementations as it’s pretty straightforward and
uninteresting.

Traditionally, before `if constexpr`

was available in C++17, this kind of
compile time type manipulation was implemented with
SFINAE.
The following code shows how it can be done in that way.

```
// Recursion termination. Type not found.
template <typename C, typename... Ts>
struct FindType {
using type = void;
};
// Recursion
template <typename C, typename T, typename... Ts>
struct FindType<C, T, Ts...> {
// Set `type = T` if T satisfies the condition, C. Otherwise, keep
// searching in the remaining types, Ts... .
using type = ::std::conditional_t< //
C::template value<T>, T, typename FindType<C, Ts...>::type>;
};
template <int storage_bits,
int exponent_bits =
standard_binary_interchange_format_exponent_bits<storage_bits>(),
int mantissa_bits =
standard_binary_interchange_format_mantissa_bits<storage_bits>()>
using BinaryFloatOrVoid = typename FindType< //
Is_Ieee754_2008_Binary_Interchange_Format<storage_bits, //
exponent_bits, //
mantissa_bits>,
float, double, long double>::type;
```

Clearly, the `if constexpr`

version is simpler and a lot more readable, and I
expect to see less of the SFINAE mess thanks to `if constexpr`

(and hopefully
concepts) in the future.

Lastly, we introduce another type layer to cause a compile error with a nice
error message, in case the requested type is not available, i.e.,
`BinaryFloatOrVoid`

is `void`

.

```
template <typename T>
struct AssertTypeFound {
static_assert(
!::std::is_same_v<T, void>,
"No corresponding IEEE 754-2008 binary interchange format found.");
using type = T;
};
template <int storage_bits>
using IEEE_754_2008_Binary = typename AssertTypeFound<
BinaryFloatOrVoid<storage_bits>>::type;
```

OK, finally, we have constructed the type `IEEE_754_2008_Binary<n>`

that
guarantees IEEE 754 standard binary interchange format. Yay!

So are we done now? Not quite, there is one last step that every programmer loves: writing tests. :)

```
template <int storage_bits, int exponent_bits, int mantissa_bits>
void test_if_type_exists() {
throw;
if constexpr (!::std::is_same_v<BinaryFloatOrVoid<storage_bits>, void>) {
using T = IEEE_754_2008_Binary<storage_bits>;
static_assert(::std::is_floating_point<T>(), "");
static_assert(::std::numeric_limits<T>::is_iec559, "");
static_assert(::std::numeric_limits<T>::radix == 2, "");
static_assert(get_storage_bits<T>() == storage_bits, "");
static_assert(get_exponent_bits<T>() == exponent_bits, "");
static_assert(get_mantissa_bits<T>() == mantissa_bits, "");
}
}
void tests() {
throw;
test_if_type_exists<16, 5, 10>();
test_if_type_exists<32, 8, 23>();
test_if_type_exists<64, 11, 52>();
test_if_type_exists<128, 15, 112>();
}
```

Again, all the checks are done at compile time, `static_assert`

, so we don’t
need to call `test()`

, and just have to ensure that `test_if_type_exists`

functions are instantiated. If a type doesn’t exists (i.e., 16 and 128 size
types in most systems) then `if constexpr`

will simply discard the checks.

I hope you had fun, like I did. The full implementation is available in this repository https://github.com/kkimdev/ieee754-types .