Short explanation of unum type 3 (posit) arithmetic

Posits are a tapered floating point number format. They are ordered the same way that signed integers of same width are ordered, meaning that signed integer comparison can be also used to compare posits.

Posits are relatively easy to read from left to right. You need to know how many bits there are in the base exponent. For 16-bit posits you have 1-bit reserved for the exponent. For 32-bit posits you got 3-bits of exponent. For 64-bit posits you get 4-bits of exponent.

For example, lets consider these 4 16-bit numbers.

0111011100001110
1010101000010111
0011101001000100
1010100101110101
0000000000000000
1000000000000000

First we retrieve the sign. Take two's complement if the sign is set.

0 111011100001110 = + 111011100001110
1 010101000010111 = - 101010111101001
0 011101001000100 = + 011101001000100 
1 010100101110101 = - 101011010001011
0 000000000000000 = 0
1 000000000000000 = ±infinity

Next we can decode the number into regime, exponent and fraction.

+ 1110 1 1100001110 = + +2 1 1.1100001110
- 10 1 010111101001 = - +0 1 1.010111101001
+ 01 1 101001000100 = + -1 1 1.101001000100
- 10 1 011010001011 = - +0 1 1.011010001011

Regime is a variable-sized field that can push the fraction and the exponent entirely out of the number. It ends to the first digit that is different from the others and is counted as a numeric value based on how many digits wide it is. Here's a beginning of the translation table to illustrate how it's read out as a value.

The regime -field causes numbers near 1 in magnitude, that are common in calculations, have more accuracy than very large or small numbers.

The regime is extending the exponent, that is it's an additional digit to the exponent. This means that the value is multiplied by the range of the exponent field and added to the exponent.

Since the 16-bit numbers have one-bit base exponent, multiply the regime by 2 and add it to the exponent. Now if we decode these numbers further, we are going to get:

+ 101 1.110000111000 = +111000.0111000 (56.437..)
- 001 1.010111101001 =     -10.10111101001 (-2.738..)
+ -11 1.101001000100 =      +0.001101001000100 (0.205..)
-  01 1.011010001011 =     -10.11010001011 (-2.817..)

The dot in the fraction is shifted right by the amount of the exponent. If the exponent is negative, the dot shifts left.

Posits may be a better format than the IEEE 754 floating points because they are conceptually simpler and easier to understand. Posits have 4 different bit patterns: {0/inf, ±num}. With floats you get 6 patterns: {±0/denorm, ±norm, ±inf/nan}.

The main difference to IEE floating points here is the tapered accuracy. That may have a significant impact on the quality of the results, especially with the higher precision numbers.

2019-07-15 updated: switched "probably" to "may be" in second last paragraph. I was excited about this at the time of writing and don't want to express as much certainty as I did. Technically it doesn't change the content, as the argument was based on the differences in the binary format.

If you need a similar description about floating point to compare, there's one in Wikipedia: Single-precision floating-point format

Accurate use of any floating point format requires that you can reliably predict the amount of numerical error in the result. It is something you may like to pay attention to if you have to work with floating point formats regularly.

There's been recent indication that Posits aren't performing well on this aspect at all:

2019-12-15 updated: Explained regime&exponent field encoding better, assuming less about what the reader knows of floating point formats.

Short explanation of unum type 3 (posit) arithmetic

Similar posts