Short explanation of unum type 3 (posit) arithmetic

Posits are a tapered floating point number format. They are ordered the same way that signed integers of same width are ordered, meaning that signed integer comparison can be also used to compare posits.

Posits are relatively easy to read from left to right. You need to know how many bits there are in the base exponent. For 16-bit posits you have 1-bit reserved for the exponent. For 32-bit posits you got 3-bits of exponent. For 64-bit posits you get 4-bits of exponent.

For example, lets consider these 4 16-bit numbers.

0111011100001110
1010101000010111
0011101001000100
1010100101110101
0000000000000000
1000000000000000

First we retrieve the sign. Take two's complement if the sign is set.

0 111011100001110 = + 111011100001110
1 010101000010111 = - 101010111101001
0 011101001000100 = + 011101001000100 
1 010100101110101 = - 101011010001011
0 000000000000000 = 0
1 000000000000000 = ±infinity

Next we can decode the number into regime, exponent and fraction.

+ 1110 1 1100001110 = + +2 1 1.1100001110
- 10 1 010111101001 = - +0 1 1.010111101001
+ 01 1 101001000100 = + -1 1 1.101001000100
- 10 1 011010001011 = - +0 1 1.011010001011

Regime is a variable-sized field that can push the fraction and the exponent entirely out of the number. It ends to the first digit that is different from the others. This causes numbers near 1 in magnitude, that are common in calculations, to have more accuracy than very large or small numbers.

The regime is extending the exponent, since the 16-bit numbers have one-bit base exponent, we can multiply the regime by 2 and add it to the exponent. Now if we decode these numbers further, we are going to get:

+ 101 1.110000111000 = +111000.0111000 (56.437..)
- 001 1.010111101001 =     -10.10111101001 (-2.738..)
+ -11 1.101001000100 =      +0.001101001000100 (0.205..)
-  01 1.011010001011 =     -10.11010001011 (-2.817..)

Posits are probably a better format than the IEEE 754 floating points because they are conceptually simpler and easier to understand. Posits have 4 different bit patterns: {0/inf, ±num}. With floats you get 6 patterns: {±0/denorm, ±norm, ±inf/nan}.

The main difference to IEE floating points here is the tapered accuracy. That may have a significant impact on the quality of the results, especially with the higher precision numbers.