Floating-point representation in programming is inherently error prone as we are not able to perfectly represent real numbers without an infinite number of bits. As a result we have a standard (https://en.wikipedia.org/wiki/IEEE_754) which has some quirky behaviours. One of which is that the error in representing real numbers get larger as the number you are trying to represent moves away from 0. The scale of this error and how fast it grows depends on the number of bits we have to represent that and it is not inherently obvious when this error occurs.
If we have perform multiple arithmetic operations then the rounding error from those calculations could be increasing. To calculate what this error is we need to be able to understand the maximum error for the given number. As a small guide, I have created a graph (below) as an easy reference.
In a later article I will go into how this can effect even simple geometric or purely numerical calculations.
For now I am going to show some code for calculating this error and then using this error to test the accuracy and understanding the actual range of results you will get from using the standard library random function generator.
In this example we want to choose a range for the random number generator, Look at the floating point representation of that range to see how many possible outcomes we could have and then test to see if we do get those outcomes and how frequently each one is hit to see how often we are getting duplicates within a range.
First step - Getting the number of values available in float in our range (between low and high).
In this example we are using std::nextafter to find the next possible floating point representation and then casting to a larger bit representation (to reduce error in this calculation) and dividing the range by this value. This will only give correct results for small ranges as the step size will change as the number gets bigger. (To calculate this perfectly for large ranges there are other less concise approaches - this will be shown in a later post).
Next, we want to create our random device and initialise the range for it to work on:
This code creates a device which produces random numbers in the range between '_low' and '_high'. Pretty standard stuff.
Now we want to create an std::map which holds a floating point result and a counter. This will allow us to count the occurrences of each outcomes from our random number generator and the number of times that number was output.
The final step is simply putting this all together in a function to populate the std::map and output the results
This function will then output text like this for 10000 samples:
And then if we write a loop we can look at the output of this for different magnitudes of input values:
If we look at the results we can see how the number of possible values match the graph at the start of this post. We are slowly reducing the maximum number of outcomes from our random number generator until the point where we can only get the left or right bound value when we are at 10 million. Which at such a large number is quite understandable and probably something you have ran into when working with float numbers in similar situations. However, you will notice that at even small numbers like between 1000-1001 we are quantised to only 16385 possible results. For a lot of applications that is a huge margin of error compared to the accuracy you get between zero and one.
To resolve this we can use bigger data types but that isn't always ideal. So quite a lot of papers recommend working at offsets and then rounding back up to reduce error, but this doesnt totally solve the problem. I will be doing more posts in this area as I finish my research in a related topic where I will be going over some of the coping solutions applications which don't need 100% accuracy of results frequently use.
Full source code for generating the above output: