How to build x264 on Windows

I just had a little bit of an ordeal getting x264 to build on Windows so I am going to write the steps out here as all the other online guides are out of date or missing some information:

Step One: You need the x264 source. This is mirrored in a few locations:

Use git to pull down a copy of the source.
From a bash command prompt like this:
git clone {address of the source from one of those sites here...}

Step Two: You need a compiler that is supported. For ease of use I went with MingW.

The installer for MingW can be found here: https://sourceforge.net/projects/mingw/files/Installer/

When running that installer you will be given a list of packages to install from the installation manager that looks like this:

From this manager ensure you have msys-base, g++ and mingw32-base. Then go into MSYS -> MSYS Base System and tick msys-bash aswell. Then go to 'Installation->Apply Changes' to download those files.

Step Three: The documentation doesnt tell you this, but on Windows you need to manually download NASM to build the application without the setup process complaining a lot.

Download here: http://www.nasm.us/pub/nasm/releasebuilds/2.13.02/win64/

The default directory that this installs to is in your user appdata instead of program files (unless you launch with admin privileges). Find the folder you installed to, it will probably look something like: 'C:\Users\{YOURNAME}\AppData\Local\bin\NASM' and add it to your path.

Step Four: You need to configure the build directory.

Open the MingW shell you installed in step two. Change to the directory you pulled the course code to and run './configure'.

Untitled.png

Step Five: Run make

Now you should have the code compiled to x264.exe in the source directory.

I hope this saves you time!


Addition!

Step Six: Did you want MP4 support? That is enabled automatically in the configure step if you have GPAC instealled.
Download here and add to PATH: https://gpac.wp.imt.fr/downloads/gpac-nightly-builds/

When can we approximate results?

We have put a few posts up now showing different techniques and thoughts on how to implement some kind of approximation to improve performance in your applications. What we hadn't covered was identifying what properties of the program are needed so that it can be approximated and how we can approach that process from a high level.

What is required of the program?
To approach the 'what' question we can start by classifying functions and algorithms into different types to understand how we can approach them:

  • No Golden Answers
    This is probably the most obvious condition that has to be met for a program to be approached with approximation methods. The answer that the program returns has to be allowed to not be a specific perfect answer. 
     
  • Resilient Results
    When we call a program resilient we mean that the returned result can be off by some margin and that result will be gracefully accepted and be acceptable. An example of this would be sensors returning information on a mobile device. The sensor data is often noisy or slightly unreliable so the mobile device correctly handles the information and removes anomalous results and still lands on or near enough to the correct conclusion. The same can be said for image encoding which takes correct information and returns an image which is not correct but is close enough to be accepted by the user as correct as it should be within the acceptable bounds of their vision.

JPEG compression


JPEG compression is a good example of this. Wikipedia has a good example of this (shown left) where the quality of the image decreases from right to left and we can see that for significant portion of that quality loss the image is acceptably clear.

  • Stochastic Results
    If you are familiar with "real-time" raytracing you will have encountered these types of approximation frequently. Approaches such as Monte-Carlo algorithms and similar stochastic techniques. These programs can return a random selection of the answer that is statistically correct but cannot be predicted. This is useful when we want to sample parts of a final answer where calculating the total final answer would be too costly.

The video above shows a project I was involved in which uses stochastic methods to approximate lightmaps which converge on being the correct result over time.

If a program is able to correct over-time like this then it is possible to approximate a final answer before the work is fully complete, possibly within the bounds of acceptable error.

  • Self-correcting Program
    A program that can self-correct from an erroneous input is definitely able to be exploited with approximation, when the cost of that correction is low. In the post that detailed approximation memoisation we gave the example of a path finding algorithm that gave near-enough results. That was acceptable because the user of the near-enough results was able to correct for the slight offset. Other examples of self-correcting programs include online multi-player games which use prediction to reduce the amount of network traffic required for each player in the game at all times. This type of prediction works because the small amount of error from where the game thinks the player is and where the player ends up being is compensated for, hopefully subtly and without impacting gameplay. This isn't always the case, some games allow for too much prediction and you see visible rubber banding - this is where the approximation is not able to correct for the lack of information accurately.
     
  • Fail Rare/Fail Small
    This type of scenario was covered in the article about approximation with multi-threading. When part of an application is allowed to fail but the program is able to compensate and continue anyway then it is open to be exploited for performance gains. This can take many forms as long as the error is handled correctly. Mobile phones can particularly exploit this by sleeping or slowing some features when it thinks the user isn't interested in the results. It is also applied in some network architectures.

How can we approach approximation?
For the second criteria we need to define different types of approximation at a high-level

  • Operation Level Approximation
    We are able to change how our data interacts at a low level, for example we can change "a+a+a" to "3*a"  - in floating-point these two examples are different despite being mathematically identical. So this is translation is technically an approximation because we have made the result different than the programmers intent (to our knowledge). Normally a compiler wont make this change for that reason, but if we relax that rule we can do operation level transformations like this to try and find optimisations at this level. This can be taken to more extreme examples by being more loose, such as saying we only care about accuracy +/- 0.1, so any number that is passed to an addition that is less than that we throw away and don't do the addition. 
  • Data-type Level Approximation - This is the most commonly found approximation in regular programming. We can change how we represent data in the code, with more or less precision this is normally done through standard types (float, double, long double... etc) but could also be expressed through custom software types with their own interesting behaviours.
  • Algorithm Level Approximation - These is the most common approach for the types of programs that are open to optimisation that we listed above. This type of optimisation can be almost anything such as: loop perforation, step skipping, probabilistic algorithms, limited search, aggressive early-out optimisation. Really whatever you can think of that will cause the results to be returned in an approximated manner.
     

That pretty much covers the basics of when and how to approach optimisation in a modern code project - but dont forget some of the old techniques are still around and doing great too. It could be argued that many search and learning algorithms that are not able to be applied with a robust and provable mathematical model are simply approximating the result as best they can.

Storing Nearby Answers in a Function Level Cache

We are all familiar with compilers being helpful by taking duplicate calls to side-effect free functions and changing them to only be called once to save on wasted computation. Effectively creating a local cache of answers in a scope. For example:
 

void someFunction(int a, int b)
{	
	int c = Add(a,b);
	int d = Add(a,b);
	int e = Add(a,b);
	
}

Can Become....

void someFunction(int a, int b)
{
	int f = Add(a,b);
	
	int c = f;
	int d = f;
	int e = f;	
}

This rearranging effectively saves on two calls to 'Add' and in long functions where there is multiple uses the new 'f' variable is a little cache to reference back to.

In a real-time scenario we might manually code a wrapper around some of the functions to store the results many calls to a complex function for some inputs. This would allow us to store the results of the function call cache in a global scope. So that a call to 'Add(1,1)' which had already been computed earlier could be returned from memory. This is very useful in the real world for things like journey planning where the directions from one place in a country to another may take a long time to compute so they might be stored if they are frequently requested to save on resources expensively calculating a result which shouldn't change.

Taking this example of journey planning, we may have a long journey from 'Edinburgh Castle' to '21B Baker Street, London' which took a long time on our system to accurately calculate and give to our user. Now, if another user want the directions from 'Edinburgh Castle' to '22B Baker Street, London' which is just next door to the end point of the last journey we calculated then we are given two choices, a) we can spent our time and resources calculating a new path or b) we can give them the directions to 21B Baker Street and trust the user that they will know to read the house numbers and will find the right place. In this example we are treating the end-user as an element in our system that is resilient to errors of up-to one house. If we were to be robust we would take the directions for 21B Baker Street and then append the much smaller and easier computed directions of how to find the house next door, but we will consider this unnecessary for the current use case.

This kind of optimisation, using a function call cache with some approximation, can be applied to many types of functions and algorithms for a limited overhead and for expensive solutions will produce huge wins but only when the next stage of the process is resilient to level of error it is given. This type of optimisation wouldn't work if the range of error in the inputs that is allowed is too great.

 So, what if I wanted to apply this generically to any function in my program? Well, thats where we can get a bit silly with templates and create a function which looks a bit like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
template<typename T>
bool CheckValid(const T& inputValue, const T& referenceValue, const double& percentage)
{
    const double min = (double)(referenceValue)-(double)(referenceValue*(1.0 - percentage));
    const double max = (double)(referenceValue)+(double)(referenceValue*(1.0 - percentage));
    if (inputValue < min || inputValue > max)
    {
        return false;
    }
    return true;
}

template<typename T, typename ret, typename argV>
const argV MemberVariablePointer(const char* voidFi, const int& _index, const T& _offsets)
{
    const char* dataPointer = voidFi + _offsets[_index];// +sizeof(ret);
    const argV* d = (const argV*)dataPointer;
    return *d;
}

//Memoise Function
#define CACHE_SIZE 10
template<typename Fn, Fn func, int percentageErrorAllowed, typename... Args>
typename std::result_of<Fn(Args...)>::type MemoiseWrapperFunction(Args&&... args)
{
    ////Approximate Information
    static std::array< FunctionInformation<std::result_of<Fn(Args...)>::type, Args...>, CACHE_SIZE > globalFunctionResultCache;
    static      int             currentIndex        = 0;
    constexpr	double          percentageError	    = double(percentageErrorAllowed) / (100.f);
    constexpr	unsigned int    inputVariableCount  = sizeof...(Args);
    constexpr	auto            argOffsets          = GetTypesOffsets<Args...>();

    //Check if there is a stored close enough value
    for (int ii = 0; ii < CACHE_SIZE; ii++)
    {
        const char*	voidFi			= (const char*)&globalFunctionResultCache[ii];
        bool		inRange			= true;
        int			variableIndex	= 0;

        const bool results[] = 
                                { (CheckValid( args, MemberVariablePointer  <   decltype(argOffsets), 
                                                                                std::result_of<Fn(Args...)>::type,
                                                                                std::remove_reference<decltype(args)>::type 
                                                                            >	(voidFi, variableIndex++, argOffsets) , percentageError))... };
        for (const bool res : results)
        {
            if (!res)
            {
                inRange = false; break;
            }
        }

        if (inRange)
        {
            //We have a match in the cache.
            return globalFunctionResultCache[ii]._returnValue;
        }
    }
    
    //No match found.
    //Do work and store cache result
    const int writeindex = currentIndex;
    globalFunctionResultCache[writeindex]._returnValue = func(std::forward<Args>(args)...);
    globalFunctionResultCache[writeindex].Set(std::forward<Args>(args)...);

    //Setup next cache index
    currentIndex++;
    currentIndex = currentIndex % CACHE_SIZE;

    //Return results
    return globalFunctionResultCache[writeindex]._returnValue;
}
#define MemoWrapper(FUNC, ERRORPERC) MemoiseWrapperFunction<decltype(&FUNC), &FUNC, ERRORPERC>

WAIT! Don't run! It isn't as terrible as it looks!

Lines 7-11 contain the static and constexpr values needed to store the results of the function and calculate if new inputs are within range of any in the cache.

The cache is implemented as a fixed size array. (Note: in a more robust implementation you can remove a lot of overhead and value checking with a hashing function that stores results nearby based on input values - but that is a little complex for this example).

The following loop simply checks the cache for the input variables that could be near enough to allow it to return the pre-computed answer. This is done mostly on line 40, which builds and array with entries for each input value which says whether it is valid or not. This is done through the expansion of the variadic input, making it a little difficult to read due to that feature of C++ being a little troublesome to write nicely. This can be tidied into a simpler input if we change the cache-system but it is robust and clear enough for this demo.

After the loop is simply running the function (line 63) and storing the result in the next cache spot. (Note: An ideal implementation wouldn't evict the next cache spot for its result, but replace the "coolest" spot so that the most frequently referenced stay in the cache)

The templates for the MemoiseWrapperFunction allow us to pass in a target function and the percentage accuracy we want. This means we call the wrapped function like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
int add(int _a, int _b)
{
	return _a + _b;
}

void ApproxCacheTest()
{
	int ff0 = MemoWrapper(add, 80)(5, 5);   // returns 10
	int ff1 = MemoWrapper(add, 80)(2, 1);   // returns 3
	int ff5 = MemoWrapper(add, 80)(5, 33);  // returns 38
	int ff6 = MemoWrapper(add, 80)(5, 6);   // returns 10 because of ff0 being -
                                            // in the cache and within the approximate range (80%).
}

(Note: This calling convention could be changed to be like our overrides in our generated approximate examples from last month, but this is fine for a simple working prototype)

As you can see in the example code above, the result of ff6 is given the result from ff0 because each input in (5,6) is within 80% of the range of (5,5).

This example uses a simple addition function which is obviously simpler than the overhead of checking the cache and comparing values. In a real-world system you would want to replace that with a very complex and expensive function like the journey planning example.

This is a very complex implementation of a quite simple idea because of being made into a generic solution. For a lot of problems, a hand-crafted solution might have a lower overhead.

When using this system, the size and behaviour of the cache will play a large part in whether it is effective at reducing complexity on very large scale systems due to the additional overhead costs which need to be justified by a certain percentage of cache hits to match the additional costs.

I will follow this post up with a more detailed and tidy example implementing the ideas I have dotted around this implementation.

Thanks for reading!

 

 

Measuring Error in Float Functions

As I have written about before, we all know that our floating point calculations contain error from accumulated rounding error and that error can distribute unevenly across the range of inputs for that function based on the distribution of results in each calculation.

Checking this error at run-time would involve a lot of overhead but having a method to analyse the function before the program is released is useful for detecting any edge cases and getting an idea of the minimum and maximum error to help you calculate a correct epsilon for handling errors. (Ask me about epsilons in physics libraries to get my full rant on this...)

Luckily, this kind of function checking is very simple. We simply need to create a replacement 'float' type which mimics all floating point calculations at the highest precision and stores this error in the type. This type can then combine it's own error with others when used together to get an understanding of the cumulative offset.

A class which replaces a standard type like this will need to have the correct constructors and operator overrides to cover all the use cases so that we don't get compile errors or type mismatches or even worse - blindly building a new instance and losing our cumulative error count.

The main elements of this class are the 'float' value it is wrapping and a high precision error counter. We can't use float for this error counter as it will be vulnerable to the same level of rounding error as the float that it is wrapping.

(Optionally: I added a string to keep the name of the float to make debugging and checking that we are counting the error correctly easier)

With that in mind, you should end up with a class that looks something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class FloatError
{
public:

	FloatError(float _x)
	{
		static int floatIDCounter = 0;

		m_name = "FloatID_";
		m_name.append(std::to_string(floatIDCounter));

		m_value = _x;
		m_error = 0.0;
	}

        /* other constructors go here... */

	//Operators overload
	FloatError operator+(const FloatError& _in)
	{
		float val = m_value + _in.m_value;
		long double highResResult = ((long double)m_value + (long double)_in.m_value);

		long double error = abs((highResResult - (long double)val ));
		m_error += error + _in.m_error;
		return FloatError(val, m_error);
	}

        /* Others go here... */

	std::string m_name;
	float m_value;
	long double m_error;
};

This implementation of the operator+ override is the important part here. Each operation is repeated while cast to the higher precision type. The difference of these two results is then computed to give us a high level approximation of the rounding error that has been incurred. This result is then combined with the errors of the two numbers being summed to give us the total error in the final result.

It would be tempting in this implementation to not take the absolute value of the error to store the total offset from the correct answer rather than allowing errors to cancel each other out by taking the negative and positive errors. However, that would only be correct for measuring when values are added or subtracted. When dealing with all operations we would have to handle how the offset value of the multiply or division would increase error with the number it is being applied to - which is a more complex problem than we are addressing here. In this instance we are only looking at analysing the total rounding error - for now.

With this all in place we want to be able to use this class on a function taking floats. The example we will use will be a function which takes a fixed array of floating point numbers and add them all together returning the sum value.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#define float FloatError
float Sum(std::array<float, 200>& _in)
{
	float err(0.0);
	for (auto i : _in)
	{
		err = err + i;
	}
	return err;
}
#define float float

In the code example you will see that we have wrapped the function in a define to override the float type and replace it with our FloatError type. We can then call this function from our test function which generates arrays of random values in a fixed range and then tests the Sum  function with those values to try and find a maximum and minimum error in that range. (Note: for any array of random values in this example there should exist values that produce zero error.) 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void TestFloatError(const float _min = 0.f, const float _max = 1000.f, const int _sampleCount = 50)
{
	std::array<FloatError, 200> numbers;
	float lo = _min;
	float hi = _max;
	long double minerr = 1000000.f;
	long double maxerr = 0.f;

	for (int j = 0; j < _sampleCount; j++)
	{
		for (int i = 0; i < 200; i++)
		{
			numbers[i] = getRandom(lo, hi);
		}

		FloatError errRes = Sum(numbers);

		std::cout << "ERROR: " << errRes.m_error << ".\n";

		minerr = std::min(minerr, errRes.m_error);
		maxerr = std::max(maxerr, errRes.m_error);
	}
	std::cout << "MIN ERROR: " << minerr << ".\n";
	std::cout << "MAX ERROR: " << maxerr << ".\n";
}

Once it has tested the function multiple times it will output and min and max error while also outputting the error for each sample as it runs. Which in the end looks like this for the range (0,1000.f):

ERROR: 0.219635.
ERROR: 0.202362.
ERROR: 0.211487.
ERROR: 0.231384.
ERROR: 0.206512.

... Many results ...

ERROR: 0.218292.
ERROR: 0.212372.
ERROR: 0.235107.
ERROR: 0.181488.
ERROR: 0.217163.
ERROR: 0.231659.
ERROR: 0.227386.
ERROR: 0.22171.
ERROR: 0.2005.
ERROR: 0.221191.
ERROR: 0.23172.
ERROR: 0.243927.
ERROR: 0.213196.
ERROR: 0.214783.
ERROR: 0.219055.
ERROR: 0.229004.
MIN ERROR: 0.181488.
MAX ERROR: 0.243927.

So we can see that the total error that is possible in even this small sample is not insignificant - and that is only with additions of float values which already sit on floating point steps.

I hope this was useful to you and that you can apply it in your own code to help visualise the error that is hidden in even your simplest algorithms!

 

 

Defining Approximate Type Systems To Allow for Controlled Precision

Many programmers writing software today are unaware of the precision with which they are currently writing applications. A general rule of convenience, there not being any errors and no large performance impact dictate many choices in the types and algorithms being used to get the results desired.

Quite often it is seen that a programmer will use an 'int' array to store numbers only ranging from zero to four. In this situation the programmer has given far too much precision in the type and is losing memory performance by having to load all the extra bit representing numbers above four. Additionally a programmer may use a math function such as 'std::sinf' to calculate some value of sin for an input but they are only interested in the result of sin to three decimal places - in which case 'std::sinf' is doing far more work that is necessary and working at a much higher precision to conform to the C Math Specification.

The need to run at a lower precision can come from all sorts of requirements. From the need to fit the calculation within a set number or cycles or simply to reduce the power consumption of the device. Many of the techniques employed in functions to get lower precision or estimated results are referenced under the category of Approximate Computation.

In the case of types, most modern programming languages do not provide a type system which allows a clear selection of the precision for most basic types. The options are often quantised to 8/16/32/64/128-bit representations to make an easier match to the common hardware available. This is known as Software-Hardware Co-design. 

However, in the case of functions we are given the full control to calculate the precision to what we need and then store into one of the predefined types. There are all sorts of numerical techniques for calculating the result of different mathematical functions to different precision and some of these can be seen in the differences between math libraries which do not conform to the C Math Specification.

We can conclude that we are, for practical reasons, stuck with standard types but can allow any function to work a given precision. How would we go about it if we wanted to implement our own approximate 'std::sinf' function for a small section of a large codebase already using the standard library implementation?
We could do that quite simply by adding our new function like any other - but what if we wanted to pass a 'float' to that function? We would need to create our own namespace, or rename the function - this would mean significant rewrites to the code base.
And we would want to make the use of this approximate variation safe - the result of a lower precision sin mixed with the real precision sin could cause some trouble as any trigonometric identities would also be approximate. Basically, we want to ensure we don't mix an approximate result with a precise one without the programmer explicitly stating to do so.

To meet the requirements above we can use a wrapper type around basic types with a templated precision flag. This type can then be used for overloading the call to 'sinf' - so that no function calls need to change and through the type system a 'float' could not be used with an 'ApproxType<float>' without the programmer explicitly requesting the base 'float' type from the wrapper. This means the programmer would have to change the types of the objects in the code base that are going to be less accurate but none of the functional logic. Additionally, it allows for the user to specify a maximum level of error that this type accepts, by allowing this we can statically switch calls made with the approximate type to the function which provides the requested level of precision.

For example, a float which says it accepts 10% error in its result that is passed to 'sinf' will be forwarded to an implementation of 'sinf' that at minimum meets that requirement.

So, with those rules and a plan - this is how to implement such a design in C++!

We want to start with the Approximate Wrapper Type. This is our class which will essentially be the type of the object that is normally passed to the function but is marked up statically with the level of accuracy desired. It is also given a 'Get' function to allow the data to be removed from the object. This is the method by which the programmer is expressing that the accuracy is final and safe to use outside of the type-safety. Ideally, we would want to also have the type specify minimum and maximum values and include all the copy rules to allow that to work - but for this basic implementation this will do to show the concept.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
template< typename T, int accNumurator >
class ApproxType
{
public:
	ApproxType(T _in)
	{
		m_data = _in;
	}

	T& Get()
	{
		return m_data;
	}

	T	m_data;
};

In this example I am using a Taylor Approximation of Sin generated through my build system. It looks a bit scary but is just a simple polynomial.

1
2
3
4
5
6
7
constexpr float ApproxSin_Order7(const float _x) noexcept
{
	return ((-0.00001633129816711214847969706187580f*(1.0f)) + (1.00037993800406788125201273942366242f*(_x)) 
			+ (-0.00211754585380569430516639606310036f*(_x* _x)) + (-0.16175605700248782414796266948542325f*(_x* _x * _x)) 
			+ (-0.00577124859440479066885476555626155f*(_x* _x * _x * _x)) + (0.01203154857279084034848981588083916f*(_x* _x * _x * _x * _x)) 
			+ (-0.00127402703149161102696984571025496f*(_x* _x * _x * _x * _x * _x)) + (-0.00000043655425207306325578100769137f*(_x* _x * _x * _x * _x * _x * _x)));
}

Now we want to Overload 'sinf' with our function which takes our wrapped approximate type. In this function we are specifying how accurate our approximate implementation is (accuracy = 8) and using an 'std::conditional' to select which function should be called. In this case, if the approximate type that is passed in allows this level or accuracy then it will call the Taylor Approximation of Sin  other wise it will simply call 'std::sinf'. 

This logic is all performed with constexpr and static types where possible so that this code can be evaluated at compile time to reduce this down to as close as possible to a single function call.

1
2
3
4
5
6
7
8
template< typename T, int accNumurator >
float sinf(ApproxType<T, accNumurator> _x)
{
	static constexpr int accuracy = 8;
	typedef std::conditional < accuracy >= accNumurator, FuncWrapper<ApproxSin_Order7>, FuncWrapper<std::sinf> >::type A;

	return A::Call(_x.Get());
}

You may have noticed in the above example that the 'std::conditional' doesn't actually contain function pointers but function pointers wrapped in a class called FuncWrapper. This is because std::conditional only takes types but we want to be able to do a compile-time selection. So we wrap the function pointer is a class which has a static function call to run the wrapped function. The call to 'std::conditional' gives us the correct type to then call the function on. It is a little bit of a hack, but is what has to be done for now until C++ gets good support for a Static If.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
template< float func(float) >
class FuncWrapper
{
public:

	static float Call(float _in)
	{
		return func(_in);
	}
};

If we put this all together we can see how the code is used. We havent had to make any special calls to the function we replaced as it is all handled by the overloading of the function. The compiler has then generated the correct function calls for each based on the level of accuracy we allow for each of the ApproxType's. 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
int main()
{
	float var = 0.5f;
	ApproxType<float, 6> approx(var);
	ApproxType<float, 9> approxhigh(var);
	
	float approxreslow  = sinf(approx);		// Calls the approximate function
	float approxreshigh = sinf(approxhigh); // Calls std::sinf from the Approximate Wrapper
	float accurateres	= sinf(var);		// Calls std::sinf

	return 0;
}

These functions can now be dropped into a codebase and then through specifying where you want the types to be approximate, we can access all the possibilities of Approximate Computing with very minimal changes.

 

Thanks for reading. All the code in one place:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <cmath>

template< typename T, int accNumurator >
class ApproxType
{
public:
	ApproxType(T _in)
	{
		m_data = _in;
	}

	T& Get()
	{
		return m_data;
	}

	T	m_data;
};

template< float func(float) >
class FuncWrapper
{
public:

	static float Call(float _in)
	{
		return func(_in);
	}
};

constexpr float ApproxSin_Order7(const float _x) noexcept
{
	return ((-0.00001633129816711214847969706187580f*(1.0f)) + (1.00037993800406788125201273942366242f*(_x)) 
			+ (-0.00211754585380569430516639606310036f*(_x* _x)) + (-0.16175605700248782414796266948542325f*(_x* _x * _x)) 
			+ (-0.00577124859440479066885476555626155f*(_x* _x * _x * _x)) + (0.01203154857279084034848981588083916f*(_x* _x * _x * _x * _x)) 
			+ (-0.00127402703149161102696984571025496f*(_x* _x * _x * _x * _x * _x)) + (-0.00000043655425207306325578100769137f*(_x* _x * _x * _x * _x * _x * _x)));
}

template< typename T, int accNumurator >
float sinf(ApproxType<T, accNumurator> _x)
{
	static constexpr int accuracy = 8;
	typedef std::conditional < accuracy >= accNumurator, FuncWrapper<ApproxSin_Order7>, FuncWrapper<std::sinf> >::type A;

	return A::Call(_x.Get());
}

int main()
{
	float var = 0.5f;
	ApproxType<float, 6> approx(var);
	ApproxType<float, 9> approxhigh(var);
	
	float approxreslow  = sinf(approx);		// Calls the approximate function
	float approxreshigh = sinf(approxhigh); // Calls std::sinf from the Approximate Wrapper
	float accurateres	= sinf(var);		// Calls std::sinf

	return 0;
}

Approximate methods for handling error in multi-threaded solutions

In many modern applications we rely on the use of many multi-threaded algorithms, some of which are ran across CPUs and some of which are ran as GPGPU tasks.

For many types of programs this usually involves taking some task and splitting it into separate parts and recombining or processing the results of each separate part into a singular answer.

tasks_good.jpg

When each of separate threads executes correctly we are given the absolute correct answer in, hopefully, less time than if the task hadnt been separated into sub-components.

But what happens if one of the sub-components of the task failed? In most applications this would be a catastrophic error and we would end the program with no result. For a program which is doing a lot of heavy computation that could mean the loss of a whole lot of processed data.

To avoid that, it is possible to employ methods which detect a fault in a single sub-component and resubmit it for processing, exclude it from results or in some way handle the error and try and recover or nicely drop out of the program in a way that isn't too lossy.

However, what if the task we were running was made out of hundreds of thousands of components, where the result of executing each component could be predicted, modelled or in some way approximated from the input data and the results being produces by the other non-failing sub-components? In that situation we might be able to continue running the application by simply replacing the failed execution with an approximate answer and reporting to the program that it reported an error and give some statistical impact of that error. Such as up-to x% based on the number of sub-components and how many reported errors.

In that case you could run some work and have a work graph that looks like below.

tasks_bad.jpg

In this new model that allows a failed/slow/broken task to be approximated from its input and neighbours we can add some stability to programs that don't require 100% accuracy and precision. If a program is fault-tolerant in its algorithms and results then it gives greater freedom to handle error or hardware problems which are guaranteed to occur on a long enough time scale.

Programs which allow for this condition are interesting in other ways too. If the program can handle up-to 10% error and still function and we have a task that is split into 1000 sub-tasks, and each sub-task is independent and side-effect free then we can manually add error into the program to increase performance if our approximation technique for failed sub-tasks is cheaper than the task that it was performing. This would give us an error that at worst within the threshold of error the program allows, and at best close to the correct answer.

In the case of artificially inducing error in the application we could reveal potential to reduce the power required by the computer to compute that tasks as well as improve the programs overall performance. We would want to induce this state in an application when the trade-off between accuracy and performance in non-linear, i.e. a small drop in accuracy gives a much more valuable benefit in another criteria.

There are many opportunities to use this technique in mobile computing, sensors and distributed computing as well as more general simulation technology to allow us to build better performing and more stable applications.

C++11/14/17 Library Usage on Github

I recently collected a list of all the top level classes and free functions that are added for each of the modern C++ revisions (11/14/17) so that I could scan the Modern C++ example repository to find which library classes and functions don't yet have implementations. (Turns out... a lot!)

However, I am quite interested in how much of these modern standards are actually being adopted in real-world applications. As it was my experience when working in industry that the majority of C++ programmers, if not faced with wide usage of the modern concepts, did not independently do the research and work to keep up to date with the latest additions and methods. This leads to most code I see in the private sector looking like C++98 with the odd 'auto' and modern 'for' loop thrown in for convenience. 

Alternatively, you would see many classes and functions from Boost, which have now been ported over to the standard library - but I will come to this later.

To be able to get some solid figures on what is actually being used, I looked to the only large public set of searchable repositories we have - Github.
I used the Github HTTP API to query for the most popular (starred) repositories that are listed as C++. I then cloned all of these repositories to my machine (286 projects all together approx 120GB+ download...) and did some text analysis looking for the correct includes project wide and testing to see if these functions/classes were called or used. This is not the most accurate solution to this problem and will produce some false positives with identically named functions or negatives with complexly included headers - but for a quick pass this should give us the broad stroke usages. Ideally, we would want to build each project and see what is actually being linked during the compilation or do some more knowledgeable static analysis. But until then... results!

C++11

Total_Output11.csv.png
Total_Output11.csv_pie.png

I was quite shocked by the lack of C++11 classes and functions being used. However, C++11 did introduce an awful lot of quite obscure parts and many variations - so this might not be as bad as it looks.

C++14

C++14 fared a lot better than C++11 which makes sense as C++14 is a revision which corrects a lot of the little things that C++11 missed and generally adds the little useful things that became obvious only after C++11 was finalised ( such as the std::rbegin/rend standalone functions).

C++17

I was honestly surprised to see as much usage of the C++17 standard as we did. Some of the functions that were used saw even higher usage than some C++11 functions that were used. Unsurprisingly the parts that were used have been the ones that you can see by the name are pretty simple and understandable - some of the more complex things in there havent shown up on any searches yet!

MORE WORK

As I mentioned above, some of the standard things in C++11 are based on some popular parts of the Boost library. I am going to do a search for Boost usage next and see if there is significant crossover which would explain the delay in some projects making the switch as that would be an awful big risk to get very little gain!

C++ noexcept and move constructors effect on performance in STL Containers

A little known feature of some of the C++ standard library containers is that in the latest C++ Standard they will use the C++11 move operation to more cheaply resize the container when possible.

A move operation is cheaper than a straight copy because it performs fewer operations at the cost of trashing the source object. This is because the memory is not copied and then reassigned, simply reassigned. (full details here if you are unfamiliar)

The actual mechanism which tells you if the container is using a move constructor or copy constructor is quite opaque to the user and the actual condition for it being switched not well documented.

class MyClass
{
    ...

    //This is a correctly formed copy constructor
    MyClass ( MyClass&& _input) noexcept
    {
     ... do something ...
    }

    ...

};

An STL container can only use the move constructor in it's resizing operation if that constructor does not break its strong exception safety guarantee. In more plain language, it wont use the move constructor of an object if that can throw an exception. This is because if an exception is thrown in the move then the data that was being processed could be lost, where as in a copy constructor the original will not be changed.

So, how do we tell the STL library that our move constructor wont throw any exceptions? Simple, we use the "noexcept" function specifier!

Tagging our move constructor with "noexcept" tells the compiler that it will not throw any exceptions. This condition is checked in C++ using the type trait function: "std::is_no_throw_move_constructible". This function will tell you whether the specifier is correctly set on your move constructor. If you haven't wrote a move constructor yourself the compiler generated one should pass this test. This is really only a concern if you have implemented your own. If your type passes the test then when the STL container calls "std::move_if_noexcept" it will correctly move the object instead of falling back to a copy.

template<typename Y>
MoveProperties GetProps()
{
	MoveProperties mp;
	mp.moveassignable               = std::is_move_assignable<Y>::value;
	mp.moveconstructible            = std::is_move_constructible<Y>::value;
	mp.moveTriviallyAssignable      = std::is_trivially_move_assignable<Y>::value;
	mp.moveTriviallyConstructible   = std::is_trivially_move_constructible<Y>::value;
	mp.moveNoThrowAssignable        = std::is_nothrow_move_assignable<Y>::value;
	mp.moveNoThrowConstructible     = std::is_nothrow_move_constructible<Y>::value;
	return mp;
}

We can use these tests to build two classes, one which has a noexcept copy contructor and one without. Then place them into a large STL container and time them to see what a difference this can make to the performance of the system.

So firstly, two classes:

#define ArraySize 100

unsigned int staticCounter = 0;
unsigned int copyCount = 0;
unsigned int moveCount = 0;

class NoThrowMoveConstructible
{
public:
	NoThrowMoveConstructible() 
	{ 
		m_memberVariable = new std::array<int, ArraySize>();
		m_memberVariable->fill(staticCounter++); 
	}
	~NoThrowMoveConstructible()
	{
		if (m_memberVariable != nullptr)
		{
			delete m_memberVariable;
		}
	}
	NoThrowMoveConstructible(NoThrowMoveConstructible&&	_in) noexcept 
	{
		moveCount++;
		this->m_memberVariable = _in.m_memberVariable;
		_in.m_memberVariable = nullptr;
	}
	NoThrowMoveConstructible(const NoThrowMoveConstructible& _in) noexcept
	{ 
		copyCount++;
		this->m_memberVariable = new std::array<int, ArraySize>(*(_in.m_memberVariable));
	}

	std::array<int, ArraySize>* m_memberVariable;
};

class CantBeNoThrowMoveConstructible
{
public:
	CantBeNoThrowMoveConstructible() 
	{ 
		m_memberVariable = new std::array<int, ArraySize>(); 
		m_memberVariable->fill(staticCounter++);
	}

	~CantBeNoThrowMoveConstructible()
	{
		if (m_memberVariable != nullptr)
		{
			delete m_memberVariable;
		}
	}

	CantBeNoThrowMoveConstructible(CantBeNoThrowMoveConstructible&& _in) 
	{ 
		moveCount++;
		this->m_memberVariable = _in.m_memberVariable;
		_in.m_memberVariable = nullptr;
	}

	CantBeNoThrowMoveConstructible(const CantBeNoThrowMoveConstructible& _in) 
	{ 
		copyCount++;
		this->m_memberVariable = new std::array<int, ArraySize>(*(_in.m_memberVariable));
	}

	std::array<int, ArraySize>* m_memberVariable;
};

These two classes are identical except for the "noexcept" being applied to its copy and move constructors. They are each simply a wrapper around a dynamically assigned array of integers. We are using an "std::array" here so that if we were to do thorough testing it would be trivial to test copy/move on different sized objects.

Using our "Props" function we can see that the classes have the correct properties. No throw move construction is false on our "CantBeNoThrowMoveConstructible" class and true on our "NoThrowMoveConstructible" class. They are otherwise identical when it comes to move properties.

    NoThrowMoveConstructible:
	moveassignable              false
	moveconstructible           true
	moveTriviallyAssignable     false
	moveTriviallyConstructible  false
	moveNoThrowAssignable	    false
	moveNoThrowConstructible    true

    CantBeNoThrowMoveConstructible:
	moveassignable	            false	
	moveconstructible           true	
	moveTriviallyAssignable     false	
	moveTriviallyConstructible  false	
	moveNoThrowAssignable       false	
	moveNoThrowConstructible    false	

If we were to run this analysis on any basic class or class where we havent overriden any of the move operator functions or constructors we would see all these properties return true. This is because the default implementations of these functions which are automatically added to your class if you don't replace them (much like the default destructors) are created with the correct noexcept properties to allow moving the objects.

Now that we know the classes are behaving as we would expect we can run a simple timing function to judge the performance and using the counters verify that behavior at run-time.

template< typename Y >
std::vector<Y> OutputTimeForFillingVector(unsigned int vectorSize)
{
	copyCount = 0;
	moveCount = 0;
	std::vector<Y> vectorToFill;

	Timer<std::chrono::milliseconds> t;
	t.Start();

	for (int i = 0; i < vectorSize; i++)
	{
		Y val;
		vectorToFill.emplace_back(val);
	}

	std::vector<Y> copyVector = vectorToFill;

	std::cout << "Move Operations: " << moveCount << ". Copy Operations: " << copyCount << "\n";
	std::cout << "Time taken for " << typeid(Y).name() << ": " << t.GetElapsedTime().count() << "ms.\n";

	return copyVector;
}

int main()
{
	MoveProperties NoThrowMoveable		= GetProps<NoThrowMoveConstructible>();
	MoveProperties NoneNoThrowMoveable = GetProps<CantBeNoThrowMoveConstructible>();
	
	unsigned int vecSize = 1000000;
	
	std::cout << "Can do move operations: \n";
	OutputTimeForFillingVector<NoThrowMoveConstructible>(vecSize);
	std::cout << "Can't do move operations: \n";
	OutputTimeForFillingVector<CantBeNoThrowMoveConstructible>(vecSize);

	system("pause");
	return 0;
}

This simple code simply creates a vector of size "vecSize" by repeatedly adding a single element to the vector. The time it takes for this to happen and the number of copies and moves that were performed on the underlying object is then output to the console for us to analyse.

Can do move operations:
Move Operations: 2099753. Copy Operations: 2000000
Time taken for class NoThrowMoveConstructible: 469ms.
Can't do move operations:
Move Operations: 0. Copy Operations: 4099753
Time taken for class CantBeNoThrowMoveConstructible: 926ms.

This leads to the class which cannot be correctly moved taking nearly twice as long as the one that can, to be placed into a simple vector. Importantly you can see the effect of this in the Visual Studio Diagnostics panel's process memory watcher.

stdmove.png

We can see that the "std::move" enabled class has a smooth increase in memory as the memory is allocated and moved to the copy of the object in the vector leading to a steady increase. Where as the copy constructed class has to frequently make copies of its memory and deallocate the the memory in the dead objects leading to a  bouncing rise in memory over a longer time due to the extra work that is having to be done.

The lesson in all of this is that if you touch the copy-constructors or operators make sure to assign the correct exception handling status to the functions to be able to enable the optimal behaviour. If you never touch these at all, there is a good chance they will default to the correct behaviour, but it is always good to check!

Any size math vector using Modern C++

In the normal case of working with any graphics or games programming code you inevitably end up with a lot of classes for handling 2D/3D points in the world. Usually as a direct map to GLSL/HLSL's float1, float2, float3 and float4 or the int and double equivalents.

This can quickly get messy to maintain when you have tens of classes to update for any small change - not to mention tedious. A resolution  to this problem is taking advantage of templates, the std::array classes and some nice new features of the language to try and (almost) make a readable single class that can be initialised to the correct type and number of elements to be used. Ideally for maximum performance we want to keep the dozens of classes to make performance analysis easier and therefore easier to make go fast, but for more general or less close to the metal applications I am presenting the class 'vecN'

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
template<typename T, unsigned int elemCount>
struct vecN
{
public:
	vecN() {}

	template<class... Args>
	vecN(Args... args)
	{
		const int n = sizeof...(Args);
		static_assert(n == elemCount, "Invalid number of arguments for vector type");

		data = { { args... } };
	}

	std::array<T, elemCount> data;
}

This is the basic backbone of the class. It is a simple vector class that can initialised like - 'vecN<float,4> someFloat4();' or  'vecN<float,4> someFloat4(1.f, 2.f, 3.f, 4.f);'.

It takes the two template parameters ('float' and '4') and uses them to initialise an std::array. The constructors offered are a simple blank constructor which will leave the array with default uninitialised values or there is a variadic template constructor. The variadic template constructor will take the number of elements in the array as an input and initialise the std::array with those values. It uses a static_assert to check at compile-time that we can only pass the correct number of elements to the array, this prevents us from passing too many or too few elements to the array.

With the class just as it is, we are able to use the vector - but a vector should provide more functionality than a simple array!

1
2
3
4
5
6
7
8
	template<class... Args>
	void Set(Args... args)
	{
		const int n = sizeof...(Args);
		static_assert(n == elemCount, "Invalid number of arguments for vector type");

		data = { { args... } };
	}

We want to be able to set the values in the vector, so we can use the same code as the constructor to write a function to allow the values in the vector to be set after construction.

A good vector class should also provide basic mathematical function to make the vector easy to use with simple types and other vectors

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
	void Multiply(T _mulVal)
	{
		std::for_each(data.begin(), data.end(), [&_mulVal](T& elem) { elem*= _mulVal; });
	}

	void Multiply(vecN<T,elemCount> _mulVal)
	{
		int counter = 0;
		std::for_each(data.begin(), data.end(), [&counter, &_mulVal](T& elem) { elem *= _mulVal.data[counter++]; });
	}

	void Add(T _addVal)
	{
		std::for_each(data.begin(), data.end(), [&_addVal](T& elem) { elem += _addVal; });
	}

	void Add(vecN<T, elemCount> _addVal)
	{
		int counter = 0;
		std::for_each(data.begin(), data.end(), [&counter, &_addVal](T& elem) { elem += _addVal.data[counter++]; });
	}

	float Length()
	{
		std::array<T, elemCount> dataSqr = data;
		std::for_each(dataSqr.begin(), dataSqr.end(), [](T& elem) { elem = elem*elem; });
		T sum = std::accumulate(dataSqr.begin(), dataSqr.end(), (T)0);
		return sqrt(sum);
	}

	void Normalise()
	{
		T len = 1.0/Length();
		std::for_each(data.begin(), data.end(), [&len](T& elem) { elem *= len; });
	}

This code makes the class actually useful for some simple linear algebra. In this code we are using 'std::for_each' from <algorithm> and 'std::accumulate' from <numeric> to allow us to work with our arbitrary sized array generically with relative ease and confidence in the code that it will generate. C++17 allows us to apply different execution patterns on these functions, but that has not been enabled in my compiler just yet, so for now we are sticking with default behaviour, but this is an area that could be improved soon - particularly if we wanted to use a very large vector such as 'vecN<float, 2048>' !

Next we want to think of the common use cases for the classes - such as having a 'float3' and needed to pass a 'float4' which is quite common when writing to some textures or in shader pipelines.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
	vecN<T, elemCount - 1> PopElem()
	{
		vecN<T, elemCount - 1> output;
		std::copy_n(data.begin(), elemCount-1, output.data.begin());
		return output;
	}

	vecN<T, elemCount + 1> PushElem(T _value = 0.0)
	{
		vecN<T, elemCount + 1> output;
		std::copy_n(data.begin(), elemCount, output.data.begin());
		output.data[elemCount] = _value;
		return output;
	}

	template<int _size>
	constexpr vecN<T, _size> GetResizedVector()
	{
		vecN<T, _size> output;
		std::copy_n(data.begin(), std::min((unsigned int)(_size), elemCount), output.data.begin());
		return output;
	}

Using the push and pop functions provided here we can arbitrarily return a copy of the vector with one more or less element. Or, we can use 'GetResizedVector' to request an arbitrary sized copy when we need to do larger adjustments. Nothing too fancy going on with these functions. We are using 'std::copy_n' to copy a sub-range of the array when shrinking, in the past this might have been done with memcpy, but this is a lot safer.

Next we want to look at accessing data from the vector in a nice way.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
	constexpr T& GetElement(int _index)
	{
		return data[_index];
	}

	template<int _index>
	constexpr T& GetElement()
	{
		return data[_index];
	}

These two little functions give the user two ways to access the data. The second function is done using the template in the hope that the compiler will recognise the constant pointer offset and save on evaluating '[_index]' at run-time where possible - just like how a normal 'someFloat3.x' would behave. They are also constexpr where possible to allow for aggressive compilers to resolve values where ever possible.

This isn't sufficient for all use cases though we also want to provide standard x/y/z/w/xyz access.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
	constexpr T& x()
	{
		static_assert(1 <= elemCount, "Invalid number of arguments for vector type");
		return data[0];
	}

	constexpr T& y()
	{
		static_assert(2 <= elemCount, "Invalid number of arguments for vector type");
		return data[1];
	}

	constexpr T& z()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		return data[2];
	}

	constexpr T& w()
	{
		static_assert(4 <= elemCount, "Invalid number of arguments for vector type");
		return data[3];
	}

	constexpr std::array<T, 3> xxx()
	{
		static_assert(1 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[0], data[0], data[0] };
		return xyz;
	}

... and many more ...

These functions, and many more not shown provide for all the element access methods that a user will be used to when dealing with 'float1-4' types. Where possible we are statically asserting to prevent out of range data access, so 'someFloat3.z()' will only work on arrays that have at least 3 elements.

But what if our user has a 6 element array and wants the first and last elements only...?

1
2
3
4
5
6
7
	template<class... Indexs>
	constexpr vecN<T, sizeof...(Indexs)> GetOrderedArrayOfIndices(Indexs... indxs)
	{
		vecN<T, sizeof...(Indexs)> output;
		output.data = { { data[indxs]... } };
		return output;
	}

In that case, we have to use variadics again and similarly to how we initialised the array of the class with the arguments in the constructor and 'Set' function we need to expand the variadic input, but this time, expanding with the pattern of 'data[indxs]...' so that instead of getting:
{ { value1, value2, value3 and on and on } }
we get:
{ {data[value1], data[value2], data[value3] and on and on } }

This allows us to call the function like 'someVector4.GetOrderedArrayOfIndicies(0,3);' to get a vector with two elements which hold the first and last element values of 'someVector4'. With this method we can arbitrarily construct arrays from other array or perform reordering.

Finally, we want to be able to work with these types without excessive typing of hard to read code, so we can hide most of the templating in the types with a few simple typedefs...

1
2
3
4
typedef vecN<float, 4> float4;
typedef vecN<float, 3> float3;
typedef vecN<float, 2> float2;
typedef vecN<float, 1> float1;

So, now the user can simple create a 'float4' and use it like any other class they are used to- but easily convert sizes and access elements however they want.

Thanks for reading!

 

Full Code:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
template<typename T, unsigned int elemCount>
struct vecN
{
public:
	vecN() {}

	template<class... Args>
	vecN(Args... args)
	{
		Set(args...);
	}

	template<class... Args>
	void Set(Args... args)
	{
		const int n = sizeof...(Args);
		static_assert(n == elemCount, "Invalid number of arguments for vector type");

		data = { { args... } };
	}

	void Multiply(T _mulVal)
	{
		std::for_each(data.begin(), data.end(), [&_mulVal](T& elem) { elem*= _mulVal; });
	}

	void Multiply(vecN<T,elemCount> _mulVal)
	{
		int counter = 0;
		std::for_each(data.begin(), data.end(), [&counter, &_mulVal](T& elem) { elem *= _mulVal.data[counter++]; });
	}

	void Add(T _addVal)
	{
		std::for_each(data.begin(), data.end(), [&_addVal](T& elem) { elem += _addVal; });
	}

	void Add(vecN<T, elemCount> _addVal)
	{
		int counter = 0;
		std::for_each(data.begin(), data.end(), [&counter, &_addVal](T& elem) { elem += _addVal.data[counter++]; });
	}

	float Length()
	{
		std::array<T, elemCount> dataSqr = data;
		std::for_each(dataSqr.begin(), dataSqr.end(), [](T& elem) { elem = elem*elem; });
		T sum = std::accumulate(dataSqr.begin(), dataSqr.end(), (T)0);
		return sqrt(sum);
	}

	void Normalise()
	{
		T len = 1.0/Length();
		std::for_each(data.begin(), data.end(), [&len](T& elem) { elem *= len; });
	}

	vecN<T, elemCount - 1> PopElem()
	{
		vecN<T, elemCount - 1> output;
		std::copy_n(data.begin(), elemCount-1, output.data.begin());
		return output;
	}

	vecN<T, elemCount + 1> PushElem(T _value = 0.0)
	{
		vecN<T, elemCount + 1> output;
		std::copy_n(data.begin(), elemCount, output.data.begin());
		output.data[elemCount] = _value;
		return output;
	}

	template<int _size>
	constexpr vecN<T, _size> GetResizedVector()
	{
		vecN<T, _size> output;
		std::copy_n(data.begin(), std::min((unsigned int)(_size), elemCount), output.data.begin());
		return output;
	}

	constexpr T& GetElement(int _index)
	{
		return data[_index];
	}

	template<int _index>
	constexpr T& GetElement()
	{
		return data[_index];
	}

	template<class... Indexs>
	constexpr vecN<T, sizeof...(Indexs)> GetOrderedArrayOfIndices(Indexs... indxs)
	{
		vecN<T, sizeof...(Indexs)> output;
		output.data = { { data[indxs]... } };
		return output;
	}

	constexpr T& x()
	{
		static_assert(1 <= elemCount, "Invalid number of arguments for vector type");
		return data[0];
	}

	constexpr T& y()
	{
		static_assert(2 <= elemCount, "Invalid number of arguments for vector type");
		return data[1];
	}

	constexpr T& z()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		return data[2];
	}

	constexpr T& w()
	{
		static_assert(4 <= elemCount, "Invalid number of arguments for vector type");
		return data[3];
	}

	constexpr std::array<T, 3> xxx()
	{
		static_assert(1 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[0], data[0], data[0] };
		return xyz;
	}

	constexpr std::array<T, 3> yyy()
	{
		static_assert(2 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[1], data[1], data[1] };
		return xyz;
	}

	constexpr std::array<T, 3> zzz()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[2], data[2], data[2] };
		return xyz;
	}

	constexpr std::array<T, 3> xyz()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[0], data[1], data[2] };
		return xyz;
	}
	constexpr std::array<T, 3> xzy()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[0], data[2], data[1] };
		return xyz;
	}

	constexpr std::array<T, 3> yzx()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[1], data[2], data[0] };
		return xyz;
	}

	constexpr std::array<T, 3> yxz()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[1], data[0], data[2] };
		return xyz;
	}

	constexpr std::array<T, 3> zyx()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[2], data[1], data[0] };
		return xyz;
	}

	constexpr std::array<T, 3> zxy()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 3> xyz = { data[2], data[0], data[1] };
		return xyz;
	}

	constexpr std::array<T, 2> xx()
	{
		static_assert(1 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[0], data[0] };
		return xyz;
	}

	constexpr std::array<T, 2> xy()
	{
		static_assert(2 <= elemCount, "Invalid number of arguments for vector type");

		std::array<T, 2> xyz = { data[0], data[1] };
		return xyz;
	}

	constexpr std::array<T, 2> xz()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[0], data[2] };
		return xyz;
	}

	constexpr std::array<T, 2> yx()
	{
		static_assert(2 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[1], data[0] };
		return xyz;
	}

	constexpr std::array<T, 2> yy()
	{
		static_assert(2 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[1], data[1] };
		return xyz;
	}

	constexpr std::array<T, 2> yz()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[1], data[2] };
		return xyz;
	}

	constexpr std::array<T, 2> zx()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[2], data[0] };
		return xyz;
	}
	constexpr std::array<T, 2> zy()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[2], data[1] };
		return xyz;
	}
	constexpr std::array<T, 2> zz()
	{
		static_assert(3 <= elemCount, "Invalid number of arguments for vector type");
		std::array<T, 2> xyz = { data[2], data[1] };
		return xyz;
	}

	std::array<T, elemCount> data;
};

typedef vecN<float, 4> float4;
typedef vecN<float, 3> float3;
typedef vecN<float, 2> float2;
typedef vecN<float, 1> float1;

Calculating Floating-Point Error in std::random_device

Floating-point representation in programming is inherently error prone as we are not able to perfectly represent real numbers without an infinite number of bits. As a result we have a standard (https://en.wikipedia.org/wiki/IEEE_754) which has some quirky behaviours. One of which is that the error in representing real numbers get larger as the number you are trying to represent moves away from 0. The scale of this error and how fast it grows depends on the number of bits we have to represent that and it is not inherently obvious when this error occurs.

If we have perform multiple arithmetic operations then the rounding error from those calculations could be increasing. To calculate what this error is we need to be able to understand the maximum error for the given number. As a small guide, I have created a graph (below) as an easy reference.

DOwnIMOXUAAxXFe.jpg

In a later article I will go into how this can effect even simple geometric or purely numerical calculations.
For now I am going to show some code for calculating this error and then using this error to test the accuracy and understanding the actual range of results you will get from using the standard library random function generator.

In this example we want to choose a range for the random number generator, Look at the floating point representation of that range to see how many possible outcomes we could have and then test to see if we do get those outcomes and how frequently each one is hit to see how often we are getting duplicates within a range.

First step - Getting the number of values available in float in our range (between low and high).

1
2
3
4
	float nextAfter = std::nextafterf(_low, _high);
	double minStep = (double)nextAfter - (double)_low;
	double numberOfSteps = ((double)_high - (double)_low) / minStep;
	numberOfSteps += 1.0;

In this example we are using std::nextafter to find the next possible floating point representation and then casting to a larger bit representation (to reduce error in this calculation) and dividing the range by this value. This will only give correct results for small ranges as the step size will change as the number gets bigger. (To calculate this perfectly for large ranges there are other less concise approaches - this will be shown in a later post).

Next, we want to create our random device and initialise the range for it to work on:

1
2
3
4
	std::random_device	rd;
	std::mt19937		gen(rd());
	std::uniform_real_distribution<float> uniformRealDistribution(_low, _high);
	float randomResult = uniformRealDistribution(gen);

This code creates a device which produces random numbers in the range between '_low' and '_high'. Pretty standard stuff.

Now we want to create an std::map which holds a floating point result and a counter. This will allow us to count the occurrences of each outcomes from our random number generator and the number of times that number was output.

1
	std::map<float, int> occuranceCounter;

The final step is simply putting this all together in a function to populate the std::map and output the results

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <map>
#include <random>
#include <iostream>

void TestRandomDistriubtion(float _low, float _high, int sampleSize)
{
	std::random_device	rd;
	std::mt19937		gen(rd());
	std::uniform_real_distribution<float> uniformRealDistribution(_low, _high);
	std::map<float, int> occuranceCounter;

	float nextAfter			= std::nextafterf(_low, _high);
	double minStep			= (double)nextAfter - (double)_low;
	double numberOfSteps	= ((double)_high - (double)_low) / minStep;

	std::cout << "Max Possible outcomes: " << numberOfSteps + 1 << ". ";

	for (unsigned int i = 0; i < sampleSize; i++ )
	{
		float randomResult = uniformRealDistribution(gen);

		if (occuranceCounter.find(randomResult) == occuranceCounter.end())
			occuranceCounter[randomResult] = 1;
		else
			occuranceCounter[randomResult] += 1;
	}


	int size = occuranceCounter.size();
	int counter = 0;
	for (auto& kv : occuranceCounter)
	{
		if (kv.second != 1)
		{
			counter++;
		}
	}

	float percent = (float)counter / (float)size;
	std::cout << "repeat: " << percent*100.f << "%. Total numbers is " << size << ".\n";
}

This function will then output text like this for 10000 samples:

Max Possible outcomes: 1.04858e+06. repeat: 0.401647%. Total numbers is 9959.

And then if we write a loop we can look at the output of this for different magnitudes of input values:

Testing for values from 1 to 2
==================
Max Possible outcomes: 8388609. repeat: 0.080064051%. Total numbers is 9992.
==================
Testing for values from 10 to 11
==================
Max Possible outcomes: 1048577. repeat: 0.41168794%. Total numbers is 9959.
==================
Testing for values from 100 to 101
==================
Max Possible outcomes: 131073. repeat: 3.5566156%. Total numbers is 9644.
==================
Testing for values from 1000 to 1001
==================
Max Possible outcomes: 16385. repeat: 27.622238%. Total numbers is 7465.
==================
Testing for values from 10000 to 10001
==================
Max Possible outcomes: 1025. repeat: 100%. Total numbers is 1025.
==================
Testing for values from 100000 to 100001
==================
Max Possible outcomes: 129. repeat: 100%. Total numbers is 129.
==================
Testing for values from 1000000 to 1000001
==================
Max Possible outcomes: 17. repeat: 100%. Total numbers is 17.
==================
Testing for values from 10000000 to 10000001
==================
Max Possible outcomes: 2. repeat: 100%. Total numbers is 2.
==================

If we look at the results we can see how the number of possible values match the graph at the start of this post. We are slowly reducing the maximum number of outcomes from our random number generator until the point where we can only get the left or right bound value when we are at 10 million. Which at such a large number is quite understandable and probably something you have ran into when working with float numbers in similar situations. However, you will notice that at even small numbers like between 1000-1001 we are quantised to only 16385 possible results. For a lot of applications that is a huge margin of error compared to the accuracy you get between zero and one.

To resolve this we can use bigger data types but that isn't always ideal. So quite a lot of papers recommend working at offsets and then rounding back up to reduce error, but this doesnt totally solve the problem. I will be doing more posts in this area as I finish my research in a related topic where I will be going over some of the coping solutions applications which don't need 100% accuracy of results frequently use.

Full source code for generating the above output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <map>
#include <random>
#include <iostream>
#include <iomanip>

void TestRandomDistriubtion(float _low, float _high, int sampleSize)
{
	std::random_device	rd;
	std::mt19937		gen(rd());
	std::uniform_real_distribution<float> uniformRealDistribution(_low, _high);
	std::map<float, int> occuranceCounter;

	float nextAfter = std::nextafterf(_low, _high);
	double minStep = (double)nextAfter - (double)_low;
	double numberOfSteps = ((double)_high - (double)_low) / minStep;

	std::cout << "Max Possible outcomes: " << numberOfSteps + 1 << ". ";

	for (unsigned int i = 0; i < sampleSize; i++)
	{
		float randomResult = uniformRealDistribution(gen);

		if (occuranceCounter.find(randomResult) == occuranceCounter.end())
			occuranceCounter[randomResult] = 1;
		else
			occuranceCounter[randomResult] += 1;
	}


	int size = occuranceCounter.size();
	int counter = 0;
	for (auto& kv : occuranceCounter)
	{
		if (kv.second != 1)
		{
			counter++;
		}
	}

	float percent = (float)counter / (float)size;
	std::cout << "repeat: " << percent*100.f << "%. Total numbers is " << size << ".\n";
}

int main()
{
	float minTestValues[] = { 1.f, 10.f, 100.f, 1000.f, 10000.f, 100000.f, 1000000.f, 10000000.f };
	int arraySize = 8;
	int sampleCount = 10000;

	for (int i = 0; i < arraySize; i++)
	{
		std::cout << std::setprecision(10) << "Testing for values from " << minTestValues[i] << " to " << minTestValues[i] + 1.f << "\n==================\n" << std::setprecision(8);
		TestRandomDistriubtion(minTestValues[i], minTestValues[i] + 1.f, sampleCount);
		std::cout << "==================\n";
	}


	return 0;
}