Build time - Run time distribution in Clang

I have recently done some research into the effects on build and run time performance for C/C++ applications being compiled with Clang when different optimisation passes are used.

The results of the raytracer test application showing strong clustering from the inclusion of inliner at different values and the GVN pass.

It produces some very interesting results. It showed that the '-O3' flag does not always use the most effective selection of optimisation passes for given code. And in fact, it may be more optimal (albeit very time consuming) to produce a custom optimisation pass list per compilation unit. As certain optimisation pass ordering produce good results for some patterns of code but bad results for others. Ideally a series of 'good' optimisation lists should be generated and matched against the patterns in the code that they handle best.

Here you can see the distribution of the experimental builds compared to the preset optimisation levels. The bottom right cluster is very important for showing that the same performance can be gained as -O3 for saving of over nearly 15% on build time.

The most interesting results showed the amount of compile time that could be saved for the same performance. This is due to some optimisation passes not being applicable to the code they are being ran on, essentially causing longer build times for no gain. It is possible to configure a benchmarking tool to find and remove these passes in a relatively short period of time. Same as the increase in performance this is highly dependent on the code that is being ran. For a very large complex application these kinds of gains would have be targeted at specific files, or groups of files that are being compiled.

A proposal will be submitted soon for this technique and results to be shown at EuroLLVM 2016.