Hoard Compiler on GU - “veni, vidi, vici”
Golem Unlimited and Hoard - new distributed compiler for Visual Studio - first results.
Over three months ago we announced our cooperation with Hoard in the area of creating a new distributed compiler for Visual Studio (VS). It would work on Golem Unlimited (GU) and unburden developers from typical distributed computing hassles: provisioning nodes, work distribution, data transfer and, above all, speed things up.
We started working on this right after the announcement. Now we are ready to share the first results, everything that we’ve managed to achieve to date and the challenges that are still ahead of us.
Bear in mind, that while resolving these challenges we are learning and hence the project might be subject to change, as everything in the field of nascent technologies.
Initial assumptions
From the very beginning Hoard’s assumptions were kept simple. Since a compiler can use all cores for compiling many source files at once, why not use cores of other machines inside of a local network to scale the process even more?
Golem Unlimited is the perfect toolset for this job. It is designed to enable the creation of a local computing network, in which one machine is a Hub that allows you to manage the network resources, see total capacity, as well as, each node’s detailed info, create new tasks and get work status.
The Hoard team has kept the setup for a Golem Unlimited local network very simple:
- Run Golem Unlimited Hub process on any computer (can be a local build server).
- Each machine within the network runs the Golem Unlimited Provider that connects to the Hub.
- Install our plugin for Visual Studio and run the build process using “Hoard distributed compiler”.
Although the setup and assumptions were kept simple, we faced several non-trivial challenges and issues.
For instance, on the Golem Unlimited side:
- gu-hub was deployable only on Linux systems,
- gu-provider was crashing on some Windows systems, because of missing folders or regional settings,
- The REST API created from swagger was not properly communicating with the Hub.
Other issues were strictly related to the Visual Studio:
- Lack of well-crafted VSIX API documentation,
- Determining the toolset and compiler you use for the project is too cumbersome,
- Problems with finding all the include CPP file uses - no standardization on where the include files are taken from and compiler directives (like include paths generated from macros in boost library),
- VSIX build for one version of VS won’t work for the other VS versions.
Phase I: veni
The first meetings with Hoard Developers were focused on describing GU, introducing its architecture and functioning.
The Hub API within the swagger editor was also demoed explaining how to generate binding to a wide range of programming languages. Two of these are integration bindings to Rust and Python, which were built in-house. This code requires some manual tweaking, however, it is usable and mostly valid.
Phase II: vidi
Upon request of the Hoard team, Przemek “Reqc” Rekucki from the GU team, within minutes, has generated the code for the C# bindings and committed it to a new repo. Then the Hoard team has forked the repo, made some naming changes and minor fixes, that were showing up throughout the development process.
The Golem team also released a Provider build especially for this case, which had to be fixed due to some inconsistencies revealed by Hoard’s development. It has been seamlessly re-released.
What else has been done on Golem’s side?
- We have prepared an experimental Windows build for the Hub, knowing that Windows is a native environment for the game dev industry (for other use-cases, we are only supporting Debian). We are still to see how well it works. Till now, we have had no bugs reported.
- We were also asked to extend the Hub API, to enable querying hardware on selected Providers. The hardware information for all connected Providers was already collected by Hub, so we just needed to expose it at the dedicated endpoint. Then the Hoard team extended their bindings accordingly, allowing for the requested functionality.
"Even though we have not completely tackled VS issues, we have managed to produce a working version of the Hoard Code Compiler powered by Golem Unlimited. However, not without some limitations." - says Cyryl Matuszewski, Hoard. "It is in a PoC stage. The whole concept works, but the results are still far from satisfying."
Is it worth to use it commercially?
"Not yet, but it’s definitely worth testing and running various scenarios to see what can be done to improve it and provide more efficient compilation process. For now, a simple hello world CPP file that has just one include <stdlib.h> uses more than 900 header files producing a 20MB package for remote compilation. What if we first preprocess it on a local machine? Then it will eat up as much as 10MB, which is better, but still not perfect, as we also need to send the compiler along with some dlls which adds another 50MB on top of that.
Finding all the dependencies on the hard drive, packing them, sending to the 4peer, compiling and sending output files back to the local machine is what slows down the whole process, but at the same time it is necessary for the process to take place at all.”
The Hoard team is still working on improving the efficiency of the Compiler by:
- Building a cache, so that the same files are distributed only once;
- Adding multithreaded compilation switch so the peer can compile at once as many files as their CPU cores (or hardware threads);
- Adding multithreaded preprocessing of CPP files on a local machine;
- Designing a task system, so that while you are waiting for the peer to finish, a new set of files can be preprocessed;
- And finally - distributing the task to many peers.
The Hoard Code Compiler powered by Golem Unlimited has improved significantly in speed, however, it is still far from such solutions as Incredibuild, FASTBuild or SNDBS.
We know this might sound a bit bleak, but it’s not. We are very happy that for big enough projects the compiler is already able to scale. The scale is not linear (since we must send way more data through the network), but the assumptions and the concept have been proven, and the work will continue.
Phase III: vici
We are far from saying that we have conquered this case, but we fight and until you fight you are a winner, so we are working hard towards optimizing the entire process and releasing a Compiler meeting the initial assumptions.
The Hoard Code Compiler powered by Golem Unlimited, next steps:
- Processing several projects in parallel, if they are not dependent on each other;
- Dealing with precompiled header files, which is a difficult task as they tend to be huge in hundreds of megabytes and need to be passed to each peer. Even more challenging in this area is that they are also incompatible with the preprocessing step;
- Adding some compression on top of tar packages to send less data through the network at the cost of compression/decompression. This is to enable starting remote compilation, as soon as the first file arrives, without waiting for the whole archive download to be completed (if it turns out to be worth it);
- Providing info on how many threads/cores are available on a particular peer (or enabled for GU Provider to use);
- Extending GU’s API or the underlying mechanisms (after analyzing what changes might significantly speed up the Hoard Compiler);
- Compiling cases when the command line is so big that the process won’t parse it all (happens with really long paths);
- And the final test - a compilation of UE4 engine or one of the AAA games with thousands of source files.
We are confident in the potential and viability of this project, and both Hoard and Golem will continue to support each other throughout the building process. In the case of distributed CPP compilation, we need more time for investigation and trying different solutions to find the best setup.
For now, we are going to stay focused on one particular big engine and make it compile faster. In the future, we would like to try distribution of other data like shaders, which are way simpler in terms of dependencies and even the compiler consists of just a few files, not tens of them.
Stay tuned for new progress updates on this front - the road is long, but we are optimistic - the final results are worth waiting for!