Toolchain Independent Distributed Compilation

Toolchain Independent
Distributed Compilation
Dietmar Hauser
Head of Console Technology
Sproing Interactive Media GmbH

Is it possible?
Input
●
1 cpp/c source file
●
Many h/hpp header files
“→ Compilation Unit” (“CU”)
Output
●
1 obj/o binary file
●
(Misc. helper files)

Research previous solutions
IncrediBuild
●
The de-facto standard
●
Easy setup, works well
●
Pretty, but pricey
●
Limited platforms
●
Coordinated load balancing
distcc
●
Free, but narrow focus
●
Needs homogenous setup
●
Two methods:
●
Preprocess & distribute
●
Analyse & distribute

The Plan
Send input files
●
Start with .cpp/.c
●
Find required .h/.inl/...
●
Precompiled Header (PCH)
●
Compiler executable(s)
●
Calculate hash for every file
Receive output files
●
Main .obj binary file
●
Misc .ti/.sbr/.d/... files
●
Log output
●
Cache inputs using hash
●
Cache output by
combining input hashes

Finding all input files
Preprocess Method
●
Run Preprocessor locally
●
Distribute Result
●
Easy to do
●
Less parallelisation
●
Input file cache not possible
●
No PCH support possible
Analyse Method
●
Analyse .cpp file
●
Find all dependencies
●
Sounds simple, but is tricky
●
Slightly more data to send
●
Good cache behaviour
●
PCH support possible

Precompiled Header Files
●
Speeds up compilation
●
Contains “global” includes
●
Is included in every CU
●
Proprietary format
●
Often very big
●
Contains extra dependency
information
●
May not be deterministic
Source: http://www.ogre3d.org

Fun with preprocessor directives
●
Directory search order
●
<> vs. “” includes
●
Multi line includes
●
Case mismatches
●
Conditional includes
●
PP constants in includes
●
PP macros in includes
●
Conservative approach
●
Find all possible dependencies
●
Reasonable overhead
●
Cache dependencies locally
●
Still a world of pain
●
Trial & Error

Putting it all together
●
Collect all input files
●
Send them if needed (zipped)
●
Build directory structure in Temp
●
Cache & Compile
●
Collect all output files
●
Cache & Send back (zipped)
●
???
●
Profit!

It kinda works...
Little problems:
●
PCH files don't work
●
Long & deeply nested file
names
●
Absolute paths
●
Some compilers need registry
●
Issues with parallel jobs
●
And some more...
Big problem:
●
Debug info stores absolute
paths to source files!

Sandboxie to the rescue
●
Virtual file system
●
Recreate original paths
●
No concurrency issues
●
Simple clean up
●
Virtual registry
●
Not free (~€10-25 per user)
●
Does not solve all problems
●
But it's good enough!

Miscellaneous titbits
●
“Screen Saver” mode
●
Automatic server updates
●
Output file cache (~ccache)
●
Data compression woes
●
100 MBit/s vs. 1 Gbit/s
●
Local compilation server
●
Parallel local compilation
●
Parallel linking experiment

So, is it worth the hassle?
●
Measuring this is tricky
●
Real projects
●
In a live environment
●
34 servers, ~17 available
●
Maximum speed up: ~17
●
Uncached: 0.6 – 6.68
●
Cached: 1.06 – 13.13

Sproing's Codebase (21 Projects)

3rd
Party Codebase (64 Projects)

Conclusions
It's possible distribute compilation with any compiler
Speed up is highly dependent on the environment and use case
Speed up is almost always positive, often greatly

What's next?
●
Get other developers involved?
●
Leverage an external cloud?
●
Distribute other processes? (Asset conversion,...)
●
Find a better solution for PCH?
●
Improve or unify front end with LLVM & Clang?
●
Distributed linking?

Thank you for your attention!
dietmar.hauser@sproing.com
@Rattenhirn
http://www.sproing.com
http://fb.me/sproing
Questions?

Toolchain Independent Distributed Compilation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Toolchain Independent Distributed Compilation

Similar to Toolchain Independent Distributed Compilation (20)

More from Dietmar Hauser

More from Dietmar Hauser (12)

Recently uploaded

Recently uploaded (20)

Toolchain Independent Distributed Compilation

Editor's Notes