SparkBuild – build optimisation

This is a guest post from Scott Castle of Electric Cloud. I’ve been wanting to get these guys on the blog for a while.

Scott has 5 USB drives full of Electric Cloud software, videos, docs and more to give away. All you need to do is tweet the #sparkbuild hashtag, and 5 lucky people will be chosen at random to get one sent out.

Take it away, Scott!

I hate doing laundry (which is ironic because I love clean clothes). My problem isn’t with the bleach smell, or the crap-quality washing machines at the laundromat, or even having to fold everything after; my problem is that it’s not a mindless task. I’ve got to:

  • sort all the clothes into washer-size batches – bright colors, darks, whites, hot and cold water fabrics, delicates
  • count the quarters on hand,
  • decide which batches will be dried all the way and which only need a half-cycle,
  • make a plan about how many loads and in which order to maximize throughput and minimize quarter use…
  • you get the picture.

This is not an automatic task. It takes a lot of brain power to plan the logistics of it all. But, hello? It’s laundry. As a programmer, this is not how I want to spend my time! I’d like to be able to dump all the clothes into the washer, and get clean ones out a little later, and not have to think about this ever again.

I also hate manually compiling code and I take as many shortcuts as I can get away with. Nobody does full builds unless they’re in the release group, but I don’t do full incrementals either (going to the top-level of the code base and typing ‘make all’); in the time it takes to parse every makefile and build everyone’s changes, I could have gotten in a load of towels, at least. So I, and I’m betting you, if you’re a programmer, go to every directory where I know I have a prerequisite, and ‘make all’ there, then build my own changes. This is much faster than waiting for a full incremental, but it makes me think of laundry, all that sorting and planning and fluffing and folding…

A colleague of mine has the same frustration and, being a better coder than me, wrote a solution. It turns out that if you collect a little data when a full build is run, you can use that data as a map to calculate something he calls a ‘subbuild’ – the critical path of prerequisites needed for the target I want to build, and where to find the rule to make each one of them.

I know what you’re thinking: that’s just an incremental! If I had a single make instance, that would be true (and I’d have to parse and evaluate the whole makefile, every time I ran make) but I’m working on a code base which uses recursive make, so I can’t just go to the top and say ‘make mycomponent.exe’. The subbuild technique makes a recursive make structure operate as if it was a single make instance, and that is great because now I don’t have to decide, each and every time, which components to build before I compile my own code.

My colleagues have coded this technique up into a tool that works with GNU Make (3.80 and 3.81) and NMAKE (7 and 8), and we’ve released it as a free tool; you can try it yourself at And if you’re interested in more technical information about make, subbuilds, and dependency trees, check out this post.

Now, if only we could write something to do my laundry…

Image thanks to AlexJReid. Disclaimer: I’m getting no kickbacks for this.

Tagged ,

3 thoughts on “SparkBuild – build optimisation

  1. duality72 says:

    So my company has been using the buildlist feature of Ivy to get a list of dependencies for the current project and building only those projects. Anything qualitatively different here? Since we’re trying to move to Maven, if you know of something similar for Maven, I’d love to hear about it.

  2. admin says:

    @duality72 – Maybe the ElectricCloud guys can give the final answer, but my understanding of SparkBuild is that it’s going to help you build a huge C or C++ project using Make. This is incredibly convenient if you’re trying to maintain such a beast. I’m guessing you work at a Java shop, so here’s my take on it:

    If you absolutely need to have as many distinct projects as you do (I think it’s an anti-pattern of our industry that we make far too many separate sub-projects), then I’d look at using Ivy or Maven to fetch binary dependencies. And you can use Continuous Integration to build those dependencies.

  3. Scott Castle says:

    @duality72 – the word ‘dependency’ gets overloaded when talking about build tools. In ivy’s case, the dependencies of interest are project-level and binary, specifying external projects your app will need to get or build. For Make, the dependencies specify the relationship between individual sources within a single build; it’s adapted to languages where the order of compilation is left to the user (read: C/C++).

    Basically, it’s the same problem, but ivy is doing it a macro level, and make is doing it at a micro level.

Comments are closed.

%d bloggers like this: