CruiseControl Enterprise Best Practices, Part 6: Scaling up

You just started out with Continuous Integration. You’re building your project in CruiseControl. Great. Now, it’s time you started to plan ahead. This post is about scaling up CruiseControl. The tool can scale up to many projects, but you have to know a few things.

The very first thing to do is make sure that you can simultaneously build your projects. CruiseControl uses Java threads to manage builds and projects. Each project has it’s own thread. There is a least one Builder thread that will actually run a build, and there is a Build Queue, which mediates the requests from the project threads to the builder thread. You need to give CruiseControl enough builder threads to work with: by default it gives you one thread. The impact of this is that if you have 5 projects configured, you’ll only ever build one at a time. This could be good if you have a tiny build server, or projects that don’t play nicely together.

On the other hand, if you have plenty of capacity to use, you will get a boost from the extra builder threads. The right setting for the number of threads will change over time; don’t hesitate to do some performance management and find the right value for your server.

<cruisecontrol>
  <system>
    <configuration>
      <threads count="2" />
    </configuration>
  </system>
<!-- rest of config file suppressed -->
</cruisecontrol>

Jeffrey Frederick wisely points out that it is useless to set the number of builder threads to be higher than the number of projects that you have.
There are plenty of other factors to consider when scaling up your Continuous Integration service: the speed of your Version Control System, where you store artifacts, logs, etc. One factor that I will mention today is disk bandwidth. Building software is a disk-intensive process. Even when a build is done compiling Java code and making archives like jars, CruiseControl has to write logfiles. It’s very easy to overload a single disk with all this activity. Ideally you want to make sure that your machine can build several projects at once.

On one of my very first projects at ThoughtWorks, the team was running two CruiseControl servers, apparently because the first Solaris server was too slow. That didn’t seem right to me, so I dug deeper. With my systems administration background I was able to see that a single disk was 100% busy running the operating system, CruiseControl and the build. I spread the workload across the four disks in the system and the machine was able to manage many more projects.

This pattern has repeated itself on many of my subsequent projects. In my experience not many CI servers are constrained by processor overhead. Unfortunately, it’s often painful to rectify disk issues once a system is up and running. Make yourself some luck and order plenty of fast disk drives before you scale up and people start to complain about the slow build.

The way to implement this is to use the configuration file. The log element is used to tell CruiseControl where to write logfiles. By default it’s a suubdirectory of the CruiseControl installation. But if you change the value of the dir attribute of the logfile, you can make sure that the logs are being written to a disk that isn’t already running CruiseControl. If you’re using Ant to build your code, you can use the antWorkingDir attribute on the ant element in the config file to make sure that your projects are built on another disk.

I can’t really do a good example for this one as each CruiseControl instance is so different. Buildix is an attempt to make many installs more homogenous. If you look at the way things are laid out on the disk, you’ll see that the CruiseControl install is in /usr/share/cruisecontrol, but the projects and logs are installed in /var/spool/cruisecontrol: the reason we did this was so that you could mount the /var directory on another disk if things got busy. Drop me a comment if you want to know more.

Tagged

4 thoughts on “CruiseControl Enterprise Best Practices, Part 6: Scaling up

  1. Banos says:

    Publishing is great. It doesnt fail the build if there’s a problem, so no false negatives! Great! So what happens when the testers try to grab 2 good builds ago to assess a defect – oops the publisher failed and no one knew about it .. till now 🙂

    Great site btw.

    B.

  2. Srinath Balaraman says:

    Is there a way to use one CruiseControl installation as a server and distribute the jobs it receives to several agent hosts to parallelize several functions? I know of several Commercial solution in this space but I was wondering if this is possible to achieve this through open source solutions.

    Thanks,
    Srinath

  3. Srinath Balaraman says:

    I found this btw: http://cruisecontrol.sourceforge.net/distributed/index.html

    I haven’t tried this. I wonder if it would work.

    Thanks,
    Srinath

  4. simpsonjulian says:

    Srinath,

    By all means give it a try. I wish someone had the time to do the integration work to get that into the main CruiseControl project. Other open source CI servers (most famously Hudson) will allow you to run agents and run different builds.

    I’ve got a feeling though, that you’re trying to throw many servers at one huge monolithic build. I did once try clustering to fix that (it failed because the Linux JVM didn’t have proper threads at the time) but I think that introduces additional complexities. There’s a point at which someone has to pay back some tech debt.

    If you can break a monolith into chunks, then perhaps you can farm the chunks out to different nodes. it might be worth paying for a commercial product. I hope this has been any help at all.

Comments are closed.

%d bloggers like this: