Category Archives: Patterns and Practices

Supporting Multiple Environments – Part 3

Supporting Multiple Environments

Supporting Multiple Environments – Part 3

(part one and two)
In this installment, I’m going to cover the configuration storage mechanism for this separate configuration jar approach.

Configuration Storage

With the approach listed in step two, it’s easiest to manage the actual values via property files, stacked up in a layered approach.  Here’s what we came up with:

Deployment environment overlay procedure

In our deploy scripts (Ant based remember), the first in is the first out (and properties are immutable):

deploy environment application on a specific machine –  Is there a unique application ID needed for clustering?
deploy environment application specific – What is <appId>’s memory requirements for this deployment environment?
deploy environment machine specific – Are there any specifications that need to be made for a given machine in a given environment?
deploy environment specific (environment specific) – What is the database server associated with this environment?
application – How many connections to your db server should this application be defaulted to?
defaults-deploy-env – non-dev – Where is bash located?
general –  What is the port you use to connect to a database server?

In development, this process is reversed.  In this custom coded (but VERY simple) Maven plugin, we have the following order of precedence:

command line – Want to quickly tweak a setting for a single build?
settings.xml – Are you testing a new version of the app server software?
pom properties  – What compiler configuration are you using?
defaults – dev – Where is your app server software installed?
general – (see above)

There are slight variations on which configuration layers exist and should take precedence over another (the reader’s situation may call for more or less layers in general).  All layers aside from the general and the defaults levels are optional on both fronts.  By no means is this an exhaustive list or the explicit ordering of configuration.  Just what worked best for us.

In the deployment environment example above, each deployment environment has its own subdirectory which allowed us to assign unique privileges per stack to prevent accidental configuration of various QA, staging and production environments.  The content of this project is regenerated with each build of the configuration project.  There are some inherit compromises with this approach (a change to a QA environment re-bundles an unchanged production configuration for example), but build cycles are extremely short and painless and each one is given a fully fledged build number in the mainline (project branches are assigned a non-unique snapshot identifier).

Each deployable unit may be shipped to production with different configuration project versions, with a complete history of the changes included in each revision (made available via Hudson).  For security reasons, none of the sensitive staging and production passwords are maintained by development or release engineering.  The configuration values for those things are stubbed out in the respective property files and an operations managed process properly injects the correct passwords into the proper configuration files.

Continued…

In the final installment, I’m going to cover how to share this configuration between developers and your deployment environments.

Tagged ,

Build Pattern: Facade

Build Pattern: FacadeLet’s face it: Build scripts can age as badly as Steven Seagal. What do you do when yours is needing more than botox and a mud pack?

What you shouldn’t do is attempt to rewrite the whole thing from scratch. There’s no guaruntee that it’ll look any better by the time you’ve done that, and you’ll just annoy your colleagues.

If a spot of build refactoring isn’t enough, then you could try a new wrapping the whole build in another tool and swapping out chunks as you can and when you can. Here’s reader Jason’s comment on using Gradle:

when Gradle consumes an ant build it treats the tasks as actual
Gradle tasks, so you could override the ant tasks as needed and
simplify things until you’re completely ready to replace the old Ant
build with a Gradle build.

I’ve recently had some discussions with my brilliant co-workers about similar approaches with Maven: Embedding Ant code using the Maven Ant plugin seems to be for the win.

(photo via luc legay)

Tagged ,

Build Pattern: Green-lit build

Build Pattern: Green-lit buildContinuous Integration should be a highway, not a parking lot. But that’s what happens sometimes when developers end up competing for limited Continuous Integration capacity. Developers working on critical and time-senstive work like production bugfixes can struggle to get their builds serviced promptly; they can be fighting a tide of checkins from their colleagues working on new functionality. Also, functional tests can jam up the available build agents while short builds queue up. This delays the feedback to the developers that they have integrated their code properly.

What to do? Dedicate some capacity to those builds that could actually take priority. How you implement depends on your continuous integration system. On CruiseControl (the original one), I made a seperate server for this. I’m implementing the same thing on Team City at the moment and I’ve added an environment variable to the build agents that I want to reserve for fast builds. Any build that is longer than 15 minutes is told not to use it if the variable exists.

One might argue that I’m not making best use of the CI server by doing this. That doesn’t matter. People are more expensive than Continuous Integration servers; let’s optimise the system for them.

(image from Ted Percival)

Tagged

The hidden cost of building

The hidden cost of buildingThanks to EJ Ciramella for this thought provoking post. There’ll be a Build Doctor T-Shirt on it’s way to him soon.

In this down economy, irrespective of size of company involved, people want to save money, limit costs and increase throughput of their systems. One area of savings is the build and continuous integration environment.

With the smattering of continuous integration servers available to said company’s release engineering staff, many offer non-acl controlled build buttons. Without this control point, anyone in any department attached to the corporate network can click off a build, regardless of the readiness of the code within source control.

Where I work, we’ve recently dropped CruiseControl for Hudson. A few colleagues have come by since the rollout asking where the build button is because they don’t have access. When pressed about why they’d like to manually spin a build, the resounding answer has been “to see what it looks like in Hudson”. This is the exact situation we’re trying to avoid and the exact subject I’ll try and illuminate within this article.

Before diving any deeper, there are a few things I think release engineering for any company must understand. Each company has a unique workflow, from project concept, to design, to implementation to release and off into support mode. What works great at one place may or may not work at another company. There are industry best practices and white papers aplenty, but if you find it difficult to follow any of these at your company, the best approach in most cases is to take what you can from these documents and carefully plan an evolutionary (not revolutionary) process to reach a tailored solution.

With all of this in mind, let’s cover the various steps and the unseen costs to performing builds.

Here is a high-level a list of things that happen when we spin a release type build.

  • SCM label

Initially, we used to label first and ask questions later. The mindset is, if the build fails, a developer could sync (we use Perforce) to a label and get exactly what the build server had used to generate the failure. Since the Perforce plugin for Hudson doesn’t operate in this manner (and a few other ways that we ended up altering the plugin to suit), we made the switch to labeling only if the build passes. Since very few developers ever did take time to sync by the labels, the failed build labels were just a waste of space. Either way, depending on the size of the codebase getting the label attached, there is disk space and memory consumption that happens on the Perforce server.

  • Build node (CPU/RAM/HD)

I’m sure that the visitors of this site are savvy enough to understand the beauty and flexibility that come with a distributed build system. I use the term “distributed” here loosely as any given build is not farming out various parts of compilation, just individual jobs (in Hudson parlance) or in some cases, individual maven modules. By the time a build reaches this queue stage, the job has now consumed a executor within the cluster. In our current cluster, we have three slaves and a master. The master and two slaves are only allowed one executor. That third slave has six. If this build is forced to run on the “singleton” nodes, that node becomes unavailable until that build is completed. Thankfully, our build times are short (the longest is 20 mins), but because of the speed of the build fired, this I think mentally cheapens the process. Don’t misinterpret this as “let’s artificially inflate build times”, but keep this in mind when a large refactoring of the build process and associated scripts yields a massive time savings (heck, why not push back to get more testing done in that same time frame?).

  • Client spec updates (Perforce Server)

One of the great features of Perforce is the server is where the “what you have” list is maintained. I’ve seen arguments for working offline and if a refactoring happens, yadda-yadda-yadda. But in large-scale corporate environments, many institutional services will be unavailable in this mode.

When a person syncs a project (in Perforce terms, but essentially a directory of directories and files), the server updates its files to reflect what the user ends up with. Same is true with the build process. When setting up Hudson, our SCM configuration choices left us with Hudson managing the client specs. This means, in some cases for us, there is one client spec for each node for each job. Even if only one is getting updated, that is still data being written to the server (again).

  • Artifact storage (both live and backup)

Now the build is finished and we’re going to retrieve the artifacts for storage. Where I work, the build artifacts are stored in a few ways (all on various NetApp slices). They are as follows:

  • Deployable units – These are the actual applications that you push to our various deployment environment. Because of a facilitating maven project that allows cross application dependency validation, some of these artifacts are stored on generalized NetApp slice (to allow us to keep our artifact storage Continuous Integration agnostic – who knows when we’ll switch from Hudson to something else) and replicated to Archiva to allow people to reference certain bits as dependencies from pom files.
  • Libraries – These are effectively the building blocks of the deployable units. The final destination of libraries is Archiva (or your repository manager of choice).

Essentially, there are two places things are stored, Archiva and our “buildartifacs” mount, to reiterate, both of which are NetApp mounts. There is a backup mechanism that keeps around hourly, daily, monthly and yearly restore points. All of this takes up space but if we ever had a complete system failure, we could very quickly return to business as usual.

  • Potential deployment of said artifact

Now that the build is done and someone within the organization has chosen to deploy and test it, they may deploy it (or put a dependency on it and trigger an application build – all with no changes) to a given stack for testing. One of our typical artifacts is close to 280 mb zipped. And to successfully deploy and test, this artifact is extracted on at least two servers and typically has a 139 mb web content artifact also deployed at the same time (as we have to keep these things in sync). Deployments back up the previous deployment (just a few) prior to extracting the new item.

  • Testing requirement of artifact (with no changes)

Once deployed, now comes the tax on the squishy bits (humans) if you have no auto integration testing/smoke/load testing. And if you do have all those tiers of testing, you’ll be consuming each one along the way. Couldn’t everything be doing something more productive rather than re-testing a deploy that contains no changes?

  • Other

There are other fiddly bits that happen as part of each build like sending emails, various cleanup steps, etc., not to mention the can of worms that is dependency generation (if Build-A happens, Build-B needs to be re-spun to pick up the change in Build-A ad infinitum ad nauseum). What if the build artifact has to actually be transferred to another country for deployment? Transferring 280 mb of data while people are trying to sync an as of yet un-proxied Perforce project or retrieving dependencies from Archiva is not a good way to spend your workday.

Everything above consumes hardware resources, time and space that could be reserved for a legitimate build or better testing. From cpu time to memory allocation to the various stopping points and distribution mechanisms. if people go ahead and deploy and try manual, testing now we’re talking about consuming one or more human resources. I cover a single large application above with the quoted sizes, but actually, this maven module generates another 120 mb artifact that is stored in Archiva which is consumed by another deployable unit that is 338 mb zipped.

This is why it’s best to either limit or prevent people from firing builds manually. I’ve taken the tack that if we find people spinning unnecessary builds, we’ll revoke their privileges within the Hudson matrix ACL settings. I’m not opposed to taking away further permissions forcing more to rely on the polling aspect of Hudson.

(image care of swimparallel)

Build Pattern: The Captive Build Tool

Check your build tool into your version control system. Ideally you’d do this in a relative location to your project(s). That way you can have a go.bat or go.sh: a one line wrapper script to call the correct build tool from your project. Don’t get clever. This should be the simplest script you can manage.

Once you have this set up, a new developer can be cutting (and building) code on their first day. This is a huge boost for your newbie developers. This pattern brings more love to your team because a new library doesn’t mean all of your guys downing tools to add it.

I was going to call this pattern Ant Farm, but that would have been a little Java specific. We’d also need NAnt farm and Phing Pharm. So why do this? The build tool should be vanilla enough to deal with any project. If that’s the case then it becomes a commodity on the developer’s computers. However in this messy world it never seems to work that way. Here’s why:

  • Someone will have to use a new feature, which calls for everyone to upgrade the build tool
  • You’ll end up using a task which insists on being on the boot classpath
  • One day you’ll look at the build tool and find half a dozen key dependencies are residing in it.

So go with the flow. Make one canonical build tool and make it a damn good one. Put all the useful tasks in it. Yes, all of them. But with one exception. Make sure that they aren’t project specific. Do you want to have to have the right version of build tool to build and test your application? No. Didn’t think so.

Drive Mappings: argh!

Drive Mappings: argh!(image taken from William Hook’s Photostream)

There’s several things that are the root of all evil. Money, love of money, and mapped drives on Windows operating systems. This is especially true in a deployment context. Your deployment system should accept UNC paths for the servers it wants to know about.

Using Windows is one thing. Using Windows badly: inexcusable. Matt Lacey has a word or two on the subject.

Link

CruiseControl Best Practices: Keep your dependencies to yourself

This is the second of ten practices for CruiseControl

The average Java project has many dependencies – open source tools and frameworks, third party libraries, libraries that come from your project or organization – the list is endless. When I wrote this article, my current project had 84 jar files that it depended on (or could have depended on!). Much pain can come from the way you manage these dependencies. Sometimes it’s harder than you think to make software build or execute if it’s not clear what the dependencies are. A lot of the pain of joining a new project and taking a day to make the code build comes from unclear code or environmental dependencies.

It’s a good idea to keep your CruiseControl installation and its dependencies clearly separated from your projects and their dependencies. So for the most basic example, lay out your CruiseControl install something like this:

|– cruise
| |– 2.7.0
| `– 2.7.1
|– logs
| |– blowfish
| `– scorpion
`– projects
|– blowfish
`– scorpion

Upgrading CruiseControl should be the work of minutes.If you need to add libraries to CruiseControl itself, this should be a warning sign that your project has dependencies it can’t itself satisfy. The only exception to this rule that I can think of is custom CruiseControl bootstrappers or publishers and the like.

There are two things in particular to take note of here: the logs and the ‘.ser’ files. The logfiles represent the history of your project: this is why I always try to keep them in a different directory hierarchy from the CruiseControl installs. CruiseControl will also by default persist the state of its projects in files with the extension ‘.ser’. Make sure to keep those when you upgrade. If you do this, and your project has no dependencies on CruiseControl, it should be simple to upgrade.

Next, think about the dependencies you have baked into your build tools. There’s a common pattern of putting dependencies into your Apache Ant’s ‘lib’ directory. For things that are hard to wire in like Junit, fine. But if your build depends on something that is actually resident in your build tool, then you have a problem. It will work today, but not necessarily tomorrow. If you upgrade libraries that your code depends on in the build tool, you can’t build yesterday’s code – which makes things interesting when you’re trying to resolve a production bug!

The good news is, it’s easy to fix: make the project contain its dependencies. For example, if your Ant build depends on Oracle (say, to support rebuild the database with the ‘sql’ task, and your standard project Ant build contains the Oracle drivers, your build may look like this:

			<target name="drop_all_tables">

			<sql driver="oracle.jdbc.driver.OracleDriver"
			userid="user" password="donttell"
			url="jdbc:oracle:thin:@localhost:1521:orcl"
			delimiter=";">

			<transaction src="${sql.dir}/drop_all_tables.sql"/>

			</sql>

			</target>
			

However in this case, the dependency on the ‘in the
default classpath’ Oracle driver isn’t stated; Ant will just look for the ‘oracle.jdbc.driver.OracleDriver’
class in the default classpath. The first thing to do is put the driver jar file
into the project and set a classpath:

lib/
`– ojdbc14.jar

			<path id="buildtime">
			<pathelement location="${lib.dir}/oracle.jar"/>
			</path>



			<target name="drop_all_tables">
			<sql driver="oracle.jdbc.driver.OracleDriver" userid="user" password="donttell"
			url="jdbc:oracle:thin:@localhost:1521:orcl"
			delimiter=";" classpathref="buildtime">
			<transaction src="${sql.dir}/drop_all_tables.sql"/>
			</sql>

			</target>
			

Note the addition of the ‘classpathref’ attribute in the ‘sql’
element. This useful little attribute allows you to refer to paths elsewhere
within the build, reducing duplication. When you have moved a dependency into a
project, take to ceremonially deleting the dependency the moment that you can.
On a project where I recently undertook this process, I was fortunate to have a
regular release working in my favour. I fixed the dependencies in the trunk and
revisited a few weeks later, once the release branches that depended on things
outside the project weren’t used.

So in summary, think about your project’s dependencies, including the ones that
are satisfied because they just happen to be satisfied. If you make those
dependencies explicit, I promise you that you’ll be asked less questions from
your colleagues about why they can’t compile! If you have many projects with the
same dependencies, that’s something that I hope to address in a future post.

Tagged ,

CruiseControl Best Practices: Configuration the CruiseControl way

This is the third article in the CruiseControl practices series

You just started using CruiseControl. You use a Version Control System to manage your code. You installed CruiseControl on a spare computer in the office; now it is giving you immediate feedback on the changes that occur in that codebase. Life is good. Then the disk on that spare computer fails, and your build server resumes its previous role as a doorstop.

“No problem”. you think: “All the code changes are in the VCS. We can regenerate any of the artifacts that we need to. In fact, all we need is the config file … “. Yes. That config file. The config file on the hard disk that doesn’t work anymore. This post will outline how to manage your configuration for CruiseControl without fear of losing it. Like many tools, CruiseControl becomes quite useless without configuration.

In projects that I did for ThoughtWorks, we always needed to allow someone to have access to the build server to make configuration changes for CruiseControl. Once projects grow past a few developers, it becomes hard to have everyone be familiar with the installation. Typically we end up with one person (sometimes your humble narrator) becoming the dedicated CruiseControl administrator for the project. This change creates a bottleneck in the team because all the changes to CruiseControl then become funnelled through that one person.

The first step in mending this situation is getting CruiseControl to apply its own configuration. Let’s get started. In addition to the projects that build and test your code, you will need a new project to apply the configuration to the server. We have been doing this to put the configuration file in the right place, using CruiseControl’s <bootstrapper> plug-in to update the configuration files when they change:


<?xml version="1.0"?>
<cruisecontrol>
<project name="config">
<labelincrementer defaultLabel="${project.name}-1" separator="-"/>
<listeners>
<currentbuildstatuslistener file="/var/spool/cruisecontrol/logs/${project.name}/currentbuildstatus.txt"/>
</listeners>
<bootstrappers>
<svnbootstrapper localWorkingCopy="/etc/cruisecontrol"/>
</bootstrappers>
<modificationset quietperiod="30">
<svn LocalWorkingCopy="/etc/cruisecontrol"/>
</modificationset>
<schedule interval="60">
<ant antWorkingDir="/etc/cruisecontrol"antscript="/var/spool/cruisecontrol/tools/apache-ant-1.6.5/bin/ant"  uselogger="true"/>
</schedule>
<publishers>
<artifactspublisher  file="${project.name}/build.log"  dest="logs/${project.name}" />
</publishers>
</project>
</cruisecontrol>

This will robotically update the configuration until the end of time. It’s simple but surprisingly effective. There’s no longer a dependency on the person who can make changes to CruiseControl. Suddenly they don’t need to make the trivial changes on behalf of the rest of the team because anybody can safely change the configuration. If someone does check in a broken configuration, it’s all under version control. Once you revert the change you can find the person who changed it and try and understand what they wanted to do.This is a big step forward. But if you check in a broken configuration it will still be applied to CruiseControl. Fortunately CruiseControl has the good sense not to apply broken configuration; but you’re missing a vital piece of feedback, and for that you need to write a simple validator like this one:

package org.juliansimpson;

import java.io.File;
import net.sourceforge.cruisecontrol.CruiseControlException;
import net.sourceforge.cruisecontrol.config.XMLConfigManager;
import org.apache.tools.ant.BuildException;
import org.apache.tools.ant.Task;

public class ConfigValidator extends Task {
    public String configFile;

    public void execute() throws BuildException {
	try {
	    File file = new File(configFile);
	    new XMLConfigManager(file);
	} catch (CruiseControlException e) {
	    throw new BuildException("Invalid CruiseControl Config");
	}
    }

    public void setConfigFile(String config) {
	configFile = config;
    }
}

The validator uses internal classes of CruiseControl itself to validate the configuration. Ideally we would have an external interface to do this – perhaps a command line option or an “official” Ant task. This approach does mean that you need to set the classpath so that the validator can find your CruiseControl install, but this way you find out with certainty that the configuration is valid for your version of CruiseControl. I like to run these as an Ant task. It’s very simple and easy for everyone to see what it does. Here’s how I included it in a simple Ant build:


<project name="cruisevalidator" default="publish" >
 <import file="build-library.xml"/>
 <target name="validated-config"depends="cruise-validator.jar">
<taskdef name="validate" classname="org.juliansimpson.ConfigValidator" classpathref="main"/>     <echo message="validating ${config}" />
<validate configFile="${config}" />
</target>
<target name="publish" depends="validated-config">
<echo level="info" message="copying CruiseControl config to server" />
<copy file="${config}" todir="${cruisecontrol.dir}" failonerror="true" description="Copy configuration to CruiseControl server" />
<echo level="info" message="forcing a reload of config on server" />
<get src="http://localhost:8000/invoke?operation=reloadConfigFile&amp;objectname=CruiseControl+Manager%3Aid%3Dunique"      dest="${build.dir}/reload.html" />
</target>
</project>

It all works together like this: The CruiseControl BootStrapper fetches us the latest CruiseControl configuration, but in isolation from CruiseControl install – you still don’t know if it is a valid configuration file yet. The “validated-config” target calls the ConfigValidator Ant task. This invokes enough of CruiseControl to make sure that the configuration is legal, and that some of the directories referred to in the configuration exist. If that passes, the “publish” target copies the configuration to the CruiseControl server itself. Finally the same target forces a reload of the CruiseControl configuration using a simple HTTP request to the JMX interface. This ensures that the configuration is reloaded immediately, so that the team knows the configuration is valid. Thanks to my erstwhile colleague Tim Brown for this great idea.

Summary: I have to admit being careless sometimes with XML configuration files. This approach works particularly well for me because I have the safety net of the validation. I do a similar thing with my email and web server installation as well, which I hope to write about soon. The validator code and build files are available here.

Update 2008-12-04: Fixed the link. Thanks to Tom Howard for pointing this out!

Tagged

Ant Best Practices: Use Ant as the Least Common Denominator

We’re back to the best practices this weekend with 12 of 15: Use Ant as the Least Common Denominator. What are we talking about? The answer is here.

What does the common denominator mean? There’s generally a conflict around this on software projects. I’ll explain:

The developer wants to write code. Fair enough. That’s kind of their job. IDE’s give them an immense productivity boost in doing that job. If she needs to switch to another application, the developer will lose focus. So there’s a booming IDE plugin market. So developers will tend to avoid vendor tools in favour of IDE plugins: version control plugins, database plugins, tracking system plugins, etc. By doing so they can become very, very effective.

The release manager and his superiors have a different angle on this: they want to know that the code is deployable. To be deployable, you have to be able to build it. So they have Ant, or some other build tool. That way they know that the code will consistently build and pass unit tests.

Developers (like most people) don’t tend to like impediments to their productivity. Release managers aren’t fond of code that they can’t metamorphose into a working system whenever they feel like it. So developers would rather build all the code using an IDE than have to switch to a command line and build the code. Most release managers don’t want a dependency on a tool that doesn’t really address their needs. It can also be difficult to automate IDE builds.

So the conflict is that each camp has valid reasons for not liking the tools of the other. Which way do you tip the scales? Projects that aren’t constrained by developer productivity (i.e. maybe you have a shortage of testers) should be setting clear guidelines about the build tool. On the other hand, I’d get the kid gloves out if the developers were under pressure to deliver. Eric M. Burke suggests that you at least make sure that:

  • There’s an Ant build that the developers can use
  • They run it before checking in
  • They can use whatever tool they like while until checkin time.

That way, you know that you can reproduce the software later; that Bob didn’t check in code that builds against the development version of a library. The regular checkpoint of the pre-checkin Ant build will allow some flexibility for the developers.

It works for me.

Tagged ,

Ant Best Practices: Use version control

Amazingly, it’s article 11 of 15 in my series on Ant Best Practices. Today’s practice is ‘Use version control’ and I can’t help but wonder if this one hasn’t dated a little. When the original article was written, Subversion didn’t exist, Perforce wasn’t free, and most people used CVS. I’m not going to get nostalgic about CVS. It was better than the alternatives (ever used SCCS?). I wouldn’t go out of my way to use it now.

So would anybody actually challenge this anymore? The original article by Eric M Burke suggests that people would version code but not the build files. I’d be gobsmacked if anybody did that nowadays. Drop me a comment or note at ‘medic@build-doctor.com’ if you seriously disagree with version control for your build files. I’d really like to hear why.

I’m chapter and verse with Eric here:

  • Version control your code
  • Version control your build files
  • Version control your binary dependencies (unless you have an external dependency manager). Don’t version control build output.
  • And try not to mess around with checking things in and out of version control from your build. It generally gets ugly.

The sin of not using version control has done to death anyhow. Ask Joel.

Tagged