Category Archives: Deployment

What rPath do

I tweeted:

.bbpBox12869013380 {background:url(http://a3.twimg.com/profile_background_images/25584215/build-doctor_97509_twitbacks.jpg) #9AE4E8;padding:20px;} p.bbpTweet{background:#fff;padding:10px 12px 10px 12px;margin:0;min-height:48px;color:#000;font-size:18px !important;line-height:22px;-moz-border-radius:5px;-webkit-border-radius:5px} p.bbpTweet span.metadata{display:block;width:100%;clear:both;margin-top:8px;padding-top:12px;height:40px;border-top:1px solid #fff;border-top:1px solid #e6e6e6} p.bbpTweet span.metadata span.author{line-height:19px} p.bbpTweet span.metadata span.author img{float:left;margin:0 7px 0 0px;width:38px;height:38px} p.bbpTweet a:hover{text-decoration:underline}p.bbpTweet span.timestamp{font-size:12px;display:block}

YouTube – Cut the C.R.A.P. http://bit.ly/a9E9b5 [hint: it’s from rPath]less than a minute ago via bit.ly

Jordan tweeted:>

.bbpBox12869525830 {background:url(http://s.twimg.com/a/1273875281/images/themes/theme2/bg.gif) #C6E2EE;padding:20px;} p.bbpTweet{background:#fff;padding:10px 12px 10px 12px;margin:0;min-height:48px;color:#000;font-size:18px !important;line-height:22px;-moz-border-radius:5px;-webkit-border-radius:5px} p.bbpTweet span.metadata{display:block;width:100%;clear:both;margin-top:8px;padding-top:12px;height:40px;border-top:1px solid #fff;border-top:1px solid #e6e6e6} p.bbpTweet span.metadata span.author{line-height:19px} p.bbpTweet span.metadata span.author img{float:left;margin:0 7px 0 0px;width:38px;height:38px} p.bbpTweet a:hover{text-decoration:underline}p.bbpTweet span.timestamp{font-size:12px;display:block}

@builddoctor I still have yet to see anything from rpath actually showing anything rpath does or sells. They drown everyone in marketing.less than a minute ago via web

The PR lady emailed:

rPath is a next-generation data center automation solution that provides a unique approach to automating the construction, deployment and maintenance of software systems across physical, virtual and cloud-based environments. For enterprise IT organizations, rPath can:

  • Accelerate IT — reducing deployment cycles from weeks to minutes
  • Reduce cost — improving system-to-admin ratio by 6-10X
  • Mitigate risk — for auditable and compliant software systems

rPath offers a model-driven and version controlled approach to automating software system deployment and maintenance. This provides deep transparency and control for managing the lifecycle of deployed software systems.
rPath is based on a version-controlled repository that acts as a definitive software library for controlled reuse of application, OS, middleware and other system artifacts across development, test and production organizations.
From this repository, rPath automates four key functions:

  • Automated generation of images for rapid deployment to physical, virtual or cloud targets
  • Automated updates and rollbacks
  • Compliance reporting and remediation
  • Controlled lifecycle promotion

So there you go. Seems very comprehensive.

Supporting Multiple Environments – Part 2

Supporting Multiple Environments(see part one of this series here)
In this installment, I’m going to cover the four cheap-fast ways I’ve seen discussed to generate configuration for your application.

Managing Configuration

There are a few ways I’ve found or seen discussed over the years for managing the configuration portion of a deployment (local or otherwise).  I don’t think one way is the best way for everyone, each company is unique.  The decision should be based on what suits your company best, not how many plugins you can configure or how can you change the way business operates (within reason).  There are other, more complex ways; like Spring based configuration.  This discussion is limited to the cheapest (simplest) way to manage configuration.  Below are four very common approaches.

– Spin a version of your deployable unit (preconfigured) for each deployment option (one artifact)

This is the most expensive (with regard to time and space) option of the four.  This option assumes that you have very few deployable units and very few stacks to deploy to.  This option can be configured via profiles in the POM and leverage values either stored in profiles or in a settings.xml (or a host of other inconvenient ways). After what I’ve seen during my recent stint interviewing and what I’ve seen come across the Maven mailing lists, I think this is no longer an option for most.  This becomes painfully obvious when the artifact starts getting beyond 50 MB. This concept fails to scale with the addition of machines or deployment environments.

– Spin a single version of your deployable unit and build multiple configuration artifacts (two artifacts)

At a previous employer, this was the initial choice for a first foray into configuration for our deployable units.  This regularly took about three hours to finish and each additional machine was painful to manage (all stored in the build user’s settings.xml checked into Perforce), let alone adding additional stacks.  To add insult to injury, having to sit through another three-hour build because ONE machine in ONE stack had ONE property tweaked was both painful and embarrassing.  This option also fails to scale with the size or complexity of the environments you’re trying to support.  Each additional stack could add upwards of 10 minutes each.

– Spin a single version of your deployable unit and have it include templates and configuration for those templates (one artifact)

This is a step in the right direction.  But the configuration is still bound to the artifact you’re attempting to deploy.  It seems like a waste to generate a new deployable unit if the configuration for a particular application in a given deployment environment has changed.  To a degree, this approach also fails to scale.  You’ll end up including more and more configuration.

– Spin a single version of your deployable unit and have it include templates with separate configuration (two independent cycling artifacts)

This is the final evolution I chose.  Breaking out the configuration from the actual deployable unit and making the configuration its own artifact (ie pushed to our repository manager) allowed the same version of an application to be instantly redeployed (with the option of just reapplying configuration to an already deployed application) with a configuration change that took merely seconds to build.

Continued…

In the next installment, I’m going to cover the configuration storage mechanism for this separate configuration jar approach.

Supporting Multiple Environments

Supporting Multiple EnvironmentsOne of the longest running debates I’ve watched in the Maven community over the years, is how to best support different configurations for different deployment environments (where “deployment environment” means a QA environment or staging environment – anything other than a local developer build). Some people argue simplicity over the complexity of some of the other options mentioned below.  Others don’t have spare disk or time to burn building up complete and unique deployment artifacts.

In this first installment, I’m going to cover deploying my favorite deployment tool – Ant.

Packaging and Deployment

I tend to prefer using Ant as a deployment mechanism, forgoing perl or various shells because of the simplicity of the language.  Anyone can read and understand a well-written Ant script (Ant is inherently self-documenting).  One of the features I love is the “expandproperties” filterchain used in combination with a copy command.  If your resources are in directories that mirror where they will live in their final locations, then it’s really easy to manage this with assembly descriptors (where the filtering can happen) and in the deploy script (if you’ll play along and use Ant).  There are all kinds of little perks in using Ant, like being able to add a “-Dname=value” on the command line to override a known configuration setting, without having to rebuild your configuration.  Here’s a very simple example of how I’ve configured applications in the past in a deploy script:

<copy todir="somelocation">
<fileset dir="templates" />
<filterchain>
<expandproperties/>
</filterchain>
</copy>

It is (or can be) as simple as that.

Assembly descriptors can be a bit trickier.  Initially, we used the stanzas to grab lists of wild-carded files, but this (obviously) won’t fail if the descriptor can’t find a file.  So we migrated (as much as we reasonably could) to using the individual file listings via the stanzas.  We also have three types of assembly descriptors: “dev” for local builds which has filtering set to true; “prod” for deployable artifacts which plucks files directly from the source tree and has filtering turned off; and a “common” component descriptor for grabbing everything that is uniform between dev and prod and requires no processing.

This approach has some pros and cons – obviously one major con is having to list out each file individually – but the giant upside is, if a file is missing because of relocation, the build will fail and you’ll instantly be alerted.  Another downside is the fact that the assembly descriptors are essentially the same, with a difference in “outputDirectory” and filtering the only real differences.  One of the benefits is that you don’t have to memorize where a template file lives in its final resting place.  A cursory glance at the resources directory gives you the exact target location.

Another implied concept here is deploying an exploded WAR/EAR (we don’t need to digitally sign the artifacts as they are consumed internally).  While there may be some benefits to startup or load times of the archived file, but being able to slightly alter configuration and redeploy without repackaging is a greater benefit – why build a archive if you don’t have to?

In the next installment, I’m going to cover the four cheap-fast ways I’ve seen discussed to generate configuration for your application.

(photo thanks to tacomabibleot)

Tagged

dbdeploy.net

dbdeploy.netAgile Database deployment for Java and .NET

(This post was originally hosted at http://www.dbdeploy.net)

DbDeploy is an implementation of the ActiveRecord Migrations pattern. DbDeploy.NET is the .NET port of DebDeploy. Both DbDeploys are projects initiated by ThoughtWorks. ActiveRecord comes to us via DHH.

Why would I use it?

When you’re developing software that hasn’t been released, the database is easy: you can tear it down and rebuild it at will. Once you have production data that people are using it, what do you do? How do you manage the change? The Migrations pattern allows you to make bite-sized changes to your database, and test it. It works very well with Continuous Integration.

What else is out there?

Payware:

Open source:

When should I use this pattern?

It’s ideal for greenfield agile projects where you are using Continuous Integration and want to make sure that changes to the database schema will be applied to integration tests. You can use other approaches if you have an ORM and you haven’t released to production yet.

When shouldn’t I use this pattern?

  • When you have a huge legacy database
  • When you’re trying to put data into a database and not schema changes
  • When you don’t use source control

The Migrations pattern is a really helpful way to manage database change; It’s not a silver bullet though. You need to have discipline and a good test regime. It works well with Continuous Integration.

Update: Gregg Jensen got in touch with a new URL for DbDeploy.Net

A way to cool dependency Hell?

How to break a deploy: Take one codebase. Sieve in a new class. Mix in the dry ingredients and a new runtime dependency. Place another dependency on a pre-warmed Hudson, bake for 10 minutes (on a medium heat) and then deploy. Oh dear. It didn’t deploy.

We’re a bit crap about managing the external dependencies of our code. I’m not talking about libraries but more basic dependencies that your application might have, like native code libraries, or commands. There’s two ways you can do this:

  • You can make people responsible for the care and feeding of your testing and production environments. This is easy to implement, but stupid. I think it would only work in an environment with exceptional communication.
  • Or you can insist that any application must declare what it depends on.

Keeping environments up to date keeps lots of people in a job. It’s a really dumb job. At my day job we’ve taken the latter route, using Puppet.

Puppet is a tool for systems administration. But you can use it even if you don’t know fsck from fmt. The way we’re using it is to be an executable specification of the dependencies that your application needs. For example: A test just failed on a new server – with this output:

Validation failed: Avatar /tmp/stream20091208-22414-y3anvf-0 is not recognized by the 'identify' command.

I realised that it needed thelibmagic1 package and possibly libmagic-dev. I could have installed them then and there onto the machine. I’d have forgotten about it in the excitement. So I added them to a file on that project called dependencies.rb. This file is run by Puppet before we deploy. It gives our developers enough control over the target operating systems so that they can make small changes to the deployment environments. We’ve been running this via a Capistrano task on our project; we typically run it as the deployment user; that way we can easily make changes to crontabs. Puppet won’t exit cleanly if it can’t install all the dependencies, so it’s a good way to test.

Here’s an abridged version of our dependencies file:

class dependencies { 

  include $operatingsystem

  class ubuntu {
    package {
       'libcurl3':             ensure => present;
       'libcurl3-gnutls':      ensure => present;
       'libcurl4-openssl-dev': ensure => present;
       'g++':                  ensure => present;
       'build-essential':      ensure => present;
       'libmysqlclient15-dev': ensure => present;
       'libxml2-dev':          ensure => present;
       'libxslt1-dev':         ensure => present;
       'libmagic1':            ensure => present;
       'libmagic-dev':         ensure => present;

    }
  }

  class gentoo {
    cron {
      'some cron job':
        command   => '/engineyard/bin/command blah blah',
        user      => 'admin',
        hour      => ['*/2'],
        minute    => ['12'],
        ensure    => 'present';
    }
  }

  class darwin {
    notice("Nothing to do in this OS.")
  }
}
node default {
  include dependencies
}

In this file we define a class (dependencies), which doesn’t do much but look for a corresponding inner class to match your operating system. Right now we have a very simple arrangement: The dependencies::gentoo class contains crontabs and the like for EngineYard. The dependencies::ubuntu class names all the native dependencies of our rubygems. We have an empty class for Darwin to stop the Mac OS X machines from complaining. That’s it. Here’s the Capistrano task:

  desc "Run Puppet to make sure that dependencies are met"
  task :dependencies, roles => :app, :except => {:no_release => true} do
    run "cd #{release_path} && rake dependencies"
  end

Image courtesy of eflon

Tagged

Deployment is the goal

We get things so ass-backwards. How do we get code from development team to the end user? I’ve written an article on this subject at InfoQ. I hope you like it.

Getting a wedgie on the last mile – at noop.nl

The last mile is where software becomes a production code, hairs turn grey, and a lot of pizza gets eaten. It’s the last place you want a wedge.

Read more of my guest post at noop.nl – thanks to Jurgen Appello for the opportunity to spread the message.

(image from dullhunk)

Continuous Integration and Release wisdom

Can’t recommend these enough.

Item one: this book. Release It! by Michael Nygard. Michael has written a fantastic book of patterns and antipatterns of writing, deploying and running software. Developers and Administrators alike should read this book. There are some profound lessons in there.

Item two: The SE Radio Podcast with Michael. He goes into a lot of detail about the subjects in the book. There’s some Floyd Rose abuse in the theme tune, too.

Item three: The next podcast in that series is with Chris Read of Thoughtworks. Chris gets grilled about Continuous Integration and comes up with some pretty compelling arguments. Same cheesy guitar music. I saw Chris give a talk at ThoughtWorks London last week.

We caught up for some food afterwards; we talked about how Michael’s book captures so many of the lessons we learned at the coalface.

Bonus item: Paul Nasrat talking about Agile Systems Administration. This was filmed in London a few weeks ago. I was gutted that I couldn’t make it. Glad that SkillsMatter published it so soon.


Access controls in test environments

Good article from the Agile Web Operations crew about securing access to environments. Made me reflect on my brief career in the City. We were asked to be the “gatekeepers of quality” for all the changes that flowed through our test systems.

That was hard. One thing I did do was take a good look at the test environments and see who had administrative access. That was everyone. In response, I hacked some Ruby scripts together, and ran them from a CI system every 24 hours.

The first issue was simple. The administrators would make an Active Directory group for each host, and add privileged users to it. Unfortunately what would happen is that people would ask for access to the host and be put in that group. So the first thing to do was add all of the hosts that I could identify and then output all the folks who had access to them.

After the first couple of mass purgings of inappropriate user access, the helpdesk would call me up and ask if it were okay for someone to have admin access to a machine.

The second was harder for me – identifying every user that had admin access to the database. I needed to run a DB script against each instance to find out:

  • if a user had sysadmin privileges for the entire instance, and
  • if the user had sysadmin or db_owner for a particular database

After some frantic Googling, pleas to the DBAs and DB developers and blind experiments I worked out how to do this for SQL Server 2005, at least. SQL server 2000 kinda worked. That again flushed out a large number of people who had access, who shouldn’t.

My workload went up some more as people who previously had far too much access to systems had to then come talk to us. We were able to ask people to script things like granting rights into their changes. This helped prevent a class of deployment failure where users would have no rights to execute stored procedures. Not sure I made many friends, either.

Anyway, here’s Dan’s post.

Limiting Access to Test and Production Systems — Agile Web Operations


Six Tips for Automated Releases

Eric Lefevre: yes

(image taken from the JetBrains Team City Photostream)

Today’s guest post is from Paulo Schneider at YouDevise. YouDevise is a financial markets information company based in the City of London, and they are hiring! You can also check out their developer blog.

A common goal of agile methodologies is releasing often. With shorter
feedback loops you can be more in tune with your users’ needs, and adapt
the development tasks accordingly. However the release process can be a
hurdle if a lot of manual work is required. Here are some tips for
automating your release process:

1- Releasing to test environments should be the same as releasing to
production: during the development cycle we release quite often to our
internal test environments. Thus, it is quite sensible that the release
process should be as straightforward as releasing to production and it
keeps us from having to deal with special cases.

2- Deploy successful builds from continuous integration: it is easy to
automate our release process to grab and build the code from source
control. However, how do we guarantee the commited code is not broken?
If we use a continuous integration tool (such as CruiseControl or
Hudson), it should provide us with completed, deployable build artifacts.
Why not use these instead of source control to get the code for deployment?
At least we know for sure that they have passed the unit and functional tests.

3- Keep per server configuration separate from application: application
configuration is hard. To avoid complicating the automated release process,
separate the configuration of the application from the application
itself. This way we can release the same application to multiple
servers and leave the hard work of configuration to the humans. For instance,
if your application reads a Java properties file on startup, put that file
in a known location on each server, say $MYAPP_CONFIG/config.props, rather than
deploying it as part of a release.

4- Use shell scripting instead of more general scripting languages: the
process of deployment involves a lot of low-level commands and monitoring
system processes, which are suited for shell scripting. If we used a more
general scripting language such as Perl and Python, we would be forever
calling “exec” or “system” and parsing result codes.

5- Use rsync for file transfers: nothing brings down the excitement of a
release than a slow file transfer to production machines. But in this modern
day and age there is a magic pill called rsync
to bring the excitement back. Rsync provides fast file transfers by only transmitting
the bits of the new application that differ from the old version on the
production machine.

6- Use the SSH “authentication agent” to forward credentials: running commands
remotely is a very common task for releasing, and if we run a server farm, we’ll
need to access several machines during the process. For security, we should use
SSH to connect to the servers – but it would hardly be an automated
process if we have to enter a password for connection to every server. Using the SSH
agent-forwarding option, you can forward the security credentials and sidestep
the whole login process for all but the first login. Just type “ssh -A myserver”
and you’ll be on your way.