Jupiter Moonbeam & the Geeks from Cyberspace: January 2008

The project I am working on has a build which used to take under a minute and has slowly crept up to around one minute forty five seconds. Recently I have begun to feel the pain of this longer build; it is surprising just how much a paltry forty-five seconds increases those periods of sitting and waiting.

The build is an essential part of the code base, it is what gives you the developer and the rest of the team feedback on the state of your code. Continous Integration tell us that we should make as small changes as possible, verify them using the build, then check in (then verify the build on the build server). Under these rules we should probably aim for a optimum integration rate - the frequency of check-ins (per pair). I believe a optimum integration rate should be around once per hour.

A slow build can be a blocker to good CI practice: the longer a build takes to run the less often people will run it, the less they will check in and the bigger the change list they create. Martin Fowler's original article on CI states that "the XP guideline of a ten minute build is perfectly within reason" however I want to contest that guideline and bring it down to a minute. Why? because the difference between a build that takes under a minute and one that is over a minute is quite considerable in trying to keep your change list down.

On average a developer will run three builds per change, the first being on the change itself - which is usually run in the IDE and will often have some failing tests (though not always) - the second to verify the change and the third to verify the update. So the general rhythm goes: make a change, run the tests, fix the broken tests, verify the build on the command line, update, verify the build again, commit. Of course it is possible to make a change and have no tests failing (meaning two builds per change) but even then most devs will run the testsuite in the IDE and then verify the full build against the command line.

In the case of a merge conflict (either in code or failing tests on update) we add an extra two builds per conflict: the first to verify the fix, the second to verify the fresh update before check-in.

Three changes per build means that the total amount of time spent building is the build time multiplied by three: therefore any build which takes one minute will consume three minutes of developer productivity. If we check-in once every hour that is 3/60 or 5% of your devs time spent building. Take the build up to just two minutes and we have 6/60 or 10%. Take a build of ten minutes and we have 30/60 or 50% of devs time spent building! If there was a merge conflict on a ten minute build it will consume 50/60 or over 80% of that hour.

On a ten minute build a productivity ratio of 1:2 is not acceptable, so to compensate we adjust the optimum integration rate. This is achieved, essentially, by trying to bring the ratio back to an acceptable level, so if 10% loss (equivalent to a two minute build) is acceptable the check-in rate becomes once every five hours. Comparatively 10% on a one minute build is a check-in every 30 minutes verses once every five hours!

Decreasing check-ins is a dangerous game; the longer changed code is out of the repository the more out-of-date it is. The difference between your repository and your code becomes greater (assuming, of course, that you are making changes over that period) and the difference between your code and your fellow developers also becomes greater. By checking in frequently the codebases are constantly re-aligned minimizing the impact of future changes. The most obvious side effect to diverging code bases is the chance of conflict increases - small, frequent check-ins reduce this to an extent where it can become a non-issue - big check-ins mean big conflicts, big conflicts mean loss of productivity, stress, more time your code is out of sync and more times you have to run the build.

To demonstrate this lets create a concept of a "window of opportunity", this window is the space between doing the update and checking in and the opportunity is for another pair to check-in before you do! The size of this window is equivalent to the length of the build; we all know that if we are going to check in we try to grab this window (how often do we say "hang on a minute I've just got to run the tests and check in this build"?). The reason for this is we don't want to find that someone else has checked in between the last update and us running the build. The greater the length of the build, the greater the chance that someone else has committed (which increases based on the size of the team) and the more times you have to run your build.

There are other side-effects to long builds, you can't keep the change list small and people will find ways around running the build; from only running it once out of the three times (increasing the chance of broken builds) or avoiding full runs by only running the tests they believe to be impacted (again increasing the chance of broken builds). None of these is a good practice. Big builds also mean loss of attention: running the build, doing an update, running the build again, then checking in takes just two minutes on a one minute build but twenty on a ten. Two minutes won't disrupt to flow of development and probably works as a nice gap to discuss things with your pair or just give the brain a rest. The ten minute build equals a wait of twenty minutes: enough time to eat lunch! At the end of twenty minutes any focus you had has gone.

A one minute build may sound unrealistic but I believe it is possible. In some ways it is an indication to healthiness of your code base and by that I mean both production and test - poorly written and bad performing test code is equally as bad as poorly written, bad performing production code. It is very easy to be slightly relaxed about build times, treat them as an inevitability and the watch them slowly go from one minute builds to ten.

It's all very easy to preach about one minute builds when I am in the privilliged position to be working on a code base that only takes one minute forty five seconds. The speed of the build is due to the drive, early on in the project, not to tolerate long builds: several times it's been looked at when it started to get a bit overgrown and it's been hacked back down to size. If we had been more relaxed I'm sure it would have crept up a lot higher.

I found out yesterday that I was being rolled of the project I am currently on. I was very ill prepared and I had been guilty of kidding myself the project was going on forever. I had made the classic mistake of inaction believing that denial resolves into desire. It has been quite a strange experience seeing a project draw to a close. I have learnt there are two pains to consultancy on a project of this kind: people roll off and you roll off. It was strange to come back after Christmas to see the team halved. People you'd worked very closely with - whole days pairing together side by side - good friends who are suddenly not there. Though it isn't literally here today gone tomorrow the period between finding out someone is going and them being gone is very short - there isn't the one month notice period and the usual "good luck with your next job" as essentially nobody left ThoughtWorks, they've just left the team, but TW being such a vast place it feels as if they get sucked up back into the unseen world of clients.

The biggest shock is when you yourself get your papers. It was a moment I was inwardly dreading: I knew I rolled off mid-January but I was hoping that my term would be increased, I was hoping the project would go on but the cogs that do turn leave my assignment end date landing in a void where there is currently no commitment to extend the project meaning I've popped up as a resource to be used somewhere else. In someways the great Resource Management machine can feel quite heartless and random, as I guess it should be, but the first time you experience it - especially after ten years of in-house employment - you realise how it's dangerous to ignore it.

It's been a great project: the best I've ever worked on. It's renewed my faith in development; I've never enjoyed my work as much as I have over the last 5 months. Although not everything has been perfect - the pace too aggressive, the project too short, a feeling of uncertainty over it's lifespan - it has proven to me how much better agile is especially when you apply all the best practices from stands-ups, story walls, retrospectives, continuous integration, testing and of course pairing (I have developed a terror of having to solo again: back to the days of insoluar, procrastinating, under productive depression). From it's earliest days I wanted to drag old colleagues in and show them how this project has lived and breathed in a way so unlike all the bureaucratic process driven drivel that I have had the unpleasure to work on or the other political chaos zero management departments out there. The people have renewed my faith in development too: working with people who seem to get *it*; a whole team (QAs, BAs, developers, project managers etc.) who care about what they do, bother to learn and understand and discover, has been mightily refreshing.

So a farewell to a great project and the great team I have had the please to work with. I hope I come across your paths and get to work with you again. A special mention to Sukita and Seema who are not only great role models for that all too rare bread of women in IT but have character of Dickensian proportions and, in the highly unlikely event of media type people reading this, deserve their own TV show (Sukita already has broadcasting experience). Also special mentions to the other half of the fire team: Greg, and to Alex for all our discussions on Russian literature and the church.

Jupiter Moonbeam & the Geeks from Cyberspace

Saturday, 19 January 2008

The One Minute Build

Thursday, 10 January 2008

Rawhide

Blog Archive

About Me