Jupiter Moonbeam & the Geeks from Cyberspace: 2007

Monday 29 October 2007

Your Feet's Too Big

Every piece of code written has a refactoring footprint which is essentially its proliferation through the code base and its level of difficulty to refactor. Some code can have a high proliferation but the refactoring may be simple, for example renaming a variable in modern IDE's is a simple matter regardless of the number of usages. On the other hand changing a variables return type or changing the signature of a method both create a large refactoring effort, especially if that requires knock-on refactorings of other classes and the greater the consumption of the code the greater the footprint. There are other things that can have an impact on the ease of refactoring: using libraries that create code that is difficult to refactor (for example having to place variable names etc. in strings), using external libraries directly without encapsulation, having similar routines across code which isn't encapsulated, the number of entry points to a piece of code (the number of different ways of achieving the same thing) all of these can make refactoring difficult and risk the introduction of bugs. Another area, much neglected when considering ease of refactoring, is test code which very often has a high consumption of production code (sometimes higher than the production code itself) and can make the simplest refactorings painful because large chunks of test code are dependent on your class.

TDD states that code that is easy to test is good code and the same is true for refactoring: writing code which is easier to refactor leads to cleaner code. Refactoring footprints have, in very real terms, an impact on productivity and therefore managing them is quite important. There are various techniques - most of them basic OOP principles - that can help keep the refactoring footprint down. However, due to the natural evolution of code, this can be quite hard to keep track of and unless you have an in-depth knowledge of the codebase - which is increasingly unlikely the bigger and older the codebase is - it can be difficult to 'see' a footprint.

I think it would be interesting to explore the use of metrics to measure refactoring footprints. These metrics would give a very interesting and valuable insight into the codebase and its quality. The interesting thing is that unlike many other code metrics refactoring footprints need to take into account test code as well. From a very basic level counting the number of usages across both production and test code can give a rudimentary indication to a footprint. More sophisticated metrics would measure the impact of a refactoring (are there knock-on refactorings). Different types of refactoring would also require different metrics and some refactorings may be uncommon in certain codebases or may have a degree of acceptable pain (package refactorings for example). As with most metrics they will be open to a degree of interpretation based on codebase and class but none the less the metrics should be useful - especially in recommending refactorings!

I haven't done much research on this so I do not know if any tools already exist to calculate such metrics or even what metrics they provide, though I am sure there will be. I shall try and track some down and have a play to see what sort of data they provide and establish what sort of insight they can give into a codebase.

Thursday 2 August 2007

Seperating the Men from the Morts

I'm a bit slow on the uptake here (maybe I'm too busy to be a true alpha geek) but if you're one of those .NET developers who've always felt a little bit uncomfortable sitting in the MS Fanboy Hummer on the way to the latest .NET launch (and secretly was wishing you were at RailsConf) and because every time you mentioned test first, agile, a new OS project, regret that a Java library wasn't in .NET - or god forbid you actually deployed open source non-MS certified libraries on a production machine! - you got funny looks, then maybe you are an ALT.NETer and never knew it.

The opposite is MORT.NET: those old VB6ers or VisualC++ers who just can't shake the habit of databound controls with a business layer of static methods which shuttle anemic data objects to and from the database and UI, and tests by creating a test app of datagrids and command buttons. The MORT.NETer is a drag-and-drop all star who wouldn't have known what AJAX was until MS released ATLAS (but are slightly cautious as it's FREE) and believe NHibernate is pure evil because it doesn't use stored procedures.

Join the movement comrades: if you're the kind of developer that's prepared to try something because it's cool rather than asking if it is MS certified you're already there!

Wednesday 18 July 2007

Blogger: stop stealing my space

Does anyone know why Blogger is adamant on removing spaces when you post? When I post code snippets it always decides to cut down all my spaces making my code snippets look a bit short on the indentation.

What's more annoying is that everytime I edit a post it takes a few more spaces away leaving the code without indentation and I have to re-indent everytime I do an edit!

I've searched the site for options but can find nothing that says "stop stealing my spaces". Any clues?

Tuesday 17 July 2007

Loopy loops

Encapsulation and information hiding tells us that we encapsulate an objects internal data structure and expose methods, a client can see all the methods but none of the data. Tell, don't Ask extends this to say we should always tell an object to do something, not ask it for some information and then do something for it. The Law of Demeter helps to clarify this by telling us that we should avoid invoking methods of a member object returned by another method. By employing Demeter we avoid "train-wreck"s e.g. a.B.C.DoSomething().

I have seen - and written - a lot of code that does this:


foreach(Object obj in myObject.Objects)
{
 obj.DoSomething();
}

But doesn't this break the three rules? First we are exposing the objects internal structure (the fact that myObject has a collection of Objects inside it). Secondly we break Demeter in three ways by:

invoking the GetEnumerator (in C#) method of IEnumerable which was returned by myObject (via Objects)
invoking the MoveNext (which incidently breaks Command-Query seperation) and Current methods of IEnumerator (in C#) which was returned by IEnumerable which was returned by myObject
invoking the DoSomething() method of obj which was returned by the Current method of IEnumerator which was returned by IEnumerable which was returned by myObject (see why it's called a train wreck?)

Lastly we break "Tell, Don't Ask" by asking all the above objects (three in total: myObject, IEnumerable, IEnumerator) in the collection for something and then doing something to obj.

Looping arrays and collections is very common and in most programs exposing the collection is the norm. But what happens when you need to change the way the collection works? What if you decide to replace your objects array with a HashTable or a Set? Thankfully most OOP languages gives you some sort of iterator so, in C# for example, you can just expose IEnumerator and you might be alright. But this still breaks the three rules and what do you do if your underlying data structure doesn't have an IEnumerable? What if you're dealing with some legacy COM+ object and you have revert back to the old for or while? Or even worse, the data structure doesn't even seem to have a simple structure (not so uncommon)? Just exposing an enumerator isn't so easy but we still work around it by doing something like this:


IEnumerator Objects
{
 IList list;
 for(int i = 0; i &lt; comObject.Count; i++)
 {
  list.Add(comObject[i]);
 }
 
 return list.GetEnumerator();
}

And then in the client we have:


foreach(Object obj in myObject.Objects)
{
 // here we go again!
}

Hang on a second: we loop so we can loop? Doesn't that sound a bit strange? To make matters worse we seem to do this as well:


void SaveEverything()
{
 foreach(Object obj in myObject.Objects)
 {
  database.Save(obj);
 }
}

... Somewhere else in the program ...

void DisplayEverything()
{
 foreach(Object obj in myObject.Objects)
 {
  ui.Display(obj);
 }
}

... Again somewhere else ...

void SendEmail()
{
 foreach(Object obj in myObject.Objects)
 {
  email.Send(obj);
 }
}

How many times do you want to write that damn loop? So what's the alternative? To implement the three rules: hide the collection and stop exposing it and then tell the myObject to loop it for you. Quite simply:


public class MyObject
{
 void Loop(IReceiver receiver)
 {
  foreach(Object obj in objects)
  {
   receiver.Recieve(obj);
  }
 }
}

All you need to do is pass your receiver to MyObject and it will do all the work for you:


myObject.Loop(databaseSaver);

public class DatabaseSaver : IReceiver
{
 void Receive(Object obj)
 {
  database.Save(obj);
 }
}

Or if you prefer you can use delegates and anonoymous methods:


myObject().Loop(delegate(Object obj) { database.Save(obj); });

public delegate void LoopDelegate(Object obj);

public class MyObject
{
 public void Loop(LoopDelegate loopDelegate)
 {
  foreach(Object obj in objects)
  {
   loopDelegate(obj);
  }
 }
}

Now if you need to change the objects internal data structure of the way you iterate it (for performance reasons etc.) you can do so with ease and without breaking any of your clients.

Although with simple loops it all looks straightforward enough, it is with more complex structures that you gain a real advantage. In a parent child structure for example you may find a loop such as this:


foreach(Parent parent in foo.Parents)
{
 foreach(Child child in foo.Children)
 {
  // do something
 }
}

Writing this loop all over the place can be fairly cumbersome, and what if you don't want to expose the structure in this way? By using the pattern above you have more control over the semantics of the object. For example you could present the list flat and merle passes each object across as above (and thus loosing nested loops all over your code) or you could expose some structure:


public interface ParentReceiver
{
 ChildReceiver CreateChildReceiver();
 void ReceiveParent(Parent, ChildReceiver);
}

public interface ChildReceiver
{
 void ReceiveParent(ParentReceiver);
}

public class Parents
{
 public void Loop(ParentReceiver receiver)
 {
  foreach(Parent parent in parents)
  {
   ChildReceiver childReceiver = receiver.CreateChildReceiver();
   parent.Loop(childReceiver);
   receiver.ReceiveParent(parent, childReceiver);
  }
 }
}

public class Parent
{
 public void Loop(ChildReceiver receiver)
 {
  for(int i = 0; i &lt; children.count; i++)
  {
      // don't pass any illegitimate children!
      if(children[i].IsNotIllegitimate)
      {
    receiver.Receive(children[i]);
   }
  }
 }
}

As you can see by following the three rules and telling the object what to do, and only the immediate object what to do, we can not only express the structure of the objects better and force clients to work with that structure, we also loosen the object couplings.

Tip:
For those pre .NET 2.0 people out there, or those who use languages that don't have generics, by using the above pattern you can strongly type your objects even if you can't strongly type the internal collections.

Thursday 28 June 2007

If only you want to dance to your tune you'll leave an empty dance floor

Last year my wife and I were married (hence the fact she is now my wife) and being the independent thinking, creative couple we are, we planned and organised the whole wedding ourselves, right down to the music. Both of us being passionate and snobbish about our music - and having the luxury of a very large collection - we didn't want some spotty DJ ruining our day by playing some dreadfully cheesy tracks and mumbling his way through the evening. So we managed what many said was impossible: a completely cheese free wedding playlist which got everyone boogieing the night away (not to blow our own trumpets but we had to hold back drunken protesters when we turned the music off!).

This weekend was my best man's wedding and so impressed was he with our music he wanted me to emulate it for him; so I accepted and invited him over to peruse my record collection. Finally, a few days before the wedding, he gave me his list and I looked at it with horror: it was clear he had just given me a list of his favourite personal tracks. His wedding being a typical big family and friends affair (over 150 guests compared to our 50) I could tell instantly the list wouldn't work, but no matter how much I tried to explain this he was adamant that on his wedding he wanted his favourite tracks. So I conceded and took his laptop gave him copies of the tracks I had and installed the latest WinAmp and the excellent SqrSoft Advanced Crossfading plugin to give it that ultimate disco feel and used WinAmp's built in ReplayGain to level the volumes of the tracks out (to prevent quiet track then sudden loud track).

The wedding reception went as I had expected: the groom loved every track that came on and a few close friends got up a jumped around the dance floor with him. But after half and hour or so the groom was off mingling at the bar and the dance floor was empty. His new bride was soon begging us groomsmen to get people up and dancing to fill the depressingly empty dance floor and although we did drag a few people off the sides, quickly the floor was vacant again.

About an hour later the groom came running up to me waving his arms and shouting in distress that his new wife had only gone and changed the playlist: she had loaded her emergency backup playlist instead! Before I could even react she had rushed over, the train of her dress picked up in her hands, shouting at him that she'd changed it because no-one was dancing while he shouted back something to the effect that her girl band music was rubbish (I can't print the exact words) and as I turned to leave them to their first domestic as husband and wife both their eyes looked pleadingly into mine. So there I stood with both looking to me to take their side; the groom playing to my taste as his wife’s playlist was full of offensively bad music and his wife looking to my reasonable side and the fact that people were now dancing. It was an easy decision for me to make: I asked them "do you want people to dance?" they both replied instantly "YES", "were people dancing before you changed the playlist?", "no" the wife replies and a sharp look at the groom squeezes out a squeaky "no", "OK then", I firmly state "if we go into the dance room and people are dancing then we leave her playlist on and if they're not then we change it". So we marched into the dance room to find a mass of badly dancing relatives and the decision was made.

There is a moral to this humorous story of marital conflict; my friend had placed his personal preferences before his objective: to ensure people dance and have a good time at his wedding. When my wife and I did our playlist we didn't just dump our favourite tracks down, we made lots of compromises - which included a strict no Smiths or she'll annul our wedding - to ensure that the songs we played people would want to dance to but we used our principles of good taste to guide us. Our principles meant that there would be no cheese at the objective meant that although The Rolling Stones Beast of Burden may be on our personal playlist we knew that Brown Sugar is the one that get's everyone dancing and that people would bounce around to The Undertones Teenage Kicks way before Radiohead's Exit Music (For a Film). When we put together our playlist we carefully considered whether each track was appropriate (which is why we didn't have Super Furry Animals Man Don't Give a Fuck) and listened to the track trying to imagine our guests dancing to it and ultimately remembering that although every song wouldn't please everyone that everyone should be pleased by most of the songs.

Development is like choosing a playlist: a good developer will always remember the objective. We've all been there: developing our own little features, the one's we think are really cool, and then getting upset when the customer says they don't like it and they want to change it. Developers are very guilty of doing what they think the application should do, forgetting what they've been asked to deliver and instead of doing the best thing to meet the customer's requirements they end up writing code to solve problems that don't exist or aren't important. It's a hard thing to avoid and I am as guilty as anyone of doing it: I remember all too clearly arguing that a feature shouldn't be developed the way it was requested as I felt it was functionally incomplete forgetting that what the customer was trying to achieve was to solve a problem quickly but not perfectly (what I call developing Word when all they want is Notepad).

As developers we find it difficult to compromise the 'perfect solution' for what the customer wants and we get stuck in thinking that to not write a technically or functionally complete solution means writing a technically or functionally bad solution (as my friend believed that not playing his favourite music was to play bad music). I had a conversation with a developer the other day who, like my friend, didn't want to compromise his design to meet the customer's objective. He wanted to build the feature in a way that was more generic and technically clever than was required and saw any sacrifice to this principle as bad coding. So entrenched was he in this view point he was prepared to sacrifice the customer's requirements and the deadline as he couldn't conceive that writing good, flexible code which was specific to the customer's requirements was a good thing. I struggled to find a way around this loggerhead and in the end I tried to explain the emphasis should be on usability before reusability and not on over-engineering.

The difference between my friends playlist and ours was that although both of us stuck to our principles of what we believe is "good music" (in the same way as a developer I would stick to mine: Agile, TDD etc.) my wife and I always checked ourselves against the objective and adapted to the situation, where as my friend just hung on to his principles for principles sake, forgetting what he was trying to achieve, and ultimately it was this that separated our full dance floor from his empty one.

Monday 25 June 2007

Intelligent design

I stumbled across Jay Flowers blog today and found his mix of eastern philosophy and Agile development very interesting. Eastern philosophy appears to be seeping into a lot of western science (in particular psychology and to some extent quantum physics) and it's nice to see these ideas placed to argue the benefits of Agile.

However I don't want to write so much about the intricacies of eastern philosophy (Jay's blog does a good job already) instead I'd like to expand on a particular post of Jay's and specifically on the quote he provided from Alan Watts.

Religion is embedded deep into our cultures: whether you are atheist, agnostic, fundamentalist, conservative or liberal believer our society and culture has been influenced more by religion than it has by science; legal systems, for example, are based on religion: the ten commandments for Christians, Sharia law for Muslims, etc.. After reading Jay's blog it occurred to me how much our approach to development is also influenced by religious culture and to demonstrate this I am going to look to the creationist vs evolution debate.

I don't want to actually get knee-deep into the creationist vs evolution debate (I am not Richard Dawkins), instead I wish to concentrate on the affect of creationism on our western culture. Though many of us take the theory of evolution for granted less than 150 years ago the main position of the scientific community was a creationist stance. So ingrained was the creationist viewpoint that Darwin took twenty years to publish his theory due to the Galileo like treatment resulting in scientists counter to the biblical view being ostracized and attacked as heretics.

So what does all this have to do with development? Most of our theories on good development and management methodologies come from those established at the time of the Industrial Revolution and the rise of Victorian engineering as an established discipline in the nineteenth century. If we look at the rise of engineering and that of evolution we can see clearly that engineering practices were not informed by evolutionary theory but instead from a creationist viewpoint. To clearly illustrate this consider that as the first copies of Darwin's groundbreaking publication The Origin of Species the great engineer Isambard Kingdom Brunel had been cold in his grave for a good two months.

Creationists (being theists rather than deists) take issue with evolutionary principles that complex biological forms are evolved from simple primitives over numbers of adaptations (dare I say it: iteratively); instead creationists argue that something (mainly the Abrahamic God) created all living creatures instantaneously and perfectly. Culturally we have been indoctrinated with this view point resulting in a populist feeling of common sense that complex systems are only successful if they are designed upfront, created perfect and built to last an indefinable period (Big Design up Front). This view is expounded as evolution is often misinterpreted (often deliberately by creationists) as being based on pure chance.

As engineering and management theories were established under the creationist view of the the biological world so did they become polluted with it. After thousands of years of belief in creationism it is not surprising that 150 years later even modern practices such as software development find creationist ideas still embedded in them. Though evolutionary design is becoming more of a mainstream approach to software development many people still defend the old creationist cause and often argue against Agile methodologies with the cultural preconditioning that unless something is created perfect somehow its overall success is left to chance as everyone is left to do what they want. For many developers moving away from the creationist view is a big step and in the same way the concepts of adaptation in evolution provides answers for a seemingly perfect complex world so it does with Agile, and where evolution emphasis the disciples of natural selection over random chance so Agile has its own disciplines. Fortunately software development is aided by the fact that something intelligent is helping it (or at least most of the time)!

Monday 11 June 2007

Letting Schrödinger's cat out of the bag

On Friday I journeyed up to the greatest capital city in the world - London (try searching on Google if you don't agree) - and had the final round of interviews with ThoughtWorks. The end result was I got the job (I'm a very happy JMB) - which has already cost me a huge wedge of cash taking my wife out for a celebratory meal in a posh restaurant - and now I am just waiting the official offer before I start the ball rolling (hand in my notice etc.).

I'd like to say what a great day I had: I have never experienced an interviewing process so thorough but at the same time so incorporating. There were several points in the day when I had to stop myself just wandering round the office and just going up to people or sitting at a laptop and pretending to be a real ThoughtWorker (which I will soon be!). I have never felt so accommodated and welcome at an interview before, from the moment I walked in the door and despite the normal sweaty palms I felt fairly relaxed (or at least only slightly anxious) and even for the written tests I was made to feel as comfortable as possible.

What was really evident was the amount of probing ThoughtWorks do to ensure that you are going to be suitable from an environmental/cultural perspective. Though this sounds obvious (and I did half-dismiss its relevance when being told about it) there is a real effort from ThoughtWorks that you understand what life as a ThoughtWorker is like - good and bad. They try and build up the picture of ThoughtWorks as much as possible with all its warts. I really appreciated this, not just for its openess, as working for ThoughtWorks can sound like the development equivalent of getting to play premiership football or getting a Hollywood part, but you still need to understand that it isn't all red carpets and adoring fans but there is work to do and sometimes that work may not be too glamorous.

I'd also add that I was surprised at the level of feedback ThoughtWorks give you; virtually every other company that have offered me a job you get the normal stony faced interview - with a few smiles and laughs at your anecdotes and Dad jokes - and then the short "we'd like to offer you the position" phone call. ThoughtWorks gave me loads of direct feedback - which I must say was really flattering - and when I walked out of the office - and aimlessly wandered the streets of London for the 30 minutes it took me to come to my senses and realise I was hungry and needed to get home - I can't honestly say what my head was more full of the excitement of the job offer or the "they said I was...".

The only downside of the whole thing was, when I finally did pull myself together and get myself home, I had to attempt to excitedly reiterate nearly five hours of information to my friends and family: but hey, if you can't bore your closest and dearest half to death with a detailed breakdown of the logic test - which retrospectively I perversely enjoyed - then who can you?

Tuesday 5 June 2007

Schrödinger's cat

A couple of weeks ago I closed my eyes and clicked on the submit button on ThoughtWorks website and down went my CV, too late to change, into the ThoughtWorks database. Imagine my excitement as the next day I get an email asking to ring back to organise a phone interview.

Seemingly the interview went well as I was asked to complete a coding test. I had to choose one of three tests and I spent hours agonizing over what angle I should tackle this at. Everyone knows the ThoughtWorks mantra on simplicity, but just how simple is simple? I believe I've kept my code simple but what if it's too complicated? And if I make it simpler what if it's then too simple? I can't describe the hours of agonising until I finally came up with a solution I was happy with which I zipped up, closed my eyes a second time, and pressed my shaky finger down onto the mouse button. I hadn't felt this nervous since jumping off the top diving board or, in my teens, ringing a girl I met in the club!

As soon as I went to bed that evening the code went round and round in my head and I cursed myself for the hundreds of dumb decisions I'd made, regretting the fact that I hadn't just slept on it before sending it. "But hey", I said to myself "this is like the real world: we can always go back over our code improving it - a good artist knows when to stop" and to my reading ThoughtWorks is as much about knowing when to stop as it is keeping it simple.

Because of the bank holiday I have been waiting a week or so for the feedback on my code and if I've made it to the next stage. When I first submitted for the job I genuinely wanted to work for ThoughtWorks but was open to the possibility that I may not be successful but as the process has gone on I feel a stronger and stronger desire to get this job and as such the potential disappointment grows exponentially with it.

As I face the twin prospects of elation and despair I realised that going for jobs is like Schrödinger's cat: your life becomes the cat in the box, the nucleus the job. Until I open the box (i.e. get the final result of the application) there exists two realities: the one in which I get the job and the one in which I don't. So from now until that time I live with the turmoil of flipping between the belief that my code was the most amazing they've ever seen and I am going to get this job and life's going to be groovy, and desperate self-doubt which tells me they are so disgusted at the poor quality of my work it'll take at least five years for them to scrub the horror from their memories!

As the French say "C'est la vie" and the First Noble Truth is "suffering exists". This is a period in my life with an outcome which, to a certain extent, is outside of my control: there is a reality on whether I have, at this stage in my career, what it takes to be a ThoughtWorker and no matter how much I may believe I do (which I really, really do) that cat in the box is either alive or dead: it is not both.

Building Cathedrals

Within companies, large and small, it is very likely that there is an in-house development team working on a systems in continuous development. In every company I've worked for I have spent the majority of the role developing a highly strategic system, whether website or CMS/CRM (or one of the other many over general three letter acronyms out there). These systems are the Cathedrals of software: they are by far the largest projects in the company and are highly important and grandiose with great architectural vision.

These systems inevitably end up the bain of senior managers, project managers and developers lives. Inevitably a battle between the business demanding bags of "quick wins" (that phrase which makes many a developer quake in their boots) and the developers who want to "do things properly". Pour on way too much vision from all parties and you end up with the poor old project manager curling into the foetal position and jabbering on about "delivery" as everyone stands around kicking them for whatever failure the business or development team wish to sight for the last release.

In these circumstances I have found Agile to be a god send: the business gets quick returns, the project managers get to wrap themselves up snuggly in their Gant charts and click away happily at MS Project and the developers - although they still sit their pushing back deadlines and moaning they haven't got enough resource - actually deliver something. All in all it keeps the whole project off the undeliverable White Elephant which the Business and development team have managed to concoct between them.

After a couple of years of doing this with success I was shocked to find that a few of the problems of the old waterfall methodology had begun to raise their ugly heads, namely: degradation, fracturing and velocity. After some careful thought and backtracking through my career I started to notice some startling similarities between all the developments I had worked on.

Degradation

Degradation is a natural occurrence of any system. Software systems are fortunate that - unlike many systems in the natural world - they rarely degrade through use - think the rail network: the more the rails are used the worse their condition becomes and therefore need replacing. Although external aspects can cause a software system to degrade through use (database size, hardware age etc.) the actual lines of code are in the exact same condition as when written.

The biggest cause of degradation in software is change: as new features are written or bugs are fixed existing code can become messy, out of date or even completely redundant without anyone actually realising it. Also existing parts of the system which work perfectly can get used in a manner not originally intended and cause bugs and performance issues.

There are a number of techniques out there in the Agile world to help minimize degradation including Refactoring, Collective Ownership and Test Driven Development. However heed the warning: despite the best unit tests in the world and pair programming every hour you've been sent, the absolute prevention of degradation relies on two things: time and human intervention. Time is required to fix degraded code which increases the length each feature takes to implement and thus affects velocity. Human intervention requires someone recognising the degradation and further more being bothered to do anything about it (collective ownership and pair programming do help here but are no way a guarantee).

The danger of degradation is that a system can degrade so much that it has a severely negative impact on the progress of a project - sometimes even bringing it to a complete halt - and resulting in the development team battling the business for a "complete rewrite". Degradation is not good for developers' morale maily due to the fear that it sounds like an admission of writing bad code. This results in the, all too common, backlash that it's the business' fault for putting too much pressure to deliver "quick wins" and not accepting the need preserve the architectural vision of the developers; and here we are again in the same spot we were with the waterfall method.

Fracturing

Fracturing can look very similar to degradation but is actually quite different. Fracturing occurs all the time in any system which works to standards - which is all systems regardless of whether those standards are documented or are for a team or individual - as they shift and move to account for changes. One example of this is the naming convention on networks: many companies opt for some kind of convention for naming switches, printers, computers etc. which seems to suit until something comes along which no longer fits. For example when a hypothetical company starts out they have three offices in the UK so the naming style becomes [OFFICE]-[EQUIPMENT]-[000]. But then the company expands to Mexico and someone decides that the naming convention needs to include the country as well: the convention is now fractured as new machines now have [COUNTRY]-[OFFICE]-[EQUIPMENT]-[000]. Then an auditor comes along and says that you should obfuscate the name of your servers to make it harder for a hacker (as UK-LON-ACCOUNTSSQL is a nice signpost to what to hack) so the convention has to change causing even more fracturing.

This happens in code all the time as your standards shift like sand: yesterday you were using constructors for everything and then today you read Gang Of Four and decide that the Abstract Factory pattern is the way to go. The next day you read Domain Driven Design and you decide that you should separate the factory from the class that it creates. Fracturing, fracturing, fracturing. And then you read about NHibernate and decide that ADO is old news and then in a few years LINQ comes out and you swap to that. Fracturing, fracturing, fracturing. Then you discover TDD and start writing the code in tests but the rest of the team doesn't get it yet. Fracturing, fracturing, fracturing.

Of course you could stand still and code to strict guidelines which never change, the team leader walking around the floor with a big stick which comes down across your knuckles every time your code strays slightly in the wrong direction. But who wants to work in an environment like that? Fracturing is a result of many things but mostly it is a result of a desire to do better and as such an improvement in quality. What coders believe is out of sync is rarely their new sparkling code but their old "I'm so embarrassed I wrote that" code. As a result their reaction is a desire to knock down the old and rewrite everything from the ground up using their new knowledge and techniques (though most recognise this is not "a good thing").

Fracturing can also result from negative factors as well: e.g. someone leaves the company taking years of experience with them and you get a new bod in but they've got to get coding quickly so we throw the standards out and we have more fracturing. Or there's a lot of pressure to get that code out so drop those new advances we proved in the last project and go back to the old safe and sound method.

Velocity

The obvious thing to say is that project velocity is negatively impacted by degradation and fracturing but that would only be half the picture. The reality is also the reverse: velocity has an affect on degradation and fracturing. Agile methodologies such as XP place a great deal of stress on maintaining a realistic velocity and the advice is wise indeed: too much velocity and more code is written than can be maintained creating too much degradation. On the other hand too little velocity and so little code is written between the natural shifts that the amount of fracturing per line of code is at a much higher ratio than it would be if the velocity had been higher.

Another consideration is that if velocity is not carefully controlled development can end up in high and low periods. High periods become stressful causing mistakes, less time to fix degraded code, negative fracturing and ultimately burn out. Low periods create boredom, frustration which cause mistakes etc. or they encourage over-engineering again causing degradation and fracturing.

Getting the velocity correct is a real art form and requires so many different considerations that they are too many to list. However one thing that is required to get velocity correct is waste. Waste is the bain of all businesses and they often spend more money trying to eliminate waste than it's original cost. Developers are expensive and to have a developer potentially sitting around doing nothing is a no-no for many companies; they want every developer outputting 100% 100% of the time. However the inability to run to such tight budgets is a reality that virtually every industry has come to terms with. Take a builder for example: if he's going to build a house he'll approximate how much sand, bricks, cement, wood and everything else he'll need then he'll add a load on top. Ninety nine times out of a hundred the builder will end up with a huge load of excess materials which he'll probably end up throwing away (or taking to the next job). Of course this isn't environmentally friendly but the principle the builder is working off is that if he ends up just two bricks or half a bag of cement short he's going to have to order more. Which means a delay of a day, which puts the whole project off track: basically it's going to cost him a lot more than the acquisition and disposal of the excess had he ordered an extra palette of bricks or bag of cement. Thus the builder makes the decision that in order to maintain velocity there will be waste.

What to do

There is an old joke:

"What's the difference between a psychotic and a neurotic? Well, a psychotic thinks 2+2=5. Whereas a neurotic knows that 2+2=4, but it makes him mad. "

Developers and businesses risk becoming psychotic about degradation and fracturing, instead believing that by coming up with some amazing architecture or definitive strategy all these problems will go away. Neurotics are only slightly less dangerous, as they become over anxious and fearful of degradation and fracturing they introduce processes, architectures and strategies until they become indistinct from the psychotics.

Eric Evans has many ideas in his book Domain Driven Design and is the very example of a pragmatist. Evan's accepts that nasty things happen to systems and bad code is written or is evolved. To Evan's it is not important that all of your code is shiny platinum quality but that code which is most critical to the business is clean and pure. This is why Evan's doesn't slate the Smart-UI pattern: because to him there are many situations where they are acceptable: it just isn't realistic to build a TVR when all you need is a 2CV.

Chapter IV (Strategic Design) of Domain Driven Design is dedicated to most of the concepts to protecting your system and is unfortunately the bit which gets least attention. Concepts such as Bounded Context, Core Domain and Large Scale Structure are very effective in dealing with degradation and fracturing. Although the best advise I can give is to read the book one of the ideas which interests me the most is the use of Bounded Contexts with Modules (A.K.A. packages - Chapter 5). Evan's explains that although the most common method for packaging tends to be around the horizontal layers (infrastructure, domain, application, UI) the best approach may be to have vertical layers based around responsibility (see Chapter 16: Large-Scale Structure). This way loosely coupled packages can be worked on in isolation and bought together within the application: there is almost a little hint of SOA going on. If an unimportant package starts to degrade or becomes fractured (for example uses NHibernate rather than ADO.NET) it doesn't require the rewriting of an entire horizontal layer to keep it clean: the layer can stay inconsistent within it's own boundary without having any affect on the other parts of the system. This leaves the development team to concentrate their efforts on maintaining the core domain and bringing business value.

This isn't the only solution though: the use of DDD on big systems will only bring benefit when it hits a certain point. If all your development team is able to do is continuously build fractured systems which degrade (either due to skill or resource shortages) then it may be best just to bite the bullet and accept it. A monolithic system takes a lot to keep going and, although DDD can be applied to prevent many of the issues discussed, if you cannot get the velocity then your domain will not develop rapidly enough to deliver the benefits of a big enterprise system. If that is the case it may be more realistic to abandon the vision and instead opt for a strategy that accurately reflects the business environment. This is where I believe Ruby-On-Rails is making headway; it isn't going to be apt for all enterprise systems but it does claim to take most of the pain out of those database-driven web-based applications that are proliferating through companies. The point is, even when a big-boy enterprise system sounds like the right thing to do, trying to develop one in a business which cannot or will not dedicate time, money and people is going to be a bigger disaster than if you ended up with a half-dozen Ruby-On-Rails projects delivering real business value. And you never know once you've delivered a critical mass you may find you can move things up to the next level naturally.

Conclusion

If you ever visit an old building, especially Cathedrals such as Canterbury, you'd know that they often gone through hundreds of iterations: bits have been damaged by fires, weather, pillaging or deliberately destroyed, they have been extended in ways sympathetic to the original and in ways originally considered abominations, some bits have been built by masters others are falling down because of their poor quality. The fact is these buildings are incredible not because they are perfect but because of the way they have adapted to use through the ages. Old buildings face huge challenges to adapt through all periods of history where architecture must be compromised and the original people, tools and materials change or may no longer be available. The challenge for these buildings across the centuries was not to maintain an unrealistic level of architectural perfection but to ensure that they maintained their use - the moment that stops the building will disappear.

Monday 21 May 2007

Did God use XP or Scrum?

I was wandering around my house in a sleepy haze this morning with thoughts gently pulsating into my peripheral consciousness. I switched on the light and said to myself "Let there be light, and there was light" and then two peripheral thoughts blurred into focus at once prompting me to rummage through my wife's academic books (she's a teacher of Religion) and read through the first chapter of the Jeudo-Christian Old Testament.

Jokes aside about developer's egos it is quite clear that God used iterative development (or creation). Give the first chapter of Genesis a read for the compelling evidence.

In fact I think there is a clear case for claiming God was a YAGNI type of mono-deity as he creates light on day one but doesn't introduce the Sun until day four (verse 16: "God made two great lights--the greater light to govern the day and the lesser light to govern the night.") and though I've personally never worked on a project of that scale I can safely guess that there was some serious refactoring going on there! This also shows that the Creator has a real talent for deciding what the next most important thing should be.

Some of the more observant out there may have noticed that God was also using a bit of Behaviour Driven Development but being God he uses "Let there be" rather than "Should" before each one of his tests.

You may be wary of reading quite so much into the text but I think it's safe to say there was no BUFD going and can you imagine it if there was: on the first day he drew up the requirements, on the second day he wrote the design document, on the third day he finally stared, on the fourth day he realised that he'd forgotten the light and had to submit a change request...

Tuesday 15 May 2007

YAGNI battles the Black Swans

There's been a lot of fuss in the economic world about a book called The Black Swan: The Impact of the Highly Improbable by Allen Lane. Lane uses the metaphor of the discovery of the Black Swan to explain his theory: basically before the discovery of Australia people believed there were only white swans in the world and created a whole set of theories around this 'fact'. Then a black swan was discovered in Australia junking the theories.

Using this as a template Lane defines a Blank Swan as a highly improbable event with three principle characteristics: it is unpredictable; it carries a massive impact; and, after the fact, we concoct an explanation that makes it appear less random and more predictable than it was.

Lane goes onto criticize businesses, markets and politicians for their over confidence in prediction. Lane even goes to the extent to suggest that it is a cold, hard fact that we cannot predict Black Swans but by preempting the future based on our belief that only White Swans exist we set ourselves up for disaster. So when a Black Swan does come along - and Lane argues that they come up a lot more than we predict - we've made things a whole lot worse by planning around White Swans.

Any developer who's every used the phrase "You Ain't Gonna Need It" would identify with Lane's theory. YAGNI tell's us that we should only implement what we need not what we foresee we need - no matter how sure you are that you're gonna need it. By writing code which preempts the future design we are building systems for White Swans and when that Black Swan comes along (and it will) we're going to be running around trying to refactor a load of over-engineered code. The difference between a Black Swan developer and a White Swan developer is the Black Swan developer accepts this and uses YAGNI to battle it where as a White Swan developer just re-engineers his predictions by turning the latest Black Swan into another White Swan - that is until the next Black Swan rears it's ugly head.

I think YAGNI developers - either naturally or through experience - think that life is full of Black Swans. I think that this is why some developers don't get YAGNI: they only believe in white ones. Maybe a different approach is not to say to them "You ain't gonna need it" but maybe say "Black Swan".

Sunday 13 May 2007

Encapsulating Top Trump objects

Many of the objects we design are like Top Trumps: on first inspection it looks like we should be exposing all their information so we can see it but when you look closer you realize it's 100% about behaviour.

What do I mean? A Top Trump game consists of you looking at your card and choosing what you believe the be the best statistic on your card and challanging your opponent. This leads us to expose the data of our Top Trump object using properties so we can see them clearly. In reality however the Top Trump card should be keeping it's data secret: it is only exposing it's data to you not to your opponent, you merle ask your opponent if your card beats his but you never know what data his card contains.

If we were to model this in an OO world we would probably have a TopTrump class with various comparison methods. We wouldn't want to do the comparisons ourself. As an example we wouldn't do:

public void ChallangeOnSpeed()
{
 if(me.CurrentTrump.Speed > opponent.CurrentTrump.Speed)
 {
  me.TakeTrumpFrom(opponent);
 }
 else if(me.CurrentTrump.Speed < opponent.CurrentTrump.Speed)
 {
  me.GiveUpTrumpTo(opponent);
 }
 else
 {
  // it's a draw
 }
}

Instead we'd have the Trumps make their own desicions:

public void Challange(IStatistic statistic)
{
 ChallangeResult challangeResult = myTrump.ChallangeWith(statistic);

 if(challangeResult.Equals(Won))
 {
  TakeTrumpFrom(opponent);
 }
 else if(challangeResult.Equals(Lost))
 {
  GiveUpTrumpTo(opponent);
 }
 else
 {
  // it's a draw
 }
}

So what has this gained us? It is no longer down to the player to compare the information the trumps expose instead the responsibility for comparison is with the trumps themselves. Now the trumps are autonomous in deciding how they should behave: they are encapsulated. "And the point of that was?" To answer that question let's look at the evolution of our model. The first version of our software only supported one type of Trump and that was the Supercars pack. Our players could compare directly by choosing ChallangeOnSpeed etc. but now we've designed a new Marvel Comic Heroes pack which have different categories. We'd have to either add new methods or create new Player classes to handle the new Trump classes and new Game classes to handle the new Player classes. What a load of work! Then what happens when Dinosaurs pack or the Horror pack comes out? Or what if we go global and the speed is worked out in KM/H as well as MPH or dollars as well as sterling? By exposing the data we have overburdened the Player and the Game classes with responsibility causing a maintanance nightmare so now everytime our domain evolves our code breaks.

This is why properties break encapsulation when they expose data directly. When we put the responsibility to compare data on the class which doesn't own it then any change to the class that own's the data has a ripple effect through your whole code breaking Single Responsibility Principle. If on the other hand the object itself is responsible for carrying out the operations on it's data you only make one change and in turn you strengthen the concepts of your domain.

"OK" I hear you say "but you have to expose properties so you can render the data on the user interface". Ah but do you? What if the Trump object told the UI what to display? "Surely that would mean putting UI code into the Trump object and that's just wrong and anyway I need that data to save to the database as well does that mean we also end up with data code in the class that's even worse?" Not if we use the Mediator pattern (hey imagine a Design Patterns Top Trumps!).

Mediator to the rescue
On a side note I'd like to discuss why I said Mediator over Builder. Some people use the Builder pattern (Dave Astels did in his One Expectation Per Test example) which is a construction pattern used to seperate the construction of an object from it's representation. The purpose of the Builder is that at the end you get an object. However when working with a UI you may not have an object as an end product instead you may be just sticking your values into existing objects. The Mediator on the other hand merle acts as an interface which defines the interaction between objects to maintain loose coupling. You could argue that some implementations of Builder are really Mediator with a Factory method (or a combination of both as we shall soon see).

Well let's start by defining our mediator for our Top Trump. For those who have read my blog before you'd know I'm a big fan of Behaviour/Test Driven Development but for the sake of consisnese I'll deal only with the implementations for these examples.

public class TopTrump
{
 int speed;

 public void Mediate(ITopTrumpMediator mediator)
 {
  mediator.SetSpeed(speed);
 }
}

public interface ITopTrumpMediator
{
 void SetSpeed(int speed);
}

Of course you could use a property (what?) for setting the data if you so wished (I think that's perfectly legitamate as it is a behaviour to change the value) however there are good arguments for using a method one of them being overloading (suppose we wanted SetSpeed(string)).

Now when we implement our concrete Mediator we tie it to the view:

public class UiTopTrumpMediator : ITopTrumpMediator
{
 private readonly ITopTrumView view;

 public UiTopTrumpMediator(ITopTrumView view)
 {
  this.view = view;
 }

 public void SetSpeed(int speed)
 {
  view.SpeedTextBox.Speed = speed.ToString();
 }
}

That works nicely and of course you could implement a database Mediator as well using one method regardless of the destination (I love OO don't you?). The only thing is our Mediator is a bit too close to the exact representation of the object. If we were to introduce our new packs we'd have to rewrite our mediator interface and all the classes that consume it. What we need to do is get back to our domain concepts and start dealing in those again:

public interface ITopTrump
{
 void Mediate(ITopTrumpMediator mediator);
}

public interface ITopTrumpMediator
{
 void AddStatistic(IStatistic);
}

public class SupercarTrump : ITopTrump
{
 int speed;
 SpeedUnit speedUnit;

 public void Mediate(ITopTrumpMediator mediator)
 {
  mediator.AddStatistic(new SpeedStatistic(speed, speedUnit));
 }
}

public class DinosaursTrump : ITopTrump
{
 StrengthStatistic strength;

 public void Mediate(ITopTrumpMediator mediator)
 {
  mediator.AddStatistic(strength);
 }
}

Then on IStatistic we'd add a ToString method like so:

public interface IStatistic
{
 string ToString();
}

public class SpeedStatistic
{
 public string ToString()
 {
  return String.Format("{0}{1}", speed, speedUnit);
 }
}

Of course we could go on and on refining and refactoring - adding muli-lingual support to our mediator etc. - but hopefully you get the picture; by encapsulating the data of the object and placing the stress on it's behaviour we protect it's internal representation from being exposed thus decreasing it's coupling and making it more flexible and stable during change.

Now if I'm honest with you after having all the above revelations there was one scenario I struggled with which I hadn't found any good examples to. Basically everyone talks about displaying data to the UI but never about changing the object from the UI. Changing the data implies breaking encapsulation in someway and if the UI isn't allowed to know about the internal representation how is it supposed to change it? Basically how would our Trump Designer package create new Trump cards and edit existing ones?

Well creation is easy: we'd use the builder pattern and have a concrete implementation of ITopTrumpBuilder for each type of Top Trump card. The UI would then simply engage the ITopTrumpBuilder and pass it's data across in much the same fashion as with the mediator just in reverse. The builder could even tell us if the resulting Trump is valid before we try and get the product.

Remember memento? (not the film the pattern)
But still what about editing an object? There's a pattern called Memento which because of the film is probably the catchiest of all the pattern names but it still is quite a rarity to see it used. That's because Memento's core purpose is for undo behaviour which is very rare on enterprise systems but it is handy for general editing scenarios. Basically Memento is either a nested private class which either holds or is allowed to manipulate the state it's container (the originator) or an internal class which the originator loads and extracts it's values from. Therefore Mementos offer a very nice way of encapsulating edit behaviour if we combine it with Mediator to create a public interface which objects external to the domain can use.

public interface ITopTrumpMemento
{
 void UpdateStatistic(IStatistic);
}

public class SupercarTrump : ITopTrump
{
 private State state;

 private class State
 {
  internal int Speed;
  internal SpeedUnit speedUnit;
 }

 private class SupercarTrumpMemento : ITopTrumpMemento
 {
  private State state;

  private SupercarTrumpMemento(State state)
  {
   this.state = state;
  }

  private State GetState()
  {
   return state;
  }

  public void UpdateStatistic(IStatistic statistic)
  {
   speedStatistic = statistic as SpeedStatistic;
   if(speedStatistic != null)
   {
    state.Speed = speedStatistic.Speed;
   }
  }
 }

 public ITopTrumpMemento CreateMemento()
 {
  return new SupercarTrumpMemento(state);
 }

 public void Update(ITopTrumpMemento memento)
 {
  this.state = memento.GetState();
 }
}

So there you go: now your UI (or database or webservice) can work with the ITopTrumpMemento interface for editing ITopTrump objects and you can add new TopTrump classes which store their internal data with varying different methods to your hearts content without every breaking any code!

There are advantages to this too numerous to mention; loose coupling is promoted as the UI never gets near the domain, testing is made far easier as you can use mocks of the IMediator, IBuilder and IMemento instead of working with the domain objects direct and also reusability is increased as the mediators take the responsibility away from your presenters.

Tip:
The trick with maintaining your encapsulation as neatly as possible is to try and ensure that your IMediators, IBuilders and IMementos all deal with the concepts of their domain (for example IStatistic) and not the structure of the data (e.g. int speed).

Friday 11 May 2007

Eric Evans: Hear his voice

DotNetRocks have published a podcast of an Eric Evans interview discussing Domain Driven Design (Show #236).

There's about 11 minutes of irrelevant chatter at the beginning so just skip forward.

Have a listen and enjoy.

Thursday 10 May 2007

Don't Expect too much

A long while ago (well over two years) there was a lot of fuss made on the testdrivendevelopment Yahoo group about having only one assertion per test. Dave Astels wrote a great little article and a little more fuss was made. It was one of those things you knew made sence but sounded a lot of work. I played with one assertion per test anyway and suddenly felt my code developing more fluidly as I focused on only one thing at a time and my tests looked clearer (more behavoir orientated) and refactoring became a lot simpler too.

Then Dave Astels came along and pushed the envelope further with One Expectation per Example. This really grabbed my attention as I had been enjoying the benefits of keeping my test code clean with one assertion per test but anything with mocks in it just turned into out of control beasts (especially some of the more complex logic such as MVP). Even simple four liners such as below would end up with four expectations:

public AddOrder(OrderDto)
{
Customer customer = session.GetCurrentCustomer();
customer.AddOrder(OrderFactory.CreateOrder(OrderDTO));

workspace.Save(customer);
}

Then everytime I needed to add new functionality I had to add more expectations and anyone who read the work (including myself after 24 hours) would struggle to make head or tale out of the monster. And if tests failed it would take a day to find which expectation went wrong. If you're not careful you end up with the TDD anti-pattern The Mockery.

I had read Dave Astels article several times but couldn't fathom out how it worked especially seeing it was written in Ruby with the behaviour driven RSpec. In the end I had to write it out into .NET myself before I got it.

So here is a break down of how I got Dave's One expectation per example to work for me:

One Expectation Per Example (C#)
One of the first things to note is Dave uses the builder pattern in his example. The idea is that the Address object interacts with a builder to pass it's data to rather than allowing objects to see it's state directly thus breaking encapsulation. I would like to go into this technique in more detail in another article but to deliver the point quickly think that you may create an HTML builder to display the address on the web.

Well let's start with Dave's first test:

[TestFixture]
public class OneExpectationPerExample
{
[Test]
public void ShouldCaptureStreetInformation()
{
    Address addr = Address.Parse("ADDR1$CITY IL 60563");

    Mockery mocks = new Mockery();
    IBuilder builder = mocks.NewMock<IBuilder>();

    Expect.Once.On(builder).SetProperty("Address1").To("ADDR1");

    addr.Use(builder);

    mocks.VerifyAllExpectationsHaveBeenMet();
}
}

You may have noticed that I've changed a few things maily to make it look consistant with .NET design practices. Basically I've introduced a Parse method rather than the from_string method Dave uses.

Now we need to get this baby to compile. First we need to create the Address class like so:

public class Address
{
public static Address Parse(string address)
{
  throw new NotImplementedExpection();
}

public void Use(IBuilder builder)
{
  throw new NotImplementedExpection();
}
}

And the IBuilder interface:

public interface IBuilder {}

Now it compiles but when we run it we get the following:

   mock object builder does not have a setter for property Address1

So we need to add the Address1 property to the IBuilder. Then we run and we get:

    TestCase 'OneExpectationPerExample.ShouldCaptureStreetInformation'
failed: NMock2.Internal.ExpectationException : not all expected invocations were performed
Expected:
1 time: builder.Address1 = (equal to "ADDR1") [called 0 times]

Let's implement some working code then:

public class Address
{
private readonly string address1;

private Address(string address1)
{
  this.address1 = address1;
}

public static Address Parse(string address)
{
    string[] splitAddress = address.Split('$');

    return new Address(splitAddress[0]);
}

public void Use(IBuilder builder)
{
    builder.Address1 = address1;
}
}

Run the tests again and they work! So let's move onto the second part of implementing the Csp. Here's the new test:


[Test]
public void ShouldCaptureCspInformation()
{
    Address addr = Address.Parse("ADDR1$CITY IL 60563");

    Mockery mocks = new Mockery();
    IBuilder builder = mocks.NewMock<IBuilder%gt;();

    Expect.Once.On(builder).SetProperty("Csp").To("CITY IL 60563");

    addr.Use(builder);

    mocks.VerifyAllExpectationsHaveBeenMet();
}

Now a little refactoring to get rid off our repeated code we turn it into this:

[TestFixture]
public class OneExpectationPerExample
{
private IBuilder builder;
private Address addr;
private Mockery mocks;

[SetUp]
public void SetUp()
{
    mocks = new Mockery();

    builder = mocks.NewMock<ibuilder>();

    addr = Address.Parse("ADDR1$CITY IL 60563");
}

[TearDown]
public void TearDown()
{
mocks.VerifyAllExpectationsHaveBeenMet();
}

[Test]
public void ShouldCaptureStreetInformation()
{
    Expect.Once.On(builder).SetProperty("Address1").To("ADDR1");

    addr.Use(builder);
}

[Test]
public void ShouldCaptureCspInformation()
{
    Expect.Once.On(builder).SetProperty("Csp").To("CITY IL 60563");

    addr.Use(builder);
}
}

Looking good! We run the new test and we get the usual error for having no Csp property on the IBuilder so we add that:

public interface IBuilder
{
string Address1 { set; }
string Csp { set; }
}

Then we run the test again and we get:

   TestCase 'OneExpectationPerExample.ShouldCaptureCspInformation'
failed: NMock2.Internal.ExpectationException : unexpected invocation of builder.Address1 = "ADDR1"
Expected:
1 time: builder.Csp = (equal to "CITY IL 60563") [called 0 times]

Oh no. This is where Dave's article falls apart for .NET.
Basically RSpec has an option to create Quite Mocks which quitely ignore any unexpected calls. Unfortunately I know of no .NET mock libaries that have such behaviour (though I have since been reliably informed by John Donaldson on the tdd yahoo group that it is possible with the NUnit mock library) . Though there is a way out: stub the whole thing out by using Method(Is.Anything):

[Test]
public void ShouldCaptureCspInformation()
{
Expect.Once.On(builder).SetProperty("Csp").To("CITY IL 60563");

// stub it as we're not interested in any other calls.
Stub.On(builder).Method(Is.Anything);

addr.Use(builder);
}

Just be careful to put the Stub AFTER the Expect and not before as NMock will use the Stub rather than the Expect and your test will keep failing.

So now we run the tests and we get:

   TestCase 'OneExpectationPerExample.ShouldCaptureCspInformation'
failed:
TearDown : System.Reflection.TargetInvocationException : Exception has been thrown by the target of an invocation.
----> NMock2.Internal.ExpectationException : not all expected invocations were performed
Expected:
1 time: builder.Csp = (equal to "CITY IL 60563") [called 0 times]

Excellent NMock is now behaving correctly we can finish implementing the code:

public class Address
{
private readonly string address1;
private readonly string csp;

private Address(string address1, string csp)
{
    this.address1 = address1;
    this.csp = csp;
}

public static Address Parse(string address)
{
    string[] splitAddress = address.Split('$');

    return new Address(splitAddress[0], splitAddress[1]);
}

public void Use(IBuilder builder)
{
    builder.Address1 = address1;
    builder.Csp = csp;
}
}

Run the test and it works! Now if we run the whole fixture we get:

   TestCase 'OneExpectationPerExample.ShouldCaptureStreetInformation'
failed: NMock2.Internal.ExpectationException : unexpected invocation of builder.Csp = "CITY IL 60563"

All we need to do do is go back and add the Stub code to the street test. That's a bit of a bummer but we could refactor our tests to do the call in the tear down like so:

[TearDown]
public void UseTheBuilder()
{
Stub.On(builder).Method(Is.Anything);

addr.Use(builder);

mocks.VerifyAllExpectationsHaveBeenMet();
}

[Test]
public void ShouldCaptureStreetInformation()
{
Expect.Once.On(builder).SetProperty("Address1").To("ADDR1");
}

[Test]
public void ShouldCaptureCspInformation()
{
Expect.Once.On(builder).SetProperty("Csp").To("CITY IL 60563");
}

This approach comes across as slightly odd because the expectations are set in the test but the test is run in the tear down. I actually think this is neater in some ways as it ensures you have one test class for each set of behaviours the only off putting thing is the naming convention of the attributes.

I won't bother continuing with the rest of Dave's article as it's just more of the same from here. The only thing I'd add is he does use one class per behaviour set (or context) so when he tests the behaviour of a string with ZIP code he uses a whole new test fixture. This can feel a little extreme in some cases as you get a bit of test class explosion but all in all it does make your life a lot easier.

I hope the translation helps all you .NET B/TDDers out there to free you from The Mockery.

Tip:
In more complex behaviours you may need to pass a value from one mock to another. In those instances you can do:

   Stub.On(x).Method(Is.Anything).Will(Return.Value(y));

Friday 27 April 2007

Being more Fluent with Equals

A lot of .NET developers don't realise that there is a difference between the == operator and the Equals method. Even fewer developers realise there is a difference between the == operator and the == operator. Confused? That's because == will behave differently depending on whether you apply it to a reference type or a value type. More confusingly some .NET classes override == and will behave differently again. To explain MS made this attempt:

For predefined value types, the equality operator (==) returns true if the values of its operands are equal, false otherwise. For reference types other than string, == returns true if its two operands refer to the same object. For the string type, == compares the values of the strings.

Except that's not entirey true. The == operator on string only works if they are both strings. Confused even more? Then read this post from the C# FAQ.

So are you clear now? If not then MS sums it up quite nicely in their Guildelines for overloading:

To check for reference equality, use ReferenceEquals. To check for value equality, use Equals or Equals.

So why don't developers have it drummed into them to follow the above advice and just dump ==? Because they believe that == is easier to read. Let's think about it for a moment. How is it easier to read? Using == risks making buggy code and goes against every rule about intention revealing interfaces and maintaining encapsulation. How the hell do you know what the developer intended when she did x == y? Were they checking for value equality or reference equality? Or had they overloaded == to always do value equality (as string does)? Basically you don't know (breaking intent) and you'd have to open up the class to see (breaking encapsulation) and still you won't know for sure. Then of course there is just plain = now did they mean to do that or did they just miss the second =? So == is definitely not easier to read from an intent or encapsulation point of view.

So they must mean that == is better style. This I believe is flawed because sometimes you end up doing a bit of == here and a bit of Equals there and what's more all objects have Equals but structs do not have ==. So having a style which prefers == except when == doesn't do the same thing as Equals (or == doesn't even exist) throws all consistency and style out of the window and you end up with a style guide that says "use == except when or when or when" rather than just placing a total ban on ==.

Then it must just be that == looks better. I think this is just habit. Lets take the following lines:

if(x == y)
{
// do something
}
if(x = y)
{
// do something
}
if(x.Equals(y))
{
// do something
}

If you took a group of people who knew little about development (or C# for that matter) and ask them what each line means I can guarantee everyone of them will always get the last line right (they'd probably think that the double equals meant equals twice and that would confuse them on the single one). The Equals method is the most explicit and clear and readable of all of them (it actually reads as x equals y). To further prove my point grab your same person and ask them what this means:

if(x)
{
// do something
}
if(!x)
{
// do something
}

Then ask them what this means:

if(x.Equals(true))
{
// do something
}
if(x.Equals(false))
{
// do something
}

Those second examples look far clearer and you'd be an idiot to not know what the intent was. What's more they make their own mini fluent interfaces. Also you eliminate all those "oh there's a bang at the beginning" bugs. However I think it is fair to say that x.Equals(true) is a bit overkill though I do find that x.Equals(false) is somewhat clearer than the using the logical negation operator.

So after knowing that technically it's the right thing to do, that it's better for showing intent, that it is more consistent, that is reduces risk of bugs and everything else, if you still need convincing because you still think that == looks better then justify it by saying you're using a fluent interface.

TDD Anti-Patterns

It's a bit old now but I picked up a piece of code from a supplier that had the Generous Leftovers anti-pattern and it reminded me that every developer should have this list printed off and study it regularly.

Check out James Carr's TDD Anti-Patterns.

The Free Ride is probably the most common one and developers need to refer to "one assertion per test" to try and avoid this.

I think The Mockery is quite possibly the hardest anti-pattern to avoid mainly because we need a leap forward in the way Mock libaries work (what I call Quite Mocks). This will allow "one expectation per example". RSpec (used in the tutorial) has its own mock libary that supports Quite Mocks by passing a null_object argument. Unfortunately for us .NET developers out there such support doesn't exist. I have requested it on the NMock feature requests board but in the mean time you can create a helper class which uses reflection to stub out all the methods/properties you are not testing.

I am planning to 'translate' Dave Astels article to C# 'cos Ruby can be a bit tough on the eyes if you aren't used to it and as I mentioned already you need Quite Mocks (which don't exist) to really get it to work.

In the mean time I made my own contribution to the TDD Anti-Patterns with:

The Mad Hatter’s Tea Party
This is one of those test cases that seems to test a whole party of objects without testing any specific one. This is often found in poorly designed systems that cannot use mocks or stubs and as a result end up testing the state and behaviour of every peripheral object in order to ensure the object under test is working correctly.

Friday 20 April 2007

Sponsor Me

I'm doing the London Marathon this Sunday. If anyone falls across this post and wishes to sponsor me and give some money to the breaks4kids charity I'm running for then please do.

Sorry: no more non-geeky activity on this blog (though it's for a good cause).

Tuesday 17 April 2007

Don't override the back button

Have Google developers forgotton the number 1 of the The Top Ten Web Design Mistakes: namely don't break the back button?

I was just working on a great post, clicked on preview, found an error, clicked back to go back to the form and accidently (honest) clicked OK and now the post I just spent 45 minutes working on is gone because Google have messed with the way the back button works.

I know why this happened: because we're in a DHTML/AJAX world now and I didn't really submit anything (as the form gives the appearance of) just flipped a little javascript and switched some styles and layers about but for the sake of the Web are we really back to the 1999 world of DHTML abuse or could this be the first signs of Google's Microsoftization and they're now making the standards?

Wednesday 11 April 2007

Whole Values

I read a post the other day that said one of the commonest mistakes made by newbie OOP programmers was to do this:

PostCode postCode = "SW1 1EQ";

Then a spurt of experienced programmers hailed down their fury on this common misconception amongst newbies and how they had to bash out of them that they shouldn't waste their lives abstracting strings.

I also heard a MS qualified trainer tell all he knew and respected him that you should avoid the implicit operator overload at all costs because "if your type were meant to be a string MS would have made it one".

Unfortunately this is one of those places where the MS fan boys have been bought up badly by nasty VB. The concepts that are being so frowned upon by leagues of MVPs is not only a corner stone of OOP (to mix data and behaviour) but a wonderful concept called Whole Value and implicit overloading is C#'s gift to you to make it happen.

Thom Lawrence has a lovely short post on how to do whole values with implicit operators here
so I won't repeat his good work (though I will add that Martin Fowler recommends using structs) but what I will do is try and explain why whole values are a wonderful thing.

Because a PostCode isn't a Surname
One of the first and most basic things a Whole Value will give you is a level of type safety that you may never have realised existed. Have you ever had that annoying bug pop up in an application because someone accidently did this:

FindPerson(form.PostCode, form.Surname)
// somewhere else far, far away:
public ReadOnlyCollection FindPerson(string surname, string postCode);

In this simple example it's pretty obvious you've got it round the wrong way but when you've got a few extra variables to play with it's really easy to get it wrong. Well what if I said there's a way to prevent this ever happening? Use a Whole Value like so:


FindPerson(form.PostCode, form.Surname)

interface Form
{
Surname Surname {get{}};
PostCode PostCode {get{}};
}
// somewhere else far, far away:
public ReadOnlyCollection FindPerson(Surname surname, PostCode postCode);

Now when you go to compile you will get an error because tpye PostCode cannot be assigned to type Surname. You'll also find it helps when you do overloading. You can turn nasty code like this:

FindByPostCodeAndSurname(string postCode, string surname);
FindByPostCode(string postCode);
FindBySurname(string surname);

Into this:

Find(PostCode postCode, Surname surname);
Find(PostCode postCode);
Find(Surname surname);

How much cleaner is that? Of course there are other ways to skin that cat but you will still find that those ugly method names dissapear (especially in factories etc.).

Because a PostCode was born of string
The other thing is PostCode will start his life out as a string of some form. Either from a web form or a database but somewhere he was made out of a string. This is where the implicit overloading comes in: we can allow PostCode to easily start out as a string and handle like a string when he needs to (because he's gonna need to):

PostCode postCode = Form["PostCode"];

Then somewhere far away:


Parameter["PostCode"] = Address.PostCode;

Because a PostCode isn't a string
The other thing is PostCode isn't a string. Sure somewhere he starts life as a string and somewhere you've got to have a string with the real post code in it but somewhere even further down that ain't a string at all it's a char array and somewhere further down... The point of OOP is to abstract real world things and encapsulate them and if you let PostCode wander around your system as a string he's never gonna reach his full potential (and he might just wander where he shouldn't). All the other bigger grown up objects are going to have to do everything for him: deciding whether he's valid, chop him up to find out what his area code is, compare him to other postcodes to see if they're in the same area. The poor old postcode will never reach his potential and instead will be pushed and shoved around by all the bigger boys.

You are a cruel, cruel programmer to let this happen: you are just as bad as those parents who never give their children any responsibility and then moan at them for being incapable of doing anything for themselves. But there is still hope: give your PostCode some responsibility and start by making him a Whole Value:

struct PostCode
{
Area {get;}
District {get;}
Sector {get;}
InwardCode {get;}
OutwardCode {get;}
}

Doesn't that look better? Now instead of this:

string postCode = "SQ1 1EQ";
if(LondonPostCodes.Contains(postCode.Substring... yuk I can't go on!

You can do something beautiful like this:

postCode = "SQ1 1EQ";
if(LondonPostCodes.Contains(postCode.Area)) ...

This of course goes even further because Area is a whole value too and you may decide that it should know what city it is. So the code now becomes:

if(postCode.Area.City.Equals(City.London))

Now PostCode can take all of that nasty horrible code that all the bigger boys had (and probably duplicated) and deal with it himself.

Because a PostCode should be a legitimate PostCode
Validation is also a good responsibility of a whole value so you can make the thing blow up if you try and put something bad in it (just the same as a DateTime will). For extra safety you can add Parse and TryParse methods to your Whole Value (I have them as standard).

So not only does your code become more type safe, more powerful and flexible but it also becomes more stable. No longer does every other object have to keep checking whether the postcode is in good shape or reference some nasty function library to find out the area; our little PostCode string has grown up into a real object at last and can now go out into the big wide world knowing he can shoulder the responsibility of keeping himself valid and answer all the questions people want to know of him.

So now whenever you see a simple type like a string or an int sittting on the outside of one your classes take a close look at it and ask youself what could have been if you'd only let it become a whole value.

Jupiter Moonbeam & the Geeks from Cyberspace