Monday 29 October 2007

Your Feet's Too Big

Every piece of code written has a refactoring footprint which is essentially its proliferation through the code base and its level of difficulty to refactor. Some code can have a high proliferation but the refactoring may be simple, for example renaming a variable in modern IDE's is a simple matter regardless of the number of usages. On the other hand changing a variables return type or changing the signature of a method both create a large refactoring effort, especially if that requires knock-on refactorings of other classes and the greater the consumption of the code the greater the footprint. There are other things that can have an impact on the ease of refactoring: using libraries that create code that is difficult to refactor (for example having to place variable names etc. in strings), using external libraries directly without encapsulation, having similar routines across code which isn't encapsulated, the number of entry points to a piece of code (the number of different ways of achieving the same thing) all of these can make refactoring difficult and risk the introduction of bugs. Another area, much neglected when considering ease of refactoring, is test code which very often has a high consumption of production code (sometimes higher than the production code itself) and can make the simplest refactorings painful because large chunks of test code are dependent on your class.

TDD states that code that is easy to test is good code and the same is true for refactoring: writing code which is easier to refactor leads to cleaner code. Refactoring footprints have, in very real terms, an impact on productivity and therefore managing them is quite important. There are various techniques - most of them basic OOP principles - that can help keep the refactoring footprint down. However, due to the natural evolution of code, this can be quite hard to keep track of and unless you have an in-depth knowledge of the codebase - which is increasingly unlikely the bigger and older the codebase is - it can be difficult to 'see' a footprint.

I think it would be interesting to explore the use of metrics to measure refactoring footprints. These metrics would give a very interesting and valuable insight into the codebase and its quality. The interesting thing is that unlike many other code metrics refactoring footprints need to take into account test code as well. From a very basic level counting the number of usages across both production and test code can give a rudimentary indication to a footprint. More sophisticated metrics would measure the impact of a refactoring (are there knock-on refactorings). Different types of refactoring would also require different metrics and some refactorings may be uncommon in certain codebases or may have a degree of acceptable pain (package refactorings for example). As with most metrics they will be open to a degree of interpretation based on codebase and class but none the less the metrics should be useful - especially in recommending refactorings!

I haven't done much research on this so I do not know if any tools already exist to calculate such metrics or even what metrics they provide, though I am sure there will be. I shall try and track some down and have a play to see what sort of data they provide and establish what sort of insight they can give into a codebase.

No comments:

About Me

My photo
West Malling, Kent, United Kingdom
I am a ThoughtWorker and general Memeologist living in the UK. I have worked in IT since 2000 on many projects from public facing websites in media and e-commerce to rich-client banking applications and corporate intranets. I am passionate and committed to making IT a better world.