Wednesday 12 October 2011

Mistaken Myths: Number 1. Your code should be self documenting

This is a fairly classic misunderstanding, yet one that is spouted a lot in the industry.

It seems pervasive that developers shoud produce self describing/documenting code. To a degree, I agree with this, but my viewpoint is that comments in code should lead to WHY something is being done and not reference how. More a specification of the method than how it does what it does without understanding what the start and end goals are (success criteria if you want to put it in TDD terms). This is a view shared by a few member of the industry such as those following in the footsteps of "The Pragmatic Programmer" and not always those only blindly following what I keep getting told "Code Complete" recommends (Note, I have not read the book myself, so can't comment on if it is the book that actually recomemnds this, or the person reading the book that does :-)

Decision traceability is something that is almost impossible to get from code alone. So some adjunct mechanism is often used, like referencing JAD/JRP sessions, original work requests/bug reports/version control logs in larger organisations, or simply a log book/notepad in smaller ones. Some developers choose to comment their code, but the comments have to be good quality. A comment that is not updated has the same productivity loss as the loss of 'tribal memory' (as Grady Booch often puts it) regarding an application (Which I will get to in a later post), so the comments should be treated with the same respect as actual code, especially if you are generating documentation from it.

In other engineering disciplines, most decisions are given a reference code and slapped on design diagrams and documents. This allows the tracing of the decision all the way back to the original discussion that led to those decisions being made. In software development, this has started to finally get through to some groups.

Additionally, I prefer to place the functional specification (I write them in OCL, given I have a bit of VDM in my history :) in the form of pre and post conditions at the top of the method/function. It would look something like:


/*
* --- ref: RQ/SH01/01 ---
* context AccountingServices::TryConnect( Host : Uri,
* Port : Integer ): boolean
* ------
* pre: registeredAddresses->contains( Host )
* post: ( result and AccountingServices.Host.Connected ) or
* not( result or AccountingServices.Host.Connected )
*/
/// <summary>
/// This method attempts to contact the host server and establishes a

/// connection if an address is one of the registered addresses.
/// <example>
/// ...
/// if ( AccountingServices.TryConnect( hostAddress, portNumber ) )
/// ... Do Something ...
/// else
/// ... Do Something else ...
/// ...
/// </example>
/// <param name="HostAddress">The host location to attempt the
/// connection to</param>
/// <param name="port">The port number to connect to</param>
/// </summary>

public bool TryConnect( Uri HostAddress, int port ){...}



Or you could just apply the reference and hope that a developer will read the documentation...

...ooh look over there! A rainbow, I need to catch it!! :o)

But before I go colour hunting, good quality comments are a good thing. Donald Knuth's Literate programming, despite his best efforts between the 1970's and 1990's, in its entirity, has been consigned to faded memory in modern day imperative paradigms. Though the principles it pushed live on in the new guise of documentation comments (such as those for JavaDoc, DelphiDoc, DOxygen and SandCastle).

The problem is using these tools under time pressure. Documenting code is often relagated to third class status, way behind getting code out of the door and unit testing. It is performed with the mentality that 'I will do it tomorrow'.

Sometimes it is up the QA members of the team to demand that this be done, especially when no formal design documents have been created and the whiteboard has been wiped clean...

...Oooooh!! It's getting away...

..Otherwise it is lost when that 'tribal memory' fades or joins another 'tribe' :o) As the best efforts of the developers in writing unit tests, using good method and variable names will never explain what decision was taken and why.

Some developers and companies regard good documenting comments as 'Gold Plating' and in doing so, will end up paying for the time of a comparatively highly paid contractor/consultant to repeatedly chase up the source of the decisions when the decision makers may have done an Elvis and left the building. If in the end the 'tribal leader' who mde the decision isn't based in the organisation any more, you are pretty much stuffed. So this highly paid consultant will trawl through tens or hundreds of thousands of lines of tests and code, not knowing if either accurately reflect the businesss process and if they are new, not even knowing what the business process is in any meaningful detail, for days or weeks at a time making zero progress on development or bug fixes before (s)he finds the source of the problem and a ten minute job later, it's fixed!! Well done! Waste Maker Corp, with your misinterpretation of lean principles, you have just wasted thousands (or tens thereof) of your own company's money because you didn't let a cheaper developer spend a couple of hours putting documenting comments in.

...crap, the rainbow's gone!! :-(

Not to worry, there will be other rainy days.

3 comments:

  1. Great article, but it looks like your comments are gonna be longer than your methods at this rate!!

    What about using those auto-documentational tools such as vsdocman, sandcastle etc - wouldn't that make life simpler?

    ReplyDelete
  2. @mollypepperpot
    Well, yes. That is exactly what they are for, hence my inclusion of them above. As far as generating documentation is concerned, that will help. However, the key is the inclusion of the OCL/semi-formal specification, as it gives you the trace back to the original decision as far as the business analyst.
    Imagine that a software developer and business analyst (forevermore referred to as a dev and BA respectively) are sat at a stakeholder meeting talking about the requirements the client has for the calculation of tax on an item that their customer wishes to purchase. The client says:
    “An item’s price, if it is a taxable, is the original item cost at 20%”
    The BA will see this as meaning the following:
    Pre: The item is a taxable item
    Post: The total cost is the item cost plus 20% of its value

    Or, if they are good, in OCL:

    context Item::CalculateValueAfterTax( item : Item ): Real
    pre: TaxableItems->includes( item )
    post: result = 1.2 * item.Cost

    The C# developer codes:

    ///
    /// … Usual comments ….
    ///
    private decimal CalculateValueAfterTax( Item item )
    {
    return( item.Cost * 1.2 );
    }

    From these two separate snippets in two different places, it is not immediately obvious what the issues is. We can run unit test on it, but if they were coded along the lines the developer was thinking (i.e. by the developer) then they are very likely to have introduced the same error and we then have to trawl through documents to see if it was originally correctly specified.

    By copying and pasting the comments from the BA’s document into the code, so you have:

    /* context Item::CalculateValueAfterTax( item : Item ): Real
    *pre: TaxableItems->includes( item )
    *post: result = 1.2 * item.Cost
    */
    ///
    /// … Usual comments ….
    ///
    private decimal CalculateValueAfterTax( Item item )
    {
    return( item.Cost * 1.2 );
    }

    Now you can immediately see the reason that you should quickly introduce the developer’s face to a wet fish, since they have missed the precondition check.

    Also, there is no reason why the XML comments can’t include the OCL/Structured text specifications in them if the development team all understand it.

    OCL and such languages were initially created to aid the specification of meta-models, typically depicted in UML. Their use would have been much more prominent had MDA/MDD properly caught on. That way, you could verify and validate software by simply generating said software from the formal model and specifications (and I mean that in the truest sense, you don’t necessarily code much).

    It could also have been useful in some BDD scenarios, as BDD takes TDD one step further and gets those test written directly by the end user client who can write it in natural language. As a result, OCL’s use has been seen as ancillary, despite it being a brilliant way of helping find out if the code you have written has side effects.

    ReplyDelete
  3. Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon. Big thanks for the useful info. Buy Facebook Reviews

    ReplyDelete

Whadda ya say?