Sunday, September 28, 2008

Banishing Return Status Codes

The most enduringly popular post on this blog is Perfecting OO's Small Classes and Short Methods, which presents a short series of stringent guidelines to help an imperative-trained developer master OO.

If I were to add one item to the list, it would be: Don't use return codes to indicate the status of an action. Developers trained in languages such as C have the habit of using return codes to indicate the success or the nature of failure of the work done by a function. This approach is used because of the lack of a structured exception mechanism. But when exceptions are part of the language, the use of status codes isa poor choice. Among the key reasons are: many status codes are easily ignored; developers will expect problems to be reported via the exception mechanism; exceptions are much more descriptive. And finally, exceptions enable return codes to be used for something useful--namely returning a data item.

Astute readers will note that in Java, null is frequently used as a return value to indicate a problem (as in Collections). This practice subverts the previous points, and it too should be avoided. Returning a null presents code with many problems it should not have to face. The first is the risk of a null-pointer blow-up because the return value was accessed without being checked. This leads to the code bloat of endless null value checks. A much better solution, which avoids this problem, is to return an empty item (empty string, empty collection, etc.). This too communicates that no data item fulfilled the function's mandate, but it does not risk the null-pointer problem, and it frequently requires no special code to handle the error condition.

Hence, if your OO code is characterized by heavy reliance on return codes (many of which I am certain are not checked), consider rewriting it in favor of exceptions and use return statements solely for returning non-null data items.

Monday, September 01, 2008

A Parameter-Validation Smell and a Solution

Last week, Jeff Fredrick and I did a day-long code review of Platypus. We used a pair-programming approach, with Jeff driving and I helping with the navigation. Eventually, we got into the input parser, which parses input lines into a series of tokens: text, commmands, macros, and comments. Macros can require a second parsing pass, and commands often require additional parsing of parameters.

Once you get a parser working well (that is, it passes unit and functional tests, and it handles errors robustly), you generally don't want to mess with refactoring it. Experience tells you that parsers have hideous code in them and wisdom tells you to leave it alone. However, we launched in.

A frequent cause of otiose code was my extensive parameter checking. Parameters were validated at every step as tokens passed through multiple levels of parsing logic. Likewise, the movement of the parse point was updated multiple tiems as the logic resolved itself back up the processing stack. This too had to be validated repeatedly.

Jeff came up with an elegant refactoring that I could not find in the usual sources. He created an inner class consisting of the passed variables, a few methods for validating them, and a few more methods for manipulating them.

This class was then passed to the methods in lieu of the individual parameters--thereby reducing the number of parameters to one or two. And because the class constructor verified the initialization of the fields, I need only to check whether the passed class was null, rather than validate each of the internal fields.

The effect was to reduce complexity of already complex code, enforce DRY, and place the validation of the variables inside a class that contained them--a set of small, but important improvements. And like many of the best refactorings, it seems obvious in retrospect.

So, if you find your class's methods are repeatedly validating the same parameters, try bundling them in an inner class along with their validation logic. You'll like the results.