Binstock on Software

Monday, May 28, 2007

Comments--How many should you have?

While there is considerable conversation about how many unit tests to write (I have two recent posts on the topic--here and here--based on conversations with vendors), few people have much to say re how many comments there should be. Once the usual themes (more comments, keep comments up-to-date, avoid pointless comments) have been stated, the conversation ends. Everyone understands what relevant and up-to-date comments mean, but few will hazard a guess as to how many of them necessary.

Interestingly, the comment ratio is a key factor in one of the most interesting metrics, the maintainability index (MI). It is also a ratio that is rewarded by Ohloh.net, the emerging site for tracking details of open-source projects. Ohloh gives projects a kudo for a high percentage of comments. The question is how high is high enough to earn the kudo. According to Ohloh, the average open source project runs around 35% comments. Projects in the top third overall get the kudo. I don't know the cut-off for this top third, but I do know Apache's FOP with a 46% ratio of comments definitely qualifies.

Comment count is a metric that is particularly easy to spoof. You could do what some OSS projects do and list all the license terms at the top of each file. I've always disliked scrolling through this pointless legalese. In my file headers, I simply point to the URL of the license, which is sufficient. But all the license boilerplate inflates comment counts amazingly. So does commenting out code and leaving it in the codebase (gag!) or writing Javadoc for simple getters and setters (also gag).

But where legitimate comments are concerned, 35% is probably a good working number to shoot for. I find many OSS projects at this ratio have quite readable code. I'll be working to bring my own codebase up to this level.

Friday, May 18, 2007

Groovy Gaining Traction

Java developers suddenly have a wealth of choices when it comes to dynamic languages that run on the JVM. There's JavaFX, which Sun announced at JavaOne this year, and JRuby, which Sun expects to complete sometime this year, and then, of course, there's my favorite: Groovy. Groovy makes writing Java programs far easier. It essentially takes Java and removes the syntactical cruft, leaving a neat language that makes you terrifically productive.

Because Groovy took a long time getting out of the gate, it's taken some licks in the press. However, it's clear that Java developers are catching on to its benefits. The JavaOne bookstore published its daily top-10 sales during the show. The picture on this post, shows the Day 2 list with two Groovy titles in the top 10 (at places 5 and 8). Overall, the Groovy bible, Groovy in Action, came in at number 5 for the show. Interest is definitely growing.

If you haven't tried Groovy yourself, it's definitely worth a look. Here are a couple of good overviews:

The Groovy Getting Started Guide
Andrew Glover's Overview and Tutorial on DeveloperWork

Wednesday, May 16, 2007

Unit Testing Private Variables and Functions

How do you write unit tests to exercise private functions and check on private variables? For my projects, I have relied on a technique of adding special testing-only methods to my classes. These methods all have names that begin with FTO_ (for testing only). My regular code may not call these functions. Eventually, I'll write a rule that code-checkers can enforce to make sure that these violations of data hiding don't accidentally appear in non-test code.

However, for a long time I've wanted to know if there is a better way to do this. So, I did what most good programmers do--I asked someone who knows testing better than I do. Which meant talking to the ever-kind Jeff Frederick, who is the main committer of the popular CI server Cruise Control (and the head of product development at Agitar).

Jeff contended that the problem is really one of code design. If all methods are short and specific, then it should be possible to test a private variable by probing the method that uses it. Or said another way: if you can't get at the variable to test it, chances are it's buried in too much code. (Extract Method, I have long believed, is the most important refactoring.)

Likewise private methods. Make 'em small, have them do only one thing, and call them from accessible methods.

I've spent a week noodling around with this sound advice. It appeals to me because almost invariably when I refactor code to make it more testable, I find that I've improved it. So far, Jeff is mostly right. I can eliminate most situations by cleaning up code. However, there are a few routines that look intractable. While I work at find a better way to refactor them (a constant quest of mine, actually), I am curious to know how you solve this problem.

Thursday, May 10, 2007

Reusing IDE and SATA Drives: A Solution

Because I review lots and lots of tools, I find myself going through PCs pretty quickly. It's not fair to gauge the performance of a product on old hardware, so each year I buy new PCs. Over the course of years, I've accumulated lots of IDE drives from the PCs I've discarded. I rarely ever use them, but every once in a while I would like to know what's on them and whether I can reuse one of them. Unfortunately, this is a time consuming task, especially hooking up the drive to a PC that will access it.

I recently came across an elegant solution to this problem: the USB 2.0 Universal Drive Adapter from newertech. This device comes with a power supply for the HDD and a separate cable that plugs into an ATA-IDE drive, a notebook IDE drive, or a SATA drive. The other end of the cable is a USB plug. So, you attach the cord and the power the drive, plug the USB end of the cable into your PC--and the IDE drive magically pops up as a USB drive on your system, with full read and write capabilities.

I have cleaned up a bunch of IDE drives during the last week using this adapter. In the process, I've discovered it has some limitations. It did not work well on older drives. Some would not power up (but they did start up when I swapped them into a PC), and others did not handle read/writes well (Windows generated write errors), although it's hard to know if the errors come from the drive or the adapter. But for most drives from the last few years, the product worked without a hitch. Neat solution and, at $24.95 retail, a no-brainer purchase.

Further note: I increasingly use virtualization for my testbed infrastructure. When I'm done with a review, I archive the VMs. This keeps my disks mostly pristine and reusable, so I am not discarding disks with old PCs nearly as much as before. The fact that drives today are > 500GB also helps. ;-)

Monday, April 30, 2007

How many unit tests per method? A rule of thumb

The other day, I met with Burke Cox who heads up Stelligent, a company that specializes in helping sites set up their code-quality infrastructure (build systems, test frameworks, code coverage analysis, continuous integration--the whole works, all on a fixed-price basis). One thing Stelligent does before leaving is to impart some of the best practices they've developed over the years.

A best practice Stelligent uses for determining the number of unit tests to write for a given method struck me as completely original. The number of tests is based on the method's cyclomatic complexity (aka McCabe complexity). This complexity metric starts at 1 and adds 1 for every path the code can take. Many tools today generate cyclomatic complexity counts for methods. Stelligent's rule of thumb is:

complexity 1-3: no test (likely the method is a getter/setter)
complexity 3-10: 1 test
complexity 11+: number of tests = complexity / 2

I like this guide, but would change one aspect. I think a cyclomatic complexity of 10 should have more than 1 test. I'd be more inclined to go with: complexity 3-10: # of tests = complexity / 3.

Note: you have to be careful with cyclomatic complexity. I recently wrote a switch statement that had 40 cases. Technically, that's a complexity measure of 40. Obviously, it's pointless to write lots of unit tests for 40 case statements that differ trivially. But, when the complexity number derives directly from the logic of the method, I think Stelligent's rule of thumb (with my modification) is an excellent place to start.

Monday, April 23, 2007

Effectiveness of Pair-wise Tests

The more I use unit testing, the more I wander into areas that are more typically the province of true testers (rather than of developers). One area I frequently visit is the problem of combinatorial testing, which is how to test code where there is a large number of possible values it must handle. Let's say, I have a function with four parameters that are all boolean. There are, therefore, 16 possible combos. My temptation is to write 16 unit tests. But the concept of pairwise testing argues against this test-every-permutation approach. It is based on the belief that most bugs occur between the interaction of pairs of values (rather than a specific configuration of three, or four of them). So, pair-wise experts look at the 16 possible values my switches can have and choose the minimum number in which all pairs of values have been exercised. It turns out there are 5 tests that will exercise every pair of switch combinations.

The question I've wondered about is if I write those five unit tests, rather than the more ambitious 16 tests, what have I given up? The answer is: not much. At the recent Software Test & Performance Conference, I attended a session by BJ Rollison who heads up TestingMentor.com when he's not drilling Microsoft's testers in the latest techniques. He provided some interesting figures from Microsoft's analysis of pair-wise testing.

For attrib.exe (which takes a path + 6 optional args), he did minimal testing, pairwise, and comprehensive testing with the following results:

Minimal: 9 tests, 74% code coverage, 358 code blocks covered.
Pairwise: 13 tests, 77% code coverage, 370 code blocks covered
Maximal: 972 tests, 77% code coverage, 370 code blocks covered

A similar test exploration with findstr.exe (which takes a string + 19 optional args) found that pairwise testing via 136 tests covered 74% of the app, while maximal coverage consisting of 3,533 tests covered 76% of the app.

These numbers make sense. Surely, if you test a key subset of pairs possibilities, testing additional combinations is not likely to exercise completely different routines, so code coverage should not increase much for the tests that exceed pair-wise recommendations. What surprised me was that pairwise got such high-numbers to begin with. 70+ % is pretty decent coverage.

From now on, pair-wise testing will be part of my unit-testing design toolbox. For a list of tools that can find the pairs to test, see here. Rollison highly recommended Microsoft's free PICT tool (see previous link), which also provides a means to specify special relationships between the various factors in the combinations.

Monday, April 16, 2007

Update to my review of Java IDEs

My review of three leading enterprise Java IDEs appeared in later March in InfoWorld. I've received some nice comments on the piece and a few corrections. Most of the corrections come from Sun in my coverage of NetBeans 5.5. Here are the principal items:

I complained that NetBeans does not use anti-aliased fonts. I overlooked a switch that can turn on these fonts in Java 5. On Java 6, they're on by default, if your system is set with font smoothing on. (It's hard to figure why NetBeans does not default to these fonts on Java 5, as do Eclipse, IntelliJ, and most of the other IDEs.)
I recommended that potential users look at NetBeans 6.0 beta, because it adds many of the features that I complain are missing. Sun gently points out that the current version of 6.0 is not quite in beta yet, but should be in beta within the next few months. For the latest version and roadmap, go to netbeans.org
After extensive discussions, Sun convinced me that they have greater support for Java 6 than I originally gave them credit for. I originally wrote they provided 'minimal' coverage. In retrospect, I would say 'good' support for Java 6.
Finally, Sun was kind enough to point out that I give them too much credit in saying that they have deployment support for JavaDB. In fact, they provide only start/stop control from within NetBeans.

If there are further corrections, I'll update this post.

Tuesday, March 27, 2007

InfoWorld Moves to Online-Only

InfoWorld magazine announced today that it would be abandoning the print publication and going to an entirely online format. This move continues a trend that has been in place for many years in technical publications. I understand the economics of the move and think that on that basis, it's the right move. However, I confess some sadness at this transition. I like print media. I am currently overseas. And it was a great pleasure to pack magazines with my carry-on luggage and read through pages of material on the plane. SD Times and Doctor Dobb's are the only developer mags I read regularly that still come in printed format. And I generally read their printed version before I read the online material.

For the same reasons, I bemoan the lack of printed documentation today. Reading printed docs for tools I use rgularly is a rewarding activity. Inevitably, I find features I didn't know about. This is more difficult to do in the search-and-locate model that digital documents deliver. Seapine and Perforce are among the last of the development tools vendors to provide printed docs. And I do love them for it.

Anyway, starting on April 3, you'll have to read my InfoWorld reviews strictly on line (at www.infoworld.com). See you there!

Thursday, March 22, 2007

Characterization Tests

In my current column in SD Times, I discuss characterization tests, which are an as-yet little discussed form of testing. They are unit tests whose purpose is not to validate the correctness of code but to capture in tests the behavior of existing code. The idea, first popularized in Michael Feathers' excellent volume Working Effectively with Legacy Code, is that the tests can reveal the scope of any changes you make to a codebase.

You write comprehensive tests of the codebase before you touch a line. Then, you make your changes to the code, and rerun the tests. The failing tests reveal the dependencies on the code you modified. You clean up the failing tests and now they reflect the modified code base. Then, rinse, lather, repeat.

The problem historically with this approach is that writing the tests is a colossal pain, especially on large code bases. Moreover, large changes in the code break so many tests that no one wants to go back and redo 150 tests, say, just to capture the scope of changes. And so frequently, the good idea falls by the way side--with unfortunate effects when it comes time for functionally testing the revised code.To the rescue comes Agitar with a free service called JunitFactory. Get a free license and send them your Java code they will send you back the characterization tests for it. Cool idea. Change your code, run the tests, verify that nothing unexpected broke. And then have JUnitFactory re-create a complete set of characterization tests. How cool is that?

Actually, I use Agitar's service not only for characterization but for my program as I am developing it. The tests always show me unit tests I never thought of writing. Try it out. You'll like it.

Sunday, March 18, 2007

MIPS per Watt: The Progression...

Last week I purchased a Kill-a-Watt Electricity Usage Monitor, which measures the wattage used by any plugged-in device. It's already proving its value.

I began measuring the electrical consumption of the three workstations I use the most. The numbers strongly suggest that the power savings from multicore are real. The question that remains is whether they're substantial enough to matter to many folks, especially small sites with only a few systems. Here we go. (Notes: All workstations are Dell systems with 2GB or 3GB RAM and 2 SATA HDDs. The performance measurements are the processor arithmetic tests from the highly regarded Sandra benchmark suite. Energy consumption measured when the systems were at rest. Systems are shown in chronological order of manufacture, oldest ones first.)

The upshot is that the dual-core systems are definitely the performance equivalents of earlier dual-processor Xeon beasts (look at the performance parity between the two Intel processors), but enegy consumption of the multicore systems is almost 40% less.

However, the difference in energy consumption between the the Pentium D and the AMD system is not that great. Moreover, the difference in CPU performance, while appearing to be a lot, feels the same when I'm at the controls.

So, I think multiprocessor boxes are easy candidates for replacement by multicore systems, but upgrading multicores does not look compelling currently. (The Pentium D system is about a year older than the AMD system.)

Wednesday, March 07, 2007

Mylar, Tasktop, and the Intriguing Logo

Yesterday, I was down at EclipseCon, a yearly gathering of the Eclipse faithful. I had lunch with Mik Kersten, the driving force behind the suddenly very popular Mylar plug-in to Eclipse that helps you organize tasks and the data that goes with them. He's working on a similar idea for desktops at his new company, Tasktop. From what I saw this will be very useful in managing the mass of data we all deal with daily. Betas are expected in late Q2.

Before you go over to the website, try to guess the meaning of Tasktop's logo: <=>

When Mik first asked me, I thought it could be dual-headed double arrow, an emoticon for a very happy, surprised person, or a vague reference to XML. But those are all wrong. Correct is: less is more. Cute!

Thursday, March 01, 2007

How Many Unit Tests Are Enough?

Recently, I was down visiting the folks at Agitar, who make great tools for doing unit testing. Going there always results in interesting conversations because they really live and breathe unit testing and are always finding new wrinkles in how to apply the technique. During one conversation, they casually threw out a metric for unit testing that I'd never heard before. It answers the question of: How many unit tests are enough? You'll note that pretty much all the books and essays on unit testing go to great pains to avoid answering this question, for fear, I presume, that by presenting any number they will discourage developers from writing more. Likewise, if the suggested number is high, they risk discouraging developers who will see the goal as overwhelming and unreachable.

The engineers at Agitar, however, did offer a number (as a side comment to an unrelated point). They said (I'm paraphrasing) that if the amount of test code equals the amount of program code, you're generally in pretty good shape. Their experience shows that parity of the two codebases translates into code coverage of around 70%, which means a well-tested app. Needless to say, they'd probably want to qualify this statement, but as a rule of thumb, I think it's a very practical data point--even if a bit ambitious.

I will see how it works out on my open-source projects. Currently, my ratio of test-code to app-code is (cough, cough) just over 42%. I guess I know what I'll be doing this weekend.

Friday, February 23, 2007

A Web Host Bill of Rights

Like many small businesses, my company (Pacific Data Works) contracts web-hosting to a third party. Actually third parties. We have two websites, one hosted at LunarPages the other at Web.com (formerly called Interland). Our main site and mail server is at Interland.

Two days ago, Web.com suffered a massive "facilities" problem that for 10 hours shut down not only hosted accounts like ours, but Web.com itself. The company's own website was off the air.
Because of this problem, all e-mails sent to us and to other companies hosted at Web.com were bounced back as undeliverable.

Everyone understands that grave things can happen in which web-hosting services are compromised, but I don't understand what happened next: nothing. Web.com sent out no notice to customers that they might have lost e-mails or apologizing for the inconvenience. I don't care about the apology although it would be nice, but I do care about not finding out about bounced emails until I started receiving word from correspondents who were surprised their e-mails to us were rejected.

LunarPages suffered a day-long black-out last year due to a power problem in their host building. They didn't notify anyone either, but they did place a long mea-culpa on their official blog, explaining the problem. However, their website still advertises that they are down less than 9 hours a year.

I think it is about time for a Bill of Rights for customer of Web hosters. At minimum:

Web hosts should notify customers when the Web site has been unavailable for more than 4 hours.
Web hosts should notify customers when e-mail service has been down for any period in which incoming emails were bounced back.
Web hosts must post accurate information about outages on their website. An outage is defined as any period of time in which more than 20% of hosted accounts are not available.
All outages should be fully explained as to the nature of the problem and what is being done to make sure it will not recur.
Web hosts should refund the pro-rated share of hosting fees for outages automatically.

I think that's a good start.

Thursday, February 15, 2007

Languages and compilation speed

Yesterday, I went to visit Electric Cloud, a company that provides a suite of tools for building large applications, including a build manager and a distributed make system. While I was there I met the CEO, John Ousterhout, the designer of Tcl/Tk. He made an interesting observation about large compilations: in visits to customers with large codebases to compile, they had found that C++ code compiled the slowest, next was C, while Java was the fastest mainstream language to compile. The position of C++ does not surprise me, but Java being faster than C did--although, to be honest, I'd never given the matter much thought.

Because this discussion was a branch off the main topic, I didn't get to pursue it further, but Ousterhout did attribute it in passing to the efficiency of Java compilers. Without more info, though, I'm not sure I'm convinced that it's a question of compiler efficiency. I suspect that not needing to generate native code is a big factor, as is the simplified linking step.

Monday, February 12, 2007

Virtualization Slides

I gave a pair of talks today at InfoWorld's Virtualization Executive Forum. One was pure lecture, the other a panel. The lecture (on the variety of uses of virtualization in IT) in particular was well attended. It seems managers correctly sense that there's a lot more you can do with virtualization than desktop code verification and server consolidation. Here are the slides from the lecture.

Tuesday, February 06, 2007

What Today's DNS Attack Looked Like

As you might have read in the press today, a group of hackers tried to take down the domain name system (DNS), which is a key component of the Internet. They did this by flooding two of the top level DNS servers with requests. This picture, provided by Ripe.Net, shows the attack in full swing.

On the left are numbers 1-13, which refer to the 13 top-level DNS servers. They handle DNS requests from lower-level servers that cannot resolve a particular DNS address. As can be seen, two of the servers were targeted simultaneously in an attack that lasted several hours.

For readers who aren't familiar with DNS, it's the service that translates URLs into actual numerical addresses (which is how the Internet actually runs). So cnn.com, for example, is translated by a DNS server into 64.236.29.120. Knock out the servers that do this translation and you can only get to sites via their numerical IP addresses.

Fortunately, these 13 top-level DNS servers are redundant, so all this attack did was to slow some Internet/Web queries. However, if all 13 had been attacked, things would have become quite serious.

Monday, January 29, 2007

Learning Java via TDD: An impressive approach

I have never been a fan of test-driven development. I think the concept is a curiosity that makes you write lots of throw-away code, while keeping lots of code that really doesn't test anything useful. But my most fundamental disagreement with TDD is that it is counter to the way people think. And part of the great joy of programming is building solutions, rather than building problems for the solutions to solve. For this reason, I think that many people who like TDD in the abstract eventually move back to a model of writing the code and immediately writing unit tests to exercise it. (This happens to be my model, but I didn't come to it via TDD. Rather, I adopted it because I became profoundly and deeply convinced of the value of lots of unit tests written immediately after coding.)

I've expressed this view before and would have held on to it unmodified had I not come across a terrific and unique Java tutorial: Agile Java by Jeff Langr. Forget the "agile" attribute. This is a book that teaches Java via TDD. And, it turns out, this is a very interesting way of doing things. How many times have we seen Java tomes start with "Hello World"? This one starts by teaching JUnit. And through JUnit, it teaches TDD, and then Java.

This approach, which is brilliantly original, has very distinct benefits: Firstly, readers are much more in tune with Java's verbose but unrevealing error messages. Rather than looking at a stack dump in disbelief, they're used to debugging and jumping in. Secondly, of course, they're used to testing and to writing code that is testable. The third and for me most important benefit is that this book cannot present Java in the usual way. (Such as: here's an inner class. Here's why you need it. Here's a code snippet. OK, now off to statics....) Rather, each chapter requires writing a mini-app that exercises the topics the author wants to present. So, you get to think about OO and Java, rather than merely learning language syntax.

The book also forces the reader to think about challenging testing issues. And I do mean challenging. It is the only text I've ever read anywhere that shows how to run a unit test on a log record written to the console. How would you solve that problem?

The only drawback is that the same topics are discussed in many different places throughout the book (each discussion amplifying earlier points), so you really can't use it as a reference after the fact. This is a small gripe, as there are plenty of Java references on line and in print.

I recently wrote a review of Java tutorials (before I knew about this book). Despite the many great titles I discuss in that article, if I were teaching a Java class today, this is definitely the book I would use. Bar none. And I don't even like TDD.

Saturday, January 27, 2007

One activity that is inherently productive: unit testing

In a post today in his blog, Larry O'Brien states: "...let's be clear that of all the things we know about software development, there are only two things that we know to be inherently highly productive: Well-treated talented programmers and iterative development incorporating client feedback"

I find it hard not to add unit testing to this list.

Of all the things that have changed in how I program during the last six or seven years, nothing comes close to unit testing in terms of making me more productive. In addition, it has made me a better programmer, because as I write code I am thinking about how to test it. And the changes that result are almost always improvements.

Wednesday, January 24, 2007

Multicores not as productive as you expected?

For a while, I have been intrigued by how small the performance pop is from multicore processor designs. I've written about this and, finally, I think I can begin to quantify it. I'll use AMD's processors for the sole reason that the company has for years posted a performance measure for each of its processors. (Originally, this data was a move to counter Intel's now-abandoned fascination with high clock speeds.)

This chart shows the performance as published by AMD and the corresponding clock speeds for most of its recent processors. I have broken the figures out for the three branches in the Athlon processor family (which is AMD's desktop chip).

There are several interesting aspects to this chart, but the one I want to focus on is the performance of the rightmost entry. The dual-core Athon 64 X2 with a rating of 5200 has a clock speed of 2600MHz. Now, notice the Athlon XP with a rating of 2600 (10th entry from the left): it has a clock speed of 2133 MHz.

In theory, since the AMD ratings are linear, a dual-core processor should give you near but not quite the performance of two single-core chips. So two 2600-rated chips should give you roughly the performance of a 5200-rated dual core chip. Using the chart, we would expect two 2.133GHz cores to give us the 5200 performance figure. In reality, though, it takes two 2.6GHz cores to do this--far more than we would expect. It's actually an even wider gap that that, because the dual-core chips have faster buses and larger caches than the Athlon processors we're comparing it to, so it can make far better use of the processor on each clock cycle.

So, why does it take so much more than twice the clock speed to deliver dual-core performance? The orignial 2600-rated Athlon XP had a memory manager built into the chip. On the X2 chip, however, the two cores do not have dedicated memory managers--instead, they share a single on-chip memory controller. This adds overhead. The cores also share interfaces to the rest of the system and so again must work through resource contention to get the attention they need.

Don't be fooled into thinking this is an AMD-specific issue. (As I said earlier, I used AMD only because they are kind enough to publish clock speed and performance data for their chips.) This is not an AMD-only problem. Intel is in exactly the same boat--what is shared between cores is plenty expensive. Expect, as time passes, to see chip vendors trying to limit these shared resources.

Tuesday, January 23, 2007

Enter The Komodo Dragon

Komodo 4.0 from Active State is out of beta, and was released today. Komodo has long been viewed as the premier IDE for scripting languages (Python, Perl, and Tcl especially). It continues this tradition with this release and adds Ruby and RoR support. It's the first high-end IDE with intelligent editing and debugging for Ruby and RoR that I'm aware of.

Komodo also provides numerous tools for Ajax development including Javascript debugging, specific editors for XML, HTML, and CSS. Plus an HTTP viewer and a DOM editor.

If you work in any of the languages Komodo supports, you owe it to yourself to examine it (for free). If you work in any two of them, you probably should just buy it. At $245, it's a steal.