Binstock on Software: March 2007

Tuesday, March 27, 2007

InfoWorld Moves to Online-Only

InfoWorld magazine announced today that it would be abandoning the print publication and going to an entirely online format. This move continues a trend that has been in place for many years in technical publications. I understand the economics of the move and think that on that basis, it's the right move. However, I confess some sadness at this transition. I like print media. I am currently overseas. And it was a great pleasure to pack magazines with my carry-on luggage and read through pages of material on the plane. SD Times and Doctor Dobb's are the only developer mags I read regularly that still come in printed format. And I generally read their printed version before I read the online material.

For the same reasons, I bemoan the lack of printed documentation today. Reading printed docs for tools I use rgularly is a rewarding activity. Inevitably, I find features I didn't know about. This is more difficult to do in the search-and-locate model that digital documents deliver. Seapine and Perforce are among the last of the development tools vendors to provide printed docs. And I do love them for it.

Anyway, starting on April 3, you'll have to read my InfoWorld reviews strictly on line (at www.infoworld.com). See you there!

Thursday, March 22, 2007

Characterization Tests

In my current column in SD Times, I discuss characterization tests, which are an as-yet little discussed form of testing. They are unit tests whose purpose is not to validate the correctness of code but to capture in tests the behavior of existing code. The idea, first popularized in Michael Feathers' excellent volume Working Effectively with Legacy Code, is that the tests can reveal the scope of any changes you make to a codebase.

You write comprehensive tests of the codebase before you touch a line. Then, you make your changes to the code, and rerun the tests. The failing tests reveal the dependencies on the code you modified. You clean up the failing tests and now they reflect the modified code base. Then, rinse, lather, repeat.

The problem historically with this approach is that writing the tests is a colossal pain, especially on large code bases. Moreover, large changes in the code break so many tests that no one wants to go back and redo 150 tests, say, just to capture the scope of changes. And so frequently, the good idea falls by the way side--with unfortunate effects when it comes time for functionally testing the revised code.To the rescue comes Agitar with a free service called JunitFactory. Get a free license and send them your Java code they will send you back the characterization tests for it. Cool idea. Change your code, run the tests, verify that nothing unexpected broke. And then have JUnitFactory re-create a complete set of characterization tests. How cool is that?

Actually, I use Agitar's service not only for characterization but for my program as I am developing it. The tests always show me unit tests I never thought of writing. Try it out. You'll like it.

Sunday, March 18, 2007

MIPS per Watt: The Progression...

Last week I purchased a Kill-a-Watt Electricity Usage Monitor, which measures the wattage used by any plugged-in device. It's already proving its value.

I began measuring the electrical consumption of the three workstations I use the most. The numbers strongly suggest that the power savings from multicore are real. The question that remains is whether they're substantial enough to matter to many folks, especially small sites with only a few systems. Here we go. (Notes: All workstations are Dell systems with 2GB or 3GB RAM and 2 SATA HDDs. The performance measurements are the processor arithmetic tests from the highly regarded Sandra benchmark suite. Energy consumption measured when the systems were at rest. Systems are shown in chronological order of manufacture, oldest ones first.)

The upshot is that the dual-core systems are definitely the performance equivalents of earlier dual-processor Xeon beasts (look at the performance parity between the two Intel processors), but enegy consumption of the multicore systems is almost 40% less.

However, the difference in energy consumption between the the Pentium D and the AMD system is not that great. Moreover, the difference in CPU performance, while appearing to be a lot, feels the same when I'm at the controls.

So, I think multiprocessor boxes are easy candidates for replacement by multicore systems, but upgrading multicores does not look compelling currently. (The Pentium D system is about a year older than the AMD system.)

Wednesday, March 07, 2007

Mylar, Tasktop, and the Intriguing Logo

Yesterday, I was down at EclipseCon, a yearly gathering of the Eclipse faithful. I had lunch with Mik Kersten, the driving force behind the suddenly very popular Mylar plug-in to Eclipse that helps you organize tasks and the data that goes with them. He's working on a similar idea for desktops at his new company, Tasktop. From what I saw this will be very useful in managing the mass of data we all deal with daily. Betas are expected in late Q2.

Before you go over to the website, try to guess the meaning of Tasktop's logo: <=>

When Mik first asked me, I thought it could be dual-headed double arrow, an emoticon for a very happy, surprised person, or a vague reference to XML. But those are all wrong. Correct is: less is more. Cute!

Thursday, March 01, 2007

How Many Unit Tests Are Enough?

Recently, I was down visiting the folks at Agitar, who make great tools for doing unit testing. Going there always results in interesting conversations because they really live and breathe unit testing and are always finding new wrinkles in how to apply the technique. During one conversation, they casually threw out a metric for unit testing that I'd never heard before. It answers the question of: How many unit tests are enough? You'll note that pretty much all the books and essays on unit testing go to great pains to avoid answering this question, for fear, I presume, that by presenting any number they will discourage developers from writing more. Likewise, if the suggested number is high, they risk discouraging developers who will see the goal as overwhelming and unreachable.

The engineers at Agitar, however, did offer a number (as a side comment to an unrelated point). They said (I'm paraphrasing) that if the amount of test code equals the amount of program code, you're generally in pretty good shape. Their experience shows that parity of the two codebases translates into code coverage of around 70%, which means a well-tested app. Needless to say, they'd probably want to qualify this statement, but as a rule of thumb, I think it's a very practical data point--even if a bit ambitious.

I will see how it works out on my open-source projects. Currently, my ratio of test-code to app-code is (cough, cough) just over 42%. I guess I know what I'll be doing this weekend.

Binstock on Software