[[!tag rant]]

Each software tool exists to solve some problem. For each problem, there are many possible solutions. Even when different programs basically do the same thing, they can have quite different shapes.

As an example, this morning I was wondering if it would be possible for me to use notmuch to index my entire mail archive. For that, I needed to convert a number of mbox folders to Maildir format. That's a resonably easy problem, given access to suitable programming libraries, but there's an existing tool for that, called mb2md. Unfortunately, it has the wrong shape for my needs.

mb2md doesn't just convert one mbox to one maildir. It's designed to for a mail admin converting all server-side mbox folders for a user into a corresponding structure of Maildir folders. This seems to be necessary when switching IMAP servers. That's a fairly specialised problem, and the program has been written to make it easy for a mail admin to do that.

What I need is part of the problem solved by mb2md and indeed it can do just that part. However, the overall shape of mb2md is such that my part is hard to do. The incantation is quite unintuitive and requires careful reading of the documentation.

The shape of a solution matters. mb2md could easily have been written in a way that provides a simple tool for the single folder conversion, and then a more complex tool for the mail admin's more complicated problem. This would have resulted in a much more general tool, and that would make it easier for more people to use it without much effort.

Mail folder format conversions are a fairly esoteric thing to do. However, the lack of generality is a frequent issue with how programs are designed. It is easy to fall into the trap of writing a highly specialised tool, instead of taking a step back and making a more general purpose tool. The specialised tool will help a small number of people. The general tool will help many people.

Examples of this are fairly common. Debian has a set of tools for making Debian live CDs; they are not quite able to make a bootable hard disk image as well (thus, vmdebootstrap). There's programs for computing cyclomatic complexity, which produce HTML reports, rather than something that can be processed by other programs without too much effort. There's tools for managing address books that are limited to specific cultures, e.g., by hardcoding assumptions of what a person's name looks like (thus, clab).

One of my favourite examples is xargs, which by default does the wrong thing by assuming its input is whitespace delimited. Any whitespace, not just newlines. Any sensible use requires adding the -0 option, which makes xargs that much more tedious to use.

Furthermore, I've often found that the more general tool is simpler. It's functional specification is simpler; it's implementation is simpler, and has fewer special cases; it's user experience is simpler. That's not always true, but often it is.

Sometimes the general solution shape is not worth it. But it's always worth considering whether it might be.

One of the parts of the Unix culture I really like is the preference for general tools that are easy to combine together.