[[!tag debian tddd systest]]

Continuing my earlier musings about test driven distro development, and the tools that would require.

I imagine something like this:

  • create a virtual machine with a particular configuration of stable
  • run a test suite verifying the VM works
  • upgrade the VM to testing and reboot
  • re-run test suite, verifying things work in testing

We might also want this scenario:

  • install testing directly in a VM (instead of upgrading from stable)
  • run test suite

There might be other scenarios that would be useful to test as well. Even from these two, it's clear there's a need for at least three separate tools:

  • create a machine image with a particular configuration
    • might be based on stable or testing or unstable
    • might also be using some other package sources
    • might want particular packages installed
    • might want particular configuration settings
  • run a suite of tests against the running image
    • which tests to run will depend on the scenario
  • run various scenarios using the above two tools
    • needs reasonably easy ways to specify scenarios
    • needs to support tests specific to scenarios

I've written a rudimentary version of the first tool: vmdebootstrap. I've since learned there's a bunch of others that might also work. There's room for more: someone should write (or find) a tool to make a snapshot of a real system and create a VM image that mimicks it, for example. Anyway, for now, I'll assume that one of the existing tools is good enough to get started.

For the second tool, I wrote a quick-and-dirty proof-of-concept thing, see systest.py. Here's a sample of how it might be used:

liw@havelock$ ./systest -v --target 192.168.122.139 --user tomjon
test 1/6: cat
test 2/6: only-ssh-port
ERROR: Assertion failed: ['22/tcp', '139/tcp', '445/tcp'] != ['22/tcp']
[status 1]
liw@havelock$ ./systest -v --target 192.168.122.139 --user tomjon \
    cat ping6-localhost ping-localhost simple-dns-lookup ssh-login
test 1/5: cat
test 2/5: ping6-localhost
test 3/5: ping-localhost
test 4/5: simple-dns-lookup
test 5/5: ssh-login
liw@havelock$ 

The first run failed, because the VM I'm testing against has some extra ports open. Some of the tests will require logging into the machine, via ssh, and for that one needs to specify the user to use.

systool may overlap heavily on system monitoring tools, and possibly the production implementation should be based on those.

I think it's best to design such a tool for the more general purpose of testing whether a system currently works rather than as an integrated part of a more specific larger tool. This lets the tool be useful for other things than just testing specific things about Debian. (The production implementation would then need to not have all the tests hardcoded, of course. SMOP.)

The third tool I have not spent a lot of time on yet. One thing at a time.

Given these tools, one would then need to decide how to use them. The easiest way would be to use them like lintian and piuparts: run them frequently on whatever packages happen to be in testing or unstable or experimental, and put links to the test reports to the PTS, and hope that people fix things.

That is the easiest way to start things.

Once there's a nice set of test cases and scenarios, it may be interesting to think about more aggressive ways: for example, preventing packages from migrating to testing unless the test suite passes with them. If the tests do not pass, one of four things is broken:

  • the package or packages in question
  • other packages already in testing
  • the tests themselves
  • the test environment

If things are set up properly, the last one should be rare. The other three always require manual inspection: it is not possible to automatically know whether the test itself, or the code it tests, is at fault. it is, however, enough to know that something is wrong. If the tests are written well, they should be robust enough to not be the culprits very often.

(Someone wanting to make a rolling distrbution, or even better, making a monthly mini-release, might benefit from this sort of automated testing.)