puzzling.org · mary.gardiner.id.au · Macquarie University

puzzling dot org

 · 

logs

 · 

thoughts

 · 

2008

 · 

July

 · 

19

Saturday 19th July 2008

Starting out with unittests

It's a big shame because your very first experience of unittesting is unittest.main and your next experience is this brick wall of suck and you think... I'm just not going to use it.
Andrew

Andrew and I have had Jonathan Lange to visit this week while he awaits home Internet, which means there's been a lot of talk about unit testing.

Coincidently, today I am at the very beginning of a small-ish Python project, one just large enough that I'd like to make sure it's fairly correct from the beginning, meaning I'd like to make sure it has automated tests, a sensible module layout and that kind of thing. It's small enough that I bet I could get it working quickly enough without automated tests... and therein lies the trap. In order to do the right thing when starting a scratch project, doing the right thing needs to be either really really easy, or really easy to correct if it was skipped at the junkcode stage of the project.

Consider various aspects of the project. I haven't worried too much about correct module naming, because at least with Bazaar renaming directories will be easy to do later. But I am trying to do unit testing from the very beginning because adding good testing to an existing codebase of much over 200 lines converges on impossible pretty quickly. And since I create new files in Python the way some other people create functions and some other other people create new lines, I separate my test modules early which results in this brick wall of suck:

import unittest

tests = [
	# list of strings containing test module names
]

def my_import(name):
    # No love to http://docs.python.org/lib/built-in-funcs.html
    mod = __import__(name)
    components = name.split('.')
    for comp in components[1:]:
        mod = getattr(mod, comp)
    return mod

def runAll():
    runner = unittest.TextTestRunner()
    suite = unittest.TestSuite()
    for testmodulename in tests:
        testmodule = my_import(testmodulename)
        print testmodule
        loader = unittest.TestLoader()
        suite.addTests(loader.loadTestsFromModule(testmodule))
    result = runner.run(suite)

That is, unittest does not do test discovery, you have to tell it where to find all the test modules, and you run slap-bang into Python's tedious programmatic interface to its own import mechanisms.

So, unit testing in Python: hard to get right later, but not easy enough to get right at the beginning. (Consider: the code above is currently about three quarters of the codebase in question.) I will watch Jonathan's pyunit3k with interest. (For my PhD I use Twisted's trial test runner, but I am not willing to introduce a dependency on Twisted for this project purely for a better test runner.)

In general, it would be excellent if some firm and right person was to write a guide to best practices for Python projects, with regard to starting a project so that it is easy to test, easy to collaborate on, easy to install and (and I think this is somewhat missing too) easy for the distributions to package. And that all these steps be so trivial that they can be carried out at the beginning of almost every project.

Version control idea: modified times

I tend to commit almost everything that doesn't move to version control, but there's one major exception: the source data for my website puzzling.org, which is mostly text files. So as to avoid various nuisances to do with calculating data I already have and storing it somewhere else and needing to keep that store up to date I keep track of the time that a file was modified by... asking the filesystem when this file was last modified.

So far so good, and tools like Blosxom do likewise, except that file timestamps tend not to be version controlled, which means if I store my files on more than one machine and rely purely on version control to maintain the date the dates don't end up the same.

So instead I use Unison, which keeps the trees and dates in sync at the expensive of losing all my history (I also have incremental backups for a couple of months, but that's a recent addition to my data management). As Martin Pool apparently did at some point, although that was some months before he started writing a version control tool.

Thus, plugin idea for version control system: optional version of additional filesystem metadata, especially times.

Last modified: 19 July 2008