Tuesday 22 June 2004 – puzzling.org

Now that I’m hosting this in the same place as my main log, it wants to become my main log’s brother. It wants me to tell it all about whatever spiritual hiccups have smeared my professional programming today. It wants to be my tech confessional. But I’m refraining. Perhaps it might get lucky and be used as a non programming tech log one day.

Currently, I’m struggling with Backwards design issues, specifically storage issues. This stuff, while not quite as dull as rule based information extraction or user input validation, is still pretty dull, and every time I think about it I think about downloading Zope 3 and using that and then I come to my senses and realise that I’d have to rewrite the whole thing.

It’s a backend problem. puzzling.org has always been pretty much a simple tree in structure so the filesystem makes sense as a storage mechanism. Except when it doesn’t, in precisely those cases when puzzling.org stops looking like a tree. For example, consider the logs. their tree structure is: root, year, month, day. However, I want the leaves of the tree (the entries) to be a doubly linked list, ie to have previous and next links.

I can search the tree in this case (if I’m looking or the entry before the 1st of January, I crawl up to the root and down into the previous year to find the 31st of December), but if I decide not to, I need to calculate some data and store it somewhere. Where? Well, I only have made 566 diary entries over the past three years, I could just about store the list in memory. But if I don’t, I need to figure out where to put it.

The case for a links blog (which doesn’t exist yet) is harder. If it is to look like my del.icio.us page, each url needs to be associated with a title, a description and a list of categories. But the sane web tree configuration is root, category, url, (as opposed to root, url, category) which means being able to make the "what urls are in this category?" query easily. When urls are in multiple categories, how do you represent that in the filesystem with a < O(n) complexity query time (n being the total number of urls)? Symlinks?

Well, it’s a trick question as far as I can see. The filesystem’s fairly strict tree structure and limited query mechanisms mean that it isn’t a good backend for this. Which means databases of course. Which means researching databases and choosing between them and learning to use my choice and dependencies (because Nevow isn’t a major dependency, no!) and ew. Hmph. I like the filesystem.