Monday 19 July 2004

I’ve been working on Backwards again: I’m trying to pull all the stuff out of it that’s specific to my site so that other people can use it. I’m getting closer: Andrew is trying to deploy it now, which is helpful. I’ve finally provided site creators with the ability to drop in their own DocFactory easily, which is a nice touch because you can choose to do the DocFactories with Stan, which is ever so much nicer that typing HTML tags, or you can do it with HTML/XML if you like. It means that it’s still a programmer’s web content tool, but I don’t intend to change that.

I’ve also finally stuck in some code that adds the cache validation Last-Modified and ETag HTTP headers to every page request. I added it to the RSS feeds a while back after I noticed that Jeff’s aggregator and Planet SLUG both poll for updates every ten minutes. I didn’t think it would be so useful for the rest of puzzling.org, because most browser visitors come in from Google to look at my summaries of my high school texts.

I’d forgotten about the Googlebot itself though. It’s a pretty regular visitor to my site now — it seems to go through every few days. Other search robots are less frequent visitors. The cache validation headers are preventing the full transmission of an awful lot of content to robots now. Something for a lot of dynamic blog tools to consider doing — perhaps many do it already.

The only problem I’m having is working out what to do when the templates change because really, the validation headers should change then too. The content of the page won’t have changed, but the layout will have. There’s a couple of possibilities:

  • drop the Last-Modified header and base the Etag header on some kind of hash of the page content (I’m currently setting weak ETags based on the timestamp, actually), which means turning off Nevow’s incremental render; or
  • extend the existing “change detection” mechanisms to detect changes in the template as well as changes in the content.

Changes in the content are currently detected in a variety of ways, but they’re all based on file timestamps. I haven’t come up with a way to detect a template change yet that doesn’t place burden on the site maintainer to record the fact or date of the change manually. I could insist that the templates be files so that I could check the datestamp but having them as nevow.loaders.DocFactory Python objects is desirable for other reasons. I guess I could also stick the template in some kind of database and timestamp it there. (Actually, doing the latter might be one way to avoid the “need to restart the process when the templates change” problem too. Maybe I have a winner here.)

There’s a few other things I want to sort out before resting for a while (by which I mean “making a numbered release which probably noone will use anyway”):

  1. Documentation. I actually loathe documenting my own projects as much as anyone, it’s only documenting other people’s that I don’t mind.
  2. URL generation. Unfortunately, the fact that I’m using old twisted.web with my personal Backwards site means that the URL generation code is an immense mess. Because old web only talks to it through a proxy, and the proxy code doesn’t set the forwarding headers, I can’t use Nevow’s URL generation mechanisms without ending up with a bunch of http://localhost:8080/ URLs. So I have my own clunky hard-coded base URLs because there’s no way to get them from the request object when it’s behind a twisted.web.proxy. I keep wanting to have a weekend-long hackfest on new twisted.web to get it deployable, but I’m not a “twisted developer” in that sense, and the way they use sandboxes has always said “my rewrite, no you touchie!” to me.
  3. Persistence. The amount of data stored in memory is really too large and should be much lower. So I need to stick it somewhere. Which means choosing between persistence systems on what I currently feel is way too little information. I currently have a bastardised mixture of shelves which really should be one file.

Trip planning: finale

You know, this really ought to be called my "misc log" not my "tech log". But I’m sure you all understand that that would mean I might start posting quizzes. Hence, I can only continue to apologise for being off-topic on my own damn website.

Trip planning: finale

Lookit! Look! We have a final itinerary, at least as far as Asia. I’m likely to move about a bit using this as a base, Andrew will be tied down by his need for an umbilical cor… Internet access.

8th August
Andrew in London
20th August, 21st August
I arrive in San Francisco from Sydney on the 20th, Andrew arrives from London on the 21st. (Something to note here is that the 20th is our fifth anniversary. Observe that not only will I have a lonely anniversary, I will also have a rather long one, thanks to crossing the date line that day.) At this stage we’re intending to spend a few nights in San Francisco and then a few more nights at Stanford, but… this depends on people at Stanford.
26th August
Boston
30th August
New York
2nd September
Washington
7th September
London. Well, London for Andrew, I’m currently planning on visiting Scotland and France as well as whatever bits of England sound like a good idea at the time. Don’t expect a firm itinerary for that, I only do these things for other people’s benefit. For the good of my own soul, I intend to travel at a whim, or at least the whim of the Pound.
1st October
Palma de Mallorca. Yes, this is a somewhat odd choice of location in Spain, but we found somewhere to stay with wireless ‘net access. What can you do?
23rd October
Prague. The thing in the Austrian mountains fell through. I maintain my sanity by reassuring myself that there are plenty of other fish in the sea and mountains to be seen. But opinions of Prague are uniformly good, if you ignore the travel advice about crime — hey it’s not their job to sell the place. And Andrew likes the cold…
13th November
Bangkok. Yes, it’s terrible isn’t it? Hopefully we can get out of Bangkok as soon as possible and spend a few weeks elsewhere in Thailand and possibly neighbouring countries.
28th November
Sydney — although Andrew will, I guess, have another overseas meeting in December, so goodness knows how this will work for him.

While the time has now past to say things like "hey, you know, Russia is an excellent destination" we’d still appreciate hearing from anyone in or near these areas who’d like to meet up, and also of interesting things to see (natural beauty is great, I like historical sites and Andrew wouldn’t say no to art galleries…) in areas near those listed.

This is, I hope, the last time I’ll post this publically. At some point in the future you can look forward to me having something meaningful to say about these places.

Trip planning update: now with fresh new plans

In the last season, our intrepid heroine’s plans in Europe involved doing "stuff" in Europe, where "stuff" involved quintessential holiday-in-Europe things like thinking ‘hmmm, Prague sounds nice, I wonder when the train leaves’, and ‘goodness, when I went to sleep everyone was speaking French, but now they’re all speaking German!’

Unfortunately, various constraints have subsequently emerged, the most important of which is that Andrew would like to spend a lot of quiet time with an Internet connection during our stay. I originally considered a longer working holiday, but eventually decided to stick with our touristy plans with longer stays at a few places with Internet connections rather than a nomadic existence. Alas, such is the nature of compromise. Unfortunately it turns out that the combination of "Internet" and "major city in Europe" pushes your accommodation right into the 100 euro a night range, which is well outside my budget. Or at least so it appears from my attempts to find accommodation online. At the moment I have a phone call to make to Spain regarding a bed and breakfast, and another to Austria about a little studio in the mountains. I’m a little attached to this mountain idea, I hope they didn’t leave a zero off the price.

Now, for a rough itinerary, pending some interesting travel agent manipulations which I’ll return to soon:

San Francisco/Stanford
Arriving Fri 20th August, leaving Friday 27th August

Boston/New York/Washington
Arriving Boston Friday 27th August (I may try to move this back a few days), leaving Washington Monday 6th September. The exact split of this trip down the coast I haven’t worked out yet.
United Kingdom and France
Arriving London Tuesday 7th September. Andrew will spend the rest of September based in London, I will likely roam more widely. I will visit France, the north of England and possibly Scotland or Wales during this time.
Spain
Assuming that accommodation works out (as best I can tell from looking for cheap accommodation in or near Barcelona and Madrid is that it exists only in myths dating from 1970), we will be in Spain in the first three weeks of October.
Austria or the Czech Republic
Again, depending on accommodation and proximity to our old friend the ‘net, we’ll be spending three weeks in Austria from about the 23rd of October, alternatively I’ll look for something nearer Prague.
Hanoi or… err… Bangkok
This is where real travel agent magic begins, but we’ll be somewhere in south east Asia from about the 13th November on.
Sydney
My present bookings have me in Sydney on the 30th November.

The magic needed revolves around trying to visit Hanoi rather than Bangkok. Our round the world ticket includes up to 26 000 miles of travel, including, unfortunately, overland trips. We were hoping to fly from Frankfurt to Hanoi, but the best route the agent has been able to find involves flying via Bangkok. That and the fact that the distance is counted from our last landing (London), bring the trip total to twenty six thousand and sixty miles. That’s right. sixty miles over the limit, pushing the total price up by five hundred dollars.

The agent is going to try and get their product manager, which is apparently a title equivalent to "high ranking wizard," to reroute the trip. At present the cleanest solution involves subtracting either Boston or Hanoi from the stops but that would leave miles to spare, not to mention making me unhappy. I want to see them bring it in at twenty five thousand, nine hundred and ninety nine.

My apologies to everyone in Washington hanging on until I pull some precise dates out of the air. The next round should get it.

Documentation bugs

I tried to find useful third party guidelines for filing bugs against documentation, and wasn’t encouraged to see something I wrote on the second page of Google’s results for “filing documentation bugs”. So I wrote something better.

I’ve been notionally working on Twisted’s documentation for about six months and I’ve been disappointed to find it’s more something to be avoided rather than something to avoid with. It is at least true that I’ve found bug reports useful as a way in: “need to fix bug” is a much more useful starting point for me than “need to improve documentation.” I’m not absolutely sure about this, but I seem to be one of the only Twisted developers (hah! I develop not!) who files bugs against themself. It’s my only consistent adoption of todo lists in my entire life to date.

However, I’ve written an entire CMS and now a series of articles (which I’ve decided are so detailed that noone who’s read them would ever think they were remotely competent enough to buy and use a domain name) rather than write or edit Twisted docs. This is very disappointing, but I’m not contemplating giving up at the moment, despite the occasional nagging fear that my presence as docs editor is holding back someone vastly more committed waiting in the wings. (I suspect there is no such person, but if there is, do get in touch.)

It’s been interesting when I do work on docs though to discover exactly what parts of it I like doing: it wasn’t what I suspected when I began. Writing text for them is very hard. My output is about a paragraph an hour except in rare cases where I understand the code sufficiently well that I don’t have to read or write 100 lines of code to convince myself I understand what’s going on. I’m ambivalent about editing other people’s text for clarity or style — in many ways I love this, but it feels too easy to slip into the “make it as if I wrote it” rather than “improve it” trap. I quite enjoy producing example code, possibly more than I enjoy producing real code. This is probably the right way around — unless I’m a very atypical documentation user, example code is worth far more than text except in the case of design discussions.

I think the main problem I really have is that I’m terrible at ongoing tasks. My coding is the same: I like to have something working at the end of every coding “run” (which for me is about three hours of solid concentration, of which I can currently do two in a day if I have a good day). I like to have a passable document, or at least section, at the end of every docs run (same length as a code run, three hours must be my natural concentration cycle). I’m discouraged from starting anything that can’t be broken up into three hour chunks. In the case of documentation, in which it’s very hard to tell when I’ve finished, this is occasionally paralysing.

Tuesday 22 June 2004

Now that I’m hosting this in the same place as my main log, it wants to become my main log’s brother. It wants me to tell it all about whatever spiritual hiccups have smeared my professional programming today. It wants to be my tech confessional. But I’m refraining. Perhaps it might get lucky and be used as a non programming tech log one day.

Currently, I’m struggling with Backwards design issues, specifically storage issues. This stuff, while not quite as dull as rule based information extraction or user input validation, is still pretty dull, and every time I think about it I think about downloading Zope 3 and using that and then I come to my senses and realise that I’d have to rewrite the whole thing.

It’s a backend problem. puzzling.org has always been pretty much a simple tree in structure so the filesystem makes sense as a storage mechanism. Except when it doesn’t, in precisely those cases when puzzling.org stops looking like a tree. For example, consider the logs. their tree structure is: root, year, month, day. However, I want the leaves of the tree (the entries) to be a doubly linked list, ie to have previous and next links.

I can search the tree in this case (if I’m looking or the entry before the 1st of January, I crawl up to the root and down into the previous year to find the 31st of December), but if I decide not to, I need to calculate some data and store it somewhere. Where? Well, I only have made 566 diary entries over the past three years, I could just about store the list in memory. But if I don’t, I need to figure out where to put it.

The case for a links blog (which doesn’t exist yet) is harder. If it is to look like my del.icio.us page, each url needs to be associated with a title, a description and a list of categories. But the sane web tree configuration is root, category, url, (as opposed to root, url, category) which means being able to make the "what urls are in this category?" query easily. When urls are in multiple categories, how do you represent that in the filesystem with a < O(n) complexity query time (n being the total number of urls)? Symlinks?

Well, it’s a trick question as far as I can see. The filesystem’s fairly strict tree structure and limited query mechanisms mean that it isn’t a good backend for this. Which means databases of course. Which means researching databases and choosing between them and learning to use my choice and dependencies (because Nevow isn’t a major dependency, no!) and ew. Hmph. I like the filesystem.

Wednesday 16 June 2004

So Dave Winer has pulled all (most?) of the free weblogs.com content. Authors can, at some point, take advantage of a one time offer to get a copy of their content. Nice Dave, good boy. In the mean time, criticism is supposedly muted because people’s content is being ‘held hostage’ for at least the rest of the month. (I don’t actually know about that, all I’ve read is the criticism. I haven’t bothered with the nicey-nice stuff.) [Update in the interests of completeness: there is now a transition plan to 90-day free hosting on buzzword.com. Content has been restored.]

A couple of things concern me about this. One is this persistent notion that’s probably been around since the beginning of time and will probably be around until the end that having the right to do something is a justification for doing that thing. (Hint: "but I’m allowed to do that" is a non-defence against criticisms of your failures of courtesy, generosity or general personability. The whole point of that stuff is that it requires you to do more for others than the bare minimum that you’re compelled to do.)

The other is this notion that if you’re getting something for free, you deserve what you get when it all turns sour. As others have noticed, this is the same stuff that was levelled at people who were shocked about Movable Type’s new licencing schemes. Mark Pilgrim re-wrote that debate in his terms in his Freedom 0 essay. What the people who wanted something for free did wrong wasn’t trying to get something for free ("something free", if you don’t like people playing fast-and-loose with the multiple senses of "free"), it was not getting a guarantee of that freedom. Hopefully Shelley Powers can do something similar with her thoughts on The Value of Free:

There’s nothing wrong with not doing the free thing. However, there’s also nothing wrong with the people who accepted the free thing, freely given… Each person who accepted these free things also gave something back in return: whether it was bodies when webloggers were few, or grateful acknowledgement when webloggers were many. Though those who have benefited from these free services in the past should be grateful, they don’t deserve to be called "cheap" or cut loose without warning. Free does not equate to no value.

Shelley Powers

The point of money is to abstract over some notion of value in a way that allows values to be compared. It’s efficient to be able to compare price tags. But the consistent confusion of money and the value it represents in some cases is concerning for all kinds of reasons. Limiting concern to Free Software alone, it would mean that there is no quality without money; that there are no ethical obligations without money; and that nothing of value is exchanged without money.

My personal instincts about this favour social changes that move from a rights based discussion ("I’m allowed to do this, I’m not compelled to do that") to a courtesy and generosity based discussion. What were the nice things Dave Winer could have done if he couldn’t provide free hosting anymore? What’s an ethical way to write software? What’s an appropriately thankful way to use it? I know, oh, I know that people have been talking in these terms for thousands of years too. I still wish they’d do it more often.

Sunday 13 June 2004

This thing has been around for a day and already people have asked me about comments. I don’t get asked about comments for the other log very much. I turned them on on Livejournal (this is cross-posted) with some trepidation and it’s worked out surprisingly well.

There are two main reasons I’ve avoided comments or web editing in Backwards. The first is that I’m not a big user of web editing: I’ve had too many crashes that cost me work, accidental cuts I can’t undelete, and old cached copies of the page in the form (thank you Zope 2, or was that Squid?). Plus it’s always someone else’s UI and they’ve always set something to be too small. The other reason is that input validation is currently competing with rule based information extraction (just don’t even ask) for my "least exciting programming chore" award. The beauty of writing the entire site myself is that I don’t have to check for malicious mark up, logins, cookies and other horrible things. I have all the power, no one else has any. Easiest authentication problem ever.

But it all comes down to the fact that I instinctively dislike the idea of comments on puzzling.org because it’s all mine, precious. Maybe I’ve spent too much time in the wrong comments threads, but I just don’t see the appeal of spending however many millions of hours I’ve spent this year in order to give people a forum to attack me and a guaranteed audience for their troll-fest. I want to put a click between me and my critics. Given the Livejournal experiment, this is a bit silly: people use my comments to say things like "let’s go crazy Spanish style" rather than "I will eat your young, ignorant evil-doer." Even so. Precious. One day someone’s spam robot would leave a comment and I’d feel personally violated.

There’s a pot and kettle problem though, because I prefer it when other people leave comments on (or in the case of my fellows who write their own CMS, write a comments system and then leave it on). There’s a certain social niche comments fill. Writing an entry to say "happy birthday" or "wasn’t it a nice day?" in response to other people’s entries is a noise problem more than a social activity. Sending an email works a bit better, but people are protective of their inboxes. Plus you miss out on interaction between commenters.

Wow, it really is possible to talk yourself into things isn’t it? Good thing I didn’t try and balance out the pros and cons of writing input validation, or I’d be spending today adding a comment facility to this thing. As it is I need to add some features for Andrew so that I can acquire my first user.

Saturday 12 June 2004

The extended absence of advogato.org has finally goaded me into doing what I’ve considered doing for ages on and off: moving my tech log to a server I control. I’m sure I’m far from alone too, especially since advogato.org posters showed up frequently on the Planets. I may work out some way of cross-posting, but I’m not sure anyone read the advogato version of this. It appears on various aggregators, hopefully they will all point at the new version soon.

Who would have thought that puzzling.org would be at all close to advogato.org in reliability?

Moving my tech log, or at least the bits of it that I had archived on my desktop, helped me iron out a bunch of kinks in Backwards too. It’s coming along nicely and maybe I’ll even do a tarball release and suggest it to the unwary on #twisted.web in a few months. Still, I had to change a large amount of code just to get it to let me use two logs with one install, so perhaps it isn’t quite that sound yet.

Incidentally, a side-effect of this is that entries I make on eyes will no longer appear on LinuxChix Live. You can subscribe to it directly if you’re interested though.

Python papers at the Australian Open Source Developers’ Conference

Heads up: A call for Python related papers for the first Australian OSDC (Open Source Developers’ Conference) went to the python-au list this afternoon.

The OSDC is between the 1st and 3rd December 2004 in Melbourne. It sounds like there will be a whole 12 hours between that and ALTA‘s summer school and workshop… in Sydney! Pfft, there’s just about time to drive between them with that kind of timing.

I’m tempted to work up a paper for OSDC, because I sure won’t have one for ALTW. It’s a pity my Python expertise is a proper subset of spiv‘s. And I’ve been doing web development again anyway, and it lacks awesomeness. Perhaps I need to develop newer and cooler Python expertise in a hurry.

Cool code of the week

Hats off to pppoeconf. It’s all so simple, post wailing and gnashing of teeth. (You think I speak figuratively? Not so! There’s a reason I’m not employed as a sysadmin.) Put ADSL modem in bridge mode, time expenditure until Eureka: some hours. pppoeconf, time expenditure until pon: 1 minute. Done!

And now that I have returned to the land of broadband, I hope never to have to look at the Whirlpool forums ever again.