Trip planning update: now with fresh new plans

In the last season, our intrepid heroine’s plans in Europe involved doing "stuff" in Europe, where "stuff" involved quintessential holiday-in-Europe things like thinking ‘hmmm, Prague sounds nice, I wonder when the train leaves’, and ‘goodness, when I went to sleep everyone was speaking French, but now they’re all speaking German!’

Unfortunately, various constraints have subsequently emerged, the most important of which is that Andrew would like to spend a lot of quiet time with an Internet connection during our stay. I originally considered a longer working holiday, but eventually decided to stick with our touristy plans with longer stays at a few places with Internet connections rather than a nomadic existence. Alas, such is the nature of compromise. Unfortunately it turns out that the combination of "Internet" and "major city in Europe" pushes your accommodation right into the 100 euro a night range, which is well outside my budget. Or at least so it appears from my attempts to find accommodation online. At the moment I have a phone call to make to Spain regarding a bed and breakfast, and another to Austria about a little studio in the mountains. I’m a little attached to this mountain idea, I hope they didn’t leave a zero off the price.

Now, for a rough itinerary, pending some interesting travel agent manipulations which I’ll return to soon:

San Francisco/Stanford
Arriving Fri 20th August, leaving Friday 27th August

Boston/New York/Washington
Arriving Boston Friday 27th August (I may try to move this back a few days), leaving Washington Monday 6th September. The exact split of this trip down the coast I haven’t worked out yet.
United Kingdom and France
Arriving London Tuesday 7th September. Andrew will spend the rest of September based in London, I will likely roam more widely. I will visit France, the north of England and possibly Scotland or Wales during this time.
Spain
Assuming that accommodation works out (as best I can tell from looking for cheap accommodation in or near Barcelona and Madrid is that it exists only in myths dating from 1970), we will be in Spain in the first three weeks of October.
Austria or the Czech Republic
Again, depending on accommodation and proximity to our old friend the ‘net, we’ll be spending three weeks in Austria from about the 23rd of October, alternatively I’ll look for something nearer Prague.
Hanoi or… err… Bangkok
This is where real travel agent magic begins, but we’ll be somewhere in south east Asia from about the 13th November on.
Sydney
My present bookings have me in Sydney on the 30th November.

The magic needed revolves around trying to visit Hanoi rather than Bangkok. Our round the world ticket includes up to 26 000 miles of travel, including, unfortunately, overland trips. We were hoping to fly from Frankfurt to Hanoi, but the best route the agent has been able to find involves flying via Bangkok. That and the fact that the distance is counted from our last landing (London), bring the trip total to twenty six thousand and sixty miles. That’s right. sixty miles over the limit, pushing the total price up by five hundred dollars.

The agent is going to try and get their product manager, which is apparently a title equivalent to "high ranking wizard," to reroute the trip. At present the cleanest solution involves subtracting either Boston or Hanoi from the stops but that would leave miles to spare, not to mention making me unhappy. I want to see them bring it in at twenty five thousand, nine hundred and ninety nine.

My apologies to everyone in Washington hanging on until I pull some precise dates out of the air. The next round should get it.

Documentation bugs

I tried to find useful third party guidelines for filing bugs against documentation, and wasn’t encouraged to see something I wrote on the second page of Google’s results for “filing documentation bugs”. So I wrote something better.

I’ve been notionally working on Twisted’s documentation for about six months and I’ve been disappointed to find it’s more something to be avoided rather than something to avoid with. It is at least true that I’ve found bug reports useful as a way in: “need to fix bug” is a much more useful starting point for me than “need to improve documentation.” I’m not absolutely sure about this, but I seem to be one of the only Twisted developers (hah! I develop not!) who files bugs against themself. It’s my only consistent adoption of todo lists in my entire life to date.

However, I’ve written an entire CMS and now a series of articles (which I’ve decided are so detailed that noone who’s read them would ever think they were remotely competent enough to buy and use a domain name) rather than write or edit Twisted docs. This is very disappointing, but I’m not contemplating giving up at the moment, despite the occasional nagging fear that my presence as docs editor is holding back someone vastly more committed waiting in the wings. (I suspect there is no such person, but if there is, do get in touch.)

It’s been interesting when I do work on docs though to discover exactly what parts of it I like doing: it wasn’t what I suspected when I began. Writing text for them is very hard. My output is about a paragraph an hour except in rare cases where I understand the code sufficiently well that I don’t have to read or write 100 lines of code to convince myself I understand what’s going on. I’m ambivalent about editing other people’s text for clarity or style — in many ways I love this, but it feels too easy to slip into the “make it as if I wrote it” rather than “improve it” trap. I quite enjoy producing example code, possibly more than I enjoy producing real code. This is probably the right way around — unless I’m a very atypical documentation user, example code is worth far more than text except in the case of design discussions.

I think the main problem I really have is that I’m terrible at ongoing tasks. My coding is the same: I like to have something working at the end of every coding “run” (which for me is about three hours of solid concentration, of which I can currently do two in a day if I have a good day). I like to have a passable document, or at least section, at the end of every docs run (same length as a code run, three hours must be my natural concentration cycle). I’m discouraged from starting anything that can’t be broken up into three hour chunks. In the case of documentation, in which it’s very hard to tell when I’ve finished, this is occasionally paralysing.

Trip planning update — fresh new info!

I’ve finally got enough money so that this is absolutely, unquestioningly going ahead. I also have a schedule of sorts… well, I want to be in France by mid-September before Jean-Phillipe comes back to Australia. So the plan is something like this (imagine me waving my hands in complicated patterns in the air and saying "this is precisely your solution space!"):

  • Mid August: US west coast, mainly Palo Alto (Stanford) and San Francisco, and I’ll probably spend a few days in the Los Angeles area since a friend will be living there by then. I hope to divert to the mountains and national parks which are kinda-sorta in the vicinity. Ish.
  • Late August, early September: US east coast, including DC, Boston and New York. I’ve more or less abandoned my plan of driving across the US — that would be more likely to happen if I was travelling with someone else who could drive, but I don’t fancy doing all the driving — and I don’t fit in buses so at this stage I’m planning to fly. I don’t fit in planes either (oh trans-Pacific leg, how I dread you) but they go faster.
  • Early/mid September: UK and France
  • After that: eh, we’ll follow our nose. I don’t have anyone to visit between western Europe and Beijing.

The earlier set of dates will hopefully be firmer within two weeks because we need to give a number of people in the US and western Europe advance warning of our plans, but after France I really am hoping to travel without a tight schedule.

I’ll be back in Australia late October at the latest unless my estimates of how much money I’ll be spending are wrong, and to be honest, they’re more likely to be under and see me back in Australia by the end of August, a whole two weeks after I left, than they are to see me still overseas in November.

I’m really hoping that I won’t completely run out of money by, say, Boston, but I’m not sanguine. I thought last week "surely you can live cheaper than AU$3000 a month in these places?" and my mother made soothing noises, but then I remembered how high rents are in Boston and realised that people there are actually paying AU$1000 – AU$1500 a month (about half my after-tax pay) just for accommodation. Yikes.

I’ll definitely trade-off cheap accommodation against having some money to spend on food and sights, but even so, accommodation is going to really hurt.

Everyone keeps saying "spend more time in fewer places" but you know, none of them recommend the same places. My grandmother, for example, says I can’t come back to Australia without seeing Florence. Mos and Jean-Phillipe say that Paris is a necessity. A number of people have recommended Athens (some of them even think it would be worth being there during the Olympics). Everyone’s in favour of Prague, although whether it’s for history and sights or cheap beer varies depending on who I ask. It’s fortunate, after all, that people aren’t quite that keen to sell the sights of the US, or I’d never get even halfway around the world.

Pay no attention to the man behind the curtain

Working in natural language processing is so disappointing sometimes — probably a lot of AI work is similar. You imagine somehow that all these clever clogs are verbing nouns and nouning verbs and otherwise initiating computers into the mysteries of human language, but really, you spend an awful lot of time writing regular expressions and using other tools at a similarly shallow level.

Case in point: gazetteers. A very common approach to the problem of locating place names or person names in a piece of text is to to have an enormous list of place names and person names and simply see if your candidate is in the list (yeah, you can make it more intelligent by checking for common surnames or the like too). It’s kind of understandable when you try and articulate rules for telling person names from place names (c’mon, can’t you imagine a person named “Ayers Rock” and some kind of landmark named “Apple Martin”?) but it still feels … un-fun.

I don’t work with them much, but I can see why the field is so fascinated with statistical approaches at the moment. Generalising over a set of data is so exciting. Stephen gave a very interesting talk on Monday about what Regina Barzilay and Mirella Lapata are doing in sentence ordering, and it didn’t seem to involve them having to sit down and manually write out any rules at all about how sentences are ordered. Very refreshing. Rigorous logical approaches have the same kind of appeal, but they’re less commonly used.

A lot of the time I don’t think I’m made to be an experimenter. But then I remember that I don’t think I’m made to be a programmer either. What I’m made to do is sit on a bean bag and have lackeys bring me interesting things to read. I can’t see that working as a job description.

Tuesday 22 June 2004

Now that I’m hosting this in the same place as my main log, it wants to become my main log’s brother. It wants me to tell it all about whatever spiritual hiccups have smeared my professional programming today. It wants to be my tech confessional. But I’m refraining. Perhaps it might get lucky and be used as a non programming tech log one day.

Currently, I’m struggling with Backwards design issues, specifically storage issues. This stuff, while not quite as dull as rule based information extraction or user input validation, is still pretty dull, and every time I think about it I think about downloading Zope 3 and using that and then I come to my senses and realise that I’d have to rewrite the whole thing.

It’s a backend problem. puzzling.org has always been pretty much a simple tree in structure so the filesystem makes sense as a storage mechanism. Except when it doesn’t, in precisely those cases when puzzling.org stops looking like a tree. For example, consider the logs. their tree structure is: root, year, month, day. However, I want the leaves of the tree (the entries) to be a doubly linked list, ie to have previous and next links.

I can search the tree in this case (if I’m looking or the entry before the 1st of January, I crawl up to the root and down into the previous year to find the 31st of December), but if I decide not to, I need to calculate some data and store it somewhere. Where? Well, I only have made 566 diary entries over the past three years, I could just about store the list in memory. But if I don’t, I need to figure out where to put it.

The case for a links blog (which doesn’t exist yet) is harder. If it is to look like my del.icio.us page, each url needs to be associated with a title, a description and a list of categories. But the sane web tree configuration is root, category, url, (as opposed to root, url, category) which means being able to make the "what urls are in this category?" query easily. When urls are in multiple categories, how do you represent that in the filesystem with a < O(n) complexity query time (n being the total number of urls)? Symlinks?

Well, it’s a trick question as far as I can see. The filesystem’s fairly strict tree structure and limited query mechanisms mean that it isn’t a good backend for this. Which means databases of course. Which means researching databases and choosing between them and learning to use my choice and dependencies (because Nevow isn’t a major dependency, no!) and ew. Hmph. I like the filesystem.

Wednesday 16 June 2004

So Dave Winer has pulled all (most?) of the free weblogs.com content. Authors can, at some point, take advantage of a one time offer to get a copy of their content. Nice Dave, good boy. In the mean time, criticism is supposedly muted because people’s content is being ‘held hostage’ for at least the rest of the month. (I don’t actually know about that, all I’ve read is the criticism. I haven’t bothered with the nicey-nice stuff.) [Update in the interests of completeness: there is now a transition plan to 90-day free hosting on buzzword.com. Content has been restored.]

A couple of things concern me about this. One is this persistent notion that’s probably been around since the beginning of time and will probably be around until the end that having the right to do something is a justification for doing that thing. (Hint: "but I’m allowed to do that" is a non-defence against criticisms of your failures of courtesy, generosity or general personability. The whole point of that stuff is that it requires you to do more for others than the bare minimum that you’re compelled to do.)

The other is this notion that if you’re getting something for free, you deserve what you get when it all turns sour. As others have noticed, this is the same stuff that was levelled at people who were shocked about Movable Type’s new licencing schemes. Mark Pilgrim re-wrote that debate in his terms in his Freedom 0 essay. What the people who wanted something for free did wrong wasn’t trying to get something for free ("something free", if you don’t like people playing fast-and-loose with the multiple senses of "free"), it was not getting a guarantee of that freedom. Hopefully Shelley Powers can do something similar with her thoughts on The Value of Free:

There’s nothing wrong with not doing the free thing. However, there’s also nothing wrong with the people who accepted the free thing, freely given… Each person who accepted these free things also gave something back in return: whether it was bodies when webloggers were few, or grateful acknowledgement when webloggers were many. Though those who have benefited from these free services in the past should be grateful, they don’t deserve to be called "cheap" or cut loose without warning. Free does not equate to no value.

Shelley Powers

The point of money is to abstract over some notion of value in a way that allows values to be compared. It’s efficient to be able to compare price tags. But the consistent confusion of money and the value it represents in some cases is concerning for all kinds of reasons. Limiting concern to Free Software alone, it would mean that there is no quality without money; that there are no ethical obligations without money; and that nothing of value is exchanged without money.

My personal instincts about this favour social changes that move from a rights based discussion ("I’m allowed to do this, I’m not compelled to do that") to a courtesy and generosity based discussion. What were the nice things Dave Winer could have done if he couldn’t provide free hosting anymore? What’s an ethical way to write software? What’s an appropriately thankful way to use it? I know, oh, I know that people have been talking in these terms for thousands of years too. I still wish they’d do it more often.

Sunday 13 June 2004

This thing has been around for a day and already people have asked me about comments. I don’t get asked about comments for the other log very much. I turned them on on Livejournal (this is cross-posted) with some trepidation and it’s worked out surprisingly well.

There are two main reasons I’ve avoided comments or web editing in Backwards. The first is that I’m not a big user of web editing: I’ve had too many crashes that cost me work, accidental cuts I can’t undelete, and old cached copies of the page in the form (thank you Zope 2, or was that Squid?). Plus it’s always someone else’s UI and they’ve always set something to be too small. The other reason is that input validation is currently competing with rule based information extraction (just don’t even ask) for my "least exciting programming chore" award. The beauty of writing the entire site myself is that I don’t have to check for malicious mark up, logins, cookies and other horrible things. I have all the power, no one else has any. Easiest authentication problem ever.

But it all comes down to the fact that I instinctively dislike the idea of comments on puzzling.org because it’s all mine, precious. Maybe I’ve spent too much time in the wrong comments threads, but I just don’t see the appeal of spending however many millions of hours I’ve spent this year in order to give people a forum to attack me and a guaranteed audience for their troll-fest. I want to put a click between me and my critics. Given the Livejournal experiment, this is a bit silly: people use my comments to say things like "let’s go crazy Spanish style" rather than "I will eat your young, ignorant evil-doer." Even so. Precious. One day someone’s spam robot would leave a comment and I’d feel personally violated.

There’s a pot and kettle problem though, because I prefer it when other people leave comments on (or in the case of my fellows who write their own CMS, write a comments system and then leave it on). There’s a certain social niche comments fill. Writing an entry to say "happy birthday" or "wasn’t it a nice day?" in response to other people’s entries is a noise problem more than a social activity. Sending an email works a bit better, but people are protective of their inboxes. Plus you miss out on interaction between commenters.

Wow, it really is possible to talk yourself into things isn’t it? Good thing I didn’t try and balance out the pros and cons of writing input validation, or I’d be spending today adding a comment facility to this thing. As it is I need to add some features for Andrew so that I can acquire my first user.

Saturday 12 June 2004

The extended absence of advogato.org has finally goaded me into doing what I’ve considered doing for ages on and off: moving my tech log to a server I control. I’m sure I’m far from alone too, especially since advogato.org posters showed up frequently on the Planets. I may work out some way of cross-posting, but I’m not sure anyone read the advogato version of this. It appears on various aggregators, hopefully they will all point at the new version soon.

Who would have thought that puzzling.org would be at all close to advogato.org in reliability?

Moving my tech log, or at least the bits of it that I had archived on my desktop, helped me iron out a bunch of kinks in Backwards too. It’s coming along nicely and maybe I’ll even do a tarball release and suggest it to the unwary on #twisted.web in a few months. Still, I had to change a large amount of code just to get it to let me use two logs with one install, so perhaps it isn’t quite that sound yet.

Incidentally, a side-effect of this is that entries I make on eyes will no longer appear on LinuxChix Live. You can subscribe to it directly if you’re interested though.

Python papers at the Australian Open Source Developers’ Conference

Heads up: A call for Python related papers for the first Australian OSDC (Open Source Developers’ Conference) went to the python-au list this afternoon.

The OSDC is between the 1st and 3rd December 2004 in Melbourne. It sounds like there will be a whole 12 hours between that and ALTA‘s summer school and workshop… in Sydney! Pfft, there’s just about time to drive between them with that kind of timing.

I’m tempted to work up a paper for OSDC, because I sure won’t have one for ALTW. It’s a pity my Python expertise is a proper subset of spiv‘s. And I’ve been doing web development again anyway, and it lacks awesomeness. Perhaps I need to develop newer and cooler Python expertise in a hurry.