Open Source Web Design

So it’s not like I’m in the most beautiful city in the world or anything, so some of the exciting things I’ve been doing include pulling a bunch of people’s weblogs into WordPress. (OK, OK, and I did take the night mode on Andrew’s digicam for a spin.)

Then this evening I was looking at my website and thinking “time for a change” and decided to head over to Open Source Web Design with the idea that if I ran out of inspiration I could pinch a design and remodel it.

It turns out OSWD’s designs are… really unspeakable awful.

They have a ‘hot or not’ rating system. Here’s my rating criteria:

  1. I would close the browser window immediately if presented with this design.
  2. I would wince and enlarge the font several times, as well as upping the contrast on my monitor, when presented with this design.
  3. This design looks kind of like the rest of the web, except all the really bad bits, which it only resembles in part.
  4. This design is quite good.
  5. Upon seeing this design I would immediately plot to steal it for my own site, held back only by the fear of looking like a big thief, and also someone who can’t do their own web design. (Not that I can do my own web design, it’s just that at the moment I do it anyway.)

I gave nothing a score above three. Andrew was sitting next to me, and he gave nothing above a three either, despite several designs making extensive use of techniques designed to appeal to his demographic. (A demographic consisting of people who love the colour purple.)

If there really are people out there doing free high quality web designs for the fun of it, I’d appreciate some Google terms.

Thursday 21 October 2004

It’s high past time that I stopped using Movable Type 2.6.x on my servers. Various options present themselves.

The first is upgrading to Movable Type 3.x. This is undesirable for two reasons. I’m hitching my wagon to proprietary software that may have arbitrary changes in price and conditions whenever a new version is released. Further, the 3.x version would cost US$99, and while I can afford this, I’m hosting other people’s weblogs using this installation, with one exception, and none of them are paying me. So I don’t feel generous enough to do this.

The next is using a Free Software solution. This is appealing; it’s my default choice in all other kinds of software. However, in order to do this, the software should have available all the features of Movable Type that the users, or I their administrator (captor) need:

Multiple weblog support.
I have ten weblogs. I don’t want to make ten copies of the same piece of (PHP, because everything is in PHP) code in ten different directories; make either ten different databases or fifty different database tables; and give the same users varying permissions over ten different weblogs. This one is a surprisingly rare feature.
User-based security.
This is not so much because I think the users are EVIL as because they will make mistakes. If the software gives them edit access to all the blogs, they will regularly make posts in the wrong place.
Web interface.
Most of these people don’t have their own computer, they blog from labs or the library. Uploading text files doesn’t cut it.
User editable templates.
Julia wants this, at least, and probably Mos does too.

Now, The first requirement alone knocks out a huge number of the candidates. There are a few that remain. Pivot seems buggy and … odd (probably unusable by users weaned on MT). b2evolution really seems like the only other candidate.

The last alternative is writing my own, because there does rather seem to be a hole in the (quite considerable) market. However, I’ve already done this twice, once for puzzling.org and once ages ago for eyes, and it’s getting dull. And in neither of those cases have I gotten anywhere near things like comments, trackbacks, or other basically necessary features. And while I have no doubt that the basic features of multiple weblogs, multiple users, and web editing would be developed fairly rapidly, the list of stuff I’d need to do started looking a little nasty:

  • I have to decide on a backend. Ew. And every time I change the code I have to have upgrade scripts. More ew.
  • In order to have more than one other user and any co-developers at all, there is but one choice of language. (Unless it turned out to be such a killer app that web hosts around the world started installing twisted.web — unlikely.)
  • Input validation and web based authentication are which of: horrible, easy to get wrong and boring? Answer: all of the above.
  • Then there’s that old horror: prevention of evil. Even if I trust my users to avoid malicious markup the question of fending off comment spammers, referrer spammers, trackback spammers and other jerks will arise eventually.
  • Letting users edit the templates is a huge input validation problem all on its own: how do you dig them out when they write an invalid template? (Getting them to write Nevow style templates? Well, anything’s possible.)
  • In order to handle both my own and other people’s bizarro weblog setups, it would need to work with arbitrary file systems, arbitrary vhosts, and quite likely generate URLs specified by the user.
  • In order to work with both my own and other people’s bizarro weblog setups, it would need to parse about twenty different types of export format.

Irritants; Shiny things

Irritants

  1. Our host’s wireless router, whose default DHCP lease is 30 seconds. That’s sure to fill one’s logs, asking for a new IP address every 30 seconds. I discovered recently though that if you specify a lease time, any lease time, it gives you about 40% of that time. There’s some amusement to be had from that if you aren’t nicely sending DHCPRELEASE, but at the moment I am settling for a lease of 852 million seconds. And counting.
  2. Nautilus, which I’m trying to use regularly, not just for reasons of sympathetic magic, but because it is indeed useful to manipulate pictures by dragging and dropping thumbnails. (A future project is a fairly hard-core script for resizing pictures en masse because the g-scripts ones don’t talk gnome-vfs. Wait on this one, I need to learn pygtk.) However, the usefulness is being offset by the incredible number of bugs, mainly related to frozen redraws or uninformative and occasionally wrong error messages, that manifest themselves every time I try and use it for non-local file access.

Shiny things

  1. Ubuntu’s "Human" icon theme (you may have to explicitly choose this). So many happy talking faces in one X-Chat icon, I can hardly believe it.

Hosting

About this time last year I was unhappy with my website host, and happened to be Googling for new hosts (Google Ads can be pretty useful as long as you were intending to part with money anyway) when I came across the idea of virtual servers: that is, paying someone to run a process for you that behaves just like a little Linux machine.

The concept was just great for me, because I had this enormous list of specialised hosting requirements that started accruing way back when I was hosting for free on a server tucked away at Andrew’s work place. These requirements include something that no shared hosting provider gives out (multiple shell accounts) and stupid requirements like the ability to construct the infinite number of addresses with – signs in them that Andrew and I use for various purposes, mainly for sorting mail from online companies into different folders.

Anyway, the virtual servers appeared to have all the advantages of dedicated servers that I needed (root access with the usual powers minus kernel upgrades and driver fiddling) without the hassles of dedicated servers, which basically comes down to price and hardware maintainence.

Since then though, I’ve embarked on a round of server hopping the likes of which would make any Australian ADSL junkie proud.

A year in review:

Bytemark. These guys are pretty good. They have a simple little shell app where you can login into the host server, poke at your parasite server, reboot it, access the consoles and so on. You can also overwrite the whole thing with a new clean server. These applications are really useful and many virtual server hosts don’t have them. When they don’t, if your server disappears from the ‘net, you’re at the mercy of tech support, even if the host server is up. I switched away from them because they were in the UK and the delay when typing from Australia was annoying me. In retrospect: dumb.

JVDS. Random web pundit consensus seems to be that these guys have a pretty good deal on RAM, disk space, and whatnot. They seem to have the most recommendations. They had reasonably prompt user support. Unfortunately, we really needed it, because they messed up our server’s setup. Every time the host machine rebooted, someone else’s server would come up in place of ours. And this server, being where ours should be, would start receiving all our mail, and not recognising the addresses, rejected it all. Delayed mail not good, bounced mail bad. Further, we had no access to the host server and couldn’t check our machine’s status. So after the fourth time they promised and failed to fix the phantom server problem, we moved hosts again.

Redwood Virtual. These guys have an amazing deal on RAM which was why we went to them. Unfortunately they’ve had two major problems: consistent ongoing performance problems probably related to disk, and massive downtime. Like JVDS they don’t give you any access to the host server that’s useful when your parasite goes down, and unlike JVDS, they don’t have 24 hour support. It turns out they grew out of a bunch of friends who got a dedicated machine and some IP addresses and started playing around with UML.

Linode. I’m testing these people at the moment. While not a strikingly good deal on RAM or disk space, these guys have the most sophisticated host server management facilities I’ve seen. They’re the only host so far with an easy way to find out your bandwidth use. You can reboot and stop your parasite server. You can subdivide your disk space, reformat it, and reinstall at will. You can maintain different installations and switch between them. You can purchase extra RAM and disk space and have it added automatically. You can access your parasite host’s consoles. You can configure reverse DNS. And then, only if there’s something wrong with all of that, you can hassle tech support. Finally, although it’s possible my parasite server just is on a new machine, they seem to have good performance.

[Side note to Twisted people eager to promote one of their helpers: thanks, I’ve heard of tummy.com. However they’re relatively expensive and offer less disk space than I need.]

Linode isn’t all roses though.

First of all, they’re draconian about spam. I’m fine with “thou shalt not spam.” I’m less happy with “if you ever get us blacklisted we will charge you $250 an hour for the time it takes us to get un-blacklisted.” (Background story: I used to run a secondary mail server for twistedmatrix.com. Spammers, for various reasons, like to send spam via the secondary mail server. Hence, I was handling all of twistedmatrix.com’s spam and forwarding it to them, as secondary servers are meant to do. One of their users noticed my server’s name in all his spams, and promptly got me in trouble with my provider, who was JVDS at the time. Moral of the story: it isn’t hard to falsely look like an open relay, and never secondary for someone who may have users who can read email headers but don’t know DNS.)

Second, as mentioned, they’re not the best deal on RAM and disk space. In particular, I probably am really pushing it trying to run a server under my current demand with 64MB RAM, especially as either Nevow or my Nevow app is a really memory hog. And, goddamn, memory usage needs to be a priority for virus and spam checkers. Amavis doesn’t even do any actual matching for me, it just hands off to clamav, and it still eats 6-10MB of memory.

Finally, nitpicking, their default Debian images have some weird problems, most noticeably not have 127.0.0.1 localhost in /etc/hosts. I hope I’ve come across the majority of these now.

However, I am hoping that a week or two of testing (they’re already handling incoming mail for Andrew and myself) will show them to be sufficiently stable and agile to look at settling there for a while.

Nifty; Job

Nifty

I was trying to deal with LiveJournal’s XML-RPC interface to transmit some UTF8 encoded text. It wasn’t working so well, so Andrew introduced me to the following Python snippet (which will test your installed fonts nicely):

print u'abcdefg€ñçﺥઘᚨ'.encode('ascii', 'xmlcharrefreplace')

The output is:

abcdefg€ñçخઘᚨ

Note: you should actually run this file rather than just whacking the script into your Python command line interpreter, because my console or interpreter didn’t like unicode input and mine was set up by those whacky Python nuts at Canonical.

Get that? (Pfft, don’t look at the HTML source, I had to change all of the & signs to &.) It takes the nasty Unicode string "abcdefgó€ñçﺥઘᚨ" and reencodes it in ascii, replacing all the non-ASCII characters (everything except ‘abcdefg’) with XML character references to their Unicode value. The upshot being a XML snippet that you can transmit in ASCII if you’re ever dealing with an interface that doesn’t seem to like your UTF-8 encoded strings.

Job

Not that I have a good resume online, but with half my holdiay over, I’m looking for six months or so of work in Sydney when I get back. I’m available from early December. Python, Perl or Java programming, or possibly tech support, tech writing or office admin, but I’ve got a better resume for the junior programming positions. Leads appreciated!

Pretty pictures

Pretty pictures

Ubuntu changed their default theme to include a harmonious humanity image featuring three pretty young things, which is causing considerable controversy mainly because the models used in the pictures are in various states of (well and truly legal in Australia) partial nudity. Screenshots linked here unless the poster takes them down. (PNGs I ask you?)

A lot of people are making the argument that those images may be inappropriate if displayed in a corporate environment or alternatively to conservative friends or family members. I don’t think anyone’s admitted to being too conservative themself to like the image, so I’ll start.

I like portraiture and good photographs, as it happens, and it can get as naked as can be. Fetish shots are fine as long as I know roughly what to expect. These shots are good photographs and reasonable portraiture, although they’re a bit more glossy/pretty-pretty than I like to see in galleries.

But for some reason, which must be unpopular judging from every theme site I’ve ever seen, I really dislike having people prettier than me on my computer’s desktop. I don’t think I’ve ever had portraits on it at all in fact, but if I did, I would never start with models. Something in the idea leaves me very cold: I’d much rather teh-boring than teh-pretty-people. (In actual fact though, I have a pretty castle shot: not the most amazing shot ever, but a favourite amongst my own.)

(I wonder what is psychologically at the root of this? Perhaps people roughly divide into two: people who’d love to strip off a bit and be happy and playful for a camera, and the other half of people — or maybe that’s just me — whose instinctive reaction to the idea has a little bit of ew in it. It certainly messes with the intended vibe.)

Update: Andrew showed me the proposed CD cover which has similar artwork, and for some reason I have considerably less squick. Maybe I’m acclimatised to teh-pretty when shopping. On the other hand, since partially naked people are usually selling things I don’t want, I think I’d pass it by on the sales rack without a second glance.

Plausible facts

Chris Yeoh made an attack on Wikipedia, inserting false facts into a new article and seeing if they’d be removed. See also Rusty Russell who considers this vandalism, and Martin Pool who thinks that this isn’t telling us anything we didn’t already know.

It made me think a little about defences already in place against this kind of thing. The biggest concern, I think, is not a misplaced fact somewhere, but the run that kind of fact can get through followup literature. For example, in early editions of The Beauty Myth Naomi Wolf claimed, with a citation, that 150 000 American women die of anorexia every year, which is in fact false. (150 000 was approximately the number of sufferers not deaths.) This is useful to bibliographers of a forensic inclination because tracking your mistakes is an excellent way to narrow down the number of sources you could have got your figures from, but not so useful to, say, anyone else. However, at this time, it isn’t of spectacular concern to Wikipedia: anyone who gets a fact from Wikipedia that would be useful in a publication will double-check it as they would any fact from so general a source. (Surely?)

Imagine the following attack on Wikipedia: I go to an article, say, the Canberra article, and find somewhere to insert this fact:

The number of women cigarette smokers between 15–25 in Canberra is 20%, considerably lower than the Australian national average of 35%.

Of course, I’d have to be a little bit cunning to insert this fact into the existing article, because it doesn’t contain any facts about the population, let alone such precise ones. But assuming I could do that, it would make a plausible if boring addition. It would also be, as far as I know, false. While I think I’ve got the number of female smokers Australia in that age group right to within about 5% (it got a lot of press coverage about five years back, because other demographics aren’t smoking in anything like those numbers), I’ve got no reason to believe that Canberra deviates from the norm in any way.

What harm would this do? Well, it’s possible a bunch of kids would copy it into their school assignments on Canberra (you poor little things, Canberra?) and get caught cheating, less because the fact is wrong and more because no kid puts that kind of fact in an assignment that they aren’t copying wholesale. University students doing assignments on health statistics might get done in by Google, although who knows, if they actually cite it they might be in the clear.

So that’s fairly minor harm. The potential major harm is in reducing the reliability of Wikipedia as a source to the extent that all that work is wasted or that people write off collaborative non-fiction of this kind as impossible. I contribute to that harm by a very small amount in this particular case, but quite a large amount if I had an axe to grind against Wikipedia and decided to be smart about hiding my identify and insert 1000 of the things into different articles. With 20 friends to help I could do a lot of damage.

Internet software has a particularly bad case of a similar problem: there is a large and powerful group of people who are very interested in abusing your software for a number of reasons; ranging from being able to commit fraud using your computer to attacking some IRC server whose admins kicked them off.

Wikipedia has less of a problem because false information in it has less marshallable power: you have to wait until nebulous social factors pick up the information and start wafting it around rather than being able to tell your virus bots to go out and memeify. Hence attacks on Wikipedia tend to be the insertion of spam links taking advantage of its Google juice (well, I presume they get them, Wikitravel sure does) and presumably edit wars between authors rather than determined attempts to undermine it.

The only real reason to insert subtly false information into Wikipedia is that you like being nasty or maybe to put it a different way, you honestly believe that “insecurities” in social systems are just like insecurities in software systems, and you’re on a crusade to ‘re-write’ society so that the kiddies can’t hack it. Or to be generous, you want to give Wikipedia a chance to show that it can defend itself, although applying the “what if everyone did that?” test doesn’t make that look so good either. (Societal systems will to break down once a critical point of disorder is reached, and since the fix for this is hardly trivial, the “doing them a favour by demonstrating flaws” argument doesn’t hold nearly as much water as it does for attacks on software.)

Anyway, given that, I thought I would consider it in the light of other heuristics for asserting facts: print heuristics, to the limited extent that I know them.

Take for example my first year university modern history course. As a rule of thumb, you don’t assert facts without justification in a history essay. Almost every declarative sentence you write will be accompanied with footnotes showing where you got your facts and arguments from. There is the occasional judgement call to make, because a sufficiently well-known fact doesn’t need citation. (To give examples on either side of the line: the fact that the assassination of Archduke Franz Ferdinand occurred on the 28th June 1914 would not require citation, but casualty figures for the battle of the Somme would, and an argument that the alliance system in Europe made WWI inevitable would require a small army of superscript numbers.) Given that though, you exhibit your sources, you check your sources where your argument relies on them (ye, unto the tenth generation) if you don’t want to get caught out, and the worth of your argument rests on the authority of your sources.

That did actually matter in first year by the way: the most common mistake made in WWII essays was sourcing Holocaust information from the web, which apparently — no, this isn’t a story from personal experience — means you run a high risk of relying on Holocaust-denying websites. (The alternative is that those essays were all by young Holocaust deniers, but given the number of people whinging that the course was insufficiently Marxist I think my classmates’ ideologies lay elsewhere.)

Now authority gets murky and geeks want the numbers here. But secretly, as Martin Pool points out, “humans are cunningly designed to do trust calculations in firmware” (yes, even people who trust their conscious mind more than their firmware). You can also see Dorothea Salo on heuristics for evaluating source reliability.

Of course, encyclopedias have different standards, because otherwise you’d get a bibliography that stood twice as high as the encyclopedia stack. (Less of a problem on DVD or the web, mind you!) I believe the system is that they rely more directly on authority: rather than sourcing the article’s facts from authority, you get an authority to write the article. Wikipedia can’t go this way, so they are left with two choices for establishing authority: have a really good reputation and a bunch of caring people, or more citations.

Citations are my secret shame, by the way, once you get used to following them and discovering interesting background information you get addicted. I wouldn’t say no to more citations on Wikipedia. (Take Franz Ferdinand for example — since we all knew I had to look up that date in a couple of places — “[n]o evidence has been found to support suggestions that his low-security visit to Sarajevo was arranged […] with the intention of exposing him to the risk of assassination” huh? Well, it would be interesting to read about who made the allegations and about the hunt for evidence, yes?)

Compare the issue of fact checking in software.

Although it has been argued (not, I think, by Eric S. Raymond, but by people extending his “many eyes” argument) that security problems in Free Software ought to be discovered very quickly because of sharp eyed people reading the code, bug reports tend to be inspired by the code not doing something you want it to do. (Except in the case of Metacity, which I think would need to display an obscene and anatomically implausible insult to me whilst running around in the background deleting all my files after emailing them one-by-one to my worst enemy before I’d believe I’d found a bug in it.)

This is a similar problem to that of finding dull smoking statistics inserted into Wikipedia by attackers or simply by authors who got their facts wrong: the less your bug displays itself to non-hostile people in the course of their usage, the less likely it is to be reported. It’s even worse, in fact, because while I can’t see any real reason for hostile people to search Wikipedia for false facts aside from saying “hah, take that insecure Internet, back to the World Book!”, there are a lot of reasons for hostile people to search for holes in your software.

I think the argument for relying on authority is less good too. Being an authority on history involves having mastery of an enormous number of facts and a considerable number of arguments, together with a nose for intellectual trends, some good mates, and a lot of cut-and-thrust. But while code authorities write a lot of code, and understand a larger amount of it, they don’t by and large earn their status in a competitive arena where their code is successful only if other people’s code is wrong, incomplete or doubtful.

This has considerable advantage in production time, as anyone who’s familiar with existing tools intended to introduce proofs of correctness knows. However, I think it has a minor cost in that there’s nowhere near the same incentive to engage with and criticise other people’s code, unless you’re a criminal. Of course I know that there are code auditors who aren’t searching for criminal opportunities, however that the professional incentive for everyone to do it is lacking.

And in some ways, Wikipedia also lacks this professional incentive. In a volunteer project rather than a cut-throat professional world where everyone is fighting for tenure, the incentives for fact checking are less. It’s essentially reducible to the old problem about getting people to write documentation: you can pay them, you can hurt them (this is how you get people to report bugs) or you can make it sexy. Ouch.

Idle notes

Idle notes

An interesting thing about time zones is that they’re stuffing up my web surfing habits. Because I read a lot of stuff written in Europe and the US (I was going to say ‘disproportionate amounts of stuff’, but given where the English writers of the world live, not so), and I’m used to it all being written over night and then doing all that reading during the morning.

Now that I’m in Spain, I’m actually 1-7 hours ahead of most of the writers, so I have to wait all day for their content to dribble in. I prefer the Australian setup.

What I’ve been working on

Precious little, much of it Wikitravel, which is sort of silly, but sort of not, since it’s still pretty hot in the middle of the day in Palma.

What I should be working on

  1. Updates to the Twisted Labs website pending the 2.0 release.
  2. A report for Diego.

Rants; Curiosity

Rants

  1. Wiki markup is great if you only have to learn one version of it. However, there isn’t only one version of it, many wikis have their own subtly different form. I don’t know why this is upsetting me, I already know about five text markup formats, what’s another few matter?
  2. Switching hosting providers always lands me with a worse hosting provider. Now that I’m down to 70 or 80% uptime, I’m considering settling.
  3. Every serious Free Software Thinker in the world can tell you why they hate every Free(-ish) creative work licence out there. The only two licences that anyone seems happy for creative people to use are the GPL and the BSD licence. Everyone understands code licences and they are good, what hey?
  4. Advice sucks. I’m considering adopting a personal philosophy involving only giving people advice in times of an immediate life threatening emergency. I am failing badly.
  5. I should never have promised that I’d work on my holiday.

Curiosity

Why are people so keen to pass on GMail invites? I don’t recall constantly being asked if I wanted a Livejournal account back in the days when those were invite-only. I keep looking at the invite spooler though; there’s something about the graphs. Andrew was really disappointed that they weren’t selling them, because he had a Buy-Sell graph nostalgia attack on seeing it.

Ubuntu Linux

Ubuntu Linux

Since Ubuntu Linux has just had a public preview released (ISOs here if the download page still has broken links when you read this) I thought I’d comment on, well, whether or not you’d want to use it.

Point of view: I’m a professional software developer (computer science oriented) with enough sysadmin capabilities to run a home or small office non-critical network. So I’m not your mother or father or whoever it is in your life you use to gauge “is Linux ready for the masses yet?” by. On the other hand I’m notoriously unlucky with hardware and I hate low-level system configuration (like PPP config files) and every time Linux makes me learn a new gotcha these days I get cranky. I have my little tools (mutt, vim, Firefox) that I configure endlessly, but everything else just needs to work.

So with that in mind, here’s why you might like Ubuntu if you’re someone like me (obviously, if none of this applies to you you’ll need to rationalise it yourself):

It’s Debian-like. It has apt, aptitude, dpkg. Its universe Archive even has Debian main in it, pretty much. (Technically you aren’t meant to Debian sources too, but I’ve been sneaking contrib and non-free in.)

Except it’s going to be released every six months. You will have noticed that that is not how Debian works by now.

It has GNOME 2.8. It is apparently going to track GNOME releases fairly closely in following releases too.

The installer is not yet pretty, but it is simpler. Mind you, Debian’s will be soon too. But this installer is pretty step-by-step.

X will be configured for you. Well, most likely it will be. It was for me. And wasn’t that lovely after endless years of being asked for my horizontal sync ranges or whatever they are, and never once, during that time, having anyone ever give me a monitor manual.

Some other nice things for me personally were: having the ipw2200 firmware in the kernel distribution; having the first user given sudo access and being put in all the right groups automatically; and… actually I think the rest of it is GNOME 2.8 stuff, like having sftp:// URLs work.

In summary, if you’re a desktop Debian user, particularly a GNOME user, Ubuntu is worth looking at.