My X pain

Since it’s very unlikely that I will see an improved X.org a for me in Ubuntu Hardy (only showstopper bugs will be fixed now that the CD images are building and in any case, 4 days seems like an unreasonable timeframe) and Bryce Harrington wrote about filing good X bugs the other day, I thought I’d make some notes on why mine is a pain. Note that Harrington’s timing is good, but he’s not talking directly to me. And none of this will be surprising to people who regularly work with bugs. There’s often one that can only be triggered on a rainy night after moon rise by a user with malaria, and you only have this one guy saying it happens all the time for him and completely and uselessly fails to mention that he lives near a swamp in a tropical country, and besides he can’t see moon rise through the clouds. Plus, the fevers confuse him.

It’s not very good form to blog about a bug without reporting it, but this one does not meet my ‘reportable’ threshold, which is that if someone says does blah fix it? I need to be able to answer in a fashion other than I have no idea.

The form that my bug takes is similar to bug 194214 except that it does not prevent me from using Ctrl+Alt+Backspace to restart X. My system will get into a state where various things happen on some desktops and different things on others (sometimes or perhaps always actually in different windows, since I run them maximised it’s difficult to tell). On one, often the Shift key will be always pressed and the mouse not usable. On another, clicks on the window that appears to be in front will actually go to the window behind it. On one, Alt-Tab will switch windows and it won’t work on another. All of these may happen simulataneously. Normality is restored only by restarting X.

The major problem is that I have no real way of deliberately triggering it. I have the impression that it is something to do with using the mouse and the keyboard together, but I do that all the time and my brain does not appear to be set up to recreate my muscle memory hand movements, not even immediately after. For example, it took me ages to realise that the occasional sudden and annoying appearance of GNOME Help Centres is due to accidently pressing F1. Unusual keyboard triggered events on my system are almost always accompanied by help, what did I press, what did I press?! and frantic peering at my keyboard. (Being a touch typist except for my right little finger sucks here.) I don’t have a chance in hell of intentionally recreating whatever scenario is going on here unless I get Andrew to stand beside me while I type and do nothing but watch. He does have that kind of patience but he chooses when.

Using git-bisect or similar from Harrington’s entry is thus going to be extraordinarily tiresome. I used version such and such for 5 days, and there weren’t problems, so on the balance of probabilities, this version doesn’t have the bug in question, seems unscientific. Well, unless I work out the frequency a little better and where my 95% confidence margin is. Which means I might also want the distribution of the occurences. A slightly better method probably consists of sitting down with the X.org codebase and looking at the diff for each revision for likely looking changes. (Comments like ‘and this is the bit that definitely for sure makes clicks go to the window with focus. FOR SURE!’ would be helpful there.)

PS Standard ‘not really looking for advice’ disclaimer applies. Unless you happen to have an Ubuntu or X.org bug number to hand. Or well, if you happen to know a lot about reproducing input problems in X.

Working with code

Riffing off Jonathan Lange on writing good (and reviewable) code (by the way, the filenames are a bit obscure, as in I didn’t know review comments tended to be passed around in .diff files, but here’s the 121kB review in question), a point I’ve been meaning to make somewhere, anywhere, for a while now: making large codebases grep-friendly is important to me. Specifically, when I enter a large codebase without any intention of becoming familiar with it (ie I’m fixing a bug) I really really really like to be able to grep for function names I’m coming across. (Really, I should probably use slightly more sophisticated tools, but they work on more or less the same principle.)

I was digging around in the Nevow codebase a couple of years for goodness-knows-what (actually, I think I know, and I really should look for it again, but that’s not relevant here) and there were factories, factory-factories and factory-factory-factories, or something like that. It was very difficult to find the implementation of any of the many many objects that were slipping around my hands, because they were implemented with a totally different class name, or several of them, and then sort of assembled from pieces. Damn it’s hard to fix bugs in code like that.

Ubuntu 8.04, status of

I felt a little bad writing this, because, well, Andrew tells me Ubuntu 8.04 (Hardy Heron, release candidate due very shortly and public release on April 24th) is very nice for him and in general I’m not much in the mood to stomp on people’s work. However.

Ubuntu Hardy is really not looking like it will be a good release for me. It contains several serious regressions over the previous release on my hardware. Some of them are likely to make this the least usable Ubuntu release for me ever, on my Dell Latitude 630M. (On my university desktop it’s fine, but I don’t place as many demands on that.)

Major problems:

  • As with the beta, every few days when I am using X, particularly when using my trackpad mouse at the same time as my keyboard, the Ctrl key gets virtually ‘stuck down’ and I have to restart X wtih Ctrl+Alt+Backspace to be able to use my machine again. (There’s not a lot of useful things you can do in X when it thinks that you’re holding Ctrl down.) I thought this was bug 194214 but it probably isn’t, since 194214 is (a) fixed and (b) much more hardcore than my weeny bug. I look for it on Launchpad occasionally, but it’s discouraging because I have no idea what the correct search terms are, and also all bugs about keys getting stuck get marked duplicates of 194214 anyway.
  • The new wireless driver, iwl3945, for Intel 3945 is Free Software, but that seems to be the single good thing that can be said for it. Most of the regressions over the ipw3945 driver have slowly disappeared (although you need to install linux-modules-backports): the LED works again and I can finally use the kill switch. But it’s not very stable. It causes some woes shutting the machine down and it just caused the first ever kernel panic I’ve had on my day-to-day machine. (I didn’t file the bug on this as Andrew thought that probably the details would end up in syslog. They didn’t, and so now I have no details.)
  • Resuming from suspend is broken for me at one time in thirty (thus more than once a week forcing a full reboot, as I suspend like its going out of fashion), which is about ten times worse than Gutsy on this hardware. I reported this as bug 217461 and now I’m trying to find out what the cause is by saving pm_state. (Wow, it’s an annoying debugging process on a user machine: my system really does not like having its time reset to random things a lot. I am not sure I have the constitution to be a kernel hacker, but I guess if I was I wouldn’t be usually debugging kernels on my work machine. Much.)

Between all of those, well, gar. Using Hardy will equate to sudden and necessary reboots more than once a week for me. And I need to muck around with hdparm again to try and sort out my system’s version of bug 59695.

Stuff that’s been or going to be fixed for release (at least enough for me): the wireless kill switch, the ignoring of many fsck errors, the insane timezone selection by the GNOME clock, the incompatibility of Java 1.4 and Gecko (although this was solved by just removing Java, which bothers someone), misnaming some wireless interfaces (this stopped wireless working after suspend for a lot of people) and the wireless LED.

One thing that possibly won’t be: sound problems with programs that do/don’t use PulseAudio, but that is more of a nuisance than a real problem for me, since I mostly use my Squeezebox for sound.

Finding out about locked LiveJournal (InsaneJournal etc) posts in your normal feed reader

This feature is rather old but I just found out about it. If you have a LiveJournal or SimilarJournal (InsaneJournal at least) and you have access to ‘friends-locked’ posts but you’d like to be able to pick them up in your normal feed reader, you can very likely use a URL like this:

http://YOURUSERNAME:PASSWORD@THEIRUSERNAME.livejournal.com/data/atom?auth=digest

For communities use the community URL followed by /data/atom?auth=digest. Similarly for LJ-like sites with their own URLs. You can also replace atom with rss if you are that way inclined.

This is described in FAQ 149 on LiveJournal.

If using a reader that is implemented in Python (eg Planet and Venus, rss2email) you will probably need to make sure you are using Python 2.5 underneath rather than any earlier version of Python, as Python’s urllib2 did not work with LiveJournal’s Digest-Auth implementation until just after Python 2.4 was released.

If this doesn’t work in your reader, the feature you want them to add and test is called ‘HTTP digest authentication’. There may also be some alternative way they want you to put in ‘YOURUSERNAME’ and ‘PASSWORD’.

Ubuntu 8.04 (Hardy) Beta bugs wrap-up

This is in my usual tradition of going over the bugs before an Ubuntu release. It’s a good time for Ubuntu, if I’d done this even last week I would have been more annoyed.

The most annoying bug of all has been around for quite some time, and may not be an Ubuntu bug at all, but rather a BIOS power management or disk firmware bug: bug 59695. This would be what cost me my last hard drive, which failed at a reported 1.7 million load cycles. The new drive is already at 2257, which is a bit high as well although I’ve tried various workarounds. The trouble with this one is that firstly the power saving settings of the drive appear to be totally opaque. Just try some numbers! Watch for high temperatures, although we don’t know what ‘high’ is for your particular drive, nor can we tell you how to work it out! My drive will probably last longer now, but the laptop is painful to touch after a while. Great.

Bug 194214 is quite seriously annoying too. For me (and I don’t use Compiz) it manifests as my Ctrl key being virtually ‘stuck’ down, so that if I press ‘q’ applications terminate, ‘d’ closes my shell, ‘Page Up’ switches tabs, ‘c’ sends SIGINT etc. An X restart is required to restore normality. The community is on this though, the report is quite impressive and git bisect has been used.

Bug 193970 is a regression from a user’s point of view, if not a programmer’s (for the latter, the important distinction is that it’s not actually the same software). The problem is that Ubuntu is now using the Free driver for the Intel 3945 wireless chipset (iwl3945), rather than the other one (ipw3945). iwl3945 doesn’t really support the ‘wireless off’ switch, well, at all, unless you consider rebooting after using it an acceptable solution. This isn’t bugging me much at the moment because I’ve learned to leave wireless on all the time, but it is annoying when I take my laptop on trains and similar and the battery drains needlessly fast. It looks like this will probably survive to the release, I will likely switch back to ipw3945 for the duration of Hardy’s lifetime.

Bug 204097 (which may be a duplicate of something else) probably cost me some data in the hard drive failure. Essentially they’ve decided to put a nice wrapper around fsck (the filesystem checks) that does not handle a check failure at all. It just reboots and tries it again. And again. And… you get the idea. Nor does it inform you that this is even due to a failure. You have to guess and boot into recovery mode yourself. Of course, this is a hard one to solve correctly, because a typical desktop user is eventually going to be told and now you have to do something very weird and difficult. But the ‘just hope it doesn’t fail the next boot’ thing is weirder.

Bug 185190 is just mystifying. Essentially the GNOME world clock programmer has decided that it is really hard to work out programmatically what timezone a city is actually in (and it is, you try it) and so they’ll just guess based on the longitude. Fortunately this only fails for very minor unheard of cities like Beijing and St Petersburg. Oh, and a bunch of major North American cities, which I genuinely am surprised is considered acceptable.

The major fix I’ve noticed is that bug 153119 (microphone was more or less useless on my laptop due to very soft volume) seems to be gone. This one surprised me since there was no response to the bug itself. update-manager is marginally better too I think. NetworkManager is fine, but I’m not using the Hardy default packages… Suspend and hibernate both work, that’s a shock this far out (a month) from a final release.

Building an online shopping site for which I will not kill you

I’ve done a fair bit of shopping online recently, and I’m just about ready to kill everyone. Here’s how to not be on the hit list:

  1. Let me compute shipping costs without having to give you my name, full address, email, phone number and chosen password. If you’re an Australian site, you can work it out from my cart and my postcode. If you’re international, my cart and my country. I know full well that you make a ludicrous amount of money from ‘shipping’ and I’m factoring it into my price comparisons. I’m getting to the stage of assuming the worst if you make me sign up before revealing shipping costs and I’m bypassing your site. No really, I am, I’m not buying from you any more, because 5 minutes signing up is 5 minutes too long.
  2. If you sell electronic equipment, be upfront about whether it’s grey market and if so, where the warranty holds (ie, in the event of failure, do I have to have it couriered to some other country to be fixed, or is it an Australian warranty?) If it’s in any way unclear, I am also assuming the worst.
  3. Ideally, debit my credit card at the time of shipping, rather than at the time of placing the order. (I’m usually not big on waaa waaa doing business is hard, that’s why it’s hard to buy from us, but you should anyway because capitalism means you have to buy things but I’m kinda sympathetic to avoiding credit card fraud, so often you get a pass here.)
  4. If you only have a web form for customer contacts, make sure damn sure it works. And by works, I mean the mission critical kind of works too, because it sure is annoying if the last POST-PAYMENT phase of the order fails and I can’t contact you about it.
  5. When I do file a support request through your web page, how about automatically emailing me a copy of it, ideally with some kind of tracking number? That way I have some reasonable assurance that it at least made it as far as your server. If you put up a generic ‘Thank you for your request, we will eventually respond’ page, I don’t know if my actual request got through. Especially if a payment just failed…
  6. I know you get a lot of dumb support requests. But please, please, don’t put up that page, you know, the one that goes man, you guys sure are dumb. This website is infallible and yet all the time waaa waaa waaa customers can’t order because customers can’t read our info even though they somehow were literate enough to apply for a credit card. Don’t blame us if we’re cranky about your dumb complaints. You’re lucky we even have this non-functioning web support form up at all. Because one day your third party gateway will ACCEPT A CREDIT CARD and your clever clever system will fall over before inserting the associated order into your database and then your contact form will also fail and then you will look rather stupid for talking to the customer about how dumb they are, won’t you?

I suppose I should mention some good systems here, shouldn’t I? OK. If you buy contact lenses through Net Optical Australia you’ll get so much feedback from them about your order status that your mail server might keel over. Glasses Online is fairly good in that respect too (they’ll even phone you to double check your weirdo prescription, if you have one). The good people behind the Nine Inch Nails album were very fast with their help despite however many million calls for help they were dealing with.

I should mention the bad system here too, but my understanding of Australian libel laws make it kind of dangerous. How about you guys just take the chargeback on the chin and we’ll call it even? (I will say though that unfortunately having clear-cut statements about shipping costs and warranties apparently does not totally correlate with a functional ordering or support system.)

King of Copyright

I saw The King of Kong today, which I recommend, especially if you haven’t heard anything about it aside from this, as I suspect it is unduly affected by spoilers. Afterwards, go read about the disputed facts on the Wikipedia page.

At the beginning of the movie there was an ad about respecting copyright. Not the you wouldn’t STEAL a car one with the heavy handed soundtrack, but one in which burning discs is portrayed as equivalent to burning the local film industry, in the form of a Happy Feet poster. In one way this is quite effective: do you support setting fire to neotenous penguins? No, you don’t. In other ways it’s less effective, in that the Australian public is notoriously indifferent to the local film industry. It’s not clear that they can blame downloading for the death of the industry any more than blacksmiths could. Or sellers of cockroach flavoured icecream.

In any case, this was shortly followed by a trailer for Be Kind Rewind in which what is essentially fanvidding gets a sympathetic portrayal and The Man is evil (he always is). I figure there’s got to be more scope for hypocrisy here. You could do a sweet little romcom about a guy being busted for downloading awesome music of a little known band, regardless of the hundreds of dollars he has spent attending their every gig. He falls in love with a woman who is supposed to be his smart, hardnosed IP lawyer and yet somehow on screen is always a little flighty, clumsy and generally in need of the help of a good man. The band plays at their wedding, and the bootlegs are so popular that they break BitTorrent. The couple live in some well lit loft at the centre of a major city without any evidence of having their lifestyle funded by a serious inheritance and the band become billionaires through T-Shirt sales. And you put the penguin arsonist ad before it, of course.

I survived the libc breakage of 2008

I didn’t get a lousy t-shirt, but I suppose I could make one. And put lice in it.

There have probably been at least a few people who haven’t heard of this, so: if people were unfortunate enough to be running Ubuntu Hardy (the development version that will become Ubuntu 8.04 in late April) yesterday and upgraded libc6 to 2.7-9ubuntu1, their system broke. It broke immediately, if they (like me) were running the upgrade process under sudo, because sudo started refusing to launch dpkg, and segfaulting.

There are a bunch of solutions to this, none of them particularly obvious to anyone who hasn’t had a broken libc6 in the last little while (the whole thing took me right back to when Debian unstable lived up to its name and used to do things like break lilo a lot — for that matter, it took me back to when I used lilo). Hrm, this is going to be one of those things, isn’t it, where none of them particularly obvious is going to be read as help, I haven’t fixed it, please send advice? Don’t you worry, I have fixed it on my machine. If you’re actually affected, go try the solutions described at the top of the relevant (now fixed) bug, bug 201673 (note that I had to also reinstall libc6-i686, which not a lot of people seem to mention). If you don’t want to be affected, either don’t be using Ubuntu Hardy right now, or, if you are, don’t upgrade libc6 or anything libc-like to Ubuntu version 2.7-9ubuntu1 (2.7-9ubuntu2 is fixed though). It should be more or less passed now, but maybe some mirrors won’t update for a little while still, I guess. The bad version was marked unreadable in the main package archive, which was probably not exactly useless but not as helpful as +1 claimed either, because the mirrors, including the official mirrors, were still distributing the broken package merrily enough.

Anyway, there were a couple of interesting things about this, from an observer’s point of view. The first is that every second person in the forums and half of +1 had their own personal solution to the problem, most of them correct but some of them harder than others (dpkg -x makes needing to pull the different .tar.gz files out of a .deb unnecessary, and chroot /mnt dpkg -i is even better as it means that your package database will be up to date with your downgrade). I suppose there’s some bias here: the people who followed already posted solutions (like me) just didn’t contribute. It left a thread with the impression that there were about twenty potential fixes, all of which you’d have to try because the earlier ones hadn’t worked for some people, whereas in actual fact the earlier ones hadn’t been tried by some people.

The other is the in practice failure of claims along the lines of [as] is prominently mentioned in a number of places, you should not be using pre-release versions of Ubuntu unless you are comfortable dealing with occasional (even severe) breakage. It’s only a partial failure: there are plenty of people who heed this and don’t whine (publicly) when development releases break. But there’s certainly a lot of people who don’t, and while you can chide them, you also still have to handhold them through fixing their system afterwards. In fact, it might even be a slight problem: the only people who heed the warning and don’t install development releases are the kind of people who read warnings and heed them, people who by inclination aren’t noisy and whiny. So you get a pretty whiny testing base (on top of the people who are seriously involved testers).

I tend to test pre-release Ubuntu and file bugs because if I don’t, I can be stuck with broken hardware or regressed software for an entire six month official release cycle (breaking hardware is more of a laptop thing). After my experiences with Hardy though I think I’ll go back to my old policy of upgrading at the beta release, rather than before it. Beta should be March 27, then you can find out about all the remaining bugs that are seriously annoying me, still.

BitTorrent

I made noises yesterday that I might learn about BitTorrent. So I tried. (It’s an interesting protocol from the point of view of needing clients to enforce penalties against refusing to upload.) Here’s the paradigm I wanted it to fit into: at, the command scheduler. 95% of my readership will know (intimately) how residential broadband works in Australia, but for those who don’t it is typical to not have unlimited downloads. On a good, just above entry level, plan you might have a limit of about 4GB to 10GB a month. (The entry level plans tend to have a limit of about 512MB or 1GB for only about $10 to $15 less. This is like selling small amounts of an addictive substance. Someone will eventually tell you about Google Earth.) However, it is also very typical to have an off-peak period with a higher, perhaps even unstated, bandwidth cap. (Mine is 48GB in a month, in the midnight to noon period only.)

Since I run a headless server anyway (for mail and music), I vastly prefer to schedule my downloads for when I’m sleeping and bandwidth is cheaper. When I’m going to download something available on the web, I run at midnight and then give the command wget --limit-rate=something -q -c [url] (-c because I’ve usually tested about the first 100K of the download already). In the morning, assuming I’m not downloading an album from a very well known band, my file is there. I therefore wanted to do the same with Bittorrent.

I’ll cut to the chase here: the solution is Transmission‘s command-line tools, particularly transmission-daemon and transmission-remote. The daemon controls all the torrents, and, usefully for my limited download window, they can be stopped and started from the remote command and therefore from cron. The only catch is that these tools seem to have only very recently matured, as in, they don’t exist in Ubuntu 7.10/Gutsy. (It has a transmission-cli package, but the daemon isn’t in it.) The transmission-cli package from Hardy will not install on Gutsy either, but you can backport it without any hassles, assuming you know how to get hold of and build Ubuntu (Debian-style) source packages, which, granted, probably doesn’t apply to 95% of my readership (although it might apply to a majority once we count Planet Linux Australia).

I thought this was worth sharing though, after I spent hours mucking around with BitTorrent (the Python client, not the rebranded µTorrent) and BitTornado, neither of which, even in the headless versions, has much support for such selfish notions as not wanting to run it until such time as it’s good and done (you can send SIGTERM and they do resume cleanly on the next invocation, not exactly champion of the world on design though), and neither of which puts up with this old-fashioned nonsense of wanting to run a process without a controlling tty. BitTorrent doesn’t even have an inkling that you might want to limit download speeds to anything less than maximum (perhaps Bram Cohen has never lived in a house with anyone else who wanted to use his net connection). I ended up looking at Transmission because the GUI version (also apparently very nice) is now Ubuntu’s default BitTorrent client. (Clutch is allegedly a nice web interface for Transmission too, but I haven’t looked at it at all.)

Edit: this post is not a bleg. The last paragraph describes a problem, yes, but this post is about how I’ve solved that problem by discovering Transmission. Please, there’s no need to email me on getting BitTorrent, BitTornado or rTorrent to work in screen or similar. I’ve discovered Transmission, which doesn’t need screen and which is very cron-friendly. This post is intended as a positive review of Transmission, not a request for help. End edit.

The brave new world

I liked the samples of the Nine Inch Nails Ghosts I-IV albums well enough, and I like FLAC well enough, that I figured I’d just buy the whole thing (USD5) rather than go with the free tracks. It’s what we’ve all been waiting for, right, FLAC without needing the physical CD?

Cut to three days later. I can only download the thing between midnight and noon, thanks to the bandwidth needed (it’s a 600MB download). I have now downloaded it something like ten times, at least in part (about 4GB of downloads and counting now). For a while they didn’t have the bandwidth resources. Now they do, but the ZIP file is corrupt or something (and not just on picky Linux systems, I can’t unpack it using Windows Explorer or WinZip either). The download almost always cuts out at about 300 MB anyway and for some reason their webservers do not do 206 Partial Content, so down it all comes again every time I request it.

It turns out the brave new world of CD quality downloadable albums must be a while away yet. One thing that would have been helpful would be allowing me to download each track individually. Even leaving aside the corrupt ZIP file issue (their tech support has not replied yet) at least I’d have most of them by now, and not be slamming their servers with several requests for a 600MB file per night as wget keeps retrying.

(Edit: apparently the whole album is CC-BY-SA-NC anyway, so it can be obtained via torrents already and one day I’m sure I’ll even learn how to use them. It’s taking my firewall settings up and down that sounds like a nuisance.)