PhD bubble

I have that thing all PhD students hate and fear to speak of, that is, a submission goal. I aim to submit my thesis no later than October 2009.

This means that until then I’m doing the same thing I did in my honours year in 2003: cutting back on random accumulated cruft in my life. That includes but is not limited to volunteering for committees, talks and organising events. It will definitely mean less time shooting the breeze on IRC or IM: I intend to try and be on them only when I have something to accomplish. I’ll cut blog subscriptions and twitter/identi.ca subscriptions back shortly too. I’d cut mailing lists, but my mailing list subscriptions never actually made it back from 2003/2004.

I’ll probably also be trying to cut down on social and semi-social commitments: as I’ve said elsewhere the number of them this year has been staggering. (You’re lovely people, all of you.) And exhausting: I can’t keep getting home after midnight three or four nights a week. I don’t intend to crawl into a hole, far from it, but I need to rediscover the joys of introversion, and not having my evening scheduled military style. So if you see me saying no to your things, that’s what’s going on.

If I’ve already volunteered for something with a firm scope, I am still doing it unless you hear otherwise. If it doesn’t have a firm scope, I’ll be in touch to firm it up. If I haven’t volunteered, I’m not hugely likely to. Not this year.

Incidentally, I’m not sharing this for accountability’s sake. If I need someone to sit on me and make me finish my PhD, I already have a mother. And you can be sure that she’ll be sufficiently displeased on your behalf if she does have to do that: she never spent much time making me do my homework before.

PSA: linux.conf.au domain

Apparently the linux.conf.au domain is dead and might be for a little while. Steve Walsh writes:

Subject: Re: [Linux-aus] linux.conf.au dead?

> whois linux.conf.au
> No Data Found

The admin team noticed this about 11am this morning and notified the
registrar of the domain, who appears to have expired the domain out of
their system. We're working on getting it back into the system ASAP.

Meanwhile, http://marchsouth.org and https://conf.linux.org.au are still
up and serving conf-y goodness.

Irritating news coverage: autism

I am getting fed up with science journalism, and so here I am, bound and determined to make you annoyed too. First, a general introduction to things that annoy me:

  1. reporting the results of a self-selected survey as a population-wide finding (if you survey readers of Australian Top Gear and furthermore describe it as a manliness survey — unless that was the journalist — about how much they feel about their female partners’ driving skills, it’s going to be news if it doesn’t come up as women drivers suck, not if it does)
  2. when reporting on medical results, not reporting on exactly which population was studied

There’s more subtle examples that don’t annoy me so much because you have to have a bit of knowledge of the science in question to get underneath them. For example, studies where a bunch of sick people and a bunch of healthy people are studied and asked questions like do you eat eggs? and the result is reported as either being sick is correlated with eating eggs or (more usually) study links eating eggs to cancer! are difficult to interpret because people who know they are sick tend to over-report (or alternatively healthy people under-report, or possibly both) anything that they think of as a risk factor. Because they too are looking for an explanation of why they’re sick.

Anyway, today’s example is from door number two:

In about 70 per cent of the kids, we’re seeing that if they’re not responding to their name at 12 months, they’ve gone on to receive a diagnosis of autism, [Associate Professor Robin Young of Flinders University] said.

New test to detect autism in babies

A very important question follows. Was that:

  • 70% of a sample of children who have been diagnosed as autistic didn’t respond to their name at twelve months; or
  • 70% of a random sample of children who didn’t respond to their name at twelve months went to on receive a diagnosis of autism?

The difference is pretty important: the second is much more concerning to parents of children who aren’t responding to their name than the first one is. (To illustrate the difference, consider the statements 99% of women aren’t blonde compared to 99% of people who aren’t blond(e) are women in terms of the kinds of predictions you’re making when you’re told someone has dark hair.)

I don’t know if Associate Professor Young gave a sloppy quote here, was misquoted or whether a sloppy quote was selected from a more informative interview/statement/press release.

Internet filtering proposal

Via Stewart Smith, a Computerworld article stating that:

Australians will be unable to opt-out of the government’s pending Internet content filtering scheme, and will instead be placed on a watered-down blacklist, experts say.

Under the government’s $125.8 million Plan for Cyber-Safety, users can switch between two blacklists which block content inappropriate for children, and a separate list which blocks illegal material.

Pundits say consumers have been lulled into believing the opt-out proviso would remove content filtering altogether.

Possible objections to this include but are not limited to:

  • general advisability of universal access to information;
  • optional filtering software already being commercially available to those who want it;
  • likelihood of universal filtering restricting activities of researchers and professional workers who need knowledge of illegal activities;
  • likelihood of universal filtering blocking access to legal information about (among other things) sexual issues and drugs;
  • likelihood of blocked material being expanded to include other things commonly not approved of by governments (criticism, activism, protest) and by people who most want a filtered Internet (pre-marital and extra-marital sex, religious and anti-religious material);
  • affect on Internet access speeds;
  • the need to identify oneself to some authority as someone willing to view adult-only material in order to be exempted from the larger blacklist;
  • low precision and recall of automatic blacklists (that is, they both block things they shouldn’t and don’t block things they should) and low recall of manual blacklists (they don’t block things they should).

Assuming you deeply dislike this proposal, you might want to discuss why you dislike it in the letters you are right now sitting down to write to Senator the Hon Stephen Conroy, Minister for Broadband, Communications and the Digital Economy and Senator the Hon Nick Minchin, Shadow Minister for Broadband, Communications and the Digital Economy.

Then you can move to New Zealand, where copyright takedowns are required regardless of proof of copyright violation.

linux.conf.au 2009 programme

The programme for linux.conf.au 2009 has been announced. I was chair of the committee that selected the talks, and you can see how seriously I took my duties (evidently I did nothing but look slightly grouchy and earnest all day).

Some talks I am especially looking forward to are:

Writing helpful reviews

I outlined the style of good academic reviews to Jonathan in light of our impending OSDC review responsibilities, and it’s worth noting here too.

For information’s sake, my authority, such as it is, on reviewing comes from being the editorial assistant of Computational Linguistics, which is a journal with a hardworking editor and conscientious reviewers. Not all academic reviews are of the quality I discuss below. They should be.

Begin with stating the title of the paper you are reviewing. Then spend one to three paragraphs summarising its content, particularly what you perceive as its major findings and conclusions.

This has a couple of purposes. The first is that if the reviews have got mixed up in the system the author finds out as soon as possible and doesn’t have to slog through a review that (perhaps) is a partial match for their paper and (especially in academic circles) a privacy problem to boot. The second is so that they know in what light to read the rest of the review. If they see that you have understood its fundamentals they will be inclined to take the entire review seriously. If they see you have misunderstood it, they can do one of two things. One is to realise that their paper is confusing, and to make its focus clearer. The other is to discount your review. The decision here may be affected by the following section.

The main body of the review is a discussion of how to improve the paper. Both the tone and discussion will vary considerably depending on certain factors:

  1. is the paper already accepted?
  2. is this the only reviewing round or will you or another reviewer be checking the changes?

For OSDC, both factors hold. For almost all conferences, there is only (at most) one reviewing round for full papers. This makes reviews more limited in scope than journal reviews, where substantial changes are often recommended even (or perhaps especially) to articles the reviewer fundamentally likes. Journal reviewers can have a role which is not far from being anonymous co-authors. (If a colleague did as much re-reading and suggestions of additional work and additional reading as Computational Linguistics reviewers do, many people would consider adding them to the authors list.)

In the event that the article has been accepted, or that this is the single reviewing round, you should limit the scope of your suggestions to much more cosmetic things. Someone who has had an article accepted is just going to be annoyed that you want it to have a whole new body of work incorporated, and they will ignore you. (And if it’s rejected after a single reviewing round, they are probably ill-placed to revise much!) In the OSDC scenario, reviewers are going to be mostly limited to suggestions as to how to structure the argument and the paper better, and not really able to productively suggest changes to the argument or the work described in the paper.

As you write your review and this section in particular, keep in mind the key factor of providing useful critiques: how could this work be better on its own terms? That is, don’t provide a review that is, fundamentally, about how the paper would have been better if you’d written it… about your pet topic. This is a subtle, tempting and common mistake, and if you have never caught yourself in it, you are likely to be the worst affected. Remember: What is the paper trying to do? How can it do it better? Avoid the temptation to suggest that it would be a better paper if it was doing something different from its current aim. (There is a little more leeway for this in journal reviews, but even in that case, generally what happens if a reviewer thinks this is that they review the article on its current form and recommend a fate suited to its current aims, and additionally comment that they would be interested in seeing further work in the additional direction should the authors choose.)

As a recipient of reviews, I do have a couple of things to add. One is to respect page limits. If you are reviewing for a work with a page limit, especially a conference, and you do really want to see a longer discussion of foo, please suggest which bar could be shortened or cut. Otherwise it is close to impossible for an author to consider your suggestion. Also, if you are making suggestions for future work that you think the authors should consider but which you do not actually want to see in the article, make this clear in the text of your review. I would probably recommend a whole separate section for this if you’re going to do it.

A review may conclude with a list of typos, spelling mistakes, suggested rephrasings, etc. Mistakes that affect the reading of the paper (eg mislabeled figures and sections) go right at the start of this list. A sufficiently ill-proofread paper may go back with a suggestion that the authors find the mistakes themselves.

Back in 2001

(It used to be that I couldn’t look back seven years. Then I could, but at least it was a different person I saw down there at the wrong end of the telescope. Now, I recall the me of 2001 and in most ways, she was me.

Google has their search index of 2001 up for playing. Crooked Timber is already collecting some fun stuff including searches for housing bubble and subprime mortgage lending. Instead, I would just like to observe that apparently 2008 wasn’t looking so exciting back then.

Using forums

I suspect that Ten easy ways to attract women to your free software project will do the rounds pretty quickly, but I at least hadn’t seen it until this morning. It’s a list of project management decisions you can make that would arguably make it more likely that women will be involved in some kind of development. They’re largely only ‘easy’ for a new project which is when many of these decisions are still open, but for what it’s worth, some of them are:

  • Use forums instead of mailing lists
  • As much as possible, use wikis instead of version controlled archives
  • Don’t discount what women do [‘what women do’ here used as ‘community management, documentation and similar activities’, via Geek chicks: second thoughts]

The justifications are reasonably lengthy and are in the linked article. I’m not going to comment extensively on what I think of the article, except to wonder about whether these are women friendly measures or people friendly measures. (A loose analogy: writing prose in such a way as to be accessible to non-native or non-fluent readers of your language is actually very helpful to native speakers as a side effect. I am of course hardly the first person to make the point that changing environments to suit women’s needs may suit men as well, it’s commonly made for workplaces.)

Instead, I want to talk about forum software. I can’t say whether women in general might prefer it, perhaps they do, but gosh, what a pain in the neck. I would never be casually involved in a project that ran over a forum. (If I liked the project enough, I might be deeply involved.) Here’s what’s involved in using a forum:

  • Thinking of a user name (very few have a tradition of mostly using full legal names, like email does, which means finding one that is unused etc etc)
  • Thinking of a password, storing it somewhere for later use etc (I haven’t seen forum software supporting OpenID yet)
  • Picking some kind of avatar for myself.
  • Learning how to use this new piece of software, how do I search for things, how do I post new things, how do I reply to things, how do I find replies to my posts?
  • Having to use my web browser’s text input tool as an editor. Argh, oh, my hands, my brain!
  • Dealing with the inevitably poor accessibility decisions of web forum software. (I don’t know when people over the age of 50 will be targeted as the next under-represented demographic in Free Software development, but the time of the hyperopic will come.)
  • Not being able to deal with anything to do with the project when not connected to the Internet. (I am something of a last woman standing here, but I do a lot of offline email work, it’s quite productive to sit on a train and plow through it.)

In my email client, I either don’t have to deal with any of these things or I’ve overcome them already and I deal with them the same way for every bit of email I deal with. I can casually join 30 mailing lists. I can’t casually lurk on 30 forums.

At the same time, the argument in the article is that forum software destroys the perception that everyone in the project is equally important, which both lessens the problem of one or two loud voices being perceived as in control, and motivates socially-oriented people by giving them some visible measure of reputation, etc. I hope at some point in the future forums, Usenet and email breed some kind of hideous yet effective love-child: protocols and software that allow more subtly moderated communities that nevertheless do not require that I use a different piece of software for every community I am in.

Organisation, and lack thereof

Note that I am not writing this entry seeking advice on how to organise things, with one exception, which is if people have systems for keeping track of academic literature I’d be interested to hear them. Otherwise, I’m just toying around with self-recognition. If you want to talk about your own bulletproof self-organisation strategies, please do so in your own space and I’d be happy to receive a link.

Jonathan Lange admits to some serious Getting Things Done violations which, although I am not a GTD user — in fact I have only skimmed the book — sparked some thought in me about my own organisation practices, or lack thereof.

First off, I mentioned this to my mother this afternoon and she said face it, you’re not an organised person. This is only partially true: she’s thinking of shoes, keys, wallets, pieces of paper and getting out the door on time, all of which I was hopeless with when living with her, and I’ve only improved dramatically at the timing thing. I am bad with physical objects.

I am, on the other hand, good with data. Part of the reason I can maintain certain amounts of physical chaos is that I have a good memory for names, dates, times and commitments. I am good with organising things inside my computer, occasionally a little too good in the I know I put this somewhere sensible but where? way but usually good.

What’s working for me:

  • Using an online calendar for anything I can attach a time to. It took about six months of occasional flesh wounds until Andrew and I were both fully converted to our new insanely scheduled way, but since around the end of 2005 we’ve been going strong. I can look at my calendar and 99% of the time it really does reflect every firm time commitment I’ve made to anything. And when I haven’t spent a weekend entirely at home since July, that’s important. We used Web Calendar for a long time until its faulty repeating-events logic drove me into the arms of Google Calendar, which works like a charm although in principle I would prefer to own that data entirely.
  • Email. It is highly procmailed and I am trying to add new rules all the time as an empty-ish inbox makes me happy. I’m not really trying for inbox zero, but inbox close-enough is OK. In most cases, my inbox contains only stuff I need to act on.
  • Scanning my snail mail to PDF. This probably doesn’t sound ideal because it’s not searchable, but neither is paper. And as previously established, I am better at electronic stuff than paper.

Obvious improvements I could make:

  • Carrying my calendar around with me. This would mean synchronising it to an electronic device and/or updating it on the road via mobile Internet of some variety. The reason I haven’t is that I am a cheapskate on the subject of both small electronic devices and paying the current (I believe outrageous) mobile data costs. I’m sure this will happen eventually, once it slides under my cheapskate threshold or I get a job which bundles it.
  • Better searchability for my email. Mairix is the most obvious solution (I have, for reasons not worth discussing, approximately zero interest in moving to Gmail), but I haven’t got around to hacking it up for my many copies of my mail on various computers and, also, I archive my old email to gzipped mboxes, which need a different solution.

Things I’m staying on top of, usually:

  • Photos, barring the photos of our November 2007 scuba trip to Thailand that I promised would be online in December. (Luckily, I promised it to people who do not have my email address.) No solution there: scuba photos need post-processing. We’re still using a kludged-up joint f-spot database over sshfs deal. I’d look into web things if it wasn’t for the pain and expense of dealing with the 29 gigabytes of photos I already have.
  • Music. We are oh-so-slowly re-ripping CDs to FLAC, but only because we are nuts. In the meantime, Squeezecenter finds us what we need.

Things I am totally at sea with:

  • Returning library books. No, I’m really really bad. I think I need to be strict with myself here: only one book at a time, and if I haven’t started reading it within a week, or I’ve lost steam, back it goes.
  • Long term lists of things I want to view or read. I have book and movie recommendations coming out my ears. I want to read more and watch more. I never keep notes.
  • Paper notes. I never never never got on top of this after high school. I don’t think that I ever once reviewed my hand-written university notes for any course at all. I am also a sufficiently fluent writer that taking notes does not greatly enhance my listening: it just all flows out through the pen without much effort in understanding it in the meantime. Plus I lose them or leave them at home. Moleskines, lab notebooks, meeting notes, I’m hopeless at them all unless I immediately transcribe them and email them to myself. This was a more serious impediment when I was doing maths courses, which are difficult to transcribe, but it’s still a major problem for PhD meetings.
  • PhD readings. This one is important and I need to solve it. I do not have a good system of filing (virtually, remember how I suck at paper?) papers that I will want to refer to, replicate or improve on. I am thinking of moving to some kind of wiki setup with a whole lot of folksonomy-ish tagging and notes aimed at enhancing searchability. (Our field’s major conference and major journal both are moving electronic only, but they are still PDF and still style guides aimed at printability rather than indexability and will be so for ever and ever.) I need to get quicker with my summarisation of papers, probably mainly focusing on the area, major technique employed, test corpus, and indicative percentage accuracies, rather than full summaries.

This is what they call insanity folks

I have just given into creeping adulthood and actually decided to do something that I’ve been putting off for a year or so on the basis that it is totally insane and obnoxious: I have started scheduling free time, specifically, I am marking weekends in the calendar on which neither Andrew nor I will not commit to doing anything or going anywhere with anyone else. I’m aiming to do this once a month or so, because barring our trip to New Zealand (world’s smallest violin, I know) we haven’t had such a thing since June, and then before that April and only because I was in my post-recompression exercise and travel ban in April. And this is for me, a borderline introvert, and Andrew, who is so introverted that Myers-Briggs cannot distinguish him from a lifelong hermit.

Even sadder: the first weekend I could find to do such a thing is more than a month from now.

Also sad, although again more in a world’s smallest violin kind of way: even on those weekends, we’re talking about squeezing some scuba in, which may not sound like such a big deal, but it’s 90 minutes commute each way to the beach from here, and also I have to re-plan all my diving around DCS first.

Clearly all this is very indulgent, but seriously, I feel like I haven’t really had much of a rest in my life since about the time we started planning our wedding, which was from about February last year. (We married in May last year, but that didn’t help because we just moved onto new time consuming challenges.) Partly it’s because I want to vanish into a PhD thesis pit like I want a whole in the head, but partly it’s because I have problems with saying no. But once a month I intend to practice. ‘NO, NO, NO, NO, NO’. Seems easy, right? I’m on the case.