Why Women’s Summer Outreach

Dom Lachowicz asks why an enormous number of women are applying to the GNOME Women’s Summer Outreach Program when they didn’t apply to Summer of Code.

Well, I’m still each-way about applying (because I’m spending two successive weeks in July at the ACL/HCSNet Advanced Program in Natural Language Processing and ACL/COLING 2006), but, for what it’s worth:

Potential major disadvantage: WSOP’s payoff is less ($3000 vs. $4500)

The time expenditure is commesurately less, two months rather than three, although Lachowicz doesn’t think that there’s a scope difference to compensate. This is a good thing for me. I’m a southern hemisphere student and also a PhD candidate, both of which mean that July and August are not some kind of idyllic vacation waiting to be filled up with code. I want my September back.

Potential major disadvantage: WSOP’s number of advertized positions is less (3) than the general SoC allotment (20)

I expected that the applicant pool would be much smaller too, turns out I was wrong about that.

Potential major disadvantage: WSOP was advertized when a lot of North American schools have their summer recess, where SoC was advertized while students were still in class

Not relevant to me. What is relevant to me is that I was very very early in my studies (my school year starts in March) when SoC opened up, and had no idea if I had the time for it. Especially since (as seen below), SoC is longer.

March to May is probably the worst time to persuade Australian students to be part of this kind of thing: they’re still trying to wrangle their school year into shape.

As for North American students, for those who remain on campus (ie postgrads) it’s likely that summer is a good time to do word of mouth advertising. Less people have time to get enthused about a summer of coding when they’re still finishing their spring of marking or taking exams. That’s speculation though.

Potential major disadvantage: On top of that, I have to imagine that GSoC was advertized more broadly than WSOP

Not in groups where I hang out. WSOP has been advertised to LinuxChix of course, but also to SLUG!

Here I have to comment on the nature of the advertising. I was personally approached about applying to WSOP and generally it seemed to be a little bit more about here, let us help with your application (although perhaps not now that applications have gone crazy) whereas the SoC stuff came across more as think you’re ready to play with the big boys, c’mon, prove it! I wouldn’t be surprised if this impression is false and based solely on the WSOP is for women factor, but nevertheless it is a feeling I have.

Potential major disadvantage: GSoC was open to both men and women, and WSOP is open only to women

I suppose this does make SoC appeal slightly more to me, but it just doesn’t scream major disadvantage. I’m not entirely sure why Lachowicz thinks this is a major disadvantage. One possible reason that that it makes SoC more prestigous, but prestige isn’t the reason I’m interested.

In summary

The major appealing factors of WSOP are:

  • The program is shorter
  • It was advertised more extensively as unambiguously cool in places I’m exposed to
  • It had more personal touch
  • It happens to be at a slightly better (as in, only mostly awful) time of year

My new baby

Actually, it’s more like a baby and an amniotic sac. The baby is a Canon Digital IXUS 65/SD630. The sac is the WP-DC3 underwater housing, rated to 40 metres depth.

But Mary, alert readers cry? Haven’t you and Andrew been dying to get a DSLR for years now? How can you introduce an ultra-compact into your family? Is it worried you will love it less than its eventual brother?

It probably ought to be worried, but it has one unbelievable advantage, which is that the cost of the housing was actually less than the cost of the camera itself. As far as I can tell, that isn’t actually true for DSLRs, or at least the prosumer ones. Expect to pay more for the housing than you did for the camera, and expect also to be compelled to buy an external strobe rather than being able to use the pop-up flash. (Yeah, I know that an external strobe is a good idea anyway. Damn you all to hell.) So, it’s my wet camera at the very least.

How did the family react?

I’m thrilled of course (although not convinced by Andrew that this is what you’d call a good shot of me):

Mary is thrilled

Andrew is mildly pleased:

Andrew is mildly pleased

Liga is really quite calm indeed:

Liga is calm

Macbeth is excited by something else entirely:

Macbeth is always excited

And I have captured something of the pretty:

Bottle brush flower in a puddle Sapling Butterlfly

Things I’m up to

Backwards

I still run my website on this thing. I must be one of the last people in the world using a hand-rolled CMS, but I spent four years maintaining my website by hand and for whatever reason it’s turned out that coding to my own needs has worked for me. I recently had a flurry of activity involving feature branches and unit tests and such.

Planet

I keep beating my head against the Planet codebase. The major problem I want to solve is that it runs like this: load all entries ever into memory, then write the top X of them out to a page. This is delightfully insane when run on my Linode 100: it can use up 30+MB of memory and push the whole thing into a nasty cycle of swapping.

Unfortunately, my reading of the codebase is that the single most flexible thing about it is that you can get any of the entries at any time. You currently might not want to, but everything is designed to leave that possibility open. Whether I kill that flexibility dead or leave it, this is a major reworking of the guts of the thing. Considering how many undocumented, untested hacks are in there (or at least, I have reason to believe are in there, it’s hard to find them) to work around the five million possible bugs people can introduce by mis-dating within their RSS feed, I’m sure this will break something.

linux.conf.au website

I volunteered to write content for this, but I’m not being nagged sufficiently hard and have also proven unwilling so far to go to their weekend-after-weekend conference organisation stints to be nagged, so I haven’t.

Women’s Summer Outreach Program

In theory I’m applying to the GNOME WSOP. In practice though it couldn’t be at a worse time. It runs July and August, here’s what I’m already doing in July and August:

  • Second week of July: Full time residential winter school in computational linguistics in Melbourne.
  • Third week of July: ACL/COLING 2006, which will be full time and, I believe, highly social. At best I can code on the train to and from the event each day.
  • First week of August: Visit to Townsville, including three full days on a liveaboard SCUBA trip. I get seasick, I’m not coding on a boat!

I’m still sort of madly tempted to apply and just find the damn time. It can probably be found. But I’d have to run close to crisis mode for a couple of months to get whatever it is done.

PhD

Yes, that too. No really. But at the moment I’m trying to do stuff in Perl (not at random, because of the existence of this) and for whatever reason Perl and my brain are not a good fit.

Twisted sprint

On the off-chance that someone who might be interested is reading this but not the twisted-python list (in which case, you might have your priorities mixed up…) there’s a Twisted sprint in Sydney June 3–4. I haven’t been involved in organising this one, but will be there. RSVP as per the link.

FLOSSPOLS report on gender and Free/Libre and Open Source Software

There’s two reports in fact: D16 – Gender: Integrated Report of Findings; and D17 – Gender: Policy Recommendations (a sub-report from D16). I’m in the middle of reading the first one. I don’t know how the general public is going to go with them, even I have trouble not wincing at the use of ‘discourse’ and ‘narratives’ and I both have a rogue semiotics major and know perfectly well that it’s unmarked jargon.

I’ve never felt really able to contribute to the why [so few women in this IRC channel]? discussion other than tending to dismiss the women’s brains don’t work like that argument for totally incorrect reasons. That is, I dismiss it largely because they seem unable to put ‘most’ in front of the words, and what kind of evidence of mathematical superiority involves making a absolute statement about an entire population in a conversation with one of the exceptions? That’s right, I’m completely petty about that one.

More seriously, I’ve always wanted to solve the problem of why I personally don’t contribute F/LOSS code before going at the bigger question. I mean, I’m in all the right places. I report the bugs. I have myself been a professional programmer, and now I’m a postgraduate computer science student which puts me even closer to the ‘likely to spend entire life on free code’ demographic. On the rare occasions when I even look at the code of stuff I use I can generally curse its inadequacy and also find the bug I’m looking for. (I have trouble fixing them even then though, because I tend to subscribe to a ‘master plan’ theory of the code in which I worry about breaking other things if I have to do any rearranging to fix the bug. I guess I’m the person test driven development was invented for.)

Some of it is this problem. Some of it is the social problem: that hurling out random patches is actually so seldom successful when fixing bugs, especially non-trivial ones, compared to spending oodles of time in Yet Another IRC Channel and finally winning commit access. (It’s possible a different choice of projects to fix bugs in would help, but that tends to be the problem with the under-resourced small ones.) Much of it, however, involves just not opening up the code in the first place.

A couple of things in D16 speak to me though: the first is the review of how important the idea of complete individual volition is in F/LOSS culture(s); that one’s choices to code or document or play will dolls are made in a social vacuum. Another interesting note, probably a tangent, from D16 (page 34):

Often it is almost as if software projects are not about software production but about code production, where members imagine that within code lies exclusive access to worthy knowledge. In this sense, F/LOSS resembles academic computer science more than engineering. It is perhaps not a coincidence that proportions of women in F/LOSS resemble academic computer science numbers.

Note that I cheat in academic computer science too: computational linguistics has a relatively large proportion of women. Still, I’m sure, a minority, but it’s difficult for me to notice that when it’s a sizable minority. But I wonder if I should tottle off and have a closer look at the ‘women in computer science’ literature now. It’s still an open question as to whether it will answer any questions I have about myself though.

Google demands, we deliver

Unlike my more organised compatriots (Pia, Scott) I didn’t go to the Google Open House party last night. But since they employ in my research field, I did head over to their Software Engineering ads in order to scope things out. After all, I like Sydney. I might want to live here after my PhD. (To be fair, it’s a bit of a long shot in that they don’t have computational linguists in Sydney.)

It’s interesting to see that they want a Masters or PhD for the position. Even modulo the usual about degrees not proving anything about anyone, I wonder what sort of Masters they’re looking for. A Masters degree is not usually a PhD feeder in Australia (four year Bachelors feed straight into 3 year PhDs with some provisos). While there are research Masters, which are just like doing a PhD except only about two thirds of the work is expected, most Masters programs are terminal coursework programs. In IT, they’re often career entry programs too, they don’t assume prior knowledge of the field. So hiring a Masters is just like hiring a Bachelors except your job candidate is older and is guaranteed to have a Bachelors in something else too. It says nothing in particular about some personal investment in or aptitude for research-like software development like it may in the US.

This is more a human resources problem for Google than anything else. It is, I assume, tough to get a decent picture of how tertiary and professional qualifications work globally. It’s a little concerning in a local sense: Australia at present does not usually demand Masters degrees for professional work except in the rare fields where there’s no undergraduate degree that qualifies you for the profession (and even medicine, dentistry and law are undergraduate, although in many cases they cheat and require that it be your second Bachelors). I’d be just as happy if it stayed that way. There’s nothing wrong with twenty-somethings being allowed to start actual work, surely.

PhD management

I’ve started a thesis, these are my toys:

Bazaar (yeah, the new one, I’m not that crazy…) for version control. As you might expect, the distributed element is a wee bit of overkill, but not as much as you might think. It makes backups easy.

LaTeX and BibTeX. I’m not a complete LaTeX purist; I find it’s more trouble than it’s worth for shorter documents, or at least, it has been since office suites on Linux started being easily installable and usable for me. (I recall, for example, in 2001, when Abiword exported to PDF but did not export the paper orientation with it. Since I’ve never owned a printer, export to a portable format for printing has always been my feature of choice.) But for a 200+ page thesis, it’s better than the alternatives, because it doesn’t involve typing XML tags and it is text based and therefore can be version controlled without custom proprietary tools and recovered by hand. And it’s designed for academic papers.

I’m not so enamored with BibTeX because I find it’s an easy format to get wrong. I suppose I should at least try EndNote someday so that I speak the same language as my librarians, but even if it is as good as everyone claims, it takes a lot to get me to switch operating system. (I have, in fact, only done it twice: DOS 2 to Windows 3.1 to Linux.) I probably should play around with BibTeX frontends again. Last time I tried pybliographer it was missing some useful features, although now I can’t recall what.

Tomboy for note-taking. This is my real discovery of the past month or so. I like wikis for notetaking in principle (hyperlinkiness is next to godliness) but in practice I find that the press edit, wait, edit, press preview, wait, edit, press preview, wait, edit, press save, wait cycle is too slow for comfort on a web connection. They’re also not really well designed for the task of taking notes into several pages simultaneously, or bouncing between them. So, I ran apt-cache search wiki specifically in search of the wiki idea implemented as a desktop program and Tomboy is what I found. I’ll be curious to see how their ability to interface with other programs goes. If it looks attractive I might try and hook it up to one of the BibTeX frontends I’m yet to find.

The bounty puzzle

My mind may be a little warped at the moment from spending time as a programmer for client driven projects and the associated perils, but every time I think about open source bounties something in the back of my brain starts squealing painfully.

I mean, these things just seem like a potential minefield to me. And I don’t mean legally, in the sense of people suing each other over bountified things that did or did not happen or bounties that did or did not get paid. I just mean in the sense of an enormous amount of sweat and blood spilled over the details of when the task is complete.

Consider the kind of features that are being bountied, for example parental control for Ubuntu. There are so many ways that one might not like an implemented solution to this problem, both reasonable and petty:

  • it might not completely control the users’ access in any one of a number of ways (doesn’t block IM, is missing some porn websites, doesn’t block use of alternate proxies…);
  • it might not be written in a programming language you favour;
  • it might manipulate packages or user settings in a way that is contrary to the way the rest of the distribution works;
  • you may disagree with its author about what parents might want to control;
  • it might have an unusable GUI, or one that you claim is unusable; or
  • it might have security flaws you can drive a truck through.

Moreover, upstream might have their own set of reasonable or petty objections, ranging from "I wanted to do that myself, it sounds fun, so I won’t use your solution" to "cleaning this up so that it works for us is going to be months of work" or "it has security flaws you can drive a truck through." And any or all of these might be the end of the project in a normal situation. But when some small amount of money is added, there’s a whole new fight on about what constitutes a completed bounty for the purposes of payment.

Even with companies, which are fairly motivated to be profitable, and even with specifications longer than the Bible, these fights can end up costing more person hours than the total value of the project. With bounties maxing out at USD500 or so, I’m willing to bet that the review time alone will cost more than the value of the bounty in all but cases so trivial that writing the code is faster than writing the specification anyway.

But the main problem for me is that doing client driven work now without a very clearly defined relationship between myself and the client (I’d like the armies of the undead to be involved, but failing that, serious review of the specification before coding begins) just gives me the willies. I’d rather take my chances with scratching my itch and hoping that the community agrees that it’s worth applying the patch than have the community try and assume a client relationship with me, let alone someone peripherally involved in the community who wants me to do the work of getting the feature that they personally want added.

There’s a problem here to be solved. It’s the old nasty one: what does a good specification (one where the specified task can be judged complete with as little ambiguity as possible) look like, and how does one write one?

Ubuntu and Malone

Apparently I haven’t gone mad, and the security fixes announced in this and this announcement do not, for all intents and purposes, actually exist for end users at the moment (bug 31584 and bug 31585).

Malone (the Launchpad bugtracker now used by Ubuntu) must have improved since I last used it. I remember way back when every time I filed a bug against something else I would have to file four bugs against Malone itself. Now I’m down to a one-to-one ratio (bug 31581 and bug 31583) plus a couple more nitpicks (bug 31586 and bug 31621). This is still much more frustration than I’m really happy with: I spent literally an hour trying to report that those security updates were missing, and it was only as small as that because I’m one of the twenty people in the world who live with a Launchpad developer. Alas, since I use Ubuntu and need to give them feedback on my problems if I want my computers to keep working, they have me all locked up on this one. And I normally find bug scutwork soothing in small amounts…