Not the Sydney Project: Questacon

This entry is part 3 of 11 in the series The Sydney Project

This year is my sonÂ’s last year before he begins full time schooling in 2015. Welcome to our year of child-focussed activities in SydneyÂ… only without the Sydney bit this once.

We rudely interrupt the Sydney Project to bring you a Canberra attraction: Questacon. In short, Questacon works nicely for V in a way that the Powerhouse did not, probably because it’s pretty shameless about catering entirely to children, complete with buttons, lights and hard hats.

We were there on a very busy day: the Saturday of the Easter weekend, the middle weekend of NSW school holidays. It was merely obnoxiously busy; I guess being used to Sydney crowds was helpful. That said, we did get there at 9:15, just after it opened. And as it was, the admission tickets to Mini-Q, the under 6 area, which is in limited numbers sessions on busy days, were only available from 11:30 onwards. I think they’d completely gone by about 10:30 in the morning. Go early, go often.

We’ve been once before, about a year ago, and Questacon was a hit to the point where for some time afterwards he asked to “see the science again!”. It took him longer to warm up to it this time. Much like last time, he shot through Measure Island without engaging. It took him a while to settle into Wonderworks, eventually getting interested in the Energy Machine and Frozen Shadow. Much to my disappointment, he’s never given a toss for my precious Harmonograph. (Much of Wonderworks has been there since I was a kid. Questacon’s exhibits are surprisingly timeless in their appeal.)

Best exhibit

Andrew and I and his father were very taken with the Cloud Chamber, which is in its own little-visited room from the steps between Wonderworks and Awesome Earth (closed for renovations), in which subatomic particles leave continuous trails through a cloud of vaporised alcohol. Andrew is keen to bring a banana next time. V was not willing to stand still for a story about how all the time, everything is being hit with tiny tiny particles moving at high speeds. Perhaps not one for the littlies.

V’s favourite exhibit is pretty unique to him. He can roll ping pong balls down a ball rollercoaster for about an hour at a time. Other children come, roll five or ten balls and go. He stays. We only extracted him with a promise to return after.

Blue tunnel

Next up was one for the watching adults, Excite@Q. V was most naturally drawn to the blue tunnel, and he was one of several smaller children jostling under Whoosh to grab a scarf and stuff it back in the wind tunnels. But we were there for one thing: to see our four year old agree to do Free Fall. I wrote about this elsewhere:

It’s a horizontal bar suspended over a very steep slide. You hold the bar. You let go. You drop freely for three metres or so before hitting the slide and sliding to the floor of the room.

The ride is, as you’d hope, very into consent. You go to the top. You get a briefing about how it works. You are told, repeatedly, that it’s OK to say no. And the day we were there, about three quarters of children did say no. (It’s a bit of a study in gender performance actually. Adult men by and large grab the bar, drop themselves down to dangle, let go and are done. Everyone else takes far far longer.)

V loves slides and heights, and so we asked him if he wanted a go. He said yes. He was dressed in the safe suit for it (I guess no risk of catches or tears), he waited in the queue and watched child after child look at the drop and shake their head and walk back down the stairs with an adult for a hug. Andrew took him to the top. He got the chat about whether he wanted to say no. He gave them a puzzled look. He got his instructions. He took them very seriously.

He held the bar:

Preparing for free fall

He dropped his weight from it:

Dangling

He looked down:

Looking down

And he let go:

Fall

He seemed to have fun, if a mystified about why this was such a very big deal.

Vincent the builder

After Free Fall, his ticketed time for Mini-Q came up. I didn’t go in, but apparently it was all construction all the time in there.

Finally, for bonus points, I put my camera down somewhere in Wonderworks, and someone found it and handed it into staff. “People who come to Questacon are generally very honest,” the information desk staffer told me, although somewhat spoiling the effect by saying she’d been tempted to keep the camera herself.

Cost: $23 adults, $17.50 children 4 and over, younger children free.

Recommended: yes, has something for the jaded adult radioactivity fans and the child who wants to drop from extraordinary heights, wear a hard hat in a playground, and roll ping pong balls down a slide for an hour alike. Try not to go on holiday weekends, and try not to leave your camera lying around.

More information: Questacon website.

Citation delusions: "The most influential paper Gerard Salton never wrote"

In trying to finalise my PhD revisions, I am giving some background on text categorisation.

Extremely briefly, the problem of text categorisation is this: you have a document and some (usually pre-defined, unless you’re clustering) categories. For example, the categories might be news and editorial. Or academic article, newspaper article and blog entry. The choice of categories is application dependent.

Then you have a document you wish to assign to a category. Is it news, or editorial? The typical way of doing this is to assemble a set of training examples: pre-assigned news and editorial pieces. Then you measure the similarity of your new document to the pre-assigned collections, and whichever category it is most like is your document’s category. You might notice that I have not here defined “measure the similarity” and “most like”: that’s often the research question. How can you represent the collections efficiently so that they can be compared against new documents? What are good measures of similarity?

A fairly common way to picture this is (for historical reasons, as we’ll see), a vector. For each word in the vocabulary (the vocabulary being the set of terms used in every document in the training examples, typically, sometimes you might try and smooth the morphology out or similar), you construct a numerical representation. Say the vocabulary is no-good, bad, rotten, and a document reads “no-good no-good bad”, you might describe it as a vector , showing two uses of the first vocabulary item, 1 of the second and none of the third. (Again, whether you count vocabulary items, or weight them in various ways, is a research question. You may also notice that this counting-of-occurences model is a “bag of words” approach, that is, it does not distinguish between “bad rotten” and “rotten bad” even though in language word order and syntactic structure is meaningful. It’s possible to transform the vectors so that this orthogonality of individual words does not hold.)

For reasons that I won’t go into here, I am trying to discuss this model briefly in my PhD thesis — actually, more briefly than I did above — and therefore looking to cite the originator of the idea. I started coming across citations in other papers that looked something like: “Gerard Salton [and others] (1975). A vector space model for information retrieval.” Sounds good. It’s got the key words in it, and quite a few citations!

I like to sight before citing though, which means I found this interesting paper:

David Dubin (2004). The Most Influential Paper Gerard Salton Never Wrote, Library Trends 52(4):748–764.

Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled “A Vector Space Model for Information Retrieval” (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information retrieval. Until the late 1970s, Salton did not present vector spaces as models of IR generally but rather as models of specific computations. Citations to the phantom paper reflect an apparently widely held misconception that the operational features and explanatory devices now associated with the VSM must have been introduced at the same time it was first proposed as an IR model.

Naturally such a subtle treatment of the history of the model is not great for my immediate purposes: I need That One Citation! (As best I can tell from Dubin, if I have to pick one it should be G. Salton, (1979). Mathematics and information retrieval. Journal of Documentation, 35(1), 1–29.) but it’s fun to come across the analysis of an idea in this form.

Update: if you want a reasonable overview of text classification/topic classification/topic assignment, the survey of choice seems to be Fabrizio Sebastiani (2002). Machine learning in automated text categorization, ACM Computing Surveys, 34(1):1–47. You know, modulo 11 years now.

Mary’s helpful guide to soliciting research participation on the ‘net

This article originally appeared on Hoyden About Town.

In my years on the ‘net, I’ve seen any number of people want to interview others or get them to take surveys for everything from a short high school or undergraduate paper through to graduate research projects and books. And they so seldom manage to meet basic ethical guidelines for making sure they aren’t wasting their participants’ time at best or endangering them at worst. Hence this article.

In addition, this article may help research participants better assess requests: are researchers telling you what you need to know? Have they considered your interests as well as their desire to Find Something Out At All Costs?

Full disclosure: I am not a research ethics expert, I am simply a researcher helping you get the basics right. Please seek expert advice if you have any doubt about the safety or integrity of your research.

Why do I need to do this stuff?

Because you’re so often asking people sensitive stuff, that’s why!

Look, I have some sympathy for the “it’s just questions about something-seemingly-small!” myself. I ask people questions about their linguistic intuitions. “Which sentence reads better to you, A or B?” There’s nothing less fun than completing a 31 page ethics application to get approval to ask people about which sentences read better.

But look, all research, at best, takes up people’s time. You owe people something for that. In addition, quite a lot of the research people are recruiting for on the ‘net wants to get into harassment of women, political affiliations, sexual experiences, why people write slash. That kind of stuff? That kind of stuff in the wrong hands loses people jobs and relationships. You owe people serious, well thought out harm mitigation for that.

So, ethical research recruitment lets people know what they’re getting into, whether it is a boring half hour sharing linguistic intuitions, or sharing potentially damaging information with a reseracher.

The bare minimum

All researchers asking for participation should share this information:

  • Who are you?
  • Who do you work for or who commissioned this work, if not yourself?
  • How can I get in contact with you, and how can I get in contact with who you are working for?
  • What is the purpose of the research?
  • What is the status of the research? Is this sheer curiosity that made you whip up a survey in five minutes, or a pilot study, or the main game?
  • What kind of effort do you want from me? (Interviews versus surveys. Five minutes versus many hours. You get the idea. Tell me upfront what my time investment is.)
  • When you’re done, where can I see the results?
  • Will the results be made public and in what form? (A peer-reviewed article? A PhD thesis? A pop science book? On your blog?)

Some of this might be the sort of thing you want to put on a webpage you can link to, so you can leave short advertisements like “Hi, I’m looking for help with X, and thought readers here might want to help because of Y, if you need to know more, please see LINK.”

You;d be amazed how many people miss the “When you’re done, where can I see the results?” step. Even if they’re asking people for 20 hours of interviews or something like that. For anything but the most trivial investment of time, letting people read your results is the minimum reward required.

Also, results being made public can often be good: the subject’s work is contributing to the sum of human knowledge! So don’t consider this necessarily a bad thing in and of itself.

Institutional research

If you are doing research at the postgraduate, postdoctoral or faculty level, research using human subjects (and other animal subjects for that matter, but you aren’t likely to be recruiting them on blogs) requires ethics approval by an institution-level ethics committee in most institutions.

So, when soliciting participants for research that has ethics approval, provide the following info:

  • All the bare minimums plus
  • A statement citing your ethics approval in whatever manner is usual. Your committee probably has boilerplate. Typically this will name the institution, give a reference number for your experiment and provide contact details for the ethics committee.
  • If your ethics committee approved a recruitment advertisement, use it! If it’s long put it at the other end of a link if that’s OK with them.
  • If your ethics approval requires that you disclose a bunch of things, also state them or place them at your info link if allowed.

If your institutional research didn’t require ethics approval (some institutions might, for example, have a blanket policy covering low-risk things like linguistic intuition questionnaires) find whatever boilerplate they let you use instead, if there is any or say something sensible along the lines of “This questionnaire comes under the XYZ University Low Risk Experimentation Policy [link].”

Basically, if you are doing research on behalf of an employer state either that you have ethics approval, or if not, why not (eg, your institution has no committee).

No committee but doing something sensitive?

If you’re doing sensitive work outside the oversight of ethics committees, here’s the start of your checklist!

  • All the bare minimums plus
  • Are respondents going to be anonymised in your personal/researcher copy of the data? Are you stripping any associated names, IP addresses, email addresses and similar? If not, what are you keeping and why?
  • How are you storing the researcher copy of the data?
  • Who has access to the researcher copy of the data? (Yourself? Your boss? All of your boss’s present and future employees? The Internet?)
  • When do you plan to delete the researcher copy of the data, if ever?
  • Are respondents going to be anonymised in the published results? If not, what identifying information will you publish and why?
  • Can a respondent withdraw their participation and be deleted from your data or transcripts? How do they do it? How long do they have to do so?

There are all kinds of other factors that ethics committees would get you to look at, basically, what capacity for harm does your research have? How are you mitigating that harm? What risk to your participants is left?

Risks include: physical health risks; mental health risks (more common with online data gathering, eg, triggering questions); exposing people to relationship disruption or breakdown, or abuse (by, eg, asking them to discuss infidelity); exposing people to criminal prosecution (eg by asking them to discuss illegal drug use); exposing people to civil liability (eg by getting them to discuss breach of contract), exposing them to job loss; denying them the best treatment or resources (by, eg, giving preferential treatment to patients or students or employees who agree to take part in the research, thus harming others); and coercing participation in general. And there’s one question that frankly stands out to me as a member of the apparently rare species Lady on the ‘Net, which is “are you studying an over-studied population and if so, what benefit does this extra research have for them, as opposed to for you?”

One of the most obvious mitigation strategies is anonymity of your subjects in reports, and eventual data destruction of any private identifying data. But as you can see from the examples related to coerced participation, it isn’t the only strategy you might need. List your possible harms, list your mitigations, let the potential subjects decide if the research is worth it to them.

Related

I wrote a similar post focussed on software development a few years back, in that case mainly focussed on “prove to your subjects that their participation is not a waste of their time.”

Creative Commons License
Mary’s helpful guide to soliciting research participation on the ‘net by Mary Gardiner is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Computational linguists

xkcd suddenly exploded in my circles in 2006, thanks to the comic Randall Munroe calls Computational Linguists and most people refer to as “Fuck Computational Linguistics” getting around at the annual conference of the Association for Computational Linguistics.

There’s been requests for the xkcd store to sell it before, but it’s never been done.

I just ordered a batch through Sticker Mule, both of the full comic and of a smaller badge version I did. (They will do proofs of them, I’ll be interested to see if the “Fuck” bugs them.) In order to do so I did a vector version of the comic (via Inkscape’s “trace bitmap”), and because the original comic, and these variants, are under Creative Commons Attribution NonCommercial, I can share them with you here. If you want them, order copies from the sticker vendor of your choice!

Full comic:
Indicative PNG | Compressed Inkscape SVG | PDF (fonts as paths)

Smaller badge-like variant:

Fuck Computational Linguistics
Compressed Inkscape SVG | PDF (fonts as paths)

The vector versions aren’t very clean, but neither is the original comic, so I’m hoping these look like the spirit of the original, rather than a nasty hack.

Reminder: these are licensed for free noncommercial use (the precise condition is noncommercial use with attribution to the original author, modifications OK). So don’t sell them!

Ada Lovelace Day: Mahananda Dasgupta, nuclear fusion researcher

7th October is Ada Lovelace Day, a day to blog about your heroines in science, technology, engineering and math.

Mahananda Dasgupta is a professor in the Department of Nuclear Physics at the Australian National University. Dasgupta’s research takes place at the heavy-ion accelerator facility and investigates quantum tunnelling when heavy nuclei collide. Her Pawsey Medal award in 2006 cites cutting-edge contributions includ[ing] precision measurements of unprecedented accuracy.

Dasgupta moved to Australia from India for a postdoctoral position in the 1990s, and eventually was appointed to a tenured position in 2003. She became the first woman to hold a tenured position in the Research School of Physical Sciences and Engineering at the ANU in its entire 50+ years of existence! (I was very surprised to find this, the School must be enormous in terms of academic staff, it comprises nine research departments.)

How do we retain that female workforce [in science]?

By strong and meaningful mentoring, which doesn’t just mean a quick meeting once a month or web-based mentoring, but real mentors who encourage women or younger people to devise strategies about how best to use their time, and what roles to apply for to advance their career.

Every person at that early stage needs support. We need to champion women scientifically – not “she’s a good person”, but “she’s an excellent physicist who’s done this great work”… Equally, the employers’ responsibility to provide childcare is very important… If we are expanding and building infrastructure – why are we not building childcare facilities?

I was educated in India where, if a student is sharp, they’re encouraged to show it through participating in discussions or taking on extra-educational activities… It does strike me that in Australia we give a lot of kudos to those who excel in sports, but if you excel in studies you are a dork, particularly among other students… Sometimes, following talks I give in schools, students come to the carpark to ask me science questions, rather than asking them in front of the class… How do we get away from that? I believe that to make real long-term progress we must respect and encourage intellectual achievements.

Mahananda Dasgupta, The Conversation: So seriously, why aren’t there more women in science?

Dasgupta is active both in advocating careers in science in general, volunteering herself as a science careers lecturer at schools, and in speaking on behalf of women in science. In 2004 she was the Woman in Physics Lecturer for the year, and in 2011 she represented the Group of Eight universities (the eight universities that consider themselves Australia’s best research universities) at a Women in Science and Engineering summit at Parliament House. Her 2011 Georgina Sweet Australian Laureate Fellowship from the Australian Research Council calls upon her to increase the profile of Women in Science through outreach activities, and work towards advancing early career researchers as well as facilitate leadership pathways for senior women researchers.

Recognition Dasgupta has received for her work includes:

  • the Australian Academy of Sciences’ Pawsey Medal in 2006, for outstanding work in physics by a scientist under 40
  • her election as a Fellow of the Australian Academy of Science in 2011
  • an Australian Laureate Fellowship in 2011

I can’t embed them in the post for licencing reasons, but David Hine has a couple of photos of Dasgupta with her experimental equipment: Dr Mahananda Dasgupta and Dr Mahananda Dasgupta and Dr David Hinde.

References

Ada Lovelace Day: Fan Chung, leading mathematician

7th October is Ada Lovelace Day, a day to blog about your heroines in science, technology, engineering and math.
This is an expanded version of a post at Geek Feminism last year.

“Don’t be intimidated!… I have seen many people get discouraged because they see mathematics as full of deep incomprehensible theories. There is no reason to feel that way. In mathematics whatever you learn is yours and you build it up—one step at a time. It’s not like a real time game of winning and losing. You win if you are benefited from the power, rigor and beauty of mathematics. It is a big win if you discover a new principle or solve a tough problem.

Fan Chung

Fan Chung is a leading mathematician, specialising in combinatorics and later graph theory. She is Distinguished Professor of Mathematics and Computer Science at UC San Diego.

I first heard of Chung in Paul Hoffman’s The Man Who Loved Only Numbers: The Story of Paul Erdős and the Search for Mathematical Truth; Chung and her husband Ron Graham were two of Erdős’s closest collaborators. Hoffman tells a great story about how when Chung had finished, and come first in, her PhD qualifying exams at the University of Pennsylvania, her eventual PhD advisor Herbert Wilf gave her a textbook on Ramsey theory to browse and she came back and explained that she’d improved one of the proofs. That was a core part of her PhD dissertation, completed in a week. Those kinds of stories are told about the best mathematicians.

Chung has worked both in academia and in industry, having spent twenty years at Bell Labs and Bellcore in both information technology and mathematics before returning to the University of Pennsylvania, where she did her doctorate. After her time in industry she is deeply concerned with mathematical breadth, and is known for her “nose” for problems that cross several subfields.

Many mathematicians would hate to marry someone in the profession. They fear their relationship would be too competitive. In our case, not only are we both mathematicians, we both do work in the same areas. So we can understand and appreciate what the other is working on, and we can work on things together-and sometimes make good progress.

Fan Chung, describing her relationship with husband Ron Graham

If my count is right, Chung’s publication list shows 79 papers co-authored with Ron Graham. I’ve always admired stories of professionally companionate marrages: even Joan Didion and John Gregory Dunne can’t compete on those numbers.

Chung’s website has a copy of a chapter about her in Claudia Henrion’s Women in mathematics: the addition of difference. Among other things it talks about her move to the United States from Taiwan for her graduate work, and her thoughts on having a child while at graduate school.

[Graduate school] is a wonderful time to have a child. You don’t have to attend classes; you only have to write your thesis.

Fan Chung

Hrm, yes, well. Perhaps I will give that advice in 20 years time. Perhaps not…

References

Book review: The Wisdom of Whores

Elizabeth Pisani, The Wisdom of Whores: Bureaucrats, Brothels, and the Business of AIDS

I picked this up when it briefly was a free ebook giveaway in 2010. Was that less than a year ago? Seems like a long time. I had not got through Jonathan Engels, The Epidemic: A Global History of AIDS, finding it not-global and spending too much time emphasising that the AIDS activists from the gay community really should have understood that they were viewed as sinners. Or that’s how I remember it now. I’m still interesting in the story of AIDS in the US, but I want it billed as such.

Anyway, Pisani’s book is an epidemiologist’s view of working in HIV research and prevention in (mostly) Indonesia. It’s partly a story and partly an argument that HIV/AIDS funding and approaches need some revision. In particular: prevention is cheaper than treatment, so while treatment is essential she thinks prevention is very underfunded. The approaches used successfully in the high-infection-risk communities in the US don’t all translate well to other high-risk groups. Emphasis on “everyone’s at risk” is nice for funding but is essentially bogus in most cultures: in most cultures sex workers, drug users, and people who have anal sex with multiple partners are at risk. (She argues the African epidemic is due to multiple long-lived concurrent heterosexual relationships being very common in some African cultures. This means that when someone has a primary HIV infection, one of the most contagious times, that they will often have more than one partner to potentially transmit to.)

I simply don’t know how valid her arguments are, because I know next to nothing about epidemiology, public health or HIV/AIDS, really. One of many books (almost anything outside my expertise) where I wish I could see expert reviews to read alongside it.

Read it if: you are interesting in HIV/AIDS, the UN, charity and NGO stuff, Indonesia, trans issues, sex worker issues.

Caution for: every so often she likes to add in a teaspoon of “I’m not PC!” She actually is, somewhat, anyway, but she likes to revel a touch in how her hip UN “AIDS mafia” crew were just such good buddies they could throw the lingo (about trans people, drug users, sex workers) in the bin. Also you may not actually agree with her on where HIV/AIDS funding should go, but it’s a book, you run that risk.

Wednesday Geek Woman special edition: Sandra Magnus, STS-135, and the end of the shuttle program

This article originally appeared on Geek Feminism.

Back-to-back American astronauts, yes. Special occasion! This is by request, from deborah on July 7:

Sandra Magnus is flying on the last NASA space shuttle launch tomorrow– how about a quick hit about her? And about being sad about the space shuttle. 🙁

Space Shuttle Atlantis en route to launchpad
Space Shuttle Atlantis en route to launchpad. Image by NASA, public domain.

We’re a little late to the party, so I’m scheduling this entry for about twelve hours prior to the end of the mission: landing is scheduled at 21 July 2011 9:56 UTC.

Sandra Magnus has a PhD in materials science and engineering and has worked on stealth aircraft design. This is Magnus’s 4th Shuttle mission, but third trip into space: she spent 134 days in orbit between November 2008 and March 2009, travelling to the International Space Station on STS-126 and returning on STS-119.

Sandra Magnus exercises in the Destiny Module on the ISS, in zero gravity
Sandra Magnus exercises aboard the ISS, March 2009. Image by NASA, public domain.
STS-135 is the 33rd mission for Space Shuttle Atlantis, and the final mission of the Shuttle program. See NASA’s video of the launch. NASA TV will be showing coverage of STS-135 throughout the planned landing.

The latest essentialism go-round: do we dare to discuss?

This article originally appeared on Geek Feminism.

Showing up on our Linkspam radar this week is John Tierney’s article for The New York Times, Daring to Discuss Women in Science, which is another round of “I am going to challenge the groupthink and be the one person who dares to make gender essentialist arguments about women in technical fields [well, me and that army over there]”. It argues from a finding among very high performing high school students, the top 0.01 percent of the population as sorted by the SAT and ACT standardised tests in the US, from which the researchers concluded that there’s distinct gender differences among students with that sort of performance.

Tierney writes:

The boy-girl ratio has also remained fairly constant, at about three to one, at the right tail of the ACT tests of both math and science reasoning. Among the 19 students who got a perfect score on the ACT science test in the past two decades, 18 were boys.

Meanwhile, the seventh-grade girls outnumbered the boys at the right tail of tests measuring verbal reasoning and writing ability. The Duke researchers report in Intelligence, “Our data clearly show that there are sex differences in cognitive abilities in the extreme right tail, with some favoring males and some favoring females.”

Here’s a roundup of feminist/women-in-science-o-sphere responses, many via SKM at Shakesville:

  • Anna N., 3 Problems With The “Women In Science” Debate: If boys and girls, men and women had truly equal opportunities, we might be able to conclude something about their “innate abilities” ”” or at least stop worrying about gender inequality in various fields. But we’re still very far from that point. Tierney finds fault with programs to eliminate bias at the university level, and says, female scientists fare as well as, if not better than, their male counterparts in receiving academic promotions and research grants. But girls may be implicitly or explicitly discouraged from pursuing science long before they actually become scientists…
  • Caroline Simard, “Daring to Discuss Women in Science:” A Response to John Tierney: The problem with the biology argument that “boys are just more likely to be born good at math and science” isn’t that it’s not “politically correct” — it’s that it assumes that we can take away the power of societal influences, which have much more solid evidence than the biology hypothesis. Tierney makes the point himself in his article…
  • Christina Agapakis, Adventures of Women in Science: The irony here being that this article is a very clear example of some of the social biases women in science face every day, just one of the countless attacks and indignities that make it that much harder for women to get up and go to lab every day, to achieve great things in math and science.
  • Janet D. Stemwedel, John Tierney thinks he’s being daring: On the general subject of claims for which there does not does not exist relevant empirical evidence, are there any published studies (or any research projects currently underway) to explore the connection Tierney, Summers, et al. seem to assume between being in the extreme right tail of laboratory measures of mathematical and scientific aptitude (like the math section of the Scholastic Aptitude Test) and having the chops to to get a doctorate in science or to win tenure at a top university?
  • SKM, Daring to Discuss Women and… *Yawn*, which also includes several links from the Larry Summers debate and earlier: For that matter, I think the “daring” idea that women are innately inferior to men at various Important Things–and indeed the preposterous notion that the idea is “daring” to begin with–has been answered quite competently in the past
  • Gretchen Keller, Women in Science: 2+2=?: …one of my least favorite, yet thought-provoking questions is “How does it feel to be a woman in science?”. Usually I reply that it feels the same as it does for a man: frustrating, time-consuming, invigorating and mostly like a bird flying repeatedly into a window desperately hoping that one of these times that pane of glass will turn into thin air.
  • Amy E. Slaton, Erring on the Side of…Exclusion: I know, I know: sarcasm is petty and unattractive. So before I lose any remaining credibility, let me defer to Troy Duster’s brilliant historical discussion of biological understandings of intellectual capacity. For almost 20 years, editions of his book, Backdoor to Eugenics, have laid out the very worrisome political and cultural implications of our pursuit of biological bases for intellectual and behavioral differences.
  • Clara Raubertas, “daring” to draw unscientific conclusions from statistics: Of course, his conclusions aren’t very scientific. Here are a few of the unfounded assumptions he has to make to draw the conclusions he draws… The assumption that science is so hard that it’s really only suited for people with extremely high scores (in the top fraction of a percent among a group of students who are already in the top fraction of a percent among their peers)
  • Melissa, The never-ending discussion: biology or bias?: What I find most frustrating is that there are myriads of studies, and everyone can cite their favorite study to support their viewpoint ”” be it that bias is the dominant factor keeping women out of sciences or that biology accounts for the paucity of women… I found what may currently be the best, though still imperfect, antidote to the never-ending, go-nowhere discussion of this topic, namely Stephen Ceci and Wendy William’s book, The Mathematics of Sex: How biology and society conspire to limit talented women and girls.
  • FemaleScienceProfessor, But I Don’t Want to Write about John Tierney Again: Thanks for all the e-mails and comments with links to the New York Times commentary by John Tierney, but what he wrote is just more of the same of what he’s written before: i.e., many women don’t want to be scientists or engineers, others can’t because they aren’t as good at math as the guys. Oh yeah, and Larry Summers made some reasonable statements in a speech that was misunderstood by hysterical females.
  • Hannah (in reply to FemaleScienceProfessor), Daring to Discuss: While I do understand this fear, how else are we going to convince the scientific establishment, many of whom likely share Tierney’s views, that gender bias is real and actually does keep women from succeeding in science careers? Clearly, just waiting for the old guard to pass on isn’t working, because I’ve met plenty of young male scientists who are just as biased as the old ones: they just hide it better.

Ethics of Free Software community research

Most of this entry is exactly a year old today and it’s just sat around in draft form all that time. Since I posted something similar on Geek Feminism about research into women in tech and similar topics, I thought I’d get it out there.

In January 2009 a researcher named Anne Chin of Monash University Law emailed the chat list for the linux.conf.au 2009 conference asking for research subjects to be interviewed about licencing and Open Source software. There were several responses criticising her use of HTML email and Microsoft Word attachments. I’ll leave the specifics of this alone except that people should be (and probably are) aware that this is almost always an unknowing violation of community norms.

I did, though, think about making some notes on research ethics and Free Software research. A bit about my background: I am not a specialist in ethics. I’m somewhat familiar with ethics applications to work with human subjects, but not from the perspective of evaluating them. I’ve made them, and I’ve been a subject in a study that had made them.

For people who haven’t seen this process, the ethical questions arising from using human subjects in your research in general covers the question of whether the good likely to arise from the outcomes of the study outweighs the harm done to the subjects, together with issues of consent to that harm. (There are many philosophical assumptions underlying this ethical framework, I don’t intend to treat them here.) Researchers in universities, hospitals, schools and research institutes usually have to present their experimental designs to an ethics committee who will determine this question for them and approve their experiment. Researchers who work across several of these (eg, a PhD student who wants to interview schoolchildren) will need to do several ethics applications, a notable chore when the forms and guidelines aren’t standardised and occasionally directly conflict. Researchers working for private commercial entities may or may not have a similar requirement. Researchers who use animals also have to have ethical reviews, these are done by animal ethics committees, which are usually separate.

At my university, essentially any part of your research that involves measuring or recording another person’s response to a research question and using it to help answer that question needs a human ethics application.

The good/harm balance may include very serious dilemmas: is there a health risk to subjects? how will the researcher manage the conflict between maintaining subject confidentiality and research integrity and the good of her subjects or the requirements of the law if she uncovers, say, episodes of abuse or violence? But it also involves less immediately obvious and serious ethical questions. Is this study a giant waste of subjects’ time? is considered a question of ethics by ethics committees, and is in fact the most serious problem for linguistics research, since there’s very seldom an outcome of particular interest to the subjects themselves.

The study in which I took part a few years back was towards the serious end actually: it was a study into the psychological profiles of people who have an immediate family member who had cancer as a child and involved both questionnaires and a phone interview with a psychologist. Both because the study explored memories of the illness and because the profiling included evaluating depressive episodes, suicidal ideation and so on, it came with a detailed consent form and with information about a counselling service that had been informed of the study and was prepared to work with its subjects.

In the case of the Free Software community the ethical questions are often more towards the waste of time? end of the spectrum than the more immediately serious end. It’s important to understand that this isn’t necessarily the case though. Here are some more cutting ethical problems:

  • getting findings that expose your subjects and/or their employers to intellectual property claims; or
  • revealing that your subjects are breaching employment contracts in some way (generally also related to IP) and thus exposing them to job loss and possible civil action.

Getting ethics approval to carry out workplace studies can be fairly hard precisely due to problems like these. But in the rest of this post I will treat the waste of time problem.

Firstly the basics: are your subjects going to be identifiable in your final reports or to the general public? If not, who will know who they are? Can a subject opt to have their responses removed from the study? When and how? All this should be explained at the start. (Usually if an ethics committee has been involved, there’s a consent form.) If doing a survey look into survey design, in order to construct non-leading questions and such.

Now, for specifics. Most of them arise from this principle: there are a lot of researchers working, in various ways, on the Free Software community, possibly making it a slightly over-studied group if anything. This places the onus on the individual researcher to demonstrate to the community that their project is worthwhile and that they’re going to do what they say. Thus:

  1. demonstrate some familiarity with the background. Depending on your research level this could mean anything from demonstrating a knowledge of existing anthropological work on Free Software (say, if the research project is for your anthropology PhD) down to at least understanding the essential concepts and core history (say, a project at high school level). This can be demonstrated by research design, eg asking sensible well-informed questions, but actually mostly requires a bigger time investment: making appearances in the community, either virtually or physically, ideally for a little time before asking the community to help you get your PhD/A-grade/pass.
  2. don’t get the community to design your experiment for you. Have a specific goal, more specific than get people to write me lengthy essays about Free Software, and get ideas from that and write about them. In the general case, the ask people incredibly vague stuff and hope they say something interesting technique fails the waste-of-time test.
  3. give your results back to the community. The most common problem with the various surveys, interviews and questionnaires sent to the Free Software community is that responding to them is like shouting into a black hole. It is not unheard of, of course, to see the thesis or essay or roundup that comes out of these, but it is unusual, relative to the number of requests. Most of the time the researcher promptly disappears. Researchers should come to the Free Software community with an explanation of when and where they will make the results of the study available. They should explain the aims in advance unless this would compromise the results. (On that note: Anne Chin is giving a linux.conf.au talk this year.)

Creative Commons License
Ethics of Free Software community research by Mary Gardiner is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.