Useful LaTeX packages: tables and figures

This entry is part 2 of 4 in the series LaTeX packages

This is part of a short series of entries on LaTeX packages I found useful while preparing the examination copy of my PhD thesis.

Today’s entry is packages relevant to preparing tables or figures. Again, some are pretty widely known and some aren’t.

rotating

If you have a big table or figure that should be rotated sideways onto its own page:

usepackage{rotating}

And then you can replace the table and figure commands with:

begin{sidewaystable}
%Giant table goes here
end{sidewaystable}
begin{sidewaysfigure}
%Giant figure goes here
end{sidewaysfigure}

dcolumn

The dcolumn package produces tabular columns that are perfectly aligned on a decimal point (ie all the decimal points in that column are exactly underneath each other), which is usually how you want to display decimal numbers.

usepackage{dcolumn}

% create a new column type, d, which takes the . out of numbers, replacing the .
% with a cdot and aligning on it.
newcolumntype{d}[1]{D{.}{cdot}{#1}}

Now that you have defined the column type, you can use d in the tabular environment, where the numeric argument is the number of figures to expect after the decimal point. You don’t have to use exactly that number of figures in every entry, just that that’s how much room it will leave.

% a tabular enviroment with a 1 and 3 figures after the decimal point column
begin{tabular}{d{1}d{3}}
1.6 & 1.657
\
2.0 & 6.563
\
7 & 6.26
\
end{tabular}

One annoying aspect of this package is that for the headers of that column, which probably aren’t numbers, you will need to use multicolumn to get them to display nicely.

% a tabular enviroment with a 1 and 3 figures after the decimal point column
begin{tabular}{d{1}d{3}}
multicolumn{1}{c}{Heading 1} & multicolumn{1}{c}{Heading 2}\
1.6 & 1.657
\
2.0 & 6.563
\
7 & 6.26
\
end{tabular}

You can mix the d column type with the usual l, r and p column types.

threeparttable

You can’t use footnote in a floating table. This is one of several packages that allow table footnotes in various ways.

usepackage{threeparttable}

threeparttable doesn’t cause tables to float on its own, so you usually want to wrap in a table command:

begin{table}

begin{threeparttable}

% Normal bits of your table go here, and use tnote{a} and
% tnote{b} and so to generate a note mark

begin{tablenotes}
tnote General note
tnote General note 2
tnote[a] Note for mark a
tnote[b] Note for mark b
end{tablenotes}

end{threeparttable}

caption{Caption goes here}
end{table}

Unfortunately you need to generate the a, b, c (or whatever) numbering manually.

The general tnote entries are useful for things like “Bold entries are highest in the column”, so that they don’t need to go in the caption.

Useful LaTeX packages: bibliography

This entry is part 1 of 4 in the series LaTeX packages

I’m going to post a short series of entries on LaTeX packages I found useful while preparing the examination copy of my PhD thesis. Largely this is just so that there’s a reference if my wiki page goes away, but also because I think many people use LaTeX the way I use it, that is, I got wedded to a bunch of packages 10 years ago and never really looked around for more recent stuff.

Today’s entry is a pretty slow start: the bibliography packages I used are pretty standard.

natbib

This is one of the most sophisticated and widely used packages for Harvard-style references (ie, “(Surname, Year)” rather than “[1]” style references).

usepackage[round]{natbib}
bibliographystyle{plainnat}

Inside your text use citep for a reference in parentheses “(Surname, Year)”, and citet for a in-text reference “Surname (Year)”. Its important to note that the plain cite command is equivalent to citet, which you may not expect.

You can use citeauthor to get just “Surname” and citeyear to get just “Year”.

bibentry

This is a useful add-on to natbib, which allows you to insert full bibliography entries into the body of your text. This is useful in the declaration portion of a thesis (where you say something like “this thesis incorporates revised versions of the following published articles”).

usepackage{bibentry}
nobibliography*

Then later on when you want to insert a full bibliography entry into the middle of your text:

bibentry{citationkey}

Product review: Shoeboxed

Update February 2017: this service is now known as Squirrel Street, and their smallest monthly pricing is significantly higher than it was in 2012. However much of the review still applies.

Original review:

I’ve been using Shoeboxed now for long enough to review it, I think.

Problem: as with every adult household, we have lots of incoming documents like bills and super statements and similar, and the high initial overhead on deciding whether and where to store them, plus re-sorting them later and so on has never been something we’ve been on top of. Come tax time, in particular, we were usually opening piles of envelopes and hoping for the best.

In 2007 or 2008 we started scanning and shredding a lot of things, but that still left going through and labelling the scans as a problem, plus when I went on maternity leave in 2010 we didn’t have access to a sheet-feed scanner anymore and got behind and never caught up. Back to the “giant unsorted pile of paper” solution.

There are a few services that accept mail on behalf of people and send scans (Pass the Post, Keeping You Posted) but these tend to be quite expensive if you want them to handle all your mail, and also there’s still a time-critical decision step (scan it or send it to me). It tends to be aimed at travellers or businesses. It was annoying enough though that every few months I hit the search engines and eventually lit on Shoeboxed.

What Shoeboxed does:

  1. accepts documents either sent by mail (not one at a time, many in a big envelope) to a US or AU postal address, or uploaded
  2. scans the physical document if any
  3. does data entry for the major data within (for bills, say, the sender and the total)
  4. makes them available after logging in on their website
  5. makes them available over an API to other services like bookkeeping websites

What Shoeboxed doesn’t do:

  1. directly accept individual physical mail on your behalf (they do have a service where you can get online receipts sent to them, I haven’t used it)
  2. full OCR of the scanned documents

There’s a very very limited Free plan involving uploading (not mailing) up to 5 documents a month for OCR plus unlimited uploads if you do your own data entry. The next plan up in Australia, which we’re on, is $20 a month, and includes all the features I listed

Impressions:

  1. overall, it pretty much does what we want: gets paper out of our house and into an easily searchable online form with scans available
  2. because it isn’t fully OCRed I still have to go through non-bills in order to note what they are, eg, a mail from childcare could be a fee change or a newsletter or a note about illness and if I need to find it in a year I’d have to search on the name and look through them all
  3. the processing speed on the Lite plan (contents of envelopes appear on the website in 3–5 days) has been a bit annoying on occasion, I’ve found myself scanning really time-critical documents and uploading them
  4. the processing speed on uploaded scans is great, the data entry is usually done within the hour
  5. the usage reporting doesn’t incorporate the bonus scans one gets by doing things like signing up for an annual plan, or answering demographic surveys. Very annoying!

For our needs, it’s definitely an improvement over our home-rolled solution. We’re scrambling to get 250 documents to them before our annual purchase bonus expires.

Copyright hell: larrakins and astrologers

This article originally appeared on Hoyden About Town.

People who support a reasonable balance between encouraging creation of artistic works by allowing creators to profit from them, and the interests of wider society in benefiting from the free availability of creative works (or even of facts) aren’t having a good day.

Larrikin vs Australian Music

Skud has covered this over at Save Aussie Music:

Today EMI Australia lost their High Court appeal against Larrikin Music in the Kookaburra/Land Down Under case…

Leaving aside the problems with the copyright system, let’s just take a moment to look at Larrikin, the folk music label that holds the rights to “Kookaburra”. Larrikin was founded in 1974 by Warren Fahey, and sold to Festival Records in 1995. Festival, owned by Murdoch, was shut down and its assets sold to Warner Music Australia in 2005, for a mere $12 million.

Larrikin was home to a number of Australian artists, among them Kev Carmody, Eric Bogle, and Redgum

Kev Carmody, one of Australia’s foremost indigenous musicians, released four albums on Larrikin and Festival between 1988 and 1995, none of which are available on iTunes nor readily available as CDs (based on a search of online retailers). …

Warner bought Larrikin Records’ assets — two decades of Australian music — not because they want to share the music with the public, but to bolster their intellectual property portfolio, in the hope that one day they’ll be able to sue someone for using a riff or a line of lyrics that sounds somewhat like something Redgum or Kev Carmody once wrote. They do this at the expense of Australian music, history, and culture.

Lauredhel covered the case earlier at Hoyden too, focussing on whether the claim of infringement stands up to a legal layperson’s listen test and musical analysis: You better run, you better take cover.

Astrologers versus software creators and users

Have you ever selected your timezone from a list which lists them like this: “Australia/Sydney”, “Europe/London”? Then you’ve used the zoneinfo database.

Timezones are complicated. You can’t work out what timezone someone is in based purely on their longitude, have a look at this map to see why. Timezones are highly dependent on political boundaries. On top of that, daylight savings transitions are all over the map (as it were). Some countries transition in an unpredictable fashion set by their legislature each year. Sometimes a sufficiently large event (such as the Sydney Olympics in 2000) causes a local daylight savings transition to happen earlier or later than that government’s usually predictable algorithm.

Therefore computer programs rely heavily on having a giant lookup table of timezones and daylight saving transitions. Data is needed both for the present, so that your clock can be updated, and for the past, so that the time of events ranging from blog entries to bank transactions can be correctly reported.

A great deal of software, including almost all open source software, relies on the freely available database variously called the tz database, the zoneinfo database or the Olson database.

Arthur David Olson (the “Olson” in “Olson database”) announced yesterday:

A civil suit was filed on September 30 in federal court in Boston; I’m a defendant; the case involves the time zone database.

The ftp server at elsie.nci.nih.gov has been shut down.

The mailing list will be shut down after this message.

The basis of the suit is that the zoneinfo database credits The American Atlas as a source of data, and The American Atlas has been purchased by astrology company Astrolabe Inc, who assert that the use of the data is an infringement of their copyright. Whether this is true is apparently highly arguable (in the US it seems to hinge on whether it’s a list of facts, which aren’t copyrightable) but in the meantime the central distribution point of the data is gone. And it could be a long meantime.

Now, people still have copies of the database (if you run Linux you probably do yourself). However, the source of updates has been removed, which means it will be out of date within a few weeks, and the community that created the updates has been fractured. Various people are doing various things, including a defence fund, a fork of the mailing list, and discussions about re-creating or resurrecting the data in other places. All a great waste of many creative people’s time and money, gain to society from Astrolabe’s action yet to be shown.

More information:

Update (Oct 17): ICANN takes over zoneinfo database

On 14th October the Internet Corporation for Assigned Names and Numbers (ICANN), which manages key Internet resources (notably, the global pool of IPv4 and IPv6 addresses) on behalf of the US government, put out a press release (PDF) announcing that they were taking over the zoneinfo database:

The Internet Corporation for Assigned Names and Numbers (ICANN) today took over operation of an Internet Time Zone Database that is used by a number of major computer systems.

ICANN agreed to manage the database after receiving a request from the Internet Engineering Task Force (IETF).

The database contains time zone code and data that computer programs and operating systems such as Unix, Linux, Java, and Oracle rely on to determine the correct time for a given location. Modifications to the database occur frequently throughout the year…

“The Time Zone Database provides an essential service on the Internet and keeping it operational falls within ICANN’s mission of maintaining a stable and dependable Internet,” said Akram Atallah, ICANN’s Chief Operating Officer.

I wonder if ICANN’s not-for-profit status is useful here. Just as Project Gutenberg can make United States public domain texts available globally, even though texts published prior to 1923 are not public domain world-wide, ICANN may present a less tempting target for lawsuits than other possible homes for the zoneinfo database.

Book review: In the Plex

Steven Levy, In the Plex: How Google Thinks, Works, and Shapes Our Lives

This book started off annoying me by being a little too worshipful of Larry Page and Sergey Brin, in my opinion. So clever! So Montessori! These cheeky little geniuses will rock your world! They’re going to take over your brain and you’re going to like it! But it improved early on other histories I’ve read of Google (lest this sound like an unfortunately dull hobby of mine, I mean shorter essays over a period of ten years or so). which tend to focus on a couple of things heavily: the Google Doodles and their approach to raising venture capital. I’ve heard about all I ever want to hear about doodles and Google’s fundraising. Levy doesn’t quite stay away from the latter but it’s mercifully short at least. Instead he gets into things that are more interesting to me, namely the engineering.

He spends a fair portion of the book getting to grips with the basic design of and use-cases of the two key Google products, search and ads, in a way that’s useful to me as someone with a software engineering background, so that was a win. I’m not sure how that would read to people without said background although it didn’t strike me as very technical. Later it deals with some of Google’s key expansions: the creation of its massive set of data centres, the Youtube acquisition, the attempt to become a major search player in China, book scanning and search, and finally, social.

I’ll certainly give Levy credit for finally explaining to me the wisdom that Google “doesn’t get social”, which I hear everywhere and which no one has ever given me a bite-sized cogent explanation for. (This is a terrible admission from someone who is meant to have some idea about the tech industry, yes? But I’m not really your go to person for social either. I use it, but I don’t make sweeping claims about it.) Levy’s bite-sized explanation: Google is philosophically committed to the best answers arising from processing huge amounts of data, and is resistant to cases where the best answers arise from polling one’s friends. Whether it’s true I have no idea but at least it’s truthy.

Levy has created a good history of Google for people especially interested in Google I think, but he largely hasn’t jumped over the bar of making Google into an interesting story for people who don’t have an existing interest in it, in the way that people have done with Enron, for example. There are parts of it that start to get close, particularly the treatment of Google’s expansion into China and its sometime Beijing office. But it’s not quite there. Possibly Levy didn’t have access to enough critical sources, or, if he did, he didn’t use them to their full extent for fear of jeopardising his access to Page, Brin and Eric Schmidt and to the Google campus. (Also, it sounds like Google makes it very hard for any current employee to be an anonymous source.)

Read it if: you are interested in the history of Google, and find them impressive. You don’t need to be a complete fanboy.

Caution for: as noted, not really a book for people seeking a rollicking good story of corporate ups and downs in general; and not really for people looking for really sharp criticism of Google either, although his critical distance certainly increases as the book goes on.

Anti-pseudonym bingo

This article originally appeared on Geek Feminism.

People testing the Google+ social network are discussing increasing evidence that, terms of service requirement or not, Google+ wants people to use their legal names much as Facebook does. Skud shares a heads-up from a user banned for using his initials. Then, for example, see discussion around it on Mark Cuban’s stream, Skud’s stream and Sarah Stokely’s blog.

Let’s recap really quickly: wanting to and being able to use your legal name everywhere is associated with privilege. Non-exhaustive list of reasons you might not want to use it on social networks: everyone knows you by a nickname; you want everyone to know you by a nickname; you’re experimenting with changing some aspect of your identity online before you do it elsewhere; online circles are the only place it’s safe to express some aspect of your identity, ever; your legal name marks you as a member of a group disproportionately targeted for harassment; you want to say things or make connections that you don’t want to share with colleagues, family or bosses; you hate your legal name because it is shared with an abusive family member; your legal name doesn’t match your gender identity; you want to participate in a social network as a fictional character; the mere thought of your stalker seeing even your locked down profile makes you sick; you want to create a special-purpose account; you’re an activist wanting to share information but will be in danger if identified; your legal name is imposed by a legal system that doesn’t match your culture… you know, stuff that only affects a really teeny minority numerically, and only a little bit, you know? (For more on the issue in general, see On refusing to tell you my name and previous posts on this site.)

Anyway, in honour of round one million of forgetting about all of this totally, I bring you anti-pseudonymity bingo!
5x5 bingo card with anti-pseudonymity arguments
Text version at bottom of post.

What squares would you add? Continue reading “Anti-pseudonym bingo”

Ask Auntie Hoyden: get your dog outlines here, and other search engine queries

This article originally appeared on Hoyden About Town.

an on-set photo of Katharine Hepburn, with overlaid text reading "Ask A Hoyden?"auntie hoyden

Why, I enjoyed those posts (1, 2, 3) in which Lauredhel tried to answer search queries as questions too! So much so that I show up in this site’s logs looking for them. So today, I too become Auntie Hoyden.

Frankly, it appears to me that most people stop here on their way to I Can Haz Cheezburger (funny cat pictures, captioned cat pictures, supernatural macros funny, funny pics), but they also appear looking for soylent green simpsons, any medicine for truth speak and, in considerable numbers, anal sex diagram. (Which is a bit odd, Google finds plenty of considerably more helpful sites for me on that term.)

But let’s see what I can do for you all today, although my specialities are more in the computer line than the sexual health and breastfeeding line that is traditional for this.

pluralising names

Lauredhel observed in 2008 that there’s a construction in Australian English (among others) that allows you to use things like “the Marys of the world” to mean “people like one particular Mary” rather than necessary literally multiple people named Mary.

But if you’re simply interested in how to add a suffix to a proper noun in order to indicate multiple things with that name, here’s a style guide’s answer.

dog outline png

[Update 2019: freesvg.org or publicdomainvectors.org are currently more searchable than openclipart.org.]

I like openclipart.org for this sort of thing: it’s public domain clipart, take it, use it and modify it without credit. (Not that I don’t also love various Creative Commons licences that do require credit, but dropping that requirement makes using many pieces a lot easier. I’ve seen people who use CC images from Flickr need to put a credits roll at the end of slide presentations.)

Plus! openclipart.org provides SVG as well as PNG. SVG (Scalable Vector Graphics) is a free image format which allows pictures to be scaled up in size without loss of quality, as well as down in size. This is accomplished by describing an image in terms of lines or curves (hence, vectors) rather than in terms of individual coloured dots. It’s not very useful for photos, but it’s great for clipart (and fonts, which are generally described in vectors and thus can be scaled up).

openclipart.org has hundreds of drawings of dogs. I’m not sure if this person was looking for a silhouette of a dog, which I couldn’t find on a very brief look, or simply a line drawing of a dog, of which there are many. Here’s a cute one.

snuggle otter

Don’t. We love ’em but that doesn’t make them domesticated pets.

mother sprays milk

Does she ever. I breastfeed a ten month old baby. When he was little I had a supply suitable for twins, or perhaps sextuplets, and milk was everywhere. Then things balanced out and we had a nice interlude of not spraying. Then he got a bit more distractible, which resulted on the weekend in him getting a letdown, pulling off and slipping so that he headbutted me in the breast and milk shot out for the best part of a metre.

If anyone else wants to grace the Internet with a milk spray story, feel free.

australia’s prime minister 2010 smiling

On the 25th June 2010, one day after becoming leader of the government and being sworn in as Prime Minister, Julia Gillard smiled in the presence of the US ambassor to Australia, Jeff Bleich. This is important, because it was photographed by embassy staff, and as a work of the US Federal Government, it is thus in the public domain and you can get it from Wikimedia Commons.

anti filter

That’s us!

Can you help out with these?
squirrel give thanks
the worst shoe eveeeer
placenta accreta deathrate
crivens!

larch: migrating mail between IMAP accounts

I recently had to move several gigabytes of email (not my own, work-related) into Google Apps (Gmail). As best I can tell, the way most people do this is that they grit their teeth and they open up a graphical email client and drag folders one-by-one. It’s a one-off job for most people.

There were a couple of reasons I didn’t want to do that. One is that I was on my parents’ DSL connection at the time and pushing gigabytes of data through someone’s DSL is a violation of good guest principles, at least in Australia. The other is that we have over 500 folders in the account I’m talking about: that’s a lot of mouse pain.

Anyway, here’s your answer, if you are in the same position as me. After substantial searching, at least for this kind of tool, I came across larch, which is a Ruby command-line tool for IMAP-to-IMAP moves, most tested on Gmail and essentially designed for the “move my mail archives into Gmail” use-case. It’s much more mature than most of the one-off scripts people have thrown up on the ‘net. It certainly seemed robust over this volume of mail, although I did have to run it a couple of times to get past a few errors (it does not re-copy already copied mail, so re-runs are fast). It deserves more search juice.

If you wanted to keep two accounts permanently in sync, offlineimap would be the tool of choice, although the manual still seems to regard IMAP-to-IMAP syncing as not as robustly tested as its core mode of operation, which is IMAP-to-Maildir.

Creative Commons License
larch: migrating mail between IMAP accounts by Mary Gardiner is licensed under a Creative Commons Attribution 4.0 International License.

Viewing attachments when using mutt remotely

Yes, that’s right, I’m still in the dark ages and do not yet use Gmail for my email. Even though it has IMAP and everything. I still use Mutt.

I almost always use Mutt locally, using offlineimap to sync IMAP folders to local maildirs. This means I don’t usually have the problem of being unable to view non-text attachments. However, for the next little while I’ll be using Mutt on a remote connection.

Don Marti has one solution to this, which assumes that you are accessing the server with Mutt on it via SSH (probably true) and are easily able to create a tunnel to your local machine, which is trivial if you are using a commandline ssh client, but while you can do it with PuTTY I figured it was just annoying enough that I might not bother. (And I doubt you can do it at all with those web-based SSH clients.)

My alternative assumes instead that you have a webserver on the remote machine that has mutt on it. It then just copies the attachment to a web-accessible directory, and tells you the URL where you’ll be able to find the attachment. It’s thus a very trivial script (and I doubt very much it’s the only one out there), but perhaps using mine might save you fifteen minutes over coming up with your own, so here it is:

copy-to-dir.sh (in a bzr git repo)

Sample output is along these lines when you try to view an attachment in Mutt:

View attachment SOMEPDF.pdf at http://example.com/~user/SOMEPDF.pdf Press any key to continue, and delete viewable attachment

In order to use it, you need to:

  1. copy the script to the remote machine where you use mutt;
  2. make it executable;
  3. edit it to set the OUTPUTDIR and VIEWINGDIR variables to something that works for your setup;
  4. set up a custom mailcap file much like the one Don Marti gives, that is, put something like this in your ~/.mutt-mailcap:
     text/*; PATH-TO-SCRIPT/copy-to-dir.sh %s application/*; PATH-TO-SCRIPT/copy-to-dir.sh %s image/*; PATH-TO-SCRIPT/copy-to-dir.sh %s audio/*; PATH-TO-SCRIPT/copy-to-dir.sh %s
  5. set mailcap_path = ~/.mutt-mailcap in your ~/.muttrc file.

Something like this probably could work for Pine and other text-based email clients used remotely too, but I’m not sure how because I don’t use them. And if someone wants to document this in a way that assumes less pre-existing knowledge, go ahead.

Also, making your attachments web-accessible means that they are, well, web-accessible. I’ve set up a HTTP Auth-protected directory using https for this, you should think about your own setup too.

Creative Commons License
Viewing attachments when using mutt remotely by Mary Gardiner is licensed under a Creative Commons Attribution 4.0 International License.

Clean up IMAP folders

Per Matt Palmer’s blog entry OfflineIMAP and Deleting Folders users of any mail sorting recipe that creates new mail folders a lot tend to find that over time they accumulate a lot of mail folders for, eg, email lists they are no longer subscribed to. And most IMAP clients will waste time checking those folders for new mail all the time.

Matt wrote:

Now, of course, someone’s going to point me to a small script that finds all of your local empty folders and deletes them locally then issues an IMAP “delete folder” command on the server. But I had fun working all this out, so it’s not a complete waste.

I haven’t quite done this, I’ve just written a script that detects and deletes empty remote folders. (For me, offlineimap does not have the behaviour of creating new remote folders, so I haven’t bothered cleaning up local folders.)

It’s good: it’s speeding up my mail syncs a whole lot, deleting these old folders I haven’t received mail in for about five years. I’ve got full details and the script available for download (as you’d expect, it’s short): Python script to delete empty IMAP folders.

Creative Commons License
Clean up IMAP folders by Mary Gardiner is licensed under a Creative Commons Attribution 4.0 International License.