Importing a large blog to WordPress.com: WXR splitting tools

I am about to import a very large WordPress blog (not this one) to WordPress.com.

There’s two issues:

1. The WXR (WordPress eXtended RSS) export from the site is 105MB uncompressed and 22MB compressed (with gzip -9). This is too large to upload to WordPress.com, which only accepts uploads of 15MB at most.

2. This site has 4000 media file uploads (and 6000 posts). The original host is going away: those 4000 media files (mostly images) must also be imported into WordPress.com.

The obvious solution to #1 is to split the upload into multiple files, but I have just tested on WordPress.com, and in order to get it to change the post contents to refer to the imported copy of the media files, rather than the original externally hosted copy which is about to go away, the media file and the post must be uploaded in the same XML file. The scripts that I’ve found that will split WXR files into multiple XML files do not attempt to put media files and the posts that refer to them in the same XML file (eg mainSplit.py doesn’t do this), they just split the contents of the export file up in the order they appear.

Anyone got leads on this one?

Opt-in Creative Commons licencing plugin for WordPress?

Does anyone have a recommendation for an opt-in Creative Commons licencing plugin for WordPress. That is, one where the default state is not to CC licence something, but when some action is taken, an individual post or page can be so licenced.

As background: I have no desire to write, maintain, or even debug a WordPress plugin. I want to know if there is something for this use case that Just Works.

I want opt-in, because it is too hard to remember, or to train others, to find an opt-out box when posting, and thus end up CC licensing things that weren’t intended to be, or can’t be, released under such a licence.

Some options I’ve already looked into:

WP License reloaded: was pretty much exactly what I wanted but doesn’t seem to be actively maintained and is now failing (possibly because the site in question is now hosted on SSL, I’m not sure, see above about not being interested in debugging).

Creative Commons Configurator: seems to be the most actively maintained CC plugin, but seems to be opt-out, and even that was only introduced recently.

Creative Commons Generator: opt-out.

Easy CC License: perhaps what I want, although I’d rather do this with an options dialogue of some kind than a shortcode.

Gain world-wide fame and adoration*

WordPress has an annoying feature of its spam handling, namely that it shows you the entire spam content in the spam comment interface (where one must venture in order to rescue legitimate comments). This is how it works:

  1. look at first line of spam, agree that it is for sure spam
  2. scroll down
  3. scroll down
  4. … scroll down
  5. oh good, here’s definitely-spam message #2
  6. scroll down…

Two things that would help:

  1. a WordPress plugin that reduced spam to short expandable excerpts
  2. an update to the Akismet Auntie Spam Greasemonkey script to make it work with current versions of WordPress

Consider your path to world-wide fame and adoration laid out before you. (I am technically capable of doing both those options, but I don’t have time for a new path to glory right now.)

* Offer may not be valid to residents of South Australia.