On Audio Blogging

I’ve begun to become interested in the idea of audio blogging, or more generally, multimedia blogging. Ideally, I want to have a way to compose and post new entries to a weblog from a mobile location which might include sound, pictures, video and just plain old text.

Ideally, this could all be done with a single gadget. I’ve been researching PDAs for the last couple of weeks, and trying to find a reasonably inexpensive but useful gadget which can serve this purpose, and yet not cost as much as a full laptop. Unfortunately, the $300-$400 price limit I’ve placed upon this project yield only devices which are compromises. The two most promising devices appear to be:

The Dell Axim X30
This PDA costs $314 currently, includes a 624Mhz processor and has both 802.11b and Bluetooth wireless. If it had a camera, it would be terrific, as it is, it’s tempting.
Palm Zire 72
This one sells for $260 or so from amazon.com, includes a 1.3 megapixel camera, can shoot both still pictures and video, but sadly includes only Bluetooth. I’d really like to be able to take advantage of open wireless connections to send email and the like. Supposedly a WiFi card is coming, but that will consume the expansion slot, and it’s unclear if the WiFi card will include some extra flash memory.

If I expand the allowable price range to $600 (ouch), then some other possibilities arise, with some other unique capabilities. Most notably, I rather like:

Asus MyPal A730
As yet unreleased, this PDA appears to be nearly ideal. Not only does it include Bluetooth, 802.11b and IR connections, it has a VGA resolution screen, a one megapixel camera and both SD and CompactFlash memory slots. Very nice.

Okay, so let’s imagine I’ve got one of these devices. I can use it to compose audio files, snap pictures, or compose text entries. I can then send these using ordinary email to my server, and they will automatically be converted into weblog postings. That seems pretty cool, as these devices are small and convenient, and will lend themselves well to increased impromptu blogging.

But what else can we do? Adam Curry has made a plea for some software that allows him to act as a DJ: you can talk, insert bumper music and songs, and record the result as a high quality mp3 file for later streaming. This immediately reminded me of a similar program that Tom Duff wrote several years ago. Tom was nominated to be the sound engineer for a theater production, and spent three weeks coding up a nice little application that allowed you to preload a bunch of sound effects and “perform” them by hitting keys on a control console. The application handled all the scheduling and mixing internally. It shouldn’t be too difficult to do that on a modest PDA these days. Curry suggests that the Studio365 interface is nearly ideal, and staring at it, it does seem like a good starting point.

There are lots of remaining questions though. While modern PDAs are capable of decoding mp3 files in real time without great difficulty, it appears that they aren’t really up to encoding in real time (I’d love to be wrong about this, if anyone knows more about this, let me know) which means that we would have to store the resulting sound files and then compress them as a post process. Assuming that we want to keep such posts short (limited to say 15 minutes) and that the highest quality isn’t required (say 22050hz audio, 16 bit, mono) that cranks up to 40Mb, which is doable, but not ideal, particularly if you have to do the final encoding on a desktop, since sending the 40Mb of audio files over wireless or Bluetooth isn’t all that fast. Some more headscratching and research is clearly needed here.

Perhaps the greatest problem of audio blogs is one of indexing: it’s very difficult to provide searchable content when given only an audio stream. Currently my idea is simply tag audio posts with some searchable keywords, and also to limit the total length of posts to modest 5-15 minute lengths. I’ve thought about trying to do automated speech recognition to provide a searchable transcript, but given that I want to post from remote locations with the possibility of considerable ambient noise, I doubt that would be entirely successful. It won’t help to have a huge number of audio blog entries with no ability to find one that you found particularly compelling.

This is as far as my thinking takes me this morning. Feel free to comment below.