Victim of your own podcasting success?

I passed another minor milestone: according to my script which peruses my http logfiles for downloads of my podcasts, over 100 unique IP addresses have accessed at least one of my audio podcasts. Kudos to me! Long live the king!

But success comes at a price: I host my webpages on my home machine (ssh, don’t tell anyone) and occasionally in the last week, I’ve noticed that I can no longer get any work done on my home machines because the uplink speed of my cable modem becomes saturated. Even my son has noticed this: I suspect that this was the cause of at least one mysterious “power outtage” on my webserver.

The problem is throttling: uplink for my home network is nominally about 300Kbits/second, or roughly 28Kbytes/second. This means that the average 10Mbyte podcast (mine have run a bit longer than this, but this is a back of the envelope calculation) takes about six minutes to download, running my uplink flat out. This means that about 10 people per hour can get my podcasts, or about 250 per day. Probably once the listeners reach half that amount, it probably means that my network uplinks are saturated over half the time, and it will be impossible for me to use my network for any other task. So I’ll be watching my logs to see how close I am to getting to this point.

But let’s imagine for a second that you don’t reach this point, and stabilize at some smallish number of subscribers, well within the capacity of your network pipes. How can you help reserve some of your total bandwidth to make sure you can get access to your machines without h-a-v-i-n-g t-o w-a-t-c-h c-h-a-r-a-c-t-e-r-s c-o-m-e o-u-t 1 p-e-r s-e-c-o-n-d?

Well, I figured out a way.

On yesterday’s blog, I mentioned that I was trying to figure out a way to throttle the bandwidth used by Apache to send mp3 files so that some guaranteed headroom would be available for things like ssh. Years ago, I ran thttpd, the throttling http server written by Jef Poskanzer. It was cool for many reasons (small size, fast, portable, easy to configure and secure) but had one unique feature: you could specify a file of regular expression patterns and a maximum and minimum bandwidth that you wanted the server to dedicate to files which matched that URL. While thinking aloud on my blog, I wished that there was similar functionality in Apache. I run WordPress as my weblog, which requires PHP and that’s something thttpd can’t handle, at least without experimental patches.

But then I realized there really was a simpler solution: use Apache to serve my weblog, and thttpd to serve mp3 files.

To make the idea more concrete, I store all my brainwagon.org audio files in a particular directory, which can be referenced as https://brainwagon.org/audio, so my enclosure url’s all look like https://brainwagon.org/audio/2004-10-04.mp3. Let’s imagine that I change the URL to https://brainwagon.org:8080/audio/2004-10-04.mp3, and run thttpd to serve the same files. Now, I can create a very simple throttle file:

# Slow down the download of mp3 files to keep from 
# choking our uplink.
#
**.mp3		20000		# 20K per second, even locally.

From the thttpd documentation:

Throttling is implemented by checking each incoming URL filename against all of the patterns in the throttle file. The server accumulates statistics on how much bandwidth each pattern has accounted for recently (via a rolling average). If a URL matches a pattern that has been exceeding its specified limit, then the data returned is actually slowed down, with pauses between each block. If that’s not possible (e.g. for CGI programs) or if the bandwidth has gotten way larger than the limit, then the server returns a special code saying ‘try again later’.

The minimum rates are implemented similarly. If too many people are trying to fetch something at the same time, throttling may slow down each connection so much that it’s not really useable. Furthermore, all those slow connections clog up the server, using up file handles and connection slots. Setting a minimum rate says that past a certain point you should not even bother – the server returns the ‘try again later” code and the connection isn’t even started.

Cool!

Now, the maximum amount of bandwidth used by serving mp3 files will be 20Kbytes/sec, or roughly 70% of my total bandwidth. This should leave enough headroom for my interactive sessions to remain snappy. The headroom means that the download will take about eight minutes instead of only six, but we never max out the channel.

I’ll probably start serving my files this way beginning with my next audio post.