I’ve been pretty interested in codes of all sort, both the cryptographic codes and the codes that are used to provide error detection and correction. While I’ve played around quite a bit with convolutional codes, I haven’t really every bothered to develop more than a cursory understanding of the Reed-Solomon error correcting codes which have widely applied in many applications such as data storage (most notably for compact discs) and satellite data transmission. It’s very cool technology.
Luckily, super smart guy Russ Cox has a very nice writeup on the theory and practice of Reed Solomon codes. Previously Russ had postings about fast regular expression matching which you might have read (or should have). I really like his postings: he has a very clear, pragmatic but not dumbed-down approach to both the theory and practice of programming. This series promises to go on in future postings, I’ll be eagerly awaiting future installments.
I’m interested in these as well. wua.la does storage tricks with them, and I’ve used them for par2 files. It’s a bummer that they are computationally expensive to create and to re-generate parity. I’m not schooled enough to follow the math.
http://www.youtube.com/watch?v=3xKZ4KGkQY8
As useful as Reed-Solomon codes are, modern practice is shifting to Gallager codes, also known as LDPC codes (low-density parity check).
As the name suggests, they’re based on parity checking, which makes them conceptually very simple. Also, it means that encoding is extremely efficient, especially in hardware. Designing codes is complex, but only has to be done once. Decoding is NP-hard in theory, but in practice the time complexity is a function of how damaged the block is.
Like Reed-Solomon codes, LDPC codes can achieve a performance that is arbitrarily close to the Shannon limit for a binary symmetric channel by increasing the block size. Unlike Reed-Solomon codes, the in-practice penalty for increasing the block size is much smaller.