Category Archives: Computer Science

Design of a simple ALU…

A couple of weeks ago, I noticed a bunch of links to a 16 bit ALU designed to operate using blocks which are defined in the game Minecraft. It got me thinking, and ordered the book that inspired that work. It contains the specification for an ALU which is very simple, and yet surprisingly powerful ALU (at least, considering the number of gates that it would take to implement it).

Here’s a C function that implements the ALU. I went ahead and just used the integer type, rather than the 16 bits specified in the Hack computer, but it doesn’t really matter what the word size is.

[sourcecode lang=”C”]
#define ZERO_X (1<<0)
#define COMP_X (1<<1)
#define ZERO_Y (1<<2)
#define COMP_Y (1<<3)
#define OP_ADD (1<<4)
#define COMP_O (1<<5)

unsigned int
alu(unsigned int x, unsigned int y, int flags)
{
unsigned int o ;

if (flags & ZERO_X) x = 0 ;
if (flags & ZERO_Y) y = 0 ;
if (flags & COMP_X) x = ~x ;
if (flags & COMP_Y) y = ~y ;
if (flags & OP_ADD)
o = x + y ;
else
o = x & y ;
if (flags & COMP_O) o = ~o ;
return o ;
}
[/sourcecode]

The function takes in two operands (x and y) and a series of six bit flags. The ZERO_X and ZERO_Y say that the X input and Z input should be set to zero. The COMP_X, COMP_Y, and COMP_O flags say that the X, Y, and output values should be bitwise negated (all the zeros become ones, and the ones zeros). The OP_ADD flag chooses one of two functions: when set, it means that the two operands should be added, otherwise, the two operands are combined with bitwise AND.

There are a surprisingly large number of interesting functions that can be calculated from this simple ALU. Well, perhaps not surprising to the hardware engineers out there, but I’ve never really thought about it. It’s obvious that you can compute functions like AND and ADD. Let’s write some defines:

[sourcecode lang=”C”]
#define AND(x,y) alu(x, y, 0)
#define ADD(x,y) alu(x, y, OP_ADD)
[/sourcecode]

You can also compute functions like OR using DeMorgan’s Law: OR(x,y) = ~(~X AND ~Y).

[sourcecode lang=”C”]
#define OR(x,y) alu(x, y, COMP_X|COMP_Y|COMP_O)
[/sourcecode]

You can make some constants such as zero and one. Zero is particularly easy, but one requires an interesting trick using twos-complement arithmetic.

[sourcecode lang=”C”]
#define ZERO(x,y) alu(x, y, ZERO_X|ZERO_Y|OP_ADD)
#define ONE(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|COMP_Y|OP_ADD|COMP_O)
[/sourcecode]

To make sense of the definition of ONE, you need to know that in twos-complement arithmetic, -x = ~x + 1, or in other words, -x – 1 = ~x. Look at the definition of ONE, it computes ~(~0 + ~0). After the addition, the register contains all 1s, except for the low order bit. Negating that builds a one. You can also build a -1, or even a -2.

You can also obviously get either operand, or their logical complements. There are even multiple implementations:

[sourcecode lang=”C”]
#define NOTX(x,y) alu(x, y, COMP_X|ZERO_Y|COMP_Y)
#define NOTX(x,y) alu(x, y, COMP_X|ZERO_Y|OP_ADD)
#define NOTX(x,y) alu(x, y, ZERO_Y|COMP_Y|COMP_O)
#define NOTX(x,y) alu(x, y, ZERO_Y|OP_ADD|COMP_O)
[/sourcecode]

We saw how we could add, we could also construct a subtractor:

[sourcecode lang=”C”]
#define SUB(x,y) alu(x, y, COMP_X|OP_ADD|COMP_O)
[/sourcecode]

We saw how we could compute +1 and -1, but we can also add these constants to X or Y..

[sourcecode lang=”C”]
#define DECY(x,y) alu(x, y, ZERO_X|COMP_X|OP_ADD)
#define DECX(x,y) alu(x, y, ZERO_Y|COMP_Y|OP_ADD)
#define INCY(x,y) alu(x, y, ZERO_X|COMP_X|COMP_Y|OP_ADD|COMP_O)
#define INCX(x,y) alu(x, y, COMP_X|ZERO_Y|COMP_Y|OP_ADD|COMP_O)
[/sourcecode]

I thought it was pretty neat or such a small implementation. The hardest part to implement is the adder, all the rest is a pretty trivial number of gates. If you use a simple ripple adder, even the adder is pretty small.

Addendum: Here’s a more (but not entirely) exhaustive list of functions that are implemented.

[sourcecode lang=”C”]
#define ADD(x,y) alu(x, y, OP_ADD)
#define AND(x,y) alu(x, y, 0)
#define DECX(x,y) alu(x, y, ZERO_Y|COMP_Y|OP_ADD)
#define DECY(x,y) alu(x, y, ZERO_X|COMP_X|OP_ADD)
#define INCX(x,y) alu(x, y, COMP_X|ZERO_Y|COMP_Y|OP_ADD|COMP_O)
#define INCY(x,y) alu(x, y, ZERO_X|COMP_X|COMP_Y|OP_ADD|COMP_O)
#define MINUS1(x,y) alu(x, y, COMP_X|ZERO_Y|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_X|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|COMP_Y)
#define MINUS1(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|OP_ADD)
#define MINUS1(x,y) alu(x, y, ZERO_X|COMP_Y|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_X|ZERO_Y|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_X|ZERO_Y|COMP_Y|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_X|ZERO_Y|COMP_Y|OP_ADD)
#define MINUS1(x,y) alu(x, y, ZERO_X|ZERO_Y|OP_ADD|COMP_O)
#define MINUS1(x,y) alu(x, y, ZERO_Y|COMP_O)
#define MINUS2(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|COMP_Y|OP_ADD)
#define NAND(x,y) alu(x, y, COMP_O)
#define NOR(x,y) alu(x, y, COMP_X|COMP_Y)
#define NOTX(x,y) alu(x, y, COMP_X|ZERO_Y|COMP_Y)
#define NOTX(x,y) alu(x, y, COMP_X|ZERO_Y|OP_ADD)
#define NOTX(x,y) alu(x, y, ZERO_Y|COMP_Y|COMP_O)
#define NOTX(x,y) alu(x, y, ZERO_Y|OP_ADD|COMP_O)
#define NOTY(x,y) alu(x, y, ZERO_X|COMP_X|COMP_O)
#define NOTY(x,y) alu(x, y, ZERO_X|COMP_X|COMP_Y)
#define NOTY(x,y) alu(x, y, ZERO_X|COMP_Y|OP_ADD)
#define NOTY(x,y) alu(x, y, ZERO_X|OP_ADD|COMP_O)
#define ONE(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|COMP_Y|OP_ADD|COMP_O)
#define OR(x,y) alu(x, y, COMP_X|COMP_Y|COMP_O)
#define RSUB(x,y) alu(x, y, COMP_Y|OP_ADD|COMP_O)
#define SUB(x,y) alu(x, y, COMP_X|OP_ADD|COMP_O)
#define X(x,y) alu(x, y, COMP_X|ZERO_Y|COMP_Y|COMP_O)
#define X(x,y) alu(x, y, COMP_X|ZERO_Y|OP_ADD|COMP_O)
#define X(x,y) alu(x, y, ZERO_Y|COMP_Y)
#define X(x,y) alu(x, y, ZERO_Y|OP_ADD)
#define XANDNOTY(x,y) alu(x, y, COMP_Y)
#define XORNOTY(x,y) alu(x, y, COMP_X|COMP_O)
#define Y(x,y) alu(x, y, ZERO_X|COMP_X)
#define Y(x,y) alu(x, y, ZERO_X|COMP_X|COMP_Y|COMP_O)
#define Y(x,y) alu(x, y, ZERO_X|COMP_Y|OP_ADD|COMP_O)
#define Y(x,y) alu(x, y, ZERO_X|OP_ADD)
#define YANDNOTX(x,y) alu(x, y, COMP_X)
#define YORNOTX(x,y) alu(x, y, COMP_Y|COMP_O)
#define ZERO(x,y) alu(x, y, COMP_X|ZERO_Y)
#define ZERO(x,y) alu(x, y, ZERO_X)
#define ZERO(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y)
#define ZERO(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|COMP_Y|COMP_O)
#define ZERO(x,y) alu(x, y, ZERO_X|COMP_X|ZERO_Y|OP_ADD|COMP_O)
#define ZERO(x,y) alu(x, y, ZERO_X|COMP_Y)
#define ZERO(x,y) alu(x, y, ZERO_X|ZERO_Y)
#define ZERO(x,y) alu(x, y, ZERO_X|ZERO_Y|COMP_Y)
#define ZERO(x,y) alu(x, y, ZERO_X|ZERO_Y|COMP_Y|OP_ADD|COMP_O)
#define ZERO(x,y) alu(x, y, ZERO_X|ZERO_Y|OP_ADD)
#define ZERO(x,y) alu(x, y, ZERO_Y)
[/sourcecode]

On random numbers…

While hacking a small program today, I encountered something that I hadn’t seen in a while, so I thought I’d blog it:

My random number generator failed me!

I was implementing a little test program to generate some random terrain. The idea was pretty simple: initialize a square array to be all zero height. Set your position to be the middle of the array, then iterate by dropping a block in the current square, then moving to a randomly chosen neighbor. Keep doing this until you place as many blocks asu like (if you wander off the edge, I wrapped around to the other side), and you are mostly done (well, you can do some post processing/erosion/filtering).

When I ran my program, it generated some nice looking islands, but as I kept iterating more and more, it kept making the existing peaks higher and higher, but never wandered away from where it began, leaving many gaps with no blocks at all. This isn’t supposed to happen in random walks: in the limit, the random walk should visit each square on the infinite grid (!) infinitely often (at least for grids of dimension two or less).

A moment’s clear thought suggeseted what the problem was. I was picking one of the four directions to go in the least complicated way imaginable:

[sourcecode lang=”C”]
dir = lrand48() & 3 ;
[/sourcecode]

In other words, I extracted just the lowest bits of the lrand48() call, and used them as an index into the four directions. But it dawned on me that the low order bits of the lrand48() all aren’t really all that random. It’s not really hard to see why in retrospect: lrand48() and the like use a linear congruential generator, and they have notoriously bad performance in their low bits. Had I used the higher bits, I probably would never have noticed, but instead I just shifted to using the Mersenne Twister code that I had lying around. And, it works much better, the blocks nicely stack up over the entire 5122 array, into pleasing piles.

Here’s one of my new test scenes:

Much better.

From Nand to Tetris in 12 steps

This came across my desk earlier today. I’ve actually been interested in this kind of “from the ground” up view: basically compiling simulators for very simple machines, but then bootstrapping all of a simulated virtual machine from the ground up.

From Nand to Tetris in 12 steps.

Addendum: This was the story that set me on the road to finding the video above. Someone implemented the 16-bit ALU from the above in terms of the LEGO-like construction materials available in the videogame “Minecraft”. Very wacky.

Translating HAKMEM 175 into C…

A couple of years back, I made note of HAKMEM 175, a nifty hack by Bill Gosper that finds the next higher value that has the same number of ‘1’ bits as the input.

brainwagon » Blog Archive » HAKMEM 175.

If I bothered to convert it to C, I didn’t scribble it down, so I thought I’d do it here.

[sourcecode lang=”C”]
#include <stdio.h>

/* _ _ _ _ ____ __ ___ __ __ _ ____ ___
* | || | /_\ | |/ / \/ | __| \/ / |__ | __|
* | __ |/ _ \| ‘ <| |\/| | _|| |\/| | | / /|__ \
* |_||_/_/ \_\_|\_\_| |_|___|_| |_|_|/_/ |___/
*
* A straightforward implementation of Gosper’s algorithm for
* determining the next higher value which has the same number of 1
* bits. This is useful for enumerating subsets.
*
* Translated by Mark VandeWettering.
*/

unsigned int hakmem175(unsigned int A) { unsigned int B, C, D ;

B = A ; /* MOVE B, A */
C = -B ; /* MOVN C, B */
C &= B ; /* AND C, B */
A += C ; /* ADD A, C */
D = A ; /* MOVE D, A */
D ^= B ; /* XOR D, B */
D >>= 2 ; /* LSH D, -2 */
D /= C ; /* IDIVM D, C */
A |= D ; /* IOR A, C typo? */
return A ;
}

main()
{
unsigned int x ;
int i ;

x = 3 ;

do {
printf("0x%x\n", x) ;
x = hakmem175(x) ;
} while (x != 0) ;
}
[/sourcecode]

I was surprised to find that there is actually a bug in the published memo. The last instruction should obviously be an OR of A and D, not A and C as listed in the published memo. After discovering the error, I seeked to find mention of it onlne somewhere. This page lists the corrected PDP-10 assembly code without comment. It also suggests a couple of optimizations: you can save one register and one instruction by variable renaming, and with a bit of work, you can avoid the integer divide, which is probably a good thing on pretty much any architecture you are likely to use.

Addendum: I thought it might be fun to show an illustration of how this could be used. At one point, while writing my checkers program, I thought about enumerating all possible positions with given numbers of checkers on each side. Counting them is actually rather easy, but enumerating them is a bit trickier, but using the trick of HAKMEM175, it becomes somewhat easier.

[sourcecode lang=”C”]
#include <stdio.h>
#include <stdlib.h>

unsigned int
hakmem175 (unsigned int A)
{
unsigned int B, C, D;

B = A; /* MOVE B, A */
C = -B; /* MOVN C, B */
C &= B; /* AND C, B */
A += C; /* ADD A, C */
D = A; /* MOVE D, A */
D ^= B; /* XOR D, B */
D >>= 2; /* LSH D, -2 */
D /= C; /* IDIVM D, C */
A |= D; /* IOR A, C typo? */
return A;
}

/*
* Here’s an interesting use (okay, semi-interesting use) of the
* above function. Let’s use it to enumerate all the potential
* positions in checkers which have two checkers (not kings) for
* each side.
*/

#define WHITEILLEGAL (0x0000000F)
#define BLACKILLEGAL (0xF0000000)

int
main (int argc, char *argv[])
{
unsigned int B, W;

int nb = atoi (argv[1]);

int nw = atoi (argv[2]);

if (nb < 1 || nb > 12 || nw < 1 || nw > 12) {
fprintf (stderr, "usage: checkers nb nw\n");
fprintf (stderr, " 1 <= nb, nw <= 12\n");
exit (-1);
}

B = (1 << nb) – 1;

for (;;) {
W = (1 << nw) – 1;
for (;;) {
while (W != 0 && (W & (B | WHITEILLEGAL)))
W = hakmem175 (W);
if (W == 0)
break;
/* output B, W, we could be prettier */
printf ("%x %x\n", B, W);
W = hakmem175 (W);
}
B = hakmem175 (B);
while (B != 0 && (B & BLACKILLEGAL))
B = hakmem175 (B);
if (B == 0)
break;
}

return 0;
}
[/sourcecode]

Using this, you can easily enumerate the 125,664 positions with 2 checkers on each side, the 8,127,272 positions with 3 checkers on each side, and the 253,782,115 positions with 4 checkers on each side. By then, the code gets a little slow: it has to step over all the positions where there is a conflict in B and W placement, so it gets pretty slow. Still, it works reasonably well, and as such, might serve as part of the inner loop of a retrograde endgame analysis project.

Somewhere… over the (simulated) rainbow…

A few days back, I simulated how light propagated in a single drop of water, but with a number of problems. First of all, it didn’t simulate the Fresnel equations, which describe how light is reflected and refracted at the interface between two media. This meant that in my simple model, no light is actually subject to total internal reflection, and no light was scattered back in the direction that the light originated. This is an important effect: without it, you cannot see rainbows. It also assumed a single index of refraction, so we couldn’t see how the reflection and refraction change as a function of wavelength.

So, I decided to fix both problems, and run a simulation to see if I could capture some interesting features of rainbows by direct simulation.

To keep you interested (if possible), I’ll merely give the picture, then explain it. (Click on it to view it full size).

Simulated rainbow reflection, light enters from left, goes to right

It’s hard to see the detail in the area of interest, so here’s a better version, zoomed in (click to view bigger):

This describes the distribution of light in each outgoing direction, assuming light coming in from the left. There are four different plots, ranging from the shortest wavelength (4047 Angstroms, or deep violet) to the longest (7065 Angstroms, or deep red). (Sadly, there isn’t any correlation between the plot color and the wavelength, you have to look at the legend to interpret things properly.) I got these plots by determining the index of water at each wavelength, and then simulating via Monte Carlo techniques the resulting outgoing distribution by tossing 10 million simulated photons at the drop. At each surface interaction, it determines what the ratio of reflected to refracted light could be, and then take one path or the other based upon a weighted coin flip. The final outgoing directions are binned, and we take the log of the count in each bin to produce the distance from the center in the polar plot.

This plot isn’t the most pretty thing you’ll ever see, but you’ll actually find a number of interesting things. First of all, you’ll see that there are two rainbows represented. The inner one is the brighter, and the outer one is significantly dimmer (you might estimate that it contains 1/10 as much energy, since it appears about 1 unit lower and I used log based 10). The inner bow has violet at the inner side, and red at the outer side while the outer bow is reversed (red on the inner radius and violet on the outer radius). There is also a distinct lack of light between the two bows. The amount of light reflected in between the two bows is less than that scattered outside them. This is a phenomenon called “Alexander’s band”.

I thought this was a fun experiment. I could work a bit harder and actually produce the appropriate colors and intensities by some more computer graphics magic. Perhaps that will wait for some other day.

Making a soft-circuit input device for your computer

I’m intrigued by various uses for embedded processors, and so are my readers. I hadn’t seen this particular microcontroller board before, the “Teensy”, which is very similar to the Arduino, except that it is uses an ATMEL AVR chip with a direct support for USB. The link also points at a nifty interface to “soft circuit” elements, which probably has some nifty controller applications.

Make: Online : Making a soft-circuit input device for your computer.

Zounds! Sounds!

Tom and I have been discussing some early hacking efforts, probably spawned in part by my re-reading of Levy’s Hackers. A couple of days ago, this resulted in me pondering the mysteries of Minsky’s circle algorithm (still ongoing), but today it drove me to an early interesting sound algorithm documented in the legendary HAKMEM, ITEM 168, which shows a simple six instruction program that generated sounds on the PDP-1. The basic idea was pretty simple: we have two variables, A and B. Read the front panel switches from the PDP-1, and add it to A. Then take the contents of A, apply a bit-wise AND from a mask, and then add that to B. Mask off the high bit of B, and output that as the “music”. (Note: I am a bit confused by ITEM 168. The PDP-1 code there doesn’t seem to be exactly what I described, but the mnemonics they list are somewhat unfamiliar to me. In any case, the rough English description above will suffice for now…)

In any case, I didn’t have a PDP-1 lying around, so I decided to implement the thing in C, using portaudio and the GNU readline library. Here’s the code:

[sourcecode lang=”C”]
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <readline/readline.h>
#include <sndfile.h>

/* __
* ___ ___ ____/ /__ ___ ___
* / _ \/ _ `/ _ / _ \/ _ \/ _ \
* / .__/\_,_/\_,_/\___/\___/ .__/
* /_/ /_/
*
* A simple musical toy inspired by some discussions on early computer
* music with Tom Duff, written by Mark VandeWettering.
*
* THE BASIC IDEA
* Tom informed me that someone generated odd musical tones with a
* very short program on the PDP 1 that basically looked like this:
*
* for (;;) {
* a += sw ; some front panel switches
* b += a & mask ; mask just selects some part of a…
* output(a>>30) ; output a single bit square wave…
* }
*
* So that’s what this toy emulates.
*
* I’ve also added a tiny bit of interface to allow you to change the
* switch settings and the mask on the fly, to "play" the instrument.
*
* I’ve called this program "padoop" just because I needed some vowels
* added to "PDP".
*
* Tom points out that it wouldn’t be hard to implement this in
* hardware, completely without a processor.
*/

#include <stdint.h>
#include <portaudio.h>

#define SAMPLE_RATE 44100

typedef struct {
uint32_t a, b ;
SNDFILE *sf ;
} paMusicData ;

uint32_t SW ;
uint32_t MASK ;

/*
* Here is a very simple output filter designed by Fisher’s mkfilter code.
*/

#define NZEROS 2
#define NPOLES 2
#define GAIN 3.414213562e+00

static float xv[NZEROS+1], yv[NPOLES+1];

float
filter(float inp)
{
xv[0] = xv[1]; xv[1] = xv[2];
xv[2] = inp / GAIN;
yv[0] = yv[1]; yv[1] = yv[2];
yv[2] = (xv[0] + xv[2]) + 2 * xv[1]
+ ( -0.1715728753 * yv[0]) + ( -0.0000000000 * yv[1]);
return yv[2];
}

static int
paMusicCallback(const void * inputBuffer,
void * outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo * timeInfo,
PaStreamCallbackFlags statusFlags,
void * userData)
{
paMusicData * data = (paMusicData *) userData ;
float *out = (float *) outputBuffer, *op ;
float f ;
int i ;

op = out ;
for (i=0; i<framesPerBuffer; i++) {
data->a += SW ;
data->b += data->a & MASK ;
*op++ = filter((data->b&(1L<<31))?1.0:-1.0) ;
}

if (data->sf != NULL) sf_write_float(data->sf, out, framesPerBuffer) ;

return 0 ;
}

void
process_command(char *cmd)
{
uint32_t tmp ;

if (strlen(cmd) > 0)
add_history(cmd) ;
if (strcmp(cmd, "p") == 0 || strcmp(cmd, "print") == 0) {
fprintf(stderr, "sw 0x%08X\n", SW) ;
fprintf(stderr, "mask 0x%08X\n", MASK) ;
} else if (sscanf(cmd, "sw %i\n", &tmp) == 1) {
SW = tmp ;
fprintf(stderr, "sw 0x%08X\n", SW) ;
} else if (sscanf(cmd, "mask %i\n", &tmp) == 1) {
MASK = tmp ;
fprintf(stderr, "mask 0x%08X\n", MASK) ;
}
}

main()
{
PaStream *stream ;
paMusicData data ;
PaError err = Pa_Initialize() ;
uint32_t m1, m2 ;
if (err != paNoError) goto error ;
SF_INFO sfinfo ;

data.a = data.b = 0 ;
/* log output data to a file… */

sfinfo.samplerate = SAMPLE_RATE ;
sfinfo.channels = 1 ;
sfinfo.format = SF_FORMAT_WAV | SF_FORMAT_PCM_16 ;
data.sf = sf_open("padoop.wav", SFM_WRITE, &sfinfo) ;

SW = 012345 ;
MASK = 0x0f0f0f0f ;

/* open the music output device… */
err = Pa_OpenDefaultStream(&stream,
0, /* no input */
1, /* just one output */
paFloat32, /* output float data */
SAMPLE_RATE, /* at some sample rate */
8192, /* samples per frame */
paMusicCallback,
&data) ;

if (err != paNoError) goto error ;

/* start */
err = Pa_StartStream(stream) ;
if (err != paNoError) goto error ;

for (;;) {
char * line = readline("padoop> ") ;
if (line == NULL) break ;
process_command(line) ;
free((void *) line) ;
}

err = Pa_StopStream(stream) ;
if (err != paNoError) goto error ;

sf_close(data.sf) ;

err = Pa_Terminate() ;
if (err != paNoError) goto error ;
exit(0) ;

error:
fprintf(stderr, "PortAudio error: %s\n", Pa_GetErrorText(err)) ;
Pa_Terminate() ;
exit(-1) ;
}
[/sourcecode]

You’ll need to have the portaudio and readline libraries installed to me it work. When you run it, it should start making noises, and you’ll see a command line prompt. It accepts 3 commands: “print” which dumps the contents of the mask and switches, “sw” followed by a number, which sets the value of the switches, and “mask” followed by a number, which does the same for the mask. Give it a try.

Addendum: Try “sw 4096” and “mask 0xFF0F0000”. Let it run for a while. If you find some interesting settings, post them in comments.
Addendum2: I included a fairly soft output filter to round the square waves a tiny bit. I’m not sure it makes any difference at all.
Addendum3: I made a version of padoop that also logged the output audio to a WAV file, so even if you can’t compile it, you can hear some of the sounds it might make.
Output from Padoop, my PDP-1 inspired music program…
Another bit of output…
Addendum4: While editing, I got some of the escaped characters screwed up, so I reinserted with the latest version. It automatically outputs a padoop.wav file of all the output that the program creates.

Drawing “circles” ala Marvin Minsky…

In my re-reading of Levy’s book Hackers, I was reminded of an interesting bit of programming lore regarding an early display hack that Marvin Minsky did for circle drawing. It’s an interesting hack because the lore was that it was originally coded by mistake, and yet the result proved to be both interesting and even useful. I first learned of this years ago when Blinn did an article on different ways to draw a circle, but also learned that it was part of the MIT AI memo called “HAKMEM”. Here’s the basic idea:

One way that you can draw a circle is to take some point x, y on the circle, and then generate new points by rotating them around the center (let’s say that’s the origin for ease) and connecting them with lines. If you have a basic understanding of matrix math, it looks like this:

/ x' \   /  cos theta  sin theta \ / x \
|    | = |                       | |   |
\ y' /   \ - sin theta cos theta / \ y /

(I should learn how to use MathML, but you hopefully get the idea). The matrix with the cosine and sine terms in it is a rotation matrix which rotates a point around by the angle theta. Apparently Minsky tried to simplify this by noting that cos(theta) is very nearly one for small theta, and that sin(theta) can be approximated by theta (we can get this by truncating the Taylor series for both). Therefore, he thought that (I’m guessing here, but it seems logical) that we could simplify this as:

/ x' \   /   1  eps \ / x \
|    | = |          | |   |
\ y' /   \ -eps  1  / \ y /

Okay, it’s pretty obvious that this seems like a bad idea. For one thing, the “rotation” matrix isn’t a pure rotation. It’s determinant is 1 – eps^2 1 + eps^2, which means that points which are mapped move slowly toward away from the origin. Nevertheless, Minsky apparently thought that it might be interesting, so he went ahead an implemented the program. I’ll express it here is pseudo-code:

newx = oldx - eps * oldy
newy = oldy + eps * newx

Note: this program is “buggy”. The second line should (for some definition of should) read “oldy + eps * oldx“, but Minsky got it wrong. Interestingly though, this “bug” has an interesting side effect: it draws circles! If eps is a power of 2, you can even implement the entire thing in integer arithmetic, and you don’t need square roots or floating point or anything. Here’s an example of it drawing a circle of radius 1024:

Well, there is a catch: it doesn’t really draw circles. It draws ellipses. But it’s still a very interesting bit of code. The first thing to note is that if you actually write down the transformation matrix that the “wrong” equations implement, the determinant is one. That helps explain why the point doesn’t spiral in. But there is another odd thing: if you implement this in integer based arithmetic, it’s eventually periodic (it doesn’t go out of control, all the round off errors eventually cancel out, and you return to the same point again and again and again). In HAKMEM, Schroeppel proclaims that the reason for the algorithm’s stability was unclear. He notes that there are a finite number of distinct radii, which indicates that perhaps the iterative process will always eventually fill in the “band” of valid radii, but the details of that seem unclear to me as well. I’ll have to ponder it some more.

Addendum: The real reason I was looking at this was because the chapter in Hackers which talks about Spacewar! also talks about the Minskytron whose official name was TRI-POS. It apparently uses some of this idea to draw curves in an interesting way. HAKMEM item #153 also suggests an interesting display hack:

ITEM 153 (Minsky):
To portray a 3-dimensional solid on a 2-dimensional display, we can use a single circle algorithm to compute orbits for the corners to follow. The (positive or negative) radius of each orbit is determined by the distance (forward or backward) from some origin to that corner. The solid will appear to wobble rigidly about the origin, instead of simply rotating.

Might be interesting to implement.

Addendum2: I goofed up the determinant above, it should be 1 + eps^2, not 1-eps^2. Corrected.

Writing emulators for obscure old computers…

(Hmmm. I’m sort of on a retrocomputing kick today…)

A few years ago, I wandered into Tom’s office one day to find him typing away. I asked “whatcha doing?” and he replied “I’m writing an emulator for the PDP-1 so I can play the original Spacewar!” It tells you a lot about Tom to hear stories like this. It tells you something about me that I didn’t respond with “why?” or “that’s stupid” and instead just went back to my desk and wrote my own. I had never seen a PDP-1 (although the Computer History Museum has since completed restoration of theirs, where you can see Spacewar! running on real hardware). It took me a couple of days to tinker the emulator sufficient to run Spacewar! into shape, and I learned a lot about how to write emulators.

A couple of days ago, Eric sent out a link to his RetroChallenge project, an FPGA-ELF:



The ELF was a microcomputer based around the RCA-1802 processor, and was published back in 1976 in the pages of Popular Electronics. I remember reading this as a precocious 12 year old, but it would be four years until I got my first computer, an Atari 400. Still, these older machines seem like interesting things, so I thought I’d investigate what it would take to make a high quality emulator (or even a crappy one) for the 1802.

Here’s the first little nugget that I found. The 1802 has sixteen 16-bit registers, labelled R0-R15. One of the more interesting features that it has is that the PC isn’t a dedicated register: there is a 4 bit P register, which tells you which of the sixteen R registers serves as the PC. In each cycle, it fetches the byte pointed at by R[P], increments R[P], and then executes the instruction. I found that there is an interesting case that occurs with opcode 0x20. The high nibble of 2 indicates that it is a DEC instruction, and the low nibble specifies the register. This says “DEC R0”. But in my case, R0 was serving as the PC (P was 0 as well), so this instruction essentially undoes the normal increment, restoring its previous contents, and we have a single byte infinite loop.

At least, that’s what I think should happen. I don’t have an actual computer, so debugging this kind of stuff is what writing emulators like this is all about.

Getting the basic instruction set working shouldn’t be too hard, but there are other things like handling DMA that might be a bit trickier. Still, I’m destined to learn a lot, about a subject that is of no importance whatsoever. But then, you should expect that from me and this blog.

Addendum: Eric wrote up some nodes about his FPGA-ELF on his blog. Factoid: it runs about 100x faster than the original ELF.

10,000 Monkeys Typing…with a Unix/sh challenge…

I was testing some code that I wrote for analyzing cryptograms, and decided that the easiest way to do so would be to get some random text, drawn from the letters A-Z. A moments thought yielded this method, without even programming anything:

tr -d -c A-Z < /dev/urandom | dd ibs=10000 count=1

The tr generates the required data, and the dd truncates it to the desired number of characters. But for tidiness, I'd like to have the output broken up so that each line consists of 50 characters, with spaces inserted between every 5 characters (I won't begrudge you if you leave a dangling space at the end of each line). I couldn't figure out a simple way to get that to happen all in one command line and using standard utilities. I can of course write a little Python utility, or even perl, but can anyone think of a clever short way to do this?

Addendum: Tom pointed out something interesting about the command that I listed above: it doesn’t work the way I think it does. Apparently the ibs is the input block size, which dd dutifully allocates, and the count is the number of read system calls that the system issues. For reasons which escape me, it does not try to make sure that it actually received a full input block: it will nicely return short blocks if it finds them, and doesn’t bother retrying to get more. Hence, it works rather erratically when using a pipe as input, particularly when the writes from the upstream process may flush in odd intervals.

Primality testing with Perl regexs

Okay, here’s a little pet peeve of mine. Somebody brought up that it’s possible to do primality testing using Perl regular expressions. This has been kicking around for a while, I’m not sure who originated the idea, but you can find links to it with a Google search. As an example, consider the Perl one liner here:

Primality testing with Perl regexs@Everything2.com

It’s a very cute hack, so where is my peeve?

The regular expression they list isn’t a regular expression.

It seems a bit of a quibble. I’ve had people argue that whatever the meaning of regular expressions might be, the “regular expressions” matched by popular “regular expression” libraries like Perl and Python should supercede, because they are in fact so ubiquitous. I might argue, except for one thing:

Real regular expressions can be matched really quickly, but are seldom matched very quickly at all by Perl, Python, or Ruby. Russ Cox wrote a nice article about this. Yes, perl regular expressions can match all sorts of complex things that real regular expressions can’t, but if all you’ve got is a real regular expression, perl’s performance is typically an order of magnitude slower than necessary, and in fact, perl can be arbitrarily slow. This should come as no surprise: since the perl implementation of primality testing does work, it obviously must be slow: it is, after all, capable of any arbitrary computation.

It’s not like doing it correctly is even difficult: we implemented all this stuff back in Ginnie Lo’s compiler class back in the early 80s, and it was old back then. We made lexical analyzers that began by compiling NFAs for all the tokens in the language, and then converting them into DFAs, which marched along and processed files in linear time.

Okay, it’s my peeve, but there is some good, beautiful computer science underlying it all.

Reminder: HOWTO tunnel http using ssh…

I have a couple of devices at home that provide http servers on my local network. I have them tucked nicely behind my firewall so that they are not accessible from the outside, but occasionally, I would like to login to them to perform some reconfiguration or the like. This is where ssh comes to the rescue: you can use it to create a secure tunnel, a port that I can connect to on my local machine that gets routed to my remote machine through my firewall.

The problem is, I can never remember the command, and it takes me a few minutes of thought to reconstruct it. So, I thought I’d write it down here so I’d remember.

The basic command is:

ssh -f mylogin@myserver.com -L 20000:192.168.1.132:80 -N

When I enter http://localhost:20000 on my machine, it’s as if I was accessing the web page on 192.168.1.132 on my local network. Very useful.

Crazy Optimization of Chaocipher…

Okay, this is a minor hack, but I thought it was fun, so I thought I’d write it up here.

My original code for simulating the Chaocipher machine proceeded as follows: it found the character in the plaintext wheel (by linear search), then rotated each wheel to get the plain and cipher text entry to the zenith minus one and zenith positions respectively, then did the short cycle rotates. It used a temporary array to perform the first rotate, and then used memcpy to copy the results back. The short rotates were done in position. Hey, it was easy to write, but kind of ugly. In particular, it did twice as many read/write operations as necessary, and I hypothesized that it was a bottleneck.

So, I wrote a little code generator that instead generated “compiled” code for each of the 26 possible paths through the encoding code. Each entry in the cipher and plaintext wheels is read and written only once, and I use only an additional char of storage to hold temp values. Here’s the fragment of the code that executes if the cipher text is found in position number zero:

	case 0:
		tmp = m->cw[1]; 
		m->cw[1] = m->cw[2]; 
		m->cw[2] = m->cw[3]; 
		m->cw[3] = m->cw[4]; 
		m->cw[4] = m->cw[5]; 
		m->cw[5] = m->cw[6]; 
		m->cw[6] = m->cw[7]; 
		m->cw[7] = m->cw[8]; 
		m->cw[8] = m->cw[9]; 
		m->cw[9] = m->cw[10]; 
		m->cw[10] = m->cw[11]; 
		m->cw[11] = m->cw[12]; 
		m->cw[12] = m->cw[13]; 
		m->cw[13] = tmp; 
		tmp = m->pw[0]; 
		m->pw[0] = m->pw[1]; 
		m->pw[1] = m->pw[2]; 
		m->pw[2] = m->pw[4]; 
		m->pw[4] = m->pw[6]; 
		m->pw[6] = m->pw[8]; 
		m->pw[8] = m->pw[10]; 
		m->pw[10] = m->pw[12]; 
		m->pw[12] = m->pw[14]; 
		m->pw[14] = m->pw[15]; 
		m->pw[15] = m->pw[16]; 
		m->pw[16] = m->pw[17]; 
		m->pw[17] = m->pw[18]; 
		m->pw[18] = m->pw[19]; 
		m->pw[19] = m->pw[20]; 
		m->pw[20] = m->pw[21]; 
		m->pw[21] = m->pw[22]; 
		m->pw[22] = m->pw[23]; 
		m->pw[23] = m->pw[24]; 
		m->pw[24] = m->pw[25]; 
		m->pw[25] = tmp; 
		tmp = m->pw[3]; 
		m->pw[3] = m->pw[5]; 
		m->pw[5] = m->pw[7]; 
		m->pw[7] = m->pw[9]; 
		m->pw[9] = m->pw[11]; 
		m->pw[11] = m->pw[13]; 
		m->pw[13] = tmp; 
		break;

You should notice that the cipher wheel doesn’t bother to move element zero at all (it doesn’t move if the plaintext letter is found at index 0). It also doesn’t touch the upper half of the cipher wheel, since those letters are left in place by this particular path. It then uses each element as the source and destination precisely once. Since there are two cycles, uses the temporary variable twice.

The resulting code is about 1450 lines long, but the resulting program runs almost 2.5x faster overall, searching about 10.8M nodes per second in my backtracking search. Worth the 20 minutes or so it took me to write the code generator.

Addendum: In theory at least, you could safely issue reads and writes which occur in different cycles in parallel, since you know they don’t target the same addresses. But I don’t know how to hint to the C compiler to perform that optimization. I also suspect that SSE instructions might allow you to load and swizzle faster.

Typos in Exhibit 1?

WARNING: if you are working on this code, this article contains spoilers which may blunt your own intellectual satisfaction in working on it yourself, including some recovered keys.

Okay, I’m home, and feeling pretty jet-lagged, so this might be wrong in some way that additional sleep will reveal, but I wanted to get this out of my head before work today.

First of all, I downloaded Exhibit 1’s plaintext from this page, specifically this link for the plaintext and this link for the ciphertext. These files contain no spaces or carriage returns, so are relatively unfriendly for human consumption, so the first thing I did was to reformat them into 11 groups of 5 characters per line to match the format that Exhibit 1 is presented in the scanned chapter of Silent Years.

Then, I set my constraint solver working. It loads both the cipher and plaintext, and does an exhaustive search to find the consistent wheel settings that consumes the most input. It then dumps the settings as a command line that should decode the ciphertext, or at least as much of it as it has discovered.

The first 100 lines of Exhibit 1 consist of 100 copies of the 55 character sentence:

ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW

So, the first 100 lines of the decryption of the cipher text should be just 99 copies. But my cipher program pointed at a problem after 603 characters. Here are

./chaocipher -d -c 'CPEDQRSTIXYLMOZABFVGUHWJKN' -p 'AYZNBQDSEFGHLWIKCMOPRTUVJX'

When I run this on the original text, I get the following (numbered for convenience):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTGW
ADLGO OSLLI VFCRB YMHAW QGNMS NDNCK GZYPM LAYNZ LFNDO WGBJU IXZLD
TNOJM TUSZG IKMLC WNCKK OUICA YKSJJ SSKFR GLLPU XAOXS BXAPQ ECJSU
SFLFT ZLNQS SWNZN TKSDG JUUQC GQCDD WPDIN FZODL MJEXV WIFFW TEFPS
ZKUAT XVTKL FZLHM VICVD PMEZP YCULX VAXNE AYRSL BBMIT UJGII TSPIQ
UMUMC FWFPO SYYAW EYCKY ZQUVJ BOIIU FBHXT QFHKS EMWDB TSNJY GNIKJ
QHJRR DCGCX MINUQ ZGITG HTINW UZWFX HJEWM QDCXN BXKGY QDDNG IAXNY
RCRAM KCOAI VLKLF SNVGW SQYOA OYUUW YOQAC SUDFF KYXPJ GTBPS ZTONZ
CZPDM MOORN EVZZV AHBUP VOCUJ AHCGZ SLVUY ILIDG YAOOX TYDDA LJDLO
ULRNQ ASEZW BXPSV LEMYZ THMVA BHCII YVHIU JQBAU RGZEI ARUIL NZEDY
BJIOG TRFMH HDJFR ROBKH SMUHY PBYUQ MNNRW YBPRZ KCUQK LGNAI EXRAX
KDJVP JHMRN IFINI ZYGSZ ITDWZ NKGQV QUXVJ YZUIJ FQECB RWSDK QZUHM
XZBFK XGICH IMHHR NKSMF VXTCG IQOJK TCCJO BWQJC ATUBZ ILHRE ULVHQ
WUMDQ SERWK AYNOW PGKZF CWMOG SOLYO YVUSW AIEIO FPJDS SCMEC LTWBA
KNGQE KZFCC RKKXL ZXNTH UQXQD DGJPU EZEVE IPVZC XNRZV OGGVV JBCYP
WMJQG BYMKX NNGAA YDGRA XKVAV LVLFK WIQDE DBZHF YSVEN WOGJK USZPZ
QPEKM UHTDT HTOJU FQIMZ KHPWT AJGDE CEUKS RSLNU GIUAM XUWLN CSJVC
NXOBT AVQTY HLBUL VYYBX CLFEV LKVVE JGYAP SDSIS RZPJQ ISDKS JNEDW
HMSMF DXVZY LVVJF USVXN ZGKFQ NYMZO QQGQJ SCUVS LJRAL NOYIY HRWVD
JMEFF ZRNCU PRUFT NFETI WESRY NOVKQ UVCKY QNWQU OQMHA MATEE QBRWE
YBSQR HLMRH OCLOP PWIOE AGTHF KTMLE TYILX UZSNX JWHIG LJXRB KUCNW
RZMFC OBKBL FGEVF THGHE MYHMA LLWBV SMUDF XGHPM IWDVC ODXYR LGLTU
SGPKX WQBIR IPVJV LNNWR LYSQK OQPOD BBZQL WSSJB IUHVU IXTVW RAUJJ
BLGMT CIVOQ KQXXZ FCXQN GWCWI SQWMY OCZBY PHCEH ULDTT FETNI ESUVQ
CUWCL LSQVL BRYZW BRGIC QFHDU MUJQQ SXCVK SPYRM BYFFI NQZPZ PURFS
RWNWH WOKUZ GSNPM LJQDB XOZEG YTYHH PAGFR NQILO QCTNP RNFMY PHSEH
MXUDW AUDMB MLZBW SIGDP EOAVG BXWWH EREWH DBPSC WFAFS AHATC HAYAL
UVYMP KGPMD NBIWC NQTXT FYVYW KNKQU VJZIH WZCLF HTTBY SRRWW APVAQ
PCZYV PTMLJ BYQRQ IHIHZ ZXIWX CEOVG FYZUN PQFTW NNHYX CQKUN GWVNF
NYGDI ZKOUR VTQYI MNFVX JHJYN TWOKP DMOYF RJOED GLGKL JZYVE EKLZU
JEMXF VWVYZ ROISI CQKRZ XUFWX RFWVS VPPLH MUEOF YUBPW BBOUG BAUMD
GIWET YNTKV DSESI ZLWSL VKHIZ NZKCO JTRGH SWVVZ RUNDD LHEFP XTLHD
LFEFB UHDDH ZORCZ NLKQZ QJZQB ZBQFT CDMOS UGHCE YYUSY EPCID PMZYY
ZHWWL TZNLX HEEMQ JCEYS EAOWO JXBGK BHCAT QTHAB CPXXT BMSLW BBLQR
CSHCH MHURA WOQRG LWMLT WGFDG AOEJF GSFXR DMRHR FCBPY JLFGP UXQHW
NPUQP BKMIC EQZDM YIMXU FWASM CXJAR PSVEK ZMIZV PQNRP YGBLK IXJFF
ASHHB BWRDJ ZXFSG STINW TGJHY TQSRQ ENFIX JMSZZ FYKKC IEOIR ERJYK
JLPCU NXTAP XTCBL HIGZT SEPTW SBMFE WLKZX WIZDG RBAEJ LIIRW UFRKI
NLJTR UTFWF KFSVJ KZQEC VCYNI RIBQM BJHEG FZYAT UQOAS UCLBG GCYBD
PSTDV TKEBN IGUFH XTKTL INLVG KDPGG IEPJG BHASL CBWNS BOVYA GKCRZ
WUQXV JZMNG NANAD BMGFN QUZCM QAHQH VAXWK SRFCY OJGKC RIGHL HGTAH
WCFRZ MHXWK TWJPM YQWJH FNCPU BLBXD YXMSK BDDIY ZQABQ HZMEO WPNOE
RHFVE QHIBY FPDCW EXRZG GJNZO MLEYY WZJHH YVRRM LUHXI YJDDQ TJOOM
EVWYG IQUFD LPPMD QADBK CTBIJ HELST EOXZZ SUUWI XZHIS OXPVY FMNOP
WKOIF JHBOZ CITZQ VTGEM NOJGB KQMVG URNWU GFEBB IDWXB KQEOD FOPIV
XLNJA LPUKH NXQTI YBIEV YIJYT ZHSQW LISZT YPZHQ VOYTF CMRFM QAPSP
YZGZT SHGIW VFZDZ YMVEF YZYJI KHQMG WVQOG YACHG MELVP MNMOE PASDH
WIWLH KAEQX QBEWW IMPVC PZISZ SWZBR SETHW LXHQO DADLT UVUXC JSNKB
EDYMV KNXGM DLOOB KUUFA DRPHO GLBNA WXVCG YZKVY BQMKF NFWGR JPAHA
DGDIO OQCTC SQBKT JVHDE KFKQF KOKYW RPEBF HBKCK ZNQYH MZBWD VAHFF
QVIAZ QUWKA IEVXP KLDWJ ATRTB LYPWI AWPAO TYDOM YFQZR ZTMAS DMCOS
ALHRW QRSPC KIHJV JXSJB UQVOG EWQON BKNLO VPIGF TNGUP BVBTU WHGTJ
XZJAV NCACR UNPZW DQWOT SYOAS YPBQZ TVIYF VZKIW SFETG XUCYI QSBPY
IHYRV YSIII IUGHL FNUUH ARMJL BIAUT WNLOP ZXECP SEBYG SSUGS LNNJR
GMPDT ZQIIE RHKSO UEZBO XEAFB DYKRK ARYYX XQUJG DTWLB WJZCB LHVIQ
RKQAF VVKEW MSVDR ZVQAN JNCBA MJJRR EJKZK ZLILN KKXVR UTXIQ ZRQDG
OKHRZ KWVHH MOHNP JVXLB XVCVV YVYYO CKKZH VRXKQ PWAJY ZIKAS QVFET
CPJHD OOZBT JIULN OSVTG DJBPP ZMALD ZCUJW HNWTL AZNIV VFVQC NMAMC
DXUGM ZUHOX YDEYR NYNIA NRWQY UIRPD UTUZF JLUJL GYIOQ GYGZG DWDIM
GMGJT TPOOK DTBFJ YUAAN OPEGH TSMPT VDUOV ZRANZ MUFDB ECZQV NWCBV
NKLVO ELKJI LKOAI PPRHR AOXIA YNYEY DFSAN JCCTR GDLWE MLOLP SQRXF
NMWLV KXALI PHSHL CHQKI CQZNK VGHBZ YXPBS CQSUB JEUYH RHAXU TJIGJ
EEMUT RSLMQ OJVLO XHXZQ GGNZM CBCZS OAWTR TTZDK YJPIQ FUVMV FCHIC
KLIRL XNRHK QKLKD JZKGJ OTCUQ ISBBH QFQOL VHAWG HWTPB BYDBB PVJKE
HQDFQ OBHXD YLZVV UXHSJ RYNXU MQENC ZHKUN MIZZB NNHRS ZVETO PCQOR
THGUI LBPFB CDJHW ARUBH FEIYL KRTXP FQEKX TFGND WQPVO NTESV WYQFU
OZMVC BMSAF ARKMX LQCWD JEAOZ WVFKM SSQND FVZNH EWCRP TKBBY ZCYCB
OWGTB IIBQX EUSAH VYJVM YHFYQ JFZIS AIKBX YIROC DEAJY PZKXG NQCZI
CCVFP PEDOC DUIES JDHAC THZCD VVSAG AWLIG XATOG ODAPS HWPWL MAUPP
AYXHP QGPKT LBERJ UKYXL JQSJG UKVMM IVWFS CMXER OKYIS QPQUS LOVHM
QIVLW GVEDN SORUC AHVTJ UNEYH VOYEE TSTSW KNGVO PZVCK BNMNC ITPNN
BVAZI JHIKG ULJOY RYYTB MLJZR PUOUZ NZQZI GTOIL CMMXY EUUSQ MJUBT
SFAGH JOKBU ODASI ILJHS QYVVD ZRZOC KZEZW XFVFS OMYZA YRXGC ONOQO
DRUGY MDFVL PRAOH SKVRX YKHRB DJVLA HNDKQ JJWMO VAOND VXZHC WEEXK
ZQQLH AOWLO TNKXI LHEJX ZHTSQ DFFJJ DPAUH TFXDI ZOWUK BOLMY ASRGP
XXNDA DIOCT BJJLK PUOUM LBHFE JBUBS FFYTK PZWNA FUJMA NVKNF JNXRH
HMROR KPBRB AJPUE IBUAI CWEMA SMTED JOAMB KFLPH BPYFT EZQZX JKHIF
VQRTJ ECWYC BQYTO CREGC TPFRT LGSHD YSULB HUVCW YFNYB BRRGH FGWAR
SBZXO LSUYP FJCPC TEZSM ICAZJ ACVIA MXMHC QHHUH HWBFF YSDII FUPYN
NHEWK YWBWH BJBTY GWPYR HSVKH DYHLZ JWCXH SYCFS SYEMJ DVIAA XNQNH
SAVVH ZCHBS LXYEO PIWHC OLQKW CSQXH JLNGE IYGDN XYCZK LTGPR YXMDQ
BXWVC XAJLI SKXIH TJHRB BONEZ ZJDPU YETBL EOVZZ LYMSG XTAEX HAGHS
CVTNS LZSMY ZWFHL ARUVN JMGLB PNZGJ ZFNWT JDYDH HBUAI GKQMG FNXON
FXQRH NNYMB ILCXR SPNZJ KCIVK OMXZN XWVGA KVHSE RVGTM UBCSC PUABB
IOBXN DATDL LTJUZ LMEXE EAHNR PIIHZ WHHEM XMFSM JDRIQ JHPRS CBGRW
CIGED BTPIU HLCTQ KOMDD KBHCV ICRWI CJBSJ IWIVB WYOHK ECODH NBSZH
HUPNL FXJEX SZVMM TLYPQ WYQDJ IIRTN YNIHG UZJFS VQPST RSKNP WMHYT
SEEYI YCKNF NYGEI XLJGF HSFGM NDXEI UUGJF RVBWV CDLDE SFCNW LZXUB
AKWBI KATXO OALDE SHAHB RTXTI FMSEW WKDEB GBWMM RPBXO VRQIR MTQFH
VMPVA BMMTM MPBAI ZTKTS XAZJU NRHUK RCKMI ILHBK VVMSS VWLAE GWXAV

You can see that in line 11, goes awry, and becomes desynchronized. It could have been a bug in my program, but I also suspected the possibility of a typo. So, I created my own version of the cipher text by taking 100 copies of the original line, using the key settings that I recovered, and creating my own version of the cipher text.

They match, except for three places. Here are the context diffs:

*** original.1	2010-07-11 17:37:09.000000000 -0700
--- fixed.1	2010-07-12 07:02:11.000000000 -0700
***************
*** 8,14 ****
  BUZLA GDBCU AMFQL ACRWW TUGSM PPZBR FASRO YIRCA GVEYN SRTOQ TDLFJ
  RUTKF KASGV LVYYF VRAIY NIVJK IUWPF ZBVRU EOTEJ GLCGY SSNHH QTIQW
  UKQAS XKGSP WHRYM TQSOQ BAMAP FQRLI IUGTI VBEBY XFBIU SEYHM LKGOE
! CSWUH TBIZZ HLBND IWTQA MAZBM YMBEK CYKCA BLYQY MELPJ OWNRV FZVBR
  EBVUJ EQIAE MOHTG FHFFI DIQQJ UAWDH LUYRE UGSKT IMDWR RNONJ KDPTC
  JDCJN BVEOU TWXOF GRXND KITNL OXSLZ WQRDE RERHL XWAMY LRVPR JFHRA
  SDJWW OIWEV AVMRR NLRJM IFDHH ADDQC BZWYK DVPAY NPIAX BYUKI JGVUC
--- 8,14 ----
  BUZLA GDBCU AMFQL ACRWW TUGSM PPZBR FASRO YIRCA GVEYN SRTOQ TDLFJ
  RUTKF KASGV LVYYF VRAIY NIVJK IUWPF ZBVRU EOTEJ GLCGY SSNHH QTIQW
  UKQAS XKGSP WHRYM TQSOQ BAMAP FQRLI IUGTI VBEBY XFBIU SEYHM LKGOE
! CSWUH TBIZZ HLBND IWTQA MAZBM YMBEK CYKCA BLYQY MELPJ OWNRV FZVKR
  EBVUJ EQIAE MOHTG FHFFI DIQQJ UAWDH LUYRE UGSKT IMDWR RNONJ KDPTC
  JDCJN BVEOU TWXOF GRXND KITNL OXSLZ WQRDE RERHL XWAMY LRVPR JFHRA
  SDJWW OIWEV AVMRR NLRJM IFDHH ADDQC BZWYK DVPAY NPIAX BYUKI JGVUC
***************
*** 47,53 ****
  AEXFA UPOKM SBMNZ YUHEM ZLBQM ROIUH KECCE IXDAR VFAEV WDHPS GYTTA
  ZRNTO WRSTD YOKCW NQUIS WEFIF LFFZQ BSDCS CBNRQ SZLXB BRICQ CLSCB
  INRYO RGNZE GCYAW PMCQL CGBMX BBUBO NQOZZ OFNQR YMZWA CDMGX NAIRA
! ABKCI OWTGT OTOOK MFRPG XADLN AAJSU BMTIQ VHOUZ TBCZA LEOPO YVEWO
  USDUN TZTJT YXUIG OZQFS VDDSR JWUFH FGIZS ORJTB IVSKB BHMPQ NXMWK
  AGSNT KJWOX HALOV WEXTS VKIYF ADOMO NPZCZ FZROC BIRWP UNTAX WXSER
  PGPPU RINGD CGFDG ZALDT NXPUQ EPQSU ZVKDO TXTBN MUQAS ZKIGH WQRQI
--- 47,53 ----
  AEXFA UPOKM SBMNZ YUHEM ZLBQM ROIUH KECCE IXDAR VFAEV WDHPS GYTTA
  ZRNTO WRSTD YOKCW NQUIS WEFIF LFFZQ BSDCS CBNRQ SZLXB BRICQ CLSCB
  INRYO RGNZE GCYAW PMCQL CGBMX BBUBO NQOZZ OFNQR YMZWA CDMGX NAIRA
! ABKCI OWTGT CTOOK MFRPG XADLN AAJSU BMTIQ VHOUZ TBCZA LEOPO YVEWO
  USDUN TZTJT YXUIG OZQFS VDDSR JWUFH FGIZS ORJTB IVSKB BHMPQ NXMWK
  AGSNT KJWOX HALOV WEXTS VKIYF ADOMO NPZCZ FZROC BIRWP UNTAX WXSER
  PGPPU RINGD CGFDG ZALDT NXPUQ EPQSU ZVKDO TXTBN MUQAS ZKIGH WQRQI
***************
*** 84,90 ****
  FCNLQ NWAOG NTOIV JZRVS IACOO KEYIN OZBNP KEGHF JFASY SDIFB NXNXF
  JPSAM RVBQG XNIZB MVGVU VNFMU FJXEL BZLTP IFIWB LBXPB QDXAW FRHBF
  QPDCM OXOSU MMERK QNMYF YKDOC BOXIY SPLGV PBLNG NKTAK YNGBX MIPOM
! RIDCL TCIBZ HFLDV RXBKP LRKMU CQHEY RAAVH XAYDH NNNUN JCINA RAEXP
  UAQRP RUDMO OHOOM EMGUP IEEIX AQTLU PETXI BQEPN IWBRE BNSEQ RDUGG
  TGWUR QRJRL XGRDP MJPDX TSDBG YYQDR DQYSZ GLXDR IDLYX FIVSQ WZVQG
  QRXLN LBLGT EGHVN ZXRFN HFQOW XIXBE ULILO MRXQO GJXRC JOUZH OTJAK
--- 84,90 ----
  FCNLQ NWAOG NTOIV JZRVS IACOO KEYIN OZBNP KEGHF JFASY SDIFB NXNXF
  JPSAM RVBQG XNIZB MVGVU VNFMU FJXEL BZLTP IFIWB LBXPB QDXAW FRHBF
  QPDCM OXOSU MMERK QNMYF YKDOC BOXIY SPLGV PBLNG NKTAK YNGBX MIPOM
! RIDCL TCIBZ HFLDV RXBKF LRKMU CQHEY RAAVH XAYDH NNNUN JCINA RAEXP
  UAQRP RUDMO OHOOM EMGUP IEEIX AQTLU PETXI BQEPN IWBRE BNSEQ RDUGG
  TGWUR QRJRL XGRDP MJPDX TSDBG YYQDR DQYSZ GLXDR IDLYX FIVSQ WZVQG
  QRXLN LBLGT EGHVN ZXRFN HFQOW XIXBE ULILO MRXQO GJXRC JOUZH OTJAK

The first is clearly a transcription error: if you look at the original, the last five characters should be FZVKR. The next is seemingly a printers error, it does appear to say OTOOK instead of CTOOK. And frankly, I don’t remember what I found out about the last one.

If you fix these three typos, the first 117 lines of Exhibit 1 decode properly, then go astray again. I haven’t worked on figuring out what the next typo is.

Eventually we should make a verified copy of the Exhibit 1 plain and cipher texts available, so that others won’t need to toil through this.