Archive for the ‘Computer Science’ Category

Somewhere… over the (simulated) rainbow…

Tuesday, August 31st, 2010

A few days back, I simulated how light propagated in a single drop of water, but with a number of problems. First of all, it didn’t simulate the Fresnel equations, which describe how light is reflected and refracted at the interface between two media. This meant that in my simple model, no light is actually subject to total internal reflection, and no light was scattered back in the direction that the light originated. This is an important effect: without it, you cannot see rainbows. It also assumed a single index of refraction, so we couldn’t see how the reflection and refraction change as a function of wavelength.

So, I decided to fix both problems, and run a simulation to see if I could capture some interesting features of rainbows by direct simulation.

To keep you interested (if possible), I’ll merely give the picture, then explain it. (Click on it to view it full size).

Simulated rainbow reflection, light enters from left, goes to right

It’s hard to see the detail in the area of interest, so here’s a better version, zoomed in (click to view bigger):

This describes the distribution of light in each outgoing direction, assuming light coming in from the left. There are four different plots, ranging from the shortest wavelength (4047 Angstroms, or deep violet) to the longest (7065 Angstroms, or deep red). (Sadly, there isn’t any correlation between the plot color and the wavelength, you have to look at the legend to interpret things properly.) I got these plots by determining the index of water at each wavelength, and then simulating via Monte Carlo techniques the resulting outgoing distribution by tossing 10 million simulated photons at the drop. At each surface interaction, it determines what the ratio of reflected to refracted light could be, and then take one path or the other based upon a weighted coin flip. The final outgoing directions are binned, and we take the log of the count in each bin to produce the distance from the center in the polar plot.

This plot isn’t the most pretty thing you’ll ever see, but you’ll actually find a number of interesting things. First of all, you’ll see that there are two rainbows represented. The inner one is the brighter, and the outer one is significantly dimmer (you might estimate that it contains 1/10 as much energy, since it appears about 1 unit lower and I used log based 10). The inner bow has violet at the inner side, and red at the outer side while the outer bow is reversed (red on the inner radius and violet on the outer radius). There is also a distinct lack of light between the two bows. The amount of light reflected in between the two bows is less than that scattered outside them. This is a phenomenon called “Alexander’s band”.

I thought this was a fun experiment. I could work a bit harder and actually produce the appropriate colors and intensities by some more computer graphics magic. Perhaps that will wait for some other day.

Making a soft-circuit input device for your computer

Friday, August 13th, 2010

I’m intrigued by various uses for embedded processors, and so are my readers. I hadn’t seen this particular microcontroller board before, the “Teensy”, which is very similar to the Arduino, except that it is uses an ATMEL AVR chip with a direct support for USB. The link also points at a nifty interface to “soft circuit” elements, which probably has some nifty controller applications.

Make: Online : Making a soft-circuit input device for your computer.

Zounds! Sounds!

Wednesday, August 11th, 2010

Tom and I have been discussing some early hacking efforts, probably spawned in part by my re-reading of Levy’s Hackers. A couple of days ago, this resulted in me pondering the mysteries of Minsky’s circle algorithm (still ongoing), but today it drove me to an early interesting sound algorithm documented in the legendary HAKMEM, ITEM 168, which shows a simple six instruction program that generated sounds on the PDP-1. The basic idea was pretty simple: we have two variables, A and B. Read the front panel switches from the PDP-1, and add it to A. Then take the contents of A, apply a bit-wise AND from a mask, and then add that to B. Mask off the high bit of B, and output that as the “music”. (Note: I am a bit confused by ITEM 168. The PDP-1 code there doesn’t seem to be exactly what I described, but the mnemonics they list are somewhat unfamiliar to me. In any case, the rough English description above will suffice for now…)

In any case, I didn’t have a PDP-1 lying around, so I decided to implement the thing in C, using portaudio and the GNU readline library. Here’s the code:

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <readline/readline.h>
#include <sndfile.h>

/*                  __
 *    ___  ___ ____/ /__  ___  ___
 *   / _ \/ _ `/ _  / _ \/ _ \/ _ \
 *  / .__/\_,_/\_,_/\___/\___/ .__/
 * /_/                      /_/
 *
 * A simple musical toy inspired by some discussions on early computer
 * music with Tom Duff, written by Mark VandeWettering.
 *
 * THE BASIC IDEA
 * Tom informed me that someone generated odd musical tones with a
 * very short program on the PDP 1 that basically looked like this:
 *
 * for (;;) {
 *    a += sw ;         some front panel switches
 *    b += a & mask ;   mask just selects some part of a...
 *    output(a>>30) ;   output a single bit square wave...
 * }
 *
 * So that's what this toy emulates.
 *
 * I've also added a tiny bit of interface to allow you to change the
 * switch settings and the mask on the fly, to "play" the instrument.
 *
 * I've called this program "padoop" just because I needed some vowels
 * added to "PDP".
 *
 * Tom points out that it wouldn't be hard to implement this in
 * hardware, completely without a processor.
 */

#include <stdint.h>
#include <portaudio.h>

#define SAMPLE_RATE     44100

typedef struct {
        uint32_t        a, b ;
        SNDFILE         *sf ;
} paMusicData ;

uint32_t        SW ;
uint32_t        MASK ;

/*
 * Here is a very simple output filter designed by Fisher's mkfilter code.
 */

#define NZEROS 2
#define NPOLES 2
#define GAIN   3.414213562e+00

static float xv[NZEROS+1], yv[NPOLES+1];

float
filter(float inp)
{
    xv[0] = xv[1]; xv[1] = xv[2];
    xv[2] = inp / GAIN;
    yv[0] = yv[1]; yv[1] = yv[2];
    yv[2] = (xv[0] + xv[2]) + 2 * xv[1]
        + ( -0.1715728753 * yv[0]) + ( -0.0000000000 * yv[1]);
    return yv[2];
}

static int
paMusicCallback(const void * inputBuffer,
                void * outputBuffer,
                unsigned long framesPerBuffer,
                const PaStreamCallbackTimeInfo * timeInfo,
                PaStreamCallbackFlags statusFlags,
                void * userData)
{
    paMusicData * data = (paMusicData *) userData ;
    float *out = (float *) outputBuffer, *op ;
    float f ;
    int i ;

    op = out ;
    for (i=0; i<framesPerBuffer; i++) {
        data->a += SW ;
        data->b += data->a & MASK ;
        *op++ = filter((data->b&(1L<<31))?1.0:-1.0) ;
    }

    if (data->sf != NULL) sf_write_float(data->sf, out, framesPerBuffer) ;

    return 0 ;
}

void
process_command(char *cmd)
{
    uint32_t tmp ;

    if (strlen(cmd) > 0)
        add_history(cmd) ;
    if (strcmp(cmd, "p") == 0 || strcmp(cmd, "print") == 0) {
        fprintf(stderr, "sw 0x%08X\n", SW) ;
        fprintf(stderr, "mask 0x%08X\n", MASK) ;
    } else if (sscanf(cmd, "sw %i\n", &tmp) == 1) {
        SW = tmp ;
        fprintf(stderr, "sw 0x%08X\n", SW) ;
    } else if (sscanf(cmd, "mask %i\n", &tmp) == 1) {
        MASK = tmp ;
        fprintf(stderr, "mask 0x%08X\n", MASK) ;
    }
}

main()
{
    PaStream *stream ;
    paMusicData data ;
    PaError err = Pa_Initialize() ;
    uint32_t m1, m2 ;
    if (err != paNoError) goto error ;
    SF_INFO sfinfo ;

    data.a = data.b = 0 ;
    /* log output data to a file... */

    sfinfo.samplerate = SAMPLE_RATE ;
    sfinfo.channels = 1 ;
    sfinfo.format = SF_FORMAT_WAV | SF_FORMAT_PCM_16 ;
    data.sf = sf_open("padoop.wav", SFM_WRITE, &sfinfo) ;

    SW = 012345 ;
    MASK = 0x0f0f0f0f ;

    /* open the music output device... */
    err = Pa_OpenDefaultStream(&stream,
        0,              /* no input */
        1,              /* just one output */
        paFloat32,      /* output float data */
        SAMPLE_RATE,    /* at some sample rate */
        8192,           /* samples per frame */
        paMusicCallback,
        &data) ;

    if (err != paNoError) goto error ;

    /* start */
    err = Pa_StartStream(stream) ;
    if (err != paNoError) goto error ;

    for (;;) {
        char * line = readline("padoop> ") ;
        if (line == NULL) break ;
        process_command(line) ;
        free((void *) line) ;
    }

    err = Pa_StopStream(stream) ;
    if (err != paNoError) goto error ;

    sf_close(data.sf) ;

    err = Pa_Terminate() ;
    if (err != paNoError) goto error ;
    exit(0) ;

error:
    fprintf(stderr, "PortAudio error: %s\n", Pa_GetErrorText(err)) ;
    Pa_Terminate() ;
    exit(-1) ;
}

You’ll need to have the portaudio and readline libraries installed to me it work. When you run it, it should start making noises, and you’ll see a command line prompt. It accepts 3 commands: “print” which dumps the contents of the mask and switches, “sw” followed by a number, which sets the value of the switches, and “mask” followed by a number, which does the same for the mask. Give it a try.

Addendum: Try “sw 4096″ and “mask 0xFF0F0000″. Let it run for a while. If you find some interesting settings, post them in comments.
Addendum2: I included a fairly soft output filter to round the square waves a tiny bit. I’m not sure it makes any difference at all.
Addendum3: I made a version of padoop that also logged the output audio to a WAV file, so even if you can’t compile it, you can hear some of the sounds it might make.
Output from Padoop, my PDP-1 inspired music program…
Another bit of output…
Addendum4: While editing, I got some of the escaped characters screwed up, so I reinserted with the latest version. It automatically outputs a padoop.wav file of all the output that the program creates.

Drawing “circles” ala Marvin Minsky…

Monday, August 9th, 2010

In my re-reading of Levy’s book Hackers, I was reminded of an interesting bit of programming lore regarding an early display hack that Marvin Minsky did for circle drawing. It’s an interesting hack because the lore was that it was originally coded by mistake, and yet the result proved to be both interesting and even useful. I first learned of this years ago when Blinn did an article on different ways to draw a circle, but also learned that it was part of the MIT AI memo called “HAKMEM”. Here’s the basic idea:

One way that you can draw a circle is to take some point x, y on the circle, and then generate new points by rotating them around the center (let’s say that’s the origin for ease) and connecting them with lines. If you have a basic understanding of matrix math, it looks like this:

/ x' \   /  cos theta  sin theta \ / x \
|    | = |                       | |   |
\ y' /   \ - sin theta cos theta / \ y /

(I should learn how to use MathML, but you hopefully get the idea). The matrix with the cosine and sine terms in it is a rotation matrix which rotates a point around by the angle theta. Apparently Minsky tried to simplify this by noting that cos(theta) is very nearly one for small theta, and that sin(theta) can be approximated by theta (we can get this by truncating the Taylor series for both). Therefore, he thought that (I’m guessing here, but it seems logical) that we could simplify this as:

/ x' \   /   1  eps \ / x \
|    | = |          | |   |
\ y' /   \ -eps  1  / \ y /

Okay, it’s pretty obvious that this seems like a bad idea. For one thing, the “rotation” matrix isn’t a pure rotation. It’s determinant is 1 – eps^2 1 + eps^2, which means that points which are mapped move slowly toward away from the origin. Nevertheless, Minsky apparently thought that it might be interesting, so he went ahead an implemented the program. I’ll express it here is pseudo-code:

newx = oldx - eps * oldy
newy = oldy + eps * newx

Note: this program is “buggy”. The second line should (for some definition of should) read “oldy + eps * oldx“, but Minsky got it wrong. Interestingly though, this “bug” has an interesting side effect: it draws circles! If eps is a power of 2, you can even implement the entire thing in integer arithmetic, and you don’t need square roots or floating point or anything. Here’s an example of it drawing a circle of radius 1024:

Well, there is a catch: it doesn’t really draw circles. It draws ellipses. But it’s still a very interesting bit of code. The first thing to note is that if you actually write down the transformation matrix that the “wrong” equations implement, the determinant is one. That helps explain why the point doesn’t spiral in. But there is another odd thing: if you implement this in integer based arithmetic, it’s eventually periodic (it doesn’t go out of control, all the round off errors eventually cancel out, and you return to the same point again and again and again). In HAKMEM, Schroeppel proclaims that the reason for the algorithm’s stability was unclear. He notes that there are a finite number of distinct radii, which indicates that perhaps the iterative process will always eventually fill in the “band” of valid radii, but the details of that seem unclear to me as well. I’ll have to ponder it some more.

Addendum: The real reason I was looking at this was because the chapter in Hackers which talks about Spacewar! also talks about the Minskytron whose official name was TRI-POS. It apparently uses some of this idea to draw curves in an interesting way. HAKMEM item #153 also suggests an interesting display hack:

ITEM 153 (Minsky):
To portray a 3-dimensional solid on a 2-dimensional display, we can use a single circle algorithm to compute orbits for the corners to follow. The (positive or negative) radius of each orbit is determined by the distance (forward or backward) from some origin to that corner. The solid will appear to wobble rigidly about the origin, instead of simply rotating.

Might be interesting to implement.

Addendum2: I goofed up the determinant above, it should be 1 + eps^2, not 1-eps^2. Corrected.

One Bit Ferrite Core Memory

Thursday, August 5th, 2010

Anyone younger than me has probably never seen core memory, or even knows how it works. A very cool writeup, illuminating the actual workings of this technology from the past.

One Bit Ferrite Core Memory – Wayne’s Tinkering Page.

Writing emulators for obscure old computers…

Thursday, August 5th, 2010

(Hmmm. I’m sort of on a retrocomputing kick today…)

A few years ago, I wandered into Tom’s office one day to find him typing away. I asked “whatcha doing?” and he replied “I’m writing an emulator for the PDP-1 so I can play the original Spacewar!” It tells you a lot about Tom to hear stories like this. It tells you something about me that I didn’t respond with “why?” or “that’s stupid” and instead just went back to my desk and wrote my own. I had never seen a PDP-1 (although the Computer History Museum has since completed restoration of theirs, where you can see Spacewar! running on real hardware). It took me a couple of days to tinker the emulator sufficient to run Spacewar! into shape, and I learned a lot about how to write emulators.

A couple of days ago, Eric sent out a link to his RetroChallenge project, an FPGA-ELF:

The ELF was a microcomputer based around the RCA-1802 processor, and was published back in 1976 in the pages of Popular Electronics. I remember reading this as a precocious 12 year old, but it would be four years until I got my first computer, an Atari 400. Still, these older machines seem like interesting things, so I thought I’d investigate what it would take to make a high quality emulator (or even a crappy one) for the 1802.

Here’s the first little nugget that I found. The 1802 has sixteen 16-bit registers, labelled R0-R15. One of the more interesting features that it has is that the PC isn’t a dedicated register: there is a 4 bit P register, which tells you which of the sixteen R registers serves as the PC. In each cycle, it fetches the byte pointed at by R[P], increments R[P], and then executes the instruction. I found that there is an interesting case that occurs with opcode 0×20. The high nibble of 2 indicates that it is a DEC instruction, and the low nibble specifies the register. This says “DEC R0″. But in my case, R0 was serving as the PC (P was 0 as well), so this instruction essentially undoes the normal increment, restoring its previous contents, and we have a single byte infinite loop.

At least, that’s what I think should happen. I don’t have an actual computer, so debugging this kind of stuff is what writing emulators like this is all about.

Getting the basic instruction set working shouldn’t be too hard, but there are other things like handling DMA that might be a bit trickier. Still, I’m destined to learn a lot, about a subject that is of no importance whatsoever. But then, you should expect that from me and this blog.

Addendum: Eric wrote up some nodes about his FPGA-ELF on his blog. Factoid: it runs about 100x faster than the original ELF.

10,000 Monkeys Typing…with a Unix/sh challenge…

Tuesday, July 27th, 2010

I was testing some code that I wrote for analyzing cryptograms, and decided that the easiest way to do so would be to get some random text, drawn from the letters A-Z. A moments thought yielded this method, without even programming anything:

tr -d -c A-Z < /dev/urandom | dd ibs=10000 count=1

The tr generates the required data, and the dd truncates it to the desired number of characters. But for tidiness, I'd like to have the output broken up so that each line consists of 50 characters, with spaces inserted between every 5 characters (I won't begrudge you if you leave a dangling space at the end of each line). I couldn't figure out a simple way to get that to happen all in one command line and using standard utilities. I can of course write a little Python utility, or even perl, but can anyone think of a clever short way to do this?

Addendum: Tom pointed out something interesting about the command that I listed above: it doesn’t work the way I think it does. Apparently the ibs is the input block size, which dd dutifully allocates, and the count is the number of read system calls that the system issues. For reasons which escape me, it does not try to make sure that it actually received a full input block: it will nicely return short blocks if it finds them, and doesn’t bother retrying to get more. Hence, it works rather erratically when using a pipe as input, particularly when the writes from the upstream process may flush in odd intervals.

Primality testing with Perl regexs

Friday, July 23rd, 2010

Okay, here’s a little pet peeve of mine. Somebody brought up that it’s possible to do primality testing using Perl regular expressions. This has been kicking around for a while, I’m not sure who originated the idea, but you can find links to it with a Google search. As an example, consider the Perl one liner here:

Primality testing with Perl regexs@Everything2.com

It’s a very cute hack, so where is my peeve?

The regular expression they list isn’t a regular expression.

It seems a bit of a quibble. I’ve had people argue that whatever the meaning of regular expressions might be, the “regular expressions” matched by popular “regular expression” libraries like Perl and Python should supercede, because they are in fact so ubiquitous. I might argue, except for one thing:

Real regular expressions can be matched really quickly, but are seldom matched very quickly at all by Perl, Python, or Ruby. Russ Cox wrote a nice article about this. Yes, perl regular expressions can match all sorts of complex things that real regular expressions can’t, but if all you’ve got is a real regular expression, perl’s performance is typically an order of magnitude slower than necessary, and in fact, perl can be arbitrarily slow. This should come as no surprise: since the perl implementation of primality testing does work, it obviously must be slow: it is, after all, capable of any arbitrary computation.

It’s not like doing it correctly is even difficult: we implemented all this stuff back in Ginnie Lo’s compiler class back in the early 80s, and it was old back then. We made lexical analyzers that began by compiling NFAs for all the tokens in the language, and then converting them into DFAs, which marched along and processed files in linear time.

Okay, it’s my peeve, but there is some good, beautiful computer science underlying it all.

Reminder: HOWTO tunnel http using ssh…

Friday, July 16th, 2010

I have a couple of devices at home that provide http servers on my local network. I have them tucked nicely behind my firewall so that they are not accessible from the outside, but occasionally, I would like to login to them to perform some reconfiguration or the like. This is where ssh comes to the rescue: you can use it to create a secure tunnel, a port that I can connect to on my local machine that gets routed to my remote machine through my firewall.

The problem is, I can never remember the command, and it takes me a few minutes of thought to reconstruct it. So, I thought I’d write it down here so I’d remember.

The basic command is:

ssh -f mylogin@myserver.com -L 20000:192.168.1.132:80 -N

When I enter http://localhost:20000 on my machine, it’s as if I was accessing the web page on 192.168.1.132 on my local network. Very useful.

Crazy Optimization of Chaocipher…

Thursday, July 15th, 2010

Okay, this is a minor hack, but I thought it was fun, so I thought I’d write it up here.

My original code for simulating the Chaocipher machine proceeded as follows: it found the character in the plaintext wheel (by linear search), then rotated each wheel to get the plain and cipher text entry to the zenith minus one and zenith positions respectively, then did the short cycle rotates. It used a temporary array to perform the first rotate, and then used memcpy to copy the results back. The short rotates were done in position. Hey, it was easy to write, but kind of ugly. In particular, it did twice as many read/write operations as necessary, and I hypothesized that it was a bottleneck.

So, I wrote a little code generator that instead generated “compiled” code for each of the 26 possible paths through the encoding code. Each entry in the cipher and plaintext wheels is read and written only once, and I use only an additional char of storage to hold temp values. Here’s the fragment of the code that executes if the cipher text is found in position number zero:

	case 0:
		tmp = m->cw[1];
		m->cw[1] = m->cw[2];
		m->cw[2] = m->cw[3];
		m->cw[3] = m->cw[4];
		m->cw[4] = m->cw[5];
		m->cw[5] = m->cw[6];
		m->cw[6] = m->cw[7];
		m->cw[7] = m->cw[8];
		m->cw[8] = m->cw[9];
		m->cw[9] = m->cw[10];
		m->cw[10] = m->cw[11];
		m->cw[11] = m->cw[12];
		m->cw[12] = m->cw[13];
		m->cw[13] = tmp;
		tmp = m->pw[0];
		m->pw[0] = m->pw[1];
		m->pw[1] = m->pw[2];
		m->pw[2] = m->pw[4];
		m->pw[4] = m->pw[6];
		m->pw[6] = m->pw[8];
		m->pw[8] = m->pw[10];
		m->pw[10] = m->pw[12];
		m->pw[12] = m->pw[14];
		m->pw[14] = m->pw[15];
		m->pw[15] = m->pw[16];
		m->pw[16] = m->pw[17];
		m->pw[17] = m->pw[18];
		m->pw[18] = m->pw[19];
		m->pw[19] = m->pw[20];
		m->pw[20] = m->pw[21];
		m->pw[21] = m->pw[22];
		m->pw[22] = m->pw[23];
		m->pw[23] = m->pw[24];
		m->pw[24] = m->pw[25];
		m->pw[25] = tmp;
		tmp = m->pw[3];
		m->pw[3] = m->pw[5];
		m->pw[5] = m->pw[7];
		m->pw[7] = m->pw[9];
		m->pw[9] = m->pw[11];
		m->pw[11] = m->pw[13];
		m->pw[13] = tmp;
		break;

You should notice that the cipher wheel doesn’t bother to move element zero at all (it doesn’t move if the plaintext letter is found at index 0). It also doesn’t touch the upper half of the cipher wheel, since those letters are left in place by this particular path. It then uses each element as the source and destination precisely once. Since there are two cycles, uses the temporary variable twice.

The resulting code is about 1450 lines long, but the resulting program runs almost 2.5x faster overall, searching about 10.8M nodes per second in my backtracking search. Worth the 20 minutes or so it took me to write the code generator.

Addendum: In theory at least, you could safely issue reads and writes which occur in different cycles in parallel, since you know they don’t target the same addresses. But I don’t know how to hint to the C compiler to perform that optimization. I also suspect that SSE instructions might allow you to load and swizzle faster.

Typos in Exhibit 1?

Monday, July 12th, 2010

WARNING: if you are working on this code, this article contains spoilers which may blunt your own intellectual satisfaction in working on it yourself, including some recovered keys.

Okay, I’m home, and feeling pretty jet-lagged, so this might be wrong in some way that additional sleep will reveal, but I wanted to get this out of my head before work today.

First of all, I downloaded Exhibit 1′s plaintext from this page, specifically this link for the plaintext and this link for the ciphertext. These files contain no spaces or carriage returns, so are relatively unfriendly for human consumption, so the first thing I did was to reformat them into 11 groups of 5 characters per line to match the format that Exhibit 1 is presented in the scanned chapter of Silent Years.

Then, I set my constraint solver working. It loads both the cipher and plaintext, and does an exhaustive search to find the consistent wheel settings that consumes the most input. It then dumps the settings as a command line that should decode the ciphertext, or at least as much of it as it has discovered.

The first 100 lines of Exhibit 1 consist of 100 copies of the 55 character sentence:

ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW

So, the first 100 lines of the decryption of the cipher text should be just 99 copies. But my cipher program pointed at a problem after 603 characters. Here are

./chaocipher -d -c 'CPEDQRSTIXYLMOZABFVGUHWJKN' -p 'AYZNBQDSEFGHLWIKCMOPRTUVJX'

When I run this on the original text, I get the following (numbered for convenience):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTYW
ALLGO ODQQU ICKBR OWNFO XESJU MPOVE RLAZY DOGTO SAVET HEIRP ARTGW
ADLGO OSLLI VFCRB YMHAW QGNMS NDNCK GZYPM LAYNZ LFNDO WGBJU IXZLD
TNOJM TUSZG IKMLC WNCKK OUICA YKSJJ SSKFR GLLPU XAOXS BXAPQ ECJSU
SFLFT ZLNQS SWNZN TKSDG JUUQC GQCDD WPDIN FZODL MJEXV WIFFW TEFPS
ZKUAT XVTKL FZLHM VICVD PMEZP YCULX VAXNE AYRSL BBMIT UJGII TSPIQ
UMUMC FWFPO SYYAW EYCKY ZQUVJ BOIIU FBHXT QFHKS EMWDB TSNJY GNIKJ
QHJRR DCGCX MINUQ ZGITG HTINW UZWFX HJEWM QDCXN BXKGY QDDNG IAXNY
RCRAM KCOAI VLKLF SNVGW SQYOA OYUUW YOQAC SUDFF KYXPJ GTBPS ZTONZ
CZPDM MOORN EVZZV AHBUP VOCUJ AHCGZ SLVUY ILIDG YAOOX TYDDA LJDLO
ULRNQ ASEZW BXPSV LEMYZ THMVA BHCII YVHIU JQBAU RGZEI ARUIL NZEDY
BJIOG TRFMH HDJFR ROBKH SMUHY PBYUQ MNNRW YBPRZ KCUQK LGNAI EXRAX
KDJVP JHMRN IFINI ZYGSZ ITDWZ NKGQV QUXVJ YZUIJ FQECB RWSDK QZUHM
XZBFK XGICH IMHHR NKSMF VXTCG IQOJK TCCJO BWQJC ATUBZ ILHRE ULVHQ
WUMDQ SERWK AYNOW PGKZF CWMOG SOLYO YVUSW AIEIO FPJDS SCMEC LTWBA
KNGQE KZFCC RKKXL ZXNTH UQXQD DGJPU EZEVE IPVZC XNRZV OGGVV JBCYP
WMJQG BYMKX NNGAA YDGRA XKVAV LVLFK WIQDE DBZHF YSVEN WOGJK USZPZ
QPEKM UHTDT HTOJU FQIMZ KHPWT AJGDE CEUKS RSLNU GIUAM XUWLN CSJVC
NXOBT AVQTY HLBUL VYYBX CLFEV LKVVE JGYAP SDSIS RZPJQ ISDKS JNEDW
HMSMF DXVZY LVVJF USVXN ZGKFQ NYMZO QQGQJ SCUVS LJRAL NOYIY HRWVD
JMEFF ZRNCU PRUFT NFETI WESRY NOVKQ UVCKY QNWQU OQMHA MATEE QBRWE
YBSQR HLMRH OCLOP PWIOE AGTHF KTMLE TYILX UZSNX JWHIG LJXRB KUCNW
RZMFC OBKBL FGEVF THGHE MYHMA LLWBV SMUDF XGHPM IWDVC ODXYR LGLTU
SGPKX WQBIR IPVJV LNNWR LYSQK OQPOD BBZQL WSSJB IUHVU IXTVW RAUJJ
BLGMT CIVOQ KQXXZ FCXQN GWCWI SQWMY OCZBY PHCEH ULDTT FETNI ESUVQ
CUWCL LSQVL BRYZW BRGIC QFHDU MUJQQ SXCVK SPYRM BYFFI NQZPZ PURFS
RWNWH WOKUZ GSNPM LJQDB XOZEG YTYHH PAGFR NQILO QCTNP RNFMY PHSEH
MXUDW AUDMB MLZBW SIGDP EOAVG BXWWH EREWH DBPSC WFAFS AHATC HAYAL
UVYMP KGPMD NBIWC NQTXT FYVYW KNKQU VJZIH WZCLF HTTBY SRRWW APVAQ
PCZYV PTMLJ BYQRQ IHIHZ ZXIWX CEOVG FYZUN PQFTW NNHYX CQKUN GWVNF
NYGDI ZKOUR VTQYI MNFVX JHJYN TWOKP DMOYF RJOED GLGKL JZYVE EKLZU
JEMXF VWVYZ ROISI CQKRZ XUFWX RFWVS VPPLH MUEOF YUBPW BBOUG BAUMD
GIWET YNTKV DSESI ZLWSL VKHIZ NZKCO JTRGH SWVVZ RUNDD LHEFP XTLHD
LFEFB UHDDH ZORCZ NLKQZ QJZQB ZBQFT CDMOS UGHCE YYUSY EPCID PMZYY
ZHWWL TZNLX HEEMQ JCEYS EAOWO JXBGK BHCAT QTHAB CPXXT BMSLW BBLQR
CSHCH MHURA WOQRG LWMLT WGFDG AOEJF GSFXR DMRHR FCBPY JLFGP UXQHW
NPUQP BKMIC EQZDM YIMXU FWASM CXJAR PSVEK ZMIZV PQNRP YGBLK IXJFF
ASHHB BWRDJ ZXFSG STINW TGJHY TQSRQ ENFIX JMSZZ FYKKC IEOIR ERJYK
JLPCU NXTAP XTCBL HIGZT SEPTW SBMFE WLKZX WIZDG RBAEJ LIIRW UFRKI
NLJTR UTFWF KFSVJ KZQEC VCYNI RIBQM BJHEG FZYAT UQOAS UCLBG GCYBD
PSTDV TKEBN IGUFH XTKTL INLVG KDPGG IEPJG BHASL CBWNS BOVYA GKCRZ
WUQXV JZMNG NANAD BMGFN QUZCM QAHQH VAXWK SRFCY OJGKC RIGHL HGTAH
WCFRZ MHXWK TWJPM YQWJH FNCPU BLBXD YXMSK BDDIY ZQABQ HZMEO WPNOE
RHFVE QHIBY FPDCW EXRZG GJNZO MLEYY WZJHH YVRRM LUHXI YJDDQ TJOOM
EVWYG IQUFD LPPMD QADBK CTBIJ HELST EOXZZ SUUWI XZHIS OXPVY FMNOP
WKOIF JHBOZ CITZQ VTGEM NOJGB KQMVG URNWU GFEBB IDWXB KQEOD FOPIV
XLNJA LPUKH NXQTI YBIEV YIJYT ZHSQW LISZT YPZHQ VOYTF CMRFM QAPSP
YZGZT SHGIW VFZDZ YMVEF YZYJI KHQMG WVQOG YACHG MELVP MNMOE PASDH
WIWLH KAEQX QBEWW IMPVC PZISZ SWZBR SETHW LXHQO DADLT UVUXC JSNKB
EDYMV KNXGM DLOOB KUUFA DRPHO GLBNA WXVCG YZKVY BQMKF NFWGR JPAHA
DGDIO OQCTC SQBKT JVHDE KFKQF KOKYW RPEBF HBKCK ZNQYH MZBWD VAHFF
QVIAZ QUWKA IEVXP KLDWJ ATRTB LYPWI AWPAO TYDOM YFQZR ZTMAS DMCOS
ALHRW QRSPC KIHJV JXSJB UQVOG EWQON BKNLO VPIGF TNGUP BVBTU WHGTJ
XZJAV NCACR UNPZW DQWOT SYOAS YPBQZ TVIYF VZKIW SFETG XUCYI QSBPY
IHYRV YSIII IUGHL FNUUH ARMJL BIAUT WNLOP ZXECP SEBYG SSUGS LNNJR
GMPDT ZQIIE RHKSO UEZBO XEAFB DYKRK ARYYX XQUJG DTWLB WJZCB LHVIQ
RKQAF VVKEW MSVDR ZVQAN JNCBA MJJRR EJKZK ZLILN KKXVR UTXIQ ZRQDG
OKHRZ KWVHH MOHNP JVXLB XVCVV YVYYO CKKZH VRXKQ PWAJY ZIKAS QVFET
CPJHD OOZBT JIULN OSVTG DJBPP ZMALD ZCUJW HNWTL AZNIV VFVQC NMAMC
DXUGM ZUHOX YDEYR NYNIA NRWQY UIRPD UTUZF JLUJL GYIOQ GYGZG DWDIM
GMGJT TPOOK DTBFJ YUAAN OPEGH TSMPT VDUOV ZRANZ MUFDB ECZQV NWCBV
NKLVO ELKJI LKOAI PPRHR AOXIA YNYEY DFSAN JCCTR GDLWE MLOLP SQRXF
NMWLV KXALI PHSHL CHQKI CQZNK VGHBZ YXPBS CQSUB JEUYH RHAXU TJIGJ
EEMUT RSLMQ OJVLO XHXZQ GGNZM CBCZS OAWTR TTZDK YJPIQ FUVMV FCHIC
KLIRL XNRHK QKLKD JZKGJ OTCUQ ISBBH QFQOL VHAWG HWTPB BYDBB PVJKE
HQDFQ OBHXD YLZVV UXHSJ RYNXU MQENC ZHKUN MIZZB NNHRS ZVETO PCQOR
THGUI LBPFB CDJHW ARUBH FEIYL KRTXP FQEKX TFGND WQPVO NTESV WYQFU
OZMVC BMSAF ARKMX LQCWD JEAOZ WVFKM SSQND FVZNH EWCRP TKBBY ZCYCB
OWGTB IIBQX EUSAH VYJVM YHFYQ JFZIS AIKBX YIROC DEAJY PZKXG NQCZI
CCVFP PEDOC DUIES JDHAC THZCD VVSAG AWLIG XATOG ODAPS HWPWL MAUPP
AYXHP QGPKT LBERJ UKYXL JQSJG UKVMM IVWFS CMXER OKYIS QPQUS LOVHM
QIVLW GVEDN SORUC AHVTJ UNEYH VOYEE TSTSW KNGVO PZVCK BNMNC ITPNN
BVAZI JHIKG ULJOY RYYTB MLJZR PUOUZ NZQZI GTOIL CMMXY EUUSQ MJUBT
SFAGH JOKBU ODASI ILJHS QYVVD ZRZOC KZEZW XFVFS OMYZA YRXGC ONOQO
DRUGY MDFVL PRAOH SKVRX YKHRB DJVLA HNDKQ JJWMO VAOND VXZHC WEEXK
ZQQLH AOWLO TNKXI LHEJX ZHTSQ DFFJJ DPAUH TFXDI ZOWUK BOLMY ASRGP
XXNDA DIOCT BJJLK PUOUM LBHFE JBUBS FFYTK PZWNA FUJMA NVKNF JNXRH
HMROR KPBRB AJPUE IBUAI CWEMA SMTED JOAMB KFLPH BPYFT EZQZX JKHIF
VQRTJ ECWYC BQYTO CREGC TPFRT LGSHD YSULB HUVCW YFNYB BRRGH FGWAR
SBZXO LSUYP FJCPC TEZSM ICAZJ ACVIA MXMHC QHHUH HWBFF YSDII FUPYN
NHEWK YWBWH BJBTY GWPYR HSVKH DYHLZ JWCXH SYCFS SYEMJ DVIAA XNQNH
SAVVH ZCHBS LXYEO PIWHC OLQKW CSQXH JLNGE IYGDN XYCZK LTGPR YXMDQ
BXWVC XAJLI SKXIH TJHRB BONEZ ZJDPU YETBL EOVZZ LYMSG XTAEX HAGHS
CVTNS LZSMY ZWFHL ARUVN JMGLB PNZGJ ZFNWT JDYDH HBUAI GKQMG FNXON
FXQRH NNYMB ILCXR SPNZJ KCIVK OMXZN XWVGA KVHSE RVGTM UBCSC PUABB
IOBXN DATDL LTJUZ LMEXE EAHNR PIIHZ WHHEM XMFSM JDRIQ JHPRS CBGRW
CIGED BTPIU HLCTQ KOMDD KBHCV ICRWI CJBSJ IWIVB WYOHK ECODH NBSZH
HUPNL FXJEX SZVMM TLYPQ WYQDJ IIRTN YNIHG UZJFS VQPST RSKNP WMHYT
SEEYI YCKNF NYGEI XLJGF HSFGM NDXEI UUGJF RVBWV CDLDE SFCNW LZXUB
AKWBI KATXO OALDE SHAHB RTXTI FMSEW WKDEB GBWMM RPBXO VRQIR MTQFH
VMPVA BMMTM MPBAI ZTKTS XAZJU NRHUK RCKMI ILHBK VVMSS VWLAE GWXAV

You can see that in line 11, goes awry, and becomes desynchronized. It could have been a bug in my program, but I also suspected the possibility of a typo. So, I created my own version of the cipher text by taking 100 copies of the original line, using the key settings that I recovered, and creating my own version of the cipher text.

They match, except for three places. Here are the context diffs:

*** original.1	2010-07-11 17:37:09.000000000 -0700
--- fixed.1	2010-07-12 07:02:11.000000000 -0700
***************
*** 8,14 ****
  BUZLA GDBCU AMFQL ACRWW TUGSM PPZBR FASRO YIRCA GVEYN SRTOQ TDLFJ
  RUTKF KASGV LVYYF VRAIY NIVJK IUWPF ZBVRU EOTEJ GLCGY SSNHH QTIQW
  UKQAS XKGSP WHRYM TQSOQ BAMAP FQRLI IUGTI VBEBY XFBIU SEYHM LKGOE
! CSWUH TBIZZ HLBND IWTQA MAZBM YMBEK CYKCA BLYQY MELPJ OWNRV FZVBR
  EBVUJ EQIAE MOHTG FHFFI DIQQJ UAWDH LUYRE UGSKT IMDWR RNONJ KDPTC
  JDCJN BVEOU TWXOF GRXND KITNL OXSLZ WQRDE RERHL XWAMY LRVPR JFHRA
  SDJWW OIWEV AVMRR NLRJM IFDHH ADDQC BZWYK DVPAY NPIAX BYUKI JGVUC
--- 8,14 ----
  BUZLA GDBCU AMFQL ACRWW TUGSM PPZBR FASRO YIRCA GVEYN SRTOQ TDLFJ
  RUTKF KASGV LVYYF VRAIY NIVJK IUWPF ZBVRU EOTEJ GLCGY SSNHH QTIQW
  UKQAS XKGSP WHRYM TQSOQ BAMAP FQRLI IUGTI VBEBY XFBIU SEYHM LKGOE
! CSWUH TBIZZ HLBND IWTQA MAZBM YMBEK CYKCA BLYQY MELPJ OWNRV FZVKR
  EBVUJ EQIAE MOHTG FHFFI DIQQJ UAWDH LUYRE UGSKT IMDWR RNONJ KDPTC
  JDCJN BVEOU TWXOF GRXND KITNL OXSLZ WQRDE RERHL XWAMY LRVPR JFHRA
  SDJWW OIWEV AVMRR NLRJM IFDHH ADDQC BZWYK DVPAY NPIAX BYUKI JGVUC
***************
*** 47,53 ****
  AEXFA UPOKM SBMNZ YUHEM ZLBQM ROIUH KECCE IXDAR VFAEV WDHPS GYTTA
  ZRNTO WRSTD YOKCW NQUIS WEFIF LFFZQ BSDCS CBNRQ SZLXB BRICQ CLSCB
  INRYO RGNZE GCYAW PMCQL CGBMX BBUBO NQOZZ OFNQR YMZWA CDMGX NAIRA
! ABKCI OWTGT OTOOK MFRPG XADLN AAJSU BMTIQ VHOUZ TBCZA LEOPO YVEWO
  USDUN TZTJT YXUIG OZQFS VDDSR JWUFH FGIZS ORJTB IVSKB BHMPQ NXMWK
  AGSNT KJWOX HALOV WEXTS VKIYF ADOMO NPZCZ FZROC BIRWP UNTAX WXSER
  PGPPU RINGD CGFDG ZALDT NXPUQ EPQSU ZVKDO TXTBN MUQAS ZKIGH WQRQI
--- 47,53 ----
  AEXFA UPOKM SBMNZ YUHEM ZLBQM ROIUH KECCE IXDAR VFAEV WDHPS GYTTA
  ZRNTO WRSTD YOKCW NQUIS WEFIF LFFZQ BSDCS CBNRQ SZLXB BRICQ CLSCB
  INRYO RGNZE GCYAW PMCQL CGBMX BBUBO NQOZZ OFNQR YMZWA CDMGX NAIRA
! ABKCI OWTGT CTOOK MFRPG XADLN AAJSU BMTIQ VHOUZ TBCZA LEOPO YVEWO
  USDUN TZTJT YXUIG OZQFS VDDSR JWUFH FGIZS ORJTB IVSKB BHMPQ NXMWK
  AGSNT KJWOX HALOV WEXTS VKIYF ADOMO NPZCZ FZROC BIRWP UNTAX WXSER
  PGPPU RINGD CGFDG ZALDT NXPUQ EPQSU ZVKDO TXTBN MUQAS ZKIGH WQRQI
***************
*** 84,90 ****
  FCNLQ NWAOG NTOIV JZRVS IACOO KEYIN OZBNP KEGHF JFASY SDIFB NXNXF
  JPSAM RVBQG XNIZB MVGVU VNFMU FJXEL BZLTP IFIWB LBXPB QDXAW FRHBF
  QPDCM OXOSU MMERK QNMYF YKDOC BOXIY SPLGV PBLNG NKTAK YNGBX MIPOM
! RIDCL TCIBZ HFLDV RXBKP LRKMU CQHEY RAAVH XAYDH NNNUN JCINA RAEXP
  UAQRP RUDMO OHOOM EMGUP IEEIX AQTLU PETXI BQEPN IWBRE BNSEQ RDUGG
  TGWUR QRJRL XGRDP MJPDX TSDBG YYQDR DQYSZ GLXDR IDLYX FIVSQ WZVQG
  QRXLN LBLGT EGHVN ZXRFN HFQOW XIXBE ULILO MRXQO GJXRC JOUZH OTJAK
--- 84,90 ----
  FCNLQ NWAOG NTOIV JZRVS IACOO KEYIN OZBNP KEGHF JFASY SDIFB NXNXF
  JPSAM RVBQG XNIZB MVGVU VNFMU FJXEL BZLTP IFIWB LBXPB QDXAW FRHBF
  QPDCM OXOSU MMERK QNMYF YKDOC BOXIY SPLGV PBLNG NKTAK YNGBX MIPOM
! RIDCL TCIBZ HFLDV RXBKF LRKMU CQHEY RAAVH XAYDH NNNUN JCINA RAEXP
  UAQRP RUDMO OHOOM EMGUP IEEIX AQTLU PETXI BQEPN IWBRE BNSEQ RDUGG
  TGWUR QRJRL XGRDP MJPDX TSDBG YYQDR DQYSZ GLXDR IDLYX FIVSQ WZVQG
  QRXLN LBLGT EGHVN ZXRFN HFQOW XIXBE ULILO MRXQO GJXRC JOUZH OTJAK

The first is clearly a transcription error: if you look at the original, the last five characters should be FZVKR. The next is seemingly a printers error, it does appear to say OTOOK instead of CTOOK. And frankly, I don’t remember what I found out about the last one.

If you fix these three typos, the first 117 lines of Exhibit 1 decode properly, then go astray again. I haven’t worked on figuring out what the next typo is.

Eventually we should make a verified copy of the Exhibit 1 plain and cipher texts available, so that others won’t need to toil through this.

Progress on Exhibit 1

Saturday, July 10th, 2010

Okay, I think I’ve figured out the problem with my code that back propagates cipher wheels to the beginning of the code, and ran it on Exhibit 1 again. It actually matches 603 characters of input, then stalls, after searching 1.8 billion cipher encoding possibilities. I am now trying to figure out if it is a typo in the original, or some slim bug that causes it to go awry.

Because others might be pursuing their own solution of this issue, I’ll hold off on publishing the key that I’ve recovered for a bit longer. If anyone needs it, let me know, and I’ll send you the spoiler.

Addendum: There is a typo in the version of the cipher text that I downloaded from the Chaoscipher clearing house. I’ll write up more when I’ve learned more.

Addendum2: More than one transcription error. More later.

More on Chaocipher…

Friday, July 9th, 2010

Well, I didn’t have much time left to work on Chaocipher last night, so I left it running on Exhibit 1. It claimed to recover a key that allowed it to match 601 characters of input, but found no better match. But it also uncovered an error in my code. It doesn’t appear that the code that I wrote to rewind the machine to find the initial settings are quite right. For instance, if I use the cipher wheel and plain wheel both set to ABCDEFGHIJKLMNOPQRSTUVWXYZ, and then encode a test message (like the first paragraph of Moshe’s announcement of the publication of the algorithm of the Chaocipher) then I get the following cipher text:

NHLBP ULNFR XRNDL IZRDO IWZSY HQCUQ XFSDE QGMPG RFUIR FQETN
ZTTXI YQJJN FHFYN WWQTO SXEKX DKCXP OKOAH FPCKI XQFYF UQVMA
JFKSN LGOTY KTNQJ ZNFXV ULYNO YRZQY ORVJV RUWUH HMAZA LVSIS
IGLJB QAZPK NJZXO YZHPK EMBCS XGKGY YTUET VSJWK WPXKC OCYRH
CNGVU RUPYZ AAFQE

which decodes to:

NINET YTWOY EARSA FTERI TSINV ENTIO NFIFT YSEVE NYEAR SAFTE
RCHAL LENGE MESSA GESWE REPUB LISHE DANDA FTERM ANYCR YPTAN
ALYTI CRESE ARCHE RSUNS UCCES SFULL YTRIE DTOSO LVETH ESECH
ALLEN GEMES SAGES JOHNF BYRNE SCHAO CIPHE RALGO RITHM CANFI
NALLY BEREV EALED

When I run my solver though, it recovers:

found a solution that exhausted the input.
NPQRSTUVWXYZAOBCDEFGHIJKLM OPRSTUVW?Y?AB?CDEFGHIJ?LMN

Which is close, but no cigar. I’m not awake enough to spot my mistake yet, and it seems unlikely that I’ll get a chance to fix my error for the next few days, but I’ll try to get an hour to track down my mistake. I do think I’m 90% of the way there to recovering the key for Exhibit 1. I’m pretty confident that the solver gets the right answer, but that it is rewinding very slightly incorrectly.

Addendum: Ah, it appears I’m not quite rewinding the very last character properly. The key that I recovered is actually good for decoding, as long as you start from the second character. I’ve written this in an overly clever way. If I use my recovered key from Exhibit 1, but start at the second letter, I get something pretty reasonable:

LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA
RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH
EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS
AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD
OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER
LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM
POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX
ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO
WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI
CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO
DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA
LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA
RTGWA DLGOO SLLIV FCRBY MHAWQ GNMSN DNCKG ZYPML AYNZL FNDOW
GBJUI XZLDT NOJMT USZGI KMLCW NCKKO UICAY KSJJS SKFRG LLPUX
AOXSB XAPQE CJSUS FLFTZ LNQSS WNZNT KSDGJ UUQCG QCDDW PDINF
ZODLM JEXVW IFFWT EFPSZ KUATX VTKLF ZLHMV ICVDP MEZPY CULXV
AXNEA YRSLB BMITU JGIIT SPIQU MUMCF WFPOS YYAWE YCKYZ QUVJB
OIIUF BHXTQ FHKSE MWDBT SNJYG NIKJQ HJRRD CGCXM INUQZ GITGH
TINWU ZWFXH JEWMQ DCXNB XKGYQ DDNGI AXNYR CRAMK COAIV LKLFS
NVGWS QYOAO YUUWY OQACS UDFFK YXPJG TBPSZ TONZC ZPDMM OORNE
VZZVA HBUPV OCUJA HCGZS LVUYI LIDGY AOOXT YDDAL JDLOU LRNQA
SEZWB XPSVL EMYZT HMVAB HCIIY VHIUJ QBAUR GZEIA RUILN ZEDYB

But that predictably goes astray after (presumably) 601 characters. Not sure if the remaining issue is mine or not.

Oh well, enough for today.

Progress on the Chaocipher…

Thursday, July 8th, 2010

My brain has got a bug now. It’s called Chaocipher.

Despite the fact that I’m spending my days off with my family, I find that in my odd moments my brain keeps leaping back to Byrne’s cipher. The other night I implemented the basics of key recovery using a chosen plaintext attack (if you have both plain and cipher text, recover the key). It proceeds according to the following idea: initialize a Chaocipher machine with all ‘?’ in every entry. Then, for every cipher/plaintext pair, check to see if the cipher or the plain text letter has already been placed in the wheel. If it has, then they better be in the same slot in each of the respective wheels. If they aren’t, then the placement is bad, and you need to backtrack. If only one has been placed opposite a ‘?’, then you fill in the ‘?’ and proceed to the next pair. If neither has been placed, then you have to try all places where two ‘?’ are opposite one another. If you manage to exhaust the input, then you probably have the right wheel, so I then run the machine in reverse back to the beginning to recover the initial key settings.

I’ve had some success on this with some of my own test cases, but have not managed to crack Exhibit 1 or Exhibit 2 on the Chaocipher Clearing House website. I’m not sure if this is due to problems in my own implementation or a more general problem. I haven’t worked on trying to generalize this into a cipher-text only cracking program, but I suspect that it’s possible. The techniques that I remember from cracking the Enigma don’t really apply here, since the permutations are dependent on the plaintext. I have a couple of ideas that I’ll be pondering during the three plus hours of driving that I am going to do today.

I left the solver working on Exhibit 2. It’s searched about a billion keys in the five minutes or so I’ve had it running: we’ll see if it can solve the cipher by the time I get back tonight.

Addendum: it searched about 20 billion nodes, before returning a partial key that properly recovers the first 51 characters of the plaintext. Not sure if there is a problem in my implementation or in the transcript of Exhibit 2. I’ll think about it more over the next few days.

Visual Inspection of Chaocipher Output Implies Weakness

Tuesday, July 6th, 2010

So, first thing this morning, before I had even had coffee or blinked the sleep from my eyes, I decided to try a chosen plaintext attack against Chaocipher. I created a file consisting entirely of 2000 A’s, and passed it through Chaocipher.
Here is my output:

PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX

If you stare at it a second, you see that it’s periodic with a very short period (13 14 (thanks Moshe, for pointing this out, I was doing this before coffee) characters). That can’t be good, and is in fact much weaker than Enigma, whose periodicity is at least around 26^3 (a bit less, because of the odd way in which the rotors increment), and even longer with the Stecker in place that isn’t modified by the Stecker, which is entirely a static table (don’t know what I was thinking when I wrote that). Enigma is sensitive (and was in fact routinely cracked using the Bombe) using cribs, which are merely short versions of the chosen plaintext attack. I’ll ponder this some more.