Category Archives: Computer Science

Progress on Exhibit 1

Okay, I think I’ve figured out the problem with my code that back propagates cipher wheels to the beginning of the code, and ran it on Exhibit 1 again. It actually matches 603 characters of input, then stalls, after searching 1.8 billion cipher encoding possibilities. I am now trying to figure out if it is a typo in the original, or some slim bug that causes it to go awry.

Because others might be pursuing their own solution of this issue, I’ll hold off on publishing the key that I’ve recovered for a bit longer. If anyone needs it, let me know, and I’ll send you the spoiler.

Addendum: There is a typo in the version of the cipher text that I downloaded from the Chaoscipher clearing house. I’ll write up more when I’ve learned more.

Addendum2: More than one transcription error. More later.

More on Chaocipher…

Well, I didn’t have much time left to work on Chaocipher last night, so I left it running on Exhibit 1. It claimed to recover a key that allowed it to match 601 characters of input, but found no better match. But it also uncovered an error in my code. It doesn’t appear that the code that I wrote to rewind the machine to find the initial settings are quite right. For instance, if I use the cipher wheel and plain wheel both set to ABCDEFGHIJKLMNOPQRSTUVWXYZ, and then encode a test message (like the first paragraph of Moshe’s announcement of the publication of the algorithm of the Chaocipher) then I get the following cipher text:

NHLBP ULNFR XRNDL IZRDO IWZSY HQCUQ XFSDE QGMPG RFUIR FQETN
ZTTXI YQJJN FHFYN WWQTO SXEKX DKCXP OKOAH FPCKI XQFYF UQVMA
JFKSN LGOTY KTNQJ ZNFXV ULYNO YRZQY ORVJV RUWUH HMAZA LVSIS
IGLJB QAZPK NJZXO YZHPK EMBCS XGKGY YTUET VSJWK WPXKC OCYRH
CNGVU RUPYZ AAFQE 

which decodes to:

NINET YTWOY EARSA FTERI TSINV ENTIO NFIFT YSEVE NYEAR SAFTE
RCHAL LENGE MESSA GESWE REPUB LISHE DANDA FTERM ANYCR YPTAN
ALYTI CRESE ARCHE RSUNS UCCES SFULL YTRIE DTOSO LVETH ESECH
ALLEN GEMES SAGES JOHNF BYRNE SCHAO CIPHE RALGO RITHM CANFI
NALLY BEREV EALED 

When I run my solver though, it recovers:

found a solution that exhausted the input.
NPQRSTUVWXYZAOBCDEFGHIJKLM OPRSTUVW?Y?AB?CDEFGHIJ?LMN

Which is close, but no cigar. I’m not awake enough to spot my mistake yet, and it seems unlikely that I’ll get a chance to fix my error for the next few days, but I’ll try to get an hour to track down my mistake. I do think I’m 90% of the way there to recovering the key for Exhibit 1. I’m pretty confident that the solver gets the right answer, but that it is rewinding very slightly incorrectly.

Addendum: Ah, it appears I’m not quite rewinding the very last character properly. The key that I recovered is actually good for decoding, as long as you start from the second character. I’ve written this in an overly clever way. If I use my recovered key from Exhibit 1, but start at the second letter, I get something pretty reasonable:

LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA
RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH
EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS
AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD
OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM POVER
LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX ESJUM
POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO WNFOX
ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI CKBRO
WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO DQQUI
CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA LLGOO
DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA RTYWA
LLGOO DQQUI CKBRO WNFOX ESJUM POVER LAZYD OGTOS AVETH EIRPA
RTGWA DLGOO SLLIV FCRBY MHAWQ GNMSN DNCKG ZYPML AYNZL FNDOW
GBJUI XZLDT NOJMT USZGI KMLCW NCKKO UICAY KSJJS SKFRG LLPUX
AOXSB XAPQE CJSUS FLFTZ LNQSS WNZNT KSDGJ UUQCG QCDDW PDINF
ZODLM JEXVW IFFWT EFPSZ KUATX VTKLF ZLHMV ICVDP MEZPY CULXV
AXNEA YRSLB BMITU JGIIT SPIQU MUMCF WFPOS YYAWE YCKYZ QUVJB
OIIUF BHXTQ FHKSE MWDBT SNJYG NIKJQ HJRRD CGCXM INUQZ GITGH
TINWU ZWFXH JEWMQ DCXNB XKGYQ DDNGI AXNYR CRAMK COAIV LKLFS
NVGWS QYOAO YUUWY OQACS UDFFK YXPJG TBPSZ TONZC ZPDMM OORNE
VZZVA HBUPV OCUJA HCGZS LVUYI LIDGY AOOXT YDDAL JDLOU LRNQA
SEZWB XPSVL EMYZT HMVAB HCIIY VHIUJ QBAUR GZEIA RUILN ZEDYB

But that predictably goes astray after (presumably) 601 characters. Not sure if the remaining issue is mine or not.

Oh well, enough for today.

Progress on the Chaocipher…

My brain has got a bug now. It’s called Chaocipher.

Despite the fact that I’m spending my days off with my family, I find that in my odd moments my brain keeps leaping back to Byrne’s cipher. The other night I implemented the basics of key recovery using a chosen plaintext attack (if you have both plain and cipher text, recover the key). It proceeds according to the following idea: initialize a Chaocipher machine with all ‘?’ in every entry. Then, for every cipher/plaintext pair, check to see if the cipher or the plain text letter has already been placed in the wheel. If it has, then they better be in the same slot in each of the respective wheels. If they aren’t, then the placement is bad, and you need to backtrack. If only one has been placed opposite a ‘?’, then you fill in the ‘?’ and proceed to the next pair. If neither has been placed, then you have to try all places where two ‘?’ are opposite one another. If you manage to exhaust the input, then you probably have the right wheel, so I then run the machine in reverse back to the beginning to recover the initial key settings.

I’ve had some success on this with some of my own test cases, but have not managed to crack Exhibit 1 or Exhibit 2 on the Chaocipher Clearing House website. I’m not sure if this is due to problems in my own implementation or a more general problem. I haven’t worked on trying to generalize this into a cipher-text only cracking program, but I suspect that it’s possible. The techniques that I remember from cracking the Enigma don’t really apply here, since the permutations are dependent on the plaintext. I have a couple of ideas that I’ll be pondering during the three plus hours of driving that I am going to do today.

I left the solver working on Exhibit 2. It’s searched about a billion keys in the five minutes or so I’ve had it running: we’ll see if it can solve the cipher by the time I get back tonight.

Addendum: it searched about 20 billion nodes, before returning a partial key that properly recovers the first 51 characters of the plaintext. Not sure if there is a problem in my implementation or in the transcript of Exhibit 2. I’ll think about it more over the next few days.

Visual Inspection of Chaocipher Output Implies Weakness

So, first thing this morning, before I had even had coffee or blinked the sleep from my eyes, I decided to try a chosen plaintext attack against Chaocipher. I created a file consisting entirely of 2000 A’s, and passed it through Chaocipher.
Here is my output:

PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX
HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM
AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE
PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV
ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK
LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS DMAVZ CUXHE PKLSD MAVZC
UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX HEPKL SDMAV ZCUXH EPKLS
DMAVZ CUXHE PKLSD MAVZC UXHEP KLSDM AVZCU XHEPK LSDMA VZCUX

If you stare at it a second, you see that it’s periodic with a very short period (13 14 (thanks Moshe, for pointing this out, I was doing this before coffee) characters). That can’t be good, and is in fact much weaker than Enigma, whose periodicity is at least around 26^3 (a bit less, because of the odd way in which the rotors increment), and even longer with the Stecker in place that isn’t modified by the Stecker, which is entirely a static table (don’t know what I was thinking when I wrote that). Enigma is sensitive (and was in fact routinely cracked using the Bombe) using cribs, which are merely short versions of the chosen plaintext attack. I’ll ponder this some more.

An Implementation of Byrnes’ Chaocipher

Okay, insomnia got me, so I went ahead and implemented it in Python. It appears to work reasonably well, at least it successfully deciphers their test message. You can specify the key by specifying the -c and -p options, which are the settings for the cipher and plain wheels. You should pass a permutation of the uppercase letters as arguments to those if you want to use other than the default key, which matches Rubin’s example.

#!/usr/bin/env python
#
#       _                      _       _               
#   ___| |__   __ _  ___   ___(_)_ __ | |__   ___ _ __ 
#  / __| '_ \ / _` |/ _ \ / __| | '_ \| '_ \ / _ \ '__|
# | (__| | | | (_| | (_) | (__| | |_) | | | |  __/ |   
#  \___|_| |_|\__,_|\___/ \___|_| .__/|_| |_|\___|_|   
#                               |_|                    
# An implementation of John Byrnes' Chaocipher as described in papers
# by Moshe Rubin.
# 
# Written by Mark VandeWettering <mvandewettering@gmail.com>
# No rights are reserved.  No warranties are implied.
#

import sys
import random
import string
import optparse

p = optparse.OptionParser()
p.add_option("-d", "--decrypt", action="store_true", dest="decrypt",
	default=False, help="decrypt instead of encrypt")
p.add_option("-p", "--pwheel", dest="pw",
	default="PTLNBQDEOYSFAVZKGJRIHWXUMC",
	help="plaintext wheel setting")
p.add_option("-c", "--cwheel", dest="cw",
	default="HXUCZVAMDSLKPEFJRIGTWOBNYQ",
	help="cipher wheel setting")

opts, args = p.parse_args()

# initialize the code machine...

cnt = 0 

def output(c):
    global cnt
    sys.stdout.write(c)
    cnt = cnt + 1 
    if cnt % 50 == 0:
	sys.stdout.write('\n')
	cnt = 0
    elif cnt % 5 == 0:
	sys.stdout.write(' ')

	
class Machine:
    def __init__(self, cw, pw):
	self.cw = cw 		# cipher wheel
 	self.pw = pw 		# plaintext wheel
	pass
    def twizzle(self, idx):
	self.cw = self.cw[idx:] + self.cw[0:idx]
	self.cw = list(self.cw[0]) + \
		  self.cw[2:14] + \
		  list(self.cw[1]) + \
		  self.cw[14:]
	# and the plaintext wheel
	self.pw = self.pw[idx:] + self.pw[0:idx]
	self.pw = self.pw[1:] + list(self.pw[0])
	self.pw = self.pw[0:2] + \
		  self.pw[3:14] + \
		  list(self.pw[2]) + \
		  self.pw[14:]
    def encrypt(self, d):
	# find where it is in the plain text wheel...
        idx = self.pw.index(d)
	r = self.cw[idx]
	self.twizzle(idx)
	return r
    def decrypt(self, d):
	idx = self.cw.index(d)
	r = self.pw[idx]
	self.twizzle(idx)
	return r 

random.seed(0)
machine = Machine(list(opts.cw), list(opts.pw))

for arg in args:
    try:
	data = open(arg).read()
    except IOError, msg:
	print >> sys.stderr, "%s: %s" % (arg, msg)
	print >> sys.stderr, "continuing..."
    data = data.upper()
    # filter out all the non alpha characters...
    data = filter(lambda x : x in string.uppercase, data)
    if opts.decrypt:
	for d in data:
	    output(machine.decrypt(d))
    else:
	for d in data:
	    output(machine.encrypt(d))
    print

Addendum: Here is some cipher text that you can decode with the above program (or your own implementation):

TLMAG OONSK JBJYB QVGDQ CDUNW NMZPL OYCWP CWKWQ RBOYA DSLQB
KYCDG XJOLO NKTTL RUZZJ QGJBQ NRQHQ RREUI YIDHZ OMVWZ MVYUF
QOGSN NUVYT JGQPS QTBRW FHLTC LVVBP MYYQV 

The Chaocipher revealed! from Cipher Mysteries

Stumbling back through articles in Slashdot, I found a pretty nifty article on one of my favorite subjects: historical cryptography. The story goes that back in 1918, a cipher system/machine was invented by John F. Byrne. Rumor says that it was very strong, and yet could be implemented using a mechanism that would fit in a cigar box. The details of this invention were never publicly released. However, recently the widow of Bryne’s son, John Byrnes Jr., has decided to donate his notes to the National Cryptological Museum, and the first publications are beginning to trickle out. Moshe Rubin has a new paper that details the working of the algorithm in sufficient detail that it should be possible to write an implementation in whatever language you like for experimentation. It’s too late for me to start today, but expect a Python reference implementation in the next few days:

The Chaocipher revealed! | Cipher Mysteries.

A cursory glance over the implementation suggests that the key space is basically 26! * 26! or about:

162,644,002,617,632,464,507,038,883,409,628,607,021,056,000,000,000,000

By comparison, the German Army Enigma (three rotors) had a keyspace of only 562,064,881,159,999,426,560, and the Navy Enigma a keyspace which was only 1000x larger. So if all things were equal, we might expect that the Chaocipher was a lot harder to crack. But all things are probably not equal. I’ll be pondering this over the next few days.

Revisiting TinyP2P

As I was driving in this morning, I entertained a train of thought that led me back to thinking about peer to peer networks. I recalled that I had seen a posting by Ed Felton some years ago about implementing a peer-to-peer networking scheme in a very few lines of Python. A few minutes of searching once I reached my desk reminded me that the name of this was TinyPTP, and that it was 15 lines of dense Python. A search of Ed’s Freedom To Tinker Website revealed nothing but dead links, but a few minutes with the Internet Wayback Machine resurrected it for our consumption:

TinyP2P

I recall that I tried to run this before, and had some difficulty, but haven’t tried again. Still, the basic idea is pretty interesting: you create a network of XMLRPC servers, whose basic functionality is to ensure that all network nodes have copies of all files. It uses the SimpleXMLRPCServer classes and some overly terse idioms to accomplish its task.

Here’s the thinking that leads me to here: I’ve been listening to a lot of stuff about the scalability of Facebook lately. Ironically, the scalability doesn’t contribute to user experience directly: no user uses very much of any of the Facebook resources. Facebook’s 60,000+ servers serve something like 200M+ users every day, so obviously each user is getting a very tiny fraction of any single server machine, a fraction so small that it could be easily subsumed by even the most tiny computing appliance. It is only the centralization which causes the need for scalability.

And, of course, a distributed architecture has the possibility of fixing a few of Facebook’s other ills, such as allowing more direct control over privacy.

So, here’s the idea: use ideas similar to the TinyPTP client to implement a distributed social network like Facebook. Implement it in a simple language like Python, and in a straightforward way (don’t be overly terse like TinyPTP). Pay attention to security, but also make it simple enough so that people can adopt it by downloading a single Python file and running it in a directory of their choosing on their machine at home.

It’s just a thought experiment at the moment. We’ll see if it gains traction in my head.

Multitasking With iOS 4 is Horrible: Apple Blew It – PCWorld

Jared Newman of PC World thinks that Apple blew it with respect to the multitasking in the iPhone:

Multitasking With iOS 4 is Horrible: Apple Blew It – PCWorld

Here’s the funny thing though: all these industry pundits keep ranting about how features of the iPhone or iPad are really terrible, and yet when polled, people who buy iPhones are overwhelmingly satisfied with their phone choice. Perhaps pundits should reconsider that they don’t understand the phone market as well as Apple does.

Newman states (correctly) that multitasking is something that has to be explicitly coded into applications. And here’s the odd thing: it would be better for the phone as a whole if most developers simply chose not to.

It’s all about battery life. If your program really doesn’t have anything to do when you aren’t staring at the screen, the iPhone will nicely not let it do anything when you shift to another process. It basically shuts down your application, perhaps storing some information so that when you re-run the program, you can go right back to where you left. Many apps are kind of lazy, and just restart the program (often displaying time wasting banners) each time, which is one of the reasons that people claim they want multitasking. For 90% of these cases, just writing the applications better would be sufficient.

But I’ll be the first to admit that there are processes which don’t fit this model, and actually do require some CPU time while the user isn’t staring at them. For instance, Google Latitude isn’t really useful on the iPhone, because it was only a web application, and therefore was really only running when you were running Safari. This kept you from informing anyone of your position if you pretty much were doing anything else. You’d like to keep your music playing. Perhaps you’d like some long downloads to complete. All good stuff.

With IOS4, Apple has given you the capability to make that happen. And yes, as a programmer you have to be explicit about that. It’s a teensy bit tedious, but frankly if you are going to make an effective app, you should really be on top of the resources that your application needs. In particular, you shouldn’t be viewing your application as necessarily so important that cutting your clients battery life unnecessarily is a virtue.

Just a quick aside about the iPhone Task Manager: in most cases, you don’t need to access it at all. Hitting the button in the task manager is just like hitting its icon on the main screen. If it is more convenient to find the application on any of your home screens, just do it that way. You will need to access it only to delete processes, and properly coded apps don’t really need to be explicitly deleted anyway: if the iPhone needs to delete an application because it runs out of memory, it will do so in a well defined way that allows you to restart the application in a consistent state.

Is iPhone multitasking amazing? No, not really. But neither is it horrible. It’s a reasonable set of design choices to ensure good user experience.

Simple, reliable 2.5D photography

I’ve been interested in techniques where amateurs can digitize images and models for quite a bit. This website percolated to the top during today’s relaxing web browsing: it’s pretty spiffy, and is interesting on a couple of fronts, not the least of which is that the author designed the gearbox for tracking a laser using a CAD program which drove a CNC mill so that the parts could be cast. Very slick.

Simple, reliable 2.5D photography.

Q: Are there any good, open user interface options?

Yesterday I was in our Atrium, and Craig had his iPad with him. I got into a discussion with him and Loren about why I thought the device was very cool. (I also told them why I had been actively discouraged from becoming an iPhone developer earlier, but that’s a story for a different time.) But I brought up a point which I’ve been kind of mulling over ever since: the iPad (and iPhone) may be the first not-completely-user-hostile-interface ever delivered in a consumer electronic device.

Consider your TV set and its remote as a counter example. On virtually every television ever made, there are two ways to select what channel you wish to watch: by going up and down, or by entering a channel number. There is no more reason to select channels this way than there would be for you to access your online photos by number, or by paging through them one at a time. And yet, for decades, no TV manufacturer has sought to upgrade this very simple, basic interaction that you have with a television. Is there really any reason at all for you to use channel numbers for anything? They add ten buttons to your television remote. Wouldn’t it be nice to remove that space, and use it to make the buttons you actually do use larger, and differentiated so you could actually use them in the dark? And how about those buttons to scroll up and down lists? Wouldn’t you just like to point at what you want? The Nintendo Wii uses a nice IR sensor system to select items: imagine of that technology were merged with your remote, along with accelerometers and the like. If Nintendo can profitably make these as options in a $250 game system, you’d think that that kind of technology could be used in other consumer electronics.

And don’t even get me started about devices like Blu Ray players. Why do I need a different remote? And a completely different system of conventions and menu selections. Bleh.

Okay, I’m getting a bit astray. The iPad has already demoed some really, really nifty applications with interesting, intuitive interfaces. But while I have defended the iPhone/iPad as a consumer device, I am not really all the enthusiastic about doing my own programming and experimentation with such a device. So, I thought I might throw out this question: imagine that you were going to prototype improved interfaces for media devices. Are there any open source options that are worth considering, or are they all terrible? I’m considering a platform like the Acer Revo, hooked to a large screen TV using HDMI, and possibly some wireless bluetooth devices (or maybe just the Wii remote). Anyone have any experience/success with this kind of interactive UI programming?

A small experiment with a C compiler

I’m still somewhat baffled by the performance of my checkers program. I keep pondering that perhaps it would be a good idea to take all the lessons I’ve learned, burn it down, and try again with Milhouse2. After all, in the words of Tom Duff (who may not have said this, but should have) “any program worth writing is worth rewriting”. Toward that end, I was reconsidering the data representation that I used, and revisited this nifty page on Bitboards:

Checker Bitboard Tutorial.

I basically use bitboards already, but at the very end of the page, you’ll see mention of a novel if somewhat unobvious representation credited to Thotsaphon Thanatipanonda, which simplifies move and jump generation. My own code for this was laboriously constructed using a python program, so it would be nice if we could shrink that code a bit and make it more straightforward. One key that Ed mentions is that it requires machine language rotate instructions, since C does not have a rotate instruction. I thought I’d try to see what gcc does with the obvious implementation. So, I quickly wrote the following function:

inline uint32_t
rol7(uint32_t val)
{
    return (val << 7) | (val >> 25) ;
}

(This function uses uint32_t to make sure we have 32 bit values. You can find the defines in stdint.h.) As perhaps we should expect, gcc is smart enough to turn this into a single machine language rorl. This means that we can write a move generator without any difference in handling the even or odd rows, which currently doubles at least the size of my own move generator. Pretty nifty.

Classic Board Games for Computer Implementation?

So, my experiments with my checkers program Milhouse have been fun and interesting. There is still work to be done: I don’t think a machine that can’t properly play first position can be reasonably said to be a good checkers player (even though I still make errors while practicing it myself against Cake), but I’ve learned a lot. I’ve been toying with the possibility of doing a rewrite of Milhouse, but maybe that isn’t the best use of my precious after-hours time. For the last couple of days, I’ve been musing about the possibility of finding other strategic board games that might be fun to try. Here is a rough list of my internal criteria:

  1. It should be a game that people play. It would be great if there was some existing scholarship on human play, including test positions.
  2. It should be a game which is uncopyrighted and free from legal restrictions for use. Hence my use of the term “classic”.
  3. It should be reasonably hard. Trivial games like Tic Tac Toe need not apply.

There are some additional possibilities..

  1. I’m willing to consider games with a chance component (such as dice in backgammon) or a gambling component (such as the doubling cube in the same).
  2. I prefer games which are more abstract, whose difficulties arise from position and interaction more than complicated rules.

So, what games are out there?

Well, obviously there is chess and its variants, including Shoji, Xiangqi and a slew of modern variants. Originally I had a goal of implementing chess, but I must admit that it doesn’t appeal to me quite as much now largely because I view it as less abstract than games like checkers. The difficulty of First Position arises not because of complex interactions between different piece types, but because of the interplay of very simple elements. Still, I have considered the possibility of taking one of the variants of chess that are played on small boards perhaps a 5×6 board and developing opening and endgame databases for these. I haven’t done the math on what it would take to resolve the game theoretic value of one of these small games, but it might be tractable with the computing power that I have at my disposal.

There are obviously variants to checkers, including international draughts and the misere variants, also known as suicide checkers. The idea is that the person who loses all their pieces first wins. Martin Fierz has a version of Cake called Suicidal Cake which can play suicide checkers, which is an interesting variant with many of its own subtle problems. International draughts is a much more complicated game, with a larger board and some really interesting rules for jumping. There are other variants as well (Polish, Turkish, Italian) which could be considered as strong candidates.

Othello or Reversi are interesting games of position, and which can use many of the techniques we see in checkers and chess (namely opening books and endgame databases). There is also a fair amount of human knowledge about the game, which is good.

Backgammon introduces chance into the mix, which generally means that you need to use a different set of techniques than we’ve seen so far. I also like to play backgammon (although am only a middling level player). Backgammon has been the subject of quite a bit of machine learning research, and there exists many implementations to spar against.

Hex is a very nearly perfect abstract game: players alternate turns to connect their side of the board to the opposite side. There are no draws possible. The entire games can be displayed as a single graphic, which shows the order in which stones were placed on the board. The high branching factor makes traditional alpha-beta search less attractive.

Go has many of the same characteristics, with the added bonus of having lots of good human players. There has been a lot of interesting research recently about playing Go.

Arimaa was designed to be hard for computers to play, but frankly, I think that it does so only by making a rule system which is complicated. It seems between the push, pull, freeze and trap rules, there are at least two and maybe three that could have been eliminated. I also suspect that while it is (according to the authors) “easy to learn”, I suspect that really only means “easy to learn the rules”. I would make the (unsupported) claim that most human play is quite weak as well.

Games like Awari or other games of the mancala family have been studied quite extensively. Connect 4 has been solved, but is still middling interesting.

Any other board games I’m missing? I await your suggestions and/or advocacy.

A Shorter Quine in Python

Quines are programs that when run produce themselves as output. Previously I had written a fairly longish one in Python. I was reading Russ Cox’s latest blog article (more skimming than reading actually, it’s neatly subtle) but it said something obvious that I hadn’t clicked on before: the repr function takes a value, and converts it to a string value that is its representation. Most of the difficulty in making a short quine is in picking a representation which is compact. Since repr is a builtin function, that makes it very compact. A few minutes of tinkering resulted in this two line quine:

#!/usr/bin/env python
s = '#!/usr/bin/env python\ns = %s ; print s %% repr(s)' ; print s % repr(s)

It could actually be made shorter and fit on one line, but I think most good programs should specify the interpreter they use. That necessitates a second line.

Russ goes on to do something truly remarkable: a zip or tar.gz file which when unzipped produces a new copy of itself. I am gonna ponder that a bit more and maybe work on my own version.

Rob Pike on Quantum Computing

Rob Pike is a Google researcher, and has written a bunch of stuff in the past that I’ve enjoyed reading.  He’s also a telescope maker, which scores big points with me.   I recently discovered that he also write an introduction to quantum computing that I had not previously read.   This is an area of computer research that I quite franly know othing about, so I’m bookmarking it for my later consmption.

Gruenberger’s prime path

Here’s an interesting little mathematical morsel from the pages of the bit-player blog having to do with two topics I’ve found interesting in the past: prime numbers and random walks.

Let’s consider the sequence of prime numbers > 3. All such primes are congruent to -1 or to 1 modulo 6. So, let’s use that as the “random” variable controlling a random walk. If you consider all odd integers, you can generate a walk as follows. Take a step in the current direction. If the number is composite, you are finished. Otherwise it is a prime. If it is congruent to -1 modulo six, then turn left, otherwise turn right. You end up with a “random” walk, with several interesting questions about whether it shares properties with real “random” walks. Check the blog entry for more discussion:

bit-player » Gruenberger’s prime path.

I can visualize an interesting program written to draw these which is a modification to the segmented sieve algorithm that I’ve coded up previously. Each “segment” generates a segment of the overall path, and as long as you know the coordinates of the starting position, you can overlay and merge these points with reasonable efficiency. I might have to give that a go some time.

Addendum: In searching for more material by Gruenberger, I discovered that he was an early proponent of the educational uses of computing, and that some of his papers are available for download from the RAND corporation.