Milhouse still doesn’t know the first thing about FIrst Position…

White to move and win, position analyzed by Payne in 1756 in the first English book on checkers...
White to move and win, position analyzed by Payne in 1756 in the first English book on checkers…

Until Milhouse can play this position out, it really can’t be considered a real checkers program. Right now, even with an endgame database, it’s still clueless about how to proceed.

Addendum: Reinfeld’s How to Win at Checkers gives some details on winning positions like this. I recall reading it (my puzzle database includes puzzles from this book) but I’m unsure as to how to effectively merge the knowledge into my evaluation function. Still, worth looking at.

Are improvements in computer chess due mostly to hardware or software?

file000183685005My recent revival in interest in computer chess/checkers/gameplaying was in part spawned by the impression (not particularly support by evidence at the time) that the dramatic increase in computer chess strength must have come from more than just basic hardware improvements. It seemed obvious to me that some fraction of the increase in the play was due to new cleverness and discoveries by chess programmers. But I had no real datapoints to figure out what that split might have been.

Until I found this 2010 post by Bob Hyatt. Bob has been an expert programmer in computer chess for decades, first for Cray Blitz and later for Crafty, an awesome open source Chess program. It’s source code is amazingly interesting, and has tons of features which are pretty cool.

In the forum post, Bob compared Crafty 23.4 (which was at the time the highest ranked version he had produced) with Crafty 10.18, the version which was available 15 years earlier from 1995. Running on the same hardware, you find that Crafty 23.4 was 360 ELO points higher than older version.

But how much would we expect to gain from the factor of 1000x increase in speed between 1995 and 2006? I remember reading that each doubling of CPU speed would be worth about 100 ELO points. Bob did some experiments that suggests for Crafty, that result might be something more like 80 ELO points. That means that from hardware improvements alone, you might expect to see an increase of 800 ELO points.

This would seem to imply that only 1/3 of the improvement of Crafty was due to software, with the remaining 2/3 of the improvement due to increases in hardware speed. Bob doesn’t believe the 360/800 numbers are accurate (they are likely too broad, 1995 Crafty was probably not 1100 points weaker than 2010 Crafty) but that the ratio of the two is likely to stand up.

Bob did this post to respond to this rather long thread which I find much harder to draw any real conclusions from. But it’s still good reading.

But it would seem that my intuition is likely wrong: that at least for Crafty, about 2/3 of its gains are likely to increases in hardware speed, with only 1/3 coming from software improvements. Interesting.

Translating HAKMEM 175 into C…

A couple of years back, I made note of HAKMEM 175, a nifty hack by Bill Gosper that finds the next higher value that has the same number of ‘1’ bits as the input.

brainwagon » Blog Archive » HAKMEM 175.

If I bothered to convert it to C, I didn’t scribble it down, so I thought I’d do it here.

#include <stdio.h>

/*  _  _   _   _  ____  __ ___ __  __ _ ____ ___ 
 * | || | /_\ | |/ /  \/  | __|  \/  / |__  | __|
 * | __ |/ _ \| ' <| |\/| | _|| |\/| | | / /|__ \
 * |_||_/_/ \_\_|\_\_|  |_|___|_|  |_|_|/_/ |___/
 *                                               
 * A straightforward implementation of Gosper's algorithm for
 * determining the next higher value which has the same number of 1
 * bits. This is useful for enumerating subsets.
 *
 * Translated by Mark VandeWettering.
 */

unsigned int hakmem175(unsigned int A) { unsigned int B, C, D ;

    B = A ;                     /* MOVE B, A */
    C = -B ;                    /* MOVN C, B */
    C &= B ;                    /* AND C, B */
    A += C ;                    /* ADD A, C */
    D = A ;                     /* MOVE D, A */
    D ^= B ;                    /* XOR D, B */
    D >>= 2 ;                   /* LSH D, -2 */
    D /= C ;                    /* IDIVM D, C */
    A |= D ;                    /* IOR A, C typo? */
    return A ;
}


main()
{
    unsigned int x ;
    int i ;

    x = 3 ;

    do {
        printf("0x%x\n", x) ;
        x = hakmem175(x) ;
    } while (x != 0) ;
}

I was surprised to find that there is actually a bug in the published memo. The last instruction should obviously be an OR of A and D, not A and C as listed in the published memo. After discovering the error, I seeked to find mention of it onlne somewhere. This page lists the corrected PDP-10 assembly code without comment. It also suggests a couple of optimizations: you can save one register and one instruction by variable renaming, and with a bit of work, you can avoid the integer divide, which is probably a good thing on pretty much any architecture you are likely to use.

Addendum: I thought it might be fun to show an illustration of how this could be used. At one point, while writing my checkers program, I thought about enumerating all possible positions with given numbers of checkers on each side. Counting them is actually rather easy, but enumerating them is a bit trickier, but using the trick of HAKMEM175, it becomes somewhat easier.

#include <stdio.h>
#include <stdlib.h>

unsigned int
hakmem175 (unsigned int A)
{
    unsigned int B, C, D;

    B = A;                      /* MOVE B, A */
    C = -B;                     /* MOVN C, B */
    C &= B;                     /* AND C, B */
    A += C;                     /* ADD A, C */
    D = A;                      /* MOVE D, A */
    D ^= B;                     /* XOR D, B */
    D >>= 2;                    /* LSH D, -2 */
    D /= C;                     /* IDIVM D, C */
    A |= D;                     /* IOR A, C typo? */
    return A;
}

/* 
 * Here's an interesting use (okay, semi-interesting use) of the
 * above function.  Let's use it to enumerate all the potential
 * positions in checkers which have two checkers (not kings) for
 * each side.   
 */

#define WHITEILLEGAL    (0x0000000F)
#define BLACKILLEGAL    (0xF0000000)

int
main (int argc, char *argv[])
{
    unsigned int B, W;

    int nb = atoi (argv[1]);

    int nw = atoi (argv[2]);

    if (nb < 1 || nb > 12 || nw < 1 || nw > 12) {
        fprintf (stderr, "usage: checkers nb nw\n");
        fprintf (stderr, "       1 <= nb, nw <= 12\n");
        exit (-1);
    }

    B = (1 << nb) - 1;

    for (;;) {
        W = (1 << nw) - 1;
        for (;;) {
            while (W != 0 && (W & (B | WHITEILLEGAL)))
                W = hakmem175 (W);
            if (W == 0)
                break;
            /* output B, W, we could be prettier */
            printf ("%x %x\n", B, W);
            W = hakmem175 (W);
        }
        B = hakmem175 (B);
        while (B != 0 && (B & BLACKILLEGAL))
            B = hakmem175 (B);
        if (B == 0)
            break;
    }

    return 0;
}

Using this, you can easily enumerate the 125,664 positions with 2 checkers on each side, the 8,127,272 positions with 3 checkers on each side, and the 253,782,115 positions with 4 checkers on each side. By then, the code gets a little slow: it has to step over all the positions where there is a conflict in B and W placement, so it gets pretty slow. Still, it works reasonably well, and as such, might serve as part of the inner loop of a retrograde endgame analysis project.

Why I can’t play checkers very well…

White is to move. It isn’t that hard to find the move that draws, but in two minutes, I couldn’t work out the move that wins for White. All other moves are dead losses for White. Solving this puzzle requires basic visualization skills which seem beyond my current capabilities.

Milhouse of course immediately answers 27-23.

Addendum: I should have been able to sort this out. 22-18 obviously loses a checker. 22-17 is the obvious drawing move (although red has to respond with 7-11, responding with the slightly plausible looking 15-18 is a loss, but is fairly tricky to see). 24-19 loses a checker to no good effect. 32-28 bottles up white hopelessly. We are left with 27-23. That forces Red’s jump, but he’s not going to get the chance to king since white moves to threaten a double jump. To respond to that, Red gives up the checker, and we end up in a symmetric position, but White has the move, and will win.

Milhouse muses from the past…

A couple of years ago, I mused about an “easy” checkers problem which my checkers program Milhouse found to be pretty difficult. Here’s the position again, with White to move and win:

An easy hard problem, or a hard easy problem?

(I didn’t mention the source of the puzzle before, I got it out of one of Rob Pike’s books, not sure which one. It’s puzzle 104 in my collection, but listed as Rob Pike Puzzle #8). As I mentioned before, it takes a fairly deep search to find the solution (28 plies). I had mentioned before that using the MTD(f) algorithm proposed by Plaat would find the solution very quickly, but that my implementation of alpha beta struggled significantly. I thought I’d try this position again to see what headway has been made in two years. Using the normal settings… here’s the output of iterative deepening….

      +1 : [-1038] 0.02s
         : 15-10
      +3 : [-1044] 0.02s
         : 18-14 30-26 14-10
... researching after failing low (-1070)
      +5 : [-1070] 0.02s
         : 15-10 32-27 19-23 28-32 23-19
      +7 : [-1086] 0.02s
         : 18-14 32-27 15-10 27-23 19x26 30x23 10-15
      +9 : [-1106] 0.04s
         : 18-14 30-26 15-10 32-27 19-15 25-30 14-9  28-32  9-14
     +11 : [-1118] 0.13s
         : 15-10 32-27 19-15 12-16 18-14  8-11 15x8   3x12 14-9 
         : 28-32  9-5 
     +13 : [-1133] 0.53s
         : 18-14 32-27 15-10 27-23 19x26 30x23 14-9  25-30  9-5 
         : 31-27 10-6  28-32  6-1 
     +15 : [-1143] 2.18s
         : 18-14 32-27 15-10 27-23 19x26 30x23 14-9  25-30  9-5 
         : 30-26 10-14 31-27 14-10 28-32 10-15
     +17 : [-1143] 8.53s
         : 18-14 32-27 15-10 27-23 19x26 30x23 14-9  25-30  9-5 
         : 30-26 10-14 28-32
... researching after failing low (-1168)
     +19 : [-1168] 46.67s
         : 15-10 32-27 18-14 28-32 14-9  31-26  9-5   8-11  5-9 
         : 17-22 19-24  3-7  10x3  27-23  9-5  20x27
     +21 : [-1176] 143.75s
         : 15-10 30-26 18-22 32-27 22x13  8-11 10-14 27-23 13-9 
         : 23x16  9-5  31-27  5-1 

In other words, after two and a half minutes of searching, the computer still thinks white is screwed, with a score of -1176.

I wanted to go back and re-enable the MTD(f) algorithm in milhouse, but somehow through the 100 or so modifications that I’ve done since then, I’ve removed that code entirely. But I thought that I might make some small modifications to the program that would significantly enhance its ability to solve this puzzle. Milhouse normally uses a windowed search to find the best move using iterative deepening. It searches at increasing levels. Each time, it sets alpha and beta (the search window) to be centered around the current value with a search window witdth of about one quarter of a man. If the search returns inside that window, we know its precise value. If it returns less, we “fail low”, and if we really wanted to know what the value was, we’d have to re-search to find it by using different bounds (I normally use -INF to the new failed bound). Similarly, if it fails high, we might need to research.

But in our case, we don’t really care if we fail low. We know if we do that no winning solution can be found at that depth. There is no reason to try to find the exact value, we may as well just deepen the search and try again. So, I made some minor modifications to the iterative deepening procedure. I basically set the alpha cutoff to a high value (say 1000), and if we fail low, then we just continue without researching at the next depth. This is very similar to what mtd(f) does, except that it uses null-width searches. Here’s the trace:

... researching after failing low (-1038)
... researching after failing low (-935)
... researching after failing low (-926)
... researching after failing low (-818)
... researching after failing low (-722)
... researching after failing low (-715)
... researching after failing low (-616)
... researching after failing low (-602)
... researching after failing low (0)
... researching after failing low (0)
... researching after failing low (0)
... researching after failing low (0)
... researching after failing low (0)
... researching after failing low (222)
     +29 : [9973] 1.76s
         : 19-24 20x27 18-22 17x26 15-10  8-11 10-6   3-7   6-9 
         : 12-16  9-6  16-19  6-2  19-24  2-6  11-16  6-9  16-19 
         : 9-13 19-23 13-9   7-10  9-13 10-15 13-17 15-18 17-22
final score = 9973
1.757 seconds elapsed.

In other words, in less than two seconds, it finds the winning line. Very cool.

Milhouse, what are you thinking?

Okay, here’s the problem which is driving me crazy. Milhouse gets itself into this position:

Black to move, but Milhouse doesn't get the right answer...

Milhouse should play 6-10, and should respond by winning a checker on the next move by forcing the checker that doesn’t move into a trade. But for reasons which somewhat escape me, milhouse judges that 6-2, 6-9 and 6-10 are all equivalent at high search levels. At 5 ply deep, it gets the right answer: it judges the 6-10 move as valued at 268, instead of the 223 and 224 that the other two positions generate. But 10 ply, it judges 6-2 and 6-9 as 268, and 6-10 at 270. And at 15 ply, it judges them all as 269. I suspect that in the first moves, it is playing a short series of moves which gets us back to the original position, where it then thinks the forced exchange will play out. But milhouse doesn’t gain grown, and just repeats this situation when it searches again.

I do have some code in place which is supposed to deal with repeated positions, and score them as a draw, but perhaps that isn’t working properly.

Three Kings vs. Two Kings still confounds Milhouse?

A couple of years ago, I realized that Milhouse didn’t play the 3K vs. 2K engame properly. Today, in doing some testing, I realized that it still doesn’t. At one point, it misses the obvious trade in material, and therefore wanders into a draw by repetition. I suspected it is because something is awry in the evaluation function. I adjusted the balance of the evaluation function slightly: a one man lead when you are down to only three checkers is much greater than a one man lead at 5. That seems to have helped a bit. But there is still something curious about the position….

It is good to see another checker program work hard…

I was sparring a bit with both Martin Fierz’s Cake, his jCheckers and also with John Kruezer’s guiCheckers. These games illustrated a number of things. Milhouse has bugs. And it seriously lacks proper checker knowledge. Because it lacks checker knowledge, it doesn’t search deep enough. And it lacks an opening book.

Sigh. It’s hard to know where to start to improve things. So, I pulled my copy of One Jump Ahead off the shelf, and started flipping through at random. I was trying to remember some of the details of a Chinook-Tinsley match that was described in it, The game I was looking for is described on page 193, and occurred on Dec 13, 1990. Chinook played a move on turn 10, to which the immortal Tinsley replied “You are going to regret this.” 23 full moves (46 plies) later, and Schaeffer was forced to admit that indeed, he did regret and Tinsley had the advantage.

Two things strike me about this:

In this section Schaeffer is talking a lot about his “nemesis”, a bug in the scoring routines. It’s a little bit enheartening to read this. I’ve been working on Milhouse in little bursts of activity, and it seems like it is progressing nicely, but it occasionally behaves in a mysterious fashion. At this point in Chinook development, Schaeffer had spent 560 days on his project, and he still had mysterious bugs. It appears that I am in good company, even if I think my computer should be holding its own.

Last, that the game is still fiendishly subtle: on page 190, we are confronted with this position from an earlier Chinook-Tinsley match.

White can win, but it's difficult...

Tinsley played 27-23, which results in an exchange, and by virtue of the endgame database, it is a draw. But later analysis by Tinsley showed that he missed a win (gasp!) I handed it to Milhouse, but it makes no good headway on this position, so I tossed it to Cake to see what it could see. Cake takes a bit of time, but eventually comes up with the right idea, and records that it is up a man afer churning for roughly 10 minutes. It is tough going, even with Cake’s 8 piece endgame database, because no exchanges occur until fairly deep in the search. For fun, I loaded the position into Chinooks 10 piece endgame querying applet, and it verified that 17-13 is indeed a winning move as later analyzed by Tinsley. Chinook also reveals that 14-9 and 14-10 are winning moves.

It’s all over my head, that’s for sure.

A small experiment with a C compiler

I’m still somewhat baffled by the performance of my checkers program. I keep pondering that perhaps it would be a good idea to take all the lessons I’ve learned, burn it down, and try again with Milhouse2. After all, in the words of Tom Duff (who may not have said this, but should have) “any program worth writing is worth rewriting”. Toward that end, I was reconsidering the data representation that I used, and revisited this nifty page on Bitboards:

Checker Bitboard Tutorial.

I basically use bitboards already, but at the very end of the page, you’ll see mention of a novel if somewhat unobvious representation credited to Thotsaphon Thanatipanonda, which simplifies move and jump generation. My own code for this was laboriously constructed using a python program, so it would be nice if we could shrink that code a bit and make it more straightforward. One key that Ed mentions is that it requires machine language rotate instructions, since C does not have a rotate instruction. I thought I’d try to see what gcc does with the obvious implementation. So, I quickly wrote the following function:

inline uint32_t
rol7(uint32_t val)
{
    return (val << 7) | (val >> 25) ;
}

(This function uses uint32_t to make sure we have 32 bit values. You can find the defines in stdint.h.) As perhaps we should expect, gcc is smart enough to turn this into a single machine language rorl. This means that we can write a move generator without any difference in handling the even or odd rows, which currently doubles at least the size of my own move generator. Pretty nifty.

jCheckers Beats Milhouse

After watching a couple of games where Milhouse appeared to get behind, but then pulled out a draw, here’s one where milhouse got behind, stumbled, and lost. I haven’t had the opportunity yet to study it in any detail, but I’ll archive the game here for future analysis:

[Black "jcheckers"]
[White " milhouse"]
[Event "sparring 15 seconds"]
[Result "1-0"]
[Date "unknown"]
1. 11-15 24-19 2. 15x24 28x19 3. 9-14 27-24 4. 8-11 24-20 5. 11-15 32-28 6.
15x24 28x19 7. 4-8 22-18 8. 8-11 18x9 9. 5x14 19-16 10. 12x19 23x16 11. 10-15
25-22 12. 6-10 26-23 13. 14-18 23x14 14. 10x26 30x23 15. 15-19 21-17 16. 19x26
31x22 17. 11-15 16-12 18. 7-11 17-14 19. 15-19 22-17 20. 19-23 14-10 21. 23-26
10-7 22. 3x10 12-8 23. 26-30 8-3 24. 2-7 17-14 25. 10x17 3x10 26. 17-22 10-7
27. 11-15 7-10 28. 15-18 10-14 29. 18-23 14-10 30. 23-27 10-15 31. 27-32
1-0

Addendum: Hmm. This is another game where Milhouse played white, and stumbled into the Double Corner opening. I could probably benefit from an opening book, or at least, I should figure out why it prefers this move in this case, and maybe alter it to not play into this opening.

Addendum2: By the time white plays 7. … 22-18, he’s probably already behind, but 22-18 doesn’t lead to anything good.

jCheckers released

Martin Fierz, author of the truly excellent checkers program Cake, has released a checkers program in Java. I run Cake on my PC, and also at times under Wine on Linux, but it is nice to have a version which can run on Mac/Linux without any hassle.

jCheckers.

For fun, I fired up Milhouse and played a game against jCheckers. I gave each engine five seconds to think (which is pretty low, but it makes the game go fast). Milhouse managed to eek out a draw, but it appears to me that it was lucky: Milhouse appeared to be in a pretty cramped position for much of the game. It seems to me that Milhouse narrowly broke through the log jam at the end, and each side ends up with 4 Kings, which is a drawn position.

[Black "Milhouse"]
[White " jCheckers"]
[Event "First Game Against jCheckers"]
[Result "1/2-1/2"]
[Date "2010-03-28"]
1. 9-14 22-18 2. 10-15 18x9 3. 5x14 25-22 4. 15-19 24x15 5. 11x25 29x22 6. 8-11
28-24 7. 6-10 22-18 8. 1-5 18x9 9. 5x14 26-22 10. 11-15 31-26 11. 3-8 23-18
12. 14x23 26x19 13. 7-11 21-17 14. 11-16 27-23 15. 8-11 17-13 16. 16-20 32-27
17. 11-16 13-9 18. 4-8 30-25 19. 8-11 9-5 20. 2-6 5-1 21. 6-9 25-21 22. 9-14
1-6 23. 14-18 23x7 24. 16x32 6-10 25. 20x27 10x19 26. 27-31 19-23 27. 31-27
23-19 28. 11-16 19-15 29. 16-20 15-19 30. 32-28 22-18 31. 20-24 7-3 32. 27-32
3-8 33. 24-27 8-11 34. 27-31 21-17 
1/2-1/2

Here’s the graphics from jCheckers, showing the final position.

Final Position

Addendum: Here’s one of the interesting position from the game. Cake finds a move which wins a checker, but both jCheckers and Milhouse seem to struggle to search deep enough to find it. Jcheckers played 14. … 27-23 which appears to be a draw, but 14. … 17-13 is a better move, and should win at least a checker.

White to play and win (jCheckers missed it, and so does Milhouse)

Classic Board Games for Computer Implementation?

So, my experiments with my checkers program Milhouse have been fun and interesting. There is still work to be done: I don’t think a machine that can’t properly play first position can be reasonably said to be a good checkers player (even though I still make errors while practicing it myself against Cake), but I’ve learned a lot. I’ve been toying with the possibility of doing a rewrite of Milhouse, but maybe that isn’t the best use of my precious after-hours time. For the last couple of days, I’ve been musing about the possibility of finding other strategic board games that might be fun to try. Here is a rough list of my internal criteria:

  1. It should be a game that people play. It would be great if there was some existing scholarship on human play, including test positions.
  2. It should be a game which is uncopyrighted and free from legal restrictions for use. Hence my use of the term “classic”.
  3. It should be reasonably hard. Trivial games like Tic Tac Toe need not apply.

There are some additional possibilities..

  1. I’m willing to consider games with a chance component (such as dice in backgammon) or a gambling component (such as the doubling cube in the same).
  2. I prefer games which are more abstract, whose difficulties arise from position and interaction more than complicated rules.

So, what games are out there?

Well, obviously there is chess and its variants, including Shoji, Xiangqi and a slew of modern variants. Originally I had a goal of implementing chess, but I must admit that it doesn’t appeal to me quite as much now largely because I view it as less abstract than games like checkers. The difficulty of First Position arises not because of complex interactions between different piece types, but because of the interplay of very simple elements. Still, I have considered the possibility of taking one of the variants of chess that are played on small boards perhaps a 5×6 board and developing opening and endgame databases for these. I haven’t done the math on what it would take to resolve the game theoretic value of one of these small games, but it might be tractable with the computing power that I have at my disposal.

There are obviously variants to checkers, including international draughts and the misere variants, also known as suicide checkers. The idea is that the person who loses all their pieces first wins. Martin Fierz has a version of Cake called Suicidal Cake which can play suicide checkers, which is an interesting variant with many of its own subtle problems. International draughts is a much more complicated game, with a larger board and some really interesting rules for jumping. There are other variants as well (Polish, Turkish, Italian) which could be considered as strong candidates.

Othello or Reversi are interesting games of position, and which can use many of the techniques we see in checkers and chess (namely opening books and endgame databases). There is also a fair amount of human knowledge about the game, which is good.

Backgammon introduces chance into the mix, which generally means that you need to use a different set of techniques than we’ve seen so far. I also like to play backgammon (although am only a middling level player). Backgammon has been the subject of quite a bit of machine learning research, and there exists many implementations to spar against.

Hex is a very nearly perfect abstract game: players alternate turns to connect their side of the board to the opposite side. There are no draws possible. The entire games can be displayed as a single graphic, which shows the order in which stones were placed on the board. The high branching factor makes traditional alpha-beta search less attractive.

Go has many of the same characteristics, with the added bonus of having lots of good human players. There has been a lot of interesting research recently about playing Go.

Arimaa was designed to be hard for computers to play, but frankly, I think that it does so only by making a rule system which is complicated. It seems between the push, pull, freeze and trap rules, there are at least two and maybe three that could have been eliminated. I also suspect that while it is (according to the authors) “easy to learn”, I suspect that really only means “easy to learn the rules”. I would make the (unsupported) claim that most human play is quite weak as well.

Games like Awari or other games of the mancala family have been studied quite extensively. Connect 4 has been solved, but is still middling interesting.

Any other board games I’m missing? I await your suggestions and/or advocacy.

Can I derive an adequate evaluation function for Milhouse through random play?

As I have mentioned from time to time, I have an implementation of a middling checkers player program that I’ve called Milhouse. Every few months I take time out to play with it and decide what I can do to make it better. Recently I’ve been pondering the possibility of implementing drop-out expansion to make a good opening book, and I’ve also been pondering how I might make further additions to its evaluation function. Sadly, improving its evaluation function requires that I actually understand more about the subtleties of checkers, and as yet, my own study of this fascinating game have been sporadic.

There have been a number of papers suggesting schemes for automatically generating heuristics for games. Most of the ones I’ve read have been based upon machine learning, and are sufficiently complicated that I haven’t had the time or energy to take a look at them. But this evening I stumbled upon the following paper by Abramson and Korf which suggested a different scheme which seems like it could be interesting to experiment with:

A Model of Two Player Evaluation Functions

The basic idea is that the evaluation of a node can be defined as the expected value of games played randomly from the terminal position. In other words, you can determine the value of a position by playing a bunch of random games from that position, and returning the estimated expected value.

This seems like a bad idea on the face of it: for example consider the classic First Position from checkers:

White to move, and win, but Milhouse flounders...

This position requires really deep search: in fact, even with an endgame WLD database, my program has difficulty actually making headway. It seems all moves as winning, and therefore all moves as having virtually the same value, which means that it searches a lot of nodes.

Martin Fierz’s Cake program plays this position quite well. It takes a while to resolve, the first capture occurs some 56 ply deep in the search.

[Event ""]
[Date ""]
[Black ""]
[White ""]
[Result "*"]
[Setup "1"]
[FEN "W:WK19,K18:BK32,3."]
1. 19-23 32-28 2. 23-27 28-32 3. 18-23 3-8 4. 27-24 32-28 5. 24-19 28-32 6. 23-26 32-28 7. 
26-31 28-32 8. 19-16 8-12 9. 16-19 32-28 10. 31-27 28-32 11. 19-23 32-28 12. 27-32 28-24 
13. 23-18 24-28 14. 18-22 28-24 15. 32-28 24-19 16. 22-18 12-16 17. 28-32 19-24 18. 18-15 
16-20 19. 15-11 24-19 20. 32-27 19-24 21. 27-23 24-28 22. 11-15 28-24 23. 15-18 24-28 24. 
23-19 28-32 25. 18-23 32-28 26. 23-27 28-32 27. 19-23 20-24 28. 27x20 32-28 29. 23-19 
28-32 30. 20-24 32-28 31. 24-27 28-32 32. 19-23 32-28 33. 27-32 28-24 34. 32-28 24-20 35. 
23-18 20-16 36. 18-15 16-20 37. 15-11 20-24 38. 28x19 *

But this paper points out something interesting: to be useful, the program only needs to differentiate better moves from worst moves at the search horizon. And, the fact of the matter is that random play might very well be a good estimate of the relative value of the two positions.

Abramson and Korf did some experiments using the game Othello which compared using the expected outcome of random play as a heuristic value, and compared it with a standard static evaluation function, the worst possible choice, and a complete minimax search from the node (which was the true value of the game). It was found that using expected outcome was reasonable, and made fewer errors than their common static evaluator.

So here’s my question: is this applicable to endgames in checkers? At the time the paper was written, there weren’t large endgame databases for checkers, but now I have a copy of the WLD database for all positions with < 8 checkers. I can quite easily run this experiment, and compare my own static evaluation function to the evaluation function achieved by looking at the expected outcome from random play, and compare them to the actual game theoretic value of the position.

I estimate that it’ll take a few hours of coding to set this up. I’ll report back on the results sometime in the next couple of weeks.

Addendum: I arrived here actually by looking at research on Go, a game which I understand even less than Checkers, but which has seen some interesting research in the last few years, particularly in the idea of “Monte Carlo” search: using random play to evaluate game positions. I’m not so interested in Go, but am somewhat interested in Hex, which has similar characteristics: namely, very high branching factor which makes deep searching hard, and very large configurations during the endgame which prevents straightforward endgame databases. I’d like to write a Hex program someday.

Fifth Position, a test for milhouse

My trip to Powell’s also netted me Erroll A. Smith’s The American Checker Player’s Handbook, a nice little tome published in 1944. It mostly is an introduction to the famous two-move openings, systematically organizing the forty-seven two-move openings into 7 so-called “Master” openings, and then the Major Variations. There are two principle areas that I’d like to see Milhouse, one is the openings, so I think it might be useful once I get going on that project. But in the mean time, it also has some nice positions that are good tests for either its general play, or its play with the endgame database. Witness the so-called Fifth Position:

Fifth Position: White to Play and Draw
Fifth Position: White to Play and Draw

Without an endgame database, it takes milhouse a 21 ply search to find the right variation that avoids a loss for white.

      +1 : [-5] 0.00s
         : 27-24
      +3 : [3] 0.00s
         : 27-24 11-15 20-16
... researching after failing low (-122)
      +5 : [-124] 0.00s
         : 27-24 11-15 20-16 14-18 21-17
... researching after failing high (-9)
      +7 : [-9] 0.00s
         : 20-16 11x20 27-23 12-16 19x12 20-24 12-8
... researching after failing low (-34)
      +9 : [-37] 0.00s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  27-32 23-18
     +11 : [-35] 0.01s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  27-32 23-18
         : 10-14 18-15
     +13 : [-40] 0.02s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  10-14 19-15
         : 27-32 23-19 32-27 15-10
... researching after failing low (-78)
     +15 : [-116] 0.08s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  10-14 19-15
         : 12-16  9-6  27-32  6-1  32-27 23-18
     +17 : [-119] 0.23s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  10-14  9-6
         : 27-32 19-15 12-16 23-18 14x23  6-1  32-27  1-5
     +19 : [-114] 0.72s
         : 27-23 11-15 20-16 15x24 16-11 10-15 23-19 14-18 19x10
         : 18x25 11-8  24-27 10-6  27-32  6-1  25-30
... researching after failing high (-89)
     +21 : [-11] 2.65s
         : 20-16 11x20 27-23 20-24 22-18 13-17 18x9  10-14  9-6
         : 24-27  6-1  27-31  1-5  31-27  5-9  27x18 19-15 18x11
         : 9x18
     +23 : [-16] 5.54s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  27-32 23-18
         : 10-14 18-15 13-17  9-6  17-22  6-1  22-26 15-10
     +25 : [-17] 13.83s
         : 20-16 11x20 27-23 20-24 22-18 24-27 18x9  27-32 23-18
         : 10-14 18-15 13-17  9-6  32-27  6-1  17-22 15-10

The play listed in this line differs a bit after the capture 18×9, but the Cake endgame database lists the move as drawn after that capture, and it appears that milhouse is able to hold a drawing line. Smith lists 27-31 as a drawing variation, but suggests 10-14 as the drawing move. Using the Cake database, it also appears that 12-16 and 27-32 can draw as well.

Using the endgame database, milhouse asserts that the game is drawn with a 7 ply search, after searching less that 1000 nodes, about a 450 fold increase in speed.

Some new additions to my checkers library…

A visit to Powell’s books today netted me three new (well, new to me, but used, and in two cases, quite old) books on checkers. It’s been a while since I mentioned my checkers program milhouse, but it’s still in the back of my mind, and these old books provide excellent insight into the game, and are rich in test cases that can be used to evaluate checkers play. For instance, in Spayth’s American Draught Player, he lists a number of nice positions, like this position, labelled Payne’s #1:

Either side to play, white to win...
Either side to play, white to win...

Milhouse has no difficulty finding the winning line on simple positions like this, but there are definitely some more subtle positions, and Spayth’s book provides some insight into many common openings. I’ll probably type in some of these problems, and eventually produce a downloadable version of milhouse for all to enjoy.

Addendum: Here’s the most interesting book: a copy of Spayth’s The American Draughts Player, which was written circa 1860. I’m not sure when this edition was printed, but it’s obviously quite old, and yet in pretty good condition.

Spayth's <em>American Draught Player</em>
Spayth's American Draught Player

For those of you who weren’t lucky enough to find a copy of this book, you can nevertheless find it on Google Books:

Addendum2: My copy is a sixth edition, which means that it probably dates to the mid 1890s.