Category Archives: Math

Parrondo’s Paradox

earl53_1909pennyfront2webI was reading Nahin’s Digital Dice, which I bought a while ago but which I didn’t really dive into deeply, and he had a nice exposition of Parrondo’s Paradox, which is very neat. Here’s the idea:

  1. We are going to play two different coin tossing games. In each case, if we win, we gain one unit. If we lose, we lose one unit.
  2. In Game A, we will toss a coin, and if it comes up heads, we win, otherwise we lose. Unfortunately for us, the coin is not a fair coin, it is more likely to come up tails than heads. Let’s call the probability of winning this game as 0.5 – ε. Clearly, this game is a losing one for ε > 0.
  3. In Game B, we have two coins. One is clearly biased against us: the chance of winning is just 0.1 – ε, the second is biased for us, and has probability of winning as 0.75 – ε. Here’s the wrinkle: if our bankroll is an exact multiple of three, we toss coin the first coin, otherwise we toss the second coin.

Game B is a bit harder to analyze, but let’s give it a try (it’s wrong, but it’s not too far wrong): 1/3 of the time, we gain one unit wth probability 0.1-ε and 2/3 of the time, we gain one unit with probability 0.75-ε. This means that our probability of winning is 8/15 – ε. If we set ε to be 0.005, we maintain a narrow advantage.

Or do we? Let’s go ahead and simulate this. I wrote some Python code that played both games for one million rounds. The results from a typical round:

GAME A: played 1000000 rounds, started with 0, ended with -8642
GAME B: played 1000000 rounds, started with 0, ended with -8532

So, here’s the first thing issue: Game B appears to be a loser as well. Why would that be? (Think about it).

But if we accept the fact that Game A and Game B are losing propositions, let’s play a new game called the combination game. It’s very simple, we randomly pick (perhaps by tossing a fair coin) which of the games we want to play in each round. If we simulate this game, we get the following (again, from a run of 1000000):

COMBINATION: played 1000000 rounds, started with 0, ended with 16620

What’s this about? We’re alternating between two losing games, and yet, we emerge as winners? What’s going on here?

Enjoy!

Addendum: Here’s some python code that I wrote to simulate the game.

#!/usr/bin/env python
#
# $Id$
# parrondo.py 
#

import random

def play_game_a(capital, eps):
    return random.uniform(0, 1) < (0.5 - eps)

def play_game_b(capital, eps):
    if capital % 3 == 0:
	return random.uniform(0, 1) < 0.1 - eps
    else:
	return random.uniform(0, 1) < 3.0/4.0 - eps

def play_game_combo(capital, eps):
    if random.randint(0, 1):
	return play_game_a(capital, eps)
    else:
	return play_game_b(capital, eps)


def play(name, f, n, icap, eps):
    cap = icap
    for i in range(n):
	if f(cap, eps):
	    cap = cap + 1 
	else:
	    cap = cap - 1 
    print "%s: played %d rounds, started with %d, ended with %d" % (name, n, icap, cap)


play("GAME A", play_game_a, 1000000, 0, 0.005)
play("GAME B", play_game_b, 1000000, 0, 0.005)
play("COMBINATION", play_game_combo, 1000000, 0, 0.005)

Random bits…

Well, pseudo-random, but reportedly cryptographically strong.

01110 10011 01100 01000 11101 11001 11010 10011 00000 10011 11111 11001
10101 11100 10011 10110 01010 01100 01000 00010 11110 01100 11010 11000
00011 11111 10001 00111 01011 00101 00011 00111 01000 11111 10011 01001
10000 10101 10001 01011 11111 00100 11010 10000 00110 11111 10111 11011
00011 11001 11111 00011 11010 01001 01011 00010 10101 10101 10010 10100
11010 00010 01010 11110 01011 01001 11101 11100 00100 11011 00010 01110
01010 00100 10001 10001 00000 00100 10010 10000 10001 10110 10010 10101
10100 11000 10001 10000 00001 01001 00100 00100 11001 11010 01110 10000
01111 00100 11111 01010 01001 01100 01000 10010 10110 01000 00000 11001
01001 10011 11100 10010 00111 11001 10100 01011 01010 00110 11001 10001
10011 01110 00010 01111 10101 01111 10110 10101 00010 10000 11010 11001
00001 01101 00100 10010 00000 10000 01011 00011 00110 00110 01011 10111
01100 01010 11011 11010 10001 01000 00010 10010 00100 01011 11001 01011
01111 11111 01110 11100 10101 01110 01010 10100 00110 11111 11101 00100
01111 01011 10001 00000 10010 01111 11110 00100 11100 11001 10001 10011
01001 11101 11100 00110 01100 01011 00100 10000 10110 00000 10100 00001
01100 11111 00010 11000 00111 11101 10110 10101 11110 00110 10101 10100
10001 01101 11011 11000 10001 10111 00100 11000 01010 10111 01010 01010
01001 00110 11111 01110 01100 11110 10100 01011 01000 00100 10010 11000
01101 10001 11100 00100 00100 01001 10100 01100 11000 11100 01101 10111
00011 11101 11001 10010 10111 10000 11000 10100 01000 10101 11010 01001
10101 11011 11101 00000 11100 01011 11100 11110 01101 00110 10111 11100
01001 11000 11101 11110 01100 10101 11011 01110 00111 10101 01000 11111
11010 11000 00100 00111 01001 11011 01100 01101 11000 11101 00010 00011
10101 00011 10000 10000 11000 00111 01010 01000 01111 10101 01011 01110
01111 01001 11011 11000 01011 10110 10100 00001 11110 00011 00111 10011
11111 11101 01000 11100 01010 01000 00010 01111 10101 11101 11010 01100
11100 11101 01100 11101 11101 11010 01100 01101 00110 11100 11011 01001
01001 01110 10100 01011 00000 11110 00101 00000 10111 00101 10101 00111
10011 11111 11100 11011 00101 10100 11001 10110 01011 10010 00000 10100
10001 01110 11000 11010 11110 01010 10011 11011 00010 01010 11011 00011
10111 00010 10010 00001 11110 11000 00011 01001 00110 00010 10000 01000
00100 11000 10000 10110 11010 11011 00010 00001 10100 00101 01011 00000
00001 10000 10000 10111 10110 10111 01110 00010 00101 01001 10110 11000
00001 10001 01101 00111 01010 00001 11100 11110 11111 10001 10110 01111
01001 00010 10011 10100 00011 10101 01100 00001 00000 01111 10111 11010
00110 01110 10001 00010 11111 11000 01100 11010 10011 11000 01000 00101
10000 00011 00001 10110 10001 00000 10001 00101 10010 00110 01101 00011
10110 00110 00101 00110 11100 11101 00011 10010 01011 01011 01000 11100
10100 11001 00110 00101 10110 10011 11001 01110 00111 00110 00000 11010
10011 00010 10000 10110 00000 01101 01111 10111 10010 11010 10001 01101
11000 01010 01011 00011 01010 01110 00011 01011 11010 00110 00101 10011
00110 10001 10001 10101 00110 11011 01111 11100 00111 00011 11001 11010
11000 11101 00000 10101 00110 01110 00110 01110 01100 00001 01001 01011
10100 01110 10000 10101 00001 10101 00101 11010 00100 10101 00111 00111
00110 11001 10110 00101 11111 01011 11110 10101 00001 11000 00011 01011
11111 01110 00011 01110 00110 10111 01111 01111 00011 01000 01101 11010
00110 11110 10000 00101 11111 11001 10111 10100 01010 01111 10010 11101
11110 01100 10101 01100 00000 10101 10110 11110 10110 11001 11001 00000
01111 10101 01000 10001 10000 11101 01001 01100 11100 11001 11010 00011

Interesting number theory underlies this: check out this wikipedia entry for the Blum Blum Shub random number generator.

Passing the Amateur Extra test by guessing…

The Amateur Extra test is 50 questions, multiple choice, with 4 answers per question. A passing grade is 35 or more. A few minutes of programming this morning, even before I had any coffee yielded that the exact probability of passing was:

      4677523340461106447
------------------------------
158456325028528675187087900672

or about 1 in 33.9 billion.

This wasn’t that interesting of a question, but to solve it, I hacked up a quick but limited implementation of rational arithmetic in Python. I was wondering if there was a better way to implement this in Python so overloading would “just work”. I didn’t know how, and the problem was simple enough, so I didn’t try. Here’s my solution.

#!/usr/bin/env python

def gcd(a, b):
    if (a < b):
        a, b = b, a
    while b != 0:
        a, b = b, a%b
    return a 

class Rational:
    def __init__(self, a, b):
        self.a = a 
        self.b = b 
    def __str__(self):
        return "[%d / %d]" % (self.a, self.b)
    def pow(self, p):
        return Rational(pow(self.a, p), pow(self.b, p))
    def mult(self, r):
        tmpa = self.a * r.a ;
        tmpb = self.b * r.b ;
        d = gcd(tmpa, tmpb)
        return Rational(tmpa//d, tmpb//d)
    def imult(self, i):
        tmpa = self.a * i ;
        tmpb = self.b ;
        d = gcd(tmpa, tmpb)
        return Rational(tmpa//d, tmpb//d)
    def add(self, r):
        tmpa = self.a * r.b + r.a * self.b
        tmpb = self.b * r.b 
        d = gcd(tmpa, tmpb)
        return Rational(tmpa//d, tmpb//d)

p = Rational(1, 4)
q = Rational(3, 4)

def fact(n):
        c = 1 
        while n > 1:
                c = c * n 
                n = n - 1 
        return c 

def comb(a, b):
        return fact(a)/(fact(b)*fact(a-b))

total = Rational(0, 1)

for t in range(35, 51):
        x = p.pow(t).mult(q.pow(50-t)).imult(comb(50, t))
        total = total.add(x)

print "Exact probability is",  total
print "Only about 1 in", total.b // total.a, "will pass"

On Multiplication…

As I was “StumbleUpon”-ing tonight, I was reminded of something that I was thinking about a couple of days ago, and though I’d write it down here. Let’s say we want to multiply, oh… 47 by 69. We begin by writing the two numbers:

47 69

and then proceed by halving the first number and doubling the second number repeatedly.

47 69
23 138
11 276
5 552
2 1104
1 2208

Now, we cross out all the rows where the first column is even, and add all the remaining entries in column two:

47 69
23 138
11 276
5 552
2 1104
1 2208

If we add up 69+138+176+552+2208, we get 3243, which is indeed 47 * 69. This is so-called Russian peasant multiplication. I’ve probably known about this since I was eleven or twelve, but never really gave it much thought. Later, it seemed like the obvious way to implement multiplication on computers which had shift instructions. It’s not hard to see why this works if you are familiar with binary arithmetic, but I must admit that if you weren’t familiar with binary arithmetic, it seems to me unobvious how you would figure this stuff out. Yet, the ancient Egyptians actually knew about a closely allied technique that allowed them to do multiplication and division, so it would appear that the principles of binary arithmetic are actually exceedingly old. Wikipedia has some nice information.

Factoring Machines

I’ve blogged about D. H. Lehmer’s factoring machines before. It’s fairly hard to convince those who aren’t pathologically interested in rather quirky bits of mathematics and computing that this collection of bicycle chains and sprockets is worthy of study, but I didn’t have much difficulty convincing my typical lunchtime companion Tom that they were interesting, so we were discussing it a bit yesterday.

In the current era of ubiquitous computing, it’s hard to imagine that there was a time when computation wasn’t automated or mechanized, but back in the 1920s and 1930s, it truly was revolutionary stuff. An actual theory of computation was still on the horizon through the work of Church and Turing. The Bombe and Colossus code breaking machines of WWII were still in the future. Tom and I were musing that the revolutionary work of Lehmer and Lehmer were all the more revolutionary because they were without precedent. We also wondered how their work might have influenced code breaking machinery in WWII. Were the gents at Bletchley Park aware of the work that the father/son duo were doing, and did it serve as some inspiration?

Well, I still don’t know. I’m going to go back and dig through some of my references later, but I did find a couple of interesting things.

According to Knuth (Semi-Numerical Algorithms, pg. 390) the Lehmers were not the first to propose a mechanical device for automating the sieving task. In 1896, Frederick Lawrence wrote a paper entitled “Factorisation of Numbers” which inspired three different prototypes. The first successful factoring machine was built by Carrisan in 1919, which you can read more about here. Very neat device.

While trying to dig up online references, I found Eran Tromer’s annotated taxonomy of special purpose cryptographic machines. Lots of very good references and insights that might help me understand the historical context of these computing devices.

YAFU factors a 100 digit number for me…

On Friday, I left YAFU factoring a 100 digit number. I returned to find that it had discovered:

57790 93002 97410 10009 81807 59480 96292 62385 64081 79034 01512 68667
60949 55202 41611 02063 00992 86747 29621 36069 = 
85404 41503 10330 70274 85067 33621 20528 36393 71759 33321 * 
67667 37996 94567 83271 38746 19751 00109 79749 75402 10589

It took about 7.6 hours of compute time.

Addendum: I was reading “Factoring by electronic mail”, a 1990 paper by Lenstra and Manasse, and they listed some times for factorization of some large numbers. In Table 4, they list a bunch of challenging numbers, including the 93 digit composite factor of 7139+1. Their system took 13 days. YAFU was able to factor it in about 70 minutes on my current desktop machine:

Total factoring time = 4283.6700 seconds


***factors found***

P1 = 2
P1 = 2
P1 = 2
P3 = 557
PRP22 = 5312442648966139012589
PRP57 = 651666519182971782190402264651775450396158503528225547931
PRP36 = 190705279598948669274660288301207361

It found the small primes via trial division, the 22 digit prime factor via the ECM method, and then churned the rest out using the quadratic sieve.

On Factoring, by Jevons’ The Principles of Science

While reading the section on factorization of large numbers in Knuth’s Seminumerical Algorithms, I encountered a reference to an interesting claim by William Stanley Jevons. I did a google search, and found it in google books:

Knuth was quick to point out that the number that he gave was easily factorable in short order (even by hand, even in Jevons’ day) using Fermat’s method. I of course just handed it to the little program I wrote, and it instantly returned

8,616,460,799 =
	89,681 *
	96,079

Addendum: How does Fermat’s method work? You basically try to express the prime as the difference of two squares. Let’s say that the prime P = u * v. If P = x^2 – y^2, then P = (x + y) (x – y), so the two factors u = x + y and v = x – y. You start by setting x to be just above the sqrt(P), and compute x*x – P. If the remainder is a square, then you have have two factors. Knuth gives several hints on how you could avoid testing many possible squares, but even acting trivially, you would only have to try 55 numbers before finding that…

8616460799 = 92880**2 - 3199**2
8616460799 = 96079 * 89681

Addendum: Lehmer factored this number in 1903. Golomb apparently had a Cryptologia article showing how it could be easily factored with a hand calculator, but I don’t want to pay $37 to get the 3 page reprint.

A Simple Python Program for Factoring…

Sometimes, your interests converge. Over on Programming Praxis, he had a coding challenge to implement Monte Carlo factorization. The last couple of days, I have been thinking a bit more about factorization, so it seemed like a good idea to give it a whack. I cribbed a version of the Rabin Miller primality tester off the web, and then coded the rest up in python.

#!/usr/bin/env python

# 
# a basic implementation of the Pollard rho factorization
# Written by Mark VandeWettering <markv@pixar.com>
#

import sys
import locale
import random

class FactorError(Exception):
    def __init__(self, value):
	self.value = value 
    def __str__(self):
	return repr(self.value)

def miller_rabin_pass(a, n):
    d = n - 1
    s = 0
    while d % 2 == 0:
        d >>= 1
        s += 1

    a_to_power = pow(a, d, n)
    if a_to_power == 1:
        return True
    for i in xrange(s-1):
        if a_to_power == n - 1:
            return True
        a_to_power = (a_to_power * a_to_power) % n
    return a_to_power == n - 1

def isprime(n):
    for repeat in xrange(20):
        a = 0
        while a == 0:
            a = random.randrange(n)
        if not miller_rabin_pass(a, n):
            return False
    return True



def gcd(a, b):
    while b != 0:
	a, b = b, a%b
    return a 

def findfactor(n):
    for c in range(1, 50):
	x = y = random.randint(1, n-1)
	x = (x * x + c) % n
	y = (y * y + c) % n
	y = (y * y + c) % n
	while True:
	    t = gcd(n, abs(x-y))
	    if t == 1:
		x = (x * x + c) % n
		y = (y * y + c) % n
		y = (y * y + c) % n
	    elif t == n:
		break
	    else:
		return t
    raise FactorError("couldn't find a factor.")
	
def factor(n):
    r = []
    while True:
	if isprime(n):
	    r.append(n)
	    break
	try:
	    f = findfactor(n)
	    r.append(f)
	    n = n / f
	except FactorError, msg:
	    r.append(n)
	    break
    r.sort()
    return r 

def doit(n):
	flist = factor(n)
	print locale.format("%d", n, True), "="
	for f in flist:
		print "\t%s" % locale.format("%d", f, True)

locale.setlocale(locale.LC_ALL, "")

doit(2**98-1)

This is a fairly challenging factorization, and after a few seconds, it outputs:

316,912,650,057,057,350,374,175,801,343 =
	3
	43
	127
	4,363,953,127,297
	4,432,676,798,593

Feature Column from the AMS

I’ve been using StumbleUpon to find new webpages on a variety of subjects when I am bored, and as you might have seen from other blog postings (although not so many recently), mathematics is one of those things that I actually find kind of interesting to think about. While scanning around this morning, I found a link to the Feature Column from the AMS, which has many interesting articles on a wide variety of mathematical topics. I found the December 2008 article on the mathematics of clock making to be particularly interesting.

aamath

Suppose you wanted to format a nifty math equation for insertion into plain old ordinary email, or maybe as a comment in your C program. For instance, something reasonably complex like this rendition of the series used in the BBP algorithm to compute hexidecimal digits of pi:


      oo
     =====
__   \     /   4         2         1         1   \   -n
|| =  >    |------- - ------- - ------- - -------| 16
     /     \8 n + 1   8 n + 4   8 n + 5   8 n + 6/
     =====
     n = 0

It would be pretty tedious, right? Well, perhaps you could use this linke to aamath, a nifty command line program for rendering ascii art versions of math formulas.

Even bigger primes…

Last year, I wrote a post about a program I wrote to compute a really large prime number. . At the time, 232482657-1 was “king” of the primes, having over 9.8 million digits. Today, one year later, it turns out that the Great Internet Mersenne Prime Search has discovered two even larger primes. I dusted off my code, and was pleased to find that it still worked, generating the nearly 13 million digits of the new king of primes in roughly two minutes…

Here are the digits of 243112609-1:

    316,470,269,330,255,923,143,453,723,
949,337,516,054,106,188,475,264,644,140,
304,176,732,811,247,493,069,368,692,043,
185,121,611,837,856,726,816,539,985,465,
097,356,123,432,645,179,673,853,590,577,
432597 lines deleted...
915,084,069,991,159,279,791,908,398,130,
223,304,824,083,119,093,195,998,014,562,
456,347,941,202,195,900,928,079,670,729,
447,921,616,491,887,478,265,780,022,181,
166,697,152,511

Addendum: Okay, it dawns on me that I never explained why this is difficult. Here’s the thing: Mersenne primes are really easy to write in binary: they are just a long string on ones. But to write them out in base 10 like we all know and love is more difficult. To do it, I actually do all my arithmetic in base 1000 (a convenient choice, because we want to print out the number with commas in place). Computing the prime isn’t very hard really, ab is 1 if b equals 0, is a * ab-1 if b is odd, and is ab/2 squared if b is even. Then, you only need to do log2(b) multiplies to get your (in this case very big) number. The real trick comes in doing the multiply: the obvious algorithm takes O(n2) time to multiply two numbers. But a faster way is to use the FFT. The trick is realizing that multiplying numbers using the old scheme is basically a lot like convolution. Convolution can be changed to simple multiplication by using the Fast Fourier Transform. So, here’s the basic idea. Suppose you have a number represented in base 1000. Pad it with an equal number of zeros. Then take the Fast Fourier Transform. Square each resulting bin. Then inverse FFT it. You’ll be left with some bins with values > 1000. But you can normalize those by dealing with the carries. And you are done! The FFT only takes O(nlogn) time, the normalization is linear, so you are much, much faster in the limit doing multiplies this way.

There are a few issues with precision but overall, it works quite well.

Math identities…

Hmmm. While searching for some unrelated program, I uncovered a program that I wrote which found the following identities:

13 + 123 = 93 + 103
23 + 893 = 413 + 863
83 + 533 = 293 + 503
93 + 343 = 163 + 333
93 + 583 = 223 + 573
103 + 273 = 193 + 243
173 + 763 = 383 + 733
303 + 673 = 513 + 583
513 + 823 = 643 + 753

Why is this interesting? It’s really not. I just kind of like math.