Monthly Archives: April 2006

Mathematical Musings on Baseball

Baseball MathMy recent purchase of the book Baseball Hacks has made me dust off some my (I must admit impossibly rudimentary) knowledge of statistics and probability and think about baseball in that context.

I think it was several years ago while reading Lewis’ Moneyball that I first became aware of Bill James’ Pythagorean Theorem of Baseball. It states that the expected number of wins for a baseball team is proportional to the ratio of the runs the team scores and the sum of the square of the runs scored and the runs scored against. At the time I heard of this, I really didn’t have much insight as to why that should be (I’m told that James takes five pages to derive it in his 1981 Baseball Abstract, but I don’t have a copy of that) and that kind of bothered me.

So, I set out to try to derive my own formula that would give the expected number of wins. I downloaded the game logs for 2004 and extracted the 162 games that the Oakland Athletics played. They scored 793 runs and had 742 runs scored against them. It turns out that they actually won 91 games. How does that compare to what Bill James predicted? A bit of math and you’ll find that by James’ Pythagorean Formula, it might be expected that they win 86.37 games. Not too bad, but it seems that the A’s might have overperformed a bit.

I tried to derive a similar number by a different tactic. I assumed that baseball scoring is a Poisson process (this is one of many simplifying assumptions that isn’t true, but it simplifies the math). I then wrote a simple little simulator that played random seasons of baseball and totalled the runs that might be expected to yield the total of 793 runs. (Basically, the time to the next score can be generated by getting a uniform random variable u in the range zero to one, and then computing the time to the next score as being -log(u)/r, where r is the average rate (in this case, 793 / 162). You’ll find that you don’t get 793 runs very often, and the distribution of potential results forms a nice looking bell curve.

Fun, but not what we were originally trying to do.

It turns out that you can pretty easily determine for any potential number k what the probability is for a given Poisson random variable to have precisely k occurrances in the unit interval. You can look it up for yourself on Wikipedia, and it’s just a few lines of code to implement. Then, you determine for each possible score, say, home and visitor, what the probability is that the particular combination of scores actually is (which I truncate at 20 runs per team), and simply sum up the cumulative probability.

Well, with one complication: it doesn’t tell us how to score ties. I summed up the probability that tie games would occur, and found that according to the theory, ties should (for the 2004 Athletics) have happened about 21.18 times (they actually happened 19 times in 2004, not bad!). I decided to do the simplest possible thing, and just assume that each team will win 50% of the games which are tied after regulation. So, when I add half the tie percentage to the previously accounted for win percentage, and multiply by 162 games…

I get a prediction that the A’s should have won 87.48 games. And I understand most of the assumptions and math that lead to this conclusion. Neat!

Oh, on the less theoretical front, Chavez, Thomas and Bradley hit three homers yesterday on three consecutive pitches and the A’s won. I tuned in at the top of the 9th in today’s game with the A’s leading 3-1, just in time to see Huston Street leave ball after ball up in the strike zone and get hammered for 4 runs. The A’s would load the bases in the bottom of the ninth, but Swisher flied out to end the game.

It’s best not to lose sight of the game for the mathematics.

[tags]Mathematics,Statistics,Baseball,My Projects[/tags]

Le voyage dans la lune

Le Voyage Dans La LuneVia archive.org, you can download a Divx copy of George Meiles Le voyage dans la lune. This classic 1902 silent film may qualify as he earliest science fiction movie.   The image to the right is probably the most famous from the film.  Cool stuff!

[tags]Le Voyage Dans La Lune,Silent Movie,Archive.org[/tags]

1986 World Series Game Six Re-enacted in RBI Baseball

One of the most famous games in all of baseball history must be Game 6 of the 1986 World Series, which pitted the Mets against the Red Sox. The Red Sox entered the game leading three games to two, and after nine innings, the game was tied 3-3. The Red Sox scored twice in the top of the 10th to lead by the score of 5-3. You’d think it would be time to pack up the equipment and go home.

But baseball can be a cruel game, and the Curse of the Bambino still lived.

From retrosheet.org’s play by play, the bottom of the tenth went:

METS 10TH: Backman made an out to left; Hernandez flied to center; (Only one out remaining from the Series win) Carter singled to left; MITCHELL BATTED FOR AGUILERA; Mitchell singled to center [Carter to second]; Knight singled to center [Carter scored, Mitchell to third]; STANLEY REPLACED SCHIRALDI (PITCHING); Stanley threw a wild pitch [Mitchell scored, Knight to second]; Wilson reached on an error by Buckner [Knight scored (unearned)]; 3 R, 3 H, 1 E, 1 LOB. Red Sox 5, Mets 6.

Ouch!

Of course, the Mets went on to win game seven and the Series. Buckner was unfairly tagged for the loss (the Sox lleft no less than fourteen runners on base), and it would overshine the entirety of his career ever after.

All this is a rather laborious setup for the following Google Video:

1986 World Series Game Six Re-enacted in RBI Baseball – Google Video

[tags]Baseball,World Series,Red Sox[/tags]

Jumpy eggs caught on camera

Good thing this isn’t my tax dollars at work:

After two years of work, with a purpose-built steel machine wired up to high-speed cameras, microphones and electronic sensors, a team of Japanese researchers has finally proved that a hard-boiled egg can jump. All it takes, according to Yutaka Shimomura and colleagues of Keio University, is a good spin.

A perfect manifestation of the brainwagon philosophy: “there is much pleasure in useless knowledge”.

Baseball Hacks

Sometimes you find a book that seems uniquely written for your interests: such is Baseball Hacks, the latest O’Reilly book in their illustrious “Hacks” series. It is basically a manual on how to use computers to fuel your obsession for baseball statistics, and includes a wide vareity of cool things you can do with a computer, access to the internet, and open source tools like MySQL and perl. I’ll say more when I’ve had a chance to work through some of the examples.

[tags]Baseball Hacks,Baseball[/tags]

Intelligent Design the Future: Heddle on Sagan: Billions and Billions of Errors

Today’s rant on the subject of Intelligent Design is going to be a little difficult to follow, so try to stick with me. Today, on the blog, Intelligent Design the Future, Jonathan Witt reports on “physicist” David Heddle’s critique of the late astronomer Carl Sagan. Actually, you can’t really call it a critique: it’s basically the assertion that Carl Sagan was never right about anything. Heddle comments:

Recent astronomical data have again demonstrated that few scientists have simultaneously achieved such widespread acclaim while consistently being wrong as the late Carl Sagan.

In the area of popular science, I don’t know much that he wrote or said that was correct.

The problem (aside from its obvious vagueness) is the part that we don’t know: how much of Sagan’s writings on popular science has Sagan Heddle actually read?

Heddle’s vita (updated at Heddle’s request) suggests that for the last fifteen years at least, his publications have all been in the field of scientific software, not the most illustrious physics resume you might actually find.  shows him to be a collaborator on a large number of papers on particle physics.  You can check out some of Sagan’s achievements via his wikipedia entry, and decide for yourself who might be best qualified to speak for the world of exobiology in astronomy.

The one criticism that Heddle makes against Sagan is:

Sagan was wrong, wrong, wrong. The earth is in a privileged location (not just for life as we know it, but for any kind of complex life imaginable), as discussed quite convincingly by Gonzalez and Richards in the Privileged Planet.

But there is the problem: The Privileged Planet isn’t convincing in the least. Their reasoning is basically that the earth occupies a fairly uncommon part of the galaxy, a part which happens to be reasonably quiet astronomically. No recent supernovas. Relatively few life extinguishing collisions. Not too close to the radiation centers at the galactic core. Isn’t it amazing, that we find ourselves in such a friendly place?

Does everyone spot the problem with this argument?

If you win the lottery, people might interview you and ask what is it about you that enabled you to win. You might think that you deserved it, or that “God told you to pick the numbers”, or that your lucky rabbits foot enabled you to win. But if you didn’t win, nobody would be around to ask the question, because nobody cares what all the losers did. Similarly, it is not at all surprising that we find ourselves orbiting a particularly boring star in the most boring part of the galaxy. If we didn’t, we would have been exterminated a long time ago, and wouldn’t have the metabolism necessary to ask the question. We didn’t evolve around a random star: we evolved around a star where life could evolve. Therefore, it really isn’t evidence of anything when we find that these places are uncommon.

You’d think this would be relatively easy to figure out.

Oh, and incidently, if you’d like to claim that intelligent design and religion are completely different, surf on over to Heddle’s personal blog and read what things he feels strongly enough to post about, and then see if you can maintain a straight face.
[tags]Intelligent Design,Sagan,Jonathan Witt,David Heddle[/tags]

MAKE: Build a Baird Televisor

A while back, I did some research on the early days of television, and provide some links that might help you create your own replica. The make blog linked to a nice scan that shows some plans from 1928 for building your own “Baird Televisor” which I thought were really cool.
MAKE: Blog: HOW TO – Build a Baird Televisor

The linked page makes each individual page available as a PDF file, I merged them into a single PDF for easier downloading and printing.. Enjoy!

CD quality field recording rig

It’s been some time since I posted anything of interest to the musicians in my target demographic, or those who are interested in field recording.  Check out this link on Instructables for a CD quality field recording rig which is entirely battery powered and cost < $1000.   The most interesting choice is to use the optical digital inputs from a Creative Nomad Jukebox 3 to record. [tags]Field Recording,Music,Gadgets[/tags]

Zito back in the groove against the Mariners…

After a shaky opening day performance, Barry Zito came back and pitched six innings, giving up only one hit against the Mariners. Relievers Calero, Kennedy and Street shut the Mariners down the rest of the way, giving up no further hits and giving the Athletics their first shutout of the season. This follows a 2 hit performance yesterday, marking the lowest total for hits in two consecutive games in Mariner’s history.

They will finish up the four game series today. First pitch today at 1:05 PST.

[tags]Oakland Athletics,Barry Zito,Baseball[/tags]

How to pull an all-nighter

This blog article reminded me of a part of my life that is long past: the times of the all nighter.   Back in my undergrad days, I would fairly regularly pull all-nighters, usually working on some kind of computer programming assignment.   I think this culminated in a massive 72 hour awake-athon, where I consumed over 100 cups of coffee in three days.   I was working the breakfast shift at the cafeteria, doing an easy job, just checking IDs of students who come in and reading the newspaper.   I remember trying to read, and suddenly seeing words suddenly jump off the page.   I decided that hallucinations were a bad sign, and got someone to take my shift, went home and slept for twelve hours.

Shortly after this, I began to notice something: the code that I wrote past midnight was usually really ugly, buggy, and didn’t make much sense the next day.   I also found that programs were just easier to write when I wasn’t tired.   So I created a rule: I simply don’t program past midnight.

As I’ve gotten older, I have begun to realize just how dependent your brain is on being properly fed and rested.   Caffeine just makes you jittery.   Loud music and bright lights just give you a headache.  You need sleep, so sleep.   Plan ahead enough to keep allnighters from becoming a necessity.

[tags]Advice[/tags]

The Art of Living by John Stuart Mill, 1848

Big thoughts that resonate with me today…

The Art of Living by John Stuart Mill, 1848
Hitherto it is questionable if all the mechanical inventions yet made have lightened the day’s toil of any human being. They have enabled a greater population to live the same life of drudgery and imprisonment, and an increased number of manufacturers and other to make fortunes. They have increased the comforts of the middle classes. But they have not yet begun to effect those great changes in human destiny, which it is in their nature and in their futurity to accomplish. Only when, in addition to just institutions, the increase of mankind shall be under the deliberate guidance of judicious foresight, can the conquests made from the powers of nature by the intellect and energy of scientific discoverers, become the common property of the species, and the means of improving and elevating the universal lot.

[tags]Quote of the Day[/tags]