I have a buddy named Andy who is your typical obnoxious Yankees fan. He’s also an astronomy post doc and, as such, has a pretty solid background in statistics. Yesterday, Andy read an article by Nate Silver that put the Red Sox collapse into mathematical terms. As far as Andy was concerned, Silver’s numbers didn’t add up.
Andy, who is prone to hyperbole, emailed me to tell me he thought Silver was playing it fast and loose with his math. “The part where he says ‘not mathematically rigorous’ should read “completely arbitrary,” Andy wrote.
Here’s the section from the Silver article that Andy found most objectionable:
The following is not mathematically rigorous, since the events of yesterday evening were contingent upon one another in various ways. But just for fun, let’s put all of them together in sequence:
• The Red Sox had just a 0.3 percent chance of failing to make the playoffs on Sept. 3.
• The Rays had just a 0.3 percent chance of coming back after trailing 7-0 with two innings to play.
• The Red Sox had only about a 2 percent chance of losing their game against Baltimore, when the Orioles were down to their last strike.
• The Rays had about a 2 percent chance of winning in the bottom of the 9th, with Johnson also down to his last strike.Multiply those four probabilities together, and you get a combined probability of about one chance in 278 million of all these events coming together in quite this way.
When confronted with numbers like these, you have to start to ask a few questions, statistical and existential.
Here’s Andy’s response:
Check out his 4 bulleted probabilities, which he multiplies together to say there’s a 1 in 278 million chance.
His 2nd and 4th bullet points come from the same game. Multiplying them together (0.3% x 2%)=6*10^-6=0.000006.
Analogously, let’s say I did the same thing after every out in a 0-0 game. After every out, the probability that one team is going to win is about 50% (not quite because the home team will have a higher win probability). Then the probability that Team A wins the game is 0.5^27, which is 7*10^-9=0.000000007. And of course it’s the same for the other team, i.e. it’s near “statistically” impossible. So obviously that doesn’t make any sense–you can’t just multiply the probabilities together.
Nate’s mistake, which he hints at in the preceding paragraph, but then ignores, is that the 0.3% in the 7th inning includes the 2% chance with 2 outs in the 9th. The same is true when multiplying the probabilities from the season standings. And of course, the analysis ignores:
1) the fact that the Yankees put in a bunch of shitty pitchers instead of a league average closer, let alone Mariano Rivera
2) the fact that the Red Sox are chokers