The Summer of Jeff

Calculating the Probability of a Streak Within a Season

Posted in baseball analysis, programming by Jeff on September 18, 2017

The 2017 baseball season has had its share of team streaks. I wrote about the Indians record-breaking 22-game winning streak for the Economist, and needed to calculate the odds that a team of Cleveland’s quality would, over the course of a 162-game season, win so many consecutive games.

It turns out that this is a rather difficult problem. Baseball fans and pundits have published a number of probabilities over the course of the streak, but many fall into two categories: (a) the odds of winning all 22 games out of a specific set of 22, or (b) the odds of winning 22 out of 22, multiplied by 162/22, or the number of 22-game stretches that occur in the baseball regular season. The first solution generates some extremely long odds–it’s much harder to win 22 specific games than to assemble a streak over a longer time frame. The second solution is a bit better, but still gets it wrong.

My solution isn’t 100% correct (and I’ll discuss its limitations in a bit), but it gets much closer, especially for rare events such as the Indians streak. We must calculate the probability of an exactly 22-game streak, then the probability of an exactly 23-game streak, all the way up to a full-season, 162-game streak. An exactly 22-game streak is really 24 games long. To exclude 23-gamers (or longer), the 22-game streak must be preceded and followed by losses. There are also the edge cases of streaks that begin or end the season–and thus cannot be both preceded and followed by losses–so we must handle those separately.

So, for the probability of an exactly 22-game streak:

  1. find the probability of a 24-game stretch consisting of a loss, 22 wins, and then a loss;
  2. count the number of 24-game stretches in the course of a season (the length of the season minus 24 minus 1, or 137, for the MLB regular season);
  3. multiply (1) by (2)
  4. to handle the edge cases, find the probability of a 23 game stretch starting (or ending) with a loss, followed (or preceded) by 22 wins;
  5. multiply (4) by 2
  6. add (3) and (5)

Then repeat the process for every streak length from 23 up to 161, and then calculate the probability of 162 wins in 162 games.

Clearly you’re not going to do this by hand. To manage it, I wrote up a python script, which can be customized for three variables: streak length, season length, and the odds that the team will win a single game.

The problem with this method is that is double-counts any seasons with multiple qualifying streaks. The longer or rarer the streak, the less likely this is a problem–just imagine the odds against Cleveland winning 22 in a row twice in the same season, even if it were still possible with fewer than 22 games remaining. But for the sake of completeness, it’s important to realize the answer given by this algorithm is not precise. And if you use it to calculate the probability of, say, a 3-game winning streak at some point in the season, the error is going to be so great as to make the whole exercise worthless.

(For the more math-conversant, here’s a discussion of the problem along with a fully correct answer. It would be possible to expand my solution in python to render it complete, but it would increase the complexity quite a bit.)

Indians, Dodgers, and long odds

As I noted in the Economist piece, the odds that a team of the Indians quality–a pythagorean winning percentage of 65.77% through the final game of the streak–would win 22 in a row at some point during a season is about 200 to 1. I was surprised it was that likely, but on the other hand, there are very few teams of that quality. Also, if we consider the range in team quality from game to game due to the differences in starting pitching, the likelihood of such a streak goes down.

Playing around with the algorithm led me to a surprising finding: If we assume the Dodgers are also a very good team, as their record suggests, the likelihood of their own 11-game losing streak was lower than the probability of the Indians reeling off 22 wins in a row. At the same pythag of 65.77%, the odds against an 11-game losing streak are over 1,000 to 1, and even at a modest 60%, the odds against dropping 11 in a row are over 250 to 1.

That doesn’t even consider their almost-adjacent five-game losing streak, which meant that one of the best teams in baseball lost 16 out of 17. The probability of that requires solving a slightly different problem, one that I’ll leave as an exercise for the reader.

Advertisements

Comments Off on Calculating the Probability of a Streak Within a Season

Wisconsin Baseball History Papers

Posted in baseball history by Jeff on September 18, 2011

In 2004, I presented two papers at SABR conferences on aspects of Wisconsin baseball history.

The first, which I gave at the Seymour Conference, is titled, “Challenging Everyone: The Capital City Base Baseball Club of Madison, Wisconsin, 1865-1870.”  It discusses the first wave of baseball enthusiasm in Madison and its quick fade.  Click for the PDF.

The second, which I gave at the SABR National in Cincinnati, is titled, “‘But Few Can Touch Him:’ George Wilson and Integrated Wisconsin Baseball, 1905-1907.”  Wilson was a very talented African-American pitcher who once played for the Page Fence Giants.  He went on to make a career for himself in the low minors in Wisconsin and Minnesota, usually the only black player on his team, and often the star.  Sadly, some of that paper has been lost, but click for a PDF of the first several pages and the accompanying handout.

Comments Off on Wisconsin Baseball History Papers

Oy Christina Big Band Scores, Part 2

Posted in music by Jeff on August 24, 2011

From 2002 to 2006, I led a 15-piece jazz orchestra.  I wrote charts for the top 40 pop hits of the day, which ranged from swing to funk to outright satire.  You can listen to the full studio album here.  I shared the charts for those songs in an earlier blog post.

I also created an album of “bootlegs” — live versions of songs that didn’t make the cut for the studio album.  You can listen to that album here.  The charts for most of those tunes are below.

I created the individual parts in Lime, a very friendly music notation program available from the Cerl Sound Group.  (You can download it for a free trial.)  I worked from handwritten score sketches; unfortunately I no longer have those to make available.  (And if I did, I’m not sure they’d be usable by anyone but me.)

Most of the charts are scored for 15 pieces: 4 trumpets, 3 trombones, 4 saxes (2 altos, tenor, bari), and rhythm (usually 2 guitars, bass, drums).  Sometimes I only created a single rhythm section part.  Some of the charts were scored multiple times for different size groups; in those cases, the filenames should make that clear.

I can’t speak for the original composers and lyricists of these songs, but you have my permission to use these scores for whatever purposes you want.  If you perform any of them, I want to hear about it–and I want to hear a recording!

Click the links to download a .zip file with the full parts for each chart:

Comments Off on Oy Christina Big Band Scores, Part 2

Oy Christina! Big Band Scores

Posted in music by Jeff on August 22, 2011

From 2002 to 2006, I led a 15-piece jazz orchestra.  I wrote charts for the top 40 pop hits of the day, which ranged from swing to funk to outright satire.  You can listen to the full studio album here.

I created the individual parts in Lime, a very friendly music notation program available from the Cerl Sound Group.  (You can download it for a free trial.)  I worked from handwritten score sketches; unfortunately I no longer have those to make available.  (And if I did, I’m not sure they’d be usable by anyone but me.)

Most of the charts are scored for 15 pieces: 4 trumpets, 3 trombones, 4 saxes (2 altos, tenor, bari), and rhythm (usually 2 guitars, bass, drums).  Sometimes I only created a single rhythm section part.  Some of the charts were scored multiple times for different size groups; in those cases, the filenames should make that clear.

I can’t speak for the original composers and lyricists of these songs, but you have my permission to use these scores for whatever purposes you want.  If you perform any of them, I want to hear about it–and I want to hear a recording!

Click the links to download a .zip file with the full parts for each chart:

Comments Off on Oy Christina! Big Band Scores

1879 Chicago White Stockings Notes, from Chicago Tribune and Times

Posted in baseball history by Jeff on August 18, 2011

Several years ago, I did a fair amount of research on the 1879-87 Chicago White Stockings, focusing on Cap Anson. That included a block of time I spent going through the Chicago Tribune from late 1878 and all of 1879.

After the jump, find my notes from that research.  (At the end, there are some notes from the 1879 Chicago Times, as well.  Passages are almost always verbatim; my personal commentary is indicated by square brackets.  These are exactly as I typed them, which means there are plenty of abbreviations (I hope you can make sense of them; most of them refer to names), and there are even more typos.  Sorry about that.

Since I’ve abandoned my 19th-century baseball research, I hope this can be useful to someone.  See also my file of 1879 White Stockings box scores and my 1879-87 White Stockings notes from the Hall of Fame Library.

(more…)

Comments Off on 1879 Chicago White Stockings Notes, from Chicago Tribune and Times

1879-87 Chicago White Stockings: Hall of Fame Library Notes

Posted in baseball history by Jeff on August 16, 2011

Several years ago, I did a fair amount of research on the 1879-87 Chicago White Stockings, focusing on Cap Anson.  I lucked into a few days at the Hall of Fame Library in Cooperstown, so I dug through several player files.

After the jump, find my notes from those files.  Passages are almost always verbatim; my personal commentary is indicated by square brackets.  These are exactly as I typed them, which means there are plenty of abbreviations (I hope you can make sense of them; most of them refer to names), and there are even more typos.  Sorry about that.

Since I’ve abandoned my 19th-century baseball research, I hope this can be useful to someone.  See also my file of 1879 White Stockings box scores.

(more…)

1879 Chicago White Stockings Box Scores

Posted in baseball history by Jeff on August 10, 2011

As part of a long-dead, unfinished project, I collected all of the box scores of the 1879 Chicago White Stockings.  (I think it’s all of them, anyway.)  I copied them from the Chicago Tribune reports of (usually) the following day.

I’ve zipped all the individual box scores, and now they are available for anyone who may want them.  (I know–it’s difficult to imagine how you lived your life all this time without them.)

Click to download.

Ted Sullivan, Humorous Stories of the Ball Field

Posted in baseball history by Jeff on August 8, 2011

Several years ago, I had some free time and easy access to a well-stocked library.  For some reason, I decided to spend that time transcribing the text of Ted Sullivan’s book, Humorous Stories of the Ball Field.

Sullivan is an important figure in early baseball, especially minor league baseball.  His Wikipedia page barely scratches the surface.  His books aren’t exactly groundbreaking, but as they are first-person narratives of the 19th-century game and its characters, they have some value.

In any event, I never finished the project, though I did transcribe roughly 75,000 words.  There are plenty of typos, and quite a few missing words, as the microfilm I was working from derived from a very worn copy.  This is not a project I’ll ever return to, and since Google Books doesn’t seem to have found Sullivan’s work yet, I’ve posted the full text of what I transcribed after the jump.

Sullivan’s book should now be in the public domain, and I disclaim any rights to any value I’ve added (if, indeed, there is any).  So use it however you want.

(more…)

Comments Off on Ted Sullivan, Humorous Stories of the Ball Field

Head Over to HeavyTopspin.com

Posted in Meta, tennis by Jeff on February 28, 2011

It was only a matter of time before I started a dedicated tennis blog.

My archived tennis studies will remain here, but from now on, I’ll be publishing tennis commentary (and additional research) at my new site, HeavyTopspin.com.

Click on over, tell your friends, and learn more than ever wanted to know about men’s tennis.

Comments Off on Head Over to HeavyTopspin.com

Lefties in Tennis: Doubles and Prize Money

Posted in tennis by Jeff on February 16, 2011

A few days ago, I offered some numbers on the prevalence of lefties in men’s tennis.  It turned out that, in the top 300 of the ATP singles rankings, lefties don’t show up much more than you would expect them to.

A reasonable follow-up question would be: What about doubles?

Being left-handed may not make one a better doubles player, but being left-handed does have the potential to make one part of a better doubles team.  Case in point: Five of the eight doubles teams that earned a spot in the ATP Tour Finals last year were a righty/lefty duo, including the top two teams in the year-end rankings.

And indeed, it turns out that left-handers are more prevalent in the top ranks of men’s doubles.  As we’ve seen, in November 2010, five of the sixteen players (31 percent) included in the ATP Tour Finals were left-handed.

The most current ATP doubles rankings tell a similar, if less extreme, story.  Of the top 100 ranked doubles players, 18 are left-handed.  That’s considerably higher than the 12 of 100 at the top of the singles rankings.  (Both top 100s include Rafael Nadal, who plays left-handed but was born right-hand dominant.  These calculations consider him left-handed.)

Prize money

The majority of players participate in both singles and doubles, at least on occasion.  To determine some general level of “success” for ATP players, we could look at total prize money.  This weights singles much more heavily.  An advantage is that it is a reasonable measure of a sustainable career in professional tennis.

So, do left-handers have a better chance at making money in tennis than we would expect, given their prevalence in the general population?

It doesn’t look like there is any substantial advantage.  Of the top 100 money-winners, 13 are left-handed, including Nadal.  The top 100 does include four doubles specialists, out of only 13 total doubles specialists in the top 100.

If we go further, we find an additional five lefties from 101 to 150, and six more from 151 to 200.

Left-handers do seem to have a better chance than right-handers of reaching a certain level of success in men’s doubles.  Beyond that, there is little in the way of a handedness advantage.  Whatever the advantages of playing tennis left-handed and the challenges of facing a lefty, they don’t translate into an overwhelming number of left-handers at the top of the professional game, or a disproportionate level of success for left-handed professionals.

Comments Off on Lefties in Tennis: Doubles and Prize Money