Predicting the outcome of men’s tennis matches
It’s easy to come up with a rough prediction of the outcome of a MLB baseball game. Take a strength rating (usually winning percentage of some variant thereof) and apply Bill James’s invention, the Log5 method.
There’s no reason something similar can’t be applied to men’s tennis matches. (Women’s tennis matches, too, but the necessary data is more readily available for men’s tennis, and I’m more interested in the men’s game.) We need two pieces of data for such a calculation:
- A numerical representation of the strength of each competitor. (In baseball, as mentioned, that’s generally a winning percentage.)
- An algorithm (such as Log5) to translate those two numbers into an expected winning percentage.
I’ve been mulling this over for some time, most of which I’ve spent stuck on #1. I’m convinced that ATP rankings are a flawed representation of each player’s talent. They are probably about right for the top 10 or so, but as you go deeper, I have more problems. But despite some experimentation, I haven’t come up with a system that comes anywhere close to replacing it.
So ATP rankings it is. Specifically: ATP ranking points. At the moment, then, Federer’s rating is 12,040, Murray’s 9,610, and so on.
For #2, I saw no reason to reinvent the wheel. While Log5 is designed to work with percentages, a little algebra can shift it to something workable with large integers. The result:
x^e / (x^e + y^e)
x and y are the ratings of each player, and e is a constant exponent. As it turns out, e = 1 is a pretty good approximation of the best possible results available from this formula. That gives us a much simpler formula; a ratio, really:
x / (x+y)
It’s easy to use, and it’s reasonably accurate.
To develop and test this approach, I built a database of every ATP and ITF challenger-level match from 2009, through last week’s results in Cincinnati. I took only those matches involving two players ranked in the top 200. That gave me a group of 2,766 contests. Side note: 1,106 (37%) of those matches were won by the lower-ranked player.
To test a variety of formulae and exponents for the formula above, I sorted those 2,766 matches by expected winning percentage (i.e., the theoretical percent chance that the higher-ranked player would win), then split them into “buckets” of about 100 matches. For each bucket, I calculated the average expected winning percentage and compared it to the actual results of the matches within that bucket.
Here’s a graph of the results:
The buckets are plotted from left to right in order of expected winning percentage. For instance, the 100 matches closest on paper gave the favorites a 50.3 percent chance of winning (on average). As it turned out, 55.4 percent of those matches were upsets. With 100-match buckets, there will be some noise, especially when a difference that looks substantial on the graph only represents 5 or 6 matches.
Let’s look at some results, based on the rankings of August 24. We get a “smell test” for the model, plus the fun of some hypothetical matchups.
- Federer vs. Murray: 56.2%
- Federer vs. Tsonga: 77.7%
- Federer vs. Wawrinka: 88.9%
- Federer vs. Kevin Kim (#100 right now): 96.3%
- Nadal vs. Soderling: 79.5%
- Roddick vs. Querrey: 81.3%
- Tsonga vs. Simon: 53.9%
- Verdasco vs. Feliciano Lopez: 73.5%
Here are a few interesting first-round matchups from the U.S. Open draw:
- Mathieu vs. Youzhny: 62.7%
- Djokovic vs. Ljubicic: 90.8%
- Monfils vs. Chardy: 67.8%
- Melzer vs. Safin: 57.6%
After the qualifiers are placed in the draw, I’ll run the numbers on the whole thing. Predictions of this sort aren’t all that interesting when they are based on commonly-used rankings–it would take an incredibly stacked draw for anyone other than Federer to be the favorite–but it will be interesting to see who has theoretically tough draws, as well as seeing the percentage chances each of the top players have of taking the title.
Of course, I could write another thousand words about the limitations of such a system as this. Here are a few:
- It doesn’t take surface into account.
- It doesn’t consider the difference between best-of-three and best-of-five matches.
- It doesn’t know about injuries.
- Related: it doesn’t consider the quality of recent performance. Sam Querrey’s success of late shows up in his ranking points, but it isn’t weighted any more heavily than if he had enjoyed an equally successful February and March, instead. While streakiness has been devalued in the analysis of team sports, it sure seems like an important component in tennis performance.
- It doesn’t consider matchups. Based purely on anecdotal evidence, I’d guess a disproportionate number of outliers come from guys like Karlovic and Isner. Maybe the same is true with unknown wildcards.
- I limited the study to matchups within the top 200. This is a comically extreme example, but what about next week’s matchup between Federer and Devin Britton? Britton hasn’t played much professional tennis, so he has 4 ranking points. That gives Roger a 99.9986% chance of winning. I’d put my money on Roger, but you have to believe in 1,000 such matches, even Federer would roll an ankle and have to retire at least once.
- Similar to streakiness like Querrey’s current run, what about the momentum that qualifiers carry into a tournament? Alejandro Falla is ranked #159, but after winning three matches in qualifying this week (including two quite handily), it’s tempting to think he is a bigger threat than a similarly-ranked player who was wildcarded into the event.
At the very least, I’d like to extend my study to more matches. The data mining is a bit arduous, but another several thousand matches wouldn’t hurt.
Given the complexity of the ATP ranking system, it’s not an easy thing to tweak. Part of the reason I seek an alternative method of ranking players is that it might be more flexible, admitting adjustments for surface and recent results.