First Server Advantage? – Now With Data!
Last week I wrongly concluded that the first server in a set had a structural advantage. In fact, both the server and returner of the first game begin each set with an equal chance of winning–at least, if the players themselves are of equal skill levels.
But what about in practice?
If the conventional wisdom is to be believed, the first server does have a psychological advantage. If the set reaches 4-4, the first server can hold for 5-4, then the second server must hold simply to stay alive. If he does, the process repeats itself at 5-5 and 6-5. In other words, in a tight set, the pressure is disproportionately on the second server.
What the data says
While we have final scores for every ATP match for the last several years, we don’t always know who served first. However, we do have some match-by-match statistics, including the number of service games for each player. When the number of service games is equal, it’s anybody’s guess who served first. But if one player served more games than the other, he must have served first. As it turns out, about half of all matches have an odd total number of games.
(Is this a biased sample? It could be, but it’s not clear to me that it is. If we were only looking at individual sets, it would be biased–6-3′s are more likely to favor the server, for instance–but at the match level, many contests have one set with an odd number of games and one or two with an even number of games, resulting in a score like 7-6(4), 3-6, 6-4. That said, the possibility of a bias can’t be ignored, as we’ll see.)
Let’s dive in. I have stats for 2674 ATP-level matches from 2010. 1316 of those had an odd number of service games, so we can analyze those. That subset of matches gives us a total of 3464 sets, of which 53.2% were won by the player who served first in that set. That’s a substantial edge.
I decided to run the same test on the 2010 Challenger tour. There, we have 4695 matches, 2255 of which are usable for this purpose, giving us 5274 sets. You couldn’t ask for anything much more consistent: At this level, the first server won 53.0% of sets.
What about deciders?
If we’re talking about pressure, let’s turn to where there’s some real pressure. Sure, the returner is playing catch-up in every set, but it only really matters in a deciding set.
At the ATP level, the available sample shrinks to 493 matches, of which the first server in the deciding set won 51.7%. Surprisingly, among Challengers, the first server in the final set won only 48.8% of the 797 relevant matches.
These results, especially the Challenger numbers, are far from intuitive. It turns out that the first server has the most dramatic edge in the first set. In the Challenger dataset, the first server won the first set of a match 61.6% of the time! For the ATP-level matches, the edge goes up to 64.6%. At this point, I have to suspect that the data is biased. A 6-3 first set makes a match more likely to show up in this dataset, and the first server is much more likely to win 6-3.
What about “tight” sets?
Here’s another possibility: The first server’s supposed edge really doesn’t kick in until the set gets to about 4-4. There may be a bit of a psychological advantage to serving first, but if someone is winning 6-1 or 6-2, it’s for reasons other than the subtle gain conferred by a coin toss.
If we limit ourselves to sets with 10 games or more, we get another wacky result. In ATP matches, the player who serves first wins a “tight” set only 44.0% of the time. In the Challenger dataset, it’s 44.1%. Again consistent, but far from intuitive! I would have expected the opposite.
Aside from the plausible aggregate numbers (53% of sets going to the initial server), it’s tough to explain a lot of this. And I’m not sure it’s worth the effort, given the possibility that the dataset is biased. Without better data, it may not be possible to draw any solid conclusions on the advantage gained by serving first.