MTG Elo Project - Blog - 2018 March 08

I promised a while ago to talk about the reconstruction effort I underwent to recover Grand Prix Philadelphia 2000. Fair warning: things may get a little technical ahead.

In a round of a typical tournament, three pages of information are generated by the event reporter: a list of pairings at the beginning of the round, a list of the results of each match after all the match results are put into the system, and the standings as of the conclusion of the round in question. For our purposes, it’s the middle one of those three that’s the most useful, because we need two pieces of information for the dataset: (a) who played whom and (b) what the match result was. The results page just tells us that straightaway.

Strictly speaking the results page is a convenience, since the information in it can be reverse-engineered from the rest of the coverage. If you know everyone’s match point total as of round N-1, and you know the pairings for round N, and you can see the standings after round N, then we can figure out the results from round N. A player won her match if her match point total after round N is three more than that after round N-1, lost it if that difference is zero, and drew if the difference is one point. Sometimes the results pages are corrupted in some way (the most typical error is the round N results page being the same document as the round N-1 results page) and I use this method to recover the data for the site.

You can imagine I wasn’t impressed with the coverage page for GP Philadelphia 2000: none of the rounds have a results page at all, and the first time we even see standings is round 6. This means for round 7 onward we can recover the results by using the method outlined above. (Round 6 doesn’t work because I don’t know the starting number of match points — those would be in the round 5 standings page.) Then I crossed my fingers, because sometimes the pairings pages include the MP totals. These don’t. All I knew about the first six rounds are the pairings. Would that be enough to recover the results?

On the face of it that may sound crazy, but there’s reason to believe that there may be enough data here to figure everything out. The results for some people will be immediate from their match point total: if they have 18 match points, they won every match they played and if they have 0 match points they lost them all. This will distribute some losses to people who played the 18MP players and some wins to the people who played the 0MP players. Maybe after that sweep is done we’ll have assigned a loss to some people with 15MP (= 5-1 record), so we’ll know they won all the rest of their matches, or maybe we’ll have uncovered a win to someone with 3MP, so they’ll have had to lose all the rest of their matches. (Note that 3MP could have been a record of 0-x-3 or 1-x, but since we’ve found a win for that person, their quantity of points left to assign is zero.) Then we get to go back and take a second pass, looking for byelines that can be completely filled in. In a perfect world, this initial cascade might fill in all six rounds.

There were 582 people in the tournament, and the successive passes filled in 86, 52, 34, 28, 16, 13, 7, 6, and 2 people, for 244 total. That’s something, but not everything. Most of the other players had some matches filled in, just not all of them. As an example, after my first sweep my Python structure had an entry of the form

         Lowery, Brett  12  ·W··L·  [12, 0, 3, 12, 12, 9]

meaning Brett had 12 MP after round 6, with a win round 2 and a loss round 5 already accounted for. The list at the end stored the match points of all six of Brett’s opponents. The possible results a player could have were W, L, D, B(bye) and X(drop). The pairings pages told me who had a bye in each round, so I at least had that going for me. A player dropped when he stopped appearing in the pairings. Thankfully nobody left and came back somehow.

The goal now was to find ways to get myself “unstuck”. If I could puzzle out an individual player’s results somehow, then we could resume the cascade; even filling in one match might lead to settling a substantial number of players. The big cache of information that I’ve left untouched so far is the fact that the pairings are done by the Swiss system, meaning the identity of your oppoents encoded some information about your record at the time of each match. I’ll try to illustrate with examples some of the techniques I used to tap into that data. I believe the list below is exhaustive in the sense that applying the observations below, together with cascading, was enough to recover all the results.

Look again at Brett’s line above. The win already credited to him round 2 turned out to be against someone who ended the tournament with zero match points. Brett’s round 2 opponent was definitely 0-1 after round one, so if they played each other Brett (almost certainly) was 0-1 himself. Therefore I credited Brett with a loss round 1.

I should address here that there’s of course the possibility of a pairdown. I made the simplifying assumption that there were no three-point pairdowns, since there were always people with draws intervening. For instance, in round two if a pairdown was necessary then there should have been a 3MP-1MP match and a 1MP-0MP match instead of a 3MP-0MP pairing. If this assumption is violated and people can be paired across brackets, I’m afraid what we’re trying to do becomes more augury than science.
Here’s another player’s line:
```
         Magby, Mike  12  B···L·  [xx, 7, 12, 12, 15, 9]
```
Mike’s round two opponent had a draw somewhere in the tournament. But it wasn’t against Mike, because the only way to reach 12MP with a draw is by going 3-0-3. Using this exclusion principle, I checked each person with a draw to see if exactly one of their undetermined matches could have been a draw. Notice that “eligible to have drawn” is something that depends on how many match points are left to be assigned; a hypothetical person with 10MP after round 6 and an uncovered history of ·DL·W· won’t have any more draws, so they definitely didn’t draw with their round 4 or round 6 opponents.

This logic eventually uncovered that two people with 9MP had to have had a draw; they both were 2-1-3 after round six. Thanks for spicing up the project, guys.
Suppose after round six you have P match points and your round six opponent has P-3 match points. Then you won round six. (Again, assuming no three-point pairdowns.) Similarly if you and your round six opponent wind up with exactly the same number of match points as of round 6, then you drew round 6. If the discrepancy between your round 6 total and your opponent’s total is 1 or 2, then there was some sort of a pairdown; I didn’t try to assign a result to a match like that at this stage. This logic applies to Mike Magby (above), who won against his round 6 opponent.
The logic of the previous item can be extended. Suppose you dropped after round 5 with P match points, and your round 5 opponent showed up in the round 6 standings with P+6 match points. Then you played that opponent in a P-P match, the opponent won and went to P+3, and then they won again round 6 and went to P+6. In short, they finished WW, and you ended LX. You can go yet another step here and consider people who dropped after round 4; if their opponents ended round six with nine more points than the dropped player, that opponent finished WWW.

Maybe there was one other item that I’ve forgotten about, but I believe these were the only methods that I used to fill in every result from the first six rounds. I was a little astonished at the end that everything was not only filled in, but also internally consistent; I think that illustrates how much information is already contained in the standings. My goal was to use the lightest touch necessary to recover all the results; I’m sure there’s other ways to draw the same conclusions, but I wanted a set of axioms that would let the rounds fill themselves in as much as possible. This way if something went wrong there would be a more limited place to look for inconsistent hypotheses; this is especially valuable since future deductions depend on previous work. Unfortunately for the other big reconstruction project (GP Kansas City 1999) things need to be done more by hand. More on that job another time.

I should address the question about whether the results I reconstructed are unique, or if there’s some other way to fill in the grid that would assign everyone the appropriate number of match points. This mainly depends on whether there was a three-point pairdown really early in the process, since future deductions are based on previous results. I’d be somewhat surprised if what I came up with wasn’t an exact match to historical fact, or at least was really close to it, so I ultimately decided to include the reconstruction on the site. It would be nice to try to reconstruct the data in a different order to check for discrepancies, but I admit I’m not optimistic that I’m going to have the time or motivation in the near future. If anyone else wants to torture themselves and go through this, though, I’d be happy to compare our results!