Welcome to Elo 2.0! See the most recent blog post to learn more.
2023 November 17 (Adam)

It’s been a while since we started our Patreon, so we’re more than overdue for a status report. (A massive heartfelt thank you to everyone who has contributed!) In my mind I’ve split the work that needs to be done into four major sections:

  1. Data acquisition and reconnaissance. The main tasks here are: determining what events exist, finding stable links to old data, scraping and verifying results for accuracy, and getting everything into a standard format for curating.
  2. Curating. The main task here is to integrate new data with what already exists. Many of the routines need to be updated—the process I had in place for adding new GPs in 2018 more-or-less works for contemporary events on Melee, but it leaves some manual work that could be streamlined. We’re also adding a LOT of new events and any even incidental improvement here would lead to immense gains.
  3. Frontend improvements. By this I mean anything that a visitor sees on the site. The main task here is refactoring a lot of our code from PHP into React. Right now the backend and frontend are pretty “rigid”: they were designed for one purpose and that’s all they can do, it’s a real challenge to add features or rearrange how data is being displayed. I want to be able to see a list of Jon Finkel’s top 8s on his match history page, or check a box that filters out GPs or SCGs, or filter the leaderboard to only show players who were active in the last five years, or link to a YouTube playlist of PT coverage, etc. I’m also lumping improvements to the ratings algorithm into this item, but that’s another post unto itself.
  4. Backend improvements. This includes anything that happens with the data behind the scenes, what exactly our stack looks like, etc. I’ll ask Rebecca to talk a bit about this later, since she’s more experienced here than I am.
  5. Public-facing API. I said there were four tasks, right? I’m not counting this as part of the site redesign; the main task here is dialogue with other stakeholders to determine what other people want to build and what endpoints we should expose. But we’re going to make sure that we get things internal to the site to work before we have these conversations. Still, it’s worth pointing out that we’re restructuring the backend with an eye toward making these applications possible.

Of course these are all interconnected in various ways, and work has gone into each of these four items already. I want to spend a bit of time discussing each of them in turn over the next couple of posts. Unimaginatively we’ll start with #1, data acquisition and reconnaissance.


I mention reconnaissance separately from acquisition because, as many of you are no doubt aware, during the pandemic each of Wizards, ChannelFireball, and Star City Games redesigned their sites and unceremoniously dismantled access to coverage of old tournaments. When we started working on the site we assumed that we could generate links to coverage by only storing the event code: for example Grand Prix Atlantic City 2015 corresponds to “gpatc15” which then was used to generate the URL http://magic.wizards.com/en/events/coverage/gpatc15. This is no longer adequate to link to source data; all these links broke. Wizards has said that event coverage would eventually migrate to their new architecture, but it’s been nearly a year and I’m not holding my breath. ChannelFireball’s coverage archive is apparently lost forever—I’ve heard that there aren’t backups (though I’d love to be proven wrong here). Star City Games’s archive still exists at old.starcitygames.com and/or static.starcitygames.com if you know where to look, but who knows when the plug gets pulled on that. (I also haven’t been able to find everything there; if you think you’re a good URL hacker and want to poke around in a black hole, talk to me on Discord sometime because I’m looking for some results that I haven’t found yet!)

Because we believe strongly that the point of the site is only to mirror data that exists publicly in other places, I’ve put a lot of work into probing what’s been preserved on archive.org’s Wayback Machine. My progress is in a Google spreadsheet at the moment, but at some point this will become the backbone of the events database, and all the links I’ve found there will propagate to the site. Our plan is to make a section of the site for coverage pages for what has been lost, so that we can be that public source for some of the lost data. For example if you want to see round 2 pairings of GP Austin 2020 (one of the lost ChannelFireball events), that should be on the internet somewhere, and we’ll make sure it is. It also doesn’t sit right with me that a new starry-eyed oaf can’t come along in five years and start their own Elo Project because the data no longer exists. (If you think you might be that oaf, please don’t do it, it’s way too much work.)

As a part of making the Google sheet, I personally added every missing SCG Open round to the Wayback Machine; if you’re ever looking at data on the Wayback Machine and see a date from 2023, that data was preserved because of me.

Basically every SCG Open is now preserved; I wish I could say the same about GPs, but many were gone before I started this process. Original data from the later part of 2019 and all of 2020 has been lost. (Plus of course the first 40-ish GPs never were on the internet to begin with.) On the bright side, I did unexpectedly find useful data from a 1997 Pro Tour that I haven’t worked with yet, but I will soon.

I’ve received requests to add a bunch of different tournaments to the site, and my answer has typically been “yeah, probably, if the data exists.” I’ve been focused on older events because the web scraping tasks are more challenging, but there are events from all eras that eventually will be included. Here’s my progress on some of these:

  • Individual SCG Opens and Invitationals, both the “one day” (2009-2014) and “two day” (2015-2020) varieties. I have 469 of them all done and ready to be curated; these altogether represent about 739,000 new matches of Magic. (The current database is around 2.7 million matches.) Four weekends (eight events) are missing altogether, but maybe they’re out there somewhere; five other events have data that’s too corrupted to use.
  • National championships. The status of these will probably make you sad; there’s surviving data for a few countries but not many, and not every year. I did find a few of these events (Germany 01-05 and Netherlands 04) from country-specific sources, for example some Germany nationals results are still on the internet at PlanetMTG.de. Are others out there somewhere? Does anyone familiar with the Magic internet circa 2004 know where to look? As far as I’m aware no other 2017 or 2018 results were ever on the Internet, but I’d be happy to be proven wrong. The good news is that all 109 events in the grid here are usable and ready to be curated. These represent another 98.2k matches.

There are plenty of other events I’d like to include, but I only sort of know what’s out there, so let me end with some questions.

  • Is there a list of post-pandemic SCGCon events? Even just a list like “in 2022 there were SCGCons in New Jersey, Indianapolis, and Cincinnati” would be a huge help in knowing that I’ve found everything. My intention is to take just “main events” where I can find them—preferably ones that are big RCQs, since then I can say that contemporary events are being included or not based on whether they fit into the organized play hierarchy.
  • Same question for Legacy EU Grand Open Qualifiers. I believe these all have results on Melee; how many have there been? Where were they?
  • I want to add THE FINALS, the yearly end-of-season invitational tournament in Japan that has run since 1995, but I haven’t had any success finding pairings and results. What appears to be the official page for the 2018 edition, for example, at https://mtg-jp.com/coverage/finals18/ doesn’t have results or pairings. Are they out there on some other site?
  • Was there a Japanese equivalent of the SCG tour in pre-covid times?
  • What about in Latin America? Were/are there convention hall-level events that people travelled to besides GPs?

I’d be happy to receive tips on the Patreon Discord; help me figure out what other data belongs on the site! When I write next I’ll be talking about phase two: curating.