Welcome to Elo 2.0! See the most recent blog post to learn more.
2017 June 13 (Adam)

2010 and 2011 were integrated into the system a couple days ago. This update added 46 tournaments and just short of 170,000 matches. This pushes us over the 300 tournament mark. There’s 1.49 million matches of Magic catalogued here. Based on the data from a couple posts ago, I think we’re at about 48.6% of all tournaments and maybe 55% of what we can possibly get. I’m hoping to stay on the pace of two years every two months like I’ve done so far, but we do have some behind the scenes work to do as well, so we shall see. The next two years total 48 tournaments, but the tournaments are also getting smaller, so my fingers are crossed that the total amount of stuff to swim through will decrease. Unsurprisingly I’m getting faster and faster at the curating process, but there’s a lot of name-reformatting that has to happen between scraping the raw data and adding it to the site. For 2010-11, the raw data had 17784 unfamiliar names and the curating process pared that down to 11521 new people. Naturally I missed some, and also some entries have data from multiple people in them. But this is what’s gained from working on the data some.

At the moment K = 36 still. I thought K = 30 or K = 32 would give more predictive power, but there’s a baseline level of noise that no value of K seems to mitigate. I thought this comes from the fact that people near 1500 are often grossly misrated, but the effect seems to persist even when we only look at people who have already played 25 matches (admittedly a much smaller data set). I’m still messing around with variable-K schemes and other slightly more complicated rating methods, but I’m not going to change anything until I get something that I can confidently say fits the data better than what we have now. Otherwise there’s no reason not to prefer continuity. I put some data together for this update, but I think I can present it better than I am right now, so I’m going to take a few extra days to improve it before I share.

We still don’t have a big problem with inactive players: it’s true that 22642/146738 people in the database haven’t played since before 1/1/13, but only 42(!) of those 22642 people have a rating of 1700+. (The highest rated such player is Ian Duke, at 1902.) I haven’t figured out exactly what should happen if people play in an event after a long layoff, but isn’t a big problem either — only about 1350 people have had a gap of more than 200 tournaments (4-5 years) in their histories, and only 26 of those have a rating over 1700 currently. Probably there should be some sort of regression to the mean or something, but I bet doing nothing is fine too. As an extreme example, Corey Baumeister made the finals of GP Miami 2015 after a 3+ year layoff. We’ll see what happens as we keep going back in time. I just don’t want to discover that a spot in the top 30 is frozen on someone who stopped playing in 2005. At the least we’ll set it up so that the rankings calculation only take into account people who have played within the last couple of years. There are people like Xu Su out there who don’t play much but are very good when they play. (Finals of consecutive events, two years apart!) (But I assume real life made him decline the PT invites.) I don’t want to cut those sorts of people out due to a hyper-focus on the week-to-week grind of the tour.