As a fun little side project this winter, we put together possibly the geekiest thing we’ve done, and that’s including our fantasy sumo league. Using a variant of professional chess’s Elo ratings, we’ve created a power ranking system that we can update day-to-day to help quantify how well wrestlers are doing beyond just their record. Here’s a quick overview of how it works:
This first picture is a partial selection of our data entry system (for Kyushu 2017 to be precise). The axes are the list of rikishi in makuuchi and the chart shows which day they faced and who won. In Hakuho’s row, you see a ton of green numbers and 1 red, representing a bunch of wins and 1 loss. The number represents the day they fought and the sign (+/-) and color represent who won. With a little bit more programming this will be the only data entry we have to do and the rest will take care of itself.
This second picture is the first few days of our daily output. Within the column for each day you can see the opponent, the opponent’s ranking, the expected win chance, the change in points, and the wrestler’s new rating. For the most nerdy among you, here is our methodology. We started by assigning 1000 as a baseline value for an average wrestler. We arbitrarily assigned a starting rating based on the banzuke rating of every makuuchi rikishi in January 2017, from 1175 for yokozuna down to 825 for maegashira 16. Here are the formulas we used for the calculations:
This is how we calculated the expected win chance. We take the difference between our wrestler’s rating and his opponents’s and then use a constant, S, to calculate an expected win chance between 0 and 1. A lower value of S means the rating difference will have more effect on the expectation. For the first edition of our ratings, we used S = 200.
This formula takes the previous rating for our wrestler and adds a constant, K, times the difference between expectation and reality. The actual outcome is either 0 for a loss or 1 for a win and we compare that to the win chance previously calculated. K is effectively the max point swing. We are using K = 20 for now.
This last formula is how we regress to the mean between tournaments. We decided that everyone should be corrected 20% closer to our 1000 baseline. This corresponds to C = 0.8. By occasionally bringing everyone closer to our average value, we mitigate the effect of flukes and help nudge each rikishi’s ratings closer to a value that accurately represents their talent.
Here is a link to the full spreadsheet if you are interested. We plan to keep it updated as often as possible during each basho. Our next step is a more accessible way for everyone to follow along live. Let us know what you think or if you have any ideas for improvement!