Model 284 NFL Elo Ratings Methodology

Follow me

This article dives into the concept of Elo ratings in general and then covers the details of our own methodology for creating Elo ratings for NFL teams. Be sure to tune into our weekly Elo ratings and ranking system for the 2017 NFL season, which we will be publishing weekly. Our numbers going into Week 1 can be found here.

Introduction

As many of you know, FiveThirtyEight is famous for the use of Elo ratings, and is among the first to develop and explore NFL-tailored Elo ratings. We at Model 284 have taken our own crack at an Elo rating system. But before I go any further into how we calculated our system, I would be remiss to not give a brief background on the history of Elo ratings. An Elo rating system is a method to evaluate competitor-versus-competitor games. First applied and implemented into single player games like chess by Arpad Elo, a Hungarian-born American physics professor, Elo ratings eventually found their way into team sports. In more recent memory, Elo rating systems have been used to rank and predict the FIFA World Cup, NFL, MLB, NBA, and more. So far, different sports have found differing levels of predictive power in Elo ratings, with the NFL being one sport where Elo ratings have found success.

Calculating Elo Ratings

Elo ratings are theoretically simple. The so-called standard method of calculating and implementing an Elo rating system starts with setting every team or competitor at a starting value: 1500, 100, or any other value, as long as every team starts with the same value. As teams play each other, the expected score of team A ( $E_{A}$ ) and team B ( $E_{B}$ ) can be calculated using the logistic curve with the following formula:

$\displaystyle E_{A} = \frac{1}{1+10^{(ELO_{A}-ELO_{B})/400}}$

$\displaystyle E_{B} = \frac{1}{1+10^{(ELO_{B}-ELO_{A})/400}}$

After the game is played, each team’s new rating is calculated as the difference between their expected score and their actual score ( $S_{A}$ ), meaning underdogs receive a larger bump to their rating for a win than favorites would. The last important component is the K-factor, a constant term that can be adjusted to determine how drastically each new game changes a team or competitor’s Elo rating. The higher the K, the higher the impact of each individual game or match on a team’s rating. The calculation for Team A’s new Elo rating would be:

$\displaystyle ELO'_{A} = ELO_{A}+K(S_{A} - E_{A})$

and Team B’s rating would be calculated as

$\displaystyle ELO'_{B} = ELO_{B}+K(S_{B} - E_{B})$

As teams play each other throughout the season, ratings will adjust, and teams will begin to separate themselves based on their performances. Thus, Elo ratings provide a good ranking that tracks performance over time.

FiveThirtyEight NFL Elo Ratings

To apply this system to the NFL, FiveThirtyEight has made some enhancements to the standard Elo ratings. You can read in depth about FiveThirtyEight’s Elo Ratings, here, but I will give a quick overview on a few of the key changes below.

1) Theoretically speaking, Elo ratings are continuous over time and do not distinguish between seasons. HOWEVA, Nate Silver and the boys and girls at FiveThirtyEight do incorporate season-to-season differences, as they regress each NFL team’s Elo rating to the mean by 33% after each season. Since we all love math (or at least, I do), here is the formula for calculating an NFL team’s current year (cy) Elo Rating from their previous year (py) Elo Rating:

$\displaystyle ELO_{cy} = ELO_{py}*\frac{2}{3}+1500*\frac{1}{3}$

2) Next, FiveThirtyEight implemented a margin-of-victory multiplier that works to discount margin-of-victory in blowouts (e.g., a 40-point win won’t count much differently than a 30-point win). Effectively, they are taking into account a diminishing marginal point differential as calculated below:

$Margin Of Victory Multiplier = \ln{(|Point Diff|+1)} * \frac{2.2}{(|ELO_{A}-ELO_{B}|*.001+2.2)}$

3) Lastly, FiveThirtyEight identified a K-Factor of 20 as an accurate adjustment for NFL games.

Model 284 NFL Elo Ratings

So finally, what you have all been waiting for: Model 284’s Elo ratings. While our Elo ratings remain fundamentally similar to FiveThirtyEight’s, below are a few significant adjustments—or dare I say, enhancements—to FiveThirtyEight’s Elo Rating calculations:

1) Our Elo Ratings are not continuous over time. Instead, they move on a four-year rolling basis. We make this adjustment due to the nature of the NFL, where careers are short and most franchises have had significant changes every 3-5 years. Heck, some have significant changes every year. As an example of how this works: in order to calculate 2017 preseason Elo ratings, we start the ratings four years ago (2013) and build them over the next 4 years, rather than going all the way back to 9999 BC. In that starting year (2013 in this case), all teams are set to 1500 and adjusted weekly all the way up to the current point in time (2017 Week 1).

2) We have adjusted FiveThirtyEight’s Margin-of-Victory Multiplier to include turnover differential and yard differential in the multiplier. By accounting for turnover and yard differential, we can reward teams for more convincing or efficient wins, as we all know that the “better” team doesn’t always win in the NFL. Here is our Margin of Victory, Turnover, and Yard Differential Multiplier, denoted as $M$ :

$M = e^{(\frac{1}{5}(TOVDIFF))} * (\frac{Yards_{A}}{(Yards_{A}+Yards_{B})}*\ln{(|Point Diff|+1)} * \frac{2.2}{(|ELO_{A}-ELO_{B}|*.001+2.2)}$

3) The NFL has significant turnover in roster every year, some teams more than others. Also, changes at certain positions have a bigger impact than others. Head coaching and, most importantly, quarterback changes have the most significant impact in year-over-year performance. Therefore, we regress teams that have had both quarterback and head coach changes 50% to the mean (rather than the standard 33%). The fact of the matter is that new head coaches often implement significant personnel changes, so how much weight should we really give the team’s previous season? If the head coach remains the same, but the quarterback changes, we make an adjustment of 40% to the mean (rather than 33%), because the quarterback position has proven to be of the utmost importance. Lastly, if the head coach and quarterback remain the same we only regress a team to the mean by 20% (rather than 33%). You will notice that the teams that remain successful year after year have continuity at the quarterback and head coach position, i.e., New England Patriots (Tom Brady and Bill Belichick), Green Bay Packers (Aaron Rodgers and Mike McCarthy), Pittsburgh Steelers (Ben Roethlisberger and Mike Tomlin), and Jacksonville Jaguars (Blake Bortles/Chad Henne/TBD and Doug Marone)…. just kidding.

Be sure to follow our Elo Ratings this NFL Season, and always remember: as the father of modern graph theory, Frank Harary, once said, “If we know the future, it will be here now.”

Model 284

Analyzing sports through math and models

Model 284 NFL Elo Ratings Methodology

Introduction

Calculating Elo Ratings

FiveThirtyEight NFL Elo Ratings

Model 284 NFL Elo Ratings

54 thoughts on “Model 284 NFL Elo Ratings Methodology”