The NHL Playoff Model uses team-level and individual player statistics to predict the probability that a given team will win a series (rather than predicting each game individually). Theoretically, one would think predicting a winner of a series would be easier than predicting the probability a team wins a single game, but many things can happen throughout the course of a series, most significantly, injuries that make predicting the series winner difficult.
“Buddy Hield is the next Stephen Curry”
“Brandon Ingram is a poor man’s Kevin Durant”
“Andrew Wiggins’ upside is Carmelo Anthony, but his floor is James Posey”
So often when we talk about NBA players, we do it through comparisons to other players, and with good reason—comparisons are a good way to quickly convey lots of information about a player. For example, if I tell you that a player had a Box Plus/Minus of 7.8 last season, you might get a vague idea of how good he is. If I then tell you that player had 19.5 points per game, you might have a slightly better idea, but it’s still far from the full picture. But if I claim that this player is the next Chris Paul, it immediately brings to mind an idea of not only how good he is, but also his strengths, weaknesses, and overall playing style. Maybe your mind also queues up a mental highlight reel of Chris Paul-like plays, for good measure. Comparisons quickly give a complete picture of a player which would otherwise require taking the time to slowly digest each number in his stat line.
Also, they’re lots of fun!
We’ve created our own set of similarity scores to make comparisons using math for the purpose of prospecting players coming out of college. Our goal is to produce a useful complement to our PNSP model. Where our PNSP model answers the question, “How valuable will this player be?”, our similarity scores aim to answer the question, “Who will this player be like?”
On Sunday, the NFL ended its season with the Super Bowl. As soon as tomorrow, pitchers and catchers will report to Florida or Arizona to begin spring training. And in Minneapolis as I type this, it’s a balmy 42 degrees. I have no doubt that we haven’t seen the last of this Minnesota winter, but nonetheless, I’ll take it as a sign that it’s time to turn our attention to baseball. So let’s talk about pitchers.
This article provides some background on the components of each NFL Playoff Model. By making use of multiple models that are comprised of different modeling techniques and variables, we can better assess each game. For example, if all the models are predicting the Pittsburgh Steelers to beat the Miami Dolphins, we can feel very confident in picking the Steelers to win. If the models are split, we can explore each model individually based on their predictors to identify areas where the model may be taking into account less than perfect information. For instance, if one model emphasizes defensive performance and Seattle is currently missing Earl Thomas, maybe that model does not represent Seattle as accurately as the others. Ultimately, by looking at multiple models we are able to reduce the noise inherent in all predictive models.
The following is a walk-through of our NBA Prospecting model called Peak NBA Statline Projection (PNSP). PNSP is a prospecting tool that synthesizes numerous variables for college basketball players to predict their NBA success. PNSP seeks to project peak potential success of a college basketball player in the NBA by returning a single rating value (ranging from 0 to 100) that is derived from all available information on a given player.
Over the past three years, I have put together a series of statistical models that predict NCAA Tournament games, and, building off of each game, the tournament as a whole. The models started as an independent research project I did at St. Olaf College with Dr. Matt Richey. For a given game, the models use each team’s statistics and information from regular season games to predict (1) each team’s win probability for that game or (2) a point spread for that game. I use a handful of different models, so each game will have multiple probabilities/point spreads to consider, and will not always agree with each other.
Continue reading March Madness Modeling Background