The following is a walk-through of our NBA Prospecting model called Peak NBA Statline Projection (PNSP). PNSP is a prospecting tool that synthesizes numerous variables for college basketball players to predict their NBA success. PNSP seeks to project peak potential success of a college basketball player in the NBA by returning a single rating value (ranging from 0 to 100) that is derived from all available information on a given player.
PNSP first uses season-long college basketball box score statistics, team-level college statistics, physical measurements, high school scouting rankings, position, and years played in college to predict a player’s fourth year NBA statline.1 We focus on a player’s fourth year in the NBA because an NBA player has generally reached or come close to his maximum production by his fourth NBA season. This is true for later seasons as well, but the later the year we choose, the less data we have to build the model. PNSP predicts each basic box score statistic (per 36 minutes) using an ensemble of regression and machine-learning techniques.
We then transform this statline into one comprehensive score. To explain, let’s show an example: predicting points per 36 minutes. PNSP predicts every single player’s fourth-year points per 36 minutes in the NBA using an ensemble of regression and machine-learning techniques, then scales every single player’s prediction relative to their position. This returns a z-score for each player based upon their predicted points (relative to players of the same position).
The rest of the predicted box score statistics are calculated similarly. Next, we sum these standardized values, returning a value metric which represents overall contribution relative to position. Finally, we transform this cumulative standardized value metric to a percentile by comparing it to metrics of historical draftees. If you would like to know more about the mathematics and statistics behind the calculation, feel free to contact us with any questions.
Is this a good measure of NBA success? Yes. We applied a similar scale-and-aggregate technique to 2016 NBA statlines to obtain values based on actual stats, rather than predictions. Some of the top players for the 2016 NBA season were Kevin Durant, DeMarcus Cousins, James Harden, Stephen Curry, Lebron James and Russell Westbrook, all with values of 99. Scores from this procedure positively correlate with other all-encompassing statistics such as Box Plus Minus and Value Over Replacement Level Player.2
Why not just predict on Value Over Replacement Player, Box Plus Minus, etc? In our investigation, predicting solely on these all-encompassing metrics produced extra noise and was less accurate at projecting NBA success for college players. Our system of predicting on individual box score statistics reduces some of the noise seen when predicting on a single all-encompassing metric. Additionally, our method has the potential to offer insight into which statistics players project to be best at in the NBA.
In just using predictions of box-score statistics, PNSP remains theoretically simple while also providing useful predictive power. It incorporates all facets of a player’s game, capturing all types of skill, while still providing a single aggregate value for interpretation.
Our data set consists of players that entered the league between 1997-2016. International players, high school players, and players with incomplete college statistics are excluded from the data. Since data record keeping has become more reliable in recent years, our data contains more players from more recent draft classes. This is only an issue in that it limits the size of our training data; skewing more recent may help to better capture current trends in the NBA.
Additionally, we still do not have one of the most significant predictors of success: competitiveness and mental makeup. This is extremely difficult to measure; maybe if we had the record every single game a player has played in one-on-one, two-on-two, or a Thanksgiving five-on-five pickup game we could get a measure of competitiveness, but so far we haven’t found a reliable source for this data.
PNSP also emphasizes skills based on how readily they show up in box score stats. Shooting is weighted heavily, for example, because there are many ways to account for it in the box score. Defense, on the other hand, is less emphasized, and so may be undervalued in our model.
Finally, since PNSP scores are scaled relative to position, it is weaker at cross-positional comparisons than comparisons of players with the same position.
Historically, PNSP has hit on a number of prospects and missed on a few as well. PNSP has correctly identified superb talents such as Kyrie Irving (99.7), John Wall (97.3), DeMarcus Cousins (96.2), Kevin Durant (93.0), Karl-Anthony Towns (93.4) and Kevin Love (99.1). PNSP has also identified certain successful second/late first-rounders as top players in their respective draft classes, such as Draymond Green (76.1), Jae Crowder (81.6), Hassan Whiteside (82.1), and Kawhi Leonard (75.3). Also, PNSP has made the correct choice between some of the top picks in the draft. For example, PNSP preferred Stephen Curry (76.3) over Jonny Flynn (39.5), and projected Markieff Morris (78.3) as a better prospect than his twin brother Marcus Morris (40.2).
PNSP has also had a few misses. Understanding these misses can help us to identify the strengths and shortcomings of the model. First and foremost, PNSP had Michael Beasley (99.4) as one of the top prospects of all time. It is easy to look back and see that clearly Michael Beasley did not have the mental makeup to be a star in the NBA, and again, PNSP does not capture that. Another glaring miss was Anthony Davis*** (67.6). While it still had Anthony Davis as a good prospect, PNSP did not predict Anthony Davis to be a star. This is due to PNSP seeing limited shooting potential with Anthony Davis and not fully capturing potential defensive contribution.
How to use PNSP
PNSP offers an objective, comprehensive view of NBA prospects, and though it has biases and limitations, it can be a useful piece of the player projection puzzle. When using PNSP to make draft decisions, one should consider each player’s PNSP value coupled with factors that are not being accounted for their PNSP rating.
As an example, let’s say we were advising the Timberwolves heading into the 2015 draft. The Wolves have the first overall pick, and are debating between Jahlil Okafor and Karl-Anthony Towns (and Flip Saunders has been a big Okafor fan for a while). PNSP scores Towns at 93.4—one of the best prospects in our database—while giving Okafor a score of 78.7—solid, but unspectacular. PNSP sees Towns as a superior player, but should be interpreted along with an examination of mental makeup, potential fit within the Timberwolves’ team, a subjective assessment of defensive potential, and the fact that Okafor lacks a pulse. PNSP preference of Towns over Okafor should be a supporting piece to the Towns over Okafor narrative.
1. An NBA statline as constituted by PNSP includes points, assists, steals, blocks, total rebounds, two-pointers made, two-pointers attempted, three-pointers made, three-pointers attempted, free-throws made, and free-throws attempted—in short, classic box score statistics.
2. We see a correlation value of .69 with Box Plus Minus, .68 with Value Over Replacement Player, and .87 with Player Efficiency Rating.
***EDITED: After Recalibrating our Model, Anthony Davis rates at 85.3.