In this essay, reprinted from Pro Basketball Prospectus 2010-11, we explain the method behind the SCHOENE projection system and take a look at how SCHOENE performed last season. For this year's projections, check out the book, which is available now.
When Nate Silver unveiled his PECOTA method for projecting player performance at our sister site Baseball Prospectus in 2004, it set the template for future projection systems on two counts--the use of similarity scores to identify development paths and the use of a former fringe player's name as an acronym. On both counts, Basketball Prospectus' SCHOENE projection system follows in PECOTA's footsteps.
SCHOENE is named for former NBA forward Russ Schoene, who spent four seasons in the NBA in the 1980s, most prominently playing for the Seattle SuperSonics. Like PECOTA, SCHONE is technically an acronym, standing for Standardized Comparable Heuristic Optimizing Empirical NBA Evolution.
We first introduced SCHOENE to project the results of the 2008-09 NBA season. While the player projection aspect is not entirely unique--ESPN Insider's John Hollinger independently developed a similar projection system--SCHOENE goes a step further by beginning to consider team context. For each team, player usage rates are adjusted (along with efficiency) to replicate the interactions between players in divvying up offensive possessions. Another adjustment handles defensive rebounding because of the tendency for good rebounders to cannibalize defensive boards from their teammates and vice versa.
While SCHOENE's default output is per-possession or per-shot rate stats, it also incorporates team pace to produce complete, realistic stat lines for each player. This is especially useful for creating fantasy projections, since a player's per-game averages will depend in part upon the pace at which his team plays.
Finally, SCHOENE brings it all together to create team stat lines, unprecedented for an NBA projection system. This gives us an idea not only of a bottom-line projection for each team's win-loss record but also how they will get there and projected strengths and weaknesses.
At the heart of the SCHOENE system are similarity scores for each player based on 13 statistical categories, standardized for league norms: height, weight, a "shooting" rating (based on 3P%, 3PM/Min and FT%), two-point percentage, "inside" rating (FTA-3PA)/possessions, usage rate, rebound percentage, assist percentage, steal percentage, block percentage, turnover percentage and player winning percentage, the per-minute component of the WARP system.
Like many similarity scores, SCHOENE's are calculated out of 100, that being an identical match. A score of 95 means two highly similar players, while 90 is reasonable similarity and anything below that starts to get dicey. The closest match for any player in this year's projections is Houston Rockets forward Jordan Hill and Kris Humphries, at 99.2. A handful of players, most notably Shaquille O'Neal, did not have a single match of 90 or better.
In general, at least the 50 most similar players of the same age--within six months of the player's age during the season, as with PECOTA--were used to generate each player's 2010-11 forecast, though the smaller pool of players in the NBA--the similarity database dates back only through 1979-80, the first year of the three-point line--means very young and very old players have a smaller group. For eight players whose comparable pools were far too small, an average age adjustment has been applied to their statistics.
In addition to using this group of comparable players to project the improvement or decline in each of 14 statistical categories, we also follow PECOTA's lead in generating summary statistics that reflect the variation in each player's projection. Recreated with each player's projection are the familiar Improve/Breakout/Decline percentages, a breakout or coppage (our term for a steep decline) being defined as at least 20 percent improvement or drop-off.
The only noticeable change to the player projections this year is the use of D-League translations, as established by a study published on the website last season, to create projections for players who have not seen at least 250 minutes of action at the NBA level but have seen more action in the D-League. In addition to translations for rookies based on NCAA stats and for European players who played in either Euroleague or the EuroCup, this further increases the size of the pool of player projections. This year's book features more than 500 individual projections.
After projections were generated for each player, these were incorporated into a team context. Games played are projected for each player using a baseline estimate of 76 games played. From there, players are penalized one game for each six missed last season (up to a maximum of 10 for players who missed the entire season) and one for each 20 missed two years ago, based upon research done by Houston Rockets analyst Ed Küpfer on projecting games played. We also account for preexisting injuries and suspensions. Playing-time projections are strictly subjective based on each team's projected depth chart.
On offense, the only step between individual projections and team totals is the aforementioned usage adjustment. Each player's usage rate is adjusted so that the team as a whole uses only the number of possessions projected based on team pace. There is also a corresponding adjustment to the player's shooting percentages and turnover rates to reflect the inverse relationship statistical analysts have found between usage and efficiency. One percentage point of usage is approximately equal to a point of Offensive Rating.
For now, assists are essentially ignored in calculating a team's offensive efficiency based on the sum of player shooting percentages, free throw rates, offensive rebounding and turnover rates.
The projection is more complicated at the defensive end because of the paucity of tracked individual defensive stats. Defensive rebounding, blocks, steals and personal fouls are projected from individual statistics. Defensive rebounding is regressed significantly to league average, a notion supported by studies done by another Rockets analyst, Eli Witus.
Two-point percentage on unblocked shots and non-steal turnovers (as well as other descriptive factors like ratio of three-point attempts to twos) are based on past team performance regressed to league average (by a factor of 25 percent for two-point shots and about 45 percent for non-steal turnovers, which also factor in projected steal rate). This can be problematic in the case of teams that change a high percentage of their personnel or defensive schemes.
After making significant changes to the individual projections last season, this year's focus was on revisiting the team projections. The APBRmetrics message board tracked the accuracy of several statistical projections during the 2009-10 season, with SCHOENE placing second in terms of root mean squared error--the average squared error of the projections, which heavily penalizes wildly inaccurate projections. (Average error is also included in the table to give a sense for how close most projections came.)
Projection System ME RMSE
Component Score (Jon Nichols) 7.21 8.73
SCHOENE 7.84 9.44
Simple Rating System (B-R.com) 7.72 9.55
Statistical Plus-Minus (B-R.com) 7.87 9.73
Win Shares (B-Reference.com) 8.33 10.08
eWins (Mike Goodman) 8.48 10.48
NBAPET (Bradford Doolittle) 8.70 11.30
Unquestionably, there is room for improvement on last year's projections. While SCHOENE's stunningly optimistic assessment of the Memphis Grizzlies proved fairly accurate and the Eastern Conference race shaped up largely as projected (Cleveland and Orlando at the top, with Boston in the rear-view mirror--at least during the regular season), SCHOENE was too bullish on New Orleans and failed to foresee the rises of the Atlanta Hawks and Oklahoma City Thunder.
Breaking down SCHOENE's accuracy at projecting each of the Four Factors in terms of the correlation between the projection and actual performance provided some interesting results.
Stat ORtg DRtg eFG% OR% FTM/FGA TO% eFG% DR% FTM/FGA TO%
Correlation to actual 0.74 0.59 0.82 0.33 0.58 0.58 0.49 0.49 0.63 0.55
As you might expect, SCHOENE does a better job of predicting offense than defense, where the lack of individual defensive statistics is a major limitation. What is surprising is how good SCHOENE is at projecting team shooting (as measured by eFG%), given that it does not include adjustments for passing or teams' ability to stretch the floor with three-point shooting. While we know these traits are valuable at the team level, adding them to the equation did not significantly improve SCHOENE's accuracy.
On the other hand, SCHOENE was unexpectedly poor at projecting team offensive rebounding in 2009-10. There was no clear pattern along the lines of defensive rebounding, where each team must be brought toward average to account for diminishing returns (that is, good defensive rebounders take some boards away from their teammates in addition to the opposition). Re-running the 2009-10 projection with actual minutes played shed some light on the issue.
Stat eFG% OR% FTM/FGA TO% eFG% DR% FTM/FGA TO%
Correlation to actual 0.82 0.33 0.58 0.58 0.49 0.49 0.63 0.55
Correlation w/minutes 0.79 0.51 0.69 0.57 0.50 0.60 0.53 0.69
On both sides of the ball, knowing minutes played made a substantial difference in terms of projecting rebounding. Elsewhere, it provided no improvement and actually hurt projections of team shooting. This might be explained by the way smaller or bigger lineups can affect rebounding. For example, Golden State's projected defensive rebounding was strong because it was assumed that Brandan Wright and Anthony Randolph would be sharing minutes at power forward. Instead, both players were hurt and Don Nelson played smallball much of the year, using Corey Maggette at the four. The Warriors' rebounding tanked as a result.
Ultimately, the only noticeable adjustment made to SCHOENE at the team level was increasing the strength of the regression to the mean on defensive rebounding. Nothing else helped improve SCHOENE's accuracy in predicting what actually transpired in 2009-10. That may change next season. SCHOENE has served up some more surprising results, as you'll see in the book, and whether they hit or miss will help determine future alterations to SCHOENE.
Follow Kevin on Twitter at @kpelton.
Kevin Pelton is an author of Basketball Prospectus.
You can contact Kevin by clicking here or click here to see Kevin's other articles.