Fundamentally, there are two questions that we can answer with statistics: what and why. Generally speaking, it's more fun to focus on the former question. "Who's better?" is a what question, as is "Should we make this trade?" Player rating systems are designed to answer "what" questions, which make up the bulk of the statistical analysis that has reached the mainstream.
Why is important too, however, for a variety of reasons. Explaining not just which player is better, but the reasons behind the distinction, allows us to generalize the results to future situations. It also helps us make more convincing arguments to decision-makers who are understandably skeptical of any analysis that functions as a black box whose method cannot be understood.
One of the most memorable stories written about the early incarnation of adjusted plus-minus, as developed by ratings guru Jeff Sagarin and Indiana University professor Wayne Winston, focused on the impressive rating of then-Washington Wizards reserve Mitchell Butler. Here's Winston's explanation, as quoted by the Washington Times:
"This guy, Butler, I have no idea what he does. I’m sure he doesn’t have flashy stats. But when he’s in, the Wizards play great. What’s his first name? Mitchell?"
It's easy to see why NBA teams weren't necessarily convinced by the results Sagarin and Winston reported. If they couldn't explain them, how could teams do so? As it turned out, their concept of adjusted plus-minus did hold value, but with the benefit of hindsight Butler's rating looks largely like one of the flukes that often occur as part of a noisy process.
At this point in the development of basketball analytics, we have a solid handle on "what." Between the box-score statistics that produce individual player ratings and the various incarnations of plus-minus numbers that measure team impact, player value is assessed more accurately now than ever before. What remains tricky is trying to understand those ratings. Witness last week's column on Ekpe Udoh's unusual combination of sub-replacement individual statistics and elite plus-minus performance. I was able to speculate on explanations, but with limited certainty.
That's where the research presentations at this year's MIT Sloan Sports Analytics Conference, held over the weekend at Boston's Hynes Convention Center, come in. The most interesting of the presentations focused not as much on "what" but "why," using various techniques.
In particular, the optical tracking data provided by STATS Inc.'s SportVU technology was the talk of the conference for the second consecutive year. The winner of the research paper competition, Rajiv Mahsewaran (with co-authors Yu-Han Chang, Aaron Henehan and Samantha Danesis), used the STATS data to study how shot location and the positioning of offensive and defensive players affect rebounding.
At this point, we're only beginning to see the fruits of the seemingly endless possibilities offered by optical tracking. Just 10 teams have signed up for SportVU and installed the six cameras that record all player and ball movement within their arenas. What those teams have mined from the data remains secret, and the researchers who have been granted access to the numbers have been limited in terms of what they have been able to study.
Still, Mahsewaran's presentation offered a glimpse of how we might someday be able to answer one of the questions posed in my Udoh column: Why does a player who grabs relatively few rebounds have a positive impact on his team's rebound percentage? One of the other presentations at Sloan, in which Allan Maymin, Philip Maymin and Eugene Shen sought to quantify the concept of fit, utilized the emerging data on adjusted four factors. The technique applies the same method as adjusted plus-minus to Dean Oliver's Four Factors, highlighting the specific areas in which no-stats stars like Udoh help their teams.
The patron saint of adjusted rebounding is the player to whom I compared Udoh, veteran center Jason Collins. The Maymin-Maymin-Shen team found that Collins increased his team's defensive rebound percentage by more than any other player from 2006-07 through 2009-10 despite never grabbing more than 14.4 percent of available defensive boards (league average for a center is around 22 percent). Subjectively, the explanation for this discrepancy has been that Collins is so good at boxing out he allows his teammates to grab additional rebounds. By looking at player positioning, optical tracking might eventually allow us to quantify how much less likely players boxed out by Collins are to secure offensive rebounds.
I can vividly recall citing Collins to a pair of NBA executives as an example of a player whose impact was larger than indicated by box-score statistics. One of them, despite an interest in statistical analysis, was dubious of his value. (In fairness, by this point Collins was near the end of his run as a productive starter.) Had I been able to offer more specific data, I think I would have improved my chances of making my case. Add in supporting video showing Collins boxing out his man and suddenly the argument becomes far stronger than simply that he must be doing something right. In this case and many others, answering the "why" only strengthens the "what."
More Notes from Sloan
As has been noted elsewhere, a fascinating juxtaposition has emerged at Sloan between the thought-provoking research papers and the better attended panels that, while entertaining, break little ground in terms of statistical analysis. Having bigger names on panels has proven a mixed blessing. While successful insiders have enlarged the profile of the conference, which continues to break records for attendance, they are often unwilling or unable to provide detailed insight into how they use statistics.
There are exceptions to this rule, certainly. As a former season-ticket holder I was naturally familiar with his concept of a fan council, imported from Europe, but Seattle Sounders FC minority owner Drew Carey was a revelation to a national audience. Carey's candor and wit made him a breakout star at the conference. Ex-coaches, like ESPN analysts Eric Mangini and Jeff Van Gundy, tend to be forthcoming about what they have learned about using numbers on the sidelines--and when the numbers can't help.
Still, this year I spent more time than ever listening to the research papers and Evolution of Sport presentations. The challenge is for us in the media to perform the role of translating the academic research papers into conclusions and language that can be understood by fans at large and the decision-makers who will be putting them into play. In that sense, I like to think of us as bridging the gap between the smaller rooms with the research presentations and the main ballroom with the broader panels.
In general, I would say I found this year's research papers about 75 percent useful. The two presentations focused on the concept of "fit" were good examples. Both had solid theoretical underpinnings and helped me think about an issue I know is important to evaluating transactions. Still, neither reached the kind of solid conclusions that can actually make a difference in how teams do business.
The Maymin-Maymin-Shen paper came up with interesting findings about which skills tend to amplify each other (like multiple players capable of forcing turnovers) and which have diminishing returns (like multiple players who are good at creating shots). However, since these conclusions were based solely on the paper's model of the game, they were lacking in empirical evidence that I believe is necessary, especially in the context of my analogy of statistical analysts as historians from last week.
The other paper, by Robert Ayers, was one of several to use clustering to group similar players. Ayers then considered how different combinations of player types with each other tended to perform relative to expectations. Besides the issues created by using the flawed NBA efficiency stat as a measure of a team's inherent talent, the bigger problem is the number of combinations within each team's "big two" or "big three." Ayers found 14 player clusters, which means 196 possible combinations of big twos and 2,744 possible big threes. Since the ABA-NBA merger, there have been 908 NBA teams, so if the combinations are distributed with some degree of randomness, Ayers' results were based on a handful of teams in each category. That makes it difficult for them to be statistically significant.
A terrific suggestion I heard is that the Sloan Conference ought to form a panel to review the papers when they are initially submitted and offer feedback, much the way the refereeing process works for papers published in academic journals. In practice, the Q&A session tends to offer something similar, but this way the researchers could incorporate the suggestions into their presentations, ideally making them stronger.
Another way to improve the research papers would be stronger subject knowledge. Many of the NBA presentations featured misspelled or mispronounced names. In front of NBA insiders, these kinds of details ought to be correct. Some papers would have benefited from the perspective of NBA experts to help balance out researchers who have the math down cold but are less certain about the basketball aspect.
I thought the rebounding paper was a deserving winner of the competition, but my runner-up would have been Matthew Goldman and Justin Rao on the effects of pressure situations on home and away players in the NBA. They found that as the leverage of the situation increased, home teams tended to do better on the offensive glass--an example of an effort play that can be motivated by the support of the home crowd--but worse at the free throw line, where players have an opportunity to think about their shot and are affected by self-focus and the desire to not let the home crowd down. The practical implications of this study are limited, at least a the team level, but it offered numbers to back up a compelling hypothesis.
I saw two presentations on college hoops power rankings, one explaining ESPN's new Basketball Power Index and the other a research paper by Mark Bashuk utilizing cumulative winning probabilities over the course of a game. I love the latter idea, to the extent it both values close losses and limits the credit to teams that blow out hapless opponents. In practice, however, it offered less predictive power than either the Pomeroy rankings or Sagarin's predictor. For now, nobody has been able to top their ability to assess team performance going forward. But both newomers are surely much better than the RPI.
This free article is an example of the kind of content available to Basketball Prospectus Premium subscribers. See our Premium page for more details and to subscribe.
Kevin Pelton is an author of Basketball Prospectus.
You can contact Kevin by clicking here or click here to see Kevin's other articles.