(Reprinted from the College Basketball Prospectus 2012-13, now available as a PDF for your iPad or as a paperback on Amazon.)
Tom Hagen: His medical condition is reported as terminal. He's only gonna live another six months anyway.
Michael Corleone: He's been dying of the same heart attack for 20 years.
A couple years ago I started asking people in basketball if they could tell me who invented the Ratings Percentage Index. Invariably the answer I received was "the NCAA," but the NCAA's not a person. I wanted a name.
No one knew that name, and, for one of the few times in my adult life, Google and Wikipedia were no help either. So I looked into the question myself, and learned that an NCAA staffer named Jim Van Valkenburg formulated the RPI in the fall of 1980. Last February, after I attended the NCAA's annual mock selection exercise in Indianapolis, I wrote about Van Valkenburg's creation in a piece I titled, "The RPI's Birth, Triumph, and Encirclement."
By "triumph" I meant not merely that the RPI has survived far longer than one might have expected. I meant additionally that the original impetus behind the RPI -- giving the committee the knowledge it needs, even if that means inventing something entirely new -- required vision, a truly open institutional mind, and an extraordinary amount of effort.
At a time when very few games were televised and college basketball fans had little to go by other than final scores, the NCAA unquestionably advanced the committee's knowledge through the creation of the RPI. That time, however, has long since passed. The problem with the NCAA isn't that it invented the RPI. The problem with the NCAA is that it stopped doing things like inventing the RPI.
Van Valkenburg passed away in 1995, but I was able to meet with his son:
"If your dad were sitting here right now," I say, "and I told him his RPI is still being used in 2012 even though a goodly number of outside observers think there are better metrics, what would he say?"
Van Valkenburg's son doesn't hesitate. "Dad was a realist. I think he'd say, 'If there's something better, use it.'"
There's something better, but it would be a mistake to classify this as merely a question of competing rating systems. We do the game no great favors, surely, if we replace an untrustworthy rating system that plays too large a role in college basketball with a trustworthy rating system that plays too large a role in college basketball. It would be better to unseat the hegemonic and flawed rating system we have, shrink the job description, and create an index comprised of several reliable and mutually correcting rating systems. Out with the erratic and mercurial Czar, in with a newly constituted cabinet populated by steady, boring, straight-arrow types.
That a rating system should measure performance with reasonable accuracy, that we should be most suspicious precisely when that system's outputs veer wildly from those of other systems, that programs should be evaluated on basketball performance and not on some intrinsically unstable compound of performance plus schedule-based happenstance -- such assertions are surely innocuous to the point of banality. And while it's true the NCAA has on occasion viewed innocuous banalities as radical notions, I am cautiously optimistic that better and more accurate days lie ahead for college hoops. I have my reasons.
Strange bedfellows and encouraging signs
For many years and indeed continuing well into the present century, the most common criticism of the RPI was not that it was some pitiably crude antique but rather the opposite, that it was just some weird number cooked up in a lab somewhere by a bunch of NCAA brainiacs who probably didn't know anything about "real" basketball. One or two prominent commentators still speak of the RPI in this manner, and if sports were as important as politics these commentators would doubtless be labeled paleos. So it is that today the NCAA's besieged metric is encircled by an unusual alliance, one forged between paleos who think evaluating teams should be more or less stat-free and those of us who think the process should simply be informed by better stats.
Thus far the NCAA has pleased neither camp, of course, but there are at long last some hopeful straws in the wind coming from Indy. This summer the NCAA had talks with at least one purveyor of a superior metric, discussions that included analysis of said metric's predictive capability in NCAA tournament games. Nothing has come of those talks yet, and maybe in the end nothing will issue directly from this particular event. Nevertheless, this is precisely the kind of discussion the NCAA should be open to.
In addition, the NCAA is now at least saying some of the right things. "The committee does reference other computer rankings," NCAA associate director of men's basketball David Worlock remarked last February, "and it is noted when there are significant discrepancies between the RPI and other rankings." The part about noting significant discrepancies, if true, would be fantastic, and indeed would be the next best thing to retiring the RPI altogether.
Call it lip service if you wish, but saying the right things is important. It's often the penultimate step before doing the right things, and in the NCAA's case saying the right things constitutes a new development. For years the things the NCAA said about their RPI were arguably worse than the metric's actual impact on the selection and seeding of the tournament field. The official version long promulgated from HQ held that the NCAA was truly "open" to other metrics but, lo and behold, the RPI had managed to go 32-0 in a free and unbiased annual competition where all possible analytic measures were considered.
Instead the NCAA could have spent those years saying, "Look, we have the Internet in Indianapolis too. We understand the RPI's inferior to other rating systems. But what you have to understand is that performance isn't our only criterion. The RPI is ours. We're proud of it, we're comfortable with it, and over the course of three decades we've learned how to work around it and still craft brackets that even the RPI's critics admit are by and large pretty well constructed."
If the NCAA had said words to this effect, their stance would be no less questionable in terms of basketball analysis, but their credibility as an organization would most certainly have improved. Even today it's commonly assumed by the hoops fan in the street that if the NCAA's still using the RPI it must be because they don't understand there's a problem.
Nothing could be further from the truth. NCAA staffers are well aware of the RPI's deficiencies, and on occasion they can take pugnacious delight in deploying their best forensic weapons on behalf of their beleaguered metric. Call me a dreamer, but I'm chalking that up as still another good sign. Such a stance is far more pliant than either simple incomprehension or blind faith would be. It suggests maybe we'll all be chuckling about this in the past tense over a beer sometime soon.
Then again last March the committee gave at-large bids to Southern Miss and Colorado State, teams that, as we'll see, occasioned two of the largest rating-system "discrepancies" that the NCAA says it's now noting. So, no, don't pour that beer just yet.
Rating sports teams and rating basketball teams
Last spring Nate Silver wrote about the NCAA's use of the RPI, and somewhere at the New York Times there's an editor who deserves a big pat on the back for coming up with this headline: "NCAA Builds Solid Brackets from a Shaky Foundation."
That about sums it up. The committee that selects and seeds the field will always be second-guessed, because the essentials of the situation are impossible. Staff the committee any way you choose, with philosopher-kings/queens, Bill Raftery alone, or the first 10 names in the Indy phone book, and those essentials won't change. The difference between the verdicts handed out by the committee -- teams are either in the field or they're not -- will always be orders of magnitude greater than the difference in actual performance between the last team in and the best team left out.
To its credit the NCAA has met this impossible situation with unfailing diligence and a proper regard for the impact their decisions will have. So what's the problem?
Though Van Valkenburg was tasked with creating a rating system expressly for use by the men's basketball committee, the RPI is not a basketball metric. It is instead an opponent- and venue-adjusted winning percentage applicable to any team sport. As such it can be quite handy, of course, and today the NCAA uses it in no fewer than 11 Division I sports.
But, irony of ironies, the sport that Van Valkenburg had in mind at the RPI's birth is one that can be captured with much greater detail. The RPI's limitation is that it can do nothing more than treat basketball as just another sport where games produce outcomes. Conversely metrics that treat basketball as basketball yield information that is, not surprisingly, superior.
How do we know the information produced by these basketball-based metrics is so very superior? The intuitive answer is that the resulting information is way better than the RPI at predicting outcomes. (As Nate has put it, "Over the long run, the RPI has predicted the outcome of NCAA games more poorly than almost any other system.") That answer, however, tends to upset a few observers. Such people will tell you that the NCAA isn't trying to predict anything with their selection and seeding of the NCAA tournament field, and that the committee is or should be in the business of simply and explicitly rewarding past performance.
Which is why the intuitive answer has always required clarification. Why are basketball-centered metrics so good at predicting future outcomes? Because they bring accuracy to bear on past performance.
Wherever the RPI is defended, you may hear this issue rendered as prediction (boo) versus past performance (yay). Actually there's no need to traffic in verb tenses. The choice lies instead between a reliable measure of past basketball performance and an unreliable measure of past generic sports team performance.
If we observe hundreds of Division I coaches using basketball-specific metrics, if we grant that none of them are trying to win their office pool, and if we remark that no coach uses the RPI as a basketball performance measure, we can state the case plainly. To reward past performance you will want to measure it, and no one but the NCAA tries to measure past basketball performance with the RPI.
Naturally "past performance" is never a perfectly clear construct, no matter how accurately it's measured. In basketball, as in all sports except perhaps bowling, "best performance" and "most wins" will never be totally synonymous.
It's this synapse which has led to claims that these fancy-pants new metrics will subvert the importance of good old-fashioned wins. It's said that, by adding reliable performance measures to the discussion, basketball-based metrics will move the focus of interest and decision from the hardwood to the hard drive.
This vein of worry echoes the dire forecasts issued nearly three decades ago, when all manner of terrible things were said to be imminent because the college game was about to adopt the shot clock. The fears may well prove equally groundless in this instance. For one thing, the NCAA and its committee have been subverting good old-fashioned wins for years, very often correctly and with the RPI's blessing, and I haven't noticed any Occupy Indy outrage in response. To take just one example close at hand, Washington won the Pac-12 regular season championship outright in 2012, yet the Huskies did not receive an invitation to play in the NCAA tournament.
Second, if the RPI were reliably favoring teams that "just go out and win their games," it would actually be a far more lovable metric. Instead it's been known to develop fevered and reckless analytic crushes on the likes of Colorado State. Last year the Rams went 8-6 in the weakest Mountain West we've seen since 2006. Meanwhile Drexel won 19 consecutive games between January and March. CSU was given a No. 11 seed, and the Dragons were sent to the NIT. The RPI values teams that just go out and win their games? If only.
Lastly, this professed fear of accuracy fails to acknowledge how all of us evaluate actual basketball teams. If someday the NCAA does make the jump to basketball-specific performance measures, I seriously doubt the committee's members will become unquestioning puppets at the ends of new statistical strings. The point is simply to help those committee members ground their discussions on premises that are finally in accord with observed basketball occurrences.
As it stands currently, those discussions can in a small number of instances be preempted by an RPI so erroneously high that no one dares to stand in the way. On Selection Sunday last year, Southern Miss had an RPI of 21, while Colorado State clocked in at No. 24. By contrast the consensus of multiple independently designed basketball-specific metrics pegged those teams in the low 70s and high 90s, respectively. Maybe there were legitimate cases to be made for both teams, but, with the RPI inflating both beyond all recognition, there was no occasion or wish to find out.
The RPI as mid-major power-up
In its past two end-of-year top 100s, the RPI has included 25 teams that Ken Pomeroy didn't have in his top 100s, and 24 of those 25 programs are mid-majors. And in the same population of teams over the same time period, if you look at the 20 largest discrepancies between the two rating systems in cases where the RPI liked a team more than Ken's system did, you'll find that all 20 are mid-majors.
This does not mean the RPI "overrates mid-majors." Belmont in particular will be forgiven if they laugh out loud at that sequence of words. In fact major-conference powers dominate the top of the RPI just like they dominate the top of the polls, the basketball-specific metrics, and pretty much everything else. What it does mean is that the teams in the vicinity of the bubble who have been most egregiously overrated by the RPI the past two years been drawn exclusively from the ranks of the mid-majors. It need hardly be added that the teams who are bumped out of the field of 68 by these RPI-fueled competitors may very well be fellow mid-majors, as we saw this past season with a near miss like Drexel.
Of course in cases where an overrated team wins its league's automatic bid, the harm done here is negligible. (See Long Island in both 2011 and 2012.) But where the team in question doesn't have an auto-bid in its pocket, the potential for selection malpractice is high. And the capacity of the RPI to act as a mid-major power-up was displayed unmistakably by Southern Miss and Colorado State in 2012.
My point isn't that these two teams should not have received bids. Merely that, particularly in the case of Southern Miss, I suspect there was little opportunity for or inclination on the part of committee members to look more closely at the question. And, as things stand now, I don't blame the committee one bit.
The truth is it would have required something bordering on real courage for the committee not to give a bid to Southern Miss. Basketball-specific metrics may have pegged the Golden Eagles as merely the 70-somethingth best team in the country, but a substantial and lovingly catalogued body of custom and tradition going back decades states that a team with an RPI this high will almost certainly get a bid. For weeks prior to Selection Sunday, the mock brackets crafted by reputable and widely read analysts showed Southern Miss safely in the field of 68.
Suffice it to say if for some strange reason the committee had revealed on Selection Sunday that they'd excluded Southern Miss, they would have been instantly criticized from all directions. The first question put to the committee chair that night by Jim Nantz would have been about Southern Miss. Larry Eustachy would have popped up on a live feed from Hattiesburg looking shocked and dejected. After all, his team had gathered to watch the bracket show thinking they'd learn where they'd be playing. Analysts had been unanimous in saying that Southern Miss was "in," not one had them on the bubble or anywhere close to it. So what happened? What in the world was the committee thinking?
Perhaps 2013 will be quiet on this front (2011 was, by chance), but if not we already know the specific contours of the challenge that will arise. The committee can't jettison all that custom and tradition without giving prior warning to the outside world. They'd be savaged if they did and, anyway, the last project they have time to undertake in February and March is a fresh look at something as exhaustively discussed as the Ratings Percentage Index.
Our fatigue with regard to the RPI as a topic is cumulative, and we've long since reached the point where everyone is not just tired but tired of being tired of this discussion. The NCAA most certainly is tired of being asked about the RPI, and I'm tired of seeing it calculated four places to the right of the decimal as if it actually signifies anything remotely so precise. By the same token with each passing year the NCAA becomes a little more adept at addressing this topic that both they and their interlocutors have grown so very weary of. Think of those two quantities -- topical fatigue and the NCAA's discursive chops on this topic -- as steadily increasing over time.
The actual mischief visited upon the selection process by the RPI, conversely, fluctuates year to year, and it does so for reasons substantially outside the committee's control. Any committee that inherits the RPI and is presented with at-large candidates enhanced by mid-major power-ups is virtually guaranteed to wreak evaluative havoc on a line or two. It's an accident waiting to happen, so please stop yelling at that year's committee chair. You're yelling at a person who can cite chapter and verse from Bart Simpson: "It was like that when I got here."
Doctors bury mistakes, lawyers hang them, and the RPI gives them bids
The NCAA has long displayed laudable consistency in specifying they want to select the "best" teams for their tournament. In basketball the best teams score points and prevent opponents from scoring, and the best way of measuring that is to track a large body of possessions. Drill down into those possessions, pull up a core sample of basketball performance, make due allowance for strength of schedule, and you'll have a good departure point for discussion.
At least that's how the process has played out for me as I've analyzed teams under the RPI dynasty's sun. If on the other hand the NCAA announces tomorrow that they're going to start using basketball-specific metrics more extensively, I suppose there's a chance the ensuing full moon could awaken the werewolf inside every head coach and teams will promptly start running up the score at every opportunity.
I'm just not sure that chance is a large one. Happily this is not September-variety college football, so the following reassurances apply: 1) Running up the score is rare in college basketball; 2) If a team does run up the score it doesn't help their numbers as much as commonly thought (last year in conference play Kentucky built perhaps the best basketball-specific numbers I've ever seen less by blowing out opponents than by never losing to them); and 3) If a team does run up the score -- or, what's more likely, if they're simply on the scene when the opponent happens to implode completely -- it's simple enough to adjust the resulting numbers to something more descriptive of their actual ability.
If you want to make sure teams aren't rewarded for running up the score, don't reward teams for running up the score. Look at a trusty measure of team performance and filter out any funky scoring distributions.
A similar filter for the RPI, one that would make its raw outputs reliably congruent with team performance, would make life far easier for the men's basketball committee. But there is no such filter, and it's likely there can't be one, for the simple reason that the RPI is erratic. My comparison of the ratings produced by different systems for 100-plus teams pulled from the top of D-I in each of the last two seasons turned up multiple instances where the RPI likely underrated or overrated teams by 50, 60, or even 70 spots in a 345-team population.
These are extreme cases, to be sure. In the sample I pulled together the RPI sported a degree of evaluative uniqueness greater than 50 spots seven percent of the time. But it's precisely these exceptionally divergent days for the RPI that can lift a team into tournament consideration and indeed a bid. As it stands now, being the team that the RPI really screws up on can be the best thing that happens to a coach. Of the 225 ratings I looked at where the RPI assessed teams in the tournament discussion over the past two seasons, the rating given to Colorado State last season ranked No. 221 in its congruence with basketball-specific metrics. That is, there were only four ratings over the past two years in this sample that were even more aberrant.
In September Luke Winn detailed how then-Colorado State coach Tim Miles, with the help of then-athletic director Paul Kowalczyk, set out to bend the RPI's eccentricities in the Rams' favor by scheduling what the two men hoped would be high-RPI but relatively low-basketball-performance opponents. Hopes can be dashed, of course. Those opponents don't always turn out the way you think they will, but in this case CSU caught all the breaks. Give Miles credit. The way the NCAA sets up its game, a coach would be negligent if he didn't indulge in this kind of reconnaissance. But is such recon really germane to the question of which teams should play in the tournament?
Granted, there are instances where the long belittled RPI eagerly joins the chorus being sung by those snooty basketball-specific metrics. Last year teams like Kentucky (obviously), Michigan State, Indiana, Georgetown, Vanderbilt, NC State, Washington, and VCU all had RPI's that corresponded quite well with their actual levels of performance. It's possible for the RPI to be correct with respect to a given team, but "possible" isn't good enough to earn our trust, for a basketball metric or anything else.
The paleos have a point
If the RPI were a window air conditioner, for example, the stereotypical view of the NCAA's metric as merely an antiquated relic would hold that the stupid thing would just sit there in your window, rusting and inoperable. But the reality's a bit more schizophrenic. Some of the time your RPI-brand air conditioner will in fact work perfectly. Then again some of the time it will work pretty well but not as well as you'd like. Some of the time it will function, quixotically enough, as a heater. And on some admittedly rare but nevertheless memorable occasions, it will turn into a blowtorch.
That actually puts the best face on the RPI, for the NCAA's metric is at its most benign when it's attempting to rate just one team. A single rating may turn out to be more or less correct, and even when it's not there's a chance the error won't have tangible consequences, whether because the team has an automatic bid or because they're not even in the tournament picture.
But as we pull back from that close-up and consider the RPI in relation to teams, plural, and indeed to college basketball as a whole, the picture changes. Even a team that the RPI happens to rate correctly will arrive at Selection Sunday having played, say, 25 different opponents, and the chances are good that the RPI will have badly mischaracterized a significant minority of those teams. For example, any team that was lucky enough to have Southern Miss or Colorado State on their schedule last year had an excellent chance to earn credit for a "top-50 win." Indeed part of the Golden Eagles' hypnotic power over the RPI was due to the fact that they went to Fort Collins on November 19 and beat putatively mighty CSU.
By definition the men's basketball committee traffics in close calls. Should this team be given a No. 1 or No. 2 seed? Should we give the last at-large bid to Team X or Team Y? In such cases having one or two additional top-50 wins can be huge. With the RPI's erratic assessments functioning as the mortar for all this intricate brickwork, it's fairly amazing the committee has done as well as they have.
The men's basketball committee may manage to wriggle free of the RPI on occasion, but the metric has succeeded in tying up the entirety of D-I when it comes to scheduling. The careful research that coaching staffs devote to cobbling together an RPI-kosher slate of opponents, the consultants advising mid-major league offices on the metric's intricacies, the willingness of major-conference programs to record their early-season wins against off-the-RPI-radar Division II opponents -- maybe it's all an invitation for us to see what is in front of one's nose.
Somewhere along the line an idea was accepted and institutionalized, and it still frankly surprises me. It's the idea that the NCAA should be in the business of explicitly favoring some schedules at the expense of others, as opposed to simply measuring strength of schedule as one discrete variable incidental to rating teams. Last March, for example, Jeff Hathaway, the chair of the men's basketball committee, said teams "have to be aggressive in their [non-conference] scheduling." Hathaway was just voicing the conventional wisdom, of course, but how did this come to be regarded as either conventional or wise?
Naturally as a fan I love aggressive non-conference scheduling. It's more fun to watch two good teams play each other in November or December than it is to watch a game with just one good team or none. I also recognize, however, that a team's ability to achieve the highest level of performance in basketball is independent of its coach's ability to achieve the highest level of performance in scheduling to the NCAA's liking.
In a normal evaluative world that still had conference play, we would surely tell every coach, "Go Izzo or go cupcake, we don't care. Play the non-conference schedule you want to play and don't worry about it. Our rating systems will do their thing, and the committee will watch your games. We'll know how well you've played regardless."
But the famously schedule-myopic RPI can't say that. In effect the NCAA, whether unintentionally or ingeniously, has recast a mathematical shortcoming as a settled matter of policy on the plane of The Good of the Game. If you want to improve your standing in our rating, the NCAA says with a straight face, you must excel not only at basketball but also at scheduling. It would have required less effort and been far more just to simply address the mathematical shortcoming.
No wonder the paleos are suspicious of statistics. The first stat they saw roll down the pike was the RPI, and it reordered a good deal of the sport by its own capricious and ambitious lights. I'd be suspicious too.
The RPI was created to give the men's basketball committee information it could not otherwise have, and in its early years that's precisely what Jim Van Valkenburg's metric did. But the value of the RPI has always been contextual and not intrinsic. In those early years it was worth consulting because there was little or nothing else available under the category of comprehensive rating systems for all of D-I. Now, three decades later, the world that has grown up around the RPI has effectively transformed Van Valkenburg's creation into the exact opposite of what it was meant to be. It is now a hindrance to good information, something the committee must work around.
For all the derision directed the RPI's way for predating the Commodore 64, the issue here is only partly one of age. Actually, basketball-specific metrics are based on insights formulated by Dean Smith a good 20 years before the RPI was even a gleam in the NCAA's eye. The core issue is one of performance, and on that criterion the RPI would be found lacking even if it had been invented late yesterday afternoon.
Perhaps a better day is coming. Maybe someday soon the NCAA will be guided by an index comprised of several independently designed basketball-specific metrics. Maybe said index will be respectfully silent on the issue of how a coach should go about creating his team's schedule. Maybe someday the subjective preferences shown by the committee will be sanctioned by accuracy and thus acquire a clarity and weight they've never had before.
Maybe someday, but not just yet. Tomorrow the RPI will still be here, and you and I will still be fans of college basketball. This is the business we've chosen.
Follow John on Twitter: @JohnGasaway. This free article is an example of the content available to Basketball Prospectus Premium subscribers. See our Premium page for more details and to subscribe..
John Gasaway is an author of Basketball Prospectus.
You can contact John by clicking here or click here to see John's other articles.