Home Unfiltered Articles Players Baseball Prospectus
Basketball Prospectus home
Click here to log in Click here for forgotten password Click here to subscribe

Click here for Important Basketball Prospectus Premium Information!

<< Previous Article
SEC Preview (10/22)
Next Article >>
SEC Preview (10/24)

October 24, 2007
Similarity Scores
Getting Past What You See

by Ken Pomeroy


There's nothing that bugs me more about college basketball analysis than player comparisons. Often an up-and-coming player must be compared to somebody to give the audience a frame of reference. Never heard of Adam Morrison? Hey, he's the next Larry Bird! Now you have to watch and see what Adam Morrison does, because he's going to do things that Larry Bird did.

I'm convinced that the oft-cited Morrison/Bird comparisons had just as much to do with how the duo looked as how they played. That isn't to say there weren't obvious similarities in the two players' games, because there were. However, they also both had the floppy hair, they couldn't master the use of a razor, and they weren't very articulate. Had Morrison been black, well-manicured and the spokesman for a couple of important social causes, but otherwise was the exact same player, would he have drawn as many Bird comparisons? Don't kid yourself.

Because college basketball quantitative analysis is still in its infancy, there are a lot of ideas that can be borrowed from other sports and applied to ours. One such idea is similarity scores. Similarity scores were created by Bill James on the baseball side about 25 years ago. They've since been applied to football and basketball at the pro level with some success. The idea is to determine which players are most similar to a player in question based solely on certain statistical factors. I'm still experimenting with a similarity method for college hoops, and I'll share what I've worked on so far.

I took 14 different tempo-free statistical factors for every player, and also added team strength based using their Adjusted Pythagorean Winning Percentage. I normalized each of these factors--in other words, put them all on the same scale. I can plug any player into this system and compare him to every other player in the college basketball universe over the past three seasons. By summing the differences in each of the 14 categories, the players with the lowest totals are the most similar to the player I'm testing. Instead of arbitrarily determining the weight that each category gets, I am going to let the player's unique characteristics do that. If a player is unusually good or bad in a particular statistic, then more importance is placed on that statistic in determining which players are most similar.

For my first example, I'm going to use Memphis center Joey Dorsey. Dorsey gained prominence last March by calling Ohio State's Greg Oden "a lot overrated" before the two teams met in an elite eight game. Even though Dorsey was essentially schooled in that contest (he finished with zero points, three rebounds and four fouls in 19 minutes), he's actually a guy that about 330 other college teams would love to have. Dorsey grabs rebounds like almost nobody else and blocks shots at a pretty good clip as well. Unfortunately, he has few offensive moves and is often foul-prone. So this system is going to look for players like that, especially honing in on the rebounding, because that is where Dorsey has few peers. Here are the most similar players to Dorsey over the last three seasons:

Joey Dorsey, So., 2006, Memphis (similarity score = 12.9)
Joey Dorsey, Fr., 2005, Memphis (21.6)
Al Horford, Fr., 2005, Florida (22.2)
Tamarr Maclin, Sr., 2005, SW Missouri St. (24.4)
Shelden Williams, Jr., 2005 Duke (24.9)

Lower scores are better, with a score of zero meaning two players are statistically identical. So Joey Dorsey is essentially most similar to himself. Then there's a very raw Al Horford, followed by a guy who got more tryouts with NFL teams than NBA teams in Maclin, and Shelden Williams, who the system locks in on because of his ability to rebound and block shots extremely well. The fact that Dorsey is most similar to Dorsey is telling. He came into the college game as a voracious rebounder and shot blocker and has improved little in the areas where he is weak. He has more or less been the same player for three years.

Let's try a polar opposite to Dorsey. When I plug Florida's Lee Humphrey in to the system, I get this quintet:

Lee Humphrey, So. , 2006, Florida (8.7)
John Sharper, Sr., 2006 San Diego State (16.0)
Clayton Hanson, Sr. 2005, Wisconsin (18.5)
Rich McBride, Jr., 2006, Illinois (18.9)
Matt Lawrence, So., 2007, Missouri (20.0)

As we would expect if the system was any good, you get bunch of guys who didn't shoot much, who when they did almost always shot a three, and made those threes at a high clip. Otherwise, this group didn't have a statistical impact when they played. Notice that the scores for Humphrey's comparables are lower than for Dorsey's. Role-playing three-point shooters are more common than freakishly good rebounding shot-blockers.

Like any tool, this one has to be used with a bit of intelligence. For instance, we could plug Kevin Durant into the system, but why would we want to? We're not to going to learn anything more than we already knew about him. Statistically, he was the best player in the game last season. I'll plug him in anyway just to illustrate my point.

Nick Fazekas, Jr., 2007, Nevada (17.4)
Nick Fazekas, So., 2006, Nevada (19.1)
Juan Mendez, Sr., 2005, Niagara (24.0)
Al Thornton, Sr., 2007, Florida St. (24.0)
Glen Davis, So., 2006, LSU (24.7)

What you get are guys who weren't as good as Durant, but who had a similar statistical profile: high-usage players who were active in many aspects on both ends of the court. Kevin Durant is not the type of player you want to compare to another college player. Other players get compared to him (and almost always this will be a poor comparison), but people weren't just impressed with Durant because he was so good at doing so many things. They were also impressed because he was dominating the game as a freshman, which is reinforced by the fact that his comparables are exclusively upperclassmen.

This begs another question: who were the freshmen most similar to Durant last season? We won't see Durant in the college game this season, nor will we see any of his top comparables, so maybe the freshman list could shed some light on which sophomores we should keep an eye on.

Ryan Anderson, California (35.1)
Stephen Curry, Davidson (36.8)
Brandon Costner, NC St. (38.3)
Luke Harangody, Notre Dame (39.1)
Kevin Coble, Northwestern (40.1)

All of the guys here are significantly different from Durant in some way. The fact that Stephen Curry can sneak into the list as a three-point shooting, playmaking point guard from a smaller school illustrates just how difficult it is to find freshmen similar to Durant. Anderson was a great surprise for Cal, but he was only the 24th-most similar player to Durant overall last season.

This list also challenges the notion that similar players must look alike. Specifically, I'm thinking about Luke Harangody. Harangody looks more like Dick Butkus in his prime, and there just isn't anyone else that looks like that in the college game. I'm confident no person on the planet compared him to Durant last season, not that a comparison like that would have been apropos. But certainly his ability to excel in so many categories and use a lot of possessions on the offensive end makes you wonder if he should be taken more seriously as a player. Harangody may be loosely comparable to Durant, but his list of most similar players provides more information on what type of player he is.

Glen Davis, So., 2006, LSU (12.3)
Wendell White, Sr., 2007, UNLV (12.8)
Craig Smith, 2005, Jr., Boston College (14.8)
Curtis Sumpter, Sr., 2007, Villanova (16.2)
Jamaal Williams, Sr., 2006, Washington (16.9)

Harangody's comparables include guys with a bunch of low scores, thus players that are quite similar to him. When it comes to body type, you could do a lot worse than comparing him to Glen Davis. If Harangody emerges as a prolific scorer and rebounder this season--and it's not a stretch that he will assuming he recovers from a recent hand injury - analysts might need a comparison that the casual fan can identify with. Davis is a great comparison. He has a similar body type and game, and even though he's moved onto the NBA, he was one of the most recognizable players the last couple of seasons.

The main thing preventing such a comparison would seem to be that each player is of a different race. Is that a good reason to avoid a comparison when their contributions on the floor are about as similar as two people can get? Analysis should be about more than comparing mug shots, it should be about comparing roles and production. With that in mind, Luke Harangody's production in 2007 plus more minutes in 2008 will produce stats very much like Glen Davis had in 2006. That might be the best example of the insight that similarity scores can give us.

Ken Pomeroy is an author of Basketball Prospectus. You can contact Ken by clicking here or click here to see Ken's other articles.

0 comments have been left for this article.

<< Previous Article
SEC Preview (10/22)
Next Article >>
SEC Preview (10/24)

State of Basketball Prospectus: A Brief Anno...
Tuesday Truths: March-at-Last Edition
Easy Bubble Solver: The Triumphant Return
Premium Article Bubbles of their Own Making: Villanova, Temp...
Tuesday Truths: Crunch Time Edition

SEC Preview: Teams, Part Two

2007-11-09 - The Best of the Rest: Elsewhere in Division ...
2007-11-06 - The Western Athletic Conference: Mass Rebuil...
2007-10-26 - Atlantic 10 Preview: A "Saint"ly Conference
2007-10-24 - Similarity Scores: Getting Past What You See
2007-10-19 - The Mountain West Conference: Preview
2007-10-17 - Pac-10 Preview: Teams, Part Two
2007-10-17 - Hometown Scoring: How Many Assists Did Acie ...

Basketball Prospectus Home  |  Terms of Service  |  Privacy Policy  |  Contact Us
Copyright © 1996-2017 Prospectus Entertainment Ventures, LLC.