Errntknght
Registered User
Recently someone mentioned the hoopdata.com site and checking it out I found a wealth of data, just ripe for analysis. How could a mathematician resist. The data consists of yearly summary statistics for each team plus some calculated formulas like Hollinger's PER (for teams.) Using a correlation of each stat versus the teams winning pct, it gives some information about how important each of them is to winning games.
Some stats, like 'point differential', are known to correlate well with winning pctage and at the entry PTdiff, you'll see it shows up with a 4 yr avg correlation coeff. of .974 (out a maximum possible of 1.0). Hollinger's PER, at the bottom of the list, has avg cc of .845, so its not all that bad. Of course, its a terribly complicated formula and you can see that simple field goal percentage difference (own minus opponents, named FG%diff) correlates just as well. FG%diff, PTdiff, 3P%diff are not given explicitly at Hoopdata.com, so I calculated them from PTS - OPTS(opponents pts), FG% - OFG%, and 3P% - O3P%. Except for those three 'stats' the rest of the labels match those at hoopdata.com in the two statistical categories "Basic" & "Advanced". (You can get the definition of them on hoopdata.com, should you be interested in the details.) I will mention that CHG means charges taken and DEF means CHG+BLK+STL.
Some things I found of interest.
How can FT% be of so little significance - note that it had a negative correlation coefficient in 08-09. The number of data points for a year are only 30, one for each team, which means that estimating a correlation has a std deviation of .18 and the average of 4 years has std dev of half that or .09. If you look across four years you can see that they do jump around a good bit, except that the high numbers don't vary as much (thats expected, just on statistical grounds). My explanation is that FT% doesn't correlate highly with winning because the variation among teams' yearly averages isn't that great, to begin with, and perhaps teams with a number of bulky strong guys, who contribute other ways, don't shoot FT's so well. You may have a better explanation.
STLs and CHGs also had negative years and little significance on average, too. One thought I had about CHGs was that teams that get lots of charges, try to get them lots of times and fairly often they fail to get the call, which is a significant negative. If you look at the progression over 4 years, the steadily declining value might well mean that the refs are siding more and more with the offensive player - or it could mean the players are pushing the envelope too far in trying to get a charge call. Of course, its well within the expected random variation, and could mean zip. One thing that is clear is that taking charges is not something that is greatly beneficial in itself. Teams probably should discourage players from doing it that don't get the calls most of the time.
STLs having essentially zero value is right in line with what many coaches say - going for steals costs you significantly when you fail. Clearly league wide, players are pushing the envelope.
Pace has a consistently negative value. I don't take that to mean that it is beneficial to slow the game down. For one thing two of teams that push the pace the most the last two years are simply two of the worst teams at any pace - GSW and NYK. Players with poor shot selection increase the pace of the game a small amount by not working for a better shot. Put a bunch of poor shot selection guys together and you have a losing team and a somewhat higher pace.
ORR, offensive rebound rate, has an almost zero cc, which seems to fly in the face of reason, but it is true that you have miss a shot to get an offensive rebound so that might be the reason for that. I'm a little surprised when it shows up in ORR, though - I've known for a while that it shows up in raw count of offensive rebounds. ORR is not o-rbs per game, its as a percentage of missed shots and you'd think that had to be beneficial. It doesn't appear in these two sets of stats but opponent's offensive rebounds has a significant negative correlation with winning pctage which would make you think own OR are worth a fair bit when expressed as a pctage of missed shots.
For the proponents of Offensive and Defensive efficiency, you'll see that they show up with significantly higher cc's than raw PTS & OPTS, so you might expect the difference to be a better metric than PTdiff but it is a dead heat between the two. In fact, not only do they average the same they're almost identical every year.
stat.......... 06-07 .... 07-08 .... 08-09 .... 09-10
PTS ..... : . +0.320 .. +0.509 .. +0.297 .. +0.448 ... Avg: +0.394
OPTS .... : . -0.525 .. -0.542 .. -0.654 .. -0.606 ... Avg: -0.582
FG% ..... : . +0.455 .. +0.643 .. +0.649 .. +0.601 ... Avg: +0.587
OFG% .... : . -0.704 .. -0.764 .. -0.873 .. -0.749 ... Avg: -0.772
3P% ..... : . +0.421 .. +0.536 .. +0.483 .. +0.532 ... Avg: +0.493
O3P% .... : . -0.561 .. -0.565 .. -0.564 .. -0.649 ... Avg: -0.585
FT% ..... : . +0.107 .. +0.284 .. -0.160 .. +0.012 ... Avg: +0.061
AST ..... : . +0.405 .. +0.487 .. +0.335 .. +0.396 ... Avg: +0.406
TO ...... : . -0.548 .. -0.473 .. -0.374 .. -0.429 ... Avg: -0.456
STL ..... : . -0.051 .. +0.304 .. +0.208 .. -0.034 ... Avg: +0.107
BLK ..... : . +0.267 .. +0.313 .. +0.163 .. +0.403 ... Avg: +0.287
CHG ..... : . +0.388 .. +0.044 .. +0.066 .. -0.106 ... Avg: +0.098
DEF ..... : . +0.311 .. +0.384 .. +0.234 .. +0.198 ... Avg: +0.282
PF ...... : . -0.373 .. -0.104 .. -0.229 .. -0.172 ... Avg: -0.220
PTdiff .. : . +0.953 .. +0.978 .. +0.991 .. +0.974 ... Avg: +0.974
FG%diff . : . +0.800 .. +0.876 .. +0.904 .. +0.836 ... Avg: +0.854
3P%diff . : . +0.635 .. +0.687 .. +0.673 .. +0.774 ... Avg: +0.692
Below from statistical category Advanced
Pace .... : . -0.108 .. -0.117 .. -0.309 .. -0.282 ... Avg: -0.204
OffEff .. : . +0.703 .. +0.853 .. +0.823 .. +0.774 ... Avg: +0.788
DefEff .. : . -0.711 .. -0.826 .. -0.870 .. -0.740 ... Avg: -0.787
Diff .... : . +0.953 .. +0.977 .. +0.990 .. +0.972 ... Avg: +0.973
TS% ..... : . +0.493 .. +0.712 .. +0.724 .. +0.682 ... Avg: +0.653
AR ...... : . +0.496 .. +0.553 .. +0.522 .. +0.485 ... Avg: +0.514
TOR ..... : . -0.535 .. -0.454 .. -0.300 .. -0.345 ... Avg: -0.409
ORR ..... : . -0.080 .. +0.104 .. +0.172 .. +0.056 ... Avg: +0.063
DRR ..... : . +0.479 .. +0.323 .. +0.470 .. +0.501 ... Avg: +0.443
TRR ..... : . +0.397 .. +0.537 .. +0.691 .. +0.553 ... Avg: +0.544
EFF ..... : . +0.658 .. +0.735 .. +0.642 .. +0.736 ... Avg: +0.693
WS ...... : . +0.764 .. +0.842 .. +0.833 .. +0.822 ... Avg: +0.815
AWS ..... : . +0.718 .. +0.805 .. +0.767 .. +0.810 ... Avg: +0.775
PER ..... : . +0.794 .. +0.883 .. +0.860 .. +0.844 ... Avg: +0.845
The stats AR, TOR, ORR, DRR, TRR are not per game rates, but per possession rates for AR (Asts) and TOR (TOs) and per opportunity rates for the rebounds. TS%, EFF, WS, AWS, and PER are formulas of varying complexity - see Hoopdata.com.
Some stats, like 'point differential', are known to correlate well with winning pctage and at the entry PTdiff, you'll see it shows up with a 4 yr avg correlation coeff. of .974 (out a maximum possible of 1.0). Hollinger's PER, at the bottom of the list, has avg cc of .845, so its not all that bad. Of course, its a terribly complicated formula and you can see that simple field goal percentage difference (own minus opponents, named FG%diff) correlates just as well. FG%diff, PTdiff, 3P%diff are not given explicitly at Hoopdata.com, so I calculated them from PTS - OPTS(opponents pts), FG% - OFG%, and 3P% - O3P%. Except for those three 'stats' the rest of the labels match those at hoopdata.com in the two statistical categories "Basic" & "Advanced". (You can get the definition of them on hoopdata.com, should you be interested in the details.) I will mention that CHG means charges taken and DEF means CHG+BLK+STL.
Some things I found of interest.
How can FT% be of so little significance - note that it had a negative correlation coefficient in 08-09. The number of data points for a year are only 30, one for each team, which means that estimating a correlation has a std deviation of .18 and the average of 4 years has std dev of half that or .09. If you look across four years you can see that they do jump around a good bit, except that the high numbers don't vary as much (thats expected, just on statistical grounds). My explanation is that FT% doesn't correlate highly with winning because the variation among teams' yearly averages isn't that great, to begin with, and perhaps teams with a number of bulky strong guys, who contribute other ways, don't shoot FT's so well. You may have a better explanation.
STLs and CHGs also had negative years and little significance on average, too. One thought I had about CHGs was that teams that get lots of charges, try to get them lots of times and fairly often they fail to get the call, which is a significant negative. If you look at the progression over 4 years, the steadily declining value might well mean that the refs are siding more and more with the offensive player - or it could mean the players are pushing the envelope too far in trying to get a charge call. Of course, its well within the expected random variation, and could mean zip. One thing that is clear is that taking charges is not something that is greatly beneficial in itself. Teams probably should discourage players from doing it that don't get the calls most of the time.
STLs having essentially zero value is right in line with what many coaches say - going for steals costs you significantly when you fail. Clearly league wide, players are pushing the envelope.
Pace has a consistently negative value. I don't take that to mean that it is beneficial to slow the game down. For one thing two of teams that push the pace the most the last two years are simply two of the worst teams at any pace - GSW and NYK. Players with poor shot selection increase the pace of the game a small amount by not working for a better shot. Put a bunch of poor shot selection guys together and you have a losing team and a somewhat higher pace.
ORR, offensive rebound rate, has an almost zero cc, which seems to fly in the face of reason, but it is true that you have miss a shot to get an offensive rebound so that might be the reason for that. I'm a little surprised when it shows up in ORR, though - I've known for a while that it shows up in raw count of offensive rebounds. ORR is not o-rbs per game, its as a percentage of missed shots and you'd think that had to be beneficial. It doesn't appear in these two sets of stats but opponent's offensive rebounds has a significant negative correlation with winning pctage which would make you think own OR are worth a fair bit when expressed as a pctage of missed shots.
For the proponents of Offensive and Defensive efficiency, you'll see that they show up with significantly higher cc's than raw PTS & OPTS, so you might expect the difference to be a better metric than PTdiff but it is a dead heat between the two. In fact, not only do they average the same they're almost identical every year.
stat.......... 06-07 .... 07-08 .... 08-09 .... 09-10
PTS ..... : . +0.320 .. +0.509 .. +0.297 .. +0.448 ... Avg: +0.394
OPTS .... : . -0.525 .. -0.542 .. -0.654 .. -0.606 ... Avg: -0.582
FG% ..... : . +0.455 .. +0.643 .. +0.649 .. +0.601 ... Avg: +0.587
OFG% .... : . -0.704 .. -0.764 .. -0.873 .. -0.749 ... Avg: -0.772
3P% ..... : . +0.421 .. +0.536 .. +0.483 .. +0.532 ... Avg: +0.493
O3P% .... : . -0.561 .. -0.565 .. -0.564 .. -0.649 ... Avg: -0.585
FT% ..... : . +0.107 .. +0.284 .. -0.160 .. +0.012 ... Avg: +0.061
AST ..... : . +0.405 .. +0.487 .. +0.335 .. +0.396 ... Avg: +0.406
TO ...... : . -0.548 .. -0.473 .. -0.374 .. -0.429 ... Avg: -0.456
STL ..... : . -0.051 .. +0.304 .. +0.208 .. -0.034 ... Avg: +0.107
BLK ..... : . +0.267 .. +0.313 .. +0.163 .. +0.403 ... Avg: +0.287
CHG ..... : . +0.388 .. +0.044 .. +0.066 .. -0.106 ... Avg: +0.098
DEF ..... : . +0.311 .. +0.384 .. +0.234 .. +0.198 ... Avg: +0.282
PF ...... : . -0.373 .. -0.104 .. -0.229 .. -0.172 ... Avg: -0.220
PTdiff .. : . +0.953 .. +0.978 .. +0.991 .. +0.974 ... Avg: +0.974
FG%diff . : . +0.800 .. +0.876 .. +0.904 .. +0.836 ... Avg: +0.854
3P%diff . : . +0.635 .. +0.687 .. +0.673 .. +0.774 ... Avg: +0.692
Below from statistical category Advanced
Pace .... : . -0.108 .. -0.117 .. -0.309 .. -0.282 ... Avg: -0.204
OffEff .. : . +0.703 .. +0.853 .. +0.823 .. +0.774 ... Avg: +0.788
DefEff .. : . -0.711 .. -0.826 .. -0.870 .. -0.740 ... Avg: -0.787
Diff .... : . +0.953 .. +0.977 .. +0.990 .. +0.972 ... Avg: +0.973
TS% ..... : . +0.493 .. +0.712 .. +0.724 .. +0.682 ... Avg: +0.653
AR ...... : . +0.496 .. +0.553 .. +0.522 .. +0.485 ... Avg: +0.514
TOR ..... : . -0.535 .. -0.454 .. -0.300 .. -0.345 ... Avg: -0.409
ORR ..... : . -0.080 .. +0.104 .. +0.172 .. +0.056 ... Avg: +0.063
DRR ..... : . +0.479 .. +0.323 .. +0.470 .. +0.501 ... Avg: +0.443
TRR ..... : . +0.397 .. +0.537 .. +0.691 .. +0.553 ... Avg: +0.544
EFF ..... : . +0.658 .. +0.735 .. +0.642 .. +0.736 ... Avg: +0.693
WS ...... : . +0.764 .. +0.842 .. +0.833 .. +0.822 ... Avg: +0.815
AWS ..... : . +0.718 .. +0.805 .. +0.767 .. +0.810 ... Avg: +0.775
PER ..... : . +0.794 .. +0.883 .. +0.860 .. +0.844 ... Avg: +0.845
The stats AR, TOR, ORR, DRR, TRR are not per game rates, but per possession rates for AR (Asts) and TOR (TOs) and per opportunity rates for the rebounds. TS%, EFF, WS, AWS, and PER are formulas of varying complexity - see Hoopdata.com.
Last edited: