The Perils of On/Off Numbers

As Dre and I have talked about on the air, on/off statistics are poor indicators of an individual's impact on the game. The primary reason is that there are large number of confounding variables...like the four other players who share the court with any given player, and the many other players who might replace that player when they are off the court.

I'll try to illustrate this with an extreme example. Imagine that an NBA team held a contest, and the winner of that contest got to start at shooting guard for the team. Now imagine that I won that contest.

A team playing me at shooting guard would be a disaster. You might think that it would be like playing 4-on-5, but you'd be wrong -- it would be much worse than playing 4-on-5. Think of all the turnovers I would create, personal fouls I would commit, shots I would get blocked, etc., that at least in a 4-on-5 situation would not happen. At least in a 4-on-5 situation, defenders would have a defensive scheme for rotating and could make choices about what shots they would live with. But as the fifth man, I would constantly be getting in the way.

The point, of course, is that clearly anytime I was on the court, my team would get outscored badly. Unless my four teammates were hall-of-famers ... in which case they might just be good enough to make up for my extreme liability while never passing me the ball and hiding me in a zone on defense.

Now ask yourself some questions:

1) If my four teammate starters were average NBA players, would their on/off numbers be zero (as one would expect from an average player) or negative?

2) If my four teammate starters were hall-of-famers, would my on/off number be zero? Positive? Negative? Would it reflect my skills or be higher than you would expect for a player of my skill playing in the NBA?

3) Does my substitute deserve amazing on/off numbers? If your answer is "yes", is that because what you are really measuring is how much I suck?

4) How would you control for my incredibly negative effects if you were creating an "Advanced on/off" metric? Note that because of the interaction effects #2 above, using MY on/off numbers as a control would be a very bad idea.

5) Would any of this explain how much I suck more than the simple box score does?

Now let's reverse the whole scenario and say that my rec league team won a big contest, and we got to have LeBron James play with us a season. Assume we are an average team at this rec level. It's fair to assume that we would start winning games by 30+ points.

1) Do the starters that play with LeBron deserve their big on/off numbers?

2) Does LeBron's substitute deserve his horrible on/off numbers? If your answer is "yes", is this because what you are really measuring is "how much it hurts to take LeBron out of a game"?

3) How would you control for LeBron when calculating each teammate's on/off numbers? Remember that because LeBron plays with four players much worse than him (including the ones you are trying to calculate), using HIS on/off numbers to control for this is a bad idea.

4) How would you control for LeBron's horrible (by comparison) teammates when calculating his on/off numbers? Remember that, since each of them plays with LeBron (and tends to sub out and share bench time with LeBron), using THEIR on/off numbers to do this is a bad idea.

5) Would any of this do explain more than LeBron's simple box score numbers do, in describing how awesome LeBron is?

....see where this is going?

Today, I was motivated to strike the word "advanced" from many of the descriptions and titles on this website. I've come to realise that we're doing a poor job of our mission, which is to make it clear that explaining basketball performance is not that complicated. Believe it or not, the central tenet of the analytics we use here is that (almost) everything you need to measure performance is right there in the modern-day box score! And it's true, there are things not explicitly quantified by the box score -- like many individual contributions to team defense. But that doesn't mean a black box that ignores the box score is the answer.

And I think this is the kind of thing that leads Charles Barkley to throw up his hands in frustration and say that analytics proponents are "a bunch of guys that never got the girls in high school."

Our advice? Don't abandon the box score, Chuck.

Loading...