Adventures in Analytics: Descriptive vs. Predictive Stats


Over at Nylon Calculus they've unveiled a new metric called Dredge, which uses play by play data with the goal of predicting RAPM data. Now, as a warning, this is already going to be a ranty post, so I'm going to avoid going down issues with RAPM. That said, a footnote that referenced Wins Produced had me scratching my head.'

[2. Win Shares performed horribly, so much so that I worry it’s an error I did not catch. But others have found its weak predictive power when looking at data with players in new situations, although it’s not as bad as Wins Produced.]

The method was - use various stats, including Dredge to explain/predict RAPM. Win Shares and Wins Produced were deemed weak because they can't predict RAPM. It's also worth noting RAPM came about as an "improvement" to Adjusted Plus-Minus (APM) because APM was not predictable enough year to year.

I've thrown a ton of alphabet soup at you, and I don't want to wade into the land of metrics I consider built on odd logic at best. That said, I do want to talk a "criticism" of Win Shares and Wins Produced. Namely, their inability to predict. Predicting RAPM is an odd decision as both metrics say their goal is to explain wins [Editor's Note: the test noted in the footnote is trying to predict future team's wins. The goal of Dredge is to "use a bevy of metrics to predict 15 year RAPM" -- paraphrased from the Nylon Calculus post.] . Also, trying to see how well they explain RAPM rather than wins is just silly, as it's an odd relic of a bad paper that was never published. Moving on.

Ok, both Wins Produced and Win Shares are "descriptive stats." And all that means is; they tell you what happened. Both use a linear set of weights on box score statistics to explain how players contributed to their team's wins. We can quibble about how the linear weights were derived but regardless, both metrics say - here is what happened.

Now, that is different than "predictive stats." These try and say: "here is what will happen." Simple example:

  • "It rained today." - descriptive stat
  • "It will rain tomorrow." - predictive stat

Now, there is some overlap on the two, and I'll get to it in a second, but first I want to take a slight tangent to talk Star Wars and what my issue is.

Star Wars and "Fast" Metrics.

Above is an iconic scene from Star Wars: A New Hope. In it Ben Kenobi and Luke Skywalker barter passage with Han Solo and Chewbacca on the Millenium Falcon. Now, what makes this scene special and the focus of many nerd battles is the following. Ben Kenobi asks Han if he has a fast ship. Han Solo acts incredulous and then replies the ship made the Kessel Run in twelve parsecs. Now, we as the viewer would be left to conclude that parsecs are a unit of time because that's the metric we expect when we ask if things are fast. However, it turns out parsecs are a unit of distance! So Han's response makes no sense. (Of course, nerds have taken this fight to a whole other level. Sadly the idea that George Lucas is not great at dialogue is pretty easy to believe.) He used a stat in the incorrect context, and that's bad.

Now back to the "Wins Produced and Win Shares aren't good at predicting." point. Why are you using them that way? No, seriously. A buzzword to explain the analytics "movement" (I often feel we're moving backward, but I guess that counts as moving) is the "advanced stats movement." And yet, many articles, blog posts, forum posts, start with fundamentally misusing stats and then criticizing them! That's not advanced! That's just silly, sorry. One more example, imagine I ask you "How did the Warriors do against the Lakers last night." and your reply is, "I think they'll win 73 games, make the finals, and lose in game 7 to the Cavs." Congrats, your prediction was awesome. Did you answer the question? No!

Now there is a point about "predictive power" that we should discuss.


Kat's got skills.

Something that does come up in regards to Wins Produced and other stats is year-to-year consistency. Indeed, a player's Adjusted Production per 48, which doesn't factor in the position adjustment, has a correlation of about 0.7 from year to year. And the reason this is important is that we can use stats to explain what happened (as Wins Produced and Win Shares do. If you get nothing else from this article, get that they are descriptive stats, please!) but we don't know for sure if it's skill. In other sports, like football and baseball, we can also use stats to explain exactly what happened. However, they are not as consistent season to season. And this means stats are less likely a result of skill than they are of other factors. So the consistency is not about using the metrics as a predictive stat. Rather, it's about asking if we think the stat is a skill or just a recording of what happened.

Dave Berri et al. published his work on Wins Produced in two books: "The Wages of Wins" and "Stumbling on Wins." And in an RTFM moment, I'll note that in these books the authors noted that age, coaching, injury impacted player performance. Indeed, as noted, what position the player is played at also matters, something people in baseball have understood for years with the WAR metric. There's been some fun preliminary research on the impact of where players play (Utah and Denver have a huge home court advantage, which may boost performance), and defense may vary widely based on team schemes. If you're trying to predict a player's future performance, the same people that gave you Wins Produced gave you a multitude of other factors that you should include! In short, taking Wins Produced, using it -- on its own -- as a predictive metric doesn't make sense, especially given the research that produced it doesn't line up with that!

Final Rant

I read an interesting article today that had a fantastic point - "The internet is not a classroom." There are a ton of articles, blog posts, etc. out there with people doing analysis. However, this analysis isn't always peer-reviewed or necessarily done by experts. And the predictive vs. descriptive stats issue is an example of a flat-out incorrect set of analysis that has been around in the "advanced stats movement" as long as I've been blogging. The "use other metrics to explain my metric" test is from a paper that was so bad it was never published in any of the economics journals it was submitted to. Yet, there it was being used in the middle of the write-up on Dredge. I hope this blog post cleared up a common issue. I'm sure it didn't for many. But the bigger point I'd like to make is stats and analysis take training and work. I understand the joy of being able to dig into the numbers. I understand the elation of posting something on-line. But all I'd like to caution against is - if you're doing "advanced stats", that maybe you ask if you can do basic stats, and also ask where you got your training. Because I can't agree more, the internet is not a classroom.

Wins Produced explanation here -

Win Shares explanation here -