Nerdnumbers avatar

The Boxscore Geeks Show: Big Trouble with Little Data


Video Show


Andres Alvarez(@nerdnumbers)

Produced by Brian Foster(@boxscorebrian)

This week's poll

Show Notes

The Olympics/Talent Evaluation

Patrick's piece on Mason Plumlee vs. DeMarcus Cousins and Zach Lowe's piece on evaluating bigs lead to a discussion about valuable skills and player evaluation.

Evaluating things like Chris Bosh's midrange game as "good" is iffy. Yes, he's above average at it, but it's not neccesarily better than average threes or layups.

My general problem with modern advanced stats is that we can zoom in on players, but don't understand the magnitude of their skill/deficiency.

Forty three players shot over four threes a game with > 35% last season!

The Pacers and 538

Paul George is a rare talent in that he's a good scorer, but doesn't quite get the respect I'd expect.

Case in point, this 538 post on it!

Nate Silver's recent piece on Paul George and injuries is interesting. That said, the stats and methods Nate uses don't seem cohesive or vetted. In short, I expect more.

Richard Feynman sums up a lot of the issues found in bad studies.

My issues with RPM.


Patrick asks if Philadelphia is the first team to "tank correctly" in the modern draft.

Devin Dignam on how tanking used to work in the NBA.

I love the logic that the Wizards have been tanking for three decades, and the solution is to tank some more!

The reason the Wizards are "bad at tanking" is not being able to give up on top draft picks when they don't turn into stars.

Becky Hammon and Good Management

I'm a little skeptical about how the mainstream media will scrutinize Becky Hammon. I've written about this before. It's pessimistic and depressing, but it's also possible this is tokenism, not a start of a change.

Kevin Draper has written about how people "in charge" in the NBA may have less control than we think.

Dan Pink has a great TED talk on what motivates people.

Tangent - Drinking Game

We've got a few common things happen in the show regularly. Feel free to tweet us more suggestions but here are a few suggestions for a Boxscore Geeks drinking game:

  • Every time Dre's dogs shake their collars/bark take a drink.
  • Every time we mention Kevin Love take a drink.

Shout Outs

Aaron Montgomery for this great tweet:

Taye Diggs(@tayediggs) for making my day by following me on Twitter.


Intro Music - Test Drive by Zapac.

Outro Music - Math by Supernova.

NBAstuffer has some ref stuff.
Racial overtones are also an aspect of the ole boy's club
Looks like George Hill produced less wins than Paul George last year, but actually had a better WP48. He had a nice rebounding year for a guard and also had 56.3% TS, sorry for selling you short on the show Mr. Hill!
James' other weakness: decision making i.e. turnovers
From: (

" ... Many of us at Boxscore Geeks and Wages of Wins have more academic backgrounds. That means we're pretty big fans of the scientific method. ..."

You mean the one where you take a theory that makes predictions and then compare it to the outcome of an experiment? It seems like Silver's made some predictions about player minutes on the floor and wins. So, isn't the scientific thing to see whether those predictions come true? (And, since the minute allocations are liable to be inaccurate, we could also look at a minute-corrected retrodiction at the end of the season.)

Similarly, if you want to prove WP against RPM - why not do the scientific thing: Make WP and RPM based predictions about the next season, and see how the methods compare at the end of the season. Heck, you can 'shotgun' in the other player ratings that people have published.

I mean, RPM foundationally cannot be trusted because it's a black box metric. I would be entirely unconvinced by someone saying that team RPM has such-and-such an r^2 with pace-adjusted point differential or whatever; how am I supposed to be sure that you're not retroactively cooking the books if I'm not allowed to see the math?
Neil Payne has published more than a few models where he has simply added the error term back in. This will, of course, make it a pretty good predictor of the future.

In other words, winning is correlated with winning. Yay.

What's interesting is that I think 65 wins might be very close. I've yet to do the math myself. This is just because the amount by which he undervalues Love happens to be pretty close to the amount by which he overvalues Waiters.

I'm excited to see this play-by-play data. Waiters must be AMAZING at defense to be a par player given how horrible he is at...well, everything else.
"Neil Payne has published more than a few models where he has simply added the error term back in." Exactly! It's those sorts of shenanigans that are wholly outside the institutional norms of science, but which are unfortunately quite prevalent in the online basketball community, that make me skeptical of black-box metrics.

Look, if you've done some really cool homebrew research and you'd like to impress actual NBA decision-makers with it, there's no law that says you have to post it on the internet. If you don't care to demonstrate how you performed your analysis to the basketball-scientific community, then just don't bother us with numbers for which we have no context. Hang onto it for an interview with a team or something.

Also, I'm pretty bullish on David Blatt; everything I hear about this guy suggests to me that he's not going to tolerate Waiters' grossly inefficient style of play. And it does seem that the Cavs have pulled off the Love trade! We'll know for sure in two weeks or so.
>I mean, RPM foundationally cannot be trusted because it's a
> black box metric.

2013-2014 RPM numbers are publicly posted.

We can use the numbers that are at
today, and see how well they predict the coming season.

They, could cook the books on numbers they publish in the future, or switch to a different methodology though.
I'm not talking about the outputs of the stat, I'm talking about the calculation of the stat itself. We don't know how the stat is calculated, so we don't know if we can trust the numbers presented as reflecting RPM.
Having a look at last season's numbers, it looks like the defensive adjustment for guards (DRPM) massively changes their relative ratings between RPM and WP/PoP. Eyeballing it, I am guessing that for guards ORPM and WP/PoP are pretty close and for bigs RPM and WP/PoP are pretty close (maybe defense shows up better in the box score stats for bigs?) Be interested to see an analysis.
I'm pretty confident that RPM will give attractive r-squared values, but I'd bet that even if you could open up the black box you wouldn't be able to learn anything from it. If a model isn't constructed scientifically it can't tell you why it predicts what it does, only what its predictions are.

If you want to know why RPM thinks a given player is good, the answer is very likely 'because the model says so', end of story.

I'd actually be surprised if RPM actually explains wins well at all, given that its correlation with WP is about .51 (RPM stats are listed FIRSTNAME LASTNAME while WP48 numbers on this site are given LASTNAME, FIRSTNAME, which made it a major pain to organize the spreadsheet, and as a consequence I only looked at 20 players, but the correlation is significant [p=.01]). Something so poorly correlated with WP48 is unlikely to do a satisfactory job explaining wins.
The problem with Nate's article is that he says it uses play-by-play data to "better account" for defense.

But how does it account for this?

Is it Neil Payne deciding that he thinks certain events recorded in play-by-play data are important?

Because if the events recorded aren't correlate with wins, why do we think they are helpful in explaining performance?

So, in essence, they are going to have to show me WHY these play-by-play data MATTER, and given Neil's track record, I am skeptical. Especially because it is taking them so long to simply publish a methodology they have been using for weeks or months.
"So, in essence, they are going to have to show me WHY these play-by-play data MATTER"

Craig Wright, a baseball statistician, talked about the distinction between statistical analysis and science as such; he remarked that "statistical analysis is too often taken for being science itself rather than a tool of science." This distinction is often lost on people without much formal background in statistics or science.

In the online basketball community, people start throwing numbers around and forget to examine the actual significance of the statistics they come up with in relation to actual wins in the NBA (or anything else); without examining the effectiveness of the statistical measures one employs in interpreting the NBA, one cannot develop a scientific understanding of basketball.

Say it with me:
oh yes and these other metrics which have a better predicting future wins also predicted that hawks will give pacers a hard time during last years 1st round matchup right, they also predicted that wade will commit a lot of turnovers against san antonio, they also predicted that in the 2011 series against the mavs lebron will drive to the basket less frequently and instead shoot those mid range jumpers.

Wins Produced was not designed to be a predictive metric. It was designed to be an explanatory metric. Its value as a predictive metric comes from that explanatory power; it can explain its predictions.

If all you care about is prediction you would construct a very different model. The best possible predictive model is, in fact, a weighted average of all models - and clearly it would have little explanatory power.

This is not a subtle difference.
"Neil Payne has published more than a few models where he has simply added the error term back in."

I apologize if this is a stupid question, as I'm not a statistics expert. But I'm also not a slouch. But I don't entirely understand what you mean when you say that he added the error term back in? Do you mean that he included the residuals as a covariate in the model? I mean, error is just the deviation between observed and predicted values (or, rather, this is the residual, which is how we estimate the error), so I don't entirely understand what you mean when you say that error is added back into the model itself.

Regardless of what exactly you mean by it, on a separate level how could one ever justify doing that? Wouldn't the end result simply be a saturated model, like trying to fit a 5th-order polynomial to 6 data points?
RyNye - basically you take the residual from a player's performance last year and add it back in to your prediction for next year.

If your model is autocorrelated (due to certain flavors of omitted variables) then adding the t-1 residual to your prediction at t will improve accuracy of your prediction (strictly speaking, you want to add some fraction of the error term back in, with the exact fraction depending on the autocorrelated 'signal' to the uncorrelated noise)

You justify it by it improving predictive power. There are some 'intangibles' that affect each player's performance, and this adds those back in; as long as you're getting more improvement from adding those intangibles than you're losing from the noise, the model will predict better.

""Why don't people link to their claims?"
Because I assume Nate can use Google. (And I was lazy.)"

- In college my friends used to make fun of a device that I once used in random discussions about an article I read. Since my memory was terrible and none of us had laptops the discussion would normally go along the lines of: 1) me "this article I read argues my point" 2) them "who wrote it and what was it called" 3) me "I don't remember but my point stands regardless of my quote from "authority"" 4) them "ha ha ha ha... well I read an article too!" more mocking.

If you have a good point to make just make it on its own merits. If you quote from an external source provide the evidence. Otherwise you are making the same mistake that I did. Claiming that I have a study that proves something means nothing if you don't provide the study. In fact that is why this site and others don't like black box predictive measures.

""If all you care about is prediction you would construct a very different model."
Not true. WP purports to measure production, and Berri has also said 1,000X that NBA players are consistent. If both are true, then WP should make excellent predictions. But it makes poor predictions."

-What wins produced is trying to measure is what individual contributions add to... wins. What plus/minus in general measures is how good a team is. Adding in most of the "adjustments" don't really make that different. I think many of the Wages of Wins guys would agree that plus/minus is a very good predictor of wins... but they would probably also say that it doesn't explain WHY the wins are happening. Now I have some problems with wins produced but I respect what they are doing and they are very open about how it works. Please don't compare apples to oranges since any plus/minus based system tells you NOTHING about what an individual player should work on (except get better at plus/minus?).

"The only possible way to know if a metric is truly isolating individuals' productivity from their teammates is to predict future wins. This is ultimately the test that tells us if a metric works, even if you want to use it to explain current wins. "

-Wins produced is meant to measure production that can lead to wins. Once again, it is truly an attempt to measure a single player in a team-game setting and tell you what that player is doing that contributes to wins. Plus/minus in any form would probably always have greater predictive power but it isn't adding to any understand of WHY.

If you just want predictive power I probably would go with plus/minus in any form. If you want the ability to explain why someone is better on the court though none of the plus/minus derivatives help a lick. So go with what you want. I like to know what is actually happening on the court to produce wins. Is WP perfect? No, but at least it gives me something concrete. I also like Coach Nick and loved Sebastien Pruiti from the get-go until the Thunder hired him away because they could break down individual plays. You may prefer someone like Mark Jackson talking about heart but I'd prefer a detailed breakdown every day. Jackson is right in a certain regard and some players are just capable of working harder and that might be a better "predictive" detail, but I also want to know the specifics about the why on an individual basis. If that isn't for you that's fine, but know what you are arguing (prediction vs description).
I agree with you, but you are comparing descriptive metrics with predictive ones. That is my only problem. If I were betting I would completely go with you. If I were picking an individual player I *might* go with you. If I were trying to understand basketball better for the long run and wanted to know what my individual players should work on plus/minus is useless. None of that is to say I think WP is the best predictive method, but it is searching for something I don't think you care about whereas predictive methodologies don't really try to advance descriptive methods. It seems you are saying that if descriptive methodologies don't work as well as predictive methodologies than the descriptive methodologies are bad. I disagree... they are focused on two different things both of which are important. I just think comparing the two is a completely misguided endeavor since they seek different results. In the ideal world the descriptive methods will overtake the current prescriptive methods but we are not there yet and the descriptive methods ARE important in the long run to advancing the goal of better prescriptive methods.
Wins produced doesn't predict well compared to what? LVH? There is a vast literature on the efficacy of betting markets at predicting outcomes - they are not easy to beat with a predictive model, let alone a descriptive one.

What is wins produced predicting anyway? That next year's performance (on a player level) will look like last year's, that is, that it's consistent? That seems to hold up pretty well to me.
"I just think comparing the two is a completely misguided endeavor since they seek different results."

Well, compare and contrast the two. They are for different purposes, after all, but it's still useful to know just how much predictive power a descriptive model has, as that 'stuff we can explain with what we can quantify' is more or less the bar any other, less rigorous model needs to meet to be worth anything.

"In the ideal world the descriptive methods will overtake the current prescriptive methods..."

I guess. You'd be able to build even better predictive models upon those that far outclassed the improved descriptive models though.

Outside of the trivial case where descriptives explain 100% of the variance I cannot imagine a situation where a rigorous descriptive model can predict as well as a properly tuned predictive model; for any given incomplete descriptive model it is trivial to create a better predictive model simply by averaging over the local neighborhood of related models.

...but perhaps that's too fine a point on an argument that isn't being grokked in the first place.
This discussion is unproductive. Guymolyneux is a garden-variety innumerate; no good can come from our sincere efforts to explain the science of basketball to a man more interested in invoking terms he lacks the (innate) ability to understand than in the falsifiable interpretation of NBA basketball. We can't be everybody's stats professor. This whole thread has devolved into a wage of time.

God I hate posting from my iPhone; so much goes unsaid. In this case though, it's probably for the better.
Waste of time*

Edit button ahora fucking mismo
New rule: comments that make claims about what WP48 does or does not do are going to just get deleted.

The only response I have to someone that thinks the team adjustment is the same as adding the error term back in is to RTFM, because that isn't remotely close to true.

At some point, arguing with people that do not understand statistical analysis, and keep parroting back the same basic mistakes just isn't fun.

If you are asking what use a model is if it isn't predictive, then I'll counter with "What use is a model that is predictive but not explanatory?"

You can predict most teams' 2014-15 results pretty damn close just by applying the 2013/14 point differential (i.e. only the teams that have made major roster overhauls will differ much). Yay. What have you learned?
It should be obvious that point differential correlates to wins. that is like saying, a coach's win percentage correlates with how many times he wins and loses. In almost every post here talking about teams and not players: point differential, offensive efficiency, and defensive efficiency are used.

It should also be intuitive that a player moving rosters will produce wins at a different rate. Why should this be news to anyone?

That being said I am going to continue to say it'd be nice if there was a placeholder for deleted posts with the reason written. just for continuity reasons.
Shilelea, that is a good suggestion. I've added it to our Trello to-do board.
> If you are asking what use a model is if it isn't predictive, then I'll
> counter with "What use is a model that is predictive but not
> explanatory?"

The predictive model is the right choice to estimate the future impact of decisions. ( And, as long as standards of repeatability are met, it doesn't matter how crazy the methodology is.)

Dre brought up the scientific method, and specifically Feynman. For Feynman, the scientific method is all about prediction. In Feynman's words: "If it disagrees with experiment, it's wrong."

So, if we have two models, and one involves some strange ideas, but consistently makes more accurate predictions than the other, which one do we say is more correct? Moreover, we can often extract explanatory value from predictive models using mathematical techniques.
trying to predict what?

If the goal is to predict wins, than a point differential method will always work best because that is what wins are, a positive point differential at the end of the game.

If the goal is to try and measure a players portion of the win via the values we place on various in game actions. than a system based on what a player does in game is more valuable.

The problem I have with the naysayers concerning WP predictive ability. is that the variance in the productivity of players they note is often related to usage patterns from different organizations. WP seems to do an amazing job repeating itself when a player plays a constant role. When the player steps outside his role, WP fluctuates. This is not a condemning thing for WP or any other metric. It simply means that different systems use players differently and those different usage options create varying degrees of productivity, and since teams are so poor at playing the proper players properly (think JJ barea) there generally is no observable invisible hand that moves the production toward a more efficient state within an organization.
In Zach Lowe's article, he called Al-Farouq Aminu one of the best wing rebounders in the league. That was his first mention of rebounding, at the 1,432nd word. Lowe also uses the word rebounding once, and also mentions that Kenneth Faried kills the glass. Those are the only mentions of rebounding in his mammoth article on the value of big men. Yeah, rebounding is often undervalued because it's not as exciting or noticeable as scoring, steals or blocks. However, everyone from the analytically minded to the analytics haters agrees that rebounding is a valuable and consistent skill. But the other thing about rebounding is that almost everyone agrees on how to measure it. Rebound rate refines the stat a little, but it tells pretty much the same story as rebounds per 48. The SportVU guys suggested evaluating contested rebounding, but that made no sense ( I just don't understand why the contributions of great rebounders like Faried are so undervalued. I get (though don't agree with) why guys like Zach Lowe don't think highly of efficient shooters like Plumlee, Birdman, or Faried again, but I just don't understand the bias against rebounding!
I applaud Patrick's heroic decision to agree with me. Now it's been a week since the last post, let's see some more content. Attica! Attica! Et cetera.

Sign in to write a comment.