Arturo avatar

The Geeks at Sloan, Part 3: Lies, Damn Lies, and Statistics

For the second year in a row, the Box Score Geeks went to Sloan! Make sure you check out part 1 and part 2 too!

Before we finish off the rest of day one, let's talk a bit about the papers. I will say there were some interesting ideas being presented, but there were some serious issues as well.

The Three Dimensions of Rebounding

As written, I thought all the papers had some flaws in terms of how they presented their information (I found out later this was due to the limitations imposed by their Non disclosure agreements). However, the Second Spectrum team put in one of the best live presentations I've seen in a long time. They owned the room. I've seen the kind of toys they're playing with and they truly are at the forefront of analytics. My quote at the time was: "This rebounding paper is basketball geek nerd porn". I stand by it.

Pointwise

This paper was a very good idea with some serious flaws. The math is innovative and I look forward to reading the full academic paper that is forthcoming. The transactions are not actually properly value though. Value splits are not being done by position. No adjustment is done for a player's team or for further transactions down the line. To give a nice clean example, Kevin Love is severely penalized every single time he has to reset the offense by passing back to Rubio because of the immediate reduction in expected value even though it's the correct decision in the long term. The key point is I feel they presented a flawed paper that wasn't quite ready in an attempt to build some publicity.

Hot Hand

Everyone is claiming that the authors proved the existence of the Hot Hand. They didn't. They proved the existence of a hot hand bias. If you read carefully (or ask the authors – like I did), you'll see that the data reveals that both the offensive player and the defense act as if they believe in the Hot hand and adjust their behavior accordingly. The offensive player takes harder shots (the "heat check") and the defense tightens up on the "hot" player. This causes the offensive player's efficiency to correct back to the mean (in fact slightly below).  So pursuing the hot hand is a terrible idea based on the results (which lines up with every other study done on the subject). I'm annoyed that we're going to have to hear talking heads talk about the hot hand again.

Recognizing On-Ball Screens

This was an interesting bit of work. The author was able to build a decently successful bit of code to identify on-ball screens in the SportVU data using machine learning. The best review I can give him is that the Spurs' rep in the room made sure to talk to him immediately after the presentation.

Data Driven Method for In-game Decision Making in MLB

Baseball's version of the fourth down bot. The authors built a model that is less biased than managers and sees better results at identifying when to pull pitchers and when to leave them in. Somebody should buy this up fairly quickly.

Ball/Strike Bias

This was the paper that has the most potential value for any sport. By identifying a significant, predictable bias in the calling of balls and strikes by umpires, the authors have the beginnings of a tool that teams could use to exploit those biases and significantly impact a team's win-loss percentage. Hell, it's also an argument for using Pitchf/x to automate balls and strikes.

Home/Away Formation Differences in Soccer

The authors identified a significant behavioral difference in formations and pitch area locations between home and away games, but they were too quick to dismiss the role that referees play in this. How a game is called can significantly impact how aggressive or passive teams are on the pitch. I would want to see more research here.

Can't Buy Much Love

This paper made me angry. It made me very, very angry. We are eight years removed from the publication of Wages of Wins by Dave Berri, Martin Schmidt, and Stacey Brook. To present this paper as novel and groundbreaking is disingenuous at best and unethical at worst. The conference needs to have some academic vetting, because I can't imagine that any competent economics panel would have accepted such a clearly unoriginal paper.

In the future, I would like the conference to really look hard at making improvements to the research paper competition. Academic review is a must. I would also consider separating the proprietary data track – for analytics companies like Second Spectrum – from the publicly available data track. Alternatively, I would set up a competition where research proposals are submitted to the paper committee and access is granted to the proprietary data (under Nondisclosure Agreements) for the purposes of writing the papers for the competition. This would make the competition much stronger than it is now and improve the results.

As for the rest of day one, the most striking thing for me was Phil Jackson revealing how thorough his data collection and processing operation was when he was with the Bulls (he was very quick to credit Tex Winter with being a massive stat head). During halftime, the Bulls showed plays from the first half on their computers using video they had captured and processed from the live broadcast. This was back when everyone had modems, so this is damn impressive. Less impressive was learning that Dennis Rodman was only their seventh choice to replace Horace Grant at power forward.

Some additional thoughts from that first day:

Bryan "ColangeLOL", who signed Andrea "L'ancora" Bargnani's current deal, was actually being asked about analytics. HA! Colangelo also said that he's still waiting for a team to break through and win an NBA title with the three. I guess I imagined the last three Finals....

Stan Van Gundy trusts his own data but is worried about the quality of the SportVU data. Of course, he said this more humorously, but it's a valid concern. Stan threw Earl Weaver some love for coaching moneyball before moneyball was cool. He promoted abolishing the draft. He also argued that playing your best players more often helps you win. To quote him directly : "I think a lot of these minute restrictions are bullshit". I like the cut of his jib. SVG for NBA commissioner!

There was a lot of talk around getting players to wear monitoring equipment on the court. That's all well and good, except that wearing monitoring equipment is very likely to impact a player's free agent value. As a player, I would have a hard time sharing private medical data that could hurt my financial interests.

Teams want people who are excellent at analytics, have good basketball experience, and who can work on the cheap. That's like saying you want to sign LeBron with the MLE. Good luck with that!

Coming tomorrow: Part 4: The Singularity.

Good review, my takes on the ones I saw:
I agree the three dimensions was the best of the bunch, wish they would have gone into more detail on the value of the position aspect. But I suppose that's a 'further research' item.

Pointwise, I completely agree. Seemed a bit more like a publicity event for their forthcoming LLC launch than a ready paper. There could be some great applications of their techniques in the future though.

Hot Hand, the behavioral responses to the perception of the hot hand were far more interesting than the whole hot hand aspect.

Ball/Strike Bias- A baseball guy at the conference claimed this was pretty unoriginal and has been done before without the SportVu patina. It was new to me, however, at least as a research paper, though the concept is pretty intuitive. I suspect refs in the NBA are less likely to call a sixth foul on a star or a foul in a close game in the final minute.


Thanks - I'd been looking forward to your review of the papers.

I wonder if one way for researchers to access the SportsVu data would be to directly approach the teams who have bought it. I think they should have some interest in their team's data being studied by a bunch of smart people. (A gold mining company got famous for putting all their geological samples - traditionally tightly guarded - online so that interested amateurs could help them find gold - they did.) You could probably all become contractors to the team, to fall under their existing license. The terms of the contract and publishing rights probably become interesting to negotiate.
I would minute allocate because it reduces the likelihood of injury but if I'm a bad team, I guess my only choice is to run my players into the ground.
In seriousness, your best players should play the most but to what extent?
Example: Lebron should be playing no more than 34-36 minutes.
> I would minute allocate because it reduces the likelihood of injury ...

A pretty obvious application of the SportVU data is to infer the aerobic limit and anaerobic reserves of the various players. That should suggest a baseline scheduling strategy for optimizing player performance. Real-time tracking should allow for even better optimization against work load. (Minutes on the floor are not created equal, just as players are not created equal.) Optimal scheduling and time-out calling is one of the first things that I wanted to look at with the data.
Nate,
That would be interesting to see but also with all this information, the randomness of injury is still there.

Sign in to write a comment.