Why Everyone Secretly Hates Data

At Sloan one of the most interesting guests was Prasad Setty. Prasad Setty is responsible for making models that do talent evaluation for software engineers at Google. He is a Billy Beane of the software world! And you'd assume that unlike sports clubs that software companies would be more open to data. And that's where you'd be wrong! Here is Prasad's recap of a story from Google:

I lead a group called people analytics and the intent behind our formation was: we make thousands of people decisions as an organization -- every large organization does. We are now at 40,000 people. And we wanted to make sure that we had the same level of rigor in making people decisions as we do in making our business decisions.
And so, when I started I said ‘All these software engineers out there, they are working on these very complex algorithms!’ -- that return all the results you folks get when you type something into the search engine. So I said ‘These folks must be used to algorithmic models to making people decisions as well.’ So I said ‘All people decisions should be made by data and analytics.’
One of the processes that we take very very seriously at Google is how we promote our software engineers -- how do you get from one level to the next level, right? And it’s a very intense process. What we do right now is we fly in hundreds of our senior most engineers from around the world. Twice a year we form these committees so that it’s not just the engineers’ managers who are making these decisions but these independent committees so that remove bias. But it’s very intensive -- we fly hundreds of engineers in, and they form these committees, and they’re reviewing these promotion packets, uh, twice a year!
And we said ‘Can we make this a simpler format, can we reduce the load?’ So my team built this really complex algorithm based on empirical data. And we said ‘For 30% of these promotion cases, with 95% accuracy, we can come up with the same decision that the committees would have made.’ So we went to the engineers very excited. We told them ‘We can reduce your load by 30% and over time our models will keep getting better and better, right? And we’ll reduce your load and now can make promotion decisions without human intervention!’ We thought they’d be excited. They hated it. They hated it. They said ‘We don’t want to hide behind a black box with something so serious as people decisions’ right? So that the changed the nature of what I thought my team was all about. I don’t think anymore it’s about replacing humans to make decisions. Instead it is to make sure that the human decision makers that we have are free of bias. And so now all the analytics that we do at Google is now shaped towards ‘How do we reduce bias from the decision makers eyes?’

When it comes to using data, we find that software engineers can be as stubborn as "old boy networks" that run professional sports teams! And the reason why is very simple. People just want data to augment their expertise. In sports I find many people love to use the numbers but only as a supporting argument to a forgone conclusion. Let me give you a classic problem but phrased two ways:

  1. What makes Kobe such a great player?
  2. Is Kobe a great player?

In the first question, I take the role as expert. I know the answer! The data just exists to help my point And that's why people will find lists of "advanced stats" like PER that show Kobe is great. The already knew Kobe was great, they just needed a number that lined up with their assessment. And this often means they don't actually understand the numbers. The second question is much different. I cease to be the expert. I now let the data and the process answer the question. And it seems universal to sports teams and mathematically inclined software engineers to distrust that the data can subvert our own expertise!

I was a bit sad to hear Prasad's final thought about how he changed his approach. You see, the biggest bias we have is actually our belief in our own abilities! One of the hardest questions to ask ourselves is "Am I wrong about this?" And that's where the data helps tremendously! Sure, we should make sure to understand what the data is doing and communicate it properly. But to simply hate the data because we think a problem is important and that we're an expert? Well that should be unforgivable in all fields. As Sloan shows though, it's much more common than we'd like.

-Dre

Loading...