Skip to main content
I, For One, Welcome Our New Data Overlords

By Josh Hemann

There has been plenty of lionizing of data lately (especially when it is big), but also an increase in demonizing it too. A few weeks back Kenneth Cukier (a keynote speaker at this year's FICO World) and Viktor Mayer-Schönberger wrote an interesting article in MIT Technology Review titled The Dictatorship of Data, and it has the this reproachful ending:

Big data will be a foundation for improving the drugs we take, the way we learn, and the actions of individuals. However, the risk is that its extraordinary powers may lure us to commit the sin of McNamara: to become so fixated on the data, and so obsessed with the power and promise it offers, that we fail to appreciate its inherent ability to mislead.

I like the warning about fixation and obsession, but what irked me was the projection onto data an "inherent ability" to deceive.

People have a desperate need to boil complex situations down to single numbers that they can make decisions from. Such heuristics are highly useful in many settings but we are seeing push-back on increasingly broad application of this "measure everything" ethos, and I think rightly so.

But let's be clear: data does not have anything inherent in it except for one thing, which is us. It comes with our value system attached (what we choose to measure and analyze), and to the extent it misleads it is because we find it convenient to be misled, because we find it too difficult to make decisions for which we must bear responsibility. Instead, we often wrap pre-determined beliefs about the world in a patina of analytics, as if the data told us to do what we already wanted to do. Data gives us the freedom to absolve ourselves from empathy, from needing to connect with the life around us and instead focus on counts, sums, averages, log-odds...

I appreciate Cukier and Mayer-Schönberger's point of view, but there is a real opportunity for introspection in these Big Data discussions and we'll miss it if we ascribe to data qualities that are really our own. 

And on a lighter note:




related posts