I have been heartened in the last few days and weeks as I follow the Big Data press and blogs. There is a noticeable shift from talking about technology to discussing the understanding and value the technology can bring. Of course, technology is still important – it provides the automation and the scale that allows the data scientists to do their work of understanding and extracting value. But it’s the data scientists and their expertise that bring the real alchemy in this equation.
Case in point: Last week Steve Lohr wrote an article for the New York Times titled “Sure, Big Data is great but so is Intuition”. He points out that models are just simplifications, and you need to balance the trade-off between the appropriate degree of simplification and the limitations of use given those simplifications. His worry is that models are too simple and thus lead to simplistic conclusions. I share that worry.
Managing simplification is the basis of sound statistical methodology. (I use the term statistical loosely, without meaning to imply only what the purists describe as statistical method actually matters.) Technologies don’t teach sound methods. I was reminded of that just today as we were reviewing results based on some pretty sophisticated methods that allow us to apply ensemble models with various boosting parameters in the operationally important part of the score range. Something didn’t make sense and it then became clear that some of the included variables were really surrogates for the performance variable. Ignorance of the issue would lead you to believe you have a good model, but it will work poorly in practice and lead to bad decisions.
Call it intuition or domain experience or good old-fashioned training. While it doesn’t carry a buzzy name like Big Data, the human factor continues to be the most important component in turning data into value.