A recent article in Information Age named 10 attributes of data leader. Each of the experts quoted discussed just one attribute — in my case, it was separating hype from potential, which is critical when you're dealing with advanced analytics. Here’s an excerpt of what I wrote for that article:
"In the Big Data era, businesses have been told to save every scrap of data because it might turn out to be a valuable clue to customer understanding or business performance. While the cost of storing data has fallen dramatically, it doesn’t represent the true cost of data, which includes ensuring quality, maintaining freshness and mitigating the risk of a breach.
"Data leaders are equally discerning about the hype and potential of the analytics used on their data. They focus on the problems the business needs to solve, and investigate new analytic techniques that can make their data “talk” in increasingly valuable ways. They will be exploring artificial intelligence and machine learning, but won’t buy into the hype that machine learning plus Big Data will magically solve all their problems. Anyone can find patterns in data — data leaders do the work to ensure they’re finding meaningful, actionable patterns, not arbitrary correlations."
I agree with the other attributes listed in the article, but of course I have my own list for what a true data leader needs to do in an advanced analytics environment:
1. Understand the basics.The first rule of data is “garbage in = garbage out.” Ensure regular and formal data governance, knowing that there is one source of the data, and that the data must be regularly updated and refreshed. Data quality is #1.
2. Focus on problem definition.Find the essential data to solve the business problem, don’t fall into the Big Data hype cycle. Data leaders know that they are solving business problems and need to prescribe the data required to solve the problem, rather than hope that some Big Data Exercise will help find nuggets of greatness. (As I noted in the Information Age article, this will be even more important under Europe’s GDPR regulation.)
3. Ensure that data is properly usable in terms of business contracts and client consent for use.Don’t misuse or lose trust – use data for prescribed contractual allowed uses.
4. Understand coverage.This means knowing where data is well represented and models can be developed, and where data anomalies can crop up or where assumptions based on data may fall down.
5. Maintain data as a valuable asset.Ensure clean, well-controlled, and permissible use. Data leaders understand that the best algorithm is constrained by the quality of the data; they keep their heads down and they ensure they treat data as a tremendous asset to the business.
6. Don’t be fooled by simple correlations.Instead, look for causation and demonstrate relationships around what you’re trying to understand. The pursuit of highest-quality data standards, quality and governance differentiates companies that will build industry-leading analytics.
7. Monitor the data for anomalies and statistical variations.Data leaders use technology (such as the auto-encoders I have discussed before) to look for data that might not be well-presented in previous learnings. Monitor statistics and monitor data as a living entity to determine when models need to be respun or data quality revisited.
8. Don’t fall for Big Data hype.Understand that applications should have prescriptive data requirements, search for the best quality data, and set up proper rights of use terms, governance and monitoring. Don’t take a “Big Data will tell us” view – most Big Data are liabilities vs. assets and it’s easier to focus on extreme data quality for key data feeds/elements rather than for the totality of data stores.
9. Treat machine learning with caution.Big Data Leaders look to be able to justify, explain and place value on each piece of data. Machine learning is touted to solve the malaise of Big Data. When machine learning finds patterns in data, data leaders ensure it’s not arbitrary correlations but look for causation.
10. Investigate data anomalies.Does a new pattern represent a new behavior, manipulation of data feeds for crime, or new insights to understand?
In the era of advanced analytics and Big Data, data leadership is not just a techie thing, it needs to take place at the C level. As one of the first generation of Chief Analytics Officers, I take my responsibilities very seriously, and have frequent discussions of this changing role with other data leaders around the world.
So, what am I leaving out? Feel free to post your comments or drop me a note on Twitter @ScottZoldi.