Play More Games to Get Your Ph.D.: Fun with Spurious Correlations
Correlation doesn’t equal causation. This point sometimes gets lost in the Big Data discussion. The argument goes that if you have enough data, correlation is good enough. Even o…

Correlation doesn’t equal causation. This point sometimes gets lost in the Big Data discussion.
The argument goes that if you have enough data, correlation is good enough. Even our friend Kenneth Cukier wrote that the focus of analyzing Big Data will shift from causation to correlation. “This represents a move away from always trying to understand the deeper reasons behind how the world works to simply learning about an association among phenomena and using that to get things done.” And current TED curator and former WIRED editor Chris Anderson proclaimed in his 2007 essay that the data deluge will make the scientific method obsolete. “Petabytes allow us to say: ‘Correlation is enough.’”
Even our Chief Analytics Officer will admit that for some things correlation may be OK. However, for many things it is important to demonstrate the impact of an action in a cause/effect fashion. For example with loyalty programs, members in the program may spend more than those not in the program. But, is it just a correlation? Would they have spent more anyway because they are the best customers, or did the program drive the higher performance? It is important to determine causation, or you may be leaving money on the table, or throwing good money after bad.
We also don’t believe that you can use Big Data to solve problems without understanding the problems. Big Data can tell you nothing, if you don’t know what questions to ask it.
This is where the fun with spurious correlations comes in. Tyler Vigen has put together a great collection of spurious correlations. Here’s one that we appreciated:
So where do you stand on the correlation/causation debate? Let us know.
Popular Posts

Business and IT Alignment is Critical to Your AI Success
These are the five pillars that can unite business and IT goals and convert artificial intelligence into measurable value — fast
Read more
Average U.S. FICO Score at 717 as More Consumers Face Financial Headwinds
Outlier or Start of a New Credit Score Trend?
Read more
FICO® Score 10 T Decisively Beats VantageScore 4.0 on Predictability
An analysis by FICO data scientists has found that FICO Score 10 T significantly outperforms VantageScore 4.0 in mortgage origination predictive power.
Read moreTake the next step
Connect with FICO for answers to all your product and solution questions. Interested in becoming a business partner? Contact us to learn more. We look forward to hearing from you.