Skip to main content
Experiments that Enable Causal Prediction

By Andrew Jennings

Last week on the Banking Analytics Blog, we discussed a critical imperative of next generation learning, which is to learn from customer behavior on the fly. The final imperative in our series is experiments that enable causal prediction.

While the easiest relationships to find in Big Data are correlative (when A occurs, B also occurs), the most valuable for customer centricity are causal (A affects B). By finding and testing causal relationships (A affects B in this specific way), companies can better understand customer sensitivities to offer features such as price, brand status or flexible terms, and predict how individual customers are likely to respond to a specific offer or treatment.

We capture causal relationships in the action-effect models at the heart of our approach to improving decisions through decision modeling and optimization. Case in point: A major retailer wanted to understand how to target its discount coupons more profitably. The central question was: Which customers will buy only with the coupon and which will buy whether or not they receive the discount? To answer that question, we needed to find causal relationships between coupon treatment characteristics and buying behavior.

While the available historical data was big in volume, it was poor in causal insights. The retailer had done no deliberate experiments on coupon effects. Historically, its coupon distribution was determined by specific customer characteristics (e.g. past purchases, frequency, demographics). Similar customers, therefore, almost always received the same coupon. As a result the data had very little of what analytic scientists call “common support,” but which we can simply think of as overlap. Overlapping different treatments across similar customers enables us to compare outcomes and see how the treatments affected them – essential for finding causal relationships.

If the company had randomly distributed coupons, the result would have been worse in terms of costs, but better for learning. Random assignment creates lots of overlap in data. (In fact, sometimes data you’d ordinarily regard as problematic, due to poorly directed haphazard treatment assignment, can turn out to be a “natural experiment” you can leverage for analytic learning).

This situation highlights a critical dynamic between customer centricity and Big Data learning. As our businesses become more customer-centric, we’re able to target treatments with finer granularity, right down to the individual customer level. But the more we target, the less overlap our data has and therefore, the less opportunity it affords for learning about causal relationships.

To learn more, we need to introduce a certain amount of randomness into treatment assignment, while keeping testing costs in check. In the case of this retailer, we found the answers the retailer needed using data it had. One technique we employed was propensity scoring to locate useable overlap in the existing data. Unlike the more familiar application of such scores to predict likelihood a customer will buy, in this case we were predicting the likelihood of a customer being assigned to a particular treatment. We found overlap in customers with similar propensity scores who had received different treatments, and within the overlap area were able to create matched samples suitable for identifying causal relationships.

One takeaway from the case study is that there may be value hiding in your existing data – in other words, more opportunity than you realize for learning about your customers. Another is that companies wanting to compete by understanding customers better need to deliberately enrich their data for learning. This means going beyond traditional champion-challenger to smart experiments that increase overlap for causal analysis.

For more information on this topic, check out my recent Insights white paper: "When Is Big Data the Way to Customer Centricity?" (registration required).

related posts