Four Ways We Can Build Trust in Analytics

As analytic advancements reach ever deeper into people’s lives — as every aspect of individuals is analyzed to drive decisions by businesses and other institutions — the need for p…

by Gerald Fahner

Senior Director, Analytic Science

February 25, 2016

As analytic advancements reach ever deeper into people’s lives — as every aspect of individuals is analyzed to drive decisions by businesses and other institutions — the need for people to trust analytics and the way organizations use them grows.

This issue arose in a recent announcement from the industry analyst firm Gartner. Gartner ”believes the trust factors influencing the ethical use of analytics are identifiable — transparent, accountable, understandable, mindful, palatable and mutually beneficial” and that “leading data-driven organizations will increasingly recognize the causal relationships between data, analytics, trust and business outcomes.”

This made me reflect on some lessons I’ve learned as a data scientist as they relate to the interplay between data, models, trust and results. I will omit here some of the more obvious topics such as model validation and tracking model performance.

I find it beneficial to think about analytics as an element of a virtuous feedback loop. Data are turned into predictions and decisions affecting consumers and outcomes, which in turn lead to new data points and new information that can improve future models and future decisions. The feedback loop perspective offers various interception points and modes of operation to foster transparency, learn about causes and their effects, and enshrine trust.

Here are four levers we can use to build people’s trust in analytics:

1. When developing models, balance data fitting with understanding and domain expertise.

For credit risk scoring, models must conform to regulations, and adverse credit decisions must be explainable to affected consumers based on the score formula used. In any domain, a business unable to explain important analytically derived decisions could come under pressure one day for failing to offer explanations and possibly lose trust with customers if a decision was to be regarded as nonsensical, unfair or unethical (think about a 1000% price hike on a life-saving medication driven by an unconstrained pricing model).

Decision support coming from black-box models isn’t just harder to “sell” to customers. It’s more likely to be ignored by managers who don’t understand it.

The ability to inject domain expertise into a model is often critical for trust and success. It’s not uncommon to find that various alternative models fit the data similarly well, while some of them are much easier to explain than others. Deploying the most palatable model is no-brainer.

On the other hand, domain expertise is no panacea. Where data contradicts human expectations, we have a chance to learn something new or to identify a data problem.

So how can we balance human judgment with letting the data speak for itself? I’m fond of the transparency, flexibility, and ability to incorporate palatability constraints provided by segmented scorecards, and I prefer to benchmark these models and to guide their design (such as binning, variable selection and segmentation schemes) through modern ensemble learning algorithms that can discover unexpected and complex relationships. This is a good methodology for balancing accurate data fitting with understanding and domain expertise.

2. Understand the process that generated the data and be mindful of biases.

While it is standard advice for data scientists to take into account context, a particularly deep understanding of the data-generation process is necessary for success with prescriptive analytics. Prescriptive analytics not only predicts future outcomes under business-as-usual conditions, but figures out which treatments to assign to customers to reach the optimal results.

Underlying these decisions is often a causal model that predicts what will happen if a certain customer receives a certain treatment. But learning from a data set about causality is a more demanding scientific task than finding humble correlations, and it can sometimes be an impossible task.

Two key challenges pertaining to data conditions are omitted variable biases and historic treatment selection biases. As an example for the first problem, a model for price elasticity of demand for ice cream could be severely biased if historic prices had been managed based on weather conditions and if we were to omit weather-related control variables from our model.

As an example for the second problem, historic treatment selection bias, imagine customers of type I always received treatment A and customers of type II always received treatment B in the past. With such a strong treatment selection bias the notion of a data-driven prediction of what would happen if we applied offer B to type I customers becomes an illusion, as predictions become extremely uncertain.

Off-the-shelf analytic tools for regression and classification don’t warn the unsuspecting data analyst when they encounter these problems. Predicting effects of treatments without ascertaining good data conditions should be highly suspect.

For a transparent causal modeling methodology that fosters trust (and for alerting the analyst when treatment selection bias is too strong to infer data-driven causal relations) I recommend treatment propensity score developments and matched sampling prior to “mining” non-experimental data for causal effects.

3. Think strategic and design safe experimentation right into your feedback loop.

Every business treatment applied today on a customer not only impacts the customer’s future journey and metrics, it also yields new data samples that could, and should, inform future decisions. Whenever we’re plotting a long-term analytic roadmap let’s think about optimizing today’s decisions not with a myopic view (which is to exploit our current best estimates of today’s optimal action), but with a strategic view (which is to balance the exploration-exploitation trade-off by continually testing ethical and safe decision alternatives).

Besides conventional champion-challenger and A/B testing, “boundary-hugging” test designs are useful to control this trade-off, leading to an increased rate of learning at minimum cost of testing. This approach greatly mitigates the aforementioned selection bias issues and helps pave the future road into transparent, data-driven prescriptive modelling.

4. Use sequential decision analytics to realize the benefit of information-gathering actions.

Conventional prescriptive analytics exploit the available data to arrive at a decision. But for some decisions, or some customers, the available data may simply be too sparse, too stale or too biased to make data-driven decisions with high confidence. In those cases one could fall back on a default treatment (which may be sub-optimal), or one could decide to gather additional information.

It is interesting to envision an analytic component that recognizes when the best next action is to buy more data or to ask the customer a question in order to learn more about her needs and preferences, before explicitly taking advantage of the new information when deciding on a relevant offer or a valuable treatment. I call this “goal-directed algorithmic curiosity.”

FICO is researching novel applications of sequential decision analytics with the option to engage customers in dialogues, for example by taking advantage of the mobile channel. Such analytics trade dialogue costs (incentives to promote responses, as well as costs of potential customer disruption) against the value of imperfect information (imperfect because not all customers may respond truthfully).

I believe that dialogue analytics can pay great dividends. But for this to work, the customer must believe that an honest response is in her self-interest, and the business must act on the received information in a way that creates a win for both sides.

The music streaming service I’m using is a simple example of trust earned that pays dividends. I’m happy to volunteer my ratings information in exchange for improved future recommendations. Seeing these recommendations actually improve over time makes me a loyal customer.

Stay tuned for a future blog post on sequential decisions and expect to hear more at FICO World 2016!

Gerald Fahner

Dr. Gerald Fahner is Senior Director in FICO's Scores division, where he leads the Advanced Analytic Capabilities Research group. He specializes on innovative methods and algorithms that turn data and domain knowledge into superior insights, predictions, and decisions. Gerald is also responsible for the core algorithms underlying FICO's scorecard development platform. His work on causal modelling won the Best Paper award at the Credit Scoring and Credit Control XI conference, was awarded patents and made it into products. He won a Best Paper award at the Data Analytics 2018 conference for developing practical methods in explainable artificial intelligence and machine learning to boost the effectiveness of FICO’s credit risk score developments. These innovations also led to the development of the FICO® Resilience Index which has been recognized by the 2021 Drexel LeBow Analytics 50 award.

Prior to joining FICO in 1996, Gerald served as a researcher in artificial intelligence, machine learning and robotics at the International Computer Science Institute in Berkeley and he earned his Computer Science doctorate from University of Bonn.

See all posts

Blog home

Take the next step

Connect with FICO for answers to all your product and solution questions. Interested in becoming a business partner? Contact us to learn more. We look forward to hearing from you.

Four Ways We Can Build Trust in Analytics

As analytic advancements reach ever deeper into people’s lives — as every aspect of individuals is analyzed to drive decisions by businesses and other institutions — the need for p…

Gerald Fahner

Has the Reporting of Rental Data to the Credit Reporting Agencies (CRAs) Increased?

Average U.S. FICO® Score at 716, Indicating Improvement in Consumer Credit Behaviors Despite Pandemic

FICO Statement on FHFA and FHA Updates to Credit Score Modernization

Take the next step

Gerald Fahner

Popular Posts

Has the Reporting of Rental Data to the Credit Reporting Agencies (CRAs) Increased?

Average U.S. FICO® Score at 716, Indicating Improvement in Consumer Credit Behaviors Despite Pandemic

FICO Statement on FHFA and FHA Updates to Credit Score Modernization

Take the next step