In 1992, Falcon Fraud Manager was introduced: a neural network-based fraud detection system that detects fraudulent payment card transactions in real-time. Since then, despite the proliferation of fraud types, there has been a dramatic reduction in payment card fraud losses across the globe.
One of the reasons we developed scoring technology was to help analysts take action on the transactions most likely to be fraudulent. We introduced the Reason Reporter, which, as the name suggests, provide reasons associated with the neural network scores Falcon produces. This capability is, in fact, Explainable Artificial Intelligence (XAI), a topic that recently has become a hot one in light of the European Union’s General Data Protection Regulation (GDPR) and society’s increasing reliance on AI systems.
In 1996 we filed a patent for Reason Reporter—indicative of how long, in fact, FICO has been working with Explainable AI. Today, it’s still performing very well, and in fact outperforming some newer techniques in the Explainable AI realm.
How XAI works in Reason Reporter
In the context of Falcon, Reason Reporter is designed to provide explanations for high-scoring cases, where high score is indicative of suspected fraudulent activity on the payment card.
During model training, we utilize the reason reporter algorithm to “bin” (sort) each input variable and then compute moments of the historical scores for each bin. In production, when a transaction is scored, the values of the variables are used to assign it to its bin. The smallest deviations of the score from the moments in each bin provide estimates of the input variables responsible for driving the score. Reason codes are then associated to a variable or group of variables. The schematic below of Falcon Case Manager, based on synthetic data used for demo purposes, shows a list of reasons (shown in the yellow box) generated by Reason Reporter for the highlighted transaction.
A Squeeze of LIME
As I wrote in my first XAI blog, perturbation-based approaches have recently gained attention due to a technique called Local Interpretable Model-agnostic Explanations (LIME). This technique involves manipulating data variables in infinitesimal ways to see which variables will change the score by the largest amount. LIME then fits a sparse model on the locally dispersed, noise-induced dataset. The indicator variables with the largest coefficients in the model are then reported as the drivers of the score.
Working with LIME can be a challenge in that it explains local sensitivity vs. historical data support for reasons. At FICO, as we always do, we compared the newer algorithm to our existing methodologies.
For example, we studied a sample of high-scoring fraud cases, which we processed through both Falcon Reason Reporter and LIME. The Falcon neural network was used to generate the scores for the noise-induced perturbation data points required by the LIME methodology. The output of both algorithms was then qualitatively analyzed through expert evaluation. Among the high-scoring cases studied, 11% had exact same top three reason codes generated by Falcon Reason Reporter, and 53% had at two out of three of the same reasons codes.
In cases where the two systems differed, the story was interesting. For example, in a high-scoring case where both the algorithms agreed only on one of the three reasons, Falcon Reason Reporter pointed out that the high amount activity and authorization velocity were suspicious. On the other hand, LIME thought that the primary driver of the score was the rate of ATM withdrawal in the last two days, contrary to any evidence of the same.
Similarly in another case, with just a few transactions in the history, LIME thought that the reason for the score was PIN decline rate, though there were no PIN declines. We saw recurrence of such behavior again and again. These cases indicate that LIME over-emphasizes the local feature sensitivity induced through its local noise perturbation technique. On the other hand, by focusing on the global historical score patterns and global support, Falcon Reason Reporter is robust to the local noise.
Averaging Out Noise
The tried-and-true Reason Reporter’s behavior (looking at global score support in the historical data) is a superior method for averaging out noise caused in complex local variable phase spaces and gradients of the solution space. In contrast, the newer XAI technique (LIME) has a tendency to pick up the local noise, leading to contrived explanations made more complicated by increased non-linearity.
Two decades after it was first introduced, Falcon Reason Reporter continues to provide robust and accurate explanations to set the investigation teams that utilize machine learning on the right track.
A Final Note
Most Explainable AI systems including Reason Reporter and LIME provide an assessment of which model input features are driving the scores. The ability to understand causality is the natural next step in the explanation systems. This requires the explanation system to understand and parse latent features that actually drive the score. As I wrote in my previous blog on this topic, my recent patent application work has explored an architecture called LENNS (Latent Explanations Neural Network Scoring) that exposes more of what’s driving the score. More on that later.