For decades, anti-money laundering (AML) detection software has been rules-based, creating a problematic two-fold legacy: first, much true criminal activity goes undetected because criminals can learn the rules and then evade them. Second, rules-based AML systems create an inordinate amount of false positive alarms, diverting investigative resources from pursuing genuine Suspicious Activity Reports (SARs). FICO has developed and deployed machine learning and artificial intelligence (ML/AI) systems that address both of these shortcomings, helping one client, a global bank headquartered in Asia, to achieve a 57% reduction in alert false positives.
FICO’s AML Advanced Analytics creates two scores, an AML Threat Score to improve efficiency and reduce false positives, and an AML Soft-Clustering Misalignment Score to find outliers that otherwise wouldn’t be investigated. These scores reduce inefficiency by generating very targeted alerts, and reduce ineffectiveness by increasing the overall detection rate of suspicious activity. Let’s explore the data science magic that drives such a significant improvement in AML alert accuracy.
AML Threat Score: Radically Eradicating Inefficiencies
Supervised machine learning AI models are trained to distinguish between suspicious and normal behaviors, using historical SARs as training examples. FICO’s AML Threat Score uses supervised learning, along with a combination of patented AI algorithms for feature generation and profiling, to construct a score, which is then used to detect transactions that resemble the historical SARs (see Figure 1). These SARs comprise only 1-5% of all the money laundering activity, so the main focus of the AML Threat Score is to help banks be radically more efficient in detecting SAR-worthy alerts.
Figure 1. The AML Threat Score distribution for various populations (green – ‘goods’, or customers without alerts, blue –customers with alerts who hit an AML rule, red – customers with a SAR filing) allows alerts to be prioritized. The distribution shows that the model is able to separate the customers with eventual SAR filing with higher scores from the ‘good’ and alerted populations.
The AML Threat Score is used to rank-order and prioritize rule-based alerts. By working cases with the highest scores first, financial institutions reduce the false positive rate without missing SARs. In production usage, this AI model dramatically improves operational efficiency. As noted above, one of FICO’s clients has seen a 57% reduction in alert false positives, while on historical data flagging all reported SARs.
Banks and regulators are also concerned with how quickly they can find suspected money laundering. Sophisticated criminals painstakingly pattern their transactions to “fly under the radar” of rules-based systems, avoiding detection for long periods. FICO’s AI model is able to identify suspicious patterns early, before they would otherwise hit rule thresholds for alert generation, in some cases many months earlier.
While FICO’s AI model makes detection of money laundering resembling historical SARs much faster and more accurate, that’s just the tip of the iceberg — about 95-99% of money laundering cases still need to be found (see Figure 2). With resources freed from working false positives of the scenario-based rules systems, banks are better equipped to investigate the most unusual behavior among the rest of the customer population. By using models to investigate the whole population, including unalerted customers, machine learning approaches give us a fundamentally new way of finding previously unknown suspicious activity.
Figure 2: Current AML rules only find a small fraction of money laundering, as only customers who exceed rule thresholds (alerted customers) are investigated.
AML Soft-Clustering Misalignment Score: Unfurling the Net Wider
To find more money laundering, we need AI models that do not rely on historical SARs. Unsupervised learning AI does not rely on the small existing set of historical SARs, or the previously known AML rules and typologies. Since the vast majority of customers do not commit money laundering, their activity patterns allow the AI model to learn what constitutes normal activity. Any unusual pattern can then be flagged by the model as potentially new money laundering behavior.
FICO’s data science organization trains our AI model using a novel unsupervised approach which learns archetypes of customer behavior, where each archetype captures a common type of behavior. Each customer’s activity is a mixture of these archetypes, and over time, the model adapts the mixture based on customers’ changing behaviors. Building on the learned archetypes, the AI model generates a score, called the AML Soft-Clustering Misalignment Score (SCM), that rank-orders customers based on how abnormal their patterns are compared to those of similar customers. The scoring algorithm uses an unsupervised neural network based on Classifier Adjusted Density Estimation (CADE), which uses a process similar to Generative Adversarial Networks (GANs), but more computationally efficient, to estimate the likelihood of a customer’s behavior.
Perhaps not surprisingly, when previously known SAR customers are scored using the SCM model, they score highly, indicating the degree to which they are outliers compared to their customer peers. The customers who haven’t hit any of the existing AML rules but score high by the SCM model need further investigation (see Figure 3 and Figure 4). Some of the cases are the customers whose behavior would eventually trigger rules many months later, or worse, simply evade detection. Other high-scoring customers are those whom the police and regulators have proactively asked to be investigated, showing the correlation of high SCM model scores and possible criminal activity (e.g., activity that falls below banks’ current AML rules thresholds).
Figure 3: The Unsupervised SCM score can be used to visualize the test set of customers without alerts (in green) and customers where a SAR was filed. Low-scoring customers are shown in green at the top of the ‘iceberg’ while higher scoring customers are shown in lighter green. Customers where a SAR was filed (in red) clearly show as outliers, and this confirms that the model is able to detect outlier behavior that catches known SARs. The high-scoring customers with alerts (light green in bottom) indicate un-investigated customers, some of which cluster near the known SARs and are very likely missed money laundering cases.
Figure 4: To visualize customers with the SCM score, the unsupervised SCM score can embed the high dimensional archetype space into two dimensions. The points represent high-scoring customers with SCM score > 500 and show a number of distinct clusters, with most historical SARs as clear outliers from the clusters. Some clusters don’t show any historical SARs; however, outliers from those clusters may include criminal activity that is currently undetected by rules.
Finding the Iceberg
If a bank investigated 1,000 customers per day before using FICO’s AML Threat Score model, and the case load would be reduced to 500 per day (assuming a 50% reduction due to Threat Score prioritization), this efficiency gain allows hundreds of worthy new cases to be investigated. The new suspicious activity found through the SCM score results in an overall more effective AML program, which reports more highly relevant SARs that would not have been caught by current rules-based system. Once enough examples of a new suspect behavior are found and understood, the data can be used for training the supervised Threat Score, and even new typologies or rules can be created.
Under traditional rules-based systems, the industry is failing to detect the vast majority of money movement tied to drugs, human trafficking and terrorism. FICO’s AI models provide two-pronged benefits, making operations much more efficient, and detecting more money laundering that is currently routinely missed. By offering significant efficiency improvements, investigators can dive under the water line to discover the larger iceberg of hidden money laundering. Given continued regulatory encouragement to use AI to improve compliance outcomes, we expect rapid adoption of AI-based AML technology.