Fighting Fraud with Responsible AI: My Talk with Dr. Scott Zoldi

FICO fraud expert TJ Horan talks fraud management and responsible AI with FICO's Chief Analytics Officer, Dr. Scott Zoldi

My colleague Dr. Scott Zoldi, Chief Analytics Officer of FICO, is one of the driving forces of the artificial intelligence (AI) and decisioning technology that powers FICO® Falcon and an entire family of fraud-fighting solutions. Scott’s a busy guy; when not spearheading the development of innovations like the FICO® Scam Detection Score or FICO’s anti-money laundering (AML) scores, he literally wrote the playbook on how organizations can achieve Responsible AI. Scott’s also authored more than 120 software patents, with a raft of them related to AI.

As for me, I’m a software guy and payments professional doing my best to roll with the changes in an ever-shifting fraud landscape. There can sometimes be a big disconnect between the leading edge of analytic innovation, where Scott lives, and the everyday trenches of the war on financial crimes, where I am. I recently sat down with Scott to talk about how fraud operations groups can benefit from Responsible AI technology and why they should care. Here’s all the best stuff from our conversation.   

TJ Horan and Scott Zoldi of FICO


How Does Responsible AI Fit into Fraud Operations?

TJ Horan: As a company FICO talks a lot about Responsible AI and associated tasks. At times that can feel aspirational, like something I should be doing and a place I hope to get to someday. When thinking about day-to-day operations, how does Responsible AI actually apply to the fraud world right now?

Scott Zoldi: People in operational roles need to understand what drives their fraud analytics models; it’s fundamentally important to their efficacy. That’s a huge part of what using AI ethically is all about, to really understand the core driving behaviors behind the models.

So what does that mean? When we build models we’re not simply throwing a bunch of data into a fully connected neural network and pressing “play.” We’re looking at what the emphasis of that model is [such as card not present (CNP), high-value transactions, etc.], how it performs on different types of fraud and if it’s changing the way it behaves from one model to the next model. We study latent features or behaviors the model is more sensitive to, or less and what does it impact – is it sensitive to dollar amount and frequency? Is it too sensitive to dollar amount and frequency?

As we look at the Responsible AI implications for fraud, we obviously don’t want to bring in data that just produces noise, since it is important to understand the real data behaviors driving the model. What types of core behaviors score high? We want to codify that so we know, in a changing environment such as what we saw during COVID, how that model is anticipated to continue to work, or if it may need to be adjusted in the production environment, and how.

How Much In-House Artificial Intelligence Expertise Is Needed?

TJ: If I’m running fraud operations, do I need my own team of data scientists to perform this inquisition into how the fraud analytics model is developed, and to make sure all the proper techniques are there? Or are there high-level things I can request and ask about, such as if I bought a car, I might ask a broad question like “What kind of new safety technology does this car have?” Even though I might not know exactly how all the mechanics of the car work, I understand enough to be assured – or not.

Scott: As a manager on a fraud team, it’s important to understand what the model is focused on, and what it’s not. In some instances, it’s better to ask for more information about the model components of certain sub-segments. For example, some of FICO’s custom model clients have 26 different sub-categories of performance in their fraud fraud models. They‘ll ask model behavior questions because it’s linked to the strategies and rules that drive their decision-making processes.

Going back to your car analogy, as a customer waiting – still! – for my 2022 Ford Bronco, I want to know that it has a heavy-duty winch system or how I could DIY it by putting a winch on the factory bumper. If I were a regular car guy I might not care or need to know that the Bronco Everglades edition has a “standard Bronco-first, factory-installed Ford Performance by WARN® winch kit.” But since I’m a Bronco fanatic, I want to know.

But I digress. You’re absolutely correct that clients should be cognizant of the high-level capabilities the fraud model is focused on, in terms of what we’re trying to improve. We publish model reports, and subsets of the model performance, to show what’s driving that and what’s changing the data. User groups and customer feedback are important; some of the custom clients may want to understand performance deeply so our model can work in conjunction with their own ‘DIY’ internally developed rules and strategies.

Revisit Old Assumptions to Understand Today’s Artificial Intelligence

TJ: Scores and advance scoring technology have been used in fraud for a long of time now. If I’m head of fraud operations, chances are that there are things that my predecessors put in place and implemented, and over the years we’ve done some things to grow our detection sophistication and adapt to the environment.

So, Scott, it sounds like you’re saying it’s not a bad idea to say, “Hey, let’s just step back, go back to the beginning for a minute and ask ourselves, ‘What is our model actually trying to predict, and how does it perform across different groups of transactions?’” It sounds like understanding those assumptions would be a good, healthy use of the team’s time and energy, challenging some of the myths that have become baked into overall fraud operations, which in fact might just be “ancient knowledge” passed down over time.

Scott: That actually has to occur. At FICO, given the maturity of our business, we understand how data analytics models can be built in a huge number of ways, and with different behaviors, but with similar levels of model performance. We build models in a responsible way to not dramatically change their behaviors, because we want to maintain consistency with decisioning processes.

For example, in the past some of our customers have said, “We’re seeing more challenges with low-value CNP and want to have a better emphasis on that.” We would approach this issue as a team, informing the customers of the model sub-segments they are focusing on, which we want to perform more effectively. In this way we are empowering clients to anticipate, monitor and make sure they’re prepared for changes in how the model will behave when ingesting those events. On the client side, fraud teams can then anticipate the right rules and strategies. You can see how the overall decisioning system comes into play in addressing specific challenge areas.

It's important to continue to look at those areas where we want to see improvement. As a vendor of models that customers are using within their AI systems, our releases can’t rapidly change model behavior while maintaining the same level of fraud detection. That might sound counterintuitive, but it results in a great model being used but ultimately producing bad decisions, because the decisioning system isn’t adjusting rapidly enough to the model shifts. If that happened, the decisioning systems, rules and strategies would be misaligned.

So, this consistent dialog between FICO and clients around what they want to emphasize, and what needs to improve, is the very best way to be engaged and also not have the machine learning be such a black box. It also helps ensure the machine learning we produce is not an “academic project” but instead is firmly viewed as living in a decisioning system.

Because our clients have rules and strategies to detect CNP, cross-border credit card, cross-border ATM, frequency and velocity, and more, they want to know whether or not the model over time will improve performance or sensitivities to certain types of behaviors, or not. We have tons of new Responsible AI technology that puts more emphasis on certain types of fraudulent behaviors that we want model to do a better job on. Customers who are using models for retail banking have benefited from that, for example; we’ve made a conscious effort into putting latent data analytics features that maintain fraud performance but improve scam performance at the same time.

We use student-teacher models to ensure that the new model isn’t too dramatically different from the previous model, to avoid operational impact in the decisioning system. Most customers want to see the model tweaked over time and in ways that don’t radically impact decisioning. That’s part of what it means to be in the driver’s seat with Responsible AI. It’s also using machine learning models in an ethical way.

Why Fraud Analytics Models Need to Be Explainable

TJ: Whether it’s an internally developed fraud model or a model technology that you’re evaluating, it sounds like both should be approached with the old adage, “Buyer beware.” If the seller can’t explain some of these basics to you, about how the fraud analytics model was built, how the development was tracked, what methodology was used to build it, and what precautions were taken, those are big red flags.

Scott: Yes, those are big red flags. FICO has spent 30 years developing processes to ensure model continuity. If we were to simply take the latest data, throw it into the cloud and generate a model, we could do that. But each time, the way the model behaves could be dramatically different, causing tremendous operational hardships and probably breakage in fraud systems.

This concept is at the heart of ensuring that model updates work in harmony with the rest of the AI decisioning the customer employs. It’s hard and it takes time. Because of the way that fraud moves, models will need to change over time and will continue to need to be refined, but they can’t change so much that customers have to throw away their rules and strategy every time a new model shows up.

TJ: Coming back to my original question: How much of ethical AI exploration is something I want to do? Should do? Need to do? How is that responsibility shaping up in the US and in other parts of the world?

Scott: Model explainability is increasingly a must-have. The EU has designated certain types of models as high-risk, such as around granting new credit, and this is upping requirements around model explainability. To meet requirements such as GDPR, everyone needs to have a strategy in which models are interpretable/explainable, behaviors are clear, and the explanations are sound.

Speaking of GDPR it’s already been in force for several years now. GDPR says that if you, TJ, produce a decision on me, you must explain to me why, for example, you blocked my transactions – and that explanation had better make sense to me. Otherwise you’re not meeting GDPR tenets for responsible use of AI.

This level of compliance is not necessarily being enforced, full force, but I think the tide is turning. There will need to be much more control around decisioning systems, to meet the higher bar set for explainability as a means to reduce bias. It can be a bit daunting, as many explainable AI methods are not up to task to meet these fairness requirements. We have effective methods in our fraud models today and are looking to deploy even newer technology, intimately interlocked with the algorithms that will produce scores.

In other parts of the world I’m seeing more discussions of using interpretable models. To have that level of clarity, IEEE 7000 is another regulation that essentially says, “Don’t focus on developing the very best model. Focus on building the model that meets the objective, can be explained and therefore is as simple as possible.” I think that’s a good practice. Those that try to sell more complicated models are generally selling a noise-producing machine, which means you lose a lot of signal. Building a data analytics model well, in a way that’s transparent to the builder and more transparent to the user, is beneficial to business and auditors.

That’s why the last 25 of my patents have been focused on frameworks around Responsible AI. Now we need to ensure that we implement techniques like Auditable AI, so we can isolate behaviors and understand what they are, and regularly monitor those drivers.

Coming back to COVID, throughout the pandemic we had confidence in our fraud models. We simply told our customers how to adjust certain thresholds to minimize the volume of cases. Even though customer transaction behaviors changed around COVID, the core behaviors the model is based on didn’t change. We saw a 400% increase in CNP fraud but the model’s fraud detection capabilities were literally unchanged.

TJ: Yes, that was amazing. The fact that the model performed well amid the most profound turmoil and behavior change in my lifetime, with minor adjustments, is a testament to having all the right things in place, in a model so robust it will evolve correctly.

Scott: Absolutely!

How FICO’s Responsible AI and Fraud Solutions Can Help Your Organization


Follow my latest thoughts on fraud, financial crime and FICO’s entire family of software solutions on Twitter @FraudBird. Scott blogs regularly about Responsible AI, data science and his Bronco obsession, and is on Twitter @ScottZoldi and LinkedIn, too.

chevron_leftBlog Home

Related posts

Take the next step

Connect with FICO for answers to all your product and solution questions. Interested in becoming a business partner? Contact us to learn more. We look forward to hearing from you.