Posted by Carole-Ann Matignon
A topic that passionates us at Fair Isaac is analytics deployment. We have spent 52 years figuring out how to build best of breed predictive models and how to make them available. I do fall in the same category... It keeps me awake at night. Not that I am worried about it, but I find exciting to keep pushing the challenge a step further.
You may have read many great posts from my colleagues on why it is a good idea, if not a necessity, to deploy predictive models and rules together. I would only quote one of those arguments here: time to deploy. We have heard many Financial Institutions complaining about how long it takes to deploy their current analytics. Almost half of the respondents to our survey said it takes 3-9 months or longer, over a year for many of them. Assuming those models address competitive pressure or a change in consumer habits / business conditions, it feels like a long time to endure once you have defined your new strategy.
With this context in mind, we have been working on many different ways to deploys those models in combination with business rules. The obvious reason we have explored this path is that business rules can be deployed in a very short time frame (technically in seconds, practically in hours given formal quality control processes). In one word: Agility.
We started many years ago with black box deployment that are made available to IT. For example, a Java deployment could then be integrated in a Java application. Many vendors are following this lead nowadays. We have since then adopted a complementary white box integration via PMML (predictive model markup language from Data Management Group). I would like to illustrate here a handful of lessons learned in that process.
- Don't tie your predictive model life cycle to a given technology: Relying on standards such as PMML gives you independence on where you ultimately deploy your models. If you use a Java deployment out of your modeling environment, it is unlikely you will be able to deploy in COBOL or .NET natively. I recommend that you consider where you need to deploy today and anticipate as much as possible the additional environments you will need to support in the future.
- Models should be used by business rules, not the other way around: Often you have a choice of where to put you business logic: eligibility rules may end up in the model itself, or they could reference the model (if your FICO score is too low, you will not qualify for a Jumbo loan). The key reason you should elect for the latter as much as possible is that business rules change and should be maintained by business users. Your scores may not change, or not at the same pace. Forcing your logic on the modeling environment forces the business to coordinate with modelers then IT before those changes get deployed into Production. You lose in agility what you gained in precision.
- Consider your model's life cycle management as a process: There are still a fair amount of companies that exchange model definitions as a paper document with little or no traceability to the source. In today's world where governance is becoming paramount, it is important to start linking those artifacts with all necessary documentation. Besides future regulations, there is already value today in doing so. When models need to be refreshed, it helps to know where they came from and how they were developed (training data, exploratory process, etc.).
- Don't assume that Modeling and Operational data models are the same: Modeling data is often prepared in a different way. The data model may have been flattened, data attributes may have been populated / filtered to avoid missing values that exist in real life, characteristics may have been pre-calculated, etc. Code developed to access the Modeling Data Model may therefore not be optimal or even executable reliably.
- Do not underestimate the need for IT to "debug" the model code: Having a black box deployment prevents IT to access the model definition for debugging or runtime performance profiling session. Authorization mechanism and IP Protection tools can effectively protect the model definition if this is a concern. There are times when IT needs to get involved, typically during an emergency, so do not architect your solution to make it impossible.
With those tips in your hand, you can now question the technologies that you have selected and the current design for addressing decision management. How effectively can your system deploy predictive analytics and business rules together? Have you architected the process with enough flexibility?