I attended a TDWI webinar today by Wayne Eckerson on Predictive Analytics. This presented information from a recent survey TDWI conducted of its members (here's the report's executive summary and you can get the report here).
Now TDWI, and Wayne, are approaching this from the point of view of Business Intelligence/Data Warehousing and that gives them a certain perspective. Like me, Wayne and TDWI want practitioners to move beyond capturing the data to analyzing and using it. Indeed Wayne describes predictive analytics as a form of BI. I am not sure I buy that (after all there are serious differences between BI and EDM) as I consider predictive analytics to be more than "uncovering relationships" it is about creating executable models (I tried to explain this here) and I like this definition of analytics from Gartner). Indeed the presentation often seemed to focus more on what I would call "predictive reporting" than on real, executable predictive analytics. That said, I do agree that prediction builds on reporting, analysis and monitoring though analytics and EDM can also add value to monitoring, not just build on it, as I noted in this article on BAM and in this one on shifting performance management into action.
Some other thoughts:
- Wayne refered to Blink, a great book I reviewed here and that I, like Wayne, highly recommend
- It was interesting that the survey showed lots of exploration and development of predictive analytics (64%) but only 21% implemented partly or fully.
Remember the survey was of BI/DW people and it has to be said that in many of our customers the BI/DW and predictive analytics group are completely separate! Nevertheless there is a clear and growing interest in predictive analytics amongst BI/DW practitioners.
- Of those adopting it only about a third are really measuring and iterating models
This worries me as adaptive control, the constant monitoring and improving of analytic models, has historically been critical to success in this area. We say Enterprise Decision Management specifically to highly this aspect of decision automation.
- Wayne gave a nice list of target applications with marketing, forecasting, churn and fraud topping the list.
It was interesting that budgeting and forecasting scored so highly, and campaign management for that matter. These do not lend themselves to embedded analytic models in operational systems so much as to predictive reporting. Indeed, Wayne gave some examples and a number were of predictive reporting while others were more EDM-like, embedding the predictive analytics into operational systems.
- Wayne noted that the credit card business is a massive adopter with big teams of analysts.
This is one of the industries that seems to have separate analytic and BI/DW teams.
- Wayne noted that the ROI of predictive analytics is much higher that for other BI applications.
This matches my experience me but then I can be cynical about the ROI for BI. While the ROI for EDM/predictive analytics is higher, the investment is also higher.
- I found it interesting that TDWI responders moved their analysts into their information management team as this is not what long-time users do.
Clearly there need to be different paths to adoption for those who are adding predictive analytics to BI from those who adopted analytic modeling sometime ago (like banks, airlines).
- Data interaction is an important component in time spent and that increased the value of analysts being close to data.
No argument from me
- Scoring and deployment seemed not to cause many problems from the respondents
This sounds too easy for me and again I suspect that comes from the amount of predictive reporting involved here.
- Wayne is right that data mining/predictive analytic tools are better and that they reducing time to build models
I disagree that they make it easy for someone without deep skills, however, as they still need real skill to use properly. They do not make it possible for non-modelers to build good models.
- Data warehouses can be really useful for modeling but there are issues in terms of how it is organized and set up
For instance data with a time component is often stored by absolute time/date in a warehouse where an analytic modeler wants data in terms of "days since acquisition" or "months before cancellation". Extending a data warehouse to support analytic modeling is possible but a non-zero effort. As Wayne suggests you will likely need to create an analytic data mart.
- Wayne mentioned PMML and, while it is a good standard (and one Fair Isaac supports), it does not cover characteristics well (the pre-cursor calculations for a model) and this can be a real problem
Wayne's examples were about predictive reporting - embedding models into reports - where this may be less of an issue
Overall I was glad to see that Wayne is promoting predictive analytics as a next step for BI/DW adoption. BI/DW professionals need to look beyond predictive reporting, however, and understand the power of rules and analytics in combination to automate 95-99% of operational decisions - EDM, in other words. EDM is about analytically-driven processes and while these include predictive analytics, predictive analytics can also be used in decision support systems and reports (I discussed the differences between decision support systems and EDM systems with Dan Power)
If you want to know more about some of the techniques I wrote a post on the basics of predictive analytics in EDM while Kathy Lange of SAS wrote a nice introduction on DM Review and I think the Berry and Linoff book on Data Mining Techniques is great. If you do go beyond predictive reporting to embedding predictive analytics, don't underestimate the challenges of letting machines take decisions. There is more in the section on predictive analytics.
As a foot note I was reading my Computerwire newsletter at the weekend and saw an interesting piece about an Accenture study. Yesterday I got Rob Preston's InformationWeek "Between the Lines" email about BI still being in its infancy. There were some nice little factoids in these two pieces:
- Most of the business information that middle managers eventually get their hands on is useless
- Middle managers spend more than a quarter of their time sifting and searching through information
- 50% of that information is of no value to them
- 60% of middle managers said the problem stems from poor information distribution
Now Rob goes on to say that you may well have problems "even if all your data is clean, up to date, and easily accessible in a central repository" and that there is "plenty of analytical overkill" going on. I think, in contrast, that there is not enough focus on the use of information to make processes and transactions flow more quickly by automating decisions. Middle Managers want to get their job done. Do they really need data or do they need systems that automate more of the day to day?
Finally a shameless plug for a recent piece by Bloor Research that discusses Fair Isaac's range of EDM technologies and how they "define the top-end of the data mining market".