We’re back, with a new edition of our series on choosing the right analytics for the job.
Today’s situation: data is in ample supply, but too much of it feels beyond the reach of traditional analytic algorithms. By volume, 80 percent of the data out there is untapped and much of it is unstructured – too messy and just too complex to crack open. Regression models thrive on numbers and numbers only, so data in the form of customer call logs, on-line support chats and detailed case records are just irrelevant noise to them.
And when the source information doesn’t “fit” neatly into a lot of our tools, we often write it off as unusable or meaningless.
But the right tools can bring those large, untapped troves of textual information into reach. Text analytics represent a large body of techniques that can help us improve on what we know, and even offer entirely new insights into our customers’ behaviors and needs.
In particular, supervised text mining techniques can help us learn which terms, phrases and content connect directly to our modeling target. In a new FICO eBook (login required), we introduce the Semantic Scorecard, a predictive modeling algorithm that transparently incorporates text analytics into your scoring, to help sharpen decisions. And in our latest FICO Insights white paper (login required), we elaborate on text mining at large, the Semantic Scorecard itself, and even describe some of forms of unsupervised text mining, applied to the very real-world problems of customer segmentation.
With these advances, the promise of better decisions through data gets even stronger: better decisions through all kinds of data.