Why “Alternative Data” Isn’t an Alternative

In the last few weeks I have been asked on several occasions to comment on the use of “alternative data” in credit scoring and risk assessment. Sometimes it seems I am being asked to defend using “traditional data” rather than the new, cool stuff.

I'm going to share a few observations that I hope can put this in context. Here’s the first: There is value in what people call “alternative data.” It just isn’t alternative.

People often talk about alternative data as if it is somehow magical and, to paraphrase a famous UK lager ad, able to reach places traditional data can’t. It reminds me of how fans talked about “alternative rock” bands like The Sex Pistols or Nirvana — they were just cooler than “classic rock” bands like The Rolling Stones.

How often have you seen words like limited, narrow and restricted used to describe traditional data, and terms like broader, voluminous and diverse to describe alternative data? Very often these people fail to refer to the value of the data. And that’s what counts.

Who cares if I “only” have your credit card payments and deposit history, if the value of this data far outweighs that of any other data I can get? When you have a person’s credit history, the law of diminishing returns for other data sources sets in pretty early on.

That’s why I say alternative data isn’t an alternative. It’s additional. (“Additional data” doesn’t sound so cool, does it?) It’s a false choice to imply otherwise, because no lender is going to throw away analysis of a consumer’s payment history.

For example, in a recent project building a scoring model outside the US, we found that the “alternative” data added 8% relative to the ROC, a measure of a predictive model’s power. Now, 8% is non-trivial and certainly worth having, but it’s not even double-digits. Would you abandon the 92% of predictive power the traditional data provides? Of course not.

Sometimes we need additional data because the traditional stuff is missing completely, too short or too old. Done right, the use of such data is in all our interests. I’ll write more about what “done right” means in a future post. In the meantime, remember – “alternative data” for lending isn’t.

