Skip to main content
Like Sherlock, We Need Details To Fight Cyber Crime

“Data! Data! Data! I can't make bricks without clay.”

That’s what Sherlock Holmes says in “The Adventures of the Copper Beeches,” and the same goes for the data sleuths today trying to stop cyber crime.

It’s a positive sign that lawmakers recognize this need. Information sharing has become of the fundamental tenets of cybersecurity being discussed by regulators on Capitol Hill. But the data we need is more than what is being proposed in some regulatory plans.

On March 3, the House Commerce Subcommittee on Oversight and Investigations held a hearing on “Understanding the Cyber Threat and Implications for the 21st Century Economy.” Greg Shannon, Chief Scientist of the CERT Program at Carnegie Mellon University, spoke to this need:

“Richer data needs to be shared with the research and development community — meaning not only incident data but also datasets that enable understanding of what ‘normal’ resembles (in terms of network activity, user behavior, etc.). If situational awareness is to develop beyond simple indicators, researchers and developers need access to everyday data so that they can begin to recognize what datasets are important. If the research community were able to successfully determine which features in datasets were essential to combating the cyber threat, then in effect, over time less data would need to be shared to productively handle cyber risks.”

We have said the same thing in our discussions with regulators and policy-makers. In cyber, we need to get enough detail — including times, frequencies, and at least some level of metadata about traffic flows — to recognize patterns and establish baseline behaviors that can be incorporated into responsive analytics. The data and behavioral variables are analyzed to incorporate into updates of the mathematical models used to score suspicious activity.

We don’t need all of the raw data to do our work, but if the data is too processed the models built on it will suffer.

FICO has tremendous experience in this area —I would argue more experience than anyone. For the past 23 years, we have been building the fraud detection models in FICO Falcon Fraud Manager based on an ever-growing worldwide consortium of data contributed by around 9,000 banks. I should point out that this data is fully depersonalized,  removing concerns about individuals’ data being shared.

We will take the same consortium approach with cybersecurity. Contributions to the consortium will allow patterns to be determined based on current cyber attack attempts, so that we can quantify, qualify and rank threats in terms of impact. This creates a continuous learning loop — actionable insights go to our clients via the updated analytics, and more data flows back to the models via the consortium. This  provides a broader and more detailed view than any one installation could obtain on its own.

As governments and other bodies consider proposals for information sharing to build cyber security, they need to consult with the people that will be researching the problem and building the models.

related posts