I get a flood of announcements about how using big data clusters to analyze the mountains of event data companies collect will "transform" security as we know it. I think this is wrong.
What I think is this: Big data capabilities close the gap between what SIEM platforms overpromised a decade ago and what they deliver today.
First, some background. Most of the owners and operators of SIEM platforms I speak with are not happy with the platform they own/manage/use. There are lots of reasons for this, but scalability, data collection, performance, and management difficulty top the list. SIEM is a must-have technology for most enterprises because it provides critical data to compliance and operations teams, and, in many cases, it's core to security teams as well. But that does not mean the customers are happy or are not looking for something better.
Three areas have invalidated assumptions on which SIEM buying decisions were made during the past five years -- three areas where buyer's anticipated need and vendor's anticipated capabilities both missed the mark. What buyers thought they would need to address compliance requirements, security threats, and volume(s) of event data was, in fact, well below what they really needed.
So how does big data play into this? Three reasons. First, the architectural advantages of big data clusters allow them to handle larger amounts of data very quickly, and scale up with the number of events being generated. Second, the query and analysis capabilities offered by most big data clusters offers faster, more flexible approaches to combing through the data collected. Finally, big data can be embedded into, or be integrated as, a peer application to existing SIEM deployments.
But beyond the native capabilities to address scale and "input velocity" -- inherent traits of big data -- it's all speculation. Yes, big data offers a great deal of flexibility in how it can mine data. And adoption of big data infrastructure to house event data helps solve the volumetric and speed issues customers experience with SIEM. That's why it's being touted as a technology that will transform security analytics.
But for the analysis to occur, someone needs to actually write the scripts to analyze the data. Someone needs to actually put Map-Reduce or similar queries to work. And they need to know what data they are interested in examining. The reports, forensic analysis tools, visualization capabilities, data correlation, and enrichment will need to be built. Today, that's just an idea. It's potential. And for anyone who watches NFL or NBA drafts can tell you, lots of time potential does not pan out.
Someone needs to put in the work to make the promise of big data a reality. Yes, we can get better analysis, and get it faster than before, but there is a lot of work to be done to get there. Some features will undoubtedly be provided by vendors and some by your internal development teams; most likely it will be a combination of both. The core technology is likely to address a couple of limiting factors (scale, timeliness of analysis, flexible query options) in the near term; the rest (reports, rules, controls, alerts, forensic tools) will be the same problem you have today.
If you did not have the time to write SIEM policies in the past, then it's likely you won't have time to write the queries for big data deployments in your future.
Adrian Lane is an analyst/CTO with Securosis LLC, an independent security consulting practice. Special to Dark Reading. Adrian Lane is a Security Strategist and brings over 25 years of industry experience to the Securosis team, much of it at the executive level. Adrian specializes in database security, data security, and secure software development. With experience at Ingres, Oracle, and ... View Full Bio