Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud

9/10/2019
10:00 AM
Howie Xu
Howie Xu
Commentary
Connect Directly
LinkedIn
RSS
E-Mail vvv
100%
0%

AI Is Everywhere, but Don't Ignore the Basics

Artificial intelligence is no substitute for common sense, and it works best in combination with conventional cybersecurity technology. Here are the basic requirements and best practices you need to know.

The fourth industrial revolution is here, and experts anticipate organizations will continue to embrace artificial intelligence (AI) and machine learning (ML) technologies. A forecast by IDC indicates spending on AI/ML will reach $35.8 billion this year and hit $79.2 billion by 2022. Though the principles of the technology have been around for decades, the more recent mass adoption of cloud computing and the flood of big data has made the concept a reality. 

The result? Companies based around software-as-a-service are best positioned to take advantage of AI/ML because cloud and data are second nature to them. 

In the past five years alone, AI/ML went from technology that showed lots of promise to one that delivers on that promise because of the convergence of easy access to inexpensive cloud computing and the integration of large data sets. AI and ML have already begun to see acceleration for cybersecurity uses. Dealing with mountains of data that only continues to grow, machines that analyze data bring immense value to security teams: They can operate 24/7 and humans can't. 

For your cybersecurity team to effectively launch AI/ML, be sure these three requirements are in place:

1. Data: If AI/ML is a rocket, data is the fuel. AI/ML requires massive amounts of data to help it train models that can do classifications and predictions with high accuracy. Generally, the more data that goes through the AI/ML system, the better the outcome.

2. Data science and data engineering: Data scientists and data engineers must be able to understand the data, sanitize it, extract it, transform it, load it, choose the right models and right features, engineer the features appropriately, measure the model appropriately, and update the model whenever needed.

3. Domain experts: They play an essential role in constructing an organization's dataset, identifying what is good and what is bad and providing insights into how this determination was made. This is often the aspect that gets overlooked when it comes to AI/ML.

Once you have these three requirements, the engineering and analytics teams can move to solving very specific problems. Here are three categories, for example:

1. Security user risk analysis: Just like a credit score, you can come up with a risk score based on a user behavior — and with AI/ML, you can now scale it for a very large-scale users.

2. Data exfiltration: With AI/ML, you'll be able to identify patterns more readily — what's normal, what's abnormal. 

3. Content classification: Variants on web pages, ransomware strains, destination, and more. 

Adopting AI/ML in your cybersecurity measures requires you to think differently, plan and pace the project differently, but it doesn't replace common sense and some of the conventional best practices. AI/ML is not a substitute for having a layered security defense, either. In fact, we've seen that AI/ML has been doing far better when combined with traditional cybersecurity technology. 

Here are three tenets to execute an AI/ML project:

1. "Not all data can be treated equal." Enterprise data has custom privacy and access control requirements; the data often is spread around different departments and encoded with a long history of "tribal knowledge."

2. "Wars have been won or lost primarily because of logistics," as noted by General Eisenhower. In the context of the AI/ML battleground, the logistics is the data and model pipeline. Without an automated and flexible data and model pipeline, you may win one battle here and there but will likely lose the war.

3. "It takes a village" to raise a successful AI/ML project. Data scientists need to have tight alignment with domain experts, data engineers, and businesspeople.

In the past, there have been two main criticisms of AI/ML: 1) AI is a black box, so it's hard for security practitioners to explain the results, and 2) AI/ML has too many false positives (that is, false alarms). But by combining AI/ML and tried-and-true conventional cybersecurity technology, AI/ML is more explainable, and you get fewer false positives than with conventional technology alone.

AI/ML already proved it can help businesses in a number of ways, but it still lacks context, common sense, and human awareness. That's the next step toward perfecting the technology. In the meantime, cybersecurity defense still requires domain experts, but now these experts are helping shape the future with a new paradigm shift for AI/ML methodology.

Related Content:

 

Check out The Edge, Dark Reading's new section for features, threat data, and in-depth perspectives. Today's top story: "Phishers' Latest Tricks for Reeling in New Victims."

Howie Xu is Vice President of AI and Machine Learning at Zscaler. He was the CEO and Co-Founder of TrustPath, which was acquired by Zscaler in 2018. Howie was formerly an EIR with Greylock Partners and the founder and head of the VMware networking unit. View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
tdsan
50%
50%
tdsan,
User Rank: Ninja
9/25/2019 | 9:30:44 PM
Re: adversarial attacks


 

It is going to be interesting to see once the market is flooded with companies, will we see improvement or will the learning process taper off after improvements to the algorithms or as it relates to tainted data. I am optimistic but only time will tell if companies like Sophos (Intercept X), Carbon-Black (with BluVector) will be able to look at datastreams and determine if that actor is just making a mistake or if it is an elaborate attack that is taking place overtime where the actor is using AI to find weaknesses.

Now that will be interesting.

T

 
howie.xu
100%
0%
howie.xu,
User Rank: Apprentice
9/25/2019 | 7:59:11 PM
adversarial attacks

Well aware of this paper.  Adversarial attacks will come to cybersecurity world more and more.  For instance, an attacker may modify a malicious activity so that its badness is preserved but can fool an ML model into thinking it's all legitimate/benign.  Zscaler being the leader in cloud security is leading the R&D in this area and we welcome top researchers and engineers out there to join our extremely interesting and rewarding journey! :)

 

-Howie Xu

 

tdsan
50%
50%
tdsan,
User Rank: Ninja
9/25/2019 | 7:08:32 PM
Re: Key points that were left out
When you get a chance, check out this article, it elaborates on the discussions we had about AI/ML.

They cover the examples you and I brought up in the discussions, it seems it just takes a small adjustment and the data is tainted, so to me that is not real AI but ML. Once AI becomes self-aware, then these problems will be a thing of the past, but there could be other things we need to address.

Todd

 
howie.xu
100%
0%
howie.xu,
User Rank: Apprentice
9/25/2019 | 7:00:37 PM
Re: Key points that were left out
Hi Todd, very true.  I didn't elaborate in this article but your point about data is very valid.  That's why my top best practice is about "not all data is created equal".  :)

Data has privacy issues, and then data quanity (volume/processing capacity) and data quality (for instance, what data can be used for what use case) issues too. The list goes on. :)

 

cheers,

 

-Howie
tdsan
100%
0%
tdsan,
User Rank: Ninja
9/25/2019 | 6:49:04 PM
Re: Key points that were left out
Yes, there is no silver-bullet, it is still a work in progress but we have to continue to move forward because the future seems to be getting brighter and brighter (or the outcomes I should say).

Of course, in the security realm, laying solutions to make it harder for the assailant to penetrate your defenses is common-sense (onion and layered approach).



And yes, I do agree, that it is going to take time for AI to make decisions that are indicative of our expected outcomes, but I am curious about the validity of data and if that data is tainted in any way (biases), the results of AI could be skewed to affect the personal lives where it has been trained (like going into neighborhoods and opening fire on people of color, possibility). I would think we need to be able to filter data that is considered way out of the normal parameters, that is up for discussion. There will be one-offs.

T

 

 
howie.xu
100%
0%
howie.xu,
User Rank: Apprentice
9/25/2019 | 3:15:23 PM
Re: Key points that were left out
Hi Todd, I appreciate your detailed feedback, compliments, comments, and questions.

 

AI/ML can help identify ""what's normal, what's abnormal" faster but then the truth is "abnormal" does not equal "malicious", as you probabaly meant to express too.

 

There is no silver bullet yet, AI/ML is help to solve a large scale problem, but one Machine Learning model often is not enough.  Often time, you need multiple models emsembled together, and you sometimes need heuristics to come to help too.

It is naive to think one machine learning model can detect anomaly and hence bad/malicious behavior, but it is reasonabe to think one machine learning model can be one of the critical pieces of the puzzle.

 

Hope it helps,

 

-howie
tdsan
100%
0%
tdsan,
User Rank: Ninja
9/12/2019 | 1:49:48 PM
Key points that were left out

1. Data: If AI/ML is a rocket, data is the fuel. AI/ML requires massive amounts of data to help it train models that can do classifications and predictions with high accuracy. Generally, the more data that goes through the AI/ML system, the better the outcome.

 I like the fact that you prefaced the statement with generally and in section 3 you addressed it quite nicely.

3. Domain experts: They play an essential role in constructing an organization's dataset, identifying what is good and what is bad and providing insights into how this determination was made. This is often the aspect that gets overlooked when it comes to AI/ML.

I do like the fact that you mentioned "what's normal, what's abnormal.". Now this statement, I am not so sure of because if we consider what is outside the various thresholds, in the human world, we have to take into consideration time or one offs. What if someone forgot to do something and they ran a task, that task was in the middle of the day but it was to go out, run a report and provide that report to the mgmt staff (that is not part of the norm from a business process standpoint but it is within the norm of normal business operations). The AI/ML could identify this task as being a threat.


However, I do like this statement you wrote, very perceptive:

2. "Wars have been won or lost primarily because of logistics," as noted by General Eisenhower. In the context of the AI/ML battleground, the logistics is the data and model pipeline. Without an automated and flexible data and model pipeline, you may win one battle here and there but will likely lose the war.


I would think it is the processes and planning that create the data (the logistics) and the pipeline is considered how the data is transferred, executed and delivered to right people at the right time, this is truly how wars are won.

"The more you sweat in peace, the less you bleed in war." - General Schwarzkopf


The details (data), planning (process) and execution (pipeline) are the key elements that are used to effectively address the issues that we see every day. The only time we are even close to winning this war on cyber-terror is when we start looking at people as human-beings and provide a roadmap to respect even the menial garbage worker, because no criminal (there are outliers) wants to remain in the same position in which they started.


Todd
sama174
100%
0%
sama174,
User Rank: Apprentice
9/11/2019 | 2:17:10 AM
Education
I really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. <a href="https://www.excelr.com/data-science-course-training-in-hyderabad/"> Data Science in Hyderabad </a>
Data Privacy Protections for the Most Vulnerable -- Children
Dimitri Sirota, Founder & CEO of BigID,  10/17/2019
Sodinokibi Ransomware: Where Attackers' Money Goes
Kelly Sheridan, Staff Editor, Dark Reading,  10/15/2019
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
7 Threats & Disruptive Forces Changing the Face of Cybersecurity
This Dark Reading Tech Digest gives an in-depth look at the biggest emerging threats and disruptive forces that are changing the face of cybersecurity today.
Flash Poll
2019 Online Malware and Threats
2019 Online Malware and Threats
As cyberattacks become more frequent and more sophisticated, enterprise security teams are under unprecedented pressure to respond. Is your organization ready?
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2019-18216
PUBLISHED: 2019-10-20
** DISPUTED ** The BIOS configuration design on ASUS ROG Zephyrus M GM501GS laptops with BIOS 313 relies on the main battery instead of using a CMOS battery, which reduces the value of a protection mechanism in which booting from a USB device is prohibited. Attackers who have physical laptop access ...
CVE-2019-18214
PUBLISHED: 2019-10-19
The Video_Converter app 0.1.0 for Nextcloud allows denial of service (CPU and memory consumption) via multiple concurrent conversions because many FFmpeg processes may be running at once. (The workload is not queued for serial execution.)
CVE-2019-18202
PUBLISHED: 2019-10-19
Information Disclosure is possible on WAGO Series PFC100 and PFC200 devices before FW12 due to improper access control. A remote attacker can check for the existence of paths and file names via crafted HTTP requests.
CVE-2019-18209
PUBLISHED: 2019-10-19
templates/pad.html in Etherpad-Lite 1.7.5 has XSS when the browser does not encode the path of the URL, as demonstrated by Internet Explorer.
CVE-2019-18198
PUBLISHED: 2019-10-18
In the Linux kernel before 5.3.4, a reference count usage error in the fib6_rule_suppress() function in the fib6 suppression feature of net/ipv6/fib6_rules.c, when handling the FIB_LOOKUP_NOREF flag, can be exploited by a local attacker to corrupt memory, aka CID-ca7a03c41753.