Dark Reading is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Operational Security

12/11/2017
08:35 AM
Bogdan Botezatu
Bogdan Botezatu
News Analysis-Security Now
Connect Directly
Twitter
LinkedIn
Google+
RSS
E-Mail vvv
50%
50%

Machine Learning for Ransomware Defense

Ransomware keeps getting more dangerous but defense is improving, too. Machine learning might be the key to actually keeping up with the level of attacks.

In the past several years, ransomware has inflicted financial losses estimated at billions of dollars -- and that's only what victims have reported to law enforcement. Described by security researchers as one of the most prolific and financially stimulating malware categories, ransomware has successfully been ported to all operating systems (OS), including mobile OSs.

While originally developed to target computers running Windows, as of 2016 Apple's MacOS and Linux have seen their own distributions of ransomware. In fact, the ransomware business has been so prolific that cybercriminals have even turned it into an as-a-service offering. Ransomware-as-a-service lets even less tech-savvy users rent ransomware services and start infecting victims for their own financial gain.

Relying on tools such as encryption, obfuscation and polymorphism, ransomware has caused serious concerns for the security industry, as traditional detection mechanisms are ill-equipped to detect each victim-specific ransomware samples. Consequently, machine learning algorithms designed to automatically and correctly tag ransomware samples based on their behavior or similarity with known ransomware have become a necessity.

While machine learning plays a significant part in detecting ransomware, no single machine learning algorithm can spot all ransomware. That would require an ensemble of specifically trained machine learning algorithms working together, each designed to identify either a specific ransomware family, a ransomware-disseminating website, or packers (a technique commonly referred to as "executable compression" and used to compress the ransomware payload to make it difficult for security tools to analyze it).

Ransomware doesn't discriminate
Ransomware indiscriminately selects targets, with the sole aim of locking access to critical files and demanding payment to restore access. However, it has recently undergone some transformations that have allowed it to infect victims even without any user interaction.

WannaCry was the first ransomware outbreak that leveraged a Windows vulnerability to automatically spread across networks and infect victims without interaction with the victim. Simply having a vulnerable PC with an Internet connection would have been enough to get infected, which is why hundreds of thousands of computers were rendered inoperative during its relatively short outbreak. GoldenEye was yet another example of a ransomware pandemic that affected several European countries, including Poland, Germany, Italy, Spain and France.

While in both cases the attackers made little money from ransoms, these incidents proved highly disruptive and demonstrated just how easily unpatched vulnerabilities can be exploited to deliver any type of threat, even one as pervasive as ransomware.

The Internet of Things is not immune either, but the business model is fundamentally different from that of the PC. Instead of locking you out of your files, for example, ransomware designed for IoT devices and smart homes could lock you out of your home. Even medical devices -- including implantable ones -- could be exploited and used to extort victims. For example, the Internet connection of Dick Cheney's Pacemaker was allegedly disabled for fear that terrorists -- or even ransomware -- could threaten his life.

Machine learning steps in
Where traditional file-based detection security technologies fail, machine learning algorithms succeed. Neural networks and deep learning algorithms can detect unknown ransomware samples if they're properly trained and adjusted to produce a low number of false positives. Augmenting cloud-based detections with machine learning and genetic algorithms is also effective in combating the rampant growth of ransomware caused by its polymorphic behavior.

In a nutshell, it all starts with a large dataset of ransomware files and an even larger set of clean files. The algorithm is tasked with finding some characteristics for each file in the training set, and normalizing it into a number that is usually called a "feature." As one characteristic may create more than one feature, only a subset of those features will be used to train a model for the sample set.

When using neural networks to create models used for ransomware identification, all samples are usually mapped in a matrix comprising tens of thousands of features. Instead of having a three-dimensional matrix that describes three features necessary for a file to be considered ransomware, imagine an n-dimensional matrix that has more than 40,000 features. That might sound extremely complicated, but the end result is actually a mathematical equation -- also known as a model -- that acts as a condition that, once satisfied, will tag a file as ransomware.

A major benefit of using machine learning models to spot ransomware is that it increases the number of possible ransomware files it can detect -- if enough ransomware features are present in an unknown ransomware sample, the file is likely ransomware.

The second benefit is that machine learning models are extremely small, usually around 1 kilobyte, which makes them easy to deploy across the entire user base. The only downside of using machine learning models to detect ransomware is that they have to be extensively tested before deployment to avoid incorrectly tagging clean files as malicious.

Some machine learning algorithms can even identify suspicious URLs that are either used to disseminate ransomware or act as command and control servers. Using Natural Language Processing (NLP) algorithms and various clustering methods to parse texts, they can potentially block new or never-before-seen links from being accessed by victims, preventing the actual ransomware payload from reaching the computer.

Machine learning algorithms for ransomware identification can be used as a proactive method for combating ransomware threats, regardless of whether they're designed for PCs, mobile devices or even IoTs. The main benefit of machine learning is that it can be used as a tool to augment existing security layers, giving them proactivity, efficacy and performance.

Ransomware is here to stay: So is defense
It's highly unlikely that ransomware will go away any time soon, especially since digitalization has brought increased interconnectivity between systems. With a proven and tested business model and financial gains in the billions of dollars, ransomware is likely the biggest mass-market threat to both end users and organizations.

However, machine learning algorithms can augment all security layers to detect and plug threats at pre-execution, on-execution and post-execution, making ransomware less of threat and more of a nuisance.

Related posts:

&emdash; Bogdan Botezatu is living his second childhood at Bitdefender as senior e-threat analyst.

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
COVID-19: Latest Security News & Commentary
Dark Reading Staff 9/25/2020
9 Tips to Prepare for the Future of Cloud & Network Security
Kelly Sheridan, Staff Editor, Dark Reading,  9/28/2020
Malware Attacks Declined But Became More Evasive in Q2
Jai Vijayan, Contributing Writer,  9/24/2020
Register for Dark Reading Newsletters
White Papers
Video
Cartoon
Current Issue
Special Report: Computing's New Normal
This special report examines how IT security organizations have adapted to the "new normal" of computing and what the long-term effects will be. Read it and get a unique set of perspectives on issues ranging from new threats & vulnerabilities as a result of remote working to how enterprise security strategy will be affected long term.
Flash Poll
How IT Security Organizations are Attacking the Cybersecurity Problem
How IT Security Organizations are Attacking the Cybersecurity Problem
The COVID-19 pandemic turned the world -- and enterprise computing -- on end. Here's a look at how cybersecurity teams are retrenching their defense strategies, rebuilding their teams, and selecting new technologies to stop the oncoming rise of online attacks.
Twitter Feed
Dark Reading - Bug Report
Bug Report
Enterprise Vulnerabilities
From DHS/US-CERT's National Vulnerability Database
CVE-2020-12505
PUBLISHED: 2020-09-30
Improper Authentication vulnerability in WAGO 750-8XX series with FW version <= FW07 allows an attacker to change some special parameters without authentication. This issue affects: WAGO 750-852 version FW07 and prior versions. WAGO 750-880/xxx-xxx version FW07 and prior versions. WAGO 750-881 ve...
CVE-2020-12506
PUBLISHED: 2020-09-30
Improper Authentication vulnerability in WAGO 750-8XX series with FW version <= FW03 allows an attacker to change the settings of the devices by sending specifically constructed requests without authentication This issue affects: WAGO 750-362 version FW03 and prior versions. WAGO 750-363 version ...
CVE-2020-4629
PUBLISHED: 2020-09-30
IBM WebSphere Application Server 7.0, 8.0, 8.5, and 9.0 could allow a local user with specialized access to obtain sensitive information from a detailed technical error message. This information could be used in further attacks against the system. IBM X-Force ID: 185370.
CVE-2019-17098
PUBLISHED: 2020-09-30
Use of hard-coded cryptographic key vulnerability in August Connect Wi-Fi Bridge App, Connect Firmware allows an attacker to decrypt an intercepted payload containing the Wi-Fi network authentication credentials. This issue affects: August Connect Wi-Fi Bridge App version v10.11.0 and prior version...
CVE-2020-15731
PUBLISHED: 2020-09-30
An improper Input Validation vulnerability in the code handling file renaming and recovery in Bitdefender Engines allows an attacker to write an arbitrary file in a location hardcoded in a specially-crafted malicious file name. This issue affects: Bitdefender Engines versions prior to 7.85448.