BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches for ransomware detection

PRESENTERS: LI CHEN, RAVI SAHITA
CONTRIBUTORS: LI CHEN, RAVI SAHITA, CHIH-YUAN YANG, ANINDYA PAUL
Machine learning based
ransomware detection:
the good, the bad, and the ugly

Legal Disclaimers
• Intel provides these materials as-is, with no express or implied warranties.
• All products, dates, and figures specified are preliminary, based on current expectations, and are subject to change without notice.
• Intel processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to
deviate from published specifications. Current characterized errata are available on request.
• Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service
activation. Performance varies depending on system configuration. No product or component can be absolutely secure. Check with
your system manufacturer or retailer or learn more at http://intel.com.
• Some results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to
you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual
performance.
• Intel and the Intel logo are trademarks of Intel Corporation in the United States and other countries.
• *Other names and brands may be claimed as the property of others.
© Intel Corporation 2019

OUTLINE OF THE TALK
• Ransomware detection case study
• The Good:
• Machine Learning (ML) is effective
• The Bad:
• ML can launch adversarial attacks on ML models
• The Ugly:
• ML Model durability
• Improving detection via complementary platform capabilties

What is Ransomware?
• Ransomware is a category of malware which hijacks victim’s data or
machine and demands monetary returns
• Categories:
• Locker-ransomware: hijack resources without encryption
• Crypto-ransomware: deny access using encryption
• The damage done by crypto-ransomware is irreversible in most cases due
to the use of cryptography

Typical Ransomware ACTIVITY
Distribution Infection Communication
EnumerationEncryptionExtortion

Ransomware Data Description
• Downloaded total ~22k ransomware
using Microsoft and Kaspersky’s labels
from VirusTotal
• ~ 5min execution for each sample
• Decoy files to identify activated crypto-
ransomware - Identified ~4.4k active
samples
Ransomware families →

DATA ACQUISITION VIA Sandbox
System
• Bare-metal system built on
Windows*-based system
• Refresh system by checkpointing
SSD writes and restoring SSD
partition image
• Anti-evasion mechanisms
• Simulated human activities
• Opened applications
• Limited heuristics
Storage
Control
Server
Storage
Robot
Internet
Data
storage
Router
Programmable
Power Control
…
Robot

Behavior Data BASED ON I/O Events
• Collected Time stamp, I/O Event Type, Target Filename, Entropy
• Based on C# .Net framework FileSystemWatcher
• Entropy of target files calculated by normalized Shannon entropy

The Good
Machine learning can be
efficient, scalable and
accurate at recognizing
malicious attacks.

Feature extraction
Events Feature
encoding
Padding 0
File deleted 1
File content changed and entropy is [0.9, 1] 2
File content changed and entropy is [0.2, 0.4) 3
File content changed and entropy is [0, 0.2) 4
File created 5
File renamed 7
• Each execution log is
represented by a
sequence of events.
• We set the length = 3000
for each sample.

ML model results for ransomware
detection
❖Train-Test ratio: 0.8:0.2
❖Training samples: 1292 benign, 3736 malicious
❖Test samples: 324 benign, 934 malicious
❖Dimensionality: n x 3000
❖7 ML models
We select Text CNN as feature
extractor due to its superior
performance compared with
other classifiers.

text-cnn feature space
https://arxiv.org/pdf/1408.5882v2.pdf
Feature
subspace

Features are well-separated in Text-
CNN subspace
Class-conditional density plot for each dimension in Text-CNN feature space.

classifiers greatly improve in Text CNN
feature subspace
Classifiers improve up to 55% in accuracy in Text-CNN space.

The Good - summary
Machine learning is highly effective for malware detection.
When ML classifiers are used in security-critical applications,
are accuracy, FPR, precision, recall, F1 scores enough?

The Bad
Machine learning can hack
vulnerable ML systems.

Adversarial Machine Learning in Vision
Object Detection
(amplified)
DNN: “Speed Limit Sign” DNN: “Ruler”Difference
Image Classification
https://arxiv.org/abs/1412.6572
Adversarial Pertubation
Object detection on original image with
adversarial pertubationObject detection on original image

Generative Adversarial Network
(GAN)
Generator
Discriminator
G generates fakes to fool D
D differentiates fakes and reals.
Over time, G and D get better.
Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.

Core Idea: GAN to synthesize
ransomware logs
Our threat model assumes the adversary has access to training dataset but has no
knowledge of the ML classifier.

adversarial quality assessment
A successful evasion means the generated malicious samples not
only fool ransomware classifier, but also persists maliciousness
based on certain metrics.
We propose sample-based and batch-based adversarial quality
metrics to evaluate

GAN to bypass ransomware detection
• We use the same training data to train AC-GAN
• The stopping criterion is based on the loss of the
discriminator
• At test time, we generate 5000 malicious segments and
ensure their adversarial quality

Detection results on good quality
adversarial examples
Indicates a broad attack surface for ML

THE BAD - SUMMARY
Machine learning can automatically hack other highly effective
ML systems. Generative adversarial network can serve as an
intelligent hacker to bypass effective systems.
Robustness and resiliency are equally important as accuracy,
FPR, precision, recall, F1 scores.
Why does this happen?

The Ugly
The investigation
Concept drift
Time variant samples

Investigation
We investigate why the
generated samples can
bypass ML detection
The generated samples, in
dark red, lie close to a
linear boundary but much
closer to the real benign
samples in the Text-CNN
latent feature subspace

Non-linear boundary decision shows to
be more robust
• SVM with radial basis in
Text-CNN space was able
to detect all the
adversarial examples
• The non-linear boundary
decision pertain
robustness and indicates
a smaller blindspot

Other issues
• Concept drift
• Time variant

The Ugly - summary:
Investigation of ML boundaries indicates adversarial
samples lie close to the benign samples in feature
subspace.
Nonlinear boundary decisions show better resiliency
against adversarial examples.

Platform
capabilities
to improve
detection
Making the attackers job
harder

Intel Labs
Analytical to real-world samples
How to take the output of GAN and incorporate it in the tool to
run ransomware?

PLATFORM capabilities to make ML system
more trustworthy
• Can we use ML + system capabilities to make the attackers’ job harder?
• Intel® Processor Trace and other telemetry can be used to make the system call
activity information more trustworthy
• Checkpointing technologies are a useful tool for recovery
• Trusted Execution capabilities can prevent model stealing attacks
• New storage mechanisms (such as persistent memory) provide new avenues for
access-control

Backup
Ransomware Mitigation
Ransomware
ML DetectorRollback
File IO Events, Entropy, Path
Early Detection
USE ML AS AN EARLY DETECTOR

ML VULNERABILITY RESEARCH PLATFORM
- MLsploit
• A Cloud-Based Framework for Adversarial
Machine Learning Research
• Tool for interactive investigation of ML
vulnerabilities
• Interactive interface and iterative
experimentation
• Comparison for attack and defenses

SUMMARY
• ML can be used to build efficient, scalable and accurate at recognizing malicious
attacks such as ransomware
• ML can also be used to hack vulnerable ML systems
• ML models must comprehend adversarial approaches, concept drift and time
variations
• Combing platform capabilities for attack surface reduction (prevention) and
recovery capabilities can complement ML detection for robust solutions

The intersection of AI & Security
40
Security Analytics Secure AI Workloads Adversarial Resilient AI
Today’s focus

Case Study
• Collect real ransomware and benign software
• Examine ML effectiveness for ransomware detection
• Explore ML robustness when ML generates adversarial ransomware
samples
• Investigate ML blind spot and boundaries

classifiers greatly improve in Text CNN
feature subspace

43
Beyond vision: audio or malware
Attack in ASR domain on audio waveforms to fool
DeepSpeech (speech-to-text transcription)
AVPASS: adversarial malwares variants that can
beat VirusTotal detection
https://www.blackhat.com/us-17/briefings/schedule/#avpass-leaking-and-bypassing-antivirus-detection-model-automatically-7354
Bypass VirusTotal
up to 100%

Training GAN
Challenges:
❖Convergence issue:
❖Transfer learning
❖Learning rates adapted for generator and discriminaro
• We use the same training data to train AC-GAN
• The stopping criterion is based on the loss of the discriminator
• At test time, we generate 5000 malicious segments
44

adversarial quality metric
Batch-basedSample-based
Indicates the generated samples are much more alike malicious real ransomware samples

Distribution difference in original and
feature space

BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches for ransomware detection

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches for ransomware detection

Similar to BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches for ransomware detection (20)

More from BlueHat Security Conference

More from BlueHat Security Conference (20)

Recently uploaded

Recently uploaded (20)

BlueHat Seattle 2019 || The good, the bad & the ugly of ML based approaches for ransomware detection