Adversarial Pattern Classification

PhD in Electronic and Computer Engineering

Adversarial Pattern Classification

Battista Biggio
XXII cycle

Advisor: prof. Fabio Roli

Department of Electrical and Electronic Engineering
University of Cagliari, Italy

Outline

• Problem definition
• Open issues
• Contributions of this thesis
– Experiments

• Conclusions and future works

05-03-2010 Adv ersarial Classification - B. Biggio 2

What is adversarial classification?
• Pattern recognition in security applications
– spam filtering, intrusion detection, biometrics

• Malicious adversaries aim to mislead the system

x2 legitimate
f(x)
malicious

Buy viagra!

Buy vi4gr@!

x1

Open issues

1. Vulnerability identification
• potential vulnerabilities may be exploited by an
adversary to mislead the system

2. Performance evaluation under attack
• standard performance evaluation does not provide
information about the robustness of a classifier under
attack

3. Defence strategies for robust classifier design
• classification algorithms were not originally thought to
be robust against adversarial attacks


Main contributions of this thesis

1. State of the art in adversarial classification
– to highlight the need for a unifying view of the
problem

2. Robustness evaluation
– to provide an estimate of the performance of a
classifier under attack
– to select a more appropriate classification model

3. Defence strategies for robust classifier design
– to improve the robustness of classifiers under attack


State of the art
• Vulnerability identification
– Good word attacks in spam filtering [W ittel, Lowd, Graham-Cumming]
– Polymorphic and poisoning attacks in IDSs [Fogla, Lee, Kloft, Laskov ]
– Possible attacks to a biometric verification system [Ratha, Jain]

• Defence strategies against specific attacks
– Good word attacks in spam filtering [Jorgensen, Nelson]
– Polymorphic and poisoning attacks in IDSs [Perdisci, Cretu]
– Spoof attacks in biometrics [Rodrigues]

• No general methodology exists to evaluate the
performance of classifiers under attack


State of the art

A clear and unifying view of the problem as well
as practical guidelines for the design of classifiers
in adversarial environments do not exist yet!


Standard performance evaluation
accuracy
C2
TRAINING SET
C1
COLLECTED CLASSIFIER
DATA

TESTING SET

Techniques Performance measures
Validation Classification accuracy
Cross validation ROC curve
Bootstrap Area Under the ROC curve (AUC)
… …


Problems

• Standard performance evaluation is likely to
provide an optimistic estimate of the
performance [Kolcz]
1. collected data may not include attacks at all

Biometric systems are not typically tested
against spoof attacks


Problems

• Standard performance evaluation is likely to
provide an optimistic estimate of the
performance [Kolcz]
2. collected data may contain attacks which however
were not targeted against the system being designed

Attacks collected in spam filtering or IDSs might have
targeted systems based on different features


Problems
3. Collected data does not contain attacks of different
attack strength
• e.g., number of words modified in spam e-mails

Buy viagra! Buy vi4gr4! Buy vi4gr4!
Did you ever play that game
when you were a kid?

It is of interest to evaluate robustness
of classifiers under different attack strength


Robustness evaluation

• Result of our robustness evaluation
– performance vs attack strength

Example
Standard
performance performance degradation of
evaluation text classifiers in spam filtering
under different number of
modified words
C2

C1

accuracy

0


Robustness evaluation
• Robustness evaluation is required to have a more
complete understanding of the classifier’s performance
– We need to figure out how an adversary may attack the
classifier (security by design)

• Designing attacks may be a very difficult task
– in-depth knowledge on the specific application is required
– costly and time-consuming
• e.g., fake fingerprints

• We thus propose to simulate the effect of attacks by
modifying the feature values of malicious samples


Attack simulation
• Biometric multi-modal verification system
• Potential attacks
– spoof attempts

s2 Fingerprint
Claimed
identity spoof
Genuine
+ +

Fingerprint score
Face Fingerprint
matcher matcher
s1 s2 + +
Impostor
Face
Fusion module spoof

s1
Genuine / Impostor Face score
f (x)

Attack simulation
• Text classifiers in spam filtering
– binary features (presence / absence of word)

• Potential attacks
– bad word obfuscation (BWO) / good word insertion (GWI)

Buy viagra! Buy vi4gr4!

Did you ever play that game
when you were a kid where the
little plastic hippo tries to
gobble up all your marbles?

x = [0 0 1 0 0 0 0 0 …] x’ = [0 0 0 0 1 0 0 1 …]

x ' = A(x)

Attack strength
• Distance in the feature space
– chosen depending on the application and features

Example
• Text classifiers in spam filtering
– binary features (presence / absence of word)

Buy viagra ! … Buy vi@gr4 ! …

x = [0 0 1 0 1 …] x’ = [0 0 0 0 1 …]

Hamming distance
number of words modified d(x, x ') = 1
in the spam message


Attack strategy A(x)
Buy viagra!
A1 (x)
+
+

B-u-y viagra! A2 (x)

+
0 D
Buy vi4gr@!

d(x, x ') ! D D =1

A(x) depends on the adversary’s knowledge about the classifier!


Worst case attack

• To simulate attacks which exploits knowledge
on the decision function of the classifier

# +1, malicious
f (x) = sign g(x) ! $
%"1, legitimate
e.g., g(x) = & wi xi + w0 Buy viagra!
i B-u-y viagra!
+
+
D =1
A(x) = arg min g(x ')
x' f (x)
+
s.t. d(x, x ') ! D
Buy vi4gr@!


Worst case attack

• Linear classifiers / binary features

viagra Buy viagra!
buy
D

weights
Buy vi4gr@!

kid B-u-y vi4gr@!
game

B-u-y vi4gr@!
game

• Features which have been assigned the highest
absolute weights are modified first

Experiments on spam filtering
Text classifiers (worst case)
• TREC 2007 public data set
– Training set: 10K emails
– Testing set: 10K emails

• Features: words (tokens)
• Classifiers (using different
number of features)
– Logistic Regression (LR) Attack strength

– Linear SVM

TP
• AUC10%

0 0.1 FP Attack strength

Mimicry attack
• To simulate attacks where no information on the
classification function is exploited
• Malicious samples are camouflaged to mimic legitimate
samples
– e.g., spoof attempts, polymorphic attacks
Buy viagra!

!
B-u-y vi4gr@! D=2
A(x) = arg min d(x ', x ) +
x'
+
s.t. d(x, x ') " D Buy viagra!
funny game
+
+

Yesterday I played a funny game…

Text classifiers (mimicry)
– Training set: 10K emails
– Testing set: 10K emails

• Features: words (tokens)

Attack strength
• Classifiers (using different
number of features)
– Logistic Regression (LR)
– Linear SVM
– Bayesian text classifier
(SpamAssassin)
– SVM with RBF kernel
Attack strength


Experiments on intrusion
detection (mimicry)
• Data set of real network traffic (Georgia Tech, 2006)
– Training set: 20K legitimate packets
– Testing set: 20K legitimate packets + 66 distinct HTTP attacks (205
packets)

• Packets are classified separately
– Features: relative byte frequencies (PAYL) [Wang]
0 1 2 … 255
• One-class classifiers
– Mahalanobis Distance
classifier (MD)
– SVM with RBF kernel

• Attack strength
– Percentage of bytes
modified in a packet

Attack strength

To sum up
1. The proposed methodology for robustness
evaluation extends standard performance
evaluation to adversarial applications

2. Experiments showed how this methodology
may give useful insights for the design of PR
systems in adversarial tasks
• e.g., LR outperforms BayesSA, etc.


Defence strategies for robust
classifier design
• Rationale
– Discriminant capability of features may change at
operating phase due to attacks
– Avoiding to under- or over-emphasise features may increase
robustness against attacks which exploit some knowledge
on the decision function

viagra
buy
buy viagra

weights
weights

Buy viagra!
…
kid kid game
game

• Feature weighting for improved classifier robustness [Kolcz ]
– Algorithms for improving robustness of linear classifiers
– Underlying idea: to obtain more uniform set of weights

Robust classifiers by MCSs

f1 (x) = ! wi1 xi + w1
0
bagging,
1 K
DATA
RSM … ! fk (x)
K k =1
fK (x) = ! wiK xi + w0
K

• We investigated if bagging and RSM can be
exploited to design more robust linear classifiers

• The underlying idea is still to obtain more uniform
set of weights


Robust training

• Adding simulated attacks to the training set
f '(x)

s2 Fingerprint
Claimed
identity spoof Genuine
+ +

Fingerprint score
Face Fingerprint
matcher matcher
s1 s2 + +
Impostor Face
Fusion module spoof

s1
Genuine / Impostor Face score
f (x)

SpamAssassin
w1
Header analysis s ! th
w2 spam
URL filter

!
s
w3 th
Keyword filter
…
wn legitimate
Text classifier s < th

• SpamAssassin: open source spam filter
– Linear classifier / binary features (tests)
• default weights are manually tuned by designers to improve robustness

– First 10,000 e-mails to train the text classifier
– Second 10,000 e-mails to train the linear decision function
– Third 10,000 e-mails as testing set


SpamAssassin (worst case)
• Attack strength
– number of evaded tests

• Robust training
– to defend against worst
case attacks
Attack strength
• Defence strategies are not
effective against the mimicry
attack

• Strategies proposed by Kolcz
exhibited similar results to RSM
and bagging

Attack strength

Conclusions and future works
• Adversarial pattern classification and open issues

• Contributions of this thesis
– State of the art of works in adversarial classification
– Methodology for robustness evaluation
– Defence strategies for robust classifier design

• Experimental results provide useful insights for the design
of PR systems in adversarial environments

• Future works
– Theoretical investigation of adversarial classification
– Robustness evaluation of biometric verification systems


Adversarial Pattern Classification

Recomendados

Recomendados

Más contenido relacionado

Similar a Adversarial Pattern Classification

Similar a Adversarial Pattern Classification (15)

Más de Pluribus One

Más de Pluribus One (20)

Último

Último (20)

Adversarial Pattern Classification