Pattern classification systems are currently used in security applications like intrusion detection in computer networks, spam filtering and biometric identity recognition. These are adversarial classification problems, since the classifier faces an intelligent adversary who adaptively modifies patterns (e.g., spam e-mails) to evade it. In these tasks the goal of a classifier is to attain both a high classification accuracy
and a high hardness of evasion, but this issue has not been deeply investigated yet in the literature. We address it under the viewpoint of the choice of the architecture of a multiple classifier system. We propose a measure of the hardness of evasion of a classifier architecture, and give an analytical evaluation and comparison of an individual classifier and a classifier ensemble architecture. We finally report an experimental evaluation on a spam filtering task.
Salient Features of India constitution especially power and functions
Multiple Classifier Systems for Adversarial Classification Tasks
1. Multiple Classifier Systems for Adversarial Classification Tasks Battista Biggio, Giorgio Fumera and Fabio Roli Dept. of Electrical and Electronic Eng., University of Cagliari
6. Design of pattern recognition systems Goal in “traditional” applications: maximise accuracy Data acquisition Feature extraction Model selection Classification
7. Design of pattern recognition systems Goal in “traditional” applications: maximise accuracy Data acquisition Feature extraction Model selection Classification Goal in adversarial classification tasks: maximise accuracy and hardness of evasion Data acquisition Feature extraction Model selection Classification
8. Design of pattern recognition systems Goal in “traditional” applications: maximise accuracy Data acquisition Feature extraction Model selection Classification Goal in adversarial classification tasks: maximise accuracy and hardness of evasion Data acquisition Feature extraction Model selection Classification
9. Hardness of evasion + th x 1 ... x n ≥ 0: malicious < 0: legitimate Decision function ... y {malicious, legitimate}
10. Hardness of evasion + th x 1 ... x n ≥ 0: malicious < 0: legitimate Decision function ... y {malicious, legitimate} Expected value of the minimum number of features the adversary has to modify to evade the classifier ( worst case: the adversary has full knowledge on the classifier)
11. Hardness of evasion: an example + th = 2 x 1 = 1 x 2 = 1 x 3 = 0 x 4 = 1 x 5 = 0 ≥ 0: malicious < 0: legitimate x = (1 1 0 1 0) 0.3 0.8 3.0 1.5 1.0 Expected value of the minimum number of features the adversary has to modify to evade the classifier
12. Hardness of evasion: an example + th = 2 x 1 = 1 x 2 = 1 x 3 = 0 x 4 = 1 x 5 = 0 ≥ 0: malicious < 0: legitimate x = (1 1 0 1 0) 0.3 0.8 3.0 1.5 1.0 Expected value of the minimum number of features the adversary has to modify to evade the classifier
13. Hardness of evasion: an example + th = 2 x 1 = 0 x 2 = 1 x 3 = 1 x 4 = 0 x 5 = 0 ≥ 0: malicious < 0: legitimate x = (0 1 1 0 0) 0.3 0.8 3.0 1.5 1.0 Expected value of the minimum number of features the adversary has to modify to evade the classifier
14. Hardness of evasion: an example + th = 2 x 1 = 0 x 2 = 1 x 3 = 1 x 4 = 0 x 5 = 0 ≥ 0: malicious < 0: legitimate x = (0 1 1 0 0) 0.3 0.8 3.0 1.5 1.0 Expected value of the minimum number of features the adversary has to modify to evade the classifier
15. Comparison of two classifier architectures x 1 x n x 2 t w 1 w 2 ... w n X x i {0,1}
16. Comparison of two classifier architectures x 1 x n x 2 t t 1 w 1 w 2 ... w n ... t 2 ... ... t N ... X 1 X 2 X N OR X 1 X 2 ... X N = X X i X j = , i j X x i {0,1}
17. Comparison of two classifier architectures x 1 x n x 2 t t 1 w 1 w 2 ... w n ... t 2 ... ... t N ... X 1 X 2 X N OR X 1 X 2 ... X N = X X i X j = , i j x 1 , x 2 ,..., x n i.i.d. identical weights t 1 = t 2 =...= t n , |X i | = n/N X x i {0,1}
18. Comparison of two classifier architectures p 1A = 0.25 p 1L = 0.15 Details are in the paper
19. Comparison of two classifier architectures p 1A = 0.25 p 1L = 0.15 Details are in the paper
20. Comparison of two classifier architectures ROC working point: min (C FP + FN) C = 1, 2, 10, 100 C = 1 C = 2 C = 10 C = 100