SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Chen‐Chi Wu1, Kuan‐Ta Chen2
                                      Yu‐Chun Chang1, Chin‐Laung Lei1

1Department of Electrical Engineering, National Taiwan University

               2Institute of Information Science, Academia Sinica




IPTComm 2008                                                        1
Outline
    Motivation
    Methodology
    Performance evaluation
    Summary




IPTComm 2008                 2
Motivation
    VoIP is becoming popular because of
         Low call cost
         High voice quality
    Skype, a popular VoIP application
           over 10,000,000 concurrent users
    Accurately identifying VoIP flows from the network traffic 
    is required
         Traffic analysis
         Traffic management

IPTComm 2008                                                  3
Motivation
    Challenges of VoIP flows identification
         Various signaling protocols: SIP, H.323, various proprietary 
         protocols
         Non‐standard port numbers
         Packet payload encryption

    The interaction of human conversation is unique
         result in a specific characteristic of VoIP traffic



IPTComm 2008                                                             4
4‐State Traffic Pattern
    Infer the on/off (talking/silence) pattern by the level of the 
    packet rate during a short period
    We model a two‐way conversation by a process of four states
         State A: a period that speaker A is talking and B is silent
         State B: B is talking and A is silent
         State D: both A and B are talking
         State M: mutual silence             ON       OFF



                                                                       ON

                  OFF

IPTComm 2008                                                                5
Intuition behind Our Approach
     The 4‐state traffic pattern of VoIP traffic is unique 
     compared to that of other network applications

                Web

  P2P (BitTorrent)
Online game (WoW)

          TELNET

       VoIP (Skype)
                                          A or B   D     M


 IPTComm 2008                                                 6
Methodology
    Detect VoIP flows based on the unique human speech 
    conversation patterns embedded in voice traffic
    Derive features (attributes) from the conversation 
    patterns
    Adopt naïve Bayesian classifier, a supervised machine 
    learning tool, to divide traffic into the VoIP and non‐VoIP 
    class
         The class label of each training data is required



IPTComm 2008                                                       7
Methodology Overview
        Training phase                                        Identification phase
                                                                   Incoming flows 
Labeled training flows                                             (unknown class)
  (VoIP or non‐VoIP)                                                    Extract conversation
                                                                        patterns and derive
               Extract 4‐state traffic       Naïve                      features   
               patterns and derive         Bayesian
               features   
                                           Classifier               Flow vectors
                        Learn classifier                Classify
                        parameters
    Flow vectors                                                    Flow labels
                                                                    (VoIP or non‐VoIP) 
IPTComm 2008                                                                              8
Naïve Bayesian Classifier
 Naïve Bayesian classifier is based on the Bayes’ theorem
                               P( B | A) P( A)
                   P( A | B) =
                                   P( B)


       Each flow is represented by a vector  X = (x1, x2,…, xn),  
       depicting n features A1, A2,…, An

       Suppose there are m classes, C1, C2,…, Cm


IPTComm 2008                                                         9
Naïve Bayesian Classifier
    Given a flow vector X, the classifier predicts the flow 
    belongs to class Ci iff
        P (C i | X ) > P (C j | X ) for 1 ≤ j ≤ m, j ≠ i

    By Bayes’ theorem
                                   P( X | Ci ) P(Ci )
                    P(Ci | X ) =
                                        P( X )
    P ( X ) is constant and             is the prior probability, thus 
                            P (Ci )
    the task is to maximize
                           P ( X | Ci )
IPTComm 2008                                                              10
Naïve Bayesian Classifier
    The naïve assumption is that the values of the features 
    are conditionally independent of one another
                                  n
                 P ( X | C i ) = ∏ P ( x k | Ci )
                                 k =1

                 = P( x1 | Ci ) × P ( x2 | Ci ) ×   × P ( x n | Ci )


    P ( x1 | Ci ), P ( x2 | Ci ),..., P ( xn | Ci ) can be easily estimated 
    from the training data


IPTComm 2008                                                                   11
How to derive features from the 4‐
state traffic pattern?
    Use a Markov chain to model the VoIP traffic pattern
    Statistics of traffic patterns


         Web

          P2P
        WoW

      TELNET

          VoIP
                                    A or B   D    M
IPTComm 2008                                               12
Markov Chain
    Build a Markov chain model based on a set of known VoIP 
    traffic patterns
    Derive a feature – likelihood value
                                      Transition probabilities of the Markov chain
                                                 A         B        D         M
                                       A      0.9022   0.0028    0.0380    0.0571
                                       B      0.0029   0.9030    0.0391    0.0550
                                       D      0.0607   0.0592    0.8763    0.0038
                                       M      0.0465   0.0439    0.0019    0.9078


               4‐state Markov chain
IPTComm 2008                                                                      13
Likelihood of Traffic Patterns
    Given a traffic pattern with a state sequence S1, S2,…, Sn, 
    where Si ∈ { A, B, D, M }
    Compute the log‐likelihood value as
                  log( P , 2 × P2,3 × × P( n −1) n )
                        1

            Pi,j : the transition probability from Si to Sj
    Traffic flows may vary in length, thus define the 
    normalized log‐likelihood value as
                     log( P , 2 × P2,3 × × P( n −1) n )
                           1

                               N
               N: the length of the sequence
IPTComm 2008                                                       14
Likelihood of Traffic Patterns
    The Markov chain represents typical human conversation

    VoIP flows => large log‐likelihood value

    Non‐VoIP flows => low log‐likelihood value
    Exhibit non‐human‐like behavior: non‐interactive, 
    independent, unidirectional




IPTComm 2008                                                 15
Statistics of Traffic Patterns
    Mean of the period that party A (or B) is ON (talking) each 
    time (also compute the standard deviation)
         Bidirectional behavior

    Mean and standard deviation of the sojourn time in 
    states A, B, D, M, respectively
         Interactive behavior

    State alternation frequency
         Fragmented and disordered level of traffic pattern

IPTComm 2008                                                  16
Statistics of Traffic Patterns
    State alternation frequency
         Alternation frequency between different states




         E.g., (6 alternations between different states) / (20 sec.)
IPTComm 2008                                                           17
Feature Summary
                                       Feature set
               Normalized log‐likelihood value based on the Markov 
               chain
               Speech period of party A or B (mean, standard deviation)
               Sojourn time in each states* (mean, standard deviation)
               Ratio of sojourn time in each states*
               Alternation rate between states*
               *states A, B, D, M




IPTComm 2008                                                              18
Methodology
        Training phase                                        Identification phase
                                                                   Incoming flows 
Labeled training flows                                             (unknown class)
  (VoIP or non‐VoIP)                                                    Extract conversation
                                                                        patterns and derive
               Extract 4‐state traffic       Naïve                      features   
               patterns and derive         Bayesian
               features   
                                           Classifier               Flow vectors
                        Learn classifier                Classify
                        parameters
    Flow vectors                                                    Flow labels
                                                                    (VoIP or non‐VoIP) 
IPTComm 2008                                                                             19
Trace Collection
       We collected network traffic from 5 categories of 
       applications
         VoIP (Skype), TELNET, Web, P2P (BitTorrent), online game 
         (World of Warcraft)

Category       # Connections    Duration      # Packets      Bytes
VoIP                      462   2,388 (min)     4,728,240    4,318 (MB)
TELNET                  2,008   4,729 (min)    10,559,261    7,331 (MB)
Web                     1,406   1,537 (min)     2,528,359     680 (MB)
P2P                    15,845   3,334 (min)    29,220,870   30,500 (MB)
Online game             2,224     120 (min)    28,264,360   59,097 (MB)

IPTComm 2008                                                              20
Performance Evaluation
    Detect VoIP flows as early as possible
         Detection time is a major concern
         95% accuracy with 4‐second detection time
         97% accuracy with 11‐second detection time




IPTComm 2008                                          21
Performance Evaluation
    Goal        detect VoIP flows
         VoIP flows         positives, non‐VoIP flows        negatives
    True positive rate
             The  number  of  VoIP  flows  correctly  identified
       TPR =
                     The  number  of  total  VoIP  flows
   False positive rate
        The  number  of  non ‐ VoIP  flows  correctly  identified
  FPR =
                The  number  of  total  non ‐ VoIP  flows
    True negative rate
IPTComm 2008                                                             22
Performance Evaluation
    97% TPR with a detection time longer than 3 sec.
    Flows of World of Warcraft tend to be mis‐identified
         Achieve 90% TNR with a detection time longer than 10 sec.




IPTComm 2008                                                         23
ROC Curves
    ROC (Receiver Operating Characteristic)




IPTComm 2008                                  24
Summary
    Propose a VoIP flow identification scheme based on 
    human conversation patterns

    Our scheme yields an identification accuracy 95% within 
    4 sec. of the detection time, and 97% within 11 sec.

    High accuracy in short detection time




IPTComm 2008                                                   25
Thanks for your attention



IPTComm 2008                     26

Más contenido relacionado

La actualidad más candente

TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...
Ahmed Ayadi
 
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
Videoguy
 
Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.
Antonio Marcos Alberti
 
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
Rosdiadee Nordin
 

La actualidad más candente (20)

TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...
 
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
 
Ipmc003 2
Ipmc003 2Ipmc003 2
Ipmc003 2
 
Er24902905
Er24902905Er24902905
Er24902905
 
Cell Tech V09 0312
Cell Tech V09 0312Cell Tech V09 0312
Cell Tech V09 0312
 
Ber performance analysis of mimo systems using equalization
Ber performance analysis of mimo systems using equalizationBer performance analysis of mimo systems using equalization
Ber performance analysis of mimo systems using equalization
 
intro_dgital_TV
intro_dgital_TVintro_dgital_TV
intro_dgital_TV
 
Ijetr021253
Ijetr021253Ijetr021253
Ijetr021253
 
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
 
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
 
Iy3116761679
Iy3116761679Iy3116761679
Iy3116761679
 
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic DispersionPerformance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
 
Multinode Cooperative Communications with Generalized Combining Schemes
Multinode Cooperative Communications with Generalized Combining SchemesMultinode Cooperative Communications with Generalized Combining Schemes
Multinode Cooperative Communications with Generalized Combining Schemes
 
report
reportreport
report
 
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
 
LREProxy module for Kamailio Presenation
LREProxy module for Kamailio PresenationLREProxy module for Kamailio Presenation
LREProxy module for Kamailio Presenation
 
Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.
 
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
 
A survey on transfer learning
A survey on transfer learningA survey on transfer learning
A survey on transfer learning
 

Similar a Detecting VoIP Traffic Based on Human Conversation Patterns

Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
Hagai Aronowitz
 
Cn osi model
Cn osi modelCn osi model
Cn osi model
NAME245
 
Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015
Yang Hong
 
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
imec.archive
 
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Vishalkumarec
 

Similar a Detecting VoIP Traffic Based on Human Conversation Patterns (20)

Inferring Speech Activity from Encrypted Skype Traffic
Inferring Speech Activity from Encrypted Skype TrafficInferring Speech Activity from Encrypted Skype Traffic
Inferring Speech Activity from Encrypted Skype Traffic
 
Producing simulation sequences by use of a Java-based Framework
Producing simulation sequences by use of a Java-based FrameworkProducing simulation sequences by use of a Java-based Framework
Producing simulation sequences by use of a Java-based Framework
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
 
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2 [Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
 
Sucha_ICC_2012
Sucha_ICC_2012Sucha_ICC_2012
Sucha_ICC_2012
 
DIANA: Scenarios for QoS based integration of IP and ATM
DIANA: Scenarios for QoS based integration of IP and ATMDIANA: Scenarios for QoS based integration of IP and ATM
DIANA: Scenarios for QoS based integration of IP and ATM
 
Cn osi model
Cn osi modelCn osi model
Cn osi model
 
NET2.PPT
NET2.PPTNET2.PPT
NET2.PPT
 
Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015
 
Prototyping Business Processes
Prototyping Business ProcessesPrototyping Business Processes
Prototyping Business Processes
 
2016 06-10-ieee-sdn (1)
2016 06-10-ieee-sdn (1)2016 06-10-ieee-sdn (1)
2016 06-10-ieee-sdn (1)
 
Day01
Day01 Day01
Day01
 
Hv3414491454
Hv3414491454Hv3414491454
Hv3414491454
 
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
 
PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!
 
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
 
Quantifying Skype User Satisfaction
Quantifying Skype User SatisfactionQuantifying Skype User Satisfaction
Quantifying Skype User Satisfaction
 
Peer-to-Peer Application Recognition Based on Signaling Activity
Peer-to-Peer Application Recognition Based on Signaling ActivityPeer-to-Peer Application Recognition Based on Signaling Activity
Peer-to-Peer Application Recognition Based on Signaling Activity
 
Tcp ip
Tcp ipTcp ip
Tcp ip
 
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
 

Más de Academia Sinica

量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
Academia Sinica
 
GamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming SystemGamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming System
Academia Sinica
 
Identifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis ApproachIdentifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis Approach
Academia Sinica
 
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessImproving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Academia Sinica
 
A Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation SystemsA Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation Systems
Academia Sinica
 
Network Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player InteractionNetwork Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player Interaction
Academia Sinica
 

Más de Academia Sinica (20)

Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
 
Games on Demand: Are We There Yet?
Games on Demand: Are We There Yet?Games on Demand: Are We There Yet?
Games on Demand: Are We There Yet?
 
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
 
Cloud Gaming Onward: Research Opportunities and Outlook
Cloud Gaming Onward: Research Opportunities and OutlookCloud Gaming Onward: Research Opportunities and Outlook
Cloud Gaming Onward: Research Opportunities and Outlook
 
Quantifying User Satisfaction in Mobile Cloud Games
Quantifying User Satisfaction in Mobile Cloud GamesQuantifying User Satisfaction in Mobile Cloud Games
Quantifying User Satisfaction in Mobile Cloud Games
 
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
 
On The Battle between Online Gamers and Lags
On The Battle between Online Gamers and LagsOn The Battle between Online Gamers and Lags
On The Battle between Online Gamers and Lags
 
Understanding The Performance of Thin-Client Gaming
Understanding The Performance of Thin-Client GamingUnderstanding The Performance of Thin-Client Gaming
Understanding The Performance of Thin-Client Gaming
 
Quantifying QoS Requirements of Network Services: A Cheat-Proof Framework
Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkQuantifying QoS Requirements of Network Services: A Cheat-Proof Framework
Quantifying QoS Requirements of Network Services: A Cheat-Proof Framework
 
Online Game QoE Evaluation using Paired Comparisons
Online Game QoE Evaluation using Paired ComparisonsOnline Game QoE Evaluation using Paired Comparisons
Online Game QoE Evaluation using Paired Comparisons
 
GamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming SystemGamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming System
 
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAre All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
 
Forecasting Online Game Addictiveness
Forecasting Online Game AddictivenessForecasting Online Game Addictiveness
Forecasting Online Game Addictiveness
 
Identifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis ApproachIdentifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis Approach
 
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesToward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
 
Game Bot Detection Based on Avatar Trajectory
Game Bot Detection Based on Avatar TrajectoryGame Bot Detection Based on Avatar Trajectory
Game Bot Detection Based on Avatar Trajectory
 
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessImproving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
 
A Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation SystemsA Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation Systems
 
Tuning Skype’s Redundancy Control Algorithm for User Satisfaction
Tuning Skype’s Redundancy Control Algorithm for User SatisfactionTuning Skype’s Redundancy Control Algorithm for User Satisfaction
Tuning Skype’s Redundancy Control Algorithm for User Satisfaction
 
Network Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player InteractionNetwork Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player Interaction
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Detecting VoIP Traffic Based on Human Conversation Patterns

  • 1. Chen‐Chi Wu1, Kuan‐Ta Chen2 Yu‐Chun Chang1, Chin‐Laung Lei1 1Department of Electrical Engineering, National Taiwan University 2Institute of Information Science, Academia Sinica IPTComm 2008 1
  • 2. Outline Motivation Methodology Performance evaluation Summary IPTComm 2008 2
  • 3. Motivation VoIP is becoming popular because of Low call cost High voice quality Skype, a popular VoIP application over 10,000,000 concurrent users Accurately identifying VoIP flows from the network traffic  is required Traffic analysis Traffic management IPTComm 2008 3
  • 4. Motivation Challenges of VoIP flows identification Various signaling protocols: SIP, H.323, various proprietary  protocols Non‐standard port numbers Packet payload encryption The interaction of human conversation is unique result in a specific characteristic of VoIP traffic IPTComm 2008 4
  • 5. 4‐State Traffic Pattern Infer the on/off (talking/silence) pattern by the level of the  packet rate during a short period We model a two‐way conversation by a process of four states State A: a period that speaker A is talking and B is silent State B: B is talking and A is silent State D: both A and B are talking State M: mutual silence ON OFF ON OFF IPTComm 2008 5
  • 6. Intuition behind Our Approach The 4‐state traffic pattern of VoIP traffic is unique  compared to that of other network applications Web P2P (BitTorrent) Online game (WoW) TELNET VoIP (Skype) A or B D M IPTComm 2008 6
  • 7. Methodology Detect VoIP flows based on the unique human speech  conversation patterns embedded in voice traffic Derive features (attributes) from the conversation  patterns Adopt naïve Bayesian classifier, a supervised machine  learning tool, to divide traffic into the VoIP and non‐VoIP  class The class label of each training data is required IPTComm 2008 7
  • 8. Methodology Overview Training phase Identification phase Incoming flows  Labeled training flows  (unknown class) (VoIP or non‐VoIP) Extract conversation patterns and derive Extract 4‐state traffic Naïve  features    patterns and derive Bayesian features    Classifier Flow vectors Learn classifier Classify parameters Flow vectors Flow labels (VoIP or non‐VoIP)  IPTComm 2008 8
  • 9. Naïve Bayesian Classifier Naïve Bayesian classifier is based on the Bayes’ theorem P( B | A) P( A) P( A | B) = P( B) Each flow is represented by a vector  X = (x1, x2,…, xn),   depicting n features A1, A2,…, An Suppose there are m classes, C1, C2,…, Cm IPTComm 2008 9
  • 10. Naïve Bayesian Classifier Given a flow vector X, the classifier predicts the flow  belongs to class Ci iff P (C i | X ) > P (C j | X ) for 1 ≤ j ≤ m, j ≠ i By Bayes’ theorem P( X | Ci ) P(Ci ) P(Ci | X ) = P( X ) P ( X ) is constant and             is the prior probability, thus  P (Ci ) the task is to maximize P ( X | Ci ) IPTComm 2008 10
  • 11. Naïve Bayesian Classifier The naïve assumption is that the values of the features  are conditionally independent of one another n P ( X | C i ) = ∏ P ( x k | Ci ) k =1 = P( x1 | Ci ) × P ( x2 | Ci ) × × P ( x n | Ci ) P ( x1 | Ci ), P ( x2 | Ci ),..., P ( xn | Ci ) can be easily estimated  from the training data IPTComm 2008 11
  • 12. How to derive features from the 4‐ state traffic pattern? Use a Markov chain to model the VoIP traffic pattern Statistics of traffic patterns Web P2P WoW TELNET VoIP A or B D M IPTComm 2008 12
  • 13. Markov Chain Build a Markov chain model based on a set of known VoIP  traffic patterns Derive a feature – likelihood value Transition probabilities of the Markov chain A B D M A 0.9022 0.0028 0.0380 0.0571 B 0.0029 0.9030 0.0391 0.0550 D 0.0607 0.0592 0.8763 0.0038 M 0.0465 0.0439 0.0019 0.9078 4‐state Markov chain IPTComm 2008 13
  • 14. Likelihood of Traffic Patterns Given a traffic pattern with a state sequence S1, S2,…, Sn,  where Si ∈ { A, B, D, M } Compute the log‐likelihood value as log( P , 2 × P2,3 × × P( n −1) n ) 1 Pi,j : the transition probability from Si to Sj Traffic flows may vary in length, thus define the  normalized log‐likelihood value as log( P , 2 × P2,3 × × P( n −1) n ) 1 N N: the length of the sequence IPTComm 2008 14
  • 15. Likelihood of Traffic Patterns The Markov chain represents typical human conversation VoIP flows => large log‐likelihood value Non‐VoIP flows => low log‐likelihood value Exhibit non‐human‐like behavior: non‐interactive,  independent, unidirectional IPTComm 2008 15
  • 16. Statistics of Traffic Patterns Mean of the period that party A (or B) is ON (talking) each  time (also compute the standard deviation) Bidirectional behavior Mean and standard deviation of the sojourn time in  states A, B, D, M, respectively Interactive behavior State alternation frequency Fragmented and disordered level of traffic pattern IPTComm 2008 16
  • 17. Statistics of Traffic Patterns State alternation frequency Alternation frequency between different states E.g., (6 alternations between different states) / (20 sec.) IPTComm 2008 17
  • 18. Feature Summary Feature set Normalized log‐likelihood value based on the Markov  chain Speech period of party A or B (mean, standard deviation) Sojourn time in each states* (mean, standard deviation) Ratio of sojourn time in each states* Alternation rate between states* *states A, B, D, M IPTComm 2008 18
  • 19. Methodology Training phase Identification phase Incoming flows  Labeled training flows  (unknown class) (VoIP or non‐VoIP) Extract conversation patterns and derive Extract 4‐state traffic Naïve  features    patterns and derive Bayesian features    Classifier Flow vectors Learn classifier Classify parameters Flow vectors Flow labels (VoIP or non‐VoIP)  IPTComm 2008 19
  • 20. Trace Collection We collected network traffic from 5 categories of  applications VoIP (Skype), TELNET, Web, P2P (BitTorrent), online game  (World of Warcraft) Category # Connections Duration # Packets Bytes VoIP 462 2,388 (min) 4,728,240 4,318 (MB) TELNET 2,008 4,729 (min) 10,559,261 7,331 (MB) Web 1,406 1,537 (min) 2,528,359 680 (MB) P2P 15,845 3,334 (min) 29,220,870 30,500 (MB) Online game 2,224 120 (min) 28,264,360 59,097 (MB) IPTComm 2008 20
  • 21. Performance Evaluation Detect VoIP flows as early as possible Detection time is a major concern 95% accuracy with 4‐second detection time 97% accuracy with 11‐second detection time IPTComm 2008 21
  • 22. Performance Evaluation Goal        detect VoIP flows VoIP flows         positives, non‐VoIP flows        negatives True positive rate The  number  of  VoIP  flows  correctly  identified TPR = The  number  of  total  VoIP  flows False positive rate The  number  of  non ‐ VoIP  flows  correctly  identified FPR = The  number  of  total  non ‐ VoIP  flows True negative rate IPTComm 2008 22
  • 23. Performance Evaluation 97% TPR with a detection time longer than 3 sec. Flows of World of Warcraft tend to be mis‐identified Achieve 90% TNR with a detection time longer than 10 sec. IPTComm 2008 23
  • 24. ROC Curves ROC (Receiver Operating Characteristic) IPTComm 2008 24
  • 25. Summary Propose a VoIP flow identification scheme based on  human conversation patterns Our scheme yields an identification accuracy 95% within  4 sec. of the detection time, and 97% within 11 sec. High accuracy in short detection time IPTComm 2008 25