SlideShare una empresa de Scribd logo
1 de 31
Outline
Motivation for Temporal Data Mining (TDM)
Examples of Temporal Data
TDM Concepts
Sequence Mining: temporal association mining
Calendric Association Rules
Trend Dependencies
Frequent Episodes
Markov Models & Hidden Markov Models
Motivation for Temporal Data Mining:
Time-varying versus Space-varying Processes
 • Most of the data mining techniques that we have
   discussed so far have focused on the classification,
   prediction, or characterization of single data points.
 • For example:
    – “Assign a record to one of a set of classes” ---
      techniques :
       • Decision Trees; Neural Networks; Bayesian classifiers; etc.
    – “Predict the value of a field in a record given the values of
      the other fields” --- techniques :
       • Regression; Neural Networks; Association rules; etc.
    – “Find regions of feature space where data points are
      densely grouped” --- techniques :
       • Clustering; Self-Organizing Maps
Time-varying versus Space-varying Processes:
  The Statistical Independence Assumption
 • In the methods that we have considered so far, we
   have assumed that each observed data point is
   statistically independent from the observation that
   preceded it. For example:
    – Classification: the class of data point xk is not influenced by
      the class of xk-1 (or indeed by any other data point).
    – Prediction: the value of a specific field in a given record
      depends only on the values of the fields within that record,
      not on values in the fields of any other records.
 • Many important real-world data mining problems do
   not satisfy this Statistical Independence Assumption.
Motivation for Temporal Data Mining, continued
  There are many examples of time-ordered data (e.g, scientific, medical,
   financial, sports, computer network traffic, web logs, sales
   transactions, factory machinery performance, fleet repair records,
   weather/climate, stock market, telephone calls, etc.)
  Time-tagged data collections frequently require repeated measurements
   -- dynamic data volumes are therefore growing, and continuing to
   grow!
  Many of the data mining methods that we have studied so far require
   some modification to handle temporal relationships (“before”, “after”,
   “during”, “in summer”, “whenever X happens”)
  Time-ordered data lend themselves to prediction -- what is the
   likelihood of an event, given the preceding history of events (e.g,
   failure of parts in an electro-mechanical system)?
  Time-ordered data often link certain events to specific patterns of
   temporal behavior (e.g, network intrusion break-ins).
Temporal Data Examples
 Periodic -- sinusoidal:
                                         • Aperiodic events (noise?):




                                         • Single spiked events:
 Periodic -- smooth non-sine:




                                         • Single long-duration events:
 Periodic -- spiked events:


                               (Chirp)
Trivial Temporal Data Examples
  A flat line (constant measurement of some quantity).
  A linearly increasing or decreasing curve (straight line).
Even though these are not complex, they are still temporal
  data. Fortunately, these data streams can easily be
  represented by one or two parameters, and then you’re
  done!
Temporal Data Mining (TDM) Concepts
Event: the occurrence of some data pattern in time
Time Series: a sequence of data over a period of time
Temporal Pattern: the structure of the time series, perhaps
 represented as a vector in a Q-dimensional metric space, used
 to characterize and/or predict events
Temporal Pattern Cluster: the set of all vectors within
 some specified similarity distance of a temporal pattern
Phase Space: a state space of metrics that describe the
 temporal pattern (e.g., Fourier space, wavelets, …)
Event Characterization Function: connects events to
 temporal patterns; characterizes the event in phase space
TDM Concepts, continued

Absolute Time Cycle Events: Co-occurrence of two
 or more events at the same time
Contiguous Time Cycle Events: Co-occurrence of
 two or more events in consecutive time intervals
Sequence Mining
Temporal Association Mining is Sequence Mining.
Temporal data need to be “categorized” (e.g, by season, or
 month, or day of week, or time interval, such as morning,
 afternoon, night).
The number of pair-wise associations could become huge
 (= combinatorial explosion), and so care must be taken in
 selecting categorizations.
Sequence mining must include concepts of “before” and
 “after”. And it can include time intervals for the “before” and
 “after” items. (e.g., it is found that people who have purchased a
  VCR are three times more likely to purchase a camcorder within the
  next two to four months.)
Calendric Association Rules
Each data item di in a temporal database D is associated with a
  timestamp ti.
The database time range is assumed to be divided into
  predefined time intervals, each having a duration t. Interval k
  is defined by:
                 Interval k: kt < (ti - t0) < (k + 1)t.
The kth subset of D is D[k], corresponding to the set of all data
 items with timestamps in interval k.
Partitional Association Rule Mining is applied to each
 partition D[k] separately, to search for associations with high
 confidence that occur within specific time intervals k. These
 are Calendric Association Rules.
Trend Dependencies
Initial thought (this is not serious):
  Two points define a tendency
                                                        •
                                                    •
  Three points define a trend
                                                •
  Four points define a theory              •
Seriously… Trend Dependencies are association rules
 that discover time variations in attribute values (e.g, an
 employee’s salary always increases over time).
More generally, Trend Dependencies do not have to be
 time-dependent variations. Trends can occur in any kind
 of data, not just time-based data.
Frequent Episodes
Frequent Episodes = a set of events that occur within a
 defined time-span; for example, a recording of all computer
 network connections that a single user made within five
 minutes.
An Episode Rule is a generalized association rule applied
 to sequences of events.
An Event Sequence is an ordered list of events, each
 occurring at a particular time.
Episode rules are used to predict failures in communications
 equipment (switching nodes): data mining is used to find a
 predictive model to predict when an event (failure) will occur
 based upon a sequence of earlier events (= a Markov
 Chain).
Outline
Motivation for Temporal Data Mining (TDM)
Examples of Temporal Data
TDM Concepts
Sequence Mining: temporal association mining
Calendric Association Rules
Trend Dependencies
Frequent Episodes
Markov Models & Hidden Markov Models
Dependent Sequences - Markov Chains
 We often encounter sequences of observations in which each
  observation may depend on the observations which preceded it. This is
  a Markov Chain (sometimes spelled “Markoff”). If the observation
  depends only on the immediately preceding value, and no other, then
  this is called a First-Order Markov Chain.
 Examples
    Sequences of phonemes (fundamental sounds) in speech -- used in speech
     recognition.
    Sequences of letters or words in text -- used in text categorization, information
     retrieval, text mining.
    Sequences of web page accesses -- used in web usage mining.
    Sequences of bases in DNA -- used in genome mining.
    Sequences of pen-strokes -- used in hand-writing analysis.
 In all these cases, the probability of observing a particular value in the
  sequence can depend on the values which came before it.
Sequence Example: Web Log
 Consider the following extract from a web log:
  xxx - - [16/Sep/2002:14:50:34 +1000]   "GET /courseware/cse5230/ HTTP/1.1"                                       200   13539
  xxx - - [16/Sep/2002:14:50:42 +1000]   "GET /courseware/cse5230/html/research_paper.html HTTP/1.1"               200   11118
  xxx - - [16/Sep/2002:14:51:28 +1000]   "GET /courseware/cse5230/html/tutorials.html HTTP/1.1"                    200   7750
  xxx - - [16/Sep/2002:14:51:30 +1000]   "GET /courseware/cse5230/assets/images/citation.pdf HTTP/1.1"             200   32768
  xxx - - [16/Sep/2002:14:51:31 +1000]   "GET /courseware/cse5230/assets/images/citation.pdf HTTP/1.1"             206   146390
  xxx - - [16/Sep/2002:14:51:40 +1000]   "GET /courseware/cse5230/assets/images/clustering.pdf HTTP/1.1"           200   17100
  xxx - - [16/Sep/2002:14:51:40 +1000]   "GET /courseware/cse5230/assets/images/clustering.pdf HTTP/1.1"           206   14520
  xxx - - [16/Sep/2002:14:51:56 +1000]   "GET /courseware/cse5230/assets/images/NeuralNetworksTute.pdf HTTP/1.1"   200   17137
  xxx - - [16/Sep/2002:14:51:56 +1000]   "GET /courseware/cse5230/assets/images/NeuralNetworksTute.pdf HTTP/1.1"   206   16017
  xxx - - [16/Sep/2002:14:52:03 +1000]   "GET /courseware/cse5230/html/lectures.html HTTP/1.1"                     200   9608
  xxx - - [16/Sep/2002:14:52:05 +1000]   "GET /courseware/cse5230/assets/images/week03.ppt HTTP/1.1"               200   121856
  xxx - - [16/Sep/2002:14:52:24 +1000]   "GET /courseware/cse5230/assets/images/week06.ppt HTTP/1.1"               200   527872



 Clearly the URL that is requested depends on the URL that was
  requested before
   If the user uses the “Back” button in his/her browser, the requested URL may
      depend on earlier URLs in the sequence too.
 Given a particular observed URL, we can then calculate the
  probabilities of observing other possible URLs in the next click.
   Note that we may even observe the same URL next.
First-Order Markov Models
In order to model processes such as these, we make use of the
 idea of states. At any time t, we consider the system to be in
 state w(t).
We can consider a sequence of successive states -- this
 sequence has length T :
                        wT = {w(1), w(2), …, w(T)}
We can model the production of a particular sequence using
  transition probabilities:

                    P( w j (t + 1) | wi (t )) = aij
  This aij is the probability that the system will be in state wj at time t+1
    given that it was in state wi at time t.
First-Order Markov Models, continued
A model of states and transition probabilities, such as the one
 we have just described, is called a Markov Model.
Since we have assumed that the transition probabilities depend
 only on the one immediately preceding state, this is a First-
 order Markov Model.
  Higher-order Markov models are possible (dependent on multiple
    previous states), but we will not consider them here.
For example, Markov models for human speech could have
  states corresponding to the various phonemes:
  A Markov model for the word “cat” would have states for /k/,
    /a/, /t/ and a final silent state.
Example: Markov Model for “cat”
         /k/              /a/                /t/              /silent/



 /k/, /a/, and /t/ correspond to the phonemes (sounds of speech) that
  lead to the pronunciation of the word “cat”.
 The probability that /a/ will follow /k/ is based upon a “dictionary” of
  speech usage: the probability that /a/ will follow /k/ is the frequency of
  occurrence of that “state transition” in the dictionary. Likewise, for the
  transition from /a/ to /t/, and from /t/ to the /silent/ state, the
  probability is calculated from the frequency of occurrence of that state
  transition in the dictionary.
 These probabilities can be calculated (i.e., they are observed).
 The “dictionary” may be the existing database (=training set).
So, what does this have to do with anything?
 (i.e., what does it have to do with Data Mining?)
 One type of data mining is “Predictive”.
 Temporal data mining is often predictive.
 What are we predicting? -- usually, one wants to predict what will
  happen next, or what is the probability that a certain thing (e.g, an error
  condition or failure state or medical condition) will happen.
 How does one make the prediction? -- by using prior frequencies of state
  transitions, based upon the existing (historical) sequences of data items
  (state transitions) in the temporal database.
 Thus, Markov Models can be applied to this form of predictive data
  mining. This is another example of Bayes classification =
  probabilistic estimation based upon prior probabilities.
Hidden Markov Models (HMM)
 In the preceding “cat” example, we said that the states correspond to
  phonemes (the fundamental sounds of speech).
 In a speech-recognition system, however, we don’t have access to
  phonemes – we can only measure properties of the sound produced by a
  particular speaker (= the training set).
 In general, our observed data do not correspond directly to an underlying
  state of the model --- The data correspond only to the visible states of
  the system.
   The visible states are directly accessible for measurement.
 The system can also have internal “hidden” states, which cannot be
  observed directly.
   For each hidden state, there is a probability of observing each visible state.
 This sort of model is called a Hidden Markov Model (HMM).
Example: Coin Toss Experiments
 Let us imagine a scenario where we are in a room which is
  divided in two by a curtain.
 We are on one side of the curtain, and on the other side is an
  unseen person who will carry out a procedure using coins
  resulting in a head (H) or a tail (T).
 When the person has carried out the procedure, they call out
  the result, H or T, which we record.
 This system will allow us to generate a sequence of H’s and
  T’s, such as:

  HHTHTHTTHTTTTTHHTHHHHTHHHTTHHHHHHTTTT
  TTTTHTHHTHTTTTTHHTHTHHHTHTHHTTTTHHTTT
  HHTHHTTTHTHTHTHTHHHTHHTTHT….
Example: A Single Fair Coin
 Imagine that the person behind the curtain has a single fair coin (i.e., it
  has equal probabilities of coming up heads H or tails T).
 We could model the process that produces the sequence of H’s and T’s
  as a Markov model: with two states, and with equal transition
  probabilities:

                                    0.5

                0.5                                        0.5
                            H                 T
                                    0.5

 Note that here the visible states correspond exactly to the internal states
  – the model is not hidden.
 Note also that states can transition to themselves. In other words, if a
  Head is tossed on one turn, then there is still a 50% chance that a Head
  will be tossed on the next turn.
Modified Example: A Fair and a Biased Coin
 Now imagine a more complicated scenario. The unseen person behind
  the curtain has two coins, one fair and one biased [for example, P(T)
  = 0.9 = 90% in favor of tossing a Tail].
   (1) The person starts by picking a coin a random.
   (2) The person tosses the coin, and calls out the result (H or T).
   (3) If at any time the result is H, the person switches coins.
   (4) Go back to step (2), and repeat.
 This modified process generates sequences like the following :

  TTTTTTTTTTTTTTTTTTTTTTTTHHTTTTTTTHHTTTTTTT
  TTTTTTTTTTTTTTTHHTTTTTTTTTHTTHTTHHTTTTTHHT
  TTTTTTTTTHHTTTTTTTTHTHHHTTTTTTTTTTTTTTHHTT
  TTTTTHTHTTTTTTTHHTTTTT…
 Note that this looks quite different from the sequence of tosses
  generated in the “fair coin” example (see Lecture Slide #23).
A Fair and a Biased Coin, continued
In this scenario, the visible state no longer corresponds
  exactly to the hidden state of the system:
  Visible state: output of H or T (this is what we can see)
  Hidden state: which coin was tossed (we cannot know)

We can model this 2-coin process using a HMM:

                                    0.1

              0.5         Fair            Biased         0.9
                          coin            coin

                    0.5             0.5                0.9
                              0.5          0.1
                T                                  H         T
                          H
A Fair and a Biased Coin, continued
 We see from the diagram on the preceding slide that we have extended
  our model:
    The visible states are shown in purple, and the emission probabilities (probabilities of
     tossing H or T) are shown too.
 With the internal states w(t) and state transition probabilities aij, we also
  have visible states v(t) and emission probabilities bjk .


                            P (vk (t ) | w j (t )) = b jk

    This is the probability that we will observe a specific visible state, given a particular
     internal (hidden) state.
    Note that the bjk do not need to be related to the aij [they are coincidentally the same in
     the “fair coin and biased coin” example above -- as a result of step (3) on Lecture Slide
     #25].
 We now have a full Hidden Markov Model (HMM).
HMM: formal definition
 We can now give a more formal definition of a First-order
   Hidden Markov Model (adapted from [RaJ1986]) :
    There is a finite number of (internal) states, N.
    At each time t, a new state is entered, based upon a transition probability
     distribution which depends on the state at time t – 1. Self-transitions are
     allowed.
    After each transition is made, a symbol is output, according to a
     probability distribution which depends only on the current state. There
     are thus N such probability distributions.
 Estimating the number of states N, as well as the transition and
   emission probabilities for each state, is usually a very complex
   problem, but solutions do exist.
 Reference: [RaJ1986] L. R. Rabiner and B. H. Juang, An introduction to hidden
   Markov models, IEEE Magazine on Acoustics, Speech and Signal Processing, 3, 1, pp.
   4-16, January 1986.
Use of HMMs
 We have now seen what sorts of processes can be modeled using
  HMMs, and how an HMM is specified mathematically.
 We now consider how HMMs are actually used.
 Consider the two H and T sequences we saw in the previous examples:
   How could we decide which coin-toss system was most likely to have produced
    each sequence?
 To which system would you assign these sequences?
  1: TTTHHTTTTTTTTTTTTTHHTTTTTTHHTTTHH
  2: THHTTTHHHTTHTHTTHTHHTTHHHTTHTHTHT
  3: THHTHTHTHTHHHTTHTTTHHTTHTTTTTHHHT
  4: HTTTHTTHTTTTHTTTHHTTHTHTTTTTTTTHT
 We could answer this question using a Bayesian formulation.
Use of HMMs for classification
 HMMs are often used to classify sequences (temporal or not).
 To do this, a separate HMM is built and trained (i.e., the parameters
   are estimated) for each class of sequence in which we are interested.
    e.g., we might have an HMM for each word in a speech-recognition system. The
      hidden states would correspond to phonemes, and the visible states to measured
      sound features.
 For a given observed sequence vT, we estimate the probability that a
   given Hidden Markov Model Ml generated it:

                                                  P (v T | M l ) P ( M l )  (Bayes theorem
                          P( M l | vT ) =
                                                           P (v T )         strikes again.)
 We assign the sequence to the model with the highest posterior
  probability (e.g, identifying a speaker by their speech patterns).
 The algorithms for calculating these probabilities can be found in
  various references [see reference listed at the bottom of Lecture Slide #28; see also Duda, Hart,
  & Stork, 2000, Pattern Classification (2nd Edition), Wiley, pp. 128-138].
Summary
Summary of Topics Covered - Lecture 8
Motivation for Temporal Data Mining (TDM)
Examples of Temporal Data
TDM Concepts
Sequence Mining: temporal association mining
Calendric Association Rules
Trend Dependencies
Frequent Episodes
Markov Models & Hidden Markov Models

Más contenido relacionado

La actualidad más candente

Text data mining1
Text data mining1Text data mining1
Text data mining1KU Leuven
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streamsKrish_ver2
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notesBAIRAVI T
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streamshktripathy
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataSalah Amean
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Salah Amean
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningSlideshare
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 

La actualidad más candente (20)

Text data mining1
Text data mining1Text data mining1
Text data mining1
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
5.1 mining data streams
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.01 Data Mining: Concepts and Techniques, 2nd ed.
01 Data Mining: Concepts and Techniques, 2nd ed.
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 

Destacado (9)

5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
 
Web Mining Presentation Final
Web Mining Presentation FinalWeb Mining Presentation Final
Web Mining Presentation Final
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Web Mining
Web Mining Web Mining
Web Mining
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
Spatial data mining
Spatial data miningSpatial data mining
Spatial data mining
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
Data mining
Data miningData mining
Data mining
 

Similar a Temporal data mining

From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleDr. Mirko Kämpf
 
Mythbusters: Event Stream Processing v. Complex Event Processing
Mythbusters: Event Stream Processing v. Complex Event ProcessingMythbusters: Event Stream Processing v. Complex Event Processing
Mythbusters: Event Stream Processing v. Complex Event ProcessingTim Bass
 
20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.ppt20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.pptPalaniKumarR2
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platformconfluent
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming PlatformDr. Mirko Kämpf
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalconfluent
 
Project Risk Analysis in Aerospace Industry
Project Risk Analysis in Aerospace IndustryProject Risk Analysis in Aerospace Industry
Project Risk Analysis in Aerospace IndustryIntaver Insititute
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2confluent
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management ServiceSafe Software
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Brian O'Neill
 
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...DataStax Academy
 
Hmm Datamining
Hmm DataminingHmm Datamining
Hmm Dataminingjefftang
 
Design and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management SystemDesign and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management SystemErdi Olmezogullari
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...Flink Forward
 
Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13specialk29
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series DataMongoDB
 
MRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsMRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsAntonio García-Domínguez
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor NetworksOscar Corcho
 

Similar a Temporal data mining (20)

From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on Scale
 
Mythbusters: Event Stream Processing v. Complex Event Processing
Mythbusters: Event Stream Processing v. Complex Event ProcessingMythbusters: Event Stream Processing v. Complex Event Processing
Mythbusters: Event Stream Processing v. Complex Event Processing
 
20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.ppt20IT501_DWDM_PPT_Unit_V.ppt
20IT501_DWDM_PPT_Unit_V.ppt
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platform
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
Project Risk Analysis in Aerospace Industry
Project Risk Analysis in Aerospace IndustryProject Risk Analysis in Aerospace Industry
Project Risk Analysis in Aerospace Industry
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...
Cassandra Day 2014: Re-envisioning the Lambda Architecture - Web-Services & R...
 
Chapter24
Chapter24Chapter24
Chapter24
 
Hmm Datamining
Hmm DataminingHmm Datamining
Hmm Datamining
 
Design and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management SystemDesign and Implementation of A Data Stream Management System
Design and Implementation of A Data Stream Management System
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
 
Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13
 
MongoDB for Time Series Data
MongoDB for Time Series DataMongoDB for Time Series Data
MongoDB for Time Series Data
 
MRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsMRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph models
 
Is this normal?
Is this normal?Is this normal?
Is this normal?
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
 

Más de ReachLocal Services India (12)

Excel ppt
Excel pptExcel ppt
Excel ppt
 
Virtual reality
Virtual realityVirtual reality
Virtual reality
 
Digital signatures
Digital signaturesDigital signatures
Digital signatures
 
System security
System securitySystem security
System security
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Loop invariant computation
Loop invariant computationLoop invariant computation
Loop invariant computation
 
Distributed dbms
Distributed dbmsDistributed dbms
Distributed dbms
 
Sexual harresment on women
Sexual harresment on womenSexual harresment on women
Sexual harresment on women
 
Digital signal processing
Digital signal processingDigital signal processing
Digital signal processing
 
Mobile network layer (mobile comm.)
Mobile network layer (mobile comm.)Mobile network layer (mobile comm.)
Mobile network layer (mobile comm.)
 
Regular expression (compiler)
Regular expression (compiler)Regular expression (compiler)
Regular expression (compiler)
 

Último

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Temporal data mining

  • 1.
  • 2. Outline Motivation for Temporal Data Mining (TDM) Examples of Temporal Data TDM Concepts Sequence Mining: temporal association mining Calendric Association Rules Trend Dependencies Frequent Episodes Markov Models & Hidden Markov Models
  • 3. Motivation for Temporal Data Mining: Time-varying versus Space-varying Processes • Most of the data mining techniques that we have discussed so far have focused on the classification, prediction, or characterization of single data points. • For example: – “Assign a record to one of a set of classes” --- techniques : • Decision Trees; Neural Networks; Bayesian classifiers; etc. – “Predict the value of a field in a record given the values of the other fields” --- techniques : • Regression; Neural Networks; Association rules; etc. – “Find regions of feature space where data points are densely grouped” --- techniques : • Clustering; Self-Organizing Maps
  • 4. Time-varying versus Space-varying Processes: The Statistical Independence Assumption • In the methods that we have considered so far, we have assumed that each observed data point is statistically independent from the observation that preceded it. For example: – Classification: the class of data point xk is not influenced by the class of xk-1 (or indeed by any other data point). – Prediction: the value of a specific field in a given record depends only on the values of the fields within that record, not on values in the fields of any other records. • Many important real-world data mining problems do not satisfy this Statistical Independence Assumption.
  • 5. Motivation for Temporal Data Mining, continued  There are many examples of time-ordered data (e.g, scientific, medical, financial, sports, computer network traffic, web logs, sales transactions, factory machinery performance, fleet repair records, weather/climate, stock market, telephone calls, etc.)  Time-tagged data collections frequently require repeated measurements -- dynamic data volumes are therefore growing, and continuing to grow!  Many of the data mining methods that we have studied so far require some modification to handle temporal relationships (“before”, “after”, “during”, “in summer”, “whenever X happens”)  Time-ordered data lend themselves to prediction -- what is the likelihood of an event, given the preceding history of events (e.g, failure of parts in an electro-mechanical system)?  Time-ordered data often link certain events to specific patterns of temporal behavior (e.g, network intrusion break-ins).
  • 6. Temporal Data Examples  Periodic -- sinusoidal: • Aperiodic events (noise?): • Single spiked events:  Periodic -- smooth non-sine: • Single long-duration events:  Periodic -- spiked events: (Chirp)
  • 7. Trivial Temporal Data Examples A flat line (constant measurement of some quantity). A linearly increasing or decreasing curve (straight line). Even though these are not complex, they are still temporal data. Fortunately, these data streams can easily be represented by one or two parameters, and then you’re done!
  • 8. Temporal Data Mining (TDM) Concepts Event: the occurrence of some data pattern in time Time Series: a sequence of data over a period of time Temporal Pattern: the structure of the time series, perhaps represented as a vector in a Q-dimensional metric space, used to characterize and/or predict events Temporal Pattern Cluster: the set of all vectors within some specified similarity distance of a temporal pattern Phase Space: a state space of metrics that describe the temporal pattern (e.g., Fourier space, wavelets, …) Event Characterization Function: connects events to temporal patterns; characterizes the event in phase space
  • 9. TDM Concepts, continued Absolute Time Cycle Events: Co-occurrence of two or more events at the same time Contiguous Time Cycle Events: Co-occurrence of two or more events in consecutive time intervals
  • 10. Sequence Mining Temporal Association Mining is Sequence Mining. Temporal data need to be “categorized” (e.g, by season, or month, or day of week, or time interval, such as morning, afternoon, night). The number of pair-wise associations could become huge (= combinatorial explosion), and so care must be taken in selecting categorizations. Sequence mining must include concepts of “before” and “after”. And it can include time intervals for the “before” and “after” items. (e.g., it is found that people who have purchased a VCR are three times more likely to purchase a camcorder within the next two to four months.)
  • 11. Calendric Association Rules Each data item di in a temporal database D is associated with a timestamp ti. The database time range is assumed to be divided into predefined time intervals, each having a duration t. Interval k is defined by: Interval k: kt < (ti - t0) < (k + 1)t. The kth subset of D is D[k], corresponding to the set of all data items with timestamps in interval k. Partitional Association Rule Mining is applied to each partition D[k] separately, to search for associations with high confidence that occur within specific time intervals k. These are Calendric Association Rules.
  • 12. Trend Dependencies Initial thought (this is not serious): Two points define a tendency • • Three points define a trend • Four points define a theory • Seriously… Trend Dependencies are association rules that discover time variations in attribute values (e.g, an employee’s salary always increases over time). More generally, Trend Dependencies do not have to be time-dependent variations. Trends can occur in any kind of data, not just time-based data.
  • 13. Frequent Episodes Frequent Episodes = a set of events that occur within a defined time-span; for example, a recording of all computer network connections that a single user made within five minutes. An Episode Rule is a generalized association rule applied to sequences of events. An Event Sequence is an ordered list of events, each occurring at a particular time. Episode rules are used to predict failures in communications equipment (switching nodes): data mining is used to find a predictive model to predict when an event (failure) will occur based upon a sequence of earlier events (= a Markov Chain).
  • 14. Outline Motivation for Temporal Data Mining (TDM) Examples of Temporal Data TDM Concepts Sequence Mining: temporal association mining Calendric Association Rules Trend Dependencies Frequent Episodes Markov Models & Hidden Markov Models
  • 15. Dependent Sequences - Markov Chains  We often encounter sequences of observations in which each observation may depend on the observations which preceded it. This is a Markov Chain (sometimes spelled “Markoff”). If the observation depends only on the immediately preceding value, and no other, then this is called a First-Order Markov Chain.  Examples  Sequences of phonemes (fundamental sounds) in speech -- used in speech recognition.  Sequences of letters or words in text -- used in text categorization, information retrieval, text mining.  Sequences of web page accesses -- used in web usage mining.  Sequences of bases in DNA -- used in genome mining.  Sequences of pen-strokes -- used in hand-writing analysis.  In all these cases, the probability of observing a particular value in the sequence can depend on the values which came before it.
  • 16. Sequence Example: Web Log  Consider the following extract from a web log: xxx - - [16/Sep/2002:14:50:34 +1000] "GET /courseware/cse5230/ HTTP/1.1" 200 13539 xxx - - [16/Sep/2002:14:50:42 +1000] "GET /courseware/cse5230/html/research_paper.html HTTP/1.1" 200 11118 xxx - - [16/Sep/2002:14:51:28 +1000] "GET /courseware/cse5230/html/tutorials.html HTTP/1.1" 200 7750 xxx - - [16/Sep/2002:14:51:30 +1000] "GET /courseware/cse5230/assets/images/citation.pdf HTTP/1.1" 200 32768 xxx - - [16/Sep/2002:14:51:31 +1000] "GET /courseware/cse5230/assets/images/citation.pdf HTTP/1.1" 206 146390 xxx - - [16/Sep/2002:14:51:40 +1000] "GET /courseware/cse5230/assets/images/clustering.pdf HTTP/1.1" 200 17100 xxx - - [16/Sep/2002:14:51:40 +1000] "GET /courseware/cse5230/assets/images/clustering.pdf HTTP/1.1" 206 14520 xxx - - [16/Sep/2002:14:51:56 +1000] "GET /courseware/cse5230/assets/images/NeuralNetworksTute.pdf HTTP/1.1" 200 17137 xxx - - [16/Sep/2002:14:51:56 +1000] "GET /courseware/cse5230/assets/images/NeuralNetworksTute.pdf HTTP/1.1" 206 16017 xxx - - [16/Sep/2002:14:52:03 +1000] "GET /courseware/cse5230/html/lectures.html HTTP/1.1" 200 9608 xxx - - [16/Sep/2002:14:52:05 +1000] "GET /courseware/cse5230/assets/images/week03.ppt HTTP/1.1" 200 121856 xxx - - [16/Sep/2002:14:52:24 +1000] "GET /courseware/cse5230/assets/images/week06.ppt HTTP/1.1" 200 527872  Clearly the URL that is requested depends on the URL that was requested before  If the user uses the “Back” button in his/her browser, the requested URL may depend on earlier URLs in the sequence too.  Given a particular observed URL, we can then calculate the probabilities of observing other possible URLs in the next click.  Note that we may even observe the same URL next.
  • 17. First-Order Markov Models In order to model processes such as these, we make use of the idea of states. At any time t, we consider the system to be in state w(t). We can consider a sequence of successive states -- this sequence has length T : wT = {w(1), w(2), …, w(T)} We can model the production of a particular sequence using transition probabilities: P( w j (t + 1) | wi (t )) = aij This aij is the probability that the system will be in state wj at time t+1 given that it was in state wi at time t.
  • 18. First-Order Markov Models, continued A model of states and transition probabilities, such as the one we have just described, is called a Markov Model. Since we have assumed that the transition probabilities depend only on the one immediately preceding state, this is a First- order Markov Model. Higher-order Markov models are possible (dependent on multiple previous states), but we will not consider them here. For example, Markov models for human speech could have states corresponding to the various phonemes: A Markov model for the word “cat” would have states for /k/, /a/, /t/ and a final silent state.
  • 19. Example: Markov Model for “cat” /k/ /a/ /t/ /silent/  /k/, /a/, and /t/ correspond to the phonemes (sounds of speech) that lead to the pronunciation of the word “cat”.  The probability that /a/ will follow /k/ is based upon a “dictionary” of speech usage: the probability that /a/ will follow /k/ is the frequency of occurrence of that “state transition” in the dictionary. Likewise, for the transition from /a/ to /t/, and from /t/ to the /silent/ state, the probability is calculated from the frequency of occurrence of that state transition in the dictionary.  These probabilities can be calculated (i.e., they are observed).  The “dictionary” may be the existing database (=training set).
  • 20. So, what does this have to do with anything? (i.e., what does it have to do with Data Mining?)  One type of data mining is “Predictive”.  Temporal data mining is often predictive.  What are we predicting? -- usually, one wants to predict what will happen next, or what is the probability that a certain thing (e.g, an error condition or failure state or medical condition) will happen.  How does one make the prediction? -- by using prior frequencies of state transitions, based upon the existing (historical) sequences of data items (state transitions) in the temporal database.  Thus, Markov Models can be applied to this form of predictive data mining. This is another example of Bayes classification = probabilistic estimation based upon prior probabilities.
  • 21. Hidden Markov Models (HMM)  In the preceding “cat” example, we said that the states correspond to phonemes (the fundamental sounds of speech).  In a speech-recognition system, however, we don’t have access to phonemes – we can only measure properties of the sound produced by a particular speaker (= the training set).  In general, our observed data do not correspond directly to an underlying state of the model --- The data correspond only to the visible states of the system.  The visible states are directly accessible for measurement.  The system can also have internal “hidden” states, which cannot be observed directly.  For each hidden state, there is a probability of observing each visible state.  This sort of model is called a Hidden Markov Model (HMM).
  • 22. Example: Coin Toss Experiments  Let us imagine a scenario where we are in a room which is divided in two by a curtain.  We are on one side of the curtain, and on the other side is an unseen person who will carry out a procedure using coins resulting in a head (H) or a tail (T).  When the person has carried out the procedure, they call out the result, H or T, which we record.  This system will allow us to generate a sequence of H’s and T’s, such as: HHTHTHTTHTTTTTHHTHHHHTHHHTTHHHHHHTTTT TTTTHTHHTHTTTTTHHTHTHHHTHTHHTTTTHHTTT HHTHHTTTHTHTHTHTHHHTHHTTHT….
  • 23. Example: A Single Fair Coin  Imagine that the person behind the curtain has a single fair coin (i.e., it has equal probabilities of coming up heads H or tails T).  We could model the process that produces the sequence of H’s and T’s as a Markov model: with two states, and with equal transition probabilities: 0.5 0.5 0.5 H T 0.5  Note that here the visible states correspond exactly to the internal states – the model is not hidden.  Note also that states can transition to themselves. In other words, if a Head is tossed on one turn, then there is still a 50% chance that a Head will be tossed on the next turn.
  • 24. Modified Example: A Fair and a Biased Coin  Now imagine a more complicated scenario. The unseen person behind the curtain has two coins, one fair and one biased [for example, P(T) = 0.9 = 90% in favor of tossing a Tail].  (1) The person starts by picking a coin a random.  (2) The person tosses the coin, and calls out the result (H or T).  (3) If at any time the result is H, the person switches coins.  (4) Go back to step (2), and repeat.  This modified process generates sequences like the following : TTTTTTTTTTTTTTTTTTTTTTTTHHTTTTTTTHHTTTTTTT TTTTTTTTTTTTTTTHHTTTTTTTTTHTTHTTHHTTTTTHHT TTTTTTTTTHHTTTTTTTTHTHHHTTTTTTTTTTTTTTHHTT TTTTTHTHTTTTTTTHHTTTTT…  Note that this looks quite different from the sequence of tosses generated in the “fair coin” example (see Lecture Slide #23).
  • 25. A Fair and a Biased Coin, continued In this scenario, the visible state no longer corresponds exactly to the hidden state of the system: Visible state: output of H or T (this is what we can see) Hidden state: which coin was tossed (we cannot know) We can model this 2-coin process using a HMM: 0.1 0.5 Fair Biased 0.9 coin coin 0.5 0.5 0.9 0.5 0.1 T H T H
  • 26. A Fair and a Biased Coin, continued  We see from the diagram on the preceding slide that we have extended our model:  The visible states are shown in purple, and the emission probabilities (probabilities of tossing H or T) are shown too.  With the internal states w(t) and state transition probabilities aij, we also have visible states v(t) and emission probabilities bjk . P (vk (t ) | w j (t )) = b jk  This is the probability that we will observe a specific visible state, given a particular internal (hidden) state.  Note that the bjk do not need to be related to the aij [they are coincidentally the same in the “fair coin and biased coin” example above -- as a result of step (3) on Lecture Slide #25].  We now have a full Hidden Markov Model (HMM).
  • 27. HMM: formal definition  We can now give a more formal definition of a First-order Hidden Markov Model (adapted from [RaJ1986]) :  There is a finite number of (internal) states, N.  At each time t, a new state is entered, based upon a transition probability distribution which depends on the state at time t – 1. Self-transitions are allowed.  After each transition is made, a symbol is output, according to a probability distribution which depends only on the current state. There are thus N such probability distributions.  Estimating the number of states N, as well as the transition and emission probabilities for each state, is usually a very complex problem, but solutions do exist.  Reference: [RaJ1986] L. R. Rabiner and B. H. Juang, An introduction to hidden Markov models, IEEE Magazine on Acoustics, Speech and Signal Processing, 3, 1, pp. 4-16, January 1986.
  • 28. Use of HMMs  We have now seen what sorts of processes can be modeled using HMMs, and how an HMM is specified mathematically.  We now consider how HMMs are actually used.  Consider the two H and T sequences we saw in the previous examples:  How could we decide which coin-toss system was most likely to have produced each sequence?  To which system would you assign these sequences? 1: TTTHHTTTTTTTTTTTTTHHTTTTTTHHTTTHH 2: THHTTTHHHTTHTHTTHTHHTTHHHTTHTHTHT 3: THHTHTHTHTHHHTTHTTTHHTTHTTTTTHHHT 4: HTTTHTTHTTTTHTTTHHTTHTHTTTTTTTTHT  We could answer this question using a Bayesian formulation.
  • 29. Use of HMMs for classification  HMMs are often used to classify sequences (temporal or not).  To do this, a separate HMM is built and trained (i.e., the parameters are estimated) for each class of sequence in which we are interested.  e.g., we might have an HMM for each word in a speech-recognition system. The hidden states would correspond to phonemes, and the visible states to measured sound features.  For a given observed sequence vT, we estimate the probability that a given Hidden Markov Model Ml generated it: P (v T | M l ) P ( M l ) (Bayes theorem P( M l | vT ) = P (v T ) strikes again.)  We assign the sequence to the model with the highest posterior probability (e.g, identifying a speaker by their speech patterns).  The algorithms for calculating these probabilities can be found in various references [see reference listed at the bottom of Lecture Slide #28; see also Duda, Hart, & Stork, 2000, Pattern Classification (2nd Edition), Wiley, pp. 128-138].
  • 31. Summary of Topics Covered - Lecture 8 Motivation for Temporal Data Mining (TDM) Examples of Temporal Data TDM Concepts Sequence Mining: temporal association mining Calendric Association Rules Trend Dependencies Frequent Episodes Markov Models & Hidden Markov Models