SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
Handling concept drift in data
       stream mining
Student: Manuel Martín Salvador
Supervisors: Luis M. de Campos and Silvia Acid




         Master in Soft Computing and Intelligent Systems
      Department of Computer Science and Artificial Intelligence
                       University of Granada
Who am I?
1. Current: PhD Student in Bournemouth University
2. Previous:
  ●   Computer Engineering in University of Granada
      (2004-2009)
  ●   Programmer and SCRUM Master in Fundación
      I+D del Software Libre (2009-2010)
  ●   Master in Soft Computing and Intelligent
      Systems in University of Granada (2010-2011)
  ●   Researcher in Department of Computer Science
      and Artificial Intelligence of UGR (2010-2012)
Index
1. Data streams
2. Online Learning
3. Evaluation
4. Taxonomy of methods
5. Contributions
6. MOA
7. Experimentation
8. Conclusions and future work


                                 3
Data streams
  1. Continous flow of instances.
       ●    In classification: instance = (a1 , a2 , …, an , c)
  2. Unlimited size
  3. May have changes in the underlying distribution of the
    data → concept drift




                                                                  4
Image: I. Žliobaitė thesis
Concept drifts
●   It happens when the data from a stream changes its
    probability distribution ПS1 to another ПS2. Potential
    causes:
    ●   Change in P(C)
    ●   Change in P(X|C)
    ●   Change in P(C|X)
●   Unpredictable
●   For example: spam


                                                             5
Gradual concept drift




                             6
Image: I. Žliobaitė thesis
Types of concept drifts




                              7
Image: D. Brzeziński thesis
Types of concept drifts




                              8
Image: D. Brzeziński thesis
Example: STAGGER
                          color=red   color=green    size=medium
Class=true if →              and           or              or
                         size=small   shape=cricle     size=large




                                                                    9

Image: Kolter & Maloof
Online learning (incremental)
●   Goal: incrementally learn a classifier at least as
    accurate as if it had been trained in batch
●   Requirements:
    1. Incremental
    2. Single pass
    3. Limited time and memory
    4. Any-time learning: availability of the model




                                                         10
Online learning (incremental)
●   Goal: incrementally learn a classifier at least as
    accurate as if it had been trained in batch
●   Requirements:
    1. Incremental
    2. Single pass
    3. Limited time and memory
    4. Any-time learning: availability of the model
●   Nice to have: deal with concept drift.


                                                         11
Evaluation
Several criteria:
   ●   Time → seconds
   ●   Memory → RAM/hour
   ●   Generalizability of the model → % success
   ●   Detecting concept drift → detected drifts, false
       positives and false negatives




                                                          12
Evaluation
Several criteria:
   ●   Time → seconds
   ●   Memory → RAM/hour
   ●   Generalizability of the model → % success
   ●   Detecting concept drift → detected drifts, false
       positives and false negatives


Problem: we can't use the traditional techniques for
evaluation (i.e. cross validation). → Solution: new
strategies.
                                                          13
Evaluation: prequential
●   Test y training each instance.                     errors
                                                 processed instances

●   Is a pessimistic estimator: holds the errors since the
    beginning of the stream. → Solution: forgetting
    mechanisms (sliding window and fading factor).
                       errors inside window
     Sliding window:       window size                                 …...


                        currentError⋅errors
     Fading factor:    1⋅processed instances                         …...




    Advantages: All instances are used for training.
                Useful for data streams with concept drifts.                  14
Evaluation: comparing




         Which method is better?   15
Evaluation: comparing




       Which method is better? → AUC   16
Evaluation: drift detection
●   First detected: correct.
●   Following detected: false positives.
●   Not detected: false negatives.
●   Distance = correct – real.




                                           17
Taxonomy of methods
                            ● Change detectors
  Learners with             ● Training windows
  triggers                  ● Adaptive sampling



   ✔ Advantages: can be used by any classification algorithm.
   ✗ Disadvantages: usually, once detected a change, they discard the old
     model and relearn a new one.




                                                                            18
Taxonomy of methods
                            ● Change detectors
  Learners with             ● Training windows
  triggers                  ● Adaptive sampling



   ✔ Advantages: can be used by any classification algorithm.
   ✗ Disadvantages: usually, once detected a change, they discard the old
     model and relearn a new one.

                            ● Adaptive ensembles
  Evolving                  ● Instance weighting

  Learners                  ● Feature space

                            ● Base model specific


   ✔ Advantages: they continually adapt the model over time
   ✗ Disadvantages: they don't detect changes.                              19
Contributions
●   Taxonomy: triggers → change detectors
    ●   MoreErrorsMoving
    ●   MaxMoving
    ●   Moving Average
        –   Heuristic 1
        –   Heuristic 2
        –   Hybrid heuristic: 1+2

●   P-chart with 3 levels: normal, warning and drift

                                                       20
Contributions: MoreErrorsMoving
●   n latest results of classification are monitored →
    History = {ei , ei+1 , …, ei+n} (i.e. 0,0,1,1)

●   History error rate:
●   The consecutive declines are controlled
●   At each time step:
    ●   If ci - 1 < ci (more errors) → declines++
    ●   If ci - 1 > ci (less errors) → declines=0
    ●   If ci - 1 = ci (same) → declines don't change
                                                         21
Contributions: MoreErrorsMoving
●   If consecutive declines > k → enable Warning
●   If consecutive declines > k+d → enable Drift
●   Otherwise → enable Normality




                                                   22
Contributions: MoreErrorsMoving
                        History = 8
                        Warning = 2
                        Drift = 4




                        Detected
                        drifts:
                        46 y 88




                        Distance to real drifts:
                        46-40 = 6
                        88-80 = 8



                                               23
Contributions: MaxMoving
●   n latest success accumulated rates are monitored since
    the last change
    ●   History={ai , ai+1 , …, ai+n} (i.e. H={2/5, 3/6, 4/7, 4/8})
●   History maximum:
●   The consecutive declines are controlled
●   At each time step:
    ●   If mi < mi - 1 → declines++
    ●   If mi > mi - 1 → declines=0
    ●   If mi = mi - 1 → declines don't change
                                                                      24
Contributions: MaxMoving
                           History = 4
                           Warning = 4
                           Drift = 8




                           Detected
                           drifts:
                           52 y 90




                           Distance to real drifts:
                           52-40 = 12
                           90-80 = 10



                                                  25
Contributions: Moving Average
      Goal: to smooth accuracy rates for better detection.




                                                             26
Contributions: Moving Average 1
●   m latest success accumulated rates are smoothed → Simple
    moving average (unweighted mean)




●   The consecutive declines are controlled
●   At each time step:
    ●   If st < st - 1 → declines++
    ●   If st > st - 1 → declines = 0
    ●   If st = st - 1 → declines don't change
                                                               27
Contributions: Moving Average 1
                         Smooth = 32
                         Warning = 4
                         Drift = 8




                         Detected
                         drifts:
                         49 y 91




                         Distance to real drifts:
                         49-40 = 9
                         91-80 = 11



                                                28
Contributions: Moving Average 2
●   History of size n with the smoothed success rates →
    History={si, si+1, …, si+n}
●   History maximum:
●   Difference between st and mt – 1 is monitored
●   At each time step:
    ●   If mt – 1 - st > u → enable Warning
    ●   If mt – 1 - st > v → enable Drift
    ●   Otherwise → enable Normality
●   Suitable for abrupt changes
                                                          29
Contributions: Moving Average 2
                         Smooth = 4
                         History = 32
                         Warning = 2%
                         Drift = 4%


                         Detected
                         drifts:
                         44 y 87




                         Distance to real drifts:
                         44-40 = 4
                         87-80 = 7



                                                30
Contributions: Moving Average Hybrid
 ●   Heuristics 1 and 2 are combined:
     ●   If Warning1 or Warning2 → enable Warning
     ●   If Drift1 or Drift2 → enable Drift
     ●   Otherwise → enable Normality




                                                    31
MOA: Massive Online Analysis
●   Framework for data stream mining. Algorithms for
    classification, regression and clustering.
●   University of Waikato → WEKA integration.
●   Graphical user interface and command line.
●   Data stream generators.
●   Evaluation methods (holdout and prequential).
●   Open source and free.



                                      http://moa.cs.waikato.ac.nz
                                                                    32
Experimentation
●   Our data streams:
    ●   5 synthetic with abrupt changes
    ●   2 synthetic with gradual changes
    ●   1 synthetic with noise
    ●   3 with real data




                                           33
Experimentation
●   Our data streams:
    ●   5 synthetic with abrupt changes
    ●   2 synthetic with gradual changes
    ●   1 synthetic with noise
    ●   3 with real data
●   Classification algorithm: Naive Bayes




                                            34
Experimentation
●   Our data streams:
    ●   5 synthetic with abrupt changes
    ●   2 synthetic with gradual changes
    ●   1 synthetic with noise
    ●   3 with real data
●   Classification algorithm: Naive Bayes
●   Detection methods:
        No detection         MovingAverage1
        MoreErrorsMoving     MovingAverage2
        MaxMoving            MovingAverageH
        DDM                  EDDM             35
Experimentation
●   Parameters tuning:
    ●   4 streams y 5 methods → 288 experiments




                                                  36
Experimentation
●   Parameters tuning:
    ●   4 streams y 5 methods → 288 experiments
●   Comparative study:
    ●   11 streams y 8+1 methods → 99 experiments




                                                    37
Experimentation
●   Parameters tuning:
    ●   4 streams y 5 methods → 288 experiments
●   Comparative study:
    ●   11 streams y 8+1 methods → 99 experiments
●   Evaluation: prequential




                                                    38
Experimentation
●   Parameters tuning:
    ●   4 streams y 5 methods → 288 experiments
●   Comparative study:
    ●   11 streams y 8+1 methods → 99 experiments
●   Evaluation: prequential
●   Measurements:
    ●   AUC: area under the curve of accumulated success
        rates
    ●   Number of correct drifts
    ●   Distance to drifts
    ●   False positives and false negatives                39
Experimentation: Agrawal




                           40
Experimentation: Electricity




                               41
Conclussions of experimentation
1. With abrupt changes:
  ●   More victories: DDM and MovingAverageH
  ●   Best in mean: MoreErrorsMoving → very responsive
2. With gradual changes:
  ●   Best: DDM and EDDM
  ●   Problem: many false positives → parameter tunning only with abrupt
      changes
3. With noise:
  ●   Only winner: DDM
  ●   Problem: noise sensitive → parameter tunning only with no-noise data
4. Real data:
  ●   Best: MovingAverage1 and MovingAverageH

                                                                             42
Conclussions of this work
1. Our methods are competitive, although sensitive to
  the parameters → Dynamic fit
2. Evaluation is not trivial → Standardization is needed
3. Large field of application in industry
4. Hot topic: last papers from 2011 + conferences




                                                           43
Future work
1. Dynamic adjustment of parameters.
2. Measuring the abruptness of change for:
  ●   Using differents forgetting mechanisms.
  ●   Setting the degree of change of the model.
3. Develop an incremental learning algorithm which
  allows partial changes of the model when a drift is
  detected.




                                                        44
Thank you




            45

Más contenido relacionado

La actualidad más candente

Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classificationSung Yub Kim
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionMartinHogg9
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reductionmrizwan969
 
Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxShubham Jaybhaye
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringHJ van Veen
 
zernike moments for image classification
zernike moments for image classificationzernike moments for image classification
zernike moments for image classificationSandeep Kumar
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkGayatri Khanvilkar
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)Padma Metta
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detectionShantanuDeosthale
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...Po-Chuan Chen
 
Independent Component Analysis
Independent Component AnalysisIndependent Component Analysis
Independent Component AnalysisTatsuya Yokota
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clusteringSOYEON KIM
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMsSetu Chokshi
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 

La actualidad más candente (20)

Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
 
Activation function
Activation functionActivation function
Activation function
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Stochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptxStochastic Gradient Decent (SGD).pptx
Stochastic Gradient Decent (SGD).pptx
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
zernike moments for image classification
zernike moments for image classificationzernike moments for image classification
zernike moments for image classification
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)
 
Outlier analysis and anomaly detection
Outlier analysis and anomaly detectionOutlier analysis and anomaly detection
Outlier analysis and anomaly detection
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
 
Independent Component Analysis
Independent Component AnalysisIndependent Component Analysis
Independent Component Analysis
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Spectral clustering
Spectral clusteringSpectral clustering
Spectral clustering
 
Missing data handling
Missing data handlingMissing data handling
Missing data handling
 
Time series predictions using LSTMs
Time series predictions using LSTMsTime series predictions using LSTMs
Time series predictions using LSTMs
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 

Similar a Handling concept drift in data stream mining

How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...Viach Kakovskyi
 
Machine learning using matlab.pdf
Machine learning using matlab.pdfMachine learning using matlab.pdf
Machine learning using matlab.pdfppvijith
 
adversarial robustness lecture
adversarial robustness lectureadversarial robustness lecture
adversarial robustness lectureMuhammadAhmedShah2
 
Optimization of power systems - old and new tools
Optimization of power systems - old and new toolsOptimization of power systems - old and new tools
Optimization of power systems - old and new toolsOlivier Teytaud
 
Tools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsTools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsOlivier Teytaud
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headachesBenoît Rostykus
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsStavros Kontopoulos
 
Dimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLabDimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLabCloudxLab
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examplesFelipe
 
Training DNN Models - II.pptx
Training DNN Models - II.pptxTraining DNN Models - II.pptx
Training DNN Models - II.pptxPrabhuSelvaraj15
 
Aaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAminaRepo
 
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...sleepy_yoshi
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingPierre Gutierrez
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIAI Frontiers
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smallerTony Tran
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...PyData
 
DRL #2-3 - Multi-Armed Bandits .pptx.pdf
DRL #2-3 - Multi-Armed Bandits .pptx.pdfDRL #2-3 - Multi-Armed Bandits .pptx.pdf
DRL #2-3 - Multi-Armed Bandits .pptx.pdfGulamSarwar31
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningKhang Pham
 

Similar a Handling concept drift in data stream mining (20)

How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...How to easily find the optimal solution without exhaustive search using Genet...
How to easily find the optimal solution without exhaustive search using Genet...
 
Machine learning using matlab.pdf
Machine learning using matlab.pdfMachine learning using matlab.pdf
Machine learning using matlab.pdf
 
adversarial robustness lecture
adversarial robustness lectureadversarial robustness lecture
adversarial robustness lecture
 
Optimization of power systems - old and new tools
Optimization of power systems - old and new toolsOptimization of power systems - old and new tools
Optimization of power systems - old and new tools
 
Tools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power SystemsTools for Discrete Time Control; Application to Power Systems
Tools for Discrete Time Control; Application to Power Systems
 
Meetup_FGVA_Uplift @ Dataiku
Meetup_FGVA_Uplift @ DataikuMeetup_FGVA_Uplift @ Dataiku
Meetup_FGVA_Uplift @ Dataiku
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
 
Dimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLabDimensionality Reduction | Machine Learning | CloudxLab
Dimensionality Reduction | Machine Learning | CloudxLab
 
Online Machine Learning: introduction and examples
Online Machine Learning:  introduction and examplesOnline Machine Learning:  introduction and examples
Online Machine Learning: introduction and examples
 
Training DNN Models - II.pptx
Training DNN Models - II.pptxTraining DNN Models - II.pptx
Training DNN Models - II.pptx
 
Aaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clusteringAaa ped-16-Unsupervised Learning: clustering
Aaa ped-16-Unsupervised Learning: clustering
 
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
ICML2012読み会 Scaling Up Coordinate Descent Algorithms for Large L1 regularizat...
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modeling
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AIJay Yagnik at AI Frontiers : A History Lesson on AI
Jay Yagnik at AI Frontiers : A History Lesson on AI
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smaller
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
DRL #2-3 - Multi-Armed Bandits .pptx.pdf
DRL #2-3 - Multi-Armed Bandits .pptx.pdfDRL #2-3 - Multi-Armed Bandits .pptx.pdf
DRL #2-3 - Multi-Armed Bandits .pptx.pdf
 
Overview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep LearningOverview on Optimization algorithms in Deep Learning
Overview on Optimization algorithms in Deep Learning
 

Más de Manuel Martín

Automatizando el aprendizaje basado en datos
Automatizando el aprendizaje basado en datosAutomatizando el aprendizaje basado en datos
Automatizando el aprendizaje basado en datosManuel Martín
 
Modelling Multi-Component Predictive Systems as Petri Nets
Modelling Multi-Component Predictive Systems as Petri NetsModelling Multi-Component Predictive Systems as Petri Nets
Modelling Multi-Component Predictive Systems as Petri NetsManuel Martín
 
Brand engagement with mobile gamification apps from a developer perspective
Brand engagement with mobile gamification apps from a developer perspectiveBrand engagement with mobile gamification apps from a developer perspective
Brand engagement with mobile gamification apps from a developer perspectiveManuel Martín
 
Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...Manuel Martín
 
Improving transport timetables usability for mobile devices
Improving transport timetables usability for mobile devicesImproving transport timetables usability for mobile devices
Improving transport timetables usability for mobile devicesManuel Martín
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Manuel Martín
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsTowards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsManuel Martín
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...Manuel Martín
 
Quick presentation for the OpenML workshop in Eindhoven 2014
Quick presentation for the OpenML workshop in Eindhoven 2014Quick presentation for the OpenML workshop in Eindhoven 2014
Quick presentation for the OpenML workshop in Eindhoven 2014Manuel Martín
 
Online Detection of Shutdown Periods in Chemical Plants: A Case Study
Online Detection of Shutdown Periods in Chemical Plants: A Case StudyOnline Detection of Shutdown Periods in Chemical Plants: A Case Study
Online Detection of Shutdown Periods in Chemical Plants: A Case StudyManuel Martín
 
Artificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisArtificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisManuel Martín
 
Minería de secuencias de datos
Minería de secuencias de datosMinería de secuencias de datos
Minería de secuencias de datosManuel Martín
 
Minería de secuencias de datos
Minería de secuencias de datosMinería de secuencias de datos
Minería de secuencias de datosManuel Martín
 
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de AndalucíaAndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de AndalucíaManuel Martín
 
Operaciones Colectivas en MPI
Operaciones Colectivas en MPIOperaciones Colectivas en MPI
Operaciones Colectivas en MPIManuel Martín
 
Introducción a GNU/Linux
Introducción a GNU/LinuxIntroducción a GNU/Linux
Introducción a GNU/LinuxManuel Martín
 
Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011Manuel Martín
 
Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010Manuel Martín
 

Más de Manuel Martín (20)

Hogar (Des)Conectado
Hogar (Des)ConectadoHogar (Des)Conectado
Hogar (Des)Conectado
 
Automatizando el aprendizaje basado en datos
Automatizando el aprendizaje basado en datosAutomatizando el aprendizaje basado en datos
Automatizando el aprendizaje basado en datos
 
Modelling Multi-Component Predictive Systems as Petri Nets
Modelling Multi-Component Predictive Systems as Petri NetsModelling Multi-Component Predictive Systems as Petri Nets
Modelling Multi-Component Predictive Systems as Petri Nets
 
Brand engagement with mobile gamification apps from a developer perspective
Brand engagement with mobile gamification apps from a developer perspectiveBrand engagement with mobile gamification apps from a developer perspective
Brand engagement with mobile gamification apps from a developer perspective
 
Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...Effects of change propagation resulting from adaptive preprocessing in multic...
Effects of change propagation resulting from adaptive preprocessing in multic...
 
Improving transport timetables usability for mobile devices
Improving transport timetables usability for mobile devicesImproving transport timetables usability for mobile devices
Improving transport timetables usability for mobile devices
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsTowards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive Systems
 
From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...From sensor readings to prediction: on the process of developing practical so...
From sensor readings to prediction: on the process of developing practical so...
 
Quick presentation for the OpenML workshop in Eindhoven 2014
Quick presentation for the OpenML workshop in Eindhoven 2014Quick presentation for the OpenML workshop in Eindhoven 2014
Quick presentation for the OpenML workshop in Eindhoven 2014
 
Online Detection of Shutdown Periods in Chemical Plants: A Case Study
Online Detection of Shutdown Periods in Chemical Plants: A Case StudyOnline Detection of Shutdown Periods in Chemical Plants: A Case Study
Online Detection of Shutdown Periods in Chemical Plants: A Case Study
 
Artificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data AnalysisArtificial Intelligence for Automating Data Analysis
Artificial Intelligence for Automating Data Analysis
 
Minería de secuencias de datos
Minería de secuencias de datosMinería de secuencias de datos
Minería de secuencias de datos
 
Minería de secuencias de datos
Minería de secuencias de datosMinería de secuencias de datos
Minería de secuencias de datos
 
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de AndalucíaAndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
AndalucíaPeople: Un sistema de recomendación para sitios de ocio de Andalucía
 
Decompiladores
DecompiladoresDecompiladores
Decompiladores
 
Operaciones Colectivas en MPI
Operaciones Colectivas en MPIOperaciones Colectivas en MPI
Operaciones Colectivas en MPI
 
Introducción a GNU/Linux
Introducción a GNU/LinuxIntroducción a GNU/Linux
Introducción a GNU/Linux
 
Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011Presentación Día de la Libertad del Software 2011
Presentación Día de la Libertad del Software 2011
 
Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010Presentacion Taller de Introducción a Linux SFD2010
Presentacion Taller de Introducción a Linux SFD2010
 

Último

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Handling concept drift in data stream mining

  • 1. Handling concept drift in data stream mining Student: Manuel Martín Salvador Supervisors: Luis M. de Campos and Silvia Acid Master in Soft Computing and Intelligent Systems Department of Computer Science and Artificial Intelligence University of Granada
  • 2. Who am I? 1. Current: PhD Student in Bournemouth University 2. Previous: ● Computer Engineering in University of Granada (2004-2009) ● Programmer and SCRUM Master in Fundación I+D del Software Libre (2009-2010) ● Master in Soft Computing and Intelligent Systems in University of Granada (2010-2011) ● Researcher in Department of Computer Science and Artificial Intelligence of UGR (2010-2012)
  • 3. Index 1. Data streams 2. Online Learning 3. Evaluation 4. Taxonomy of methods 5. Contributions 6. MOA 7. Experimentation 8. Conclusions and future work 3
  • 4. Data streams 1. Continous flow of instances. ● In classification: instance = (a1 , a2 , …, an , c) 2. Unlimited size 3. May have changes in the underlying distribution of the data → concept drift 4 Image: I. Žliobaitė thesis
  • 5. Concept drifts ● It happens when the data from a stream changes its probability distribution ПS1 to another ПS2. Potential causes: ● Change in P(C) ● Change in P(X|C) ● Change in P(C|X) ● Unpredictable ● For example: spam 5
  • 6. Gradual concept drift 6 Image: I. Žliobaitė thesis
  • 7. Types of concept drifts 7 Image: D. Brzeziński thesis
  • 8. Types of concept drifts 8 Image: D. Brzeziński thesis
  • 9. Example: STAGGER color=red color=green size=medium Class=true if → and or or size=small shape=cricle size=large 9 Image: Kolter & Maloof
  • 10. Online learning (incremental) ● Goal: incrementally learn a classifier at least as accurate as if it had been trained in batch ● Requirements: 1. Incremental 2. Single pass 3. Limited time and memory 4. Any-time learning: availability of the model 10
  • 11. Online learning (incremental) ● Goal: incrementally learn a classifier at least as accurate as if it had been trained in batch ● Requirements: 1. Incremental 2. Single pass 3. Limited time and memory 4. Any-time learning: availability of the model ● Nice to have: deal with concept drift. 11
  • 12. Evaluation Several criteria: ● Time → seconds ● Memory → RAM/hour ● Generalizability of the model → % success ● Detecting concept drift → detected drifts, false positives and false negatives 12
  • 13. Evaluation Several criteria: ● Time → seconds ● Memory → RAM/hour ● Generalizability of the model → % success ● Detecting concept drift → detected drifts, false positives and false negatives Problem: we can't use the traditional techniques for evaluation (i.e. cross validation). → Solution: new strategies. 13
  • 14. Evaluation: prequential ● Test y training each instance. errors processed instances ● Is a pessimistic estimator: holds the errors since the beginning of the stream. → Solution: forgetting mechanisms (sliding window and fading factor). errors inside window Sliding window: window size …... currentError⋅errors Fading factor: 1⋅processed instances …... Advantages: All instances are used for training. Useful for data streams with concept drifts. 14
  • 15. Evaluation: comparing Which method is better? 15
  • 16. Evaluation: comparing Which method is better? → AUC 16
  • 17. Evaluation: drift detection ● First detected: correct. ● Following detected: false positives. ● Not detected: false negatives. ● Distance = correct – real. 17
  • 18. Taxonomy of methods ● Change detectors Learners with ● Training windows triggers ● Adaptive sampling ✔ Advantages: can be used by any classification algorithm. ✗ Disadvantages: usually, once detected a change, they discard the old model and relearn a new one. 18
  • 19. Taxonomy of methods ● Change detectors Learners with ● Training windows triggers ● Adaptive sampling ✔ Advantages: can be used by any classification algorithm. ✗ Disadvantages: usually, once detected a change, they discard the old model and relearn a new one. ● Adaptive ensembles Evolving ● Instance weighting Learners ● Feature space ● Base model specific ✔ Advantages: they continually adapt the model over time ✗ Disadvantages: they don't detect changes. 19
  • 20. Contributions ● Taxonomy: triggers → change detectors ● MoreErrorsMoving ● MaxMoving ● Moving Average – Heuristic 1 – Heuristic 2 – Hybrid heuristic: 1+2 ● P-chart with 3 levels: normal, warning and drift 20
  • 21. Contributions: MoreErrorsMoving ● n latest results of classification are monitored → History = {ei , ei+1 , …, ei+n} (i.e. 0,0,1,1) ● History error rate: ● The consecutive declines are controlled ● At each time step: ● If ci - 1 < ci (more errors) → declines++ ● If ci - 1 > ci (less errors) → declines=0 ● If ci - 1 = ci (same) → declines don't change 21
  • 22. Contributions: MoreErrorsMoving ● If consecutive declines > k → enable Warning ● If consecutive declines > k+d → enable Drift ● Otherwise → enable Normality 22
  • 23. Contributions: MoreErrorsMoving History = 8 Warning = 2 Drift = 4 Detected drifts: 46 y 88 Distance to real drifts: 46-40 = 6 88-80 = 8 23
  • 24. Contributions: MaxMoving ● n latest success accumulated rates are monitored since the last change ● History={ai , ai+1 , …, ai+n} (i.e. H={2/5, 3/6, 4/7, 4/8}) ● History maximum: ● The consecutive declines are controlled ● At each time step: ● If mi < mi - 1 → declines++ ● If mi > mi - 1 → declines=0 ● If mi = mi - 1 → declines don't change 24
  • 25. Contributions: MaxMoving History = 4 Warning = 4 Drift = 8 Detected drifts: 52 y 90 Distance to real drifts: 52-40 = 12 90-80 = 10 25
  • 26. Contributions: Moving Average Goal: to smooth accuracy rates for better detection. 26
  • 27. Contributions: Moving Average 1 ● m latest success accumulated rates are smoothed → Simple moving average (unweighted mean) ● The consecutive declines are controlled ● At each time step: ● If st < st - 1 → declines++ ● If st > st - 1 → declines = 0 ● If st = st - 1 → declines don't change 27
  • 28. Contributions: Moving Average 1 Smooth = 32 Warning = 4 Drift = 8 Detected drifts: 49 y 91 Distance to real drifts: 49-40 = 9 91-80 = 11 28
  • 29. Contributions: Moving Average 2 ● History of size n with the smoothed success rates → History={si, si+1, …, si+n} ● History maximum: ● Difference between st and mt – 1 is monitored ● At each time step: ● If mt – 1 - st > u → enable Warning ● If mt – 1 - st > v → enable Drift ● Otherwise → enable Normality ● Suitable for abrupt changes 29
  • 30. Contributions: Moving Average 2 Smooth = 4 History = 32 Warning = 2% Drift = 4% Detected drifts: 44 y 87 Distance to real drifts: 44-40 = 4 87-80 = 7 30
  • 31. Contributions: Moving Average Hybrid ● Heuristics 1 and 2 are combined: ● If Warning1 or Warning2 → enable Warning ● If Drift1 or Drift2 → enable Drift ● Otherwise → enable Normality 31
  • 32. MOA: Massive Online Analysis ● Framework for data stream mining. Algorithms for classification, regression and clustering. ● University of Waikato → WEKA integration. ● Graphical user interface and command line. ● Data stream generators. ● Evaluation methods (holdout and prequential). ● Open source and free. http://moa.cs.waikato.ac.nz 32
  • 33. Experimentation ● Our data streams: ● 5 synthetic with abrupt changes ● 2 synthetic with gradual changes ● 1 synthetic with noise ● 3 with real data 33
  • 34. Experimentation ● Our data streams: ● 5 synthetic with abrupt changes ● 2 synthetic with gradual changes ● 1 synthetic with noise ● 3 with real data ● Classification algorithm: Naive Bayes 34
  • 35. Experimentation ● Our data streams: ● 5 synthetic with abrupt changes ● 2 synthetic with gradual changes ● 1 synthetic with noise ● 3 with real data ● Classification algorithm: Naive Bayes ● Detection methods: No detection MovingAverage1 MoreErrorsMoving MovingAverage2 MaxMoving MovingAverageH DDM EDDM 35
  • 36. Experimentation ● Parameters tuning: ● 4 streams y 5 methods → 288 experiments 36
  • 37. Experimentation ● Parameters tuning: ● 4 streams y 5 methods → 288 experiments ● Comparative study: ● 11 streams y 8+1 methods → 99 experiments 37
  • 38. Experimentation ● Parameters tuning: ● 4 streams y 5 methods → 288 experiments ● Comparative study: ● 11 streams y 8+1 methods → 99 experiments ● Evaluation: prequential 38
  • 39. Experimentation ● Parameters tuning: ● 4 streams y 5 methods → 288 experiments ● Comparative study: ● 11 streams y 8+1 methods → 99 experiments ● Evaluation: prequential ● Measurements: ● AUC: area under the curve of accumulated success rates ● Number of correct drifts ● Distance to drifts ● False positives and false negatives 39
  • 42. Conclussions of experimentation 1. With abrupt changes: ● More victories: DDM and MovingAverageH ● Best in mean: MoreErrorsMoving → very responsive 2. With gradual changes: ● Best: DDM and EDDM ● Problem: many false positives → parameter tunning only with abrupt changes 3. With noise: ● Only winner: DDM ● Problem: noise sensitive → parameter tunning only with no-noise data 4. Real data: ● Best: MovingAverage1 and MovingAverageH 42
  • 43. Conclussions of this work 1. Our methods are competitive, although sensitive to the parameters → Dynamic fit 2. Evaluation is not trivial → Standardization is needed 3. Large field of application in industry 4. Hot topic: last papers from 2011 + conferences 43
  • 44. Future work 1. Dynamic adjustment of parameters. 2. Measuring the abruptness of change for: ● Using differents forgetting mechanisms. ● Setting the degree of change of the model. 3. Develop an incremental learning algorithm which allows partial changes of the model when a drift is detected. 44
  • 45. Thank you 45