SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
Modelling globalized systems:
  challenges and examples


          Cécile Germain-Renaud
  Laboratoire de Recherche en Informatique
     Université Paris Sud, CNRS, INRIA
      Results from the GO collaboration
Outline


   ü  Globalized          systems




Franco-Taiwanese meeting              10 September 2012
Remember tomorrow


       A computational grid is a
       hardware and software
       infrastructure that provides
       dependable, consistent,
       pervasive, and inexpensive
       access to high-end
       computational capabilities

   Ian Foster, 1998




Franco-Taiwanese meeting                 10 September 2012
The Clouds take me higher


   Cloud computing is a model for enabling convenient, on-demand network
   access to a shared pool of configurable computing resources (e.g., networks,
   servers, storage, applications, and services) that can be rapidly provisioned
   and released with minimal management effort or service provider
   interaction. NIST (US National Institute of Standards and Technology)
   definition of clouds




Franco-Taiwanese meeting                                  10 September 2012
Yesterday's sorrows

                           How we configure our grids (EGEE 09)




Franco-Taiwanese meeting                                    10 September 2012
Tomorrow's white lies?




Franco-Taiwanese meeting                      10 September 2012
Globalized systems
                           Grid                Data Center          Cloud

  Distribution             Very large          Any                  Moderate

  Sharing                  Virtual             No                   Isolation –
                           Organisations –                          individualized
                           collective rights                        access
                           and control
  Large data (file)        Yes                 Yes                  Yes

  Big Data                 No                  Yes                  Yes
  (indexed)
  Economics                Long-term SLAs      Proprietary or       Pay as you go
                                               usual
                                               commercial
                                               contract


Franco-Taiwanese meeting                                        10 September 2012
Outline


   ü  Globalized          systems
   ü  Challenges




Franco-Taiwanese meeting              10 September 2012
The challenges


     We need to show that the research has
     verifiable and positive impact on
     production systems

   Demonstrating impact on complex systems
    —  requires experimental data
    —  raises serious scientific issues



Franco-Taiwanese meeting                    10 September 2012
The Grid Observatory


   —  Digital curation of the
       behavioural data of
       the EGI grid: observe
       and publish
   —  Complex systems
       description
   —  Models, optimization,
       Autonomics




Franco-Taiwanese meeting                    10 September 2012
Why the EGEE/EGI grid?
     Accessible globalized production system




      LHC is the                EGEE/EGI is the             Atlas
•    Largest                •  Largest (40K             Collaboration
     (26km),                    CPUs),                  (one in four)
•    Fastest(14TeV)         •  Most distributed    •  3000 scientists
•    Coldest (1.9K)             (250 sites),
                                                   •  38 countries
•    Emptiest               •  Most used (300K
     (10−13 atm)                jobs/day)          •  174 universities
                               Computer system        and labs
       machine.
 Franco-Taiwanese meeting                         10 September 2012
The Green Computing Observatory




                           2GBytes/day at 1 minute sampling period


Franco-Taiwanese meeting                           10 September 2012
The Grid Observatory collaboration
   —  Born in EGEE-III, now a collaborative effort of
       —    CNRS/UPS Laboratoire de Recherche en Informatique
       —    CNRS/UPS Laboratoire de l'Accélérateur Linéaire
       —    Imperial College London
       —    France Grilles – French NGI of EGI
       —    EGI-Inspire
       —    Ile de France council
       —        (Software and Complex Systems programme)
       —    INRIA – Saclay (ADT programme)
       —    CNRS (PEPS programme)
       —    University Paris Sud (MRM programme)

   —  Scientific Collaborations
       —    NSF Center for Autonomic Computing
       —    European Middleware Initiative
       —    Institut des Systèmes Complexes
       —    Cardiff University



Franco-Taiwanese meeting                                 10 September 2012
Globalized systems are complex ones


   Dynamic(al) system
       —  Entities change
           behavior as an
           effect of unexpected
           feedbacks,
           emergent behavior
       —  Organized self-
           criticality, minority
           games,...




Franco-Taiwanese meeting           10 September 2012
Predicting the response time




Franco-Taiwanese meeting          10 September 2012
Globalized systems are complex ones


   Lack of complete and
   common knowledge –
   Information uncertainty
       —  Monitoring is distributed
           too
       —  Resolution and
           calibration




Franco-Taiwanese meeting               10 September 2012
Outline


   ü  Globalized            systems
   ü  Challenges

   ü  Towards             realistic behavioural
       models



Franco-Taiwanese meeting                   10 September 2012
Issue I: Fundamentals in statistics


   —  “unusual” statistics: which metrics?
   —  Are our systems stationary?




Franco-Taiwanese meeting         10 September 2012
Metrics
          Root Mean Squared Error is inadequate




Franco-Taiwanese meeting             10 September 2012
Metrics
              Should make sense for the end user




Franco-Taiwanese meeting                10 September 2012
The ROC metrics: à la BQP


                                         —  Evaluation of binary
                                            predictors: False positives
                                            vs true positive curve

                                         —  Intervals of the response
                                            time define as many binary
                                            predictors

                                         —  Intervals of increasing size
                                         —  gLite prediction is
                                            definitely better than
                                            random


       [C. Germain-Renaud et al. The Grid Observatory. CCGRID 2011]
Franco-Taiwanese meeting                             10 September 2012
Statistical significance
         Extreme values may dominate the statistics




Franco-Taiwanese meeting                  10 September 2012
More on statistical significance
                           Can we predict anything?
                 Maybe as difficult as earthquakes and markets




Franco-Taiwanese meeting                            10 September 2012
A few keywords
  Heavy tail
                                                    Self-similarity


                                                    Long range dependence
     Heteroskedasticity




                            Harold Edwin Hurst
                               1880-1978




Franco-Taiwanese meeting                         10 September 2012
Stationarity
Joint probability distribution of the time series does not change when shifted




Franco-Taiwanese meeting                              10 September 2012
Do naïve statistics make sense?


       Non-stationarity and long-range dependence can
       easily be confused
       —  The Hurst effect under trends. J. Appl. Probab., 20(3), 1983.
       —  Occasional structural breaks and long memory with an
            application to the S&P 500 absolute stock returns. J. Empirical
            Finance, 11(3), 2004.
       —  Testing for long-range dependence in the presence of shifting
           means or a slowly declining trend, using a variance-type
           estimator. J. Time Ser. Anal., 18(3), 1997.
       —  Long memory and regime switching. J. Econometrics, 105(1),
            2001.




Franco-Taiwanese meeting                                10 September 2012
Do naïve statistics make sense?


   The “physical” process is not stationary

   —  Trends: Rogers’s curve of adoption
   —  Technology innovations
   —  Real-world events
       —  Experimental discoveries
       —  Slashdotted accesses



             NON-STATIONARITY   IS A REASONABLE ALTERNATIVE



Franco-Taiwanese meeting                      10 September 2012
Dealing with non-stationarity

   1.  Statistical testing
   —  Sequential jump
       detection
   —  Theoretical guarantees
       for known distributions
   —  Predictive, not
       generative
   —  Example: blackhole
       detection
   —  Calibration and
       Validation: by the Expert


Franco-Taiwanese meeting            10 September 2012
Dealing with non-stationarity

   2.  Segmentation
   —  Fit a piecewise time-
       series: infer the
       parameters of the
       local models and the
       breakpoints
   —  Model selection: AIC,
       MDL,… – based
   —  a priori hypotheses on
       the segment models:      [Towards non stationary Grid
       AR, ARMA, FARMA,…        Models, JoGC Dec. 2011]




Franco-Taiwanese meeting                    10 September 2012
Dealing with non-stationarity


   2.  Segmentation
   —  Mostly off-line and
       computationally
       expensive: generative,
       explanatory models
   —  Validation is not trivial
       —  Fit quality
       —  Stability: bootstrapping
       —  Randomized
            optimization: clustering
            the results
   —  Hints at global behavior

Franco-Taiwanese meeting               10 September 2012
Dealing with non-stationarity


   3.  Adaptive clustering:
   —  Adaptive: on-line rupture detection
   —  Back to statistical testing, but on the model, not on the data




 [Toward Autonomic Grids: Analyzing the Job
 Flow with Affinity Streaming". SIGKDD'2009]
Franco-Taiwanese meeting                           10 September 2012
Remember tomorrow (5 years later)


       Systems manage themselves
       according to an
       administrator’s goals. New
       components integrate as
       effortlessly as a new cell
       establishes itself in the
       human body

       J. Kephart and David M.
       Chess, the Autonomic
       Computing Manifesto,
       2003




Franco-Taiwanese meeting            10 September 2012
Issue II: Intelligibility

                                           S   E
                                 Autonomic Manager
                                 Analyze           Plan



                           Monitor    Knowledge           Execute



                                           S   E
                                 Managed Element



Franco-Taiwanese meeting                                       10 September 2012
Issue II: Intelligibility


       How to build the knowledge?                             Blackhole

   —  No Gold Standard, too rare
       experts




                                           Software
                                             fault



Franco-Taiwanese meeting                        10 September 2012
Issue II: Intelligibility


       How to build the knowledge?
   —  No Gold Standard, too rare
       experts
   —  Let’s build it on-line! Model-
       free policies eg
       Reinforcement Learning!
   —  Unfortunately, tabula rasa
       policies and vanilla ML            Exploration/exploitation
       methods are too often                      tradeoff
       defeated (Rish & Tesauro
       ICML 2006, Tesauro)
Franco-Taiwanese meeting                        10 September 2012
… Issue II: Intelligibility


   —  Transaction traces are text files, thus we can infer
       causes from data as latent topics, in the spirit of
       text mining.
       [Characterizing E-Science File Access Behavior via
       Latent Dirichlet Allocation, UCC 2011]

   —  The internals of a globalized system might be so
        complex that it might be more effective to consider
        it as a black box, but the causes of failures or
        performance can be elucidated from external
        observation.
       [Distributed Monitoring with Collaborative Prediction.
       CCGrid 2012]

Franco-Taiwanese meeting                      10 September 2012
Latent Dirichlet Allocation…

   —  A corpus is a set of documents, each built over a
       dictionary (set of words)
       —  A document is characterized by a mixture distribution over
           topics. Best example of topics: scientific keywords
       —  A topic is characterized by a distribution over words.
       —  The only observables are words.
       —  Bag of words - interchangeability
   —  LDA is a generative model
       —  For each document, choose the topic distribution.
       —  For each topic, choose a word distribution.
       —  For each word, choose:
            —  the topic along the selected topic distribution
            —  the word along the selected word distribution for this topic


Franco-Taiwanese meeting                                10 September 2012
…Latent Dirichlet Allocation


   M is the number of documents, N the size of a document




                                                    β	





                 α	

       θ	

            t               x N
                                                              M


Franco-Taiwanese meeting                                    10 September 2012
Analogy


   —  An analogy between text corpora and transaction
       traces
       —  Corpus ∼ Complete trace
       —  Document ∼ Segment of a trace (phase)
       —  Topic ∼ Activity
       —  Word ∼ Filename




Franco-Taiwanese meeting                     10 September 2012
And differences


   —  Unlike text corpora, trace files have...
       —  No natural segmentation.
       —  No well established, predefined set of activities
            equivalent to a set of topics.

   —  This work makes crude assumptions to avoid
       dealing with these issues.
       —  1 week phase
       —  Arbitrarily fix the number of activities




Franco-Taiwanese meeting                          10 September 2012
Inference and parameter estimation


   —  Exact algorithms are intractable due to coupling
       between θ and β

   —  Alternating variational EM for the MLE estimates of
       α and β [Blei,Ng,Jordan, JMLR 2003]

   —  Gibbs sampling for estimating θ and Φ
       [Griffiths&Steyvers, Procs Nat Academy Science
       2004]

   —  …



Franco-Taiwanese meeting                  10 September 2012
A tentative simpler model


   —  User data is included in each transaction thus is
       observable

   —  Assume each activity is associated with a unique
       user.

   —  Estimation and inference is much easier than
       standard LDA

   —  Goal: check the validity of this assumption




Franco-Taiwanese meeting                   10 September 2012
Experimental results


       —  Synthetic trace generated using the estimated
           parameters of the 2 models.
       —  2M different files. 63 activities (standard LDA,
           number of clustered users), 262 activities (observed).
       —  File popularity: χ-square test gives p-value of 1




Franco-Taiwanese meeting                       10 September 2012
Ongoing work


   —  Inferred segmentation: Probabilistic Context-Free
       Grammar




Franco-Taiwanese meeting                  10 September 2012
Fault management


   Operational motivation: all (CExSE) pairs tests
                      256 CEs                        121 SEs




                                  lcg_cr/srm_ls




Franco-Taiwanese meeting                          10 September 2012
All (CExSE) pairs tests




           256 CEs




                               121 SEs


Franco-Taiwanese meeting                 10 September 2012
Goal: Detection/Diagnosis, or Prediction?


    Detection/diagnosis: define a minimal set of probes
    that discovers all / any faulty component

    Equivalent to the minimum cover set problem

    Assumes that we know the internal dependencies




 Franco-Taiwanese meeting                10 September 2012
Assumes that we know the internal dependencies




Franco-Taiwanese meeting               10 September 2012
Goal
     Prediction! More precisely
     —  (Minimal) probe selection:
         choose which subset of the
         (CE,SE) pairs will actually be
         tested                           Less probes
     —  Prediction: predict the
         availability of all (CE,SE)
         pairs from a small number
         of them.




Franco-Taiwanese meeting                  10 September 2012
All (CExSE) pairs tests

      A case for collaborative filtering




           256 CEs
           ~ users




                               121 SEs ~items


Franco-Taiwanese meeting                        10 September 2012
Collaborative filtering


   —  Major aplication: recommendation systems eg
       netflix challenge

   —  Neigborhood approach
   —  Latent factor models approach
       —  Transform items AND users into the same latent
           factors space
       —  Factors are inferred from data
       —  Better if interpretable eg comedy, drama, action,
           scenery, music,… but this is another task
   Latent topics: LDA                 Implicit mapping to high-
                                      dimensional space: SVM
Franco-Taiwanese meeting                        10 September 2012
Maximum Margin Matrix factorization
   (Srebro, Rennie, Jaakkola, NIPS 2005)
   —  Linear factor model
   X the observed nxm sparse matrix
                     X = UV               U is nxk, V is kxm
               each line i of U is a feature vector (« tastes » of user i)
               each column j of V is a linear predictor for movie j
   —  Low-rank CP: regularizing by the rank k
   Trace (or Frobenius) norm as a convex surrogate for rank
   —  With uniform sample selection, theoretical bounds on
       misclassification error: learning both U and V is within
       log factors of learning only one

Franco-Taiwanese meeting                              10 September 2012
Maximum Margin Matrix factorization
   (Srebro, Rennie, Jaakkola, NIPS 2005)
   —  Linear factor model
   —  Low-rank CP: regularizing by the rank k
   Trace (or Frobenius) norm as a convex surrogate for rank




   —  With uniform sample selection, theoretical bounds on
       misclassification error: learning both U and V is within
       log factors of learning only one

Franco-Taiwanese meeting                         10 September 2012
MMMF with Active Learning


                            —  Rish & Tesauro, 2007
                            —  Min margin selection: get
                               the label for the most
                               uncertain entry

                            —  More applicable to system
                               problems than to movies
                               ratings

                            —  In our case, just launch a
                               probe




Franco-Taiwanese meeting                10 September 2012
Evaluation
   For once, we have ground truth: 51 days (March-April
   2011) of all (CExSE) probes outcomes




                                                    Less probes




Franco-Taiwanese meeting                10 September 2012
Evaluation
         —  Systematic failures: excellent results, too easy problem

         —  Without systematic failures: accuracy is excellent, but not a
                       significant performance indicator

         —  MMMF-based Active probing
             —  provides good results
             —  outperforms M3F, a combined low-rank/latent topic
                                method
                        
                       
                                              
                                             
                                                                                    
                                                                                                   
                                                                                                                          
                                                                                                                         
                                                                                                                                              


                                                                                                




                                                                                            




                                                                                            
  




                                                                                    


                                                                                            




                                                                                            




                                                                                                
                                                                                                                                                    
                                                                                                           


Franco-Taiwanese meeting                                                                                                 10 September 2012
Outline


   ü  Globalized            systems
   ü  Scientific           challenges
   ü  Towards             realistic behavioural
       models
   ü  Conclusion            and questions


Franco-Taiwanese meeting                   10 September 2012
Franco-Taiwanese meeting   10 September 2012
How to

—  Get an account                           —  Download files




                            www.grid-observatory.org

 Franco-Taiwanese meeting                          10 September 2012
Questions ?




Franco-Taiwanese meeting                 10 September 2012

Más contenido relacionado

La actualidad más candente

Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaUniversitat Politècnica de Catalunya
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019Universitat Politècnica de Catalunya
 
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Universitat Politècnica de Catalunya
 

La actualidad más candente (8)

Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
 
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
Deep Video Object Tracking - Xavier Giro - UPC Barcelona 2019
 
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
Self-supervised Audiovisual Learning - Xavier Giro - UPC Barcelona 2019
 
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
Video Saliency Prediction with Deep Neural Networks - Juan Jose Nieto - DCU 2019
 
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
Video Analysis (D4L2 2017 UPC Deep Learning for Computer Vision)
 
Intro til mfs service
Intro til mfs serviceIntro til mfs service
Intro til mfs service
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
 

Similar a Modelling Globalized Systems

Ist africa2012 alert system in case of excess drawing of ground water_1
Ist africa2012 alert system in case of excess drawing of ground water_1Ist africa2012 alert system in case of excess drawing of ground water_1
Ist africa2012 alert system in case of excess drawing of ground water_1Karel Charvat
 
ENVIROFI for cross domain FI-PPP applications
ENVIROFI for cross domain FI-PPP applicationsENVIROFI for cross domain FI-PPP applications
ENVIROFI for cross domain FI-PPP applicationsDenis Havlik
 
Green Computing Observatory
Green Computing ObservatoryGreen Computing Observatory
Green Computing ObservatoryCecile Germain
 
Multi-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud ComputingMulti-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud ComputingAlexandru Iosup
 
OGF Standards Overview - ITU-T JCA Cloud
OGF Standards Overview - ITU-T JCA CloudOGF Standards Overview - ITU-T JCA Cloud
OGF Standards Overview - ITU-T JCA CloudAlan Sill
 
Jos van Sas - Testimonial Alcatel-Lucent Bell Labs
Jos van Sas - Testimonial Alcatel-Lucent Bell LabsJos van Sas - Testimonial Alcatel-Lucent Bell Labs
Jos van Sas - Testimonial Alcatel-Lucent Bell Labsimec.archive
 
Sociotechnical systems resilience
Sociotechnical systems resilienceSociotechnical systems resilience
Sociotechnical systems resilienceJean-René RUAULT
 
About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...Nane Kratzke
 
Ieee wcnc 2012 workshop - befemto - Agenda
Ieee wcnc 2012   workshop - befemto - AgendaIeee wcnc 2012   workshop - befemto - Agenda
Ieee wcnc 2012 workshop - befemto - AgendaThierry Lestable
 
18 applied architectures_part_2
18 applied architectures_part_218 applied architectures_part_2
18 applied architectures_part_2Majong DevJfu
 
D1.1. State of The Art and Requirements Analysis for Hypervideo
D1.1. State of The Art and Requirements Analysis for HypervideoD1.1. State of The Art and Requirements Analysis for Hypervideo
D1.1. State of The Art and Requirements Analysis for HypervideoLinkedTV
 
NCOIC GCC OWS-10 presentation 10 7 2013
NCOIC GCC OWS-10 presentation 10 7 2013NCOIC GCC OWS-10 presentation 10 7 2013
NCOIC GCC OWS-10 presentation 10 7 2013GovCloud Network
 
Cloud computing - Terena 2011
Cloud computing - Terena 2011Cloud computing - Terena 2011
Cloud computing - Terena 2011Jisc
 
Ramin elahi fog_computing_ecosystem_final_dec22_updated
Ramin elahi fog_computing_ecosystem_final_dec22_updatedRamin elahi fog_computing_ecosystem_final_dec22_updated
Ramin elahi fog_computing_ecosystem_final_dec22_updatedHarshitParkar6677
 
OGF Standards Overview - Cloudscape V
OGF Standards Overview - Cloudscape VOGF Standards Overview - Cloudscape V
OGF Standards Overview - Cloudscape VAlan Sill
 
Dynamic Semantics for the Internet of Things
Dynamic Semantics for the Internet of Things Dynamic Semantics for the Internet of Things
Dynamic Semantics for the Internet of Things PayamBarnaghi
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability InstituteNeil Chue Hong
 

Similar a Modelling Globalized Systems (20)

Ist africa2012 alert system in case of excess drawing of ground water_1
Ist africa2012 alert system in case of excess drawing of ground water_1Ist africa2012 alert system in case of excess drawing of ground water_1
Ist africa2012 alert system in case of excess drawing of ground water_1
 
ENVIROFI for cross domain FI-PPP applications
ENVIROFI for cross domain FI-PPP applicationsENVIROFI for cross domain FI-PPP applications
ENVIROFI for cross domain FI-PPP applications
 
Green Computing Observatory
Green Computing ObservatoryGreen Computing Observatory
Green Computing Observatory
 
Open Geospatial Consortium (OGC) - Water/Hydro related activities
Open Geospatial Consortium (OGC) - Water/Hydro related activitiesOpen Geospatial Consortium (OGC) - Water/Hydro related activities
Open Geospatial Consortium (OGC) - Water/Hydro related activities
 
Multi-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud ComputingMulti-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud Computing
 
OGF Standards Overview - ITU-T JCA Cloud
OGF Standards Overview - ITU-T JCA CloudOGF Standards Overview - ITU-T JCA Cloud
OGF Standards Overview - ITU-T JCA Cloud
 
Jos van Sas - Testimonial Alcatel-Lucent Bell Labs
Jos van Sas - Testimonial Alcatel-Lucent Bell LabsJos van Sas - Testimonial Alcatel-Lucent Bell Labs
Jos van Sas - Testimonial Alcatel-Lucent Bell Labs
 
Sociotechnical systems resilience
Sociotechnical systems resilienceSociotechnical systems resilience
Sociotechnical systems resilience
 
About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...
 
Ieee wcnc 2012 workshop - befemto - Agenda
Ieee wcnc 2012   workshop - befemto - AgendaIeee wcnc 2012   workshop - befemto - Agenda
Ieee wcnc 2012 workshop - befemto - Agenda
 
18 applied architectures_part_2
18 applied architectures_part_218 applied architectures_part_2
18 applied architectures_part_2
 
D1.1. State of The Art and Requirements Analysis for Hypervideo
D1.1. State of The Art and Requirements Analysis for HypervideoD1.1. State of The Art and Requirements Analysis for Hypervideo
D1.1. State of The Art and Requirements Analysis for Hypervideo
 
NCOIC GCC OWS-10 presentation 10 7 2013
NCOIC GCC OWS-10 presentation 10 7 2013NCOIC GCC OWS-10 presentation 10 7 2013
NCOIC GCC OWS-10 presentation 10 7 2013
 
The OHF Legacy
The OHF LegacyThe OHF Legacy
The OHF Legacy
 
Cloud computing - Terena 2011
Cloud computing - Terena 2011Cloud computing - Terena 2011
Cloud computing - Terena 2011
 
Ramin elahi fog_computing_ecosystem_final_dec22_updated
Ramin elahi fog_computing_ecosystem_final_dec22_updatedRamin elahi fog_computing_ecosystem_final_dec22_updated
Ramin elahi fog_computing_ecosystem_final_dec22_updated
 
Andreas Hense: Climate data for our future – acquired, analysed, archived
Andreas Hense: Climate data for our future – acquired, analysed, archivedAndreas Hense: Climate data for our future – acquired, analysed, archived
Andreas Hense: Climate data for our future – acquired, analysed, archived
 
OGF Standards Overview - Cloudscape V
OGF Standards Overview - Cloudscape VOGF Standards Overview - Cloudscape V
OGF Standards Overview - Cloudscape V
 
Dynamic Semantics for the Internet of Things
Dynamic Semantics for the Internet of Things Dynamic Semantics for the Internet of Things
Dynamic Semantics for the Internet of Things
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability Institute
 

Último

JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 

Último (20)

JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 

Modelling Globalized Systems

  • 1. Modelling globalized systems: challenges and examples Cécile Germain-Renaud Laboratoire de Recherche en Informatique Université Paris Sud, CNRS, INRIA Results from the GO collaboration
  • 2. Outline ü  Globalized systems Franco-Taiwanese meeting 10 September 2012
  • 3. Remember tomorrow A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities Ian Foster, 1998 Franco-Taiwanese meeting 10 September 2012
  • 4. The Clouds take me higher Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. NIST (US National Institute of Standards and Technology) definition of clouds Franco-Taiwanese meeting 10 September 2012
  • 5. Yesterday's sorrows How we configure our grids (EGEE 09) Franco-Taiwanese meeting 10 September 2012
  • 6. Tomorrow's white lies? Franco-Taiwanese meeting 10 September 2012
  • 7. Globalized systems Grid Data Center Cloud Distribution Very large Any Moderate Sharing Virtual No Isolation – Organisations – individualized collective rights access and control Large data (file) Yes Yes Yes Big Data No Yes Yes (indexed) Economics Long-term SLAs Proprietary or Pay as you go usual commercial contract Franco-Taiwanese meeting 10 September 2012
  • 8. Outline ü  Globalized systems ü  Challenges Franco-Taiwanese meeting 10 September 2012
  • 9. The challenges We need to show that the research has verifiable and positive impact on production systems Demonstrating impact on complex systems —  requires experimental data —  raises serious scientific issues Franco-Taiwanese meeting 10 September 2012
  • 10. The Grid Observatory —  Digital curation of the behavioural data of the EGI grid: observe and publish —  Complex systems description —  Models, optimization, Autonomics Franco-Taiwanese meeting 10 September 2012
  • 11. Why the EGEE/EGI grid? Accessible globalized production system LHC is the EGEE/EGI is the Atlas •  Largest •  Largest (40K Collaboration (26km), CPUs), (one in four) •  Fastest(14TeV) •  Most distributed •  3000 scientists •  Coldest (1.9K) (250 sites), •  38 countries •  Emptiest •  Most used (300K (10−13 atm) jobs/day) •  174 universities Computer system and labs machine. Franco-Taiwanese meeting 10 September 2012
  • 12. The Green Computing Observatory 2GBytes/day at 1 minute sampling period Franco-Taiwanese meeting 10 September 2012
  • 13. The Grid Observatory collaboration —  Born in EGEE-III, now a collaborative effort of —  CNRS/UPS Laboratoire de Recherche en Informatique —  CNRS/UPS Laboratoire de l'Accélérateur Linéaire —  Imperial College London —  France Grilles – French NGI of EGI —  EGI-Inspire —  Ile de France council —  (Software and Complex Systems programme) —  INRIA – Saclay (ADT programme) —  CNRS (PEPS programme) —  University Paris Sud (MRM programme) —  Scientific Collaborations —  NSF Center for Autonomic Computing —  European Middleware Initiative —  Institut des Systèmes Complexes —  Cardiff University Franco-Taiwanese meeting 10 September 2012
  • 14. Globalized systems are complex ones Dynamic(al) system —  Entities change behavior as an effect of unexpected feedbacks, emergent behavior —  Organized self- criticality, minority games,... Franco-Taiwanese meeting 10 September 2012
  • 15. Predicting the response time Franco-Taiwanese meeting 10 September 2012
  • 16. Globalized systems are complex ones Lack of complete and common knowledge – Information uncertainty —  Monitoring is distributed too —  Resolution and calibration Franco-Taiwanese meeting 10 September 2012
  • 17. Outline ü  Globalized systems ü  Challenges ü  Towards realistic behavioural models Franco-Taiwanese meeting 10 September 2012
  • 18. Issue I: Fundamentals in statistics —  “unusual” statistics: which metrics? —  Are our systems stationary? Franco-Taiwanese meeting 10 September 2012
  • 19. Metrics Root Mean Squared Error is inadequate Franco-Taiwanese meeting 10 September 2012
  • 20. Metrics Should make sense for the end user Franco-Taiwanese meeting 10 September 2012
  • 21. The ROC metrics: à la BQP —  Evaluation of binary predictors: False positives vs true positive curve —  Intervals of the response time define as many binary predictors —  Intervals of increasing size —  gLite prediction is definitely better than random [C. Germain-Renaud et al. The Grid Observatory. CCGRID 2011] Franco-Taiwanese meeting 10 September 2012
  • 22. Statistical significance Extreme values may dominate the statistics Franco-Taiwanese meeting 10 September 2012
  • 23. More on statistical significance Can we predict anything? Maybe as difficult as earthquakes and markets Franco-Taiwanese meeting 10 September 2012
  • 24. A few keywords Heavy tail Self-similarity Long range dependence Heteroskedasticity Harold Edwin Hurst 1880-1978 Franco-Taiwanese meeting 10 September 2012
  • 25. Stationarity Joint probability distribution of the time series does not change when shifted Franco-Taiwanese meeting 10 September 2012
  • 26. Do naïve statistics make sense? Non-stationarity and long-range dependence can easily be confused —  The Hurst effect under trends. J. Appl. Probab., 20(3), 1983. —  Occasional structural breaks and long memory with an application to the S&P 500 absolute stock returns. J. Empirical Finance, 11(3), 2004. —  Testing for long-range dependence in the presence of shifting means or a slowly declining trend, using a variance-type estimator. J. Time Ser. Anal., 18(3), 1997. —  Long memory and regime switching. J. Econometrics, 105(1), 2001. Franco-Taiwanese meeting 10 September 2012
  • 27. Do naïve statistics make sense? The “physical” process is not stationary —  Trends: Rogers’s curve of adoption —  Technology innovations —  Real-world events —  Experimental discoveries —  Slashdotted accesses NON-STATIONARITY IS A REASONABLE ALTERNATIVE Franco-Taiwanese meeting 10 September 2012
  • 28. Dealing with non-stationarity 1.  Statistical testing —  Sequential jump detection —  Theoretical guarantees for known distributions —  Predictive, not generative —  Example: blackhole detection —  Calibration and Validation: by the Expert Franco-Taiwanese meeting 10 September 2012
  • 29. Dealing with non-stationarity 2.  Segmentation —  Fit a piecewise time- series: infer the parameters of the local models and the breakpoints —  Model selection: AIC, MDL,… – based —  a priori hypotheses on the segment models: [Towards non stationary Grid AR, ARMA, FARMA,… Models, JoGC Dec. 2011] Franco-Taiwanese meeting 10 September 2012
  • 30. Dealing with non-stationarity 2.  Segmentation —  Mostly off-line and computationally expensive: generative, explanatory models —  Validation is not trivial —  Fit quality —  Stability: bootstrapping —  Randomized optimization: clustering the results —  Hints at global behavior Franco-Taiwanese meeting 10 September 2012
  • 31. Dealing with non-stationarity 3.  Adaptive clustering: —  Adaptive: on-line rupture detection —  Back to statistical testing, but on the model, not on the data [Toward Autonomic Grids: Analyzing the Job Flow with Affinity Streaming". SIGKDD'2009] Franco-Taiwanese meeting 10 September 2012
  • 32. Remember tomorrow (5 years later) Systems manage themselves according to an administrator’s goals. New components integrate as effortlessly as a new cell establishes itself in the human body J. Kephart and David M. Chess, the Autonomic Computing Manifesto, 2003 Franco-Taiwanese meeting 10 September 2012
  • 33. Issue II: Intelligibility S E Autonomic Manager Analyze Plan Monitor Knowledge Execute S E Managed Element Franco-Taiwanese meeting 10 September 2012
  • 34. Issue II: Intelligibility How to build the knowledge? Blackhole —  No Gold Standard, too rare experts Software fault Franco-Taiwanese meeting 10 September 2012
  • 35. Issue II: Intelligibility How to build the knowledge? —  No Gold Standard, too rare experts —  Let’s build it on-line! Model- free policies eg Reinforcement Learning! —  Unfortunately, tabula rasa policies and vanilla ML Exploration/exploitation methods are too often tradeoff defeated (Rish & Tesauro ICML 2006, Tesauro) Franco-Taiwanese meeting 10 September 2012
  • 36. … Issue II: Intelligibility —  Transaction traces are text files, thus we can infer causes from data as latent topics, in the spirit of text mining. [Characterizing E-Science File Access Behavior via Latent Dirichlet Allocation, UCC 2011] —  The internals of a globalized system might be so complex that it might be more effective to consider it as a black box, but the causes of failures or performance can be elucidated from external observation. [Distributed Monitoring with Collaborative Prediction. CCGrid 2012] Franco-Taiwanese meeting 10 September 2012
  • 37. Latent Dirichlet Allocation… —  A corpus is a set of documents, each built over a dictionary (set of words) —  A document is characterized by a mixture distribution over topics. Best example of topics: scientific keywords —  A topic is characterized by a distribution over words. —  The only observables are words. —  Bag of words - interchangeability —  LDA is a generative model —  For each document, choose the topic distribution. —  For each topic, choose a word distribution. —  For each word, choose: —  the topic along the selected topic distribution —  the word along the selected word distribution for this topic Franco-Taiwanese meeting 10 September 2012
  • 38. …Latent Dirichlet Allocation M is the number of documents, N the size of a document β α θ t x N M Franco-Taiwanese meeting 10 September 2012
  • 39. Analogy —  An analogy between text corpora and transaction traces —  Corpus ∼ Complete trace —  Document ∼ Segment of a trace (phase) —  Topic ∼ Activity —  Word ∼ Filename Franco-Taiwanese meeting 10 September 2012
  • 40. And differences —  Unlike text corpora, trace files have... —  No natural segmentation. —  No well established, predefined set of activities equivalent to a set of topics. —  This work makes crude assumptions to avoid dealing with these issues. —  1 week phase —  Arbitrarily fix the number of activities Franco-Taiwanese meeting 10 September 2012
  • 41. Inference and parameter estimation —  Exact algorithms are intractable due to coupling between θ and β —  Alternating variational EM for the MLE estimates of α and β [Blei,Ng,Jordan, JMLR 2003] —  Gibbs sampling for estimating θ and Φ [Griffiths&Steyvers, Procs Nat Academy Science 2004] —  … Franco-Taiwanese meeting 10 September 2012
  • 42. A tentative simpler model —  User data is included in each transaction thus is observable —  Assume each activity is associated with a unique user. —  Estimation and inference is much easier than standard LDA —  Goal: check the validity of this assumption Franco-Taiwanese meeting 10 September 2012
  • 43. Experimental results —  Synthetic trace generated using the estimated parameters of the 2 models. —  2M different files. 63 activities (standard LDA, number of clustered users), 262 activities (observed). —  File popularity: χ-square test gives p-value of 1 Franco-Taiwanese meeting 10 September 2012
  • 44. Ongoing work —  Inferred segmentation: Probabilistic Context-Free Grammar Franco-Taiwanese meeting 10 September 2012
  • 45. Fault management Operational motivation: all (CExSE) pairs tests 256 CEs 121 SEs lcg_cr/srm_ls Franco-Taiwanese meeting 10 September 2012
  • 46. All (CExSE) pairs tests 256 CEs 121 SEs Franco-Taiwanese meeting 10 September 2012
  • 47. Goal: Detection/Diagnosis, or Prediction? Detection/diagnosis: define a minimal set of probes that discovers all / any faulty component Equivalent to the minimum cover set problem Assumes that we know the internal dependencies Franco-Taiwanese meeting 10 September 2012
  • 48. Assumes that we know the internal dependencies Franco-Taiwanese meeting 10 September 2012
  • 49. Goal Prediction! More precisely —  (Minimal) probe selection: choose which subset of the (CE,SE) pairs will actually be tested Less probes —  Prediction: predict the availability of all (CE,SE) pairs from a small number of them. Franco-Taiwanese meeting 10 September 2012
  • 50. All (CExSE) pairs tests A case for collaborative filtering 256 CEs ~ users 121 SEs ~items Franco-Taiwanese meeting 10 September 2012
  • 51. Collaborative filtering —  Major aplication: recommendation systems eg netflix challenge —  Neigborhood approach —  Latent factor models approach —  Transform items AND users into the same latent factors space —  Factors are inferred from data —  Better if interpretable eg comedy, drama, action, scenery, music,… but this is another task Latent topics: LDA Implicit mapping to high- dimensional space: SVM Franco-Taiwanese meeting 10 September 2012
  • 52. Maximum Margin Matrix factorization (Srebro, Rennie, Jaakkola, NIPS 2005) —  Linear factor model X the observed nxm sparse matrix X = UV U is nxk, V is kxm each line i of U is a feature vector (« tastes » of user i) each column j of V is a linear predictor for movie j —  Low-rank CP: regularizing by the rank k Trace (or Frobenius) norm as a convex surrogate for rank —  With uniform sample selection, theoretical bounds on misclassification error: learning both U and V is within log factors of learning only one Franco-Taiwanese meeting 10 September 2012
  • 53. Maximum Margin Matrix factorization (Srebro, Rennie, Jaakkola, NIPS 2005) —  Linear factor model —  Low-rank CP: regularizing by the rank k Trace (or Frobenius) norm as a convex surrogate for rank —  With uniform sample selection, theoretical bounds on misclassification error: learning both U and V is within log factors of learning only one Franco-Taiwanese meeting 10 September 2012
  • 54. MMMF with Active Learning —  Rish & Tesauro, 2007 —  Min margin selection: get the label for the most uncertain entry —  More applicable to system problems than to movies ratings —  In our case, just launch a probe Franco-Taiwanese meeting 10 September 2012
  • 55. Evaluation For once, we have ground truth: 51 days (March-April 2011) of all (CExSE) probes outcomes Less probes Franco-Taiwanese meeting 10 September 2012
  • 56. Evaluation —  Systematic failures: excellent results, too easy problem —  Without systematic failures: accuracy is excellent, but not a significant performance indicator —  MMMF-based Active probing —  provides good results —  outperforms M3F, a combined low-rank/latent topic method                                     Franco-Taiwanese meeting 10 September 2012
  • 57. Outline ü  Globalized systems ü  Scientific challenges ü  Towards realistic behavioural models ü  Conclusion and questions Franco-Taiwanese meeting 10 September 2012
  • 58. Franco-Taiwanese meeting 10 September 2012
  • 59. How to —  Get an account —  Download files www.grid-observatory.org Franco-Taiwanese meeting 10 September 2012