SlideShare una empresa de Scribd logo
1 de 48
Descargar para leer sin conexión
Tim Menzies, WVU, USA
Forrest Shull, Fraunhofer , USA
(with John Hoskings, UoA, NZ)
Jan 27-2011



Empirical Software
Engineering, Version 2.0
About us
  Curators of large repositories of SE data
     Searched for conclusions

  Shull: NSF-funded CeBase 2001- 2005
     No longer on-line

  Menzies: PROMISE 2006-2011
     If you publish, offer data used in that pub
     http://promisedata.org/data

                                      Text-mining
  Our question:                   Model-based
     What’s next?                       General
                                 Effort estimation
                                          Defect
                                                                   2
                                                     0 20 40 60 80 100
Summary

  We need to do more “data mining”
    Not just on different projects
    But again and again on the same project

  And by “data Mining” we really mean
    Automated agents that implement
      prediction
      monitoring
      diagnosis,
      Planning
      Adaptive business intelligence


                                               3 of 48
Adaptive Business Intelligence

   learning, and re-learning,

  How to….
    Detect death march project
    Repair death march projects
    Find best sell/buy point for software artifacts
    Invest more (or less) in staff training/dev programs
    Prioritize software inspections
    Estimate development cost
    Change development costs
    etc

                                                            4 of 48
This talk

  A plea for industrial partners to join in

  A roadmap for my next decade of research
    Many long term questions
    A handful of new results




                                               5 of 48
Data Mining &
  Software
 Engineering
            6
So many applications
         of data mining to SE
  Process data                     Social data
    Input: Developer skills,         Input: e.g. which tester do
     platform stability                you most respect?
    Output: effort estimation        Output: predictions of
                                       what bugs gets fixed first
  Product data                     Trace data
     Input: static code               Input: what calls what?
      descriptions
                                       Output: call sequences
     Output: defect predictors         that lead to a core dump

  Usage data                        Any textual form
     Input: what is everyone           Input: text of any artifact
      using?
                                        Output: e.g. fault
     Output: recommendations            localization                  7
      on where to browse next
The State of the Art

  If data collected, then usually forgotten

  Dashboards : visualizations for feature
  extraction; intelligence left the user

  MapReduce, Hadoop et. al : systems
  support for massive, parallel execution.
    http://hadoop.apache.org
    Implements the bus, but no bus drivers

  Many SE data mining publications
     e.g. Bird, Nagappan, Zimmermann
      and last slide
     But, no agents that recognize when
      old models are no longer relevant,
     Or to repair old models using new data   8 of 48
Of course, DM gets
             it wrong, sometimes
  Heh, nobody’s perfect           Create agent communities, each
                                   with novel insights and limitations
  E.g. look at all the              Data miners working with humans
   mistakes people make:             See more together than separately
     Wikipidea: list of             Partnership
      cognitive biases
     38 decision
      making biases
     30 biases in probability
     18 social biases
     10 memory biases

  At least with DM, can
   repeat the analysis,
   audit the conclusion.                                         9 of 48
Does this change
  empirical SE
   research?

                   10
Ben Shneiderman, Mar’08

  The growth of the World Wide Web ... continues
  to reorder whole disciplines and industries. ...

  It is time for researchers in science to take
  network collaboration to the next phase and
  reap the potential intellectual and societal
  payoffs.

    -B. Shneiderman.
    Science 2.0.
    Science, 319(7):1349–1350, March 2008


                                                     11 of 48
Science
  2.0
          12
A proposal




  Add roller skates to software engineering

  Always use DM (data mining) on SE data
                                               13 of 48
What’s the difference?

    SE research v1.0                   SE research v2.0
  Case studies                      Data generators
    Watch, don’t touch                Case studies
                                       Experiments
  Experiments
    Vary a few conditions in a      Data analysis
     project                           10,000 of possible
                                        data miners
  Simple analysis
    A little ANOVA, regression,     Crowd-sourcing
     maybe a t-test                    10,000 of possible analysts
                                         


                                                               14 of 48
Value-added (to case-study-
            based research)
  Case studies: powerful for defining
  problems, highlighting open issues

  Has documented 100s of candidate
  methods for improving SE
  development

  e.g. Kitchenham et. Al IEEE TSE, 2007,
    Cross versus Within-Company Cost
     Estimation
    Spawned a sub-culture of researchers
       checking if what works here also works
        there.

                                                 15 of 48
Case-Study-based Research:
                 Has Limits
  Too slow
     Years to produce conclusions
     Meanwhile, technology base changes

  Too many candidate methods
     No guidance on what methods to
      apply to particular projects

  Little generality
     Zimmermann et. al, FSE 2009
        662 times : learn here, test there
        Worked in 4% of pairs
     Many similar no-generality results
        Chpt1, Menzies & Shull
                                              16 of 48
Case-studies + DM =
               Better Research
  Propose a partnership between
    case study research
    And data mining

  Data mining is stupid
    Syntactic, no business knowledge

  Case studies are too slow
    And to check for generality? Even slower

  Case study research (on one project) to raise questions
    Data mining (on many projects) to check the answers


                                                           17 of 48
Adaptive
 Agents
           18
Need for adaptive agents
  No general rules in SE
     Zimmermann FSE, 2009

  But general methods to find the local rules

  Issues:
     How quickly can we learn the local models?
     How to check when local models start failing?
     How to repair local models?

  An adaptive agent watching a stream of data, learning
  and relearning as appropriate


                                                      19 of 48
Agents for adaptive business
              intelligence
                             mode #2
Quality
(e.g.          mode #1
PRED(30))                   learn                   mode #3

            learn   break           break   learn
                    down            down

                                              Data collected over time




                                                               20 of 48
Agents for adaptive business
              intelligence
                             mode #2
Quality
(e.g.          mode #1
PRED(30))                   learn                   mode #3

            learn   break           break   learn
                    down            down

                                              Data collected over time


       What is different here?
         Not “apply data mining to build a predictor”
         But add monitor and repair tools to recognize and
          handle the breakdown of old predictors
         Trust = data mining + monitor + repair               21 of 48
If crowd sourcing
                Quality                mode #2
                            mode #1
                (e.g.
                                                    mode
Company#1       PRED(30))
                                                    #3


                Quality                mode #5
                            mode #4
Company #2      (e.g.
                PRED(30))                           mode
                                                    #6


                Quality                mode #8
                            mode #7
Company #3      (e.g.
                PRED(30))                           mode
                                                    #9



   With DM, we could recognize that e.g. 1=4=7
   •  i.e. when some “new” situation has happened before   22
   •  So we’d know what experience base to exploit
Research Questions.
           How to handle….
  Anonymization                    Noise: from dad data
    Make data public, without       collection
     revealing private data
                                    Mode recognition
  Volume of data                      Is when new is stuff is new,
    Especially if working from         or a a repeat of old stuff
     “raw” project artifacts
                                    Trust : you did not collect
    Especially if crowd
                                     the data
     sourcing
                                       Must surround the learners
  Explanation : of complex             with assessment agents
  patterns                                Anomaly detectors
                                          Repair
                                                                23 of 48
Most of the technology

required for this approach

can be implemented via

      data mining
                          24
So it would scale

to large data sets


                     25
Organizing the
   artifacts
             26
Visual Wiki (“Viki”) concept




                           27 of 48
Enterprise Artifacts example




  Add documents; organize; search; navigate

  Edit properties, documents, add links, extract links
                                                          28 of 48
Various Vikis




  Bottom left: business process descriptions
                                                29 of 48
VikiBuilder – generating Vikis




                            30 of 48
Text mining
  Key issue: dimensionality reduction
  In some domains, can be done in linear time




  Use standard data miners, applied to top
  100 terms in each corpus                    31 of 48
Details
          32
Agents for adaptive business
              intelligence
                               mode #2
Quality
(e.g.          mode #1
PRED(30))                    learn                   mode #3

            learn   break            break   learn
                    down             down

                                               Data collected over time

       Q1: How to learn faster?
         Technology: active learning: reflect on examples
            to date to ask most informative next question
       Q2: How to recognize breakdown?
         Technology: bayesian anomaly detection
         Focusing on frequency counts of contrast sets         33 of 48
Agents for adaptive business
              intelligence
                                mode #2
Quality
(e.g.          mode #1
PRED(30))                     learn                   mode #3

            learn   break             break   learn
                    down              down

                                                Data collected over time


       Q3: How to classify a mode?
         Recognize if you’ve arrived at a mode seen before
         Technology: Bayes classifier
       Q4: How to make predictions?
         Using the norms of a mode, report expected behavior 34 of 48
         Technology: table look-up of data inside Bayes classifier
Agents for adaptive business
              intelligence
                              mode #2
Quality
(e.g.          mode #1
PRED(30))                    learn                   mode #3

            learn   break            break   learn
                    down             down

                                               Data collected over time


       Q5: What went wrong? (diagnosis)
         Delta between current and prior, better, mode
       Q6: What to do? (planning)
         Delta between current and other, better, mode
       Technology: contrast set learning                       35 of 48
Agents for adaptive business
              intelligence
                              mode #2
Quality
(e.g.          mode #1
PRED(30))                    learn                   mode #3

            learn   break            break   learn
                    down             down

                                               Data collected over time


       Q7: How to understand a mode (explanation)
         Presentation of essential features of a mode
       Technology: contrast set learning

                                                                36 of 48
Bits and pieces


 Prototypes
                  37
Contrast set learning
  Minimal contrast set learning =
  diagnosis &planning

  A decision tree with weighted leaves
     Treatment = decisions that prune branches
       Culls bad weights

       Keeps   good weights

  E.g. Simulator + C4.5 + 10-way
     10 * 1000 node trees
     TAR1: tiny rules: decision on 4 ranges


  Why so small?
     Higher decisions prune more branches
        touch fewer nodes

                                                  38 of 48
Contrast Set Learning 
 Succinct Explanations




                          39 of 48
Contrast Set Learning
               (10 years later)
  No longer a post-processor to a
   decision tree learner
     TAR3: Branch pruning operators
      applied directly to discretized
      data

  Summer’09
     Shoot ‘em up at NASA AMES
     State-of-the-art numerical
      optimizer
     TAR3
          Ran 40 times faster
          Generated better solutions

  Powerful succinct explanation tool
                                        40 of 48
Contrast Set Learning 
         Anomaly Detection
  Recognize when old ideas
  are now out-dated

  SAWTOOTH:
    read data in “eras” of 100 instances
    Classify all examples as “seen it”

  SAWTOOTH1:
    Report average likelihood of
     examples belong to “seen it”
    Alert if that likelihood drops

  SAWTOOTH2:
    Back-end to TAR3
    Track frequency of contrast sets
    Some uniformity between contrast
     sets and anomaly detection             41 of 48
Contrast sets  noise management
  CLIFF: post-processor to TAR3
    Linear time instance selector
  Finds the attribute ranges that change classification
  Delete all instances that lack the “power ranges”




                                                       42 of 48
Contrast Sets  CLIFF 
           Active Learning
  Many examples, too few
  labels

  Reduce the time required
  for business users to offer
  judgment on business
  cases

  Explore the reduced
  space generated by
  CLIFF.
       Randomly ample the
        instances half-way
        between different
        classes

  Fast (in the reduced
  space)                          43 of 48
Contrast sets  CLIFF 
          Statistical databases
  Anonymize the data: Preserving its distributions

  For KNN, that means keep the boundaries between classes
     Which we get from CLIFF

  Also, CLIFF empties out the instance space
     Leaving us space to synthesize new instances




                                                       44 of 48
And so…

          45
We seek industrial partners

1.  That will place textual versions of their products
   in a wiki

2.  That will offer joins of those products to quality
   measures

3.  That will suffer us interviewing their managers,
   from time to time, to learn the control points.

(Note: 1,2 can be behind your firewalls.)



                                                         46 of 48
In return, we offer

  Agents for
    automatic, adaptive, business intelligence
    that tunes itself to your local domain




                                                  47 of 48
Questions?
Comments?
             48

Más contenido relacionado

La actualidad más candente

Lionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 KeynoteLionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 Keynote
ICSM 2011
 
Information Systems design science research
Information Systems design science  researchInformation Systems design science  research
Information Systems design science research
Raimo Halinen
 

La actualidad más candente (20)

Theories in Empirical Software Engineering
Theories in Empirical Software EngineeringTheories in Empirical Software Engineering
Theories in Empirical Software Engineering
 
Building and Evaluating Theories 
 in Software Engineering
Building and Evaluating Theories 
 in Software EngineeringBuilding and Evaluating Theories 
 in Software Engineering
Building and Evaluating Theories 
 in Software Engineering
 
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to ValidityControlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
 
Surveys in Software Engineering
Surveys in Software EngineeringSurveys in Software Engineering
Surveys in Software Engineering
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overview
 
[01-B] Empirical software engineering
[01-B] Empirical software engineering[01-B] Empirical software engineering
[01-B] Empirical software engineering
 
Design Thinking for Requirements Engineering
Design Thinking for Requirements EngineeringDesign Thinking for Requirements Engineering
Design Thinking for Requirements Engineering
 
Survey Research in Software Engineering
Survey Research in Software EngineeringSurvey Research in Software Engineering
Survey Research in Software Engineering
 
Selecting Empirical Methods for Software Engineering
Selecting Empirical Methods for Software EngineeringSelecting Empirical Methods for Software Engineering
Selecting Empirical Methods for Software Engineering
 
[2017/2018] RESEARCH in software engineering
[2017/2018] RESEARCH in software engineering[2017/2018] RESEARCH in software engineering
[2017/2018] RESEARCH in software engineering
 
Theory Building in RE - The NaPiRE Initiative
Theory Building in RE - The NaPiRE InitiativeTheory Building in RE - The NaPiRE Initiative
Theory Building in RE - The NaPiRE Initiative
 
Improving Requirements Engineering by Artefact Orientation
Improving Requirements Engineering by Artefact OrientationImproving Requirements Engineering by Artefact Orientation
Improving Requirements Engineering by Artefact Orientation
 
UX Research
UX ResearchUX Research
UX Research
 
Introduction to knowledge discovery
Introduction to knowledge discoveryIntroduction to knowledge discovery
Introduction to knowledge discovery
 
System Analysis Fact Finding Methods
System Analysis Fact Finding MethodsSystem Analysis Fact Finding Methods
System Analysis Fact Finding Methods
 
Past and Future of Software Testing and Analysis
Past and Future of Software Testing and AnalysisPast and Future of Software Testing and Analysis
Past and Future of Software Testing and Analysis
 
Lionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 KeynoteLionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 Keynote
 
An Exploratory Study on Technology Transfer in Software Engineering
An Exploratory Study on Technology Transfer in Software EngineeringAn Exploratory Study on Technology Transfer in Software Engineering
An Exploratory Study on Technology Transfer in Software Engineering
 
Exploratory testing STEW 2016
Exploratory testing STEW 2016Exploratory testing STEW 2016
Exploratory testing STEW 2016
 
Information Systems design science research
Information Systems design science  researchInformation Systems design science  research
Information Systems design science research
 

Similar a empirical software engineering, v2.0

Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
mark madsen
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
mark madsen
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
juliennehar
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
mark madsen
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
mark madsen
 
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining AppliedDMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
Johannes Hoppe
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...
Kun Le
 
Introduction to Big Data An analogy between Sugar Cane & Big Data
Introduction to Big Data An analogy  between Sugar Cane & Big DataIntroduction to Big Data An analogy  between Sugar Cane & Big Data
Introduction to Big Data An analogy between Sugar Cane & Big Data
Jean-Marc Desvaux
 

Similar a empirical software engineering, v2.0 (20)

Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
discopen
discopendiscopen
discopen
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
From Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into valueFrom Lab to Factory: Or how to turn data into value
From Lab to Factory: Or how to turn data into value
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 
Hadoop dev 01
Hadoop dev 01Hadoop dev 01
Hadoop dev 01
 
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining AppliedDMDW Lesson 05 + 06 + 07 - Data Mining Applied
DMDW Lesson 05 + 06 + 07 - Data Mining Applied
 
Data mining 2012 generalwithmethods
Data mining  2012 generalwithmethodsData mining  2012 generalwithmethods
Data mining 2012 generalwithmethods
 
Big Data: an introduction
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the data
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...
 
Mohammed AL Madhani
Mohammed AL MadhaniMohammed AL Madhani
Mohammed AL Madhani
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Introduction to Big Data An analogy between Sugar Cane & Big Data
Introduction to Big Data An analogy  between Sugar Cane & Big DataIntroduction to Big Data An analogy  between Sugar Cane & Big Data
Introduction to Big Data An analogy between Sugar Cane & Big Data
 

Más de CS, NcState

Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
CS, NcState
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
CS, NcState
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
CS, NcState
 

Más de CS, NcState (20)

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdec
 
Future se oct15
Future se oct15Future se oct15
Future se oct15
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
Goldrush
GoldrushGoldrush
Goldrush
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 

Último

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Último (20)

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 

empirical software engineering, v2.0

  • 1. Tim Menzies, WVU, USA Forrest Shull, Fraunhofer , USA (with John Hoskings, UoA, NZ) Jan 27-2011 Empirical Software Engineering, Version 2.0
  • 2. About us   Curators of large repositories of SE data   Searched for conclusions   Shull: NSF-funded CeBase 2001- 2005   No longer on-line   Menzies: PROMISE 2006-2011   If you publish, offer data used in that pub   http://promisedata.org/data Text-mining   Our question: Model-based   What’s next? General Effort estimation Defect 2 0 20 40 60 80 100
  • 3. Summary   We need to do more “data mining”   Not just on different projects   But again and again on the same project   And by “data Mining” we really mean   Automated agents that implement   prediction   monitoring   diagnosis,   Planning   Adaptive business intelligence 3 of 48
  • 4. Adaptive Business Intelligence    learning, and re-learning,   How to….   Detect death march project   Repair death march projects   Find best sell/buy point for software artifacts   Invest more (or less) in staff training/dev programs   Prioritize software inspections   Estimate development cost   Change development costs   etc 4 of 48
  • 5. This talk   A plea for industrial partners to join in   A roadmap for my next decade of research   Many long term questions   A handful of new results 5 of 48
  • 6. Data Mining & Software Engineering 6
  • 7. So many applications of data mining to SE   Process data   Social data   Input: Developer skills,   Input: e.g. which tester do platform stability you most respect?   Output: effort estimation   Output: predictions of what bugs gets fixed first   Product data   Trace data   Input: static code   Input: what calls what? descriptions   Output: call sequences   Output: defect predictors that lead to a core dump   Usage data   Any textual form   Input: what is everyone   Input: text of any artifact using?   Output: e.g. fault   Output: recommendations localization 7 on where to browse next
  • 8. The State of the Art   If data collected, then usually forgotten   Dashboards : visualizations for feature extraction; intelligence left the user   MapReduce, Hadoop et. al : systems support for massive, parallel execution.   http://hadoop.apache.org   Implements the bus, but no bus drivers   Many SE data mining publications   e.g. Bird, Nagappan, Zimmermann and last slide   But, no agents that recognize when old models are no longer relevant,   Or to repair old models using new data 8 of 48
  • 9. Of course, DM gets it wrong, sometimes   Heh, nobody’s perfect   Create agent communities, each with novel insights and limitations   E.g. look at all the   Data miners working with humans mistakes people make:   See more together than separately   Wikipidea: list of   Partnership cognitive biases   38 decision making biases   30 biases in probability   18 social biases   10 memory biases   At least with DM, can repeat the analysis, audit the conclusion. 9 of 48
  • 10. Does this change empirical SE research? 10
  • 11. Ben Shneiderman, Mar’08   The growth of the World Wide Web ... continues to reorder whole disciplines and industries. ...   It is time for researchers in science to take network collaboration to the next phase and reap the potential intellectual and societal payoffs.   -B. Shneiderman.   Science 2.0.   Science, 319(7):1349–1350, March 2008 11 of 48
  • 13. A proposal   Add roller skates to software engineering   Always use DM (data mining) on SE data 13 of 48
  • 14. What’s the difference? SE research v1.0 SE research v2.0   Case studies   Data generators   Watch, don’t touch   Case studies   Experiments   Experiments   Vary a few conditions in a   Data analysis project   10,000 of possible data miners   Simple analysis   A little ANOVA, regression,   Crowd-sourcing maybe a t-test   10,000 of possible analysts   14 of 48
  • 15. Value-added (to case-study- based research)   Case studies: powerful for defining problems, highlighting open issues   Has documented 100s of candidate methods for improving SE development   e.g. Kitchenham et. Al IEEE TSE, 2007,   Cross versus Within-Company Cost Estimation   Spawned a sub-culture of researchers   checking if what works here also works there. 15 of 48
  • 16. Case-Study-based Research: Has Limits   Too slow   Years to produce conclusions   Meanwhile, technology base changes   Too many candidate methods   No guidance on what methods to apply to particular projects   Little generality   Zimmermann et. al, FSE 2009   662 times : learn here, test there   Worked in 4% of pairs   Many similar no-generality results   Chpt1, Menzies & Shull 16 of 48
  • 17. Case-studies + DM = Better Research   Propose a partnership between   case study research   And data mining   Data mining is stupid   Syntactic, no business knowledge   Case studies are too slow   And to check for generality? Even slower   Case study research (on one project) to raise questions   Data mining (on many projects) to check the answers 17 of 48
  • 19. Need for adaptive agents   No general rules in SE   Zimmermann FSE, 2009   But general methods to find the local rules   Issues:   How quickly can we learn the local models?   How to check when local models start failing?   How to repair local models?   An adaptive agent watching a stream of data, learning and relearning as appropriate 19 of 48
  • 20. Agents for adaptive business intelligence mode #2 Quality (e.g. mode #1 PRED(30)) learn mode #3 learn break break learn down down Data collected over time 20 of 48
  • 21. Agents for adaptive business intelligence mode #2 Quality (e.g. mode #1 PRED(30)) learn mode #3 learn break break learn down down Data collected over time   What is different here?   Not “apply data mining to build a predictor”   But add monitor and repair tools to recognize and handle the breakdown of old predictors   Trust = data mining + monitor + repair 21 of 48
  • 22. If crowd sourcing Quality mode #2 mode #1 (e.g. mode Company#1 PRED(30)) #3 Quality mode #5 mode #4 Company #2 (e.g. PRED(30)) mode #6 Quality mode #8 mode #7 Company #3 (e.g. PRED(30)) mode #9 With DM, we could recognize that e.g. 1=4=7 •  i.e. when some “new” situation has happened before 22 •  So we’d know what experience base to exploit
  • 23. Research Questions. How to handle….   Anonymization   Noise: from dad data   Make data public, without collection revealing private data   Mode recognition   Volume of data   Is when new is stuff is new,   Especially if working from or a a repeat of old stuff “raw” project artifacts   Trust : you did not collect   Especially if crowd the data sourcing   Must surround the learners   Explanation : of complex with assessment agents patterns   Anomaly detectors   Repair 23 of 48
  • 24. Most of the technology required for this approach can be implemented via data mining 24
  • 25. So it would scale to large data sets 25
  • 26. Organizing the artifacts 26
  • 27. Visual Wiki (“Viki”) concept 27 of 48
  • 28. Enterprise Artifacts example   Add documents; organize; search; navigate   Edit properties, documents, add links, extract links 28 of 48
  • 29. Various Vikis   Bottom left: business process descriptions 29 of 48
  • 30. VikiBuilder – generating Vikis 30 of 48
  • 31. Text mining   Key issue: dimensionality reduction   In some domains, can be done in linear time   Use standard data miners, applied to top 100 terms in each corpus 31 of 48
  • 32. Details 32
  • 33. Agents for adaptive business intelligence mode #2 Quality (e.g. mode #1 PRED(30)) learn mode #3 learn break break learn down down Data collected over time   Q1: How to learn faster?   Technology: active learning: reflect on examples to date to ask most informative next question   Q2: How to recognize breakdown?   Technology: bayesian anomaly detection   Focusing on frequency counts of contrast sets 33 of 48
  • 34. Agents for adaptive business intelligence mode #2 Quality (e.g. mode #1 PRED(30)) learn mode #3 learn break break learn down down Data collected over time   Q3: How to classify a mode?   Recognize if you’ve arrived at a mode seen before   Technology: Bayes classifier   Q4: How to make predictions?   Using the norms of a mode, report expected behavior 34 of 48   Technology: table look-up of data inside Bayes classifier
  • 35. Agents for adaptive business intelligence mode #2 Quality (e.g. mode #1 PRED(30)) learn mode #3 learn break break learn down down Data collected over time   Q5: What went wrong? (diagnosis)   Delta between current and prior, better, mode   Q6: What to do? (planning)   Delta between current and other, better, mode   Technology: contrast set learning 35 of 48
  • 36. Agents for adaptive business intelligence mode #2 Quality (e.g. mode #1 PRED(30)) learn mode #3 learn break break learn down down Data collected over time   Q7: How to understand a mode (explanation)   Presentation of essential features of a mode   Technology: contrast set learning 36 of 48
  • 37. Bits and pieces Prototypes 37
  • 38. Contrast set learning   Minimal contrast set learning = diagnosis &planning   A decision tree with weighted leaves   Treatment = decisions that prune branches  Culls bad weights  Keeps good weights   E.g. Simulator + C4.5 + 10-way   10 * 1000 node trees   TAR1: tiny rules: decision on 4 ranges   Why so small?   Higher decisions prune more branches   touch fewer nodes 38 of 48
  • 39. Contrast Set Learning  Succinct Explanations 39 of 48
  • 40. Contrast Set Learning (10 years later)   No longer a post-processor to a decision tree learner   TAR3: Branch pruning operators applied directly to discretized data   Summer’09   Shoot ‘em up at NASA AMES   State-of-the-art numerical optimizer   TAR3   Ran 40 times faster   Generated better solutions   Powerful succinct explanation tool 40 of 48
  • 41. Contrast Set Learning  Anomaly Detection   Recognize when old ideas are now out-dated   SAWTOOTH:   read data in “eras” of 100 instances   Classify all examples as “seen it”   SAWTOOTH1:   Report average likelihood of examples belong to “seen it”   Alert if that likelihood drops   SAWTOOTH2:   Back-end to TAR3   Track frequency of contrast sets   Some uniformity between contrast sets and anomaly detection 41 of 48
  • 42. Contrast sets  noise management   CLIFF: post-processor to TAR3   Linear time instance selector   Finds the attribute ranges that change classification   Delete all instances that lack the “power ranges” 42 of 48
  • 43. Contrast Sets  CLIFF  Active Learning   Many examples, too few labels   Reduce the time required for business users to offer judgment on business cases   Explore the reduced space generated by CLIFF.   Randomly ample the instances half-way between different classes   Fast (in the reduced space) 43 of 48
  • 44. Contrast sets  CLIFF  Statistical databases   Anonymize the data: Preserving its distributions   For KNN, that means keep the boundaries between classes   Which we get from CLIFF   Also, CLIFF empties out the instance space   Leaving us space to synthesize new instances 44 of 48
  • 45. And so… 45
  • 46. We seek industrial partners 1.  That will place textual versions of their products in a wiki 2.  That will offer joins of those products to quality measures 3.  That will suffer us interviewing their managers, from time to time, to learn the control points. (Note: 1,2 can be behind your firewalls.) 46 of 48
  • 47. In return, we offer   Agents for   automatic, adaptive, business intelligence   that tunes itself to your local domain 47 of 48