SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
ON IMPACT IN SOFTWARE ENGINEERING RESEARCH
ANDREAS ZELLER, CISPA / SAARLAND UNIVERSITY
SEMINAR ON RESEARCH TRAINING IN SOFTWARE ENGINEERING
DAGSTUHL, JANUARY 15, 2018
ANDREAS ZELLER: KEY FACTS
• PhD in 1997 on Configuration Management with Feature Logic
• Since 2001 professor in Saarbrücken (Saarland University / CISPA)
• Four 10-year impact awards 2009–2017 (for papers 1999–2007)
• ACM Fellow in 2010
• Google Focused Impact Award in 2010
• ERC Advanced Grant in 2011
WHAT IS IMPACT?
WHAT IS IMPACT?
• How do your actions change the world?
• Often measured in citations, publications, funding, people, …
• All these are indicators of impact, but not goals in themselves
• We want to make the world a better place
• Gives meaning and purpose to our (professional) life
WHAT MAKES IMPACTFUL RESEARCH?
• Intellectual challenge – was it hard, or could anyone have done this?
• Elegance – is your research specific to a context, or can it be reused
again and again?
• Usefulness – can someone make money with it?
• Innovation is the delta in any of these metrics
PEOPLE LAUGH AT US
• Programming Languages folks miss the intellectual challenge
• Formal Methods folks miss elegance and challenge
• Industry folks miss the usefulness
• We recluse in our private bubbles
MY PATH TO IMPACT
MY PATH TO IMPACT
• Life can only be understood backwards; but it must be lived forwards

(Søren Kierkegaard)
CONFIGURATION MANAGEMENT

WITH FEATURE LOGIC (1991–1997)
• Topic defined by my PhD advisor
Gregor Snelting
• Idea: Formally describe variants and
revisions with feature logic
• “A unified model for configuration
management”
3.3 Combining Delta Features and other Features
..........
bug fixed f functionbug fixed f function
bug fixed f function procedurebug fixed f function procedure
bug fixed f function bug fixed f procedure
Figure 8: Delta features and other features
. This is only natural, since the change relies on the presence of the
change. Adding a new revision under implies that the revision is now tagged w
, so that the generalization property is not violated. Indeed,
.
FEATURE LOGIC: LESSONS LEARNED
• You can get plenty of papers accepted
• even if you miss the problem
• even if you do not evaluate
• “Modeling for the sake of modeling”
• Enabled much of my later work, though
DDD (1994–1999)
• During PhD, programmed a lot
• Debugging was hard!
• Built the DDD debugger GUI

with my student Dorothea Lütkehaus
• Welcome change from formal work
DDD (1994–1999)
• DDD was among the first dev tools

with a “professional” GUI
• Downloaded by the tens of thousands
• Adopted as a GNU project:

Street credibility with developers
• Impact through usefulness
DDD: LESSONS LEARNED
• Work on a real problem
• Assume as little as possible
• Keep things simple
–  “real” as in “real world”, not “real papers”
–  make things fit into real processes
–  complexity impresses, but prevents impact
DELTA DEBUGGING (1999–2003)
• After PhD, looking for new topic
• Delta Debugging brought together
debugging and version control
• Isolate failure causes through
repeated experiments
DELTA DEBUGGING (1999–2003)
• Delta debugging was a bomb
• Easy to teach + understand
• 7 lines of algorithm

(and 25 lines of Python)
• Spent two years on these
⎧
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩
(c′
✔, c′
✘) if |∆| = 1
dd′
(c′
✘  ∆i, c′
✘, 2) if ∃i ∈ {1..n} · test(c′
✘  ∆i) = ✔
dd′
(c′
✔, c′
✔ ∪ ∆i, 2) if ∃i ∈ {1..n} · test(c′
✔ ∪ ∆i) = ✘
dd′
c′
✔ ∪ ∆i, c′
✘, max(n − 1, 2) else if ∃i ∈ {1..n} · test(c′
✔ ∪ ∆i) = ✔
dd′
c′
✔, c′
✘  ∆i, max(n − 1, 2) else if ∃i ∈ {1..n} · test(c′
✘  ∆i) = ✘
dd′
c′
✔, c′
✘, min(2n, |∆|) else if n < |∆| (“increase granularity”)
(c′
✔, c′
✘) otherwise
dd(c✔, c✘) = dd′
(c✔, c✘, 2)
dd′
(c′
✔, c′
✘, n) =
DELTA DEBUGGING: LESSONS LEARNED
• Work on a real problem
• Assume as little as possible
• Keep things simple
• Have a sound model
–  Version control? tests? Never heard of it
– 25 lines of Python is probably excessive
–  DD was my version model reborn
–  Why debug? We build correct software
MINING SOFTWARE ARCHIVES (2003–2010)
• In the early 2000s, open-source
version repositories became available
• Stephan Diehl saw an opportunity for
visualization and approached me
• Quickly expanded into data mining
• Tom Zimmermann: our MSc student
• Work of a research team
MINING SOFTWARE ARCHIVES (2003–2010)
• Our 2004 paper was the first ICSE
paper on mining software archives
• Handful of competing groups;

instant hit
• MSR now a conference on its own
• Paper has 1200+ citations so far
• Impact at Microsoft, Google, SAP…
MINING SOFTWARE ARCHIVES (2003–2010)
• We are now after the gold rush
• Data still exciting (if you have some)
• Few new insights on old data
• Get out of a field when too crowded 3.5 Programmer Actions and Defects
Now that we know how to predict defects, can we actually prevent
them? Of course, we could focus quality assurance on those files
predicted as most defect-prone. But are there also constructive
ways to avoid these defects? Is there a general rule to learn?
For this purpose, let us now focus on H2: Is there a correlation
between individual actions (= keystrokes) and defects? For this
purpose, we would search for correlations between the count of
the 256 characters and the overall post-defect count per file; our
null hypothesis would be:
H0. There is no correlation between character distribution and
defect-proneness.
After a number of preliminary experiments, we focused on the
Eclipse 3.0 dataset. It is well known that most metrics of software
do not follow a normal distribution and our measures of key-
strokes are no exception. The distributions of characters appear to
have an exponential rather than a power-law character. Nonethe-
less, due to the heavily skewed distribution, we used a standard Our results show a strong correlation between specific pro-
Figure 2: Color-coding keys by their defect correlation; (red = strong). The five strongest correlations are highlighted.
Figure 3: Defect correlation for the 26 lower-case letters.
MINING SOFTWARE REPOSITORIES:
LESSONS LEARNED
• Work on a real problem
• Assume as little as possible
• Keep things simple
• Have a sound model
• Keep on learning
–  Empirical research is core field of SE
–  simple parsers for multiple languages
–  essence of 2004 paper is one line of SQL
–  retrieval, precision, recall, etc, etc
–  statistics, data mining, machine learning
MINING APP ARCHIVES (2014–)
• How do we know an app does what
it should do?
• CHABADA checks for mismatches
between description and behavior
• Novel usage of NLP; 

novel app store mining
Alessandra Gorla · Ilaria Tavecchia⇤
· Florian Gross · Andreas Zeller
Saarland University
Saarbrücken, Germany
{gorla, tavecchia, fgross, zeller}@cs.uni-saarland.de
ABSTRACT
How do we know a program does what it claims to do? After clus-
tering Android apps by their description topics, we identify outliers
in each cluster with respect to their API usage. A “weather” app that
sends messages thus becomes an anomaly; likewise, a “messaging”
app would typically not be expected to access the current location.
Applied on a set of 22,500+ Android applications, our CHABADA
prototype identified several anomalies; additionally, it flagged 56%
of novel malware as such, without requiring any known malware
patterns.
Categories and Subject Descriptors
D.4.6 [Security and Protection]: Invasive software
General Terms
Security
Keywords
Android, malware detection, description analysis, clustering
1. App collection 2. Topics
"Weather",
"Map"…
"Travel",
"Map"…
"Theme"
3. Clusters
Weather
+ Travel
Themes
Access-LocationInternet Access-LocationInternet Send-SMS
4. Used APIs 5. Outliers
Weather
+ Travel
Figure 1: Detecting applications with unadvertised behavior.
Starting from a collection of “good” apps (1), we identify their
description topics (2) to form clusters of related apps (3). For
each cluster, we identify the sentitive APIs used (4), and can
MINING APP ARCHIVES (2014–)
• The ICSE paper of 2014 is among
most cited
• CHABADA techniques now adopted
by Google and Microsoft
• Most of your mobile apps have
gone through such an analysis :-)
Alessandra Gorla · Ilaria Tavecchia⇤
· Florian Gross · Andreas Zeller
Saarland University
Saarbrücken, Germany
{gorla, tavecchia, fgross, zeller}@cs.uni-saarland.de
ABSTRACT
How do we know a program does what it claims to do? After clus-
tering Android apps by their description topics, we identify outliers
in each cluster with respect to their API usage. A “weather” app that
sends messages thus becomes an anomaly; likewise, a “messaging”
app would typically not be expected to access the current location.
Applied on a set of 22,500+ Android applications, our CHABADA
prototype identified several anomalies; additionally, it flagged 56%
of novel malware as such, without requiring any known malware
patterns.
Categories and Subject Descriptors
D.4.6 [Security and Protection]: Invasive software
General Terms
Security
Keywords
Android, malware detection, description analysis, clustering
1. App collection 2. Topics
"Weather",
"Map"…
"Travel",
"Map"…
"Theme"
3. Clusters
Weather
+ Travel
Themes
Access-LocationInternet Access-LocationInternet Send-SMS
4. Used APIs 5. Outliers
Weather
+ Travel
Figure 1: Detecting applications with unadvertised behavior.
Starting from a collection of “good” apps (1), we identify their
description topics (2) to form clusters of related apps (3). For
each cluster, we identify the sentitive APIs used (4), and can
MINING APPS: LESSONS LEARNED
• Work on a real problem
• Assume as little as possible
• Keep things simple
• Have a sound model
• Keep on learning
• Keep on moving
–  Yes, there is malware
–  Descriptions and APIs
–  Standard NLP techniques
–  Standard NLP methods
–  NLP, machine learning, recommendation
–  Security starts with SE
MORE THINGS I DID
• Automatic repair
• Automatic parallelization
• Automatic website testing
• Structured fuzzing
• Automatic sandboxing
–  Wesley Weimer beat us to it
–  Struggled with complexity
–  Langfuzz found 4000 bugs at Mozilla
–  lots of potential in here
–  Built a company for that
THINGS I STAYED AWAY FROM
• Symbolic techniques
• Formal methods
• Modeling
• Architecture
• Work on a real problem
• Assume as little as possible
• Keep things simple
• Have a sound model
• Keep on learning
• Keep on moving
YOUR WAYS TO HAVE IMPACT
IMPACT AS A RESEARCHER
• Society funds research to take risks that no one else does
• Research is risky by construction – 

you should expect to fail, and fail again
• Tenure is meant to allow you to take arbitrarily grand challenges –

so work on the grand stuff
• If you lack resources, try smarter and harder
IMPACT AS A TEACHER
• Teaching can be a great way to multiply your message
• Not only focus on teaching the standards, but also your research
• Teaching your research helps to propagate it and make it accessible
• Engage students on topics dear to you
IMPACT WITH INDUSTRY
• Do work with industry to find problems and frame your work
• Do not work with industry to solve (their) concrete problems
• Your role as researcher is more than a cheap consulting tool
• Many “research” funding schemes are there to subsidize industry
IMPACT THROUGH TOOLS
• Getting your technique out as a tool is a great way to have impact!
• Also allows to check what actual users need (and if they exist)
• A tool can have far more impact than a paper
• Funding agencies and hiring committees begin to realize this
IMPACT AS FOUNDER
• Creating a company out of your research can be great fun!
• Allows you to push your research and ideas into practice
• Again, shows you what the market wants (and what not)
• Plenty of monetary and consultancy support available
IMPACT AS MENTOR
• Working with advanced students (MSc, PhD, PostDoc) can be the
most satisfying part of your job
• The variety of SE research needs universal problem solving skills
• Find such skills besides good grades
A GREAT ENVIRONMENT
• In 2000, Bernhard Westfechtel informed me of a
position in Saarbrücken
• I applied, got rejected, but dept would create a
position just for me
• Stayed there since 2001, despite several offers
A GREAT ENVIRONMENT
• Saarbrücken is the best-funded CS department in Germany
• For every 1€ that Saarland spends, we bring in 5€ more
• Great team and great team spirit
• Excellence Cluster, two Max-Planck Institutes, DFKI
• New CISPA Helmholtz Center for IT Security
® Visual

Computing

Institute
Center for Information Security, Privacy and
Accountability
375
PhD students
200
Post PhDs
9

new buildings

since 2001
11

ERC grants
6

Leibniz awards
4

ACM Fellows
2600
BSc + MSc
+400 +100 +7
® Visual

Computing

Institute
Center for Information Security, Privacy and
Accountability
We’re hiringWe’re hiring!
® Visual

Computing

Institute
Center for Information Security, Privacy and
Accountability
We’re hiring
past impact

future impact

direct impact

indirect impact
ON IMPACT IN SOFTWARE ENGINEERING RESEARCH
ANDREAS ZELLER, CISPA / SAARLAND UNIVERSITY
LESSONS LEARNED:

ON IMPACT IN SE RESEARCH
• Work on a real problem
• Assume as little as possible
• Keep things simple
• Have a sound model
• Keep on learning
• Keep on moving
–  possibly bursting your bubble
–  immediate impact on adoption
–  complexity inhibits impact
–  tools may fade away, concepts persist
– learn new stuff and leverage it
– do not stay in your cozy SE corner

Más contenido relacionado

La actualidad más candente

HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckTao Xie
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
 
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago
 
User Expectations in Mobile App Security
User Expectations in Mobile App SecurityUser Expectations in Mobile App Security
User Expectations in Mobile App SecurityTao Xie
 
Innoslate 101: A Webinar for New Users
Innoslate 101: A Webinar for New Users Innoslate 101: A Webinar for New Users
Innoslate 101: A Webinar for New Users SarahCraig7
 
What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?SarahCraig7
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software DatasetsTao Xie
 
What are Model-Based Reviews
What are Model-Based ReviewsWhat are Model-Based Reviews
What are Model-Based ReviewsSarahCraig7
 
Troubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep LearningTroubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep LearningSergey Karayev
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringTao Xie
 
Help! I need an empirical study for my PhD!
Help! I need an empirical study for my PhD!Help! I need an empirical study for my PhD!
Help! I need an empirical study for my PhD!Walid Maalej
 
Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Massimiliano Di Penta
 
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchTao Xie
 
Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)
Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)
Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)Sergey Karayev
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie
 
Creativity vs Best Practices
Creativity vs Best PracticesCreativity vs Best Practices
Creativity vs Best PracticesSupun Dissanayake
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...Tao Xie
 
Verification and Validation with Innoslate Slide Deck
Verification and Validation with Innoslate Slide DeckVerification and Validation with Innoslate Slide Deck
Verification and Validation with Innoslate Slide DeckSarahCraig7
 
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Margaret-Anne Storey
 

La actualidad más candente (20)

HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
 
User Expectations in Mobile App Security
User Expectations in Mobile App SecurityUser Expectations in Mobile App Security
User Expectations in Mobile App Security
 
Innoslate 101: A Webinar for New Users
Innoslate 101: A Webinar for New Users Innoslate 101: A Webinar for New Users
Innoslate 101: A Webinar for New Users
 
What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?What is the Lifecycle Modeling Language?
What is the Lifecycle Modeling Language?
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
What are Model-Based Reviews
What are Model-Based ReviewsWhat are Model-Based Reviews
What are Model-Based Reviews
 
Troubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep LearningTroubleshooting Deep Neural Networks - Full Stack Deep Learning
Troubleshooting Deep Neural Networks - Full Stack Deep Learning
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
Help! I need an empirical study for my PhD!
Help! I need an empirical study for my PhD!Help! I need an empirical study for my PhD!
Help! I need an empirical study for my PhD!
 
Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?
 
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven ResearchISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
 
Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)
Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)
Lecture 13: ML Teams (Full Stack Deep Learning - Spring 2021)
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
 
Creativity vs Best Practices
Creativity vs Best PracticesCreativity vs Best Practices
Creativity vs Best Practices
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
 
Verification and Validation with Innoslate Slide Deck
Verification and Validation with Innoslate Slide DeckVerification and Validation with Innoslate Slide Deck
Verification and Validation with Innoslate Slide Deck
 
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
 

Similar a On Impact in Software Engineering Research

SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie
SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao XieSCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie
SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao XieTao Xie
 
SBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisSBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisTao Xie
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringTao Xie
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Introduction to object oriented language
Introduction to object oriented languageIntroduction to object oriented language
Introduction to object oriented languagefarhan amjad
 
Software Development and Quality
Software Development and QualitySoftware Development and Quality
Software Development and QualityHerwig Habenbacher
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesTao Xie
 
Believe it or not - keynote CAS 2015
Believe it or not - keynote CAS 2015Believe it or not - keynote CAS 2015
Believe it or not - keynote CAS 2015lantoli
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deploymentFilippo Zanella
 
Diagnosability vs The Cloud
Diagnosability vs The CloudDiagnosability vs The Cloud
Diagnosability vs The CloudBob Rhubart
 
Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30Cary Millsap
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasMerce Crosas
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 
Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015
Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015
Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015Mozaic Works
 
The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)
The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)
The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)Radu Marinescu
 
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesTowards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesPvrtechnologies Nellore
 
2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible researchYannick Wurm
 

Similar a On Impact in Software Engineering Research (20)

SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie
SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao XieSCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie
SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie
 
SBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisSBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and Analysis
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software Engineering
 
NEXiDA at OMG June 2009
NEXiDA at OMG June 2009NEXiDA at OMG June 2009
NEXiDA at OMG June 2009
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Introduction to object oriented language
Introduction to object oriented languageIntroduction to object oriented language
Introduction to object oriented language
 
Ch01lect1 et
Ch01lect1 etCh01lect1 et
Ch01lect1 et
 
Software Development and Quality
Software Development and QualitySoftware Development and Quality
Software Development and Quality
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Believe it or not - keynote CAS 2015
Believe it or not - keynote CAS 2015Believe it or not - keynote CAS 2015
Believe it or not - keynote CAS 2015
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deployment
 
Diagnosability vs The Cloud
Diagnosability vs The CloudDiagnosability vs The Cloud
Diagnosability vs The Cloud
 
Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30Diagnosability versus The Cloud, Redwood Shores 2011-08-30
Diagnosability versus The Cloud, Redwood Shores 2011-08-30
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015
Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015
Andrea Mocci: Beautiful Design, Beautiful Coding at I T.A.K.E. Unconference 2015
 
The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)
The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)
The Good the Bad and the Ugly of Dealing with Smelly Code (ITAKE Unconference)
 
Msr2021 tutorial-di penta
Msr2021 tutorial-di pentaMsr2021 tutorial-di penta
Msr2021 tutorial-di penta
 
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesTowards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniques
 
2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research
 

Más de CISPA Helmholtz Center for Information Security

Más de CISPA Helmholtz Center for Information Security (15)

Language-Based Testing and Debugging.pdf
Language-Based Testing and Debugging.pdfLanguage-Based Testing and Debugging.pdf
Language-Based Testing and Debugging.pdf
 
Digital Networking and Community
Digital Networking and CommunityDigital Networking and Community
Digital Networking and Community
 
Fuzzing - A Tale of Two Cultures
Fuzzing - A Tale of Two CulturesFuzzing - A Tale of Two Cultures
Fuzzing - A Tale of Two Cultures
 
Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)Illustrated Code (ASE 2021)
Illustrated Code (ASE 2021)
 
Fast and Effective Fuzz Testing (Facebook TAV 2019)
Fast and Effective Fuzz Testing (Facebook TAV 2019)Fast and Effective Fuzz Testing (Facebook TAV 2019)
Fast and Effective Fuzz Testing (Facebook TAV 2019)
 
Software-Tests automatisch erzeugen: Frische Ansätze für Forschung, Praxis un...
Software-Tests automatisch erzeugen: Frische Ansätze für Forschung, Praxis un...Software-Tests automatisch erzeugen: Frische Ansätze für Forschung, Praxis un...
Software-Tests automatisch erzeugen: Frische Ansätze für Forschung, Praxis un...
 
Twelve tips on how to prepare an ERC grant proposal
Twelve tips on how to prepare an ERC grant proposalTwelve tips on how to prepare an ERC grant proposal
Twelve tips on how to prepare an ERC grant proposal
 
Getting your work funded
Getting your work fundedGetting your work funded
Getting your work funded
 
Learning from 6,000 projects mining specifications in the large
Learning from 6,000 projects   mining specifications in the largeLearning from 6,000 projects   mining specifications in the large
Learning from 6,000 projects mining specifications in the large
 
Debugging Debugging
Debugging DebuggingDebugging Debugging
Debugging Debugging
 
Seeding Bugs To Find Bugs
Seeding Bugs To Find BugsSeeding Bugs To Find Bugs
Seeding Bugs To Find Bugs
 
Mining Processes
Mining ProcessesMining Processes
Mining Processes
 
Mining Programs
Mining ProgramsMining Programs
Mining Programs
 
Getting your paper accepted (at ISSTA 2008)
Getting your paper accepted (at ISSTA 2008)Getting your paper accepted (at ISSTA 2008)
Getting your paper accepted (at ISSTA 2008)
 
Woher kommen Software-Fehler?
Woher kommen Software-Fehler?Woher kommen Software-Fehler?
Woher kommen Software-Fehler?
 

Último

lect1 introduction.pptx microbiology ppt
lect1 introduction.pptx microbiology pptlect1 introduction.pptx microbiology ppt
lect1 introduction.pptx microbiology pptzbyb6vmmsd
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptAmirRaziq1
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learningvschiavoni
 
Production technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenaProduction technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenajana861314
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaDr.Mahmoud Abbas
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfSubhamKumar3239
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTAlexander F. Mayer
 
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's SurvivalHarry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survivalkevin8smith
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsDanielBaumann11
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerLuis Miguel Chong Chong
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Chiheb Ben Hammouda
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 

Último (20)

PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
lect1 introduction.pptx microbiology ppt
lect1 introduction.pptx microbiology pptlect1 introduction.pptx microbiology ppt
lect1 introduction.pptx microbiology ppt
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.ppt
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
 
Production technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongenaProduction technology of Brinjal -Solanum melongena
Production technology of Brinjal -Solanum melongena
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdf
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWST
 
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's SurvivalHarry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological CorrelationsTimeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
Timeless Cosmology: Towards a Geometric Origin of Cosmological Correlations
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of Cancer
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
 
Introduction Classification Of Alkaloids
Introduction Classification Of AlkaloidsIntroduction Classification Of Alkaloids
Introduction Classification Of Alkaloids
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 

On Impact in Software Engineering Research

  • 1. ON IMPACT IN SOFTWARE ENGINEERING RESEARCH ANDREAS ZELLER, CISPA / SAARLAND UNIVERSITY SEMINAR ON RESEARCH TRAINING IN SOFTWARE ENGINEERING DAGSTUHL, JANUARY 15, 2018
  • 2. ANDREAS ZELLER: KEY FACTS • PhD in 1997 on Configuration Management with Feature Logic • Since 2001 professor in Saarbrücken (Saarland University / CISPA) • Four 10-year impact awards 2009–2017 (for papers 1999–2007) • ACM Fellow in 2010 • Google Focused Impact Award in 2010 • ERC Advanced Grant in 2011
  • 4. WHAT IS IMPACT? • How do your actions change the world? • Often measured in citations, publications, funding, people, … • All these are indicators of impact, but not goals in themselves • We want to make the world a better place • Gives meaning and purpose to our (professional) life
  • 5. WHAT MAKES IMPACTFUL RESEARCH? • Intellectual challenge – was it hard, or could anyone have done this? • Elegance – is your research specific to a context, or can it be reused again and again? • Usefulness – can someone make money with it? • Innovation is the delta in any of these metrics
  • 6. PEOPLE LAUGH AT US • Programming Languages folks miss the intellectual challenge • Formal Methods folks miss elegance and challenge • Industry folks miss the usefulness • We recluse in our private bubbles
  • 7. MY PATH TO IMPACT
  • 8. MY PATH TO IMPACT • Life can only be understood backwards; but it must be lived forwards
 (Søren Kierkegaard)
  • 9. CONFIGURATION MANAGEMENT
 WITH FEATURE LOGIC (1991–1997) • Topic defined by my PhD advisor Gregor Snelting • Idea: Formally describe variants and revisions with feature logic • “A unified model for configuration management” 3.3 Combining Delta Features and other Features .......... bug fixed f functionbug fixed f function bug fixed f function procedurebug fixed f function procedure bug fixed f function bug fixed f procedure Figure 8: Delta features and other features . This is only natural, since the change relies on the presence of the change. Adding a new revision under implies that the revision is now tagged w , so that the generalization property is not violated. Indeed, .
  • 10. FEATURE LOGIC: LESSONS LEARNED • You can get plenty of papers accepted • even if you miss the problem • even if you do not evaluate • “Modeling for the sake of modeling” • Enabled much of my later work, though
  • 11. DDD (1994–1999) • During PhD, programmed a lot • Debugging was hard! • Built the DDD debugger GUI
 with my student Dorothea Lütkehaus • Welcome change from formal work
  • 12. DDD (1994–1999) • DDD was among the first dev tools
 with a “professional” GUI • Downloaded by the tens of thousands • Adopted as a GNU project:
 Street credibility with developers • Impact through usefulness
  • 13. DDD: LESSONS LEARNED • Work on a real problem • Assume as little as possible • Keep things simple –  “real” as in “real world”, not “real papers” –  make things fit into real processes –  complexity impresses, but prevents impact
  • 14. DELTA DEBUGGING (1999–2003) • After PhD, looking for new topic • Delta Debugging brought together debugging and version control • Isolate failure causes through repeated experiments
  • 15. DELTA DEBUGGING (1999–2003) • Delta debugging was a bomb • Easy to teach + understand • 7 lines of algorithm
 (and 25 lines of Python) • Spent two years on these ⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ (c′ ✔, c′ ✘) if |∆| = 1 dd′ (c′ ✘ ∆i, c′ ✘, 2) if ∃i ∈ {1..n} · test(c′ ✘ ∆i) = ✔ dd′ (c′ ✔, c′ ✔ ∪ ∆i, 2) if ∃i ∈ {1..n} · test(c′ ✔ ∪ ∆i) = ✘ dd′ c′ ✔ ∪ ∆i, c′ ✘, max(n − 1, 2) else if ∃i ∈ {1..n} · test(c′ ✔ ∪ ∆i) = ✔ dd′ c′ ✔, c′ ✘ ∆i, max(n − 1, 2) else if ∃i ∈ {1..n} · test(c′ ✘ ∆i) = ✘ dd′ c′ ✔, c′ ✘, min(2n, |∆|) else if n < |∆| (“increase granularity”) (c′ ✔, c′ ✘) otherwise dd(c✔, c✘) = dd′ (c✔, c✘, 2) dd′ (c′ ✔, c′ ✘, n) =
  • 16. DELTA DEBUGGING: LESSONS LEARNED • Work on a real problem • Assume as little as possible • Keep things simple • Have a sound model –  Version control? tests? Never heard of it – 25 lines of Python is probably excessive –  DD was my version model reborn –  Why debug? We build correct software
  • 17. MINING SOFTWARE ARCHIVES (2003–2010) • In the early 2000s, open-source version repositories became available • Stephan Diehl saw an opportunity for visualization and approached me • Quickly expanded into data mining • Tom Zimmermann: our MSc student • Work of a research team
  • 18. MINING SOFTWARE ARCHIVES (2003–2010) • Our 2004 paper was the first ICSE paper on mining software archives • Handful of competing groups;
 instant hit • MSR now a conference on its own • Paper has 1200+ citations so far • Impact at Microsoft, Google, SAP…
  • 19. MINING SOFTWARE ARCHIVES (2003–2010) • We are now after the gold rush • Data still exciting (if you have some) • Few new insights on old data • Get out of a field when too crowded 3.5 Programmer Actions and Defects Now that we know how to predict defects, can we actually prevent them? Of course, we could focus quality assurance on those files predicted as most defect-prone. But are there also constructive ways to avoid these defects? Is there a general rule to learn? For this purpose, let us now focus on H2: Is there a correlation between individual actions (= keystrokes) and defects? For this purpose, we would search for correlations between the count of the 256 characters and the overall post-defect count per file; our null hypothesis would be: H0. There is no correlation between character distribution and defect-proneness. After a number of preliminary experiments, we focused on the Eclipse 3.0 dataset. It is well known that most metrics of software do not follow a normal distribution and our measures of key- strokes are no exception. The distributions of characters appear to have an exponential rather than a power-law character. Nonethe- less, due to the heavily skewed distribution, we used a standard Our results show a strong correlation between specific pro- Figure 2: Color-coding keys by their defect correlation; (red = strong). The five strongest correlations are highlighted. Figure 3: Defect correlation for the 26 lower-case letters.
  • 20. MINING SOFTWARE REPOSITORIES: LESSONS LEARNED • Work on a real problem • Assume as little as possible • Keep things simple • Have a sound model • Keep on learning –  Empirical research is core field of SE –  simple parsers for multiple languages –  essence of 2004 paper is one line of SQL –  retrieval, precision, recall, etc, etc –  statistics, data mining, machine learning
  • 21. MINING APP ARCHIVES (2014–) • How do we know an app does what it should do? • CHABADA checks for mismatches between description and behavior • Novel usage of NLP; 
 novel app store mining Alessandra Gorla · Ilaria Tavecchia⇤ · Florian Gross · Andreas Zeller Saarland University Saarbrücken, Germany {gorla, tavecchia, fgross, zeller}@cs.uni-saarland.de ABSTRACT How do we know a program does what it claims to do? After clus- tering Android apps by their description topics, we identify outliers in each cluster with respect to their API usage. A “weather” app that sends messages thus becomes an anomaly; likewise, a “messaging” app would typically not be expected to access the current location. Applied on a set of 22,500+ Android applications, our CHABADA prototype identified several anomalies; additionally, it flagged 56% of novel malware as such, without requiring any known malware patterns. Categories and Subject Descriptors D.4.6 [Security and Protection]: Invasive software General Terms Security Keywords Android, malware detection, description analysis, clustering 1. App collection 2. Topics "Weather", "Map"… "Travel", "Map"… "Theme" 3. Clusters Weather + Travel Themes Access-LocationInternet Access-LocationInternet Send-SMS 4. Used APIs 5. Outliers Weather + Travel Figure 1: Detecting applications with unadvertised behavior. Starting from a collection of “good” apps (1), we identify their description topics (2) to form clusters of related apps (3). For each cluster, we identify the sentitive APIs used (4), and can
  • 22. MINING APP ARCHIVES (2014–) • The ICSE paper of 2014 is among most cited • CHABADA techniques now adopted by Google and Microsoft • Most of your mobile apps have gone through such an analysis :-) Alessandra Gorla · Ilaria Tavecchia⇤ · Florian Gross · Andreas Zeller Saarland University Saarbrücken, Germany {gorla, tavecchia, fgross, zeller}@cs.uni-saarland.de ABSTRACT How do we know a program does what it claims to do? After clus- tering Android apps by their description topics, we identify outliers in each cluster with respect to their API usage. A “weather” app that sends messages thus becomes an anomaly; likewise, a “messaging” app would typically not be expected to access the current location. Applied on a set of 22,500+ Android applications, our CHABADA prototype identified several anomalies; additionally, it flagged 56% of novel malware as such, without requiring any known malware patterns. Categories and Subject Descriptors D.4.6 [Security and Protection]: Invasive software General Terms Security Keywords Android, malware detection, description analysis, clustering 1. App collection 2. Topics "Weather", "Map"… "Travel", "Map"… "Theme" 3. Clusters Weather + Travel Themes Access-LocationInternet Access-LocationInternet Send-SMS 4. Used APIs 5. Outliers Weather + Travel Figure 1: Detecting applications with unadvertised behavior. Starting from a collection of “good” apps (1), we identify their description topics (2) to form clusters of related apps (3). For each cluster, we identify the sentitive APIs used (4), and can
  • 23. MINING APPS: LESSONS LEARNED • Work on a real problem • Assume as little as possible • Keep things simple • Have a sound model • Keep on learning • Keep on moving –  Yes, there is malware –  Descriptions and APIs –  Standard NLP techniques –  Standard NLP methods –  NLP, machine learning, recommendation –  Security starts with SE
  • 24. MORE THINGS I DID • Automatic repair • Automatic parallelization • Automatic website testing • Structured fuzzing • Automatic sandboxing –  Wesley Weimer beat us to it –  Struggled with complexity –  Langfuzz found 4000 bugs at Mozilla –  lots of potential in here –  Built a company for that
  • 25. THINGS I STAYED AWAY FROM • Symbolic techniques • Formal methods • Modeling • Architecture • Work on a real problem • Assume as little as possible • Keep things simple • Have a sound model • Keep on learning • Keep on moving
  • 26. YOUR WAYS TO HAVE IMPACT
  • 27. IMPACT AS A RESEARCHER • Society funds research to take risks that no one else does • Research is risky by construction – 
 you should expect to fail, and fail again • Tenure is meant to allow you to take arbitrarily grand challenges –
 so work on the grand stuff • If you lack resources, try smarter and harder
  • 28. IMPACT AS A TEACHER • Teaching can be a great way to multiply your message • Not only focus on teaching the standards, but also your research • Teaching your research helps to propagate it and make it accessible • Engage students on topics dear to you
  • 29. IMPACT WITH INDUSTRY • Do work with industry to find problems and frame your work • Do not work with industry to solve (their) concrete problems • Your role as researcher is more than a cheap consulting tool • Many “research” funding schemes are there to subsidize industry
  • 30. IMPACT THROUGH TOOLS • Getting your technique out as a tool is a great way to have impact! • Also allows to check what actual users need (and if they exist) • A tool can have far more impact than a paper • Funding agencies and hiring committees begin to realize this
  • 31. IMPACT AS FOUNDER • Creating a company out of your research can be great fun! • Allows you to push your research and ideas into practice • Again, shows you what the market wants (and what not) • Plenty of monetary and consultancy support available
  • 32. IMPACT AS MENTOR • Working with advanced students (MSc, PhD, PostDoc) can be the most satisfying part of your job • The variety of SE research needs universal problem solving skills • Find such skills besides good grades
  • 33. A GREAT ENVIRONMENT • In 2000, Bernhard Westfechtel informed me of a position in Saarbrücken • I applied, got rejected, but dept would create a position just for me • Stayed there since 2001, despite several offers
  • 34. A GREAT ENVIRONMENT • Saarbrücken is the best-funded CS department in Germany • For every 1€ that Saarland spends, we bring in 5€ more • Great team and great team spirit • Excellence Cluster, two Max-Planck Institutes, DFKI • New CISPA Helmholtz Center for IT Security
  • 35. ® Visual
 Computing
 Institute Center for Information Security, Privacy and Accountability 375
PhD students 200
Post PhDs 9
 new buildings
 since 2001 11
 ERC grants 6
 Leibniz awards 4
 ACM Fellows 2600
BSc + MSc +400 +100 +7
  • 36. ® Visual
 Computing
 Institute Center for Information Security, Privacy and Accountability We’re hiringWe’re hiring!
  • 37. ® Visual
 Computing
 Institute Center for Information Security, Privacy and Accountability We’re hiring past impact
 future impact
 direct impact
 indirect impact
  • 38. ON IMPACT IN SOFTWARE ENGINEERING RESEARCH ANDREAS ZELLER, CISPA / SAARLAND UNIVERSITY
  • 39. LESSONS LEARNED:
 ON IMPACT IN SE RESEARCH • Work on a real problem • Assume as little as possible • Keep things simple • Have a sound model • Keep on learning • Keep on moving –  possibly bursting your bubble –  immediate impact on adoption –  complexity inhibits impact –  tools may fade away, concepts persist – learn new stuff and leverage it – do not stay in your cozy SE corner