SlideShare una empresa de Scribd logo
1 de 26
A Closer Look at Real-world Patches
Kui Liu, Dongsun Kim, Anil Konyuncu, Tegawendé F. Bissyandé, and Yves Le Traon
Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Luxembourg
Li Li
Monash Software Force (MSF), Monash University, Melbourne, Australia
@ Madrid Spain, 34th ICSME 2018September 27, 2018
1
> Basic Process of Automated Program Repair (APR)
Fault
Localization
Test
Pass
Fail
Patch
Candidate
APR
Tools
Suspicious
buggy code
Where is the code to be fixed? How to generate patches? Is the patch correct?
passing
tests
Passing
tests
Failing
tests
2
> How many bugs are fixed by existing APR tools?
Benchmark Defects4J [42] (395 bugs).
APR Tool # fixed bugs # Correctly fixed bugs
jGenProg 29 5
jKali 22 1
jMutRepair 17 3
Nopol 35 5
HDRepair 23 6
ACS 23 18
ssFix 60 20
ELIXIR 41 26
JAID 26 9
CapGen 25 21
SketchFix 26 19
SimFix 56 34
Why the quantity of bugs
that can be fixed by APR
tools and the quality of
patches generated by APR
tools are such low?
3
> Scope Limitation of APR Tools
Fixing bugs at the statement level.
Bug Chart_1 in Defects4J fixed by jMutRepair, ELIXIR, ssFix,
JAID, SketchFix, CapGen, SimFix.
4
> Are Non-Statement Code Entities Bug-free?
Bug located in Type Declaration (Math-12 in Defects4J).
Bug located in Method Declaration (Lang-29 in Defects4J). Bug located in Field Declaration (Lang-56 in Defects4J).
Bugs located in non-statement code entities.
None of existing APR tools can fix these bugs.
5
> Statement Level VS. Finer Granularity Level
Statement level: UPD ReturnStatement.
The repair action is difficult to be used
to fix similar bugs.
Expression level: dim / 2  0.5 * dim.
Project: Commons-math.
Bug Report ID: MATH-29, “Fix truncated value.”
Commit cedf0d27f9e9341a9e9fa8a192735a0c2e11be40,
--- a/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java
+++ b/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java
@@ −895, 1 +895, 1 @@
- return FastMath.pow(2 * FastMath.PI, -dim / 2) *
+ return FastMath.pow(2 * FastMath.PI, -0.5 * dim) *
FastMath.pow(covarianceMatrixDeterminant, -0.5) * getExponentTerm(vals);
The fix pattern could be used to fix similar bugs.
6
> Objective
Deepen knowledge on repair ingredients
from real-world patches in a fine-grained
way for automated program repair.
7
STUDY DESIGN
8
> Research Questions.
RQ4. Which parts of buggy expressions are prone to be buggy?
RQ1. Do patches impact some specific statement types?
RQ2. Are there code elements in statements that are prone to be faulty?
RQ3. Which expression types are most impacted by patches?
In APR, Fault localization techniques
(e.g.,Tarantula[31], Ochiai[32], Ochiai2[33], Zoltar[34] and
DStar[35]) are used to identify bug positions at code line level.
Data
Type
Variable
Name Operator
Being
Assigned
Expression
9
> Bug-fixing Patches Collection
1). Keyword matching.
Bug, error, fault, fix, patch or repair
2) Bug linking.
Bug IDs (e.g, MATH-929) in issue
tracking system:
(1) Issue Type is ‘bug’,
(2) Resolution is ‘fixed’.
Projects
# Commits
Identified Selected
Commons-io 222 191
Commons-lang 643 522
Mahout 751 717
Commans-math 1,021 909
Derby 3,788 3,356
Lucene-solr 11,408 10,755
Total 18,013 16,450
Buggy_Hunk
Fixed_Hunk
0 2 4 6 8 10
Hunk Size
Commit logs.
10
> Patch Differencing at AST Node Level
Buggy version
Fixed version
Patch
Regroup
Hierarchical construct
of code change actions.
GumTree[25]
11
> Hierarchical Construct of Code Change Actions of a Patch
“Fixed truncated value.”
12
RESULTS
13
> RQ1: Root AST Nodes Impacted by Patches
• Statements are the main buggy
code entities.
None of existing APR tools can fix declaration-related bugs in Defects4J.
Distributions of Root AST node Types Impacted by Patches.
MethodDeclaration, 15.95%
FieldDeclaration, 9.32%
EnumDeclaration, 0.03%
TypeDeclaration, 1.41%
Statement,
73.29%
• Declaration entities (~27%) could
be buggy.
14
> RQ1: Statements Recurrently Impacted by Patches.
5 out of 22 Statement types occupy 88% buggy code statements.
APR tools could focus on fixing some specific statements.
15
> RQ1: Adoption of Update
Supports the investigation of repair
ingredients in a fine-grained way.
“Update” occupies half of repair actions.
1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2);
2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3);
Update:
- a = a + b;
+ a = a * b;
Delete:
int a = 0;
- a = a + b
Move:
- a = a + b;
sum(a,b);
+ a = a + b;
Insert:
int a = 0;
+ a = a + b;
16
> Search Space at Statement Level VS. Expression Level
Expression-level granularity could reduce search space.
Number of buggy
ExpressionStatements: ~40,000.
Commit log: added protection against infinite loops by
setting a maximal number of valuations.
Number of buggy
PrefixExpression: 1,362.
17
> RQ2: Buggy Modifier.
Three ways of repair actions for “modifier”-
related bugs:
1) Add a missing modifier.
2) Delete an inappropriate modifier.
3) Replace an inappropriate modifier.
None of existing APR tools can fix modifier-related bugs in Defects4J.
Modifier, 3.30%
Type, 8.70%
Identifier, 5.50%
Expression,
82.40%
Distributions of inner-statement elements impacted by patches.
Commit log: LANG-334: To avoid exposing a mutating map.
18
> RQ2: Buggy Type Usage.
Buggy Types:
1. Buggy primitive types.
2. Buggy non-primitive types.
Modifier, 3.30%
Type, 8.70%
Identifier, 5.50%
Expression,
82.40%
Distributions of inner-statement elements impacted by patches.
It is a new challenge for APR tools to fix non-primitive type related
bugs.
Commit log: Fix integer overflow.
19
> RQ2: Buggy Identifiers.
APR tools Do not Fix Buggy Identifiers.
Modifying the inconsistent identifier is also
labeled as a bug fix by developers.
Debugging buggy names [58, 59, 60, 61, 62].
Modifier, 3.30%
Type, 8.70%
Identifier, 5.50%
Expression,
82.40%
Distributions of inner-statement elements impacted by patches.
20
> RQ3: Expressions Recurrently Impacted by Patches
5 out of 34 expression types occupy 80% of buggy expressions.
APR tools could focus on fixing some specific expressions.
Distributions of repair actions at the expression level.
21
> RQ3: Buggy Literal Expressions.
Buggy Literal Expressions raise a new challenge for APR tools.
Commit log: SOLR-6959, fix incorrect base url for PDFs.
22
> RQ4: Fault-prone Parts in Expressions.
Non-buggy part of expressions could provide context for fix
pattern mining at the expression level.
Distribution of whole VS. sub-element changes in some buggy expressions.
Expression % whole exp % each sub-exp
Assignment 18.1% Left_Hand_Exp (13.3%) Operator (0.8%) Right_Hand_Exp (73.5)
CastExpression 45.8% Type (11.9%) Exp (42.9%)
ClassInstanceCreation 15.5% Pre_Exp (9.2%) ClassType (19.7%) Argus (63%)
ConditionalExpression 22.9% Condition_Exp (24.1%) Then_Exp (33%) Else_Exp (49.5%)
InfixExpression 27.3% Left_Hand_Exp (35%) Operator (5.6%) Right_Hand_Exp (68.7)
MethodInvocation 14.7% MethodName (22.1%) Argus (79.8%)
23
> Fix Pattern Mining at Expression Level
Commit 44854912194177d67cdfa1dc765ba684eb013a4c
--- a/src/main/java/org/apache/commons/lang3/time/FastDateParser.java
+++ b/src/main/java/org/apache/commons/lang3/time/FastDateParser.java
@@ −895, 1 +895, 1 @@
- final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase());
+ final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase(Locale.ROOT));
- value.toUpperCase()
+
value.toUpperCase(Locale.ROOT);
Fix
Pattern:
Commit log: use toUpperCase(Locale) internally to avoid i18n issues.
24
> Take-away
RQ1:
1. APR scope should be extended to declaration entities.
2. APR changes can be prioritized on a few specific statement types.
3. Move action can be ignored by APR tools.
4. Real-world patches support further investigation in a fine-grained way.
RQ2:
1. APR scope should be extended to modifiers.
2. Buggy non-primitive types could be a new direction for APR.
RQ3:
1. APR changes can be prioritized on a few specific expression types.
2. Buggy literal expressions raise a new challenge for APR.
RQ4:
Non-buggy part of expressions could provide context for fix pattern mining at the expression level.
25
> Summary
15
> RQ1: Adoption of Update
Supports the investigation of repair
ingredients in a fine-grained way.
“Update” occupies half of repair actions.
1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2);
2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3);
10
> Patch Differencing at AST Node Level
Buggy version
Fixed version
Patch
Regroup
Hierarchical construct
of code change actions.
GumTree[25]
https://github.com/AutoProRepair/PatchParser

Más contenido relacionado

La actualidad más candente

Impact of Tool Support in Patch Construction
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch ConstructionDongsun Kim
 
Automated Program Repair Keynote talk
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talkAbhik Roychoudhury
 
Test final jav_aaa
Test final jav_aaaTest final jav_aaa
Test final jav_aaaBagusBudi11
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error predictionNIKHIL NAWATHE
 
Opal Hermes - towards representative benchmarks
Opal  Hermes - towards representative benchmarksOpal  Hermes - towards representative benchmarks
Opal Hermes - towards representative benchmarksMichaelEichberg1
 
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time LogsSherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time LogsDacong (Tony) Yan
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours? Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours? Rogue Wave Software
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
Looking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelopLooking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelopPVS-Studio
 
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Lionel Briand
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareSangmin Park
 
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Sangmin Park
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения..."Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...Yandex
 

La actualidad más candente (20)

Impact of Tool Support in Patch Construction
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch Construction
 
Automated Program Repair Keynote talk
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talk
 
Test final jav_aaa
Test final jav_aaaTest final jav_aaa
Test final jav_aaa
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
Opal Hermes - towards representative benchmarks
Opal  Hermes - towards representative benchmarksOpal  Hermes - towards representative benchmarks
Opal Hermes - towards representative benchmarks
 
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time LogsSherLog: Error Diagnosis by Connecting Clues from Run-time Logs
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
Repair dagstuhl jan2017
Repair dagstuhl jan2017Repair dagstuhl jan2017
Repair dagstuhl jan2017
 
Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours? Static analysis works for mission-critical systems, why not yours?
Static analysis works for mission-critical systems, why not yours?
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Mobilesoft 2017 Keynote
Mobilesoft 2017 KeynoteMobilesoft 2017 Keynote
Mobilesoft 2017 Keynote
 
Looking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelopLooking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelop
 
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
 
Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
 
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
Griffin: Grouping Suspicious Memory-Access Patterns to Improve Understanding...
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения..."Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
"Быстрое обнаружение вредоносного ПО для Android с помощью машинного обучения...
 

Similar a A Closer Look at Real-World Patches

S D D Program Development Tools
S D D  Program  Development  ToolsS D D  Program  Development  Tools
S D D Program Development Toolsgavhays
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationMasud Rahman
 
Static Slicing Technique with Algorithmic Approach
Static Slicing Technique with Algorithmic ApproachStatic Slicing Technique with Algorithmic Approach
Static Slicing Technique with Algorithmic ApproachIOSR Journals
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smelleSAT Publishing House
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smelleSAT Journals
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET Journal
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-MeetingMasud Rahman
 
A survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsA survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsAhmed Magdy Ezzeldin, MSc.
 
Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...butest
 
IRJET-Automatic Bug Triage with Software
IRJET-Automatic Bug Triage with Software IRJET-Automatic Bug Triage with Software
IRJET-Automatic Bug Triage with Software IRJET Journal
 
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...IJERA Editor
 
Iterative code reviews system for detecting and correcting faults from softwa...
Iterative code reviews system for detecting and correcting faults from softwa...Iterative code reviews system for detecting and correcting faults from softwa...
Iterative code reviews system for detecting and correcting faults from softwa...IAEME Publication
 
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSUSING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSijseajournal
 
Works For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug ReportsWorks For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug ReportsSALT Lab @ UBC
 

Similar a A Closer Look at Real-World Patches (20)

S D D Program Development Tools
S D D  Program  Development  ToolsS D D  Program  Development  Tools
S D D Program Development Tools
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
 
selenium_master.pdf
selenium_master.pdfselenium_master.pdf
selenium_master.pdf
 
Static Slicing Technique with Algorithmic Approach
Static Slicing Technique with Algorithmic ApproachStatic Slicing Technique with Algorithmic Approach
Static Slicing Technique with Algorithmic Approach
 
BH-US-06-Bilar.pdf
BH-US-06-Bilar.pdfBH-US-06-Bilar.pdf
BH-US-06-Bilar.pdf
 
CORRECT-ICSE2016
CORRECT-ICSE2016CORRECT-ICSE2016
CORRECT-ICSE2016
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
 
A survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithmsA survey of fault prediction using machine learning algorithms
A survey of fault prediction using machine learning algorithms
 
Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...Finding latent code errors via machine learning over program ...
Finding latent code errors via machine learning over program ...
 
Chap4
Chap4Chap4
Chap4
 
IRJET-Automatic Bug Triage with Software
IRJET-Automatic Bug Triage with Software IRJET-Automatic Bug Triage with Software
IRJET-Automatic Bug Triage with Software
 
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
Crosscutting Specification Interference Detection at Aspect Oriented UML-Base...
 
Iterative code reviews system for detecting and correcting faults from softwa...
Iterative code reviews system for detecting and correcting faults from softwa...Iterative code reviews system for detecting and correcting faults from softwa...
Iterative code reviews system for detecting and correcting faults from softwa...
 
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSUSING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
 
Works For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug ReportsWorks For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug Reports
 

Último

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

A Closer Look at Real-World Patches

  • 1. A Closer Look at Real-world Patches Kui Liu, Dongsun Kim, Anil Konyuncu, Tegawendé F. Bissyandé, and Yves Le Traon Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, Luxembourg Li Li Monash Software Force (MSF), Monash University, Melbourne, Australia @ Madrid Spain, 34th ICSME 2018September 27, 2018
  • 2. 1 > Basic Process of Automated Program Repair (APR) Fault Localization Test Pass Fail Patch Candidate APR Tools Suspicious buggy code Where is the code to be fixed? How to generate patches? Is the patch correct? passing tests Passing tests Failing tests
  • 3. 2 > How many bugs are fixed by existing APR tools? Benchmark Defects4J [42] (395 bugs). APR Tool # fixed bugs # Correctly fixed bugs jGenProg 29 5 jKali 22 1 jMutRepair 17 3 Nopol 35 5 HDRepair 23 6 ACS 23 18 ssFix 60 20 ELIXIR 41 26 JAID 26 9 CapGen 25 21 SketchFix 26 19 SimFix 56 34 Why the quantity of bugs that can be fixed by APR tools and the quality of patches generated by APR tools are such low?
  • 4. 3 > Scope Limitation of APR Tools Fixing bugs at the statement level. Bug Chart_1 in Defects4J fixed by jMutRepair, ELIXIR, ssFix, JAID, SketchFix, CapGen, SimFix.
  • 5. 4 > Are Non-Statement Code Entities Bug-free? Bug located in Type Declaration (Math-12 in Defects4J). Bug located in Method Declaration (Lang-29 in Defects4J). Bug located in Field Declaration (Lang-56 in Defects4J). Bugs located in non-statement code entities. None of existing APR tools can fix these bugs.
  • 6. 5 > Statement Level VS. Finer Granularity Level Statement level: UPD ReturnStatement. The repair action is difficult to be used to fix similar bugs. Expression level: dim / 2  0.5 * dim. Project: Commons-math. Bug Report ID: MATH-29, “Fix truncated value.” Commit cedf0d27f9e9341a9e9fa8a192735a0c2e11be40, --- a/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java +++ b/src/main/java/org/apache/commons/math3/distribution/MultivariateNormalDistribution.java @@ −895, 1 +895, 1 @@ - return FastMath.pow(2 * FastMath.PI, -dim / 2) * + return FastMath.pow(2 * FastMath.PI, -0.5 * dim) * FastMath.pow(covarianceMatrixDeterminant, -0.5) * getExponentTerm(vals); The fix pattern could be used to fix similar bugs.
  • 7. 6 > Objective Deepen knowledge on repair ingredients from real-world patches in a fine-grained way for automated program repair.
  • 9. 8 > Research Questions. RQ4. Which parts of buggy expressions are prone to be buggy? RQ1. Do patches impact some specific statement types? RQ2. Are there code elements in statements that are prone to be faulty? RQ3. Which expression types are most impacted by patches? In APR, Fault localization techniques (e.g.,Tarantula[31], Ochiai[32], Ochiai2[33], Zoltar[34] and DStar[35]) are used to identify bug positions at code line level. Data Type Variable Name Operator Being Assigned Expression
  • 10. 9 > Bug-fixing Patches Collection 1). Keyword matching. Bug, error, fault, fix, patch or repair 2) Bug linking. Bug IDs (e.g, MATH-929) in issue tracking system: (1) Issue Type is ‘bug’, (2) Resolution is ‘fixed’. Projects # Commits Identified Selected Commons-io 222 191 Commons-lang 643 522 Mahout 751 717 Commans-math 1,021 909 Derby 3,788 3,356 Lucene-solr 11,408 10,755 Total 18,013 16,450 Buggy_Hunk Fixed_Hunk 0 2 4 6 8 10 Hunk Size Commit logs.
  • 11. 10 > Patch Differencing at AST Node Level Buggy version Fixed version Patch Regroup Hierarchical construct of code change actions. GumTree[25]
  • 12. 11 > Hierarchical Construct of Code Change Actions of a Patch “Fixed truncated value.”
  • 14. 13 > RQ1: Root AST Nodes Impacted by Patches • Statements are the main buggy code entities. None of existing APR tools can fix declaration-related bugs in Defects4J. Distributions of Root AST node Types Impacted by Patches. MethodDeclaration, 15.95% FieldDeclaration, 9.32% EnumDeclaration, 0.03% TypeDeclaration, 1.41% Statement, 73.29% • Declaration entities (~27%) could be buggy.
  • 15. 14 > RQ1: Statements Recurrently Impacted by Patches. 5 out of 22 Statement types occupy 88% buggy code statements. APR tools could focus on fixing some specific statements.
  • 16. 15 > RQ1: Adoption of Update Supports the investigation of repair ingredients in a fine-grained way. “Update” occupies half of repair actions. 1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2); 2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3); Update: - a = a + b; + a = a * b; Delete: int a = 0; - a = a + b Move: - a = a + b; sum(a,b); + a = a + b; Insert: int a = 0; + a = a + b;
  • 17. 16 > Search Space at Statement Level VS. Expression Level Expression-level granularity could reduce search space. Number of buggy ExpressionStatements: ~40,000. Commit log: added protection against infinite loops by setting a maximal number of valuations. Number of buggy PrefixExpression: 1,362.
  • 18. 17 > RQ2: Buggy Modifier. Three ways of repair actions for “modifier”- related bugs: 1) Add a missing modifier. 2) Delete an inappropriate modifier. 3) Replace an inappropriate modifier. None of existing APR tools can fix modifier-related bugs in Defects4J. Modifier, 3.30% Type, 8.70% Identifier, 5.50% Expression, 82.40% Distributions of inner-statement elements impacted by patches. Commit log: LANG-334: To avoid exposing a mutating map.
  • 19. 18 > RQ2: Buggy Type Usage. Buggy Types: 1. Buggy primitive types. 2. Buggy non-primitive types. Modifier, 3.30% Type, 8.70% Identifier, 5.50% Expression, 82.40% Distributions of inner-statement elements impacted by patches. It is a new challenge for APR tools to fix non-primitive type related bugs. Commit log: Fix integer overflow.
  • 20. 19 > RQ2: Buggy Identifiers. APR tools Do not Fix Buggy Identifiers. Modifying the inconsistent identifier is also labeled as a bug fix by developers. Debugging buggy names [58, 59, 60, 61, 62]. Modifier, 3.30% Type, 8.70% Identifier, 5.50% Expression, 82.40% Distributions of inner-statement elements impacted by patches.
  • 21. 20 > RQ3: Expressions Recurrently Impacted by Patches 5 out of 34 expression types occupy 80% of buggy expressions. APR tools could focus on fixing some specific expressions. Distributions of repair actions at the expression level.
  • 22. 21 > RQ3: Buggy Literal Expressions. Buggy Literal Expressions raise a new challenge for APR tools. Commit log: SOLR-6959, fix incorrect base url for PDFs.
  • 23. 22 > RQ4: Fault-prone Parts in Expressions. Non-buggy part of expressions could provide context for fix pattern mining at the expression level. Distribution of whole VS. sub-element changes in some buggy expressions. Expression % whole exp % each sub-exp Assignment 18.1% Left_Hand_Exp (13.3%) Operator (0.8%) Right_Hand_Exp (73.5) CastExpression 45.8% Type (11.9%) Exp (42.9%) ClassInstanceCreation 15.5% Pre_Exp (9.2%) ClassType (19.7%) Argus (63%) ConditionalExpression 22.9% Condition_Exp (24.1%) Then_Exp (33%) Else_Exp (49.5%) InfixExpression 27.3% Left_Hand_Exp (35%) Operator (5.6%) Right_Hand_Exp (68.7) MethodInvocation 14.7% MethodName (22.1%) Argus (79.8%)
  • 24. 23 > Fix Pattern Mining at Expression Level Commit 44854912194177d67cdfa1dc765ba684eb013a4c --- a/src/main/java/org/apache/commons/lang3/time/FastDateParser.java +++ b/src/main/java/org/apache/commons/lang3/time/FastDateParser.java @@ −895, 1 +895, 1 @@ - final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase()); + final TimeZone tz = TimeZone.getTimeZone(value.toUpperCase(Locale.ROOT)); - value.toUpperCase() + value.toUpperCase(Locale.ROOT); Fix Pattern: Commit log: use toUpperCase(Locale) internally to avoid i18n issues.
  • 25. 24 > Take-away RQ1: 1. APR scope should be extended to declaration entities. 2. APR changes can be prioritized on a few specific statement types. 3. Move action can be ignored by APR tools. 4. Real-world patches support further investigation in a fine-grained way. RQ2: 1. APR scope should be extended to modifiers. 2. Buggy non-primitive types could be a new direction for APR. RQ3: 1. APR changes can be prioritized on a few specific expression types. 2. Buggy literal expressions raise a new challenge for APR. RQ4: Non-buggy part of expressions could provide context for fix pattern mining at the expression level.
  • 26. 25 > Summary 15 > RQ1: Adoption of Update Supports the investigation of repair ingredients in a fine-grained way. “Update” occupies half of repair actions. 1. double d = FastMath.pow(2 * FastMath.PI, -dim / 2); 2. double d = FastMath.pow(2 * FastMath.PI, -dim / 3); 10 > Patch Differencing at AST Node Level Buggy version Fixed version Patch Regroup Hierarchical construct of code change actions. GumTree[25] https://github.com/AutoProRepair/PatchParser

Notas del editor

  1. Chart_17, Lang_4 none of apr tools can fix non primitive type related bugs.
  2. Some bugs are also related to literal expressions.