SlideShare una empresa de Scribd logo
1 de 31
CrashLocator: Locating Crashing
Faults Based on Crash Stacks
Rongxin Wu1, Hongyu Zhang2,
Shing-Chi Cheung1 and Sunghun Kim1
The Hong Kong University of Science and Technology1
Microsoft Research2
July 24th , 2014
ISSTA 2014
Background
2
Crash Information
with Crash Stack
Crash Reporting System
Software Crash
Bug ReportsDevelopers Crash Buckets
Feedbacks From Mozilla
Developers
• Locating crashing faults is hard
• Ad hoc approach
“… and look at the crash stack listed. It shows the line number
of the code, and then I go to the code and inspect it. If I am
unsure what it does I go to the second line of the stack and
code and inspect that, and so on and so forth …”
“Some crashes are hard to fix because it is not necessarily
indicative of the place where it crashes in the crash stack …”
“ I use the top down method of following the crash backwards.”
“Sometimes it can be very difficult.”
3
Uncertain Fault Location
• The faulty function may not appear in crash stack
About 33%~41% of crashing faults in Firefox
cannot be located in crash stacks!
A
B
C E F G
H
Buggy Code
D
Crash Stack
Crash Point
4
• Related Work
• Tarantula
(J. A. Jones et al., ICSE 2002)
(J. A. Jones et al., ASE 2005)
• Jaccard
(R. Abreu et al., TAICPART-MUTATION 2007)
• Ochiai
(R. Abreu et al., TAICPART-MUTATION 2007)
(S. Art et al., ISSTA 2010)
• …
• Passing Traces and Failing Traces
Spectrum-Based Fault Localization
5
• Are these techniques applicable?
Spectrum-Based Fault Localization
Instrumented
Product Software
Failing Traces
Passing Traces
Privacy Concern
Performance Overhead
(C. Luk et al., PLDI 2005)
x
Crash Stack
f1
f2
f3
…
fn
x
Test Cases
Effectiveness
(S. Artzi et al., ISSTA’10)
6
Our Research Goal
How to help developers fix crashing faults?
– Locate crashing faults based on crash stack
7
Our technique: CrashLocator
• Target at locating faulty functions
• No instrumentation needed
• Approximate failing traces
 Based on Crash Stacks
 Use static analysis techniques
• Rank suspicious functions
 Without passing traces
 Based on characteristics of faulty functions
8
Approximate Failing Traces
• Basic Stack Expansion Algorithm
A
B
C
D
Crash Stack
E
J
M
N
Depth-1
F
K
L
Depth-2
G
H
Depth-3 A
B J
C K L
D E
M N F
G H
Call Graph
9
functionposition File Line
D0 file_0 l0
C1 file_1 l1
B2 file_2 l2
A3 file_3 l3
Crash Stack
Approximate Failing Traces
• Basic Stack Expansion Algorithm
 Function call information only
• Improved Stack Expansion Algorithm
 Source file position information
10
Improved Stack Expansion
Algorithm
• Control Flow Analysis
if
J()
…
B()
…
Entry
Exit
In Crash Stack
CFG of A
A
B
C
D
Crash Stack
E
J
M
N
Depth-1
F
K
L
Depth-2
G
H
Depth-3
11
Improved Stack Expansion
Algorithm
• Backward Slicing
1. Obj D(){
2. Obj s;
3. int a = M();
4. char b = ‘’;
5. Obj[] c = N(b);
6. s=c[1]; //crash here
7. if(s!=‘’){
8. …
9. }
8. …
9. }
variables {s,c}
A
B
C
D
Crash Stack
E
M
N
Depth-1
F
Depth-2
G
H
Depth-3
Not in slicing
12
After crash stack expansion, there are still
a large number of suspicious functions
How to rank the suspicious functions?
13
Rank suspicious functions
• An empirical study on the characteristics of faulty
functions
• Quantify the suspiciousness of suspicious functions
14
Observation 1:
Frequent Function
• Faulty functions appear frequently in the crash
traces of the corresponding buckets.
 Function Frequency (FF)
Crash
Report More Frequent,
More Suspicious
For 89-92% crashing faults, the associated faulty functions
appear in all crash execution traces in the corresponding bucket.
Crash Bucket
15
Frequent Function
• Some frequent functions are unlikely to be buggy
 Entry points (main, _RtlUserThreadStart, …)
 Event handling routine (CloseHandle)
• Information retrieval, some frequent words are useless
 stop-words, e.g. “the”, “an”, “a”
 Inverse Document Frequency (IDF)
• Inverse Bucket Frequency (IBF)
 If a function appears in many buckets, it is less likely to be
buggy
16
Observation 2:
Functions Close to Crash Point
• Faulty functions appear closer to crash point
 In Mozilla Firefox, for 84.3% of crashing faults, the
distance between crash point and the associated faulty
functions is less 5.
• Inverse Average Distance to Crash Point (IAD)
17
Observation 3:
Less Frequently Changed Functions
• Functions that do not contain crashing faults are
often less frequently changed
 94.1% of faulty functions have been changed at least
once during the past 12 months
 Immune Functions (Y. Dang et al. ICSE 2012)
• Less frequently changed functions
 Functions that have no changes in past 12 months
 Suspicious score is 0
18
Observation 4: Large Functions
• Our prior study (H. Zhang. ICSM 2009) showed that
large modules are more likely to be defect-prone
• Function’s Lines of Code (FLOC)
19
Suspicious Score
𝑆𝑐𝑜𝑟𝑒 𝑓, 𝐵 = 𝐹𝐹 𝑓, 𝐵 ∗ 𝐼𝐵𝐹 𝑓 ∗ 𝐼𝐴𝐷 𝑓, 𝐵 ∗ 𝐹𝐿𝑂𝐶(𝑓)
• FF (Function Frequency)
𝐹𝐹 𝑓, 𝐵 =
𝑁𝑓,𝐵
𝑁 𝐵
• IBF(Inverse Bucket Frequency)
𝐼𝐵𝐹 𝑓 = 𝑙𝑜𝑔(
#𝐵
#𝐵𝑓
+ 1)
• IAD(Inverse Distance to Crash Point)
𝐼𝐴𝐷 𝑓, 𝐵 =
𝑁𝑓,𝐵
1 + 𝑗=1
𝑛
𝑑𝑖𝑠𝑗(𝑓)
• FLOC(Function Lines of Code)
𝐹𝐿𝑂𝐶 𝑓 = 𝑙𝑜𝑔 (𝐿𝑂𝐶 𝑓 + 1) 20
Evaluation Subjects
• Mozilla Products
 5 releases of Firefox
 2 releases of Thunderbird
 1 release of SeaMonkey
• 160 crashing faults(buckets)
• Large-Scale
 More than 2 million LOC
 More than 120K functions
21
Evaluation Metrics
• Recall@N: Percentage of successfully located faults
by examining top N recommended functions
• Mean Reciprocal Rank (MRR)
 Measure the quality of the ranking results in IR
 Range value: 0 ~ 1
 Higher value means better ranking
22
Experimental Design
• RQ1: How many faults can be successfully located by
CrashLocator?
• RQ2: Can CrashLocator outperform the conventional
stack-only methods?
• RQ3: How does each factor contribute to the crash
localization performance?
• RQ4: How effective is the proposed crash stack
expansion algorithm?
23
RQ1: CrashLocator Performance
System Recall@1 Recall@5 Recall@10 MRR
Firefox 4.0b4 55.6% 66.7% 77.8% 0.627
Firefox 4.0b5 47.1% 70.6% 70.6% 0.566
Firefox 4.0b6 48.0% 64.0% 64.0% 0.540
Firefox14.0.1 52.0% 52.0% 56.0% 0.528
Firefox16.0.1 53.8% 53.8% 53.8% 0.542
Thunderbird17.0 48.5% 66.7% 78.8% 0.568
Thunderbird24.0 50.0% 66.7% 66.7% 0.544
SeaMonkey2.21 55.0% 70.0% 70.0% 0.600
Summary 50.6% 63.7% 67.5% 0.559
24
RQ2: Comparison with Stack-Only
methods
• Conventional Stack-Only Methods
• StackOnlySampling
• StackOnlyAverage
• StackOnlyChangeDate
25
RQ2: Comparison with Stack-Only
methods
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 5 10 20 50 100
Recall@N
Top N Functions
StackOnlySampling
StackOnlyAverage
StackOnlyChangeDate
CrashLocator
26
RQ3: Contribution of Each Factors
• Inverse Bucket Frequency (IBF)
• Function Frequency (FF)
• Function’s Lines of Code (FLOC)
• Inverse Average Distance to Crash Point (IAD)
27
RQ3: Contribution of Each Factors
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
ff4.0b4 ff4.0b5 ff4.0b6 ff14.0.1 ff16.0.1 tb17.0 tb24.0 sm2.21 Summary
MRR
IBF IBF*FF IBF*FF*FLOC IBF*FF*FLOC*IAD
28
RQ4: Stack Expansion Algorithms
• Basic Stack Expansion Algorithm
 Static Call Graph
• Improved Stack Expansion Algorithm
 Static Call Graph
 Control Flow Analysis
 Backward Slicing
29
RQ4: Stack Expansion Algorithms
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Recall@1 Recall@5 Recall@10 Recall@20 Recall@50 MRR
Basic Stack Trace Expansion Improved Stack Trace Expansion
30
Conclusions
• Propose a novel technique CrashLocator to locate
crashing faults based on crash stack only
• Evaluate on real and large-scale projects
• 50.6%, 63.7%, and 67.5% of crashing faults can be
located by examining only top 1,5,10 functions
• CrashLocator outperforms Stack-Only methods
significantly, with the improvement of MRR at least
32% and the improvement of Recall@10 at least 23%
31

Más contenido relacionado

La actualidad más candente

A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
Sung Kim
 
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
Sung Kim
 
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Lionel Briand
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
Sung Kim
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
sjust
 
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
Kim Herzig
 
Change Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsChange Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language Requirements
Lionel Briand
 
The Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyThe Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency Statically
Ray Buse
 
Dealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in Verification
DVClub
 

La actualidad más candente (20)

A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...
 
Dissertation Defense
Dissertation DefenseDissertation Defense
Dissertation Defense
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
 
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
 
Issre2014 test defectprediction
Issre2014 test defectpredictionIssre2014 test defectprediction
Issre2014 test defectprediction
 
Software testing: an introduction - 2017
Software testing: an introduction - 2017Software testing: an introduction - 2017
Software testing: an introduction - 2017
 
Improving Code Review Effectiveness Through Reviewer Recommendations
Improving Code Review Effectiveness Through Reviewer RecommendationsImproving Code Review Effectiveness Through Reviewer Recommendations
Improving Code Review Effectiveness Through Reviewer Recommendations
 
TMPA-2017: 5W+1H Static Analysis Report Quality Measure
TMPA-2017: 5W+1H Static Analysis Report Quality MeasureTMPA-2017: 5W+1H Static Analysis Report Quality Measure
TMPA-2017: 5W+1H Static Analysis Report Quality Measure
 
Change Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsChange Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language Requirements
 
The Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency StaticallyThe Road Not Taken: Estimating Path Execution Frequency Statically
The Road Not Taken: Estimating Path Execution Frequency Statically
 
SBST 2019 Keynote
SBST 2019 Keynote SBST 2019 Keynote
SBST 2019 Keynote
 
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
 
Dealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in VerificationDealing with the Three Horrible Problems in Verification
Dealing with the Three Horrible Problems in Verification
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage"
 

Destacado

Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Sung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
Sung Kim
 

Destacado (13)

Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
 
How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012
 
The Anatomy of Developer Social Networks
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social Networks
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
 
A Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
 
Tensor board
Tensor boardTensor board
Tensor board
 
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
CNN 초보자가 만드는 초보자 가이드 (VGG 약간 포함)
 
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
AWS 클라우드 기반 확장성 높은 천만 사용자 웹 서비스 만들기 - 윤석찬
 
Time series classification
Time series classificationTime series classification
Time series classification
 
의료빅데이터 컨테스트 결과 보고서
의료빅데이터 컨테스트 결과 보고서의료빅데이터 컨테스트 결과 보고서
의료빅데이터 컨테스트 결과 보고서
 

Similar a CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
Sangmin Park
 
Technical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot AnalysisTechnical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot Analysis
Positive Hack Days
 
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Zhen Huang
 
CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019
Olivera Milenkovic
 
Impact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location TechniqueImpact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location Technique
Chakkrit (Kla) Tantithamthavorn
 

Similar a CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014) (20)

Effective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
 
Fighting Software Inefficiency Through Automated Bug Detection
 Fighting Software Inefficiency Through Automated Bug Detection Fighting Software Inefficiency Through Automated Bug Detection
Fighting Software Inefficiency Through Automated Bug Detection
 
Technical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot AnalysisTechnical Workshop - Win32/Georbot Analysis
Technical Workshop - Win32/Georbot Analysis
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
How to Design a Program Repair Bot? Insights from the Repairnator Project
How to Design a Program Repair Bot? Insights from the Repairnator ProjectHow to Design a Program Repair Bot? Insights from the Repairnator Project
How to Design a Program Repair Bot? Insights from the Repairnator Project
 
Practical Windows Kernel Exploitation
Practical Windows Kernel ExploitationPractical Windows Kernel Exploitation
Practical Windows Kernel Exploitation
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
Automatic Fine-Grained Issue Report Reclassification
Automatic Fine-Grained Issue Report ReclassificationAutomatic Fine-Grained Issue Report Reclassification
Automatic Fine-Grained Issue Report Reclassification
 
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
Talos: Neutralizing Vulnerabilities with Security Workarounds for Rapid Respo...
 
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slapDEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
DEF CON 27 - CHRISTOPHER ROBERTS - firmware slap
 
Metasploit & Windows Kernel Exploitation
Metasploit & Windows Kernel ExploitationMetasploit & Windows Kernel Exploitation
Metasploit & Windows Kernel Exploitation
 
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
ACSAC2016: Code Obfuscation Against Symbolic Execution AttacksACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
ACSAC2016: Code Obfuscation Against Symbolic Execution Attacks
 
Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment
 
Getting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingGetting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testing
 
black-box testing is a type of software testing in which the tester is not co...
black-box testing is a type of software testing in which the tester is not co...black-box testing is a type of software testing in which the tester is not co...
black-box testing is a type of software testing in which the tester is not co...
 
CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019CodeChecker Overview Nov 2019
CodeChecker Overview Nov 2019
 
Lecture 1 Try Throw Catch.pptx
Lecture 1 Try Throw Catch.pptxLecture 1 Try Throw Catch.pptx
Lecture 1 Try Throw Catch.pptx
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Impact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location TechniqueImpact Analysis of Granularity Levels on Feature Location Technique
Impact Analysis of Granularity Levels on Feature Location Technique
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
 

Más de Sung Kim (7)

MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)
 
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
 
Software Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of CrowdsSoftware Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of Crowds
 
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
 
Self-defending software: Automatically patching errors in deployed software ...
Self-defending software: Automatically patching  errors in deployed software ...Self-defending software: Automatically patching  errors in deployed software ...
Self-defending software: Automatically patching errors in deployed software ...
 
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
 

Último

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
 

Último (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

  • 1. CrashLocator: Locating Crashing Faults Based on Crash Stacks Rongxin Wu1, Hongyu Zhang2, Shing-Chi Cheung1 and Sunghun Kim1 The Hong Kong University of Science and Technology1 Microsoft Research2 July 24th , 2014 ISSTA 2014
  • 2. Background 2 Crash Information with Crash Stack Crash Reporting System Software Crash Bug ReportsDevelopers Crash Buckets
  • 3. Feedbacks From Mozilla Developers • Locating crashing faults is hard • Ad hoc approach “… and look at the crash stack listed. It shows the line number of the code, and then I go to the code and inspect it. If I am unsure what it does I go to the second line of the stack and code and inspect that, and so on and so forth …” “Some crashes are hard to fix because it is not necessarily indicative of the place where it crashes in the crash stack …” “ I use the top down method of following the crash backwards.” “Sometimes it can be very difficult.” 3
  • 4. Uncertain Fault Location • The faulty function may not appear in crash stack About 33%~41% of crashing faults in Firefox cannot be located in crash stacks! A B C E F G H Buggy Code D Crash Stack Crash Point 4
  • 5. • Related Work • Tarantula (J. A. Jones et al., ICSE 2002) (J. A. Jones et al., ASE 2005) • Jaccard (R. Abreu et al., TAICPART-MUTATION 2007) • Ochiai (R. Abreu et al., TAICPART-MUTATION 2007) (S. Art et al., ISSTA 2010) • … • Passing Traces and Failing Traces Spectrum-Based Fault Localization 5
  • 6. • Are these techniques applicable? Spectrum-Based Fault Localization Instrumented Product Software Failing Traces Passing Traces Privacy Concern Performance Overhead (C. Luk et al., PLDI 2005) x Crash Stack f1 f2 f3 … fn x Test Cases Effectiveness (S. Artzi et al., ISSTA’10) 6
  • 7. Our Research Goal How to help developers fix crashing faults? – Locate crashing faults based on crash stack 7
  • 8. Our technique: CrashLocator • Target at locating faulty functions • No instrumentation needed • Approximate failing traces  Based on Crash Stacks  Use static analysis techniques • Rank suspicious functions  Without passing traces  Based on characteristics of faulty functions 8
  • 9. Approximate Failing Traces • Basic Stack Expansion Algorithm A B C D Crash Stack E J M N Depth-1 F K L Depth-2 G H Depth-3 A B J C K L D E M N F G H Call Graph 9
  • 10. functionposition File Line D0 file_0 l0 C1 file_1 l1 B2 file_2 l2 A3 file_3 l3 Crash Stack Approximate Failing Traces • Basic Stack Expansion Algorithm  Function call information only • Improved Stack Expansion Algorithm  Source file position information 10
  • 11. Improved Stack Expansion Algorithm • Control Flow Analysis if J() … B() … Entry Exit In Crash Stack CFG of A A B C D Crash Stack E J M N Depth-1 F K L Depth-2 G H Depth-3 11
  • 12. Improved Stack Expansion Algorithm • Backward Slicing 1. Obj D(){ 2. Obj s; 3. int a = M(); 4. char b = ‘’; 5. Obj[] c = N(b); 6. s=c[1]; //crash here 7. if(s!=‘’){ 8. … 9. } 8. … 9. } variables {s,c} A B C D Crash Stack E M N Depth-1 F Depth-2 G H Depth-3 Not in slicing 12
  • 13. After crash stack expansion, there are still a large number of suspicious functions How to rank the suspicious functions? 13
  • 14. Rank suspicious functions • An empirical study on the characteristics of faulty functions • Quantify the suspiciousness of suspicious functions 14
  • 15. Observation 1: Frequent Function • Faulty functions appear frequently in the crash traces of the corresponding buckets.  Function Frequency (FF) Crash Report More Frequent, More Suspicious For 89-92% crashing faults, the associated faulty functions appear in all crash execution traces in the corresponding bucket. Crash Bucket 15
  • 16. Frequent Function • Some frequent functions are unlikely to be buggy  Entry points (main, _RtlUserThreadStart, …)  Event handling routine (CloseHandle) • Information retrieval, some frequent words are useless  stop-words, e.g. “the”, “an”, “a”  Inverse Document Frequency (IDF) • Inverse Bucket Frequency (IBF)  If a function appears in many buckets, it is less likely to be buggy 16
  • 17. Observation 2: Functions Close to Crash Point • Faulty functions appear closer to crash point  In Mozilla Firefox, for 84.3% of crashing faults, the distance between crash point and the associated faulty functions is less 5. • Inverse Average Distance to Crash Point (IAD) 17
  • 18. Observation 3: Less Frequently Changed Functions • Functions that do not contain crashing faults are often less frequently changed  94.1% of faulty functions have been changed at least once during the past 12 months  Immune Functions (Y. Dang et al. ICSE 2012) • Less frequently changed functions  Functions that have no changes in past 12 months  Suspicious score is 0 18
  • 19. Observation 4: Large Functions • Our prior study (H. Zhang. ICSM 2009) showed that large modules are more likely to be defect-prone • Function’s Lines of Code (FLOC) 19
  • 20. Suspicious Score 𝑆𝑐𝑜𝑟𝑒 𝑓, 𝐵 = 𝐹𝐹 𝑓, 𝐵 ∗ 𝐼𝐵𝐹 𝑓 ∗ 𝐼𝐴𝐷 𝑓, 𝐵 ∗ 𝐹𝐿𝑂𝐶(𝑓) • FF (Function Frequency) 𝐹𝐹 𝑓, 𝐵 = 𝑁𝑓,𝐵 𝑁 𝐵 • IBF(Inverse Bucket Frequency) 𝐼𝐵𝐹 𝑓 = 𝑙𝑜𝑔( #𝐵 #𝐵𝑓 + 1) • IAD(Inverse Distance to Crash Point) 𝐼𝐴𝐷 𝑓, 𝐵 = 𝑁𝑓,𝐵 1 + 𝑗=1 𝑛 𝑑𝑖𝑠𝑗(𝑓) • FLOC(Function Lines of Code) 𝐹𝐿𝑂𝐶 𝑓 = 𝑙𝑜𝑔 (𝐿𝑂𝐶 𝑓 + 1) 20
  • 21. Evaluation Subjects • Mozilla Products  5 releases of Firefox  2 releases of Thunderbird  1 release of SeaMonkey • 160 crashing faults(buckets) • Large-Scale  More than 2 million LOC  More than 120K functions 21
  • 22. Evaluation Metrics • Recall@N: Percentage of successfully located faults by examining top N recommended functions • Mean Reciprocal Rank (MRR)  Measure the quality of the ranking results in IR  Range value: 0 ~ 1  Higher value means better ranking 22
  • 23. Experimental Design • RQ1: How many faults can be successfully located by CrashLocator? • RQ2: Can CrashLocator outperform the conventional stack-only methods? • RQ3: How does each factor contribute to the crash localization performance? • RQ4: How effective is the proposed crash stack expansion algorithm? 23
  • 24. RQ1: CrashLocator Performance System Recall@1 Recall@5 Recall@10 MRR Firefox 4.0b4 55.6% 66.7% 77.8% 0.627 Firefox 4.0b5 47.1% 70.6% 70.6% 0.566 Firefox 4.0b6 48.0% 64.0% 64.0% 0.540 Firefox14.0.1 52.0% 52.0% 56.0% 0.528 Firefox16.0.1 53.8% 53.8% 53.8% 0.542 Thunderbird17.0 48.5% 66.7% 78.8% 0.568 Thunderbird24.0 50.0% 66.7% 66.7% 0.544 SeaMonkey2.21 55.0% 70.0% 70.0% 0.600 Summary 50.6% 63.7% 67.5% 0.559 24
  • 25. RQ2: Comparison with Stack-Only methods • Conventional Stack-Only Methods • StackOnlySampling • StackOnlyAverage • StackOnlyChangeDate 25
  • 26. RQ2: Comparison with Stack-Only methods 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1 5 10 20 50 100 Recall@N Top N Functions StackOnlySampling StackOnlyAverage StackOnlyChangeDate CrashLocator 26
  • 27. RQ3: Contribution of Each Factors • Inverse Bucket Frequency (IBF) • Function Frequency (FF) • Function’s Lines of Code (FLOC) • Inverse Average Distance to Crash Point (IAD) 27
  • 28. RQ3: Contribution of Each Factors 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 ff4.0b4 ff4.0b5 ff4.0b6 ff14.0.1 ff16.0.1 tb17.0 tb24.0 sm2.21 Summary MRR IBF IBF*FF IBF*FF*FLOC IBF*FF*FLOC*IAD 28
  • 29. RQ4: Stack Expansion Algorithms • Basic Stack Expansion Algorithm  Static Call Graph • Improved Stack Expansion Algorithm  Static Call Graph  Control Flow Analysis  Backward Slicing 29
  • 30. RQ4: Stack Expansion Algorithms 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Recall@1 Recall@5 Recall@10 Recall@20 Recall@50 MRR Basic Stack Trace Expansion Improved Stack Trace Expansion 30
  • 31. Conclusions • Propose a novel technique CrashLocator to locate crashing faults based on crash stack only • Evaluate on real and large-scale projects • 50.6%, 63.7%, and 67.5% of crashing faults can be located by examining only top 1,5,10 functions • CrashLocator outperforms Stack-Only methods significantly, with the improvement of MRR at least 32% and the improvement of Recall@10 at least 23% 31

Notas del editor

  1. Good afternoon. thanks for joining this presentation. My name is … Today, I am going to present… This work is a joint work between … Let me start to introduce it.
  2. As we know, software crash is common. Crash is a severe manifestation of faults. Due to the importance and severity of crash, recent years, some industrial companies and open source communities developed crash reporting systems to collect crash reports from end users. Due to the large number of users, there are many crash reports received daily. It is impossible for developers to inspect each of them. Therefore, crash reporting systems will organize the crash reports. The organizing process is also called crash bucketing, which group the crash reports caused by the same bugs together. Then, bug reports are generated based on the crash buckets and sent to developers for debugging.
  3. Although the crash reporting systems have been proved to be useful in debugging, still debugging crashing faults is not easy. After communicating with Mozilla developers, we found that, sometimes locating crashing fault is hard. Especially, when they cannot directly get evidences from the crash stack, the crashing fault become difficult. To fix crashing bug, they usually used ad hoc approach. Usually, they used top down method to inspect crash stack. Crash stack is useful. However, using only crash stack is insufficient.
  4. We conduct an empirical study in 3 release versions of Firefox. We find that the buggy code may not always appear in crash stack. This is because, the buggy code may be executed and popped out of call stack. Then, the side effect of buggy code is taken in the later executed statements. In Mozilla, 33%-41% of crashing faults cannot be located in crash stacks.
  5. Then, we consider the fault localization techniques to assist the debugging. In recent years, many spectrum-based fault localization techniques are proposed, such as Tarantula, Jaccard, Ochiai. These techniques contrast passing and failing execution traces, and compute the suspicious scores of program elements, and present the ranked list of program elements to developers.
  6. These techniques are well studied. However, are these techniques directly applicable? As we know, the passing and failing traces are required by these techniques. To obtain these traces, usually they need to instrument the programs to collect. However, in production software, usually instrumentation is not allowed, due to the privacy concern and the performance overhead caused by the instrumentation. Therefore, we cannot obtain the traces from end users. For failing trace, what we have now is crash stack. Crash stack is a snapshot of call stack at the time of crashing. It is a partial execution trace and is not equivalent to complete failing trace. For passing trace, we may be able to obtain via the test cases. However, the study by S. Artzi showed that, the fault localization techniques are effective, if the passing traces are similar to the failing traces. It is not always possible to have such test cases to generate passing traces similar to failing ones. First, the instrumenting production software to collect full trace is usually not available. This is mainly because of the privacy concerns of end users as well as the performance overhead caused by the instrumentation. We noticed that, recently some researches in our community have proposed some low-overhead instrumentation technologies to profile the dynamic behavior of software. However, before the wide-adoption of these new instrumentation technique, instrumenting production software to collect full trace is still not available. Except the instrumentation, are we able to obtain the execution traces? For the failing traces, since the crash stack is a call stack information at the time of crashing, it is a partial failing trace. For the passing traces, although we can get passing traces from existing test cases, we cannot guarantee the effectiveness of these test cases. Some studies show that, leverage passing tests whose characteristics are similar to the failing trace can achieve effective fault localization performance. As such, due to these limitations, conventional fault localization techniques may not be directly applicable in current step.
  7. Then, with only crash reports available in crash reporting system, how we can help developers fix crashing faults? We propose our research goal: to locate crashing faults based on crash stacks.
  8. Our technique is named as CrashLocator, which aims at locating faulty functions, because functions are commonly used in unit testing and helpful for crash reproducing. Different from conventional fault localization, our technique does not need any instrumentation. CrashLocator contains two major steps. The first step is approximating failing traces, the second is ranking suspicious functions. The first step is to approximate the failing traces. This is because faulty functions may not reside on crash stack. In this step, we use static analysis to generate the failing traces based o crash stacks. The second step is to rank the suspicious functions. This is because the number of suspicious functions after approximating can be very large and we need to prioritize the list. In this step, we do not use passing traces. Instead, the ranking is based on the characteristics of faulty functions.
  9. Let us see the detail of our technique. To approximate the failing traces, a simple way is to expand the crash stack via call graph information. For example, we have a crash stack and call graph at the beginning. For the function A, there are two callee function B and J. B is in crash stack, J is not in crash stack but can be possibly executed before crash. Therefore, we include J into failing traces. Similarly, we can do this for the function C and D in crash stack. As such, we can include JEMN in our failing traces in the call depth 1. We can further expanding the failing traces by analysis the functions that can be executed by JEMN. As such, we can include the function KLF in the failing traces. By expanding crash stack in different call depths, we can approximate the failing traces.
  10. The basic stack expansion algorithm is simple and conservative. It only use the function call information in crash stack. However, we find that crash stack contains more information, such as the source file position information. Therefore, we proposed the improved stack expansion algorithm based on it.
  11. To reduce the functions that are impossible to be executed before crash, we conduct control flow analysis on each function in crash stack. For example, we first get the cfg of function A. We find this position is in crash stack. so we can infer the possible control flow path and J is not in the path. Then, we can filter the function call J out. In the stack expansion steps, we will not consider to expand the function call J from A.
  12. In our study, we find that, usually, the variables in crash lines are related to crashes. Then, we perform backward slicing to get the statements that can affect the crash-related variables. For example, in function D, Line 6 is crash line. Crash related variables are s and c. Via backward slicing, we find that, Line 3 is not in the slicing statements. The function call to M will not affect the s and c. Therefore, we filter out the function call to M from D in our expansion steps. Based on control flow analysis and backward slicing, we can approximate a comparable precise failing traces.
  13. Let us see the first observation. A crashing bug may trigger a bucket of crash reports. The crash stacks in these reports may be different, since a single fault may manifest in different ways due to different configuration and platforms. Intuitively, the faulty functions should appear frequently in the failing traces in these crash reports. Our empirical study showed that, 89-92% of crashing faults, the associated functions appear in all crash execution traces in the corresponding bucket. We conclude this result as our first observation. Faulty functions appear frequently in crash traces of the corresponding buckets. Then we propose our first factor function frequency to characterize the faulty function.
  14. However, some functions appear frequently but are unlikely to be buggy, e.g. the entry points and some event handling routines. This is similar to the concept of “stop-words” in information retrieval. The words like “a”, “an” “the” appear frequently but contain less meaning. Therefore, to decrease the weight of these words, inverse document frequency will be used. We adopt the similar concept, and generate our second factor Inverse Bucket Frequency to decrease the priority of the frequent functions that are across many buckets.
  15. We also find that, in Mozilla, for 84.3% of crashing faults, the distance between faulty function and crashing point is very close. W summarize this studying result as our second observation. Based on that, we propose our third factors “Inverse average distance ” (IAD). IAD gives high priority to the functions closer to crash point.
  16. Our empirical study also showed that, 94.1% of faulty functions have been changed at least once during the past 12 months. This result is consistent with our previous study in Microsoft. In that work, we find the existence of immune functions. Immune functions are a list of functions that are considered to be unlikely to be buggy. One category of immune functions are those functions have been successfully used for quite a long time without changes. Therefore, we summarize our third observation as Functions that do not contain crashing faults are often less frequently changed. Using this observation, we select the functions that have no changes in past 12 months and assign 0 as the suspicious score for them.
  17. In our prior study, we find that a large modules are more likely to be buggy. Therefore, we design the fourth factor Function’s Lines of Code.
  18. Based on the four factors, we design the suspicious score as multiplying all of factors. Based on the suspicious score, we rank the functions in approximated traces.
  19. For the evaluation, we select Mozilla three products as our evaluation subjects. In total, there are 160 crashing buckets. The programming language is C/C++. All the subjects are large-scale.
  20. We use Recall@N and MRR as evaluation metrics. Recall@N measures the percentage of the bugs can be located by examining top N recommended functions. MRR is a widely-used metrics to measure the quality of ranking results in IR. Its value ranges from 0 to 1. The higher value of MRR means a better ranking result.
  21. We design four research questions. RQ1 evaluates the performance of our approach. RQ2 compares our approach with the baseline approach named as stack-only methods. The stack-only methods are originated from the Mozilla developer’s feedback. RQ3 evaluates the contribution of each factor. RQ4 evaluates the effectiveness of our proposed crash stack expansion algorithm by comparing with basic stack expansion algorithm.
  22. The table shows the evaluation on RQ1. For each product, we showed the metrics of Recall@1 Recall@5 and Recall@10, as well as MRR. Take firefox 4.0b4 as an example, Recall@1 is 55.6%, that means only examining the top 1 recommended function, we can locate 55.6% of crashing faults. Similarly, by examining top 5 functions, we can locate 66.7% of faults, by examining top 10 functions, we can locate 77.8% of faults. The MRR value is 0.627. Overall, by examining top 1 functions, we can locate 50.6% of faults.
  23. For RQ2, we compare with the baseline approaches, that is stack-only methods. With the feedback from Mozilla developers, they usually inspected the functions in crash stack for debugging. Then, we design three variants of stack-only approaches. In StackOnlySampling method, for each bucket, we randomly select one crash from the bucket, rank the functions based on their position in crash stack. In StackOnlyAverage method, for each bucket, we select all the crashes from the bucket, rank the functions based on their average position in crash stack. In StackOnlyChangeDate method, for each bucket, we randomly select one crash from the bucket, and rank the functions based on the last modified date of the functions.
  24. The figure shows the comparison results. The X axis is the number of functions we examined in the recommendation list. The Y axis is the Recall@N metric. As we can see, CrashLocator outperforms all the other approaches. For example, by examining top 1 functions, CrashLocator can locate 50.6% of faults, while the second best approach StackOnlyAverage can only locate 35.6% of faults. In terms of Recall@1, the improvement of CrashLocator over StackOnlyAverage is 42%. Similarly, in terms of Recall@10, improvement is ranging from 23.2% to 45.8%.
  25. In RQ3, we evaluate the contribution of the four proposed factors, IBF, FF, FLOC, and IAD.
  26. This figure shows the performance of crashlocator by incrementally applying IBF, FF, FLOC and IAD factor, in terms of MRR metric. When only applying IBF, the performance is lowest, e.g. the overall MRR is about 0.1, by incrementally adding FF and FLOC factors, the performance is improved. When all factors are considered, the performance is the best. Therefore, we can know that, each factor can contribute to the performance, and IAD factor has more significant contributions than other factors.
  27. RQ4, we evaluate the effectiveness of our proposed stack expansion algorithm by comparing with the basic one which only uses static call graph.
  28. This figure shows the comparison between two stack expansion algorithms in terms of Recall@1, Recall@5, Recall@10, Recall@20, Recall@50 and MRR. in terms of Recall@N, the improvement of the proposed expansion algorithm over the basic one is ranging from 13.3% to 72.3%. In terms of MRR, the improvement is 59.3%.