CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

CrashLocator: Locating Crashing
Faults Based on Crash Stacks
Rongxin Wu1, Hongyu Zhang2,
Shing-Chi Cheung1 and Sunghun Kim1
The Hong Kong University of Science and Technology1
Microsoft Research2
July 24th , 2014
ISSTA 2014

Background
2
Crash Information
with Crash Stack
Crash Reporting System
Software Crash
Bug ReportsDevelopers Crash Buckets

Feedbacks From Mozilla
Developers
• Locating crashing faults is hard
• Ad hoc approach
“… and look at the crash stack listed. It shows the line number
of the code, and then I go to the code and inspect it. If I am
unsure what it does I go to the second line of the stack and
code and inspect that, and so on and so forth …”
“Some crashes are hard to fix because it is not necessarily
indicative of the place where it crashes in the crash stack …”
“ I use the top down method of following the crash backwards.”
“Sometimes it can be very difficult.”
3

Uncertain Fault Location
• The faulty function may not appear in crash stack
About 33%~41% of crashing faults in Firefox
cannot be located in crash stacks!
A
B
C E F G
H
Buggy Code
D
Crash Stack
Crash Point
4

• Related Work
• Tarantula
(J. A. Jones et al., ICSE 2002)
(J. A. Jones et al., ASE 2005)
• Jaccard
(R. Abreu et al., TAICPART-MUTATION 2007)
• Ochiai
(R. Abreu et al., TAICPART-MUTATION 2007)
(S. Art et al., ISSTA 2010)
• …
• Passing Traces and Failing Traces
Spectrum-Based Fault Localization
5

• Are these techniques applicable?
Spectrum-Based Fault Localization
Instrumented
Product Software
Failing Traces
Passing Traces
Privacy Concern
Performance Overhead
(C. Luk et al., PLDI 2005)
x
Crash Stack
f1
f2
f3
…
fn
x
Test Cases
Effectiveness
(S. Artzi et al., ISSTA’10)
6

Our Research Goal
How to help developers fix crashing faults?
– Locate crashing faults based on crash stack
7

Our technique: CrashLocator
• Target at locating faulty functions
• No instrumentation needed
• Approximate failing traces
 Based on Crash Stacks
 Use static analysis techniques
• Rank suspicious functions
 Without passing traces
 Based on characteristics of faulty functions
8

Approximate Failing Traces
• Basic Stack Expansion Algorithm
A
B
C
D
Crash Stack
E
J
M
N
Depth-1
F
K
L
Depth-2
G
H
Depth-3 A
B J
C K L
D E
M N F
G H
Call Graph
9

functionposition File Line
D0 file_0 l0
C1 file_1 l1
B2 file_2 l2
A3 file_3 l3
Crash Stack
Approximate Failing Traces
 Function call information only
• Improved Stack Expansion Algorithm
 Source file position information
10

Improved Stack Expansion
Algorithm
• Control Flow Analysis
if
J()
…
B()
…
Entry
Exit
In Crash Stack
CFG of A
A
B
C
D
Crash Stack
E
J
M
N
Depth-1
F
K
L
Depth-2
G
H
Depth-3
11

Improved Stack Expansion
Algorithm
• Backward Slicing
1. Obj D(){
2. Obj s;
3. int a = M();
4. char b = ‘’;
5. Obj[] c = N(b);
6. s=c[1]; //crash here
7. if(s!=‘’){
8. …
9. }
8. …
9. }
variables {s,c}
A
B
C
D
Crash Stack
E
M
N
Depth-1
F
Depth-2
G
H
Depth-3
Not in slicing
12

After crash stack expansion, there are still
a large number of suspicious functions
How to rank the suspicious functions?
13

Rank suspicious functions
• An empirical study on the characteristics of faulty
functions
• Quantify the suspiciousness of suspicious functions
14

Observation 1:
Frequent Function
• Faulty functions appear frequently in the crash
traces of the corresponding buckets.
 Function Frequency (FF)
Crash
Report More Frequent,
More Suspicious
For 89-92% crashing faults, the associated faulty functions
appear in all crash execution traces in the corresponding bucket.
Crash Bucket
15

Frequent Function
• Some frequent functions are unlikely to be buggy
 Entry points (main, _RtlUserThreadStart, …)
 Event handling routine (CloseHandle)
• Information retrieval, some frequent words are useless
 stop-words, e.g. “the”, “an”, “a”
 Inverse Document Frequency (IDF)
• Inverse Bucket Frequency (IBF)
 If a function appears in many buckets, it is less likely to be
buggy
16

Observation 2:
Functions Close to Crash Point
• Faulty functions appear closer to crash point
 In Mozilla Firefox, for 84.3% of crashing faults, the
distance between crash point and the associated faulty
functions is less 5.
• Inverse Average Distance to Crash Point (IAD)
17

Observation 3:
Less Frequently Changed Functions
• Functions that do not contain crashing faults are
often less frequently changed
 94.1% of faulty functions have been changed at least
once during the past 12 months
 Immune Functions (Y. Dang et al. ICSE 2012)
• Less frequently changed functions
 Functions that have no changes in past 12 months
 Suspicious score is 0
18

Observation 4: Large Functions
• Our prior study (H. Zhang. ICSM 2009) showed that
large modules are more likely to be defect-prone
• Function’s Lines of Code (FLOC)
19

Suspicious Score
𝑆𝑐𝑜𝑟𝑒 𝑓, 𝐵 = 𝐹𝐹 𝑓, 𝐵 ∗ 𝐼𝐵𝐹 𝑓 ∗ 𝐼𝐴𝐷 𝑓, 𝐵 ∗ 𝐹𝐿𝑂𝐶(𝑓)
• FF (Function Frequency)
𝐹𝐹 𝑓, 𝐵 =
𝑁𝑓,𝐵
𝑁 𝐵
• IBF(Inverse Bucket Frequency)
𝐼𝐵𝐹 𝑓 = 𝑙𝑜𝑔(
#𝐵
#𝐵𝑓
+ 1)
• IAD(Inverse Distance to Crash Point)
𝐼𝐴𝐷 𝑓, 𝐵 =
𝑁𝑓,𝐵
1 + 𝑗=1
𝑛
𝑑𝑖𝑠𝑗(𝑓)
• FLOC(Function Lines of Code)
𝐹𝐿𝑂𝐶 𝑓 = 𝑙𝑜𝑔 (𝐿𝑂𝐶 𝑓 + 1) 20

Evaluation Subjects
• Mozilla Products
 5 releases of Firefox
 2 releases of Thunderbird
 1 release of SeaMonkey
• 160 crashing faults(buckets)
• Large-Scale
 More than 2 million LOC
 More than 120K functions
21

Evaluation Metrics
• Recall@N: Percentage of successfully located faults
by examining top N recommended functions
• Mean Reciprocal Rank (MRR)
 Measure the quality of the ranking results in IR
 Range value: 0 ~ 1
 Higher value means better ranking
22

Experimental Design
• RQ1: How many faults can be successfully located by
CrashLocator?
• RQ2: Can CrashLocator outperform the conventional
stack-only methods?
• RQ3: How does each factor contribute to the crash
localization performance?
• RQ4: How effective is the proposed crash stack
expansion algorithm?
23

RQ1: CrashLocator Performance
System Recall@1 Recall@5 Recall@10 MRR
Firefox 4.0b4 55.6% 66.7% 77.8% 0.627
Firefox 4.0b5 47.1% 70.6% 70.6% 0.566
Firefox 4.0b6 48.0% 64.0% 64.0% 0.540
Firefox14.0.1 52.0% 52.0% 56.0% 0.528
Firefox16.0.1 53.8% 53.8% 53.8% 0.542
Thunderbird17.0 48.5% 66.7% 78.8% 0.568
Thunderbird24.0 50.0% 66.7% 66.7% 0.544
SeaMonkey2.21 55.0% 70.0% 70.0% 0.600
Summary 50.6% 63.7% 67.5% 0.559
24

RQ2: Comparison with Stack-Only
methods
• Conventional Stack-Only Methods
• StackOnlySampling
• StackOnlyAverage
• StackOnlyChangeDate
25

RQ2: Comparison with Stack-Only
methods
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 5 10 20 50 100
Recall@N
Top N Functions
StackOnlySampling
StackOnlyAverage
StackOnlyChangeDate
CrashLocator
26

RQ3: Contribution of Each Factors
• Inverse Bucket Frequency (IBF)
• Function Frequency (FF)
• Function’s Lines of Code (FLOC)
• Inverse Average Distance to Crash Point (IAD)
27

RQ3: Contribution of Each Factors
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
ff4.0b4 ff4.0b5 ff4.0b6 ff14.0.1 ff16.0.1 tb17.0 tb24.0 sm2.21 Summary
MRR
IBF IBF*FF IBF*FF*FLOC IBF*FF*FLOC*IAD
28

RQ4: Stack Expansion Algorithms
 Static Call Graph
• Improved Stack Expansion Algorithm
 Static Call Graph
 Control Flow Analysis
 Backward Slicing
29

RQ4: Stack Expansion Algorithms
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Recall@1 Recall@5 Recall@10 Recall@20 Recall@50 MRR
Basic Stack Trace Expansion Improved Stack Trace Expansion
30

Conclusions
• Propose a novel technique CrashLocator to locate
crashing faults based on crash stack only
• Evaluate on real and large-scale projects
• 50.6%, 63.7%, and 67.5% of crashing faults can be
located by examining only top 1,5,10 functions
• CrashLocator outperforms Stack-Only methods
significantly, with the improvement of MRR at least
32% and the improvement of Recall@10 at least 23%
31

CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (13)

Similar a CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

Similar a CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014) (20)

Más de Sung Kim

Más de Sung Kim (7)

Último

Último (20)

CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)

Notas del editor