SlideShare una empresa de Scribd logo
1 de 61
Exposing Difficult
Compiler Bugs With
Random Testing
John Regehr, Xuejun Yang, Yang Chen, Eric Eide
University of Utah
• Found serious wrong-code bugs in
all C compilers we’ve tested
–Including GCC
–Including expensive commercial compilers
–Including 11 bugs in a research compiler
that was proved to be correct
–287 bugs reported so far
• Counting crash and wrong-code bugs
2
static int x;
static int *volatile z = &x;
static int foo (int *y) {
return *y;
}
int main (void) {
*z = 1;
printf ("%dn", foo(&x));
return 0;
}
• Should print “1”
• GCC r164319 at -O2 on x86-64 prints “0” 3
int foo (void) {
signed char x = 1;
unsigned char y = 255;
return x > y;
}
• Should return 0
• GCC 4.2.3 from Ubuntu Hardy (8.04)
for x86 returns 1 at all optimization
levels
4
const volatile int x;
volatile int y;
void foo(void) {
for (y=0; y>10; y++)
{
int z = x;
}
}
foo: movl $0, y
movl x, %eax
jmp .L3
.L2: movl y, %eax
incl %eax
movl %eax, y
.L3: movl y, %eax
cmpl $10, %eax
jg .L3
ret
• GCC 4.3.0 -Os for x86
6
We find and
report a bug
You fix it
(hopefully)
7
We find and
report a bug
You fix it
55 bugs fixed so far + a few
reported but not yet fixed
20 of these bugs were P1
Goal: Harden GCC by finding and
killing difficult optimizer bugs
8
0 5 10 15 20 25
Richard Guenther
Jakub Jelinek
Andrew Pinski
Jan Hubicka
Martin Jambor
Uros Bizjak
Michael Matz
Andrew Macleod
Eric Botcazou
Kai Tietz
Sebastian Pop
H. J. Lu
Ira Rosen
Who fixed
these bugs?
What Kind of Bugs?
• Compiler crash or ICE
• Compiler generates code that…
–Crashes
–Computes wrong value
–Wrongfully terminates
–Wrongfully fails to terminate
–Accesses a volatile wrong number of times
9
1. What we do
2. How we do it
3. What we learned
4. What still needs to happen
10
11
if ((((l_421 || (safe_lshift_func_uint8_t_u_u (l_421, 0xABE574F6L))) && (func_77(func_38((l_424 >=
l_425), g_394, g_30.f0), func_8(l_408, g_345[2], g_7, (*g_165), l_421), (*l_400), func_8(((*g_349) !=
(*g_349)), (l_426 != (*l_400)), (safe_lshift_func_int16_t_s_u ((**g_349), 0xD5C55EF8L)), 0x0B1F0B62L,
g_95), (safe_add_func_uint32_t_u_u ((*g_165), l_431))) ^ ((safe_rshift_func_uint8_t_u_s (((*g_165)
>= (**g_349)), (safe_mul_func_int8_t_s_s ((*g_165), l_421)))) <= func_77((*g_129), g_95, 1L, l_408,
(*l_400))))) | (*l_400))) {
struct S0 *l_443 = &g_30;
(*l_400) = ((safe_mod_func_int16_t_s_s ((safe_add_func_int16_t_s_s (l_421, (**g_164))), (**g_349)))
&& l_425);
l_447 ^= (safe_sub_func_int16_t_s_s (0x27AC345CL, ((**g_250) <= func_66(l_446, g_19, g_129,
(*g_129), l_407))));
(*l_446) = func_22(l_431, -1L, l_421, (0x1B625347L <= func_22(g_287, g_394, l_447, -1L)));
} else {
const uint32_t l_459 = 0x9671310DL;
l_448 = (*g_186);
(*l_400) = (0L & (0 == (*g_348)));
(*l_400) = func_77((*g_31), ((*g_165) && 6L), l_426, func_77((*l_441), (safe_lshift_func_uint16_t_u_u
((((safe_mul_func_int16_t_s_s ((**g_349), (*g_165))) | ((*g_165) > l_426)) < (0 != (*g_129))), (&l_431
== &l_408))), (l_453 == &l_407), func_77(func_38((*l_400), (safe_mod_func_uint16_t_u_u ((l_420 <
(*g_165)), func_77((*l_441), l_456, (*l_446), (*l_448), g_345[5]))), g_345[4]), g_287,
(func_77((*g_129), l_421, (l_424 & (**g_349)), ((*l_453) != (*g_129)), 0x6D4CA97DL) ==
(safe_div_func_int64_t_s_s (-1L, func_77((*g_129), l_459, l_447, (*l_446), l_459)))), g_95, g_19),
l_420), (*l_446));
} 12
A test program:
• Does lots of random stuff
• Checksums its globals
• Prints checksum and exits
13
Test case
generator
gcc -O0 gcc -O3 clang -O …
vote
minoritymajority
C program
results
Test Case Generator
• Driven by
–Random search
–Depth first search
• Based on
–Grammar for subset of C
–Analyses to ensure test case validity
14
Not a Bug #1
int foo (int x) {
return (x+1) > x;
}
int main (void) {
printf ("%dn",
foo (INT_MAX));
return 0;
}
$ gcc -O1 foo.c -o foo
$ ./foo
0
$ gcc -O2 foo.c -o foo
$ ./foo
1
15
Not a Bug #2
int a;
void bar (int x, int y) {
}
int main (void) {
bar (a=1, a=2);
printf ("%dn", a);
return 0;
}
$ gcc -O bar.c -o bar
$ ./bar
1
$ clang -O bar.c -o bar
$ ./bar
2
16
Not a Bug #3
int main (void) {
long a = -1;
unsigned b = 1;
printf ("%dn", a > b);
return 0;
}
$ gcc -m64 baz.c -o baz
$ ./baz
0
$ gcc -m32 baz.c -o baz
$ ./baz
1
17
• Key property for automated
compiler testing:
–C standard gives each test case a unique
meaning
–Results differ → COMPILER BUG
• Test cases must not…
–Execute undefined behavior (191 kinds)
–Rely on unspecified behavior (52 kinds)
18
• Expressive code generation is easy
–If you don’t care about undefined behavior
• Avoiding undefined behavior is easy
–If you don’t care about expressiveness
• Expressive code that avoids
undefined / unspecified behavior is
hard
19
20
More expressiveLess expressive
More undefined / unspecified behavior
Less undefined / unspecified behavior
Lindig 07
McKeeman 98
Sheridan 07
Our work
Avoiding Undefined and
Unspecified Behaviors
• Offline avoidance is too difficult
–E.g. ensuring in-bounds array access
• Online avoidance is too inefficient
–E.g. ensuring validity of pointer to stack
• Solution: Combine static analysis
and dynamic checks
21
Order of Evaluation Problems
• Problem: Order of evaluation of
function arguments is unspecified
• E.g.
foo(bar(),baz())
• Where bar() and baz() both modify
some variable
22
Order of Evaluation Problems
• Solution:
–Compute conservative read and write set
for each function
• Interprocedural analysis
• Including read/write through pointers
–In between sequence points, never invoke
functions where read and write sets
conflict
23
Integer Undefined Behaviors
• Problem: These are undefined in C
–Divide by zero
–INT_MIN % -1
• Debatable in C99 standard but
undefined in practice
–Shift by negative, shift past bitwidth
–Signed overflow
–Etc.
24
Undefined Integer Behaviors
• Solution: Wrap all potentially
undefined operations
int safe_signed_sub (int si1, int si2) {
if (((si1^si2) & (((si1^((si1^si2)
& (1 << (sizeof(int)*CHAR_BIT-1))))-si2)^si2))
< 0) {
return 0;
} else {
return si1 - si2;
}
}
25
Pointer Problems
• Problem: Undefined pointer
behaviors
–Null pointer deref
–Deref pointer into dead stack frame
–Create or use out of bounds pointer
26
Pointer Problems
• Solution:
–Some dynamic checks
• if (ptr) { … }
–Some static analysis
• Track alias set for each pointer to ensure
validity
• Avoid casting away qualifiers
27
• Arithmetic, logical, and
bit operations
• Loops
• Conditionals
• Function calls
• Const and volatile
• Structs
• Pointers and arrays
• Goto
• Break, continue
• Bitfields
• Comma operator
• Interesting type casts
• Strings
• Unions
• Floating point
• Nonlocal jumps
• Varargs
• Recursive functions
• Function pointers
• Malloc / free
28
SUPPORTED UNSUPPORTED
Design Compromise #1
• Implementation-defined behavior is
allowed
–Avoiding it is too restrictive
• Cannot do differential testing of e.g.
x86 GCC vs. AVR GCC
–Fine in practice
29
Design Compromise #2
• No ground truth
–If all compilers generate the same wrong
answer, we’ll never know
• We could write a C interpreter
–No reason to think ours would be better
than anyone else’s
–Not worth it
30
Design Compromise #3
• No attempt to generate terminating
programs
–Test harness uses timeouts
–In practice ~10% of random programs don’t
terminate within a few seconds
31
Design Compromise #4
• Not aiming for coverage of the C
standard
–E.g. exceeding max identifier length
–Existing test suites do a good job
• Goal is to find deep optimizer bugs
–Existing test suites are insufficient
32
1. What we do
2. How we do it
3. What we learned
4. What still needs to happen
33
• As expected: Higher optimization
levels are buggier
• But sometimes a compiler is wrong…
–Only at -O0
–Consistently at all optimization levels
–Because it was itself miscompiled
–Because a system library function is wrong
–Non-deterministically
• Due to HW faults, ASLR, ???
34
An Experiment
• Compiled and ran 1,000,000
random programs
• Using GCC 3.[0-4].0 and 4.[0-5].0
• -O0, -O1, -O2, -Os, -O3
• x86 only
35
37
38
• Fixing bugs we reported is
correlated with reduction in
observed error rate
• But is there causation?
–Not enough information
–This is not a controlled experiment – many
bugs fixed besides the ones we reported
39
Do These Bugs Matter?
• How often do regular GCC users hit
the kind of bugs we find?
–Several bugs we reported were
subsequently re-reported by application
developers
–We sometimes find known bugs
–But overall, not enough evidence
40
File # of wrong code bugs # of crash bugs
fold-const.c 3 6
combine.c 1 4
tree-ssa-pre.c 0 4
tree-vrp.c 0 4
tree-ssa-dce.c 0 3
tree-ssa-reassoc.c 0 2
reload1.c 1 1
tree-ssa-loop-niter.c 1 1
dse.c 2 0
tree-scalar-evolution.c 2 0
Other (12 files) 13 18
Total (22 files) 23 43 41
42
75.13%
82.23%
46.26%
75.58%
82.41%
47.11%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Line Function Branch
make check-c
make check-c +
10,000 random
programs
Coverage of GCC Code
1. What we do
2. How we do it
3. What we learned
4. What still needs to happen
43
• We’ve only reported bugs for…
–A few of GCC’s platforms
–The most basic compiler options
–About 2 years’ worth of GCC versions
–A subset of C
• A lot of work remains to be done
–Can we push some random testing out into
the community?
44
• Can a casual user find and report
compiler bugs using our tool?
• Need to…
–Run the test harness – EASY
–Run CPU emulators for testing cross
compilers – EASY
–Create reduced test cases – EASY (for ICEs)
–Figure out if bugs are reported yet – EASY
(for ICEs)
45
• However…
–Creating reduced test cases for wrong code
bugs is hard
–Figuring out if a wrong code bug was
already reported is hard
• Automation is needed
46
• Delta debugging is obvious way to
reduce size of failure-inducing tests
• Delta debugging == Repeatedly
remove part of the program and see
if it remains interesting
–Works well for crash bugs
–Works poorly for wrong code bugs
47
• Problem: Throwing away part of a
program may introduce undefined
behavior
• Example:
int foo (void) {
int x;
x = 1;
return x;
}
48
Oops!
Possible Solutions
1. Generate small random programs
2. Detect undefined and unspecified
behavior during reduction
3. Use the test case generator to
reduce program size
49
50
81 KB of C,
on average
Possible Solutions
1. Generate small random programs
2. Detect undefined and unspecified
behavior during reduction
3. Use the test case generator to
reduce program size
5151
• Prototype reduces size of failure-
inducing test cases by 93%
–Averaged over 33 wrong code bugs in GCC
and LLVM
–Takes a few minutes to reduce a program
• But given a few hours, a skilled
human can do quite a bit better
52
• What if manual and automated test
case reduction fails?
–If we cannot create a small testcase for a
failure, we don’t report the bug
• Small ≈ 15 lines
–This happens, but infrequently
–Are we bad at testcase reduction or are
there compiler bugs that only trigger on
complex inputs?
53
• What if an overnight run finds 500
programs that trigger wrong code
bugs?
–Did we just find one compiler bug or 500?
• If we can’t answer this, we have to
report 1 bug at a time
–This is what we currently do
–Need a way to do “bug triage”
54
• Idea for bug triage:
–Binary search on GCC versions to find the
revision causing the bug
–Same rev → likely same bug
–Different rev → inconclusive!
• Too often, bug was introduced earlier
• Latent until exposed by some other
change
• Could also search over passes
• Any other ideas? 55
• TODO for us: Create a turnkey tester
–Test harness needs a partial rewrite
• 7000 lines of Perl…
–Testcase reducer needs improvement
• TODO for you: Please keep fixing
bugs we report
–Even volatile bugs
56
One Last Idea
• Currently, compiler certification for
critical systems is a bad joke
• Can we certify a version of GCC by
–Restricting the set of optimization passes
–Selecting a simple target (Thumb2 maybe)
–Freeze features and fix bugs for a while
–Perform near-exhaustive whitebox testing
• Test paths in the compiler that matter
57
Conclusion #1
• Random testing is powerful
• But has drawbacks
–Never know when to stop testing
–Tuning probabilities is hard
–Generating expressive output that is still
correct is hard
–Our generator is very C specific
58
Conclusion #2
• Fixed test suites are not enough
–We find bugs other testing misses
–We can auto-generate reduced testcases
59
Conclusion #3
• Our work is the most extensive fuzz
attack on compilers to date
–Quickly finds bugs in every compiler we’ve
tested
• Compilers need random testing
60
Code Coverage Backup Slide
• make check-c
– Lines : 75.13% (246876 / 328609)
– Functions : 82.23% (15292 / 18596)
– Branches : 46.26% (243658 / 526724)
• make check-c + 10,000 random
programs
– Lines : 75.58% (248358 / 328609)
– Functions : 82.41% (15325 / 18596)
– Branches : 47.11% (248129 / 526724)
61

Más contenido relacionado

La actualidad más candente

Digital system design practical file
Digital system design practical fileDigital system design practical file
Digital system design practical fileArchita Misra
 
How Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsHow Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsJonathan Salwan
 
PVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemPVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemAndrey Karpov
 
EXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program AnalysisEXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program AnalysisIosif Itkin
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerAndrey Karpov
 
Experiences from Designing and Validating a Software Modernization Transforma...
Experiences from Designing and Validating a Software Modernization Transforma...Experiences from Designing and Validating a Software Modernization Transforma...
Experiences from Designing and Validating a Software Modernization Transforma...Alexandru-Florin Iosif-Lazăr
 
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...Jonathan Salwan
 
VHDL PROGRAMS FEW EXAMPLES
VHDL PROGRAMS FEW EXAMPLESVHDL PROGRAMS FEW EXAMPLES
VHDL PROGRAMS FEW EXAMPLESkarthik kadava
 
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...corehard_by
 
Session 6 sv_randomization
Session 6 sv_randomizationSession 6 sv_randomization
Session 6 sv_randomizationNirav Desai
 
Verilog Lecture5 hust 2014
Verilog Lecture5 hust 2014Verilog Lecture5 hust 2014
Verilog Lecture5 hust 2014Béo Tú
 
Les race conditions, nos très chères amies
Les race conditions, nos très chères amiesLes race conditions, nos très chères amies
Les race conditions, nos très chères amiesPierre Laporte
 
System Verilog Tutorial - VHDL
System Verilog Tutorial - VHDLSystem Verilog Tutorial - VHDL
System Verilog Tutorial - VHDLE2MATRIX
 

La actualidad más candente (20)

Digital system design practical file
Digital system design practical fileDigital system design practical file
Digital system design practical file
 
VHDL Programs
VHDL ProgramsVHDL Programs
VHDL Programs
 
Behavioral modelling in VHDL
Behavioral modelling in VHDLBehavioral modelling in VHDL
Behavioral modelling in VHDL
 
How Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protectionsHow Triton can help to reverse virtual machine based software protections
How Triton can help to reverse virtual machine based software protections
 
PVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemPVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating system
 
Programs of VHDL
Programs of VHDLPrograms of VHDL
Programs of VHDL
 
EXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program AnalysisEXTENT-2016: Industry Practices of Advanced Program Analysis
EXTENT-2016: Industry Practices of Advanced Program Analysis
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
Vhdl lab manual
Vhdl lab manualVhdl lab manual
Vhdl lab manual
 
Experiences from Designing and Validating a Software Modernization Transforma...
Experiences from Designing and Validating a Software Modernization Transforma...Experiences from Designing and Validating a Software Modernization Transforma...
Experiences from Designing and Validating a Software Modernization Transforma...
 
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
 
VHDL PROGRAMS FEW EXAMPLES
VHDL PROGRAMS FEW EXAMPLESVHDL PROGRAMS FEW EXAMPLES
VHDL PROGRAMS FEW EXAMPLES
 
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...
Update on C++ Core Guidelines Lifetime Analysis. Gábor Horváth. CoreHard Spri...
 
Vhdl new
Vhdl newVhdl new
Vhdl new
 
Session 6 sv_randomization
Session 6 sv_randomizationSession 6 sv_randomization
Session 6 sv_randomization
 
Verilog Lecture5 hust 2014
Verilog Lecture5 hust 2014Verilog Lecture5 hust 2014
Verilog Lecture5 hust 2014
 
Vhdl programming
Vhdl programmingVhdl programming
Vhdl programming
 
Les race conditions, nos très chères amies
Les race conditions, nos très chères amiesLes race conditions, nos très chères amies
Les race conditions, nos très chères amies
 
LLDB Introduction
LLDB IntroductionLLDB Introduction
LLDB Introduction
 
System Verilog Tutorial - VHDL
System Verilog Tutorial - VHDLSystem Verilog Tutorial - VHDL
System Verilog Tutorial - VHDL
 

Similar a GCC Summit 2010

Digging for Android Kernel Bugs
Digging for Android Kernel BugsDigging for Android Kernel Bugs
Digging for Android Kernel BugsJiahong Fang
 
Pharo Virtual Machine: News from the Front
Pharo Virtual Machine: News from the FrontPharo Virtual Machine: News from the Front
Pharo Virtual Machine: News from the FrontESUG
 
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...Cyber Security Alliance
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performancePiotr Przymus
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerAndrey Karpov
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error predictionNIKHIL NAWATHE
 
SPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic librarySPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic libraryAdaCore
 
Making archive IL2C #6-55 dotnet600 2018
Making archive IL2C #6-55 dotnet600 2018Making archive IL2C #6-55 dotnet600 2018
Making archive IL2C #6-55 dotnet600 2018Kouji Matsui
 
Code lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf LinzCode lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf LinzIvan Krylov
 
Infecting the Embedded Supply Chain
 Infecting the Embedded Supply Chain Infecting the Embedded Supply Chain
Infecting the Embedded Supply ChainPriyanka Aash
 
Code quality par Simone Civetta
Code quality par Simone CivettaCode quality par Simone Civetta
Code quality par Simone CivettaCocoaHeads France
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldAndrey Karpov
 
Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)Emilio Coppa
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Rapita Systems Ltd
 
Improving Code Quality Through Effective Review Process
Improving Code Quality Through Effective  Review ProcessImproving Code Quality Through Effective  Review Process
Improving Code Quality Through Effective Review ProcessDr. Syed Hassan Amin
 
No liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failureNo liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failureRogue Wave Software
 
OptView2 MUC meetup slides
OptView2 MUC meetup slidesOptView2 MUC meetup slides
OptView2 MUC meetup slidesOfek Shilon
 

Similar a GCC Summit 2010 (20)

Digging for Android Kernel Bugs
Digging for Android Kernel BugsDigging for Android Kernel Bugs
Digging for Android Kernel Bugs
 
Pharo Virtual Machine: News from the Front
Pharo Virtual Machine: News from the FrontPharo Virtual Machine: News from the Front
Pharo Virtual Machine: News from the Front
 
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
App secforum2014 andrivet-cplusplus11-metaprogramming_applied_to_software_obf...
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
SPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic librarySPARKNaCl: A verified, fast cryptographic library
SPARKNaCl: A verified, fast cryptographic library
 
Klee and angr
Klee and angrKlee and angr
Klee and angr
 
Making archive IL2C #6-55 dotnet600 2018
Making archive IL2C #6-55 dotnet600 2018Making archive IL2C #6-55 dotnet600 2018
Making archive IL2C #6-55 dotnet600 2018
 
Code lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf LinzCode lifecycle in the jvm - TopConf Linz
Code lifecycle in the jvm - TopConf Linz
 
Infecting the Embedded Supply Chain
 Infecting the Embedded Supply Chain Infecting the Embedded Supply Chain
Infecting the Embedded Supply Chain
 
Code quality par Simone Civetta
Code quality par Simone CivettaCode quality par Simone Civetta
Code quality par Simone Civetta
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)Symbolic Execution (introduction and hands-on)
Symbolic Execution (introduction and hands-on)
 
Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage" Presentation slides: "How to get 100% code coverage"
Presentation slides: "How to get 100% code coverage"
 
College1
College1College1
College1
 
Improving Code Quality Through Effective Review Process
Improving Code Quality Through Effective  Review ProcessImproving Code Quality Through Effective  Review Process
Improving Code Quality Through Effective Review Process
 
Introduction to C ++.pptx
Introduction to C ++.pptxIntroduction to C ++.pptx
Introduction to C ++.pptx
 
No liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failureNo liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failure
 
OptView2 MUC meetup slides
OptView2 MUC meetup slidesOptView2 MUC meetup slides
OptView2 MUC meetup slides
 

Último

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

GCC Summit 2010

  • 1. Exposing Difficult Compiler Bugs With Random Testing John Regehr, Xuejun Yang, Yang Chen, Eric Eide University of Utah
  • 2. • Found serious wrong-code bugs in all C compilers we’ve tested –Including GCC –Including expensive commercial compilers –Including 11 bugs in a research compiler that was proved to be correct –287 bugs reported so far • Counting crash and wrong-code bugs 2
  • 3. static int x; static int *volatile z = &x; static int foo (int *y) { return *y; } int main (void) { *z = 1; printf ("%dn", foo(&x)); return 0; } • Should print “1” • GCC r164319 at -O2 on x86-64 prints “0” 3
  • 4. int foo (void) { signed char x = 1; unsigned char y = 255; return x > y; } • Should return 0 • GCC 4.2.3 from Ubuntu Hardy (8.04) for x86 returns 1 at all optimization levels 4
  • 5. const volatile int x; volatile int y; void foo(void) { for (y=0; y>10; y++) { int z = x; } } foo: movl $0, y movl x, %eax jmp .L3 .L2: movl y, %eax incl %eax movl %eax, y .L3: movl y, %eax cmpl $10, %eax jg .L3 ret • GCC 4.3.0 -Os for x86
  • 6. 6 We find and report a bug You fix it (hopefully)
  • 7. 7 We find and report a bug You fix it 55 bugs fixed so far + a few reported but not yet fixed 20 of these bugs were P1 Goal: Harden GCC by finding and killing difficult optimizer bugs
  • 8. 8 0 5 10 15 20 25 Richard Guenther Jakub Jelinek Andrew Pinski Jan Hubicka Martin Jambor Uros Bizjak Michael Matz Andrew Macleod Eric Botcazou Kai Tietz Sebastian Pop H. J. Lu Ira Rosen Who fixed these bugs?
  • 9. What Kind of Bugs? • Compiler crash or ICE • Compiler generates code that… –Crashes –Computes wrong value –Wrongfully terminates –Wrongfully fails to terminate –Accesses a volatile wrong number of times 9
  • 10. 1. What we do 2. How we do it 3. What we learned 4. What still needs to happen 10
  • 11. 11
  • 12. if ((((l_421 || (safe_lshift_func_uint8_t_u_u (l_421, 0xABE574F6L))) && (func_77(func_38((l_424 >= l_425), g_394, g_30.f0), func_8(l_408, g_345[2], g_7, (*g_165), l_421), (*l_400), func_8(((*g_349) != (*g_349)), (l_426 != (*l_400)), (safe_lshift_func_int16_t_s_u ((**g_349), 0xD5C55EF8L)), 0x0B1F0B62L, g_95), (safe_add_func_uint32_t_u_u ((*g_165), l_431))) ^ ((safe_rshift_func_uint8_t_u_s (((*g_165) >= (**g_349)), (safe_mul_func_int8_t_s_s ((*g_165), l_421)))) <= func_77((*g_129), g_95, 1L, l_408, (*l_400))))) | (*l_400))) { struct S0 *l_443 = &g_30; (*l_400) = ((safe_mod_func_int16_t_s_s ((safe_add_func_int16_t_s_s (l_421, (**g_164))), (**g_349))) && l_425); l_447 ^= (safe_sub_func_int16_t_s_s (0x27AC345CL, ((**g_250) <= func_66(l_446, g_19, g_129, (*g_129), l_407)))); (*l_446) = func_22(l_431, -1L, l_421, (0x1B625347L <= func_22(g_287, g_394, l_447, -1L))); } else { const uint32_t l_459 = 0x9671310DL; l_448 = (*g_186); (*l_400) = (0L & (0 == (*g_348))); (*l_400) = func_77((*g_31), ((*g_165) && 6L), l_426, func_77((*l_441), (safe_lshift_func_uint16_t_u_u ((((safe_mul_func_int16_t_s_s ((**g_349), (*g_165))) | ((*g_165) > l_426)) < (0 != (*g_129))), (&l_431 == &l_408))), (l_453 == &l_407), func_77(func_38((*l_400), (safe_mod_func_uint16_t_u_u ((l_420 < (*g_165)), func_77((*l_441), l_456, (*l_446), (*l_448), g_345[5]))), g_345[4]), g_287, (func_77((*g_129), l_421, (l_424 & (**g_349)), ((*l_453) != (*g_129)), 0x6D4CA97DL) == (safe_div_func_int64_t_s_s (-1L, func_77((*g_129), l_459, l_447, (*l_446), l_459)))), g_95, g_19), l_420), (*l_446)); } 12 A test program: • Does lots of random stuff • Checksums its globals • Prints checksum and exits
  • 13. 13 Test case generator gcc -O0 gcc -O3 clang -O … vote minoritymajority C program results
  • 14. Test Case Generator • Driven by –Random search –Depth first search • Based on –Grammar for subset of C –Analyses to ensure test case validity 14
  • 15. Not a Bug #1 int foo (int x) { return (x+1) > x; } int main (void) { printf ("%dn", foo (INT_MAX)); return 0; } $ gcc -O1 foo.c -o foo $ ./foo 0 $ gcc -O2 foo.c -o foo $ ./foo 1 15
  • 16. Not a Bug #2 int a; void bar (int x, int y) { } int main (void) { bar (a=1, a=2); printf ("%dn", a); return 0; } $ gcc -O bar.c -o bar $ ./bar 1 $ clang -O bar.c -o bar $ ./bar 2 16
  • 17. Not a Bug #3 int main (void) { long a = -1; unsigned b = 1; printf ("%dn", a > b); return 0; } $ gcc -m64 baz.c -o baz $ ./baz 0 $ gcc -m32 baz.c -o baz $ ./baz 1 17
  • 18. • Key property for automated compiler testing: –C standard gives each test case a unique meaning –Results differ → COMPILER BUG • Test cases must not… –Execute undefined behavior (191 kinds) –Rely on unspecified behavior (52 kinds) 18
  • 19. • Expressive code generation is easy –If you don’t care about undefined behavior • Avoiding undefined behavior is easy –If you don’t care about expressiveness • Expressive code that avoids undefined / unspecified behavior is hard 19
  • 20. 20 More expressiveLess expressive More undefined / unspecified behavior Less undefined / unspecified behavior Lindig 07 McKeeman 98 Sheridan 07 Our work
  • 21. Avoiding Undefined and Unspecified Behaviors • Offline avoidance is too difficult –E.g. ensuring in-bounds array access • Online avoidance is too inefficient –E.g. ensuring validity of pointer to stack • Solution: Combine static analysis and dynamic checks 21
  • 22. Order of Evaluation Problems • Problem: Order of evaluation of function arguments is unspecified • E.g. foo(bar(),baz()) • Where bar() and baz() both modify some variable 22
  • 23. Order of Evaluation Problems • Solution: –Compute conservative read and write set for each function • Interprocedural analysis • Including read/write through pointers –In between sequence points, never invoke functions where read and write sets conflict 23
  • 24. Integer Undefined Behaviors • Problem: These are undefined in C –Divide by zero –INT_MIN % -1 • Debatable in C99 standard but undefined in practice –Shift by negative, shift past bitwidth –Signed overflow –Etc. 24
  • 25. Undefined Integer Behaviors • Solution: Wrap all potentially undefined operations int safe_signed_sub (int si1, int si2) { if (((si1^si2) & (((si1^((si1^si2) & (1 << (sizeof(int)*CHAR_BIT-1))))-si2)^si2)) < 0) { return 0; } else { return si1 - si2; } } 25
  • 26. Pointer Problems • Problem: Undefined pointer behaviors –Null pointer deref –Deref pointer into dead stack frame –Create or use out of bounds pointer 26
  • 27. Pointer Problems • Solution: –Some dynamic checks • if (ptr) { … } –Some static analysis • Track alias set for each pointer to ensure validity • Avoid casting away qualifiers 27
  • 28. • Arithmetic, logical, and bit operations • Loops • Conditionals • Function calls • Const and volatile • Structs • Pointers and arrays • Goto • Break, continue • Bitfields • Comma operator • Interesting type casts • Strings • Unions • Floating point • Nonlocal jumps • Varargs • Recursive functions • Function pointers • Malloc / free 28 SUPPORTED UNSUPPORTED
  • 29. Design Compromise #1 • Implementation-defined behavior is allowed –Avoiding it is too restrictive • Cannot do differential testing of e.g. x86 GCC vs. AVR GCC –Fine in practice 29
  • 30. Design Compromise #2 • No ground truth –If all compilers generate the same wrong answer, we’ll never know • We could write a C interpreter –No reason to think ours would be better than anyone else’s –Not worth it 30
  • 31. Design Compromise #3 • No attempt to generate terminating programs –Test harness uses timeouts –In practice ~10% of random programs don’t terminate within a few seconds 31
  • 32. Design Compromise #4 • Not aiming for coverage of the C standard –E.g. exceeding max identifier length –Existing test suites do a good job • Goal is to find deep optimizer bugs –Existing test suites are insufficient 32
  • 33. 1. What we do 2. How we do it 3. What we learned 4. What still needs to happen 33
  • 34. • As expected: Higher optimization levels are buggier • But sometimes a compiler is wrong… –Only at -O0 –Consistently at all optimization levels –Because it was itself miscompiled –Because a system library function is wrong –Non-deterministically • Due to HW faults, ASLR, ??? 34
  • 35. An Experiment • Compiled and ran 1,000,000 random programs • Using GCC 3.[0-4].0 and 4.[0-5].0 • -O0, -O1, -O2, -Os, -O3 • x86 only 35
  • 36.
  • 37. 37
  • 38. 38
  • 39. • Fixing bugs we reported is correlated with reduction in observed error rate • But is there causation? –Not enough information –This is not a controlled experiment – many bugs fixed besides the ones we reported 39
  • 40. Do These Bugs Matter? • How often do regular GCC users hit the kind of bugs we find? –Several bugs we reported were subsequently re-reported by application developers –We sometimes find known bugs –But overall, not enough evidence 40
  • 41. File # of wrong code bugs # of crash bugs fold-const.c 3 6 combine.c 1 4 tree-ssa-pre.c 0 4 tree-vrp.c 0 4 tree-ssa-dce.c 0 3 tree-ssa-reassoc.c 0 2 reload1.c 1 1 tree-ssa-loop-niter.c 1 1 dse.c 2 0 tree-scalar-evolution.c 2 0 Other (12 files) 13 18 Total (22 files) 23 43 41
  • 43. 1. What we do 2. How we do it 3. What we learned 4. What still needs to happen 43
  • 44. • We’ve only reported bugs for… –A few of GCC’s platforms –The most basic compiler options –About 2 years’ worth of GCC versions –A subset of C • A lot of work remains to be done –Can we push some random testing out into the community? 44
  • 45. • Can a casual user find and report compiler bugs using our tool? • Need to… –Run the test harness – EASY –Run CPU emulators for testing cross compilers – EASY –Create reduced test cases – EASY (for ICEs) –Figure out if bugs are reported yet – EASY (for ICEs) 45
  • 46. • However… –Creating reduced test cases for wrong code bugs is hard –Figuring out if a wrong code bug was already reported is hard • Automation is needed 46
  • 47. • Delta debugging is obvious way to reduce size of failure-inducing tests • Delta debugging == Repeatedly remove part of the program and see if it remains interesting –Works well for crash bugs –Works poorly for wrong code bugs 47
  • 48. • Problem: Throwing away part of a program may introduce undefined behavior • Example: int foo (void) { int x; x = 1; return x; } 48 Oops!
  • 49. Possible Solutions 1. Generate small random programs 2. Detect undefined and unspecified behavior during reduction 3. Use the test case generator to reduce program size 49
  • 50. 50 81 KB of C, on average
  • 51. Possible Solutions 1. Generate small random programs 2. Detect undefined and unspecified behavior during reduction 3. Use the test case generator to reduce program size 5151
  • 52. • Prototype reduces size of failure- inducing test cases by 93% –Averaged over 33 wrong code bugs in GCC and LLVM –Takes a few minutes to reduce a program • But given a few hours, a skilled human can do quite a bit better 52
  • 53. • What if manual and automated test case reduction fails? –If we cannot create a small testcase for a failure, we don’t report the bug • Small ≈ 15 lines –This happens, but infrequently –Are we bad at testcase reduction or are there compiler bugs that only trigger on complex inputs? 53
  • 54. • What if an overnight run finds 500 programs that trigger wrong code bugs? –Did we just find one compiler bug or 500? • If we can’t answer this, we have to report 1 bug at a time –This is what we currently do –Need a way to do “bug triage” 54
  • 55. • Idea for bug triage: –Binary search on GCC versions to find the revision causing the bug –Same rev → likely same bug –Different rev → inconclusive! • Too often, bug was introduced earlier • Latent until exposed by some other change • Could also search over passes • Any other ideas? 55
  • 56. • TODO for us: Create a turnkey tester –Test harness needs a partial rewrite • 7000 lines of Perl… –Testcase reducer needs improvement • TODO for you: Please keep fixing bugs we report –Even volatile bugs 56
  • 57. One Last Idea • Currently, compiler certification for critical systems is a bad joke • Can we certify a version of GCC by –Restricting the set of optimization passes –Selecting a simple target (Thumb2 maybe) –Freeze features and fix bugs for a while –Perform near-exhaustive whitebox testing • Test paths in the compiler that matter 57
  • 58. Conclusion #1 • Random testing is powerful • But has drawbacks –Never know when to stop testing –Tuning probabilities is hard –Generating expressive output that is still correct is hard –Our generator is very C specific 58
  • 59. Conclusion #2 • Fixed test suites are not enough –We find bugs other testing misses –We can auto-generate reduced testcases 59
  • 60. Conclusion #3 • Our work is the most extensive fuzz attack on compilers to date –Quickly finds bugs in every compiler we’ve tested • Compilers need random testing 60
  • 61. Code Coverage Backup Slide • make check-c – Lines : 75.13% (246876 / 328609) – Functions : 82.23% (15292 / 18596) – Branches : 46.26% (243658 / 526724) • make check-c + 10,000 random programs – Lines : 75.58% (248358 / 328609) – Functions : 82.41% (15325 / 18596) – Branches : 47.11% (248129 / 526724) 61