Pas oct12

Debugging of Software Regressions

Abhik Roychoudhury
National University of Singapore
Intl. Seminar on Program Verification, Automated
Debugging and Symbolic Computation (PAS) 2012
Organized by Beihang U. and Chinese Acad. of Sciences

Software is Evolving
1329662 versions in

About 270 changes per day, 1 change per 6 minutes

More than 250 billion lines of code have been created

More than 53116 issues publicly reported in Apache
BugZilla.
Maintaining software quality in this evolving process is
challenging.
Testing, debugging, bug-fixing after code changes

Outline: Debugging software regressions
Describe intended behavior of program changes
Change Contract language (later part of the talk)

OR

Extract actual behavior resulting from program changes
Symbolic execution

Novel usage of symbolic execution, beyond guiding search.

Debugging vs. Bug Hunting
input = 0
P
G( pc = end ⇒output > input)
P

output = 0
Model Checker

Counter-example:
input = 0, output = 0

We should have (output > input)

(b) Model Checking
(a) Debugging

Debugging vs. Bug Hunting
Debugging
Have a problematic input i, or ``counter-example” trace.

Does not match expected output for i.

Not sure what desired ``property” is violated.

Amounts to implicitly alerting programmer about program’s intended
specifications as well.
Bug Hunting via Model Checking
Have a desired “property”

Tries to find a counter-example trace, and hence an input which
violates the property.

Regression Debugging

Test Input t why?

Old Stable New Buggy
Program P Program P’

Pass Fail

Contributions
Debugging evolving programs
Introduction of formal techniques into debugging
 Traditional: input mutation, trace comparison …

 Ours: symbolic execution, SMT solving, dependency analysis, …

New usage of symbolic execution
 From guiding search to extracting glimpse of program semantics

Adapting Trace Comparison
Test Input t New Input t’ ??

Path σ Path π
for t

X
Directly Compare σ and π
for t

How to obtain the new test?
The new test input
New
Old Pgm. P’
Pgm. P

Buggy input

Path Condition
in==0
input in;
in >= 0

Yes No
Useful to find:

“the set of all inputs a = -1;
which trace a given a = in;
path”

-> Path condition
return a;
in ≥ 0

Corresponds to the logical formula ∃ in in ≥ 0

Our Approach
Test Input t New Input t’


Path σ for t Path π for t Path π’ for t’
Path condition f Path condition f’

1. Solve f ∧ ¬f ' to get 2. Compare π and
another input t’ π’ to get bug report.

Generating New Input
1. Compute f, the path condition of t in P.
2. Compute f’, the path condition of t in P’.
3. Solve for f ∧ ¬f '
 Many solutions: Compare the trace of each t’ in P’ with the
trace of t in P’. Return bug report from P’.
 No solution: go to next step.
4. Solve for f '∧¬f
 Many solutions: Compare the trace of each t’ in P with the
trace of t in P. Return bug report from P.
 No solution: Impossible, since then f ⇔ f '

Simple Example
int inp, outp; int inp, outp;
scanf("%d", &inp); scanf("%d", &inp);
if (inp >=1){ if (inp >= 1){
outp = g(inp); outp = g(inp);
if (inp>9){ /* if (inp>9){
outp=g1(inp); outp=g1(inp);
} } */
} else{ } else{
outp = h(inp); outp = h(inp);
} }
printf("%d", outp); printf("%d", outp);

1,2,..,9 1,2,…,9,
10,11,… Explain inp == 100
10,11,…
0,-1,-2,.. 0,-1,-2,…
using ?? 9

int inp, outp; inp==100 int inp, outp;
scanf("%d", &inp); scanf("%d", &inp);
if (inp >=1){ if (inp >= 1){
outp = g(inp); outp = g(inp);
if (inp>9){ /* if (inp>9){
outp=g1(inp); outp=g1(inp);
} } */
} else{ } else{
outp = h(inp); outp = h(inp);
} }
printf("%d", outp); printf("%d", outp);

Path condition f Path condition f’
(inp >= 1)&& (inp>9) (inp >= 1)

STP Solver
f ∧ ¬f ' = (inp > 9) & &(inp <= 1) No soln.
STP Solver
f '∧¬f = (inp >= 1) & &(inp <= 9) inp==9

inp==100 inp==9
1 int inp, outp; 1 int inp, outp;
2 scanf("%d", &inp); 2 scanf("%d", &inp);
3 if (inp >=1){ 3 if (inp >=1){
4 outp = g(inp); 4 outp = g(inp);
5 if (inp>9){ 5 if (inp>9){
6 outp=g1(inp); 6 outp=g1(inp);
7 } 7 }
8 } else{ 8 } else{
9 outp = h(inp); 9 outp = h(inp);
10 } 10 }
11 printf("%d", outp); 11 printf("%d", outp);

Trace Alignment 1 2 3 4 5 6 7 11
1 2 3 4 5 _ _ 11

Bug Report : Line 5 if (inp >9){

Choosing Alternative Inputs
b1 Solve f ∧ ¬f '
¬ψ 1 ψ1
b2 f ' = (ψ 1 ∧ψ 2 ∧ ... ∧ψ m )
¬ψ 2 ψ2
b3 Check for satisfiability of
¬ψ 3 ψ3 f ∧ ¬ψ 1
b4
ψ4 f ∧ ψ 1 ∧ ¬ψ 2
 b5
ψ5 f ∧ ψ 1 ∧ ψ 2 ∧ ¬ψ 3
b6

 f'

At most m alternate inputs !!

Bug report for one alternate input
b1
ψ1 tnew = input obtained by solving

b2 f ∧ ψ 1 ∧ ψ 2 ∧ ¬ψ 3
ψ2
b3 Bug report by comparing traces of
¬ψ 3 ψ3 tbug and tnew should be the branch
b3 !!
b4
tnew ψ4
At most m alternate inputs ⇒
b5 at most m lines in bug report.
ψ5
b6 Comparing traces with deviation in one
f' branch –remove trace comparison

 tbug
altogether

DARWIN: Putting Everything Together
Test Input t Alternative Input t’


STP Solver Satisfiable sub-
and input Concrete and
formulae from
validation Symbolic Execution
f / ¬f’

f:Path condition f':Path condition
of t in P of t in P’ Bug Report (Assembly level)

Bug Report (Source level)
f ∧ ¬f '

Implementation
TEMU
VINE:
Windows/Linux OS
x86 binaries  Symbolic Execution
 Path Conditition Extraction
 Dynamic Slicing
 Other analysis on binary
trace

Assembly level
Execution Trace

http://bitblaze.cs.berkeley.edu/

Results
Buggy Program Stable program Time taken Bug report size

LibPNG v1.0.7 LibPNG 13 m 34 s 9
(31164 loc) v1.2.21
(36776 loc)
TCPflow TCPflow 31m 6
(patched) (unpatched)

Miniweb Apache 14s 5
(2838 loc) (358379 loc)
Savant Apache httpd 9m 46
(8730 loc) (358379 loc)

If we require the alternative input to behave the same in buggy program and reference
program (passing test) - the bug report size is 1 in all three cases.

An experiment we tried
Validate Embedded Linux

AGAINST

Linux (GNU Core-utils, net –tools)

Busybox distribution is 121 KLOC.
Various errors to be root-caused in tr, arp, top, printf.

Trying on Embedded Linux
• The concept
– Golden: GNU Coreutils, net-tools
– Buggy: Busybox
– De-facto distribution for embedded devices.

– Aims for low code size

– Less checks and more errors.

– Try DARWIN!
• The practice
– Failing input takes logically equivalent paths in Busybox and Core-
utils.

Going beyond
P P’
input x; input x;
y = 2 * x; y = 2*x + 1; // bug
output y output y

Observable error: Input x == 0, Expected output y == 0
Observed output y == 1

Employ DARWIN:
In program P, path condition f == true
path condition f’ == true
f ∧ ¬ f’ == false also f’ ∧ ¬f == false.
No Bug report generated !!

A more direct approach
P P’
input x; input x;
y = 2 * x; y = 2*x+1; // bug
output y output y

• Characterize observable error (obs)
– y != 0
• Weakest pre-condition along failing path w.r.t. obs
– 2*x != 0
– 2*x + 1 != 0
• Compare the WPs and find differing constraints.
• Map differing constraints to the lines contributing them.

Approach 2 - summary
• Set observable error: x< 0
• Set slicing criterion: value of x at line 8
• Simultaneously perform
– Slicing – Control and Data dependencies
– Symbolic execution – along the slice
– WP computation along the slice

• The above is performed on both P, P’
– Produces WP, WP’ – conjunction of constraints
– Find differing constraints in WP, WP’
– Map differing constraints to contributing LOC – this is the bug-
report.

Approach 2 – in action
1. ... // input inp1, inp2 inp1 - 1< 0 ∧ inp1> 0

2. if (inp1 > 0) inp1 - 1 < 0 ∧ inp1 > 0 (control dep.)

4. x = inp1 - 1; // bug
inp1 - 1 < 0 (data dep.)
4. else x = inp1 + 1;

5. if (inp2 > 0)

8. y = inp2 - 2

7. else y = inp2 + 2;

8. ... // output x, y observe unexpected x < 0 for inp1 == inp2 == 1

Comparing WP, WP’
WP = (ϕ1∧ϕ2∧ … ∧ϕn)

WP’ = (ϕ’1∧ ϕ’2 ∧ …∧ϕ’m)

Solver may choke in trying to check
WP ⇒ ϕ’1 … WP’ ⇒ ϕ1 …
Instead, we perform pair-wise comparison

XTautology elimination – computation along slice:
=1 WP lot of reduction!
… 1 > 0 ∧ Y < 0 // due to assignment of X
if (X > 0) {
… X > 0 ∧ Y < 0 // due to the branch
printf(“%d”,Y);
} Y < 0 // the constraint we start with

Experiments on Embedded Linux
Utility Trace Size Slice Size WP terms WP terms LOC in Time
(after elim.) BugReport taken

arp 5039 : 4764 56524 : 51448 722 : 434 27 : 34 1:3 1m30s
top 1637 : 3921 34523 : 332281 566 : 2501 8:6 2:0 1m28s

printf 3702 : 3633 27781 : 40403 241 : 414 21 : 35 1:3 1m20s
tr 5474 : 138538 85047 : 29375 445 : 280 9:9 1:0 2m28s

• Each : separated tuple in Columns 2-6 refers to data from
embedded Linux and GNU Coreutils in that order
• Trace Size refers to no. of assembly / intermediate level instructions
• Tautology elimination reduces a significant WP analysis overhead
• Bug report size is quite small in each of the cases

Retrospective
Symbolic execution –
Test generation [ e.g. DART, KLEE, … ]

Path traversal in path sensitive searches such as model checkers
[e.g. JPF]
Debugging – some milestones
Visualization [e.g. Tarantula ]

Dynamic Slicing [e.g. JSlice]

Trace comparison and Delta Debugging

Symbolic Techniques

Perspective: Symbolic Execution

Guiding search Uncovering what went wrong
- Test generation, model checking, … - Debugging!
- Summarization & Semantics Extraction

Our use of the symbolic techniques
• Debugging evolving programs (code evolution)
– Program Versions
– Embedded SW against non-embedded version
– Two implementations of same specification
– Web-servers implementing http

• Related works in our group using symbolic techniques
– Test generation to stress a given program change
– Test suite augmentation and Program Path Partitioning

Where are we?
 Debugging is aided by specification discovery.
 Intended program behavior

 What the program should do!

 Symbolic Execution can extract specifications.
 Actual program behavior

 What the program is actually doing!!

 Symbolic execution of reference program gives a hint of intended behavior!

 One possible take from today’s discussion -
 It is possible to bridge the gulf.

 How to directly specify intended behavior of changes?

Outline: Debugging software regressions
Change Contract language (now!)

OR



Programmer Intention
Test Input t

Old Stable New Program
Program P P’

Output ≠ Output’
Bug?

Change Contract Example

Set m(String s){ Set m(String s){
if(/*complex predicate on s*/) if(/*complex predicate on s*/)
return new HashSet(); return new HashSet();
else else
return new TreeSet(); return new TreeSet().add(s);
} }

/*@changed_behavior
@ when_ensured result instanceof TreeSet;
@ ensures result.size() == prev(result).size() +
1;
@*/

Change Contract
A specification language for program changes
This is the intended change, not the actual change!!

Two Aspects

Under which condition the program semantics is changed? (sub input space)

How is the program semantics changed?

Realized in Java based on JML(Java Modeling Language)

Why not Program Contract?

Set m(String s){ Set m(String s){
if(/*complex predicate on s*/) if(/*complex predicate on s*/)
return new HashSet(); return new HashSet();
else else
return new TreeSet(); return new TreeSet().add(s);
} }

/*@changed_behavior
1;
@*/

Default Equal Assumption
For inputs not specified in change contract, program
behavior remains the same.

/*@changed_behavior
1;
@*/

When return value is not instance of TreeSet,
previous method and current method should behave
the same

Empty change contract

pre ≡ pre
//@ changed_behavior
void m1() { void m2() {
x.f = x.f - 1; x.f = x.f + 1 - 2;
} }

post ≡ post

Change contract checker

V1 Change
Contract
C.C
Checker
Parser RAC ESC

V2 OpenJML

Automatic
Input
Test suite
Augmentation [ASE10]

Change Contract Language Expressiveness
User-study on 3 open-source Java projects

Ant, JMeter, Log4j

Write Change Contracts based on real changes

two MS students (CS major)
 Not related to our research – to give veracity to our user study.

 Did not have background on contracts.

52 changes in total

All of them can be expressed using change contracts

User study
Total Refactorin
Proj. Diff ?? n/a
changes g
Ant 43 4 28 3 8
JMete
17 1 11 1 4
r
log4j 20 2 13 1 4

• ??: failed to be understood due to e.g., 3rd-party library

• n/a: e.g., multi-threading, non-program changes

Detecting Incorrect Changes through Change
Contracts

V1 Buggy change
V2 Bugfix to previous bug
V3

Change Contract

Detected all incorrect changes for 10 studied cases in Ant, Jmeter,
Log4j, via randomly generated test cases from Randoop.

Change contract can ...
detect incorrect program changes.

serve as program change requirement.

substitute ambiguous and often incorrect change logs.

guide more efficient test suite augmentation.

Wrap-up: Debugging software regressions
Change Contract language



Our dilemma is that we hate change
and love it at the same time; what we
really want is for things to remain the
same but get better.
Sydney J. Harris

Acknowledgements
• Co-Authors
• Dawei Qi , Zhenkai Liang, Jooyong Yi, … - NUS

• Kapil Vaswani – Microsoft Research India

• Ansuman Banerjee – Indian Statistical Institute Kolkata

• Funding from
• Defense Research and Technology Office.

• Ministry of Education, Singapore.

• Papers: FSE 09, FSE 10, FSE 11, FSE 12, ASE10, TOSEM

Pas oct12

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (17)

Destacado

Destacado (7)

Similar a Pas oct12

Similar a Pas oct12 (20)

Más de Abhik Roychoudhury

Más de Abhik Roychoudhury (18)

Último

Último (20)

Pas oct12

Notas del editor