Developer testing, a common step in software development, involves generating sufficient test inputs and checking the behavior of the program under test during the execution of the test inputs. Complicated logics inside a method make generating appropriate arguments difficult. In testing object-oriented programs, generating method sequences to put the receiver object or argument objects into appropriate states further complicates test-input generation. After the generated test inputs are executed, program crashes or uncaught exceptions can be used to indicate program problems, especially robustness problems. However, some program problems such as producing wrong program outputs do not crash the program.
In this talk, the speaker will present an overview of achievements and challenges in improving automation in developer testing, especially on test-input generation (i.e., generating sufficient test inputs) and test oracles (i.e., checking the behavior of the program under test).
About the speaker:
Tao Xie is an Associate Professor in the Department of Computer Science of the College of Engineering at North Carolina State University. He received his Ph.D. in Computer Science from the University of Washington in 2005. Before that, he received an M.S. in Computer Science from the University of Washington in 2002, an M.S. in Computer Science from Peking University in 2000, and a B.S. in Computer Science from Fudan University in 1997. He worked as a visiting researcher at Microsoft Research Redmond and Microsoft Research Asia.
His research interests are in software engineering, focusing on automated software testing and mining software engineering data. He has published more than 100 research papers in refereed journals and conference proceedings in the area of software engineering. Besides doing research, he has contributed to understanding the software engineering research community.
He has served as the ACM SIGSOFT History Liaison in the SIGSOFT Executive Committee as well as serving in the ACM History Committee. He received a National Science Foundation Faculty Early Career Development (CAREER) Award in 2009. He received 2008, 2009, and 2010 IBM Faculty Awards and a 2008 IBM Jazz Innovation Award. He received 2010 North Carolina State University Sigma Xi Faculty Research Award. He received the ASE 2009 Best Paper Award and an ACM SIGSOFT Distinguished Paper Award. He was Program Co-Chair of 2009 IEEE International Conference on Software Maintenance (ICSM) and is Program Co-Chair of 2011 and 2012 International Working Conference on Mining Software Repositories (MSR).
2. Automation in Developer Testing
• Background on developer testing
– http://www.developertesting.com/
– Kent Beck’s 2004 talk on “Future of Developer
Testing”
http://www.itconversations.com/shows/detail301.h
tml
• This talk focuses on developer testing
– Not system testing etc. conducted by testers
• Unit Test Automation commonly referred to writing
unit test cases manually, executed automatically 2
4. Software Testing Problems
=
?
Outputs Expected
Outputs
Program+Test
inputs
Test Oracles
4
• Faster: How can tools help developers create and run tests faster?
5. Software Testing Problems
=
?
Outputs Expected
Outputs
Program+Test
inputs
Test Oracles
5
• Faster: How can tools help developers create and run tests faster?
• Better Test Inputs: How can tools help generate new better test inputs?
6. Software Testing Problems
=
?
Outputs Expected
Outputs
Program+Test
inputs
Test Oracles
6
• Faster: How can tools help developers create and run tests faster?
• Better Test Inputs: How can tools help generate new better test inputs?
• Better Test Oracles: How can tools help generate better test oracles?
7. Example Unit Test Case
=
?
Outputs Expected
Outputs
Program+Test
inputs
Test Oracles
7
void addTest() {
ArrayList a = new ArrayList(1);
Object o = new Object();
a.add(o);
AssertTrue(a.get(0) == o);
}
• Appropriate method sequence
• Appropriate primitive argument values
• Appropriate assertions
Test Case = Test Input + Test
Oracle
8. Levels of Test Oracles
• Expected output for an individual test input
– In the form of assertions in test code
• Properties applicable for multiple test inputs
– Crash (uncaught exceptions) or not, related to
robustness issues, supported by most tools
– Properties in production code: Design by Contract
(precondition, postcondition, class invariants)
supported by Parasoft Jtest, Google CodePro
AnalytiX
– Properties in test code: Parameterized unit tests
supported by MSR Pex, AgitarOne
X. Xiao, S. Thummalapenta, and T. Xie. Advances on Improving Automation in
Developer Testing. In Advances in Computers, 2012
http://people.engr.ncsu.edu/txie/publications.htm#ac12-devtest
9. Economics of Test Oracles
9
• Expected output for an individual test input
– Easy to manually verify for one test input
– Expensive/infeasible to verify for many test inputs
– Limited benefits: only for one test input
• Properties applicable for multiple test inputs
– Not easy to write (need abstraction skills)
– But once written, broad benefits for multiple test
inputs
10. Assert behavior of multiple test inputs
Design by Contract
• Example tools: Parasoft Jtest, Google CodePro
AnalytiX, MSR Code Contracts, MSR Pex
• Class invariant: properties being satisfied by an
object (in a consistent state) [AgitarOne allows a
class invariant helper method used as test oracles]
• Precondition: conditions to be satisfied (on receiver
object and arguments) before a method can be
invoked
• Postcondition: properties being satisfied (on receiver
object and return) after the method has returned
• Other types of specs also exist
http://research.microsoft.com/en-
11. Microsoft Research Code
Contracts
[ContractInvariantMethod]
void ObjectInvariant() {
Contract.Invariant( items != null );
}
Features
§ Language expression
syntax
§ Type checking / IDE
§ Declarative
§ Special Encodings
§ Result and Old
public virtual int Add(object value)
{
Contract.Requires( value != null );
Contract.Ensures( Count == Contract.OldValue(Count) + 1 );
Contract.Ensures( Contract.Result<int>() == Contract.OldValue(Count) );
if (count == items.Length) EnsureCapacity(count + 1);
items[count] = value;
return count++;
}
- Slide adapted from MSRhttp://research.microsoft.com/en-us/projects/contracts/
12. Parameterized Unit Testing
void TestAdd(List list, int item) {
Assume.IsTrue(list != null);
var count = list.Count;
list.Add(item);
Assert.AreEqual(count + 1, list.Count);
}
• Parameterized Unit Test =
Unit Test with Parameters
• Separation of concerns
– Data is generated by a tool
– Developer can focus on functional specification
[Tillmann&Schulte ESEC/FSE 05]
http://research.microsoft.com/apps/pubs/default.aspx?id=77419
13. Parameterized Unit Tests are
Formal Specifications
Algebraic Specifications• A Parameterized Unit Test can be read as a
universally quantified, conditional axiom.
void TestReadWrite(Res r, string name, string data) {
Assume.IsTrue(r!=null & name!=null && data!=null);
r.WriteResource(name, data);
Assert.AreEqual(r.ReadResource(name), data);
}
string name, string data, Res r:
r ≠ null ⋀ name ≠ null ⋀ data ≠ null ⇒⇒
equals(
ReadResource(WriteResource(r, name, data).state, name),
data)
15. Parameterized Unit Testing
Getting PopularParameterized Unit Tests (PUTs) commonly supported
by various test frameworks
• .NET: Supported by .NET test frameworks
– http://www.mbunit.com/
– http://www.nunit.org/
– …
• Java: Supported by JUnit 4.X
– http://www.junit.org/
Generating test inputs for PUTs supported by tools
• .NET: Supported by Microsoft Research Pex
– http://research.microsoft.com/Pex/
• Java: Supported by Agitar AgitarOne
– http://www.agitar.com/
17. Assert behavior of multiple test inputs
Software Agitation in AgitarOne
Code
Software
Agitation
Observations
on code behavior,
plus
Test Coverage data
If an Observation
reveals a bug, fix it
If it describes desired behavior,
click to create a Test AssertionCode
Compile
Review
Agitate
- Slide adapted from Agitar Software Inc.
http://www.agitar.com
/
19. Automated Test Generation
19
Recent advanced
technique: Dynamic
Symbolic
Execution/Concolic Testing
•Instrument code to explore
feasible paths
P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In
Proc. PLDI 2005
K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In Proc.
ESEC/FSE 2005
N. Tillmann and J. de Halleux. Pex - White Box Test Generation for .NET. In Proc.
TAP 2008
20. void CoverMe(int[] a)
{
if (a == null) return;
if (a.Length > 0)
if (a[0] == 1234567890)
thrownewException("bug");
}
a.Length>0
a[0]==123…
TF
T
F
F
a==null
T
Constraints to solve
a!=null
a!=null &&
a.Length>0
a!=null &&
a.Length>0 &&
a[0]==123456890
Input
null
{}
{0}
{123…}
Execute&MonitorSolve
Choose next path
Observed constraints
a==null
a!=null &&
!(a.Length>0)
a==null &&
a.Length>0 &&
a[0]!=1234567890
a==null &&
a.Length>0 &&
a[0]==1234567890
Done: There is no path
left.
Dynamic Symbolic Execution in
Pex
http://pex4fun.com/HowDoesPexW
21. Automating Test Generation
• Method sequences
– MSeqGen/Seeker [Thummalapenta et al. OOSPLA 11, ESEC/FSE
09],
Covana [Xiao et al. ICSE 2011], OCAT [Jaygarl et al. ISSTA 10],
Evacon [Inkumsah et al. ASE 08], Symclat [d'Amorim et al. ASE 06]
• Environments e.g., db, file systems, network, …
– DBApp Testing [Taneja et al. ESEC/FSE 11], [Pan et al. ASE 11]
– CloudApp Testing [Zhang et al. IEEE Soft 12]
• Loops
– Fitnex [Xie et al. DSN 09]
@NCSU ASE
http://people.engr.ncsu.edu/txie/publications.htm
22. Pex on MSDN DevLabs
Incubation Project for Visual Studio
Download counts (20 months)
(Feb. 2008 - Oct. 2009 )
Academic: 17,366
Devlabs: 13,022
Total: 30,388
http://research.microsoft.com/projects/pe
23. Open Source Pex extensions
http://pexase.codeplex.com/
Publications: http://research.microsoft.com/en-us/projects/pex/community.aspx#publications
24. Writing Test Oracles
Learning Formal Methods!?
• Parameterized Unit Test =
Unit Test with Parameters
• Separation of concerns
– Data is generated by a tool
– Developer can focus on functional
specification
void TestAdd(List list, int item) {
Assume.IsTrue(list != null);
var count = list.Count;
list.Add(item);
Assert.AreEqual(count + 1, list.Count);
}
25. Automatic Test Generation
Human Assistance to Test Generation?!
Running Symbolic PathFinder ...
…
=====================================
================= results
no errors detected
=====================================
================= statistics
elapsed time: 0:00:02
states: new=4, visited=0, backtracked=4,
end=2
search: maxDepth=3, constraints=0
choice generators: thread=1, data=2
heap: gc=3, new=271, free=22
instructions: 2875
max memory: 81MB
loaded code: classes=71, methods=884
…
25
26. Challenges
Faced by Test Generation Tools
object-creation problems (OCP) - 65%
external-method call problems (EMCP) –
Total block coverage achieved is 50%, lowest coverage
16%.
26
Example: Dynamic Symbolic
Execution/Concolic Testing
• Instrument code to explore feasible paths
• Challenge: path explosion
27. Ø A graph example
from QuickGraph
library
Ø Includes two classes
Graph
DFSAlgorithm
Ø Graph
AddVertex
AddEdge: requires
both vertices to be
in graph
00: class Graph : IVEListGraph { …
03: public void AddVertex (IVertex v) {
04: vertices.Add(v); // B1 }
06: public Edge AddEdge (IVertex v1, IVertex v2) {
07: if (!vertices.Contains(v1))
08: throw new VNotFoundException("");
09: // B2
10: if (!vertices.Contains(v2))
11: throw new VNotFoundException("");
12: // B3
14: Edge e = new Edge(v1, v2);
15: edges.Add(e); } }
//DFS:DepthFirstSearch
18: class DFSAlgorithm { …
23: public void Compute (IVertex s) { ...
24: if (graph.GetEdges().Size() > 0) { // B4
25: isComputed = true;
26: foreach (Edge e in graph.GetEdges()) {
27: ... // B5
28: }
29: } } }
[Thummalapenta et al. OOPSLA
Example Object-Creation
Problem
28. 28
Ø Test target: Cover true
branch (B4) of Line 24
Ø Desired object
state: graph should
include at least one
edge
Ø Target sequence:
Graph ag = new Graph();
Vertex v1 = new Vertex(0);
Vertex v2 = new Vertex(1);
ag.AddVertex(v1);
ag.AddVertex(v2);
ag.AddEdge(v1, v2);
DFSAlgorithm algo = new
DFSAlgorithm(ag);
algo.Compute(v1);
00: class Graph : IVEListGraph { …
03: public void AddVertex (IVertex v) {
04: vertices.Add(v); // B1 }
06: public Edge AddEdge (IVertex v1, IVertex v2) {
07: if (!vertices.Contains(v1))
08: throw new VNotFoundException("");
09: // B2
10: if (!vertices.Contains(v2))
11: throw new VNotFoundException("");
12: // B3
14: Edge e = new Edge(v1, v2);
15: edges.Add(e); } }
//DFS:DepthFirstSearch
18: class DFSAlgorithm { …
23: public void Compute (IVertex s) { ...
24: if (graph.GetEdges().Size() > 0) { // B4
25: isComputed = true;
26: foreach (Edge e in graph.GetEdges()) {
27: ... // B5
28: }
29: } } }
Example Object-Creation
Problem
[Thummalapenta et al. OOPSLA
29. Challenges
Faced by Test Generation Tools
object-creation problems (OCP) - 65%
external-method call problems (EMCP) –
Total block coverage achieved is 50%, lowest coverage
16%.
29
Example: Dynamic Symbolic
Execution/Concolic (Pex)
• Instrument code to explore feasible paths
• Challenge: path explosion
30. Example External-Method Call
Problems (EMCP)
Example 1:
File.Exists has data
dependencies on program input
Subsequent branch at Line 1
using the return value of
File.Exists.
Example 2:
Path.GetFullPath has data
dependencies on program input
Path.GetFullPath throws
exceptions.
Example 3: String.Format
do not cause any problem
30
1
2
3
31. Human Can Help!
Object Creation Problems (OCP)
Tackle object-creation problems with Factory Methods
31
32. Human Can Help!
External-Method Call Problems (EMCP)
Tackle external-method call problems with Mock
Methods or Method Instrumentation
Mocking System.IO.File.ReadAllText
32
33. State-of-the-Art/Practice
Testing Tools
Running Symbolic PathFinder ...
…
=====================================
================= results
no errors detected
=====================================
================= statistics
elapsed time: 0:00:02
states: new=4, visited=0, backtracked=4,
end=2
search: maxDepth=3, constraints=0
choice generators: thread=1, data=2
heap: gc=3, new=271, free=22
instructions: 2875
max memory: 81MB
loaded code: classes=71, methods=884
…
Tools typically don’t communicate challenges faced by them to enable
cooperation between tools and users.
We typically don’t teach people how to cooperate with tools.
33
X. Xiao, T. Xie, N. Tillmann, and J. de Halleux. Precise Identification of Problems for
Structural Test Generation. In Proc. ICSE 2011
http://people.engr.ncsu.edu/txie/publications/icse11-covana.pdf
35. Coding Duels
Pex computes “semantic diff” in cloud
code written in browser vs.
secret reference implementation
You win when Pex finds no differences
secret
36. Behind the Scene of Pex for Fun
Secret
Implementation
class Secret {
public static int Puzzle(int x)
{
if (x <= 0) return 1;
return x * Puzzle(x-1);
}
}
Player Implementation
class Player {
public static int Puzzle(int x)
{
return x;
}
}
class Test {
public static void Driver(int x) {
if (Secret.Puzzle(x) !=
Player.Puzzle(x))
throw new
Exception(“Mismatch”);
}
behavior
Secret Impl == Player Impl
36
37. Coding Duels
Fun and Engaging
Iterative gameplay
Adaptive
Personalized
No cheating
Clear winning criterion
38. Example User Feedback
“It really got me *excited*. The part that got me most is
about spreading interest in teaching CS: I do think that
it’s REALLY great for teaching | learning!”
“I used to love the first person shooters and
the satisfaction of blowing away a whole team
of Noobies playing Rainbow Six, but this is far
more fun.”
“I’m afraid I’ll have to constrain myself to spend just
an hour or so a day on this really exciting stuff, as I’m
really stuffed with work.”
Released since
2010
X
42. Coding Duels for Training Testing
public static string Puzzle(int[] elems, int capacity, int elem) {
if ((maxsize <= 0) || (elems == null) || (elems.Length > (capacity
+ 1)))
return "Assumption Violation!";
Stack s= new Stack(capacity);
for (int i = 0; i < elems.Length; i++)
s.Push(elems[i]);
int origSize = s.GetNumOfElements();
//Please fill in below test scenario on the s stack
//The lines below include assertions to assert the program
behavior
PexAssert.IsTrue(s.GetNumOfElements() == origSize + 1);
PexAssert.IsTrue(s.Top() == elem);
PexAssert.IsTrue(!s.IsEmpty());
PexAssert.IsTrue(s.IsMember(elem));
return s.GetNumOfElements().ToString() + "; “ +
s.Top().ToString() + "; “
Set up a stack with some
elements
Cache values used in
assertions
43. Usage Scenarios of Pex4Fun
• Massive Open Online Courses (MOOC):
Challenges
– Grading, addressed by Pex4Fun
– Cheating [Open Challenge]
• Course assignments (students/professionals)
– E.g., intro programming, software engineering
• Student/professional competitions
– E.g., coding-duel competition at ICSE 2011
• Assessment of testing/programming/problem
solving skills for job applicants
– Not just final results of problem solving but also process!
44. More Reading
Nikolai Tillmann, Jonathan De Halleux, Tao
Xie, Sumit Gulwani and Judith Bishop
Teaching and Learning Programming
and Software Engineering via Interactive
Gaming
In Proceedings of the 35th International
Conference on Software Engineering (ICSE
2013), Software Engineering Education
(SEE), San Francisco, CA, May 2013.
http://people.engr.ncsu.edu/txie/publications
/icse13see-pex4fun.pdf
45. Conclusion
• Software testing is important and yet
costly; needs automation
• Better Test Inputs: help generate new
better test inputs
– Generate method arguments
– Generate method sequences
• Better Test Oracles: help generate better
test oracles
– Assert behavior of individual test inputs
– Assert behavior of multiple test inputs
• Software Testing Educational Gaming 45
46. Example Industrial
Developer Testing Tools
• Agitar AgitatorOne http://www.agitar.com/
• Parasoft Jtest http://www.parasoft.com/
• Google CodePro AnalytiX
https://developers.google.com/java-dev-tools/codepro/doc/
• SilverMark Test Mentor http://www.silvermark.com/
• Microsoft Research Pex (for .NET)
http://research.microsoft.com/Pex/
• Microsoft Research Spec Explorer (for .NET)
http://research.microsoft.com/specexplorer/
46
47. Trends in Practice
• Regression Test Selection/Prioritization
• Cloud Computing for Test Execution, e.g.,
http://www.skytap.com/
• Crowdsourcing for Testing, e.g.,
http://www.utest.com/
• Mocking Environments
– Google: EasyMock
– Microsoft VS: Fake/Moles
http://research.microsoft.com/en-us/projects/pex/
• Automatic Test Generation
– Microsoft: Pex, SAGE
http://research.microsoft.com/en-us/um/people/pg/
49. Automated Combinatorial Testing
Goals – reduce testing cost, improve cost-benefit ratio
Accomplishments – huge increase in performance,
scalability, 200+ users, most major IT firms and others
Also non-testing applications – modelling and
simulation, genome
http://csrc.nist.gov/groups/SNS/acts/index.html
51. NIST ACTS Tool
• Covering array generator
• Coverage analysis - what is the combinatorial coverage of
existing test set?
• .NET configuration file generator
• Fault characterization -
ongoing Current
users
http://csrc.nist.gov/groups/SNS/acts/documents/comparison-report.html
approximately 200 users as of
July 2009, in IT, defense,
finance, telecom, and many
other industries