2. Who Am I?
PhD Student at University of São Paulo
Master thesis defended on April.2012
Software Developer
Consultancy for companies such as
VeriFone, Sony.
Nowadays: Caelum
Open Source
Restfulie.NET
1st Test-Driven Development book in Brazilian
portuguese (in my non biased opinion, the
best TDD book ever!)
3. Unit Tests and Code Quality
Great synergy between a testable class
and a well-designed class (Feathers, 2007)
The write of unit tests can become
complex as the interactions between
software components grow out of control
(McGregor, 2001)
Agile practitioners state that unit tests are
a way to validate and improve class
design (Beck et al, 2001)
4. What am I Going to Say?
A little bit about my master thesis.
The first very step of my PhD.
The tool I am working on.
5. 1st part: TDD and Class Design
Does the practice of TDD influence on the
quality of class design?
Mixed study with ~20 experienced
developers from industry
33% has 6 to 10 years of experience
6 different companies in 3 different cities
Developers were asked to implement a
set of problems, using and not using TDD.
Exercises dealt with coupling, cohesion,
encapsulation problems.
6. Quantitative Analysis
264 production classes
831 methods / 2520 lines of code
73 test classes
225 methods / 1832 lines of code
Wilcoxonto compare the difference in
both groups.
8. Quantitative Analysis
Filtering by their experience in TDD
No statistical significance.
Specialists’ opinion
Two different specialists reviewed all
generated code, without knowing if that
code was produced with or without TDD.
They evaluated in terms of “class
design”, “testability”, “simplicity”, using a
Likert scale from 1 to 5.
No difference in their evaluation.
9. Qualitative Analysis
Interviews with ~10 developers.
All of them said that “TDD does not guide
you to a better class design by itself; the
experience in OO and class design makes
such a difference”.
Some patterns emerged.
11. In my PhD
My idea is to check whether the presence
of those patterns in a unit test really
implies in a bad production code.
MSR techniques.
Open source repositories for exploratory
purposes and industry repositories for the
final study.
12. 2nd part: Unit Tests and Asserts
Every unit test contains three parts
Set up the scenario
Invoke the behavior under test
Validates the expected output
Assert instructions
assertEquals (expected, calculated);
assertTrue(), assertFalse(), and so on
No limits for the number of asserts per test
13. A little piece of code
class InvoiceTest {
@Test
public void shouldCalculateTaxes() {
// (i) setting up the scenario
Invoice inv = new Invoice(5000.0);
// (ii) invoking the behavior
double tax = inv.calculateTaxes();
// (iii) validating the output
assertEquals (5000 ∗ 0.06 , tax );
}
}
14. Why would…
… a test contain more than one assert?
Is it a smell of bad code/design?
15. Research Design
We selected 22 projects
19 from ASF
3 from a Brazilian consultancy
Data extraction from all projects
Code metrics
Statistical
Test
Qualitative Analysis
16. Data Extraction
Test code
Number of asserts per test
Production method being tested
Production code
Cyclomatic Complexity (McCabe, 1976)
Number of method invocations (Li and
Henry, 1993)
Lines of Code
17. Heuristic to Extract the
Production Method
class InvoiceTest { class Invoice {
@Test public double calculateTaxes()
{
public void shouldCalculateTaxes() {
// something…
// (i) setting up the scenario
}
Invoice inv = new Invoice(5000.0);
}
// (ii) invoking the behavior
double tax = inv.calculateTaxes();
// (iii) validating the output
assertEquals (5000 ∗ 0.06 , tax );
}
}
20. Why more than 1 assert?
130tests randomly selected
Qualitative analysis:
More than one assert for the same object
(40.4%)
Different inputs to the same method (38.9%)
List/Array (9.9%)
Others (6.8%)
Extra assert to check if object is not null (3.8%)
21. “Asserted Objects”
We coined the term “asserted objects”
It counts not the number of asserts in a unit
test, but the number of different instances
of objects that are being asserted
assertEquals(10, obj.getA());
assertEquals(20, obj.getB());
Counts as 1 “asserted object”
24. Findings
Counting the number of asserts in a unit test
does not give valuable feedback about
code quality
But counting the number of asserted objects
may provide useful information
However, the difference between both groups
was not “big”
A possible explanation:
Methods that contain higher CC, lines of
code, and method invocations contains many
different paths, and developers prefer to write
all of it in a single unit test, rather than splitting in
many of them
25. My current problem
How to statistically identify if a test code is
a “unit test” or a “integration/system
test”?
26. 3rd Step: Metric Miner
Started
as a command-line tool to
calculate code metrics in Git repositories.
As you can guess, I needed that for my
masters.
A undergraduate student ported my tool
to a web-based system.
Much more interesting!
27. What does it do?
Tool that facilitates studies in MSR.
Already contains the entire Apache
repository cloned.
Researcher can write a new metric and
just plug to the system.
Later on, he can execute an SQL query
and extract data.
He can also execute an statistical test
with two sets of existent data.
28. Pros and Cons
You do not need to spend your computer
resources.
The power of cloud computing (thanks,
Locaweb!)
Still slow.
We need to parallelize the metric
execution.
Go for Google’s Big Query (~300GB of
data).
29. Contact Information
Mauricio Aniche
aniche@ime.usp.br / @mauricioaniche
TDD no Mundo Real
http://www.tddnomundoreal.com.br
Software Engineering & Collaborative
Systems Research Lab (LAPESSC)
http://lapessc.ime.usp.br/