10. Largest Peer Code Review Study
•
•
•
•
•
•
Objectives:
– lightweight vs. formal inspections
– What constitutes an effective review?
10-month case study at Cisco
Cisco MeetingPlace product, teleconferencing solution
3.2 million lines of code
2500 reviews
50 developers
17. Measure
Industry
Average
High
Performance
Teams
Net Promoter Score
20%
> 70%
% defects of total injected found by customer
15%
< 2%
% effort spent in finding and fixing defects
50%
< 10%
% effort for post-release support
30%
< 5%
Unit test code coverage
Varies
> 80%
Post release defect density
7.5
defects/KLOC
< 0.5
defects/KLOC
18. Measure
Industry
Average
High
Performance
Teams
Net Promoter Score
20%
> 70%
% defects of total injected found by customer
15%
< 2%
% effort spent in finding and fixing defects
50%
< 10%
% effort for post-release support
30%
< 5%
Unit test code coverage
Varies
> 80%
Post release defect density
7.5
defects/KLOC
< 0.5
defects/KLOC
Bugs found in development are 8-12X less expensive to fix than those
found in testing phase
And 30-100X less expensive than bugs that reach customers
19. Case Study:
Large National Insurance Company
• 2011: 350 developers
• 2013: 650 team members
• User stories are shared in Word format with entire
team
• Design documents are shared in Powerpoint with
entire team
• Code is shared with entire team
• Test cases are shared in Excel format with entire team
20. Benefits: Cross-Functional Peer
Review
• Every member of the extended development team
knows what’s happening
• Problems with user stories, code, and test plans are
found faster
• Developers are forced to write readable code
• Optimization methods/tricks/productive programs
spread faster
• Teams can iterate from story to code to test plan
• It's Agile
• It’s fun
So one of the key findings of the study is that as you do code review, it is effective in discovering defects until about the 60 minutes mark, you see the defects discovered continued to go up along with the time. But after that mark, it starts to plateau, further time spent do not result in more defects discovered.
As you can see from this set of data, the defects density, in another word, how many defects discovered per thousand lines of code tends to cluster between 0 and 500 lines of code examined per hour. So if you go way too fast, let’s say you looked at 1200 lines of code per hour, the number of defects that you discovered goes way down. So like I tell my daughter who just started to drive, go slow.
This is similar to the last one, don’t pile on thousands and thousands of lines of code for someone to review, they will not be able to focus and detect the errors due to the volume. So do just enough code and have it reviewed, in the 200 to 400 range, which give you the best yield.