Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Source code comprehension on evolving software
1. Source Code Comprehension on Evolving Software:
A Literature Survey
Yida Tao
Supervisor: Sunghun Kim
1
2. Motivation
Code Change Comprehension
Tao et al., FSE’12
Code change comprehension is
• Frequently required
• In major development activities, in
particular the code-review process
• How do software engineers understand code changes? An exploratory study in industry. Tao et al., FSE’12
• Expectations, outcomes, and challenges of modern code review. Bacchelli and Bird, ICSE’13
Bacchelli & Bird, ICSE’13
• “…review and understand code they
have not seen before may be more
common that a developer working on
new code”
• “From interviews, no other code
review challenge emerged as clearly as
understanding the submitted change”
2
5. Text Differencing
Flat representation of a program
Sequence of strings
Unix diff
Only output added/deleted lines, can not detect modified lines
Hard to determine when a code fragment is moved upward or downward
Ldiff (Canfora et al., ICSE’09)
An enhanced line differencing tool
Limitations
Changes to *characters*
No syntactic-structure information
5
6. Syntactic Differencing
Structured representation of a program
Abstract syntax tree; XML
ChangeDistiller (Fluri et al., TSE’07)
Tree differencing
Node: bigram string similarity
Control structure: subtree similarity
Output: tree edit script (insert, delete, move, update)
XML differecing
srcXML (Maletic & Collard, ICSM’04): embeds abstract syntax and structure
within the source code
diffX (Al-Ekram et al., CASCON '05)
Limitation
Cannot describe how the behavior of a program is changed
Still report differences for behavior-preserving changes
6
11. Code Change Summarization
LSdiff (Kim and Notkin, ICSE’09)
Group related changes
Detect potential inconsistencies in a code change
11
12. Code Change Summarization (cont.)
DeltaDoc (Buse and Weimer, ASE’10)
Symbolic execution: obtain path predicates for each statement in both
versions
Identify statements that are added, deleted, or have a changed predicates
Summarization
12
13. Code Change Summarization (cont.)
Multi-document summarization (Rastkar and Murphy, ICSE’13)
Linking evolutionary documents (commit log, issue tracking entries)
Finding the most informative sentences to extract to form a summary
Similarity between a sentence and the title of the enclosing document
Overlap between a sentence and the adjacent document
13
14. Code Change Summarization (cont.)
Challenges
Evolutionary documents
Linkage might not be found (Bachman et al., FSE’10, Wu et al., FSE’11)
Human-written document may be unavailable or uninformative (Buse and Weimer,
ASE’10, Tao et al., FSE’12)
Automatically generated document
Verbosity
Uninteresting changes are identified, e.g., “all types that declared toString() added
constructors” (Kim and Notkin, ICSE’09)
14
LSdiff DeltaDoc
15. Outline
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Code Change Comprehension
Querying and Filtering
Customization
15
16. Querying and Filtering
Specifying and detecting meaningful changes (Yu et al., ASE’11)
Normalize the program (user-specified) before differencing
Non-trivial to construct the query
16
17. Querying and Filtering (cont.)
Filtering non-essential changes (Kawrykow and
Robillard, ICSE’11)
Non-essential changes: rename-induced modifications, local
variable extraction, trivial keyword modification, whitespace
and documentation updates
ChangeDistiller (Fluri et al., TSE’07) + Partial program
analysis (Dagenais and Robillard, ICSE’08)
Goal: improving mining and recommendation accuracy
instead of developers’ comprehension
17
18. Outline
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Code Change Comprehension
18
19. Research Directions
Program Differencing
Text Differencing
Syntactic differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Source Code Changes
Work-item-based changes?
19
20. Work-item-based Changes
Multiple work-items in a single code change (e.g., a bug fix +
code cleanup + a new feature)
Very difficult to understand (Tao et al., FSE’12)
20
JFreeChart revision 1083
Trivial keyword removal
Bug fix
Formatting
21. Work-item-based Change Detection
Multiple work-items in a single code change (e.g., a bug fix +
code cleanup + a new feature)
Very difficult to understand (Tao et al., FSE’12)
Change decomposition
Program slicing (entity dependencies)
Pattern matching (similarities)
A single work-item spreads across multiple code changes
(e.g., 5 changes to finally fix a bug completely)
Change aggregation
Linkage to the same issue
Heuristics like time duration, commit authors, program dependencies, etc.
21
22. Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Code Change Comprehension
Work-item change detection
Change decomposition
Change aggregation
22
23. Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Work-item-specific
changes
Code Change Comprehension
Work-item change detection
Change decomposition
Change aggregation
23
24. Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Work-item-specific
changes
Code Change Comprehension
Concrete Execution
Work-item change detection
Change decomposition
Change aggregation
24
25. Explaining code changes with executions of co-
changed test cases
25
Test cases
Best documentation for source code
Test cases co-changed with source code
Documentation for code changes?
Mostly synchronous co-evolution of production and test
code (Zaidman et al., Empirical Software Engineering’11)
Differential test executions
Co-changed test cases T
Executing T on the old version P and new version P’
Comparing executions to explained change behaviors
From StackExchange
http://programmers.stackexchange.com/questions/154439/quality-of-code-in-
unit-tests?newsletter=1&nlcode=67628%7c1a35
• “Unit tests are one of the best sources of documentation for your system,
and arguably the most reliable form”
• “Unit tests are often the first thing you look at when trying to grasp what
some piece of code does”
• “They can also serve as a starting point for people new to the code base”
26. Research Directions
Program Differencing
Text Differencing
Syntax differencing
Semantic differencing
Code Change Summarization
Rules and exceptions
Control-flow changes
Evolutionary documentation
Querying and Filtering
Meaningful changes
Non-essential changes
Work-item-specific
changes
Code Change Comprehension
Concrete Execution
• Co-changed test cases
• Differential test execution
Work-item change detection
Change decomposition
Change aggregation
26
Notas del editor
We know that software is continuously evolving since developers practically change source code all the time. One of the consequences is, developers also have to understand these code changes, which I refer to as CCC through this talk. Last year, we conducted an exploratory study in MS, where we sent surveys and conducted interviews with MS developers for their practices on CCC. This work is published in FSE. In this work, we found first, CCC is frequently required. The majority of developers understand code changes several times each day
In this year’s ICSE, B in their empirical study on modern code review, they also expressed the similar findings that CCC is more common than understanding the entire program, but CCC is also the most challenging part.
These motivate our work since CCC is a challenging activity but it’s also fundamental to developers’ daily practices.
So in the literature survey, I identify 3 major categories related to CCC.
First is program differencing. This line of work try to help developers by describing code changes
Second is …. Studies in this category take one step further to try to reasoning and explain code changes
Third is. This is sort of “customized” CCC.
Unix diff is the most well-known example in this category. But it’s also well-recognized for two major limitations.
Ldiff:
diff: Longest common subsequence
All possible hunk pairs -> similarity (vector space cosine similarity) -> pick the topmost pairs
Line matching -> Levenhstein edit distance -> above threshold is marked as changed
Unmatched lines are new hunks -> iterate step 2
Since these techniques treat program as normal text, they report program difference as changes to characters. But from a developer’s point of view, the syntactic, or structure information about the source code is lost. This motivates another line of work, which we call “syntax differencing”
This line of work uses structured representation of a program.
Changedistiller, which represents a program as an abstract syntax tree and applies tree differencing algorithm.
In addition to AST, studies also represent code in XML, which can also embed …Then we can apply XML differencing algorithms, like diffX proposed in, to compute program differences.
In cases when developers perform behavior-preserving modifications such as switch the order of if-else, it will still report the differences although from developer’s perspective, they might not think it is an important change.
Therefore, the next line of work focuses on semantic differencing of two program versions. Semantic diff operates on method level, and compares variable dependencies to derive behavioral changes.
In the old version of method add, if x not equal to HI, add it to TOT, otherwise, add DEF to total. From this code, we can derive a list of dependencies, for example, …
In the new version, developers simply want to switch the order of if-else but mistakenly uses assignment instead of equals. Therefore, when the technique computes variable dependencies and compare it to previous ones, it will report that..
These behavioral differences are certainly not expected because when x is assigned to HI, the initial value of x is always lost. In such cases, semantic diff is certainly better than syntactic diff since it can raise developers’ attention on program’s unexpected behavioral change.
Another work, Jdiff, which is published in, is about semantic differencing for oo program.
Simply applying syntactic differencing, we’ll only know that m1 is added, and . But developers may be more interested in how the behavior of program is changed.
if the dynamic type of a is B, the call a.m1 in new version actually invokes m1 in B.
The exception thrown will be caught by different catch blocks after the change.
Jdiff extends CFG to combine…ECFG considers dynamic binding and exception handling for the previous example, and graph differencing algorithm can be applied to reveal the difference.
Some studies also use symbolic execution to characterize programs’ behavior. This technique…instead of actual values. For example, a symbolic execution for this code fragment is like, if this condition is satisfied, return; otherwise, if…, return…
XXX proposed differential symbolic execution that compares the SE of two program versions. The output is like this. Under which condition, two different versions produces different results.
Now I’ve covered 3 categories in program differencing. These work basically try to help CCC by describing what the code change is. The next line of work, which I call “CCS”, takes a further step to try to explain code changes.
Program is presented as a set of predicates that describe code elements, containment relationships, and structural dependencies, which are called “facts”. Then Lsdiff computes changed facts between two program versions.
Inferring rules from the list of change facts
Also inferring exceptions to the rules. Example: all Car’s subtypes’ start methods added calls to the Key.chk method except for the subtype Kia
Finally, DeltaDoc uses some transformation heuristics to summarize these statements’ differences to human-readable documentation.
The studies we’ve seen so far all extract information from source code itself. However, other software artifacts, such as commit log, can also be helpful for understanding code changes since from these artifacts, we might found useful natural language sentences related to the code changes. Motivated by this observation, …proposed…
Each sentence has some features, for example. To locate the most informative or relevant sentences, they are ranked by their feature values.
Here is an example of their output. For this change, its summary contains a list of relevant sentences extracted from its evolutionary documents.
The major challenges of using evolutionary documents is first, linkage between these documents might not exist so we may not even be able to find documents relevant to a code change. This problem is known as the “missing link” and is studied recently.
In addition, document may not… In such cases, we can not rely on them to extract informative change summaries.
As for I introduced before, the biggest problem is verbosity. This is rules and exceptions generated by Lsdiff to describe a code change. This is the number of lines in the change documentation. Compared to human-written commit log, which is the black bar, documentation generated by DeltaDoc is still very long.
Another challenge is, some uninterested changes can be identified automatically. For example, a rule reported by Lsdiff says…, which in the user study, participants complain that such a rule is not useful.
Therefore, there are studies that customize CCC so that developers can query their interested changes and filtering out irrelevant changes.
Non-essential changes include …, which is less likely to be of developers’ interest.
They use ChangeDistiller to detect changes, and apply PPA to resolve type bindings for partial programs (i.e., code changes)
However, the goal of this work is to…
In general, studies in this category focuses on querying meaningful changes and filtering out non-essential changes.