ERA - Measuring Disruption from Software Evolution Activities Using Graph-Based Metrics
1. Measuring Disruption from Software
Evolution Activities Using Graph-
Based Metrics
Prashant Paymal, Rajvardhan Patil, Sanjukta Bhowmick, Harvey Siy
Department of Computer Science,
University of Nebraska at Omaha
2. Introduction
• Real world software systems have large numbers
of components (e.g. classes, functions, etc.)
• It is difficult to get a quick summary of how
system evolved after a major change such as
perfective maintenance activity or new software
release
3. Case Study
Version Date Commit Messages
V1 3/9/2001 Merge to JHotDraw 5.2 (using JFC/Swing GUI components)
V2 10/24/2001 Before merge for version 5.3 (dnd, undo…) merge dnd
(before 5.3)
V3 8/4/2002 After various merges… (before 5.4 relaease)
V4 11/8/2002 Refactor to use StandardStorageFormat as a superclass
V5 5/8/2003 Refactoring of Cursor. – java.awt.Cursor(class) has been
systematically replaced
V6 1/9/2004 After renaming the CH.ifa.draw to org.jhotdraw
• Our case study consists of six versions of JHotDraw from
March 2001 to January 2004
4. Network Construction
• Extracted relationships from these versions
(inheritance, implementation, method calls and class member
access, object declaration and instantiation)
• Network was created by connecting class
dependencies, where each edge (u, v) is a
dependency from class „u‟ to class „v‟
5. Vertex Properties
• Degree Distribution
▫ Frequency of vertices per degree, scale
free for most real world networks
• Clustering Coefficient
▫ Connections between neighbors
• Betweenness Centrality
▫ Ratio of shortest paths through a
vertex
• Articulation Points
▫ It‟s removal would cause the network to
become disconnected
6. • Network representing Version 1,
▫ Lighter Nodes: High Betweenness Centrality
▫ Larger Nodes: High Clustering Coefficient
7. Objective
• Extract key combinatorial properties from these
six networks that would enable us to detect
evolutionary characteristics such as
▫ Points of significant change in the software
▫ How these changes affect crucial classes in the
network
8. Change in Vertex Properties
• All properties increased with version number
9. Correlation Between Properties
▫ Positive correlation between degree and betweenness centrality
▫ Correlation between clustering coefficient and betweenness
centrality changes across versions
10. Disruption in Values and Rank
• We examine how the relationships between
these properties changed from one version to the
next
13. Identifying Crucial Vertices
• High
▫ If vertex has high rank (within top 25) in at least one of the
following categories
• Extra High
▫ If vertex has high rank in at least two categories
• Low
▫ If vertex has zero value for any one vertex based properties
and is not marked as a High vertex
• Extra Low
▫ If it has zero value for both betweenness centrality and
clustering coefficient
• (High Betweenness Centrality, High Indegree, High Outdegree,
High Clustering Coefficient / Articulation Point)
17. Bug Frequencies
• Changes that have the keywords “bug fix” in the change log
• The periods with high percentage are also the periods after the high
disruption
18. Conclusion
• The significant evolutionary changes occur between
Version 2 – Version 3 and Version 4 – Version 5
• The network has grown cumulatively. Newer vertices
tend to get added to the peripheries of the network
• The top 25 ranking of vertices was generally stable
across versions. Important nodes stay important. This
indicates stability in the design.
• The bug frequency is higher after Version 3 and Version
5. The degree of disruption can help explain why bug
incidence increases (future work)