On the Relevance of Code Anomalies for Identifying Architecture Degradation Symptoms
1. On the Relevance of Code Anomalies for
Identifying Architecture Degradation Symptoms
Isela Macía1, Roberta Arcoverde1, Alessandro Garcia1,
Christina Chavez2, Arndt von Staa1
1Pontifical
Catholic University of Rio de Janeiro – PUC-Rio, Brazil
2Federal University of Bahia – UFBA, Brazil
LES | DI |PUC-Rio - Brazil OPUS Group
2. Code Anomalies
Code Anomaly
“A code smell is a surface indication
that usually corresponds to a deeper
problem in the system.”
Martin Fowler, 1999
Roberta @ OPUS Group 2
3. Architectural Anomalies
ModuleA
<<subsystem>>
Concern A1
Concern A2
Concern C1
ModuleC
<<subsystem>>
ModuleB ConcernC
<<subsystem>>
Concern B1
ConcernB2
ConcernC2
Scattered Functionality
Roberta @ OPUS Group 3
4. Relevance of Code Anomalies
GUI
public class HWFacade{ <<subsystem>>
public void updateComplaint(..){..} <<subsystem>>
public Complaint searchComplaint(..){..}
public void insertComplaint(..){..}
public void insertEmployee(..){..} Employee
public Employee searchEmployee(..){..} Symptom
public void updateEmployee(..){..} Complaint
public void insertSymptom(..){..} HWFacade
Business
public Symptom searchSymptom(..){..}
public void updateSymptom(..){..}
... <<subsystem>>
}
Roberta @ OPUS Group 4
5. Relevance of Code Anomalies
public class ComplaintRepo{
...
public int insert(..){..}
public void update(..){..} DATA
public int getIndex(..){..}
<<subsystem>>
<<subsystem>>
public boolean exists(..){..}
EmployeeArray
public Complaint search(..){..}
ComplaintRepo
public void reset(..){..} Repository
public Object next(..){..} ArrayRepository
Factory
public void remove(..){..}
public List getList(..){..}
public boolean hasNext(..){..}
public void updateTimestamp(..){..}
public int searchTimestamp(..){..}
...
}
Roberta @ OPUS Group 5
6. Previous Research on Code Anomaly Impact
Khomh et al. WCRE „09
Anomalous code elements tend to
be changed more frequently than
free-anomalous elements
Roberta @ OPUS Group 6
7. Previous Research on Code Anomaly Impact
D‟Ambros et al. QSIC „10
There are no code anomalies that
can be considered more harmful with
respect to software defects
Roberta @ OPUS Group 7
8. Previous Research on Code Anomaly Impact
D‟Ambros et al. QSIC „10
There are no code anomalies that
can be considered more harmful with
respect to software defects
Roberta @ OPUS Group 8
9. Architecture Degradation
Major software engineering problem
Might unable systems evolution
Early identification could help avoiding it
But can it be identified from code anomalies?
Roberta @ OPUS Group 9
10. Three Questions
Are anomalous code elements related to
1 architecture problems?
If so, which characteristics of the code
2 anomaly are relevant for the architecture
design?
To what extent the applied refactorings
3 actually addressed architecturally-
relevant code anomalies?
Roberta @ OPUS Group 10
11. Target Systems
MIDAS MM HW PDP
C++ Java/AspectJ Java/AspectJ C#
76 KLOC 54 KLOC 49 KLOC 22 KLOC
111 anomalies 170 anomalies 252 anomalies 175 anomalies
6 different systems
40 revisions
Architecture information available
Roberta @ OPUS Group 11
12. Study Phases
1. Data Collection
2. Analysis of code anomalies impact on
identified architecture problems
3. Refactoring extraction
4. Analysis of refactoring on identified
architecture problems
Roberta @ OPUS Group 12
13. Data Collection
Recovering Actual Architecture
Identifying Architecture Problems
Detecting Code Anomalies
Analyzing the Impact of Code Anomalies DATA
BUSINESS GUI
DATA
Roberta @ OPUS Group 13
14. Data Collection
Recovering Actual Architecture
Identifying Architecture Problems
Detecting Code Anomalies
Analyzing the Impact of Code Anomalies
DATA BUSINESS GUI
Roberta @ OPUS Group 14
15. Data Collection
Recovering Actual Architecture
Identifying Architecture Problems
Detecting Code Anomalies
Analyzing the Impact of Code Anomalies
DATA
DATA BUSINESS GUI
Roberta @ OPUS Group 15
17. Analyzing the Impact of Code Anomalies I
Null hypothesis: There is no relation between
code anomalies and architecture problems
Roberta @ OPUS Group 17
18. Analyzing the Impact of Code Anomalies I
Null hypothesis: There is no relation between
code anomalies and architecture problems
Fisher’s exact test
Roberta @ OPUS Group 18
19. Analyzing the Impact of Code Anomalies I
Code anomalies and architecture problems were
related in
77,5%
of the analyzed versions
Roberta @ OPUS Group 19
20. Analyzing the Impact of Code Anomalies II
Downstream Analysis
Which architecture problems were caused by code
anomalies
DATA
DATA BUSINESS GUI
Roberta @ OPUS Group 20
21. Analyzing the Impact of Code Anomalies II
Downstream Analysis
100
90
80
70
60 Not Caused by Code
50 Anomalies
40 Caused by Code
30 Anomalies
20
10
0
HW MM PDP MIDAS
Roberta @ OPUS Group 21
22. Analyzing the Impact of Code Anomalies II
Upstream Analysis
Which code anomalies caused architecture
problems
DATA
DATA BUSINESS GUI
Roberta @ OPUS Group 22
23. Analyzing the Impact of Code Anomalies II
Upstream Analysis
100
90
80
70
60
50 Irrelevant
40 Relevant
30
20
10
0
HW MM PDP MIDAS
Roberta @ OPUS Group 23
24. Analyzing the Impact of Code Anomalies II
Upstream Analysis
100
90
80
70
60
50 Irrelevant
40 Relevant
30
20
10
0
HW MM PDP MIDAS
Roberta @ OPUS Group 24
25. Identifying Relevant Code Anomalies
Code anomalies were divided by
Type of code anomaly
Earliness of anomaly
Roberta @ OPUS Group 25
26. Type of Code Anomaly
# of releases where each type of anomaly was
significant (causing architecture problems)
Roberta @ OPUS Group 26
27. Type of Code Anomaly
# of releases where each type of anomaly was
statistically significant (causing architecture problems)
Roberta @ OPUS Group 27
28. Earliness of Anomaly
Early anomaly: appears in the 1st version of
each system
18%
Of all architecturally-relevant
code anomalies were identified
as
early anomalies
Roberta @ OPUS Group 28
29. Earliness of Anomaly
Early anomaly: appears in the 1st version of
each system
and were related to more than
18% 37%
Of all architecturally-relevant
code anomalies were identified of all architecture problems
as
early anomalies
Roberta @ OPUS Group 29
30. Refactoring of Relevant Anomalies
We wanted to analyze whether architecturally-
relevant anomalies were often refactored
Detecting refactorings from source code history
Commit messages
Source code diffs (manually inspected)
Checking whether the refactored anomaly was
architecturally-relevant
Roberta @ OPUS Group 30
31. Refactoring of Relevant Anomalies
658 refactorings
33% high-level
Move member (16%)
Extract class or superclass (12%)
67% low-level
Rename (32%)
Extract local variable (16%)
37% of all architecture-relevant anomalies were
refactored
Isolated versions concentrated most of the refactoring
efforts
Roberta @ OPUS Group 31
37. Cause-Effect Criteria
1 Recurrently inferred in all systems versions
2 Observed in different modules of the same system
3 Modules involved the contribution of different developers
Isela Macia et al – AOSD 2011: An Exploratory Study of Code Smells in Aspect-Oriented Systems.
Isela Macia et al – AOSD 2012: Are Automatically-detected Code Anomalies relevant to
Architectural Modularity?
Roberta @ OPUS Group 37
The methapor of Code anomaly or bad smell was coined by Fowler and Beck as a program structure that usually indicates a deeper problem in the system
However, code anomalies are particularly severe when they introduce architecture problems. Examples of these problems are architectural anomalies*Architectural anomalies are compositions of architecture elements that hinder system maintainability.There are several examples of architectural anomalies documented in the literature-
We’d also like to define architecturally-relevant code anomalies. We call architecturally-relevant all those code anomalies that cause or are related to architecture problems. This God Class, for example.We took this example from a real world application that we analyzed in our study
- *Manystudieshavebeendedicated to analyzingcodeanomalyimpactKhomh et al [17] also investigate the impact of code anomalies on system changes. They found that anomalous code elements tend to change more frequently than free-anomalous elements.
Other works investigate the impact of automatically-detected code anomalies on software defects (i.e. the need for corrective maintenance). For instance, D‟Ambros et al found that, while some code anomalies are more frequent, none of them can be considered more harmful with respect to software defects
This phase was based on a semi-automatic process. We have used Sonar [43] and Understand [47] to support the recovery of the actual architecture from the source code. These tools support architecture and code analyses in order to help developers to analyze and measure the modularity of the system‟s architecture and implementation
Developers and architects collaborated to provide explicit mappings between the actual, extracted architecture (EA) and the intended architecture (IA). These mappings will be used by the Reflexion Model-based tools to measure the conformance in terms of convergence (a component or relationship that is in both EA and IE), divergence (a component or relationship that is in EA but not in IA), and absence (a component or relationship that is in IA but not EA). For instance, all absence classifications were considered as violations Architectural anomalies were detected by architects based mainly on: (i) a visual inspection of the EA, and (ii) acareful analysis of the code-level elements mapped to architectural-level elements, due to the lack of tools. Wealso asked the original architects to indicate other anomalies observed in the architecture design beyond those presentedin Table 2. This helped us to better judge whether and which code anomalies are good indicators of architecturalmodularity problems.
For detectingthecodeanomalies, weuseddifferenttools, basedondetectionstrategies. For example, for the Java projects, TogetherandUnderstandwereused for identifyingcodeanomalies; PDP ontheotherhand, is a C# project, soweusedNdepend to analyze it. In thisillustration, theredcrossesmarksdetectedcodeanomalies
Furthermore, our study provided some findings that can help developers to build more effective tools for identifying more severe code smells. For instance, some architecturally-relevant code smell occurrences cannot be detected and prioritized if architectural decisions are not somehow traced and mapped to the source code, and used by code-level smell detection tools.
Finally, ourresultssuggestthatmechanisms for detectingarchitecturally-relevantcodeanomaliesshouldalsoanalyzetherelationshipbetweencodeanomaliesandtheirimpactonthearchitecture design, that is theyshouldlook for patternsofcodeanomaliesratherthansolelyrelyon individual codeanomalies.Finally, weobservedthat certain recurring patterns of co-occurring code anomalies and the propagation of code anomalies from parents to children in the inheritance trees tend to be stronger indicators of architecture problems than individual anomalyoccurrences