2. Overview
Understanding object-oriented
distributed software applications
by reverse engineering
the source code,
focusing on the distribution-related
aspects of the system,
using a structural, technology-aware
analysis approach
3. Distributed Software
The distributed aspect is crucial for understanding
- systems are specifically built for distributed problems
- technology dependence: communication infrastructure
Making the distributed aspect central makes the analysis easy
- without ignoring the local functionality concerns
6. Model
Augments an OO meta-model (Memoria):
makes the distributed aspect a main concept
distributable feature -- feature directly involved in
the distributed functionality, either by providing remote services, or
by directly using such services
frontier classes -- act at the frontier between the system
and the communication infrastructure (“communication mediator”)
7. System - Mediator
Frontier
Distributable
Feature Core Frontier
Frontier Frontier Frontier
Frontier Class Class
Class Class
Class
Core Core
Core Core Core Class Class
Class Class Class
Core Core
Class Class
Core
Class Distributable
Feature Core
Acquaintance
Class Acquaintance Acquaintance Acquaintance
Acquaintance
Class Class Class
Class
Acquaintance Acquaintance Acquaintance
Acquaintance
Class Class Class
Class
Acquaintance
Class Acquaintance
Class Acquaintance
Distributable Class
Feature Distributable
Feature
Acquaintance
Class
Class Class
Model Class Class
Overview Class
Class
Local
Feature
9. Approach
0: Initial graph of vertex: a class
classes
edge: method call / attribute
access / inheritance relation
3: Capture
coarse-grained architecture
core of of distributable features
extracted
distr.feat. new feature
Mediator
remote
call
utility
class
frontier Mediator
class Mediator 2: Separate distinct cores 5: Support for restructuring
of distributable features
core class
1: Build the dependency graph 4: Assess impact of
of distributable features (DGDF) distributable features
12. Goals
Find the core entities involved in the distributed
functionality
Get an overview of the distributed architecture
13. Build a Core Graph
Identify the Frontier - technology dependent rules
Start with the Frontier Classes:
the best starting points describing involvement in distribution
Incrementally add new vertices and edges
until a configurable depth of
search is reached
14. Identify the Distributable
Feature Cores
Detect and remove edges that connect
loosely coupled sets of classes
- technology-aware and cohesion-based
heuristics
The resulting connected components:
candidate DF cores
Identify the remote communication channels
The engineer reviews the result
18. Goals
focus on the rest of the classes in the system (the
majority)
evaluate their involvement in providing the
Distributable Features
identify the classes that follow the
main patterns of involvement
make system-level and class-level characterizations
19. Class involvement
Set of coupling-based metrics
The collaboration of a class with the entire system
Total bidirectional coupling (TBC)
Involvement in providing a particular DF
Acquaintance with a Distributable Feature (ADF)
Involvement in providing all DFs = involvement in distribution
Total Acquaintance with Distributable Features (TADF)
System-wide “distributed awareness”
Average Total Acquaintance with Distributable Features (Average TADF)
20. Visualization
The Feature Affiliation Perspective
Intensity of
Feature
gray: total collaboration
Acquaintance color: distribution-related
collaboration
Total Collaboration Intensity
- intensity: no of collaborations
- dispersion: no. of collaborators
Dispersion of
Feature
Acquaintance
Total Collaboration Dispersion
22. Patterns of Involvement
How does a class participate in providing the DFs
- The main patterns of involvement were detected (Patterns of
acquaintance)
- Define and use a set of detection strategies [Marinescu04] to
detect the classes following a certain pattern
- Put the visualization to use: see the interesting classes
23. Pattern I.
Significant Feature
Big Color Box
Acquaintance
Total coupling with
distributable features is high
T ADF ≥ HIGH
Significant
Acquaintance of
AND Distributable
Class is mostly coupled with
Features
distributable features
TADF
TBC ≥ AV ERAGE
Class has significant involvement with the
distributed functionality
24. Pattern II.
Local Feature
Big Gray
Contributor
Class is strongly coupled with the
other classes in the system
T BC ≥ HIGH
Local Feature
AND Contributor
Class has (almost) no relation
with the distributable features
T ADF
≤ LOW
T BC
Class has significant involvement with
local (non-distributed) functionality
25. Pattern III.
Color-Spotted
Connector Class
Gray
Class has significant coupling with
the distributable features
T ADF ≥ AV ERAGE
Connector
AND Class
Class has significant coupling with
other classes in the system
TADF
LOW < ≤ AV ERAGE
TBC
Class connects a local feature
with a distributed one
28. System-level
characterization
FWS:
2 DF cores
Average TADF=3, lot of gray
- significant local functionality
[80 classes belong to a local tool,
system initially non distributed]
EHCACHE:
5 DF cores
Average TADF=9, more color
- more distributed functionality
[documentation: system
redesigned specifically as distributed]
29. Class-level
characterization
Local Feature Contributor / Big Gray
• FWS
- 80 classes -- the local tool for visually editing workflow specifications
- 6 classes -- belonging to other local features
• EHCACHE
- Less than 5 classes
- Cache – highest TBC heavily used, but local
- ConfigurationHelper – manages configuration files
30. Class-level
characterization
Significant Feature Acquaintance / Big Color Spot
• FWS
- 5 classes, related to the Workflow Engine
- Small number => the functionality is well located in the system
• EHCACHE
- 12 classes, related to the Cache Peer Manager
- TADF/TBC close to 1 => classes are dedicated to the distributable feature
(ex: Mutex, ConcurrencyUtil, Sync)
31. Class-level
characterization
Connector Class / Color-Spotted Gray
FWS
- 5 classes
- Most interesting case: ProcessDefinition
- TADF=15, TADF/TBC=0.2
- Models/stores the internal representation of workflows in execution
- Links the classes that run the workflow (detected as Significant Feature
Acquaintances) with an XML parser that reads the workflow specifications
EHCACHE
- 6 classes
- Most interesting case: Element
- Represents the data item cached by the system
- The only class that has a noticeable relation with Cache Replicator
- Links the Cache Replicator with the non-distributed feature of the
system that actually stores (caches) data
33. Goal
Apply concepts and measurements
similar to those used in the analysis
to help the engineer
explore / play with
tentative restructuring scenarios
34. Approach
Visualize (a part of) the graph of classes
Select a set of initial classes
See what happens if they are to be extracted (removed) as a
separate unit:
- evaluate the redesign layout
which classes should go with those selected,
which should remain in the initial system
- evaluate the cost
Apply such scenarios at will
35. Helpers
Metrics-based visualization to
help select initial classes
intensity
- In-group Adequacy (IGA) metric dispersion a) b) c)
Compute the forecasted layout
- Acquaintance with Class Group (ACG)
- Configurable threshold value
Computing the extraction cost
- Extraction Cost (EC)
38. niSiDe
“non-invasive Structural insight on Distributed environments”
Follows all the steps in the methodology, and provides
complete support for analysis
Generates all visualizations and support diagrams
Built for extensibility
Integrated in the iPlasma environment
40. Contributions
• A methodology for understanding object-oriented
distributed systems
• A model for object-oriented distributed systems
• The Distributable Features View (visualization)
• Basic restructuring support as a natural extension
to the understanding techniques
• Comprehensive tool support