SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
Mining source code for structural
regularities
Kim Mens, 	

Andy Kellens, 	

Gabriela Arevalo, 	

Angela Lozano
Problem Context
• Context: 	

• Programmers often use regularities: coding

conventions, design patterns, crosscutting concerns...	


• Enforcing such regularities facilitates maintenance,
evolution & comprehension 	


• Regularities are often not fully respected (implicit, no
support)	


• Goal: provide automated (tool) support to discover
source code (ir)regularities
ATTRIBUTES
4 legged
Characterizing
Mammals

hair covered
intelligent
marine

cats
dogs
Objects

*
*

thumbed
*
*

dolphins
gibbons

*
*

*

*

*

humans

*

*

whales

*

*

Formal Concept Analysis FCA
FCA: Lessons learnt
from mining aspects
•

DISADVANTAGES	


•

almost combinatorial
amount of results	


•
•
•

•
•

does not detect
exceptional cases	


description of the
concept	


•

redundancy	


requires traversal
heuristics (ad-hoc)	


ADVANTAGES	


•

shared properties
= hint of concept
specification	


does not require apriori knowledge
Rules Notation
Concept1 (k) --n m% --> (l) Concept2	

should be read as: 	


• n elements in Concept1 also appear in Concept2	

• m% of the elements in Concept1 (a.k.a. confidence) are also in Concept2	

• k elements in Concept1 are NOT in Concept2	

• l elements in Concept2 are NOT in Concept1	

!
Visually:	


Concept1

Concept 2

k

n

l

• Special case: when m = 100%	

Concept1 (0) ==n 100% ==> (l) Concept2	


!
Visually:

Concept1
n

Concept 2
l
Implications	


Marine
dolphins	

whales

Intelligent
humans	

gibbons

Marine (0) ==2 (100%)==> (2) Intelligent!
Marine mammals (closed world)	

Subset relation in lattice

Asssociations	

Hair covered (1) --2 (66%)--> (0) 4 legged !
Most hair-covered mammals are 4 legged;
gibbons aren’t	

Superset or siblings in the lattice

Hair covered
gibbons

cats	

dogs

4 legged
CASES
• Case 1 : IntensiVE 	

• Tool to validate if regularities documented through
several views are respected	


• Smalltalk	

• 270 classes; 2729 methods	


• Case 2 : Freecol	

• Colonization game (graphic, multiplayer)	

• Java	

• 382 classes; 3252 methods
OUR Approach
• Objects = Classes, & Attributes =	

• (K) has keyword (in class name)	

• (I) implements a particular method / message	

• (H) in hierarchy of class	

!

FreeColAction

• E.g. 	

• FreeColAction: 	

• K>Action, H>FreeColAction, I>getId, I>toXMLImpl
FreeColAction
getId()	

toXMLImpl()
Source
Code

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Eliminate
redundant
rules

Extract FCA
context

Calculate
concept lattice

Simplify
rules

Calculate
implication &
association
rules
Confidence >= 50%
and
Support >= 3

Rule
groups

algorithm
Source
Code

eliminate SCEs

Group by
overlaping
elements

Rule
groups

!

• Irrelevant entities: 	

• Object class	

• Test classes & methods

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3
Source
Code

rule calculation

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Rule
groups

• Calculate implications:	

• traverse child-parent relations of key nodes*	

!

• Calculate associations:	

• traverse all parent-child relation of key nodes*	

• traverse all key nodes* pairs that have no connection in the lattice 	

!

• Filter relations with confidence & support below thresholds	

• confidence ≤ 75% and support ≥ 3	

!

* those that add attributes or objects to the lattice

Confidence >= 50%
and
Support >= 3
Source
Code

simplify rules

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

GOAL: Eliminate redundant properties of a rule	

!
H>IVEditorNodeFigure H>IVEditorFigure ----> K>'IVEditor' K>'Figure'!

!
H>IVEditorNodeFigure ====> H>IVEditorFigure!

IVEditorFigure

IVEditorNodeFigure

because IVEditorFigure does not add information to
the rule
Source
Code

simplify rules

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Rule
groups

Confidence >= 50%
and
Support >= 3

! IVEditorFigure
H>IVEditorNodeFigure ==100%==> H>IVEditorFigure!
IVEditorNodeFigure

!
H>IVEditorNodeFigure, H>IVEditorFigure ----> K>'IVEditor', K>'Figure'!
K>'IVEditor', K>'Figure' ----> H>IVEditorNodeFigure, H>IVEditorFigure!
K>'IVEditor', K>'Figure', H>IVEditorFigure ----> H>IVEditorNodeFigure!
K>'IVEditor', K>'Figure', H>IVEditorNodeFigure ----> H>IVEditorFigure
H>Intensional.IVIntensiVEAction (0) ==24 (100%)==> 	

(0) I>#undoAction,H>Classifications2.AbstractAction,I>#performAction
Source
Code

Group by
overlaping
elements

simplify rules:

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

Priority to apply implications
b

Suppose

	
(R) a,b,c,e,f → h	
	
(Z) f
b,e	
(Y) a

b

c	

a

b	

(X) b

c

!
Is there an order to apply several
implications to remove
redundancies from R?

b

f

e
Source
Code

simplify rules:

Eliminate
redundant
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

b,e	

h

(R) a,b,c,e,f → h	

b

f

a

!

Y, X, Z:
Y, Z, X:
Z, X, Y:
Z, Y, X:
X, Y, Z:
X, Z, Y:

Simplify
rules

Rule
groups

b	

(Z) f

Calculate
concept lattice

c	

(Y) a

Extract FCA
context

Group by
overlaping
elements

Priority to apply implications
(X) b

Eliminate
irrelevant SCEs

(R)
(R)
(R)
(R)
(R)
(R)

a,b,c,e,f
a,b,c,e,f
a,b,c,e,f
a,b,c,e,f
a,b,c,e,f
a,b,c,e,f

→
→
→
→
→
→

h	

h	

h	

h	

h	

h

c

e
simplify rules:

Priority to apply implications
A≤B

Source
Code

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Group by
overlaping
elements

Confidence >= 50%
and
Support >= 3

Rule
groups

A.condition ⊆ B.conclusion 	


∨ B. conclusion ⊆ A. conclusion	

	
(X) b

c	

(Y) a

b	

(Z) f

b,e	

(R) a,b,c,e,f → h	

h

b

f

a

!
X 1st: 	

X≤Z because b ⊆ b,e (X.condition ⊆ Z.conclusion)	

X≤Y because b ⊆ b (X.condition ⊆ Y.conclusion)	

applying X:
(XtoR) a,b,c,e,f → h	


!

Z 2nd:	

Z≤Y because b ⊆ b+e (Y. conclusion ⊆ Z. conclusion)	

applying Z:
(Zto(XtoR)) a,b,c,e,f → h

c

e
Source
Code

eliminate rules
•

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Rule
groups

Confidence >= 50%
and
Support >= 3

Eliminate unrelated sets	


!
K>'Colopedia' (0) ==7 (100%)==> (86) I>'actionPerformed'

!

a (exc. condition) --(matches)→ (exc. conclusion) b

!

i.e. the average number of exceptions is below a quarter of any of the sets
(exc. condition)! (exc. conclusion)!
(condition size)

(conclusion size)

2

(exc. condition)!
(exc. condition+matches)

(exc. conclusion)!
(exc. conclusion +matches)

2

0.25
Source
Code

eliminate rules

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Group by
overlaping
elements

Confidence >= 50%
and
Support >= 3

Rule
groups

• Similar rules that conclude SubClass or SuperClass.	

Property>X

-- matchesSub--> (exc. conclusion sub) H>SubClass

Property>X

-- matchesSuper--> (exc. conclusion super)

!

H>SuperClass

eliminate the super rule !

(matchesSub)!

if it just adds noise !

!

(condition size)

(--matches, ++ exceptions):!

(matchesSuper)!

(exc. conclusion sub)!

0.9 and

0.9

(exc. conclusion super)

(condition size)

!
deleted K>'Classification' (0) --5 (100%)--> (24) H>Classifications2.AbstractClassification
K>'Classification' (0) --5 (100%)--> (0) H>Classifications2.Classification
!

eliminate the sub rule!

(matchesSub)!

(matchesSuper)!

if it has lower confidence:

(condition size)

(condition size)

I>'shouldBeEnabled' (0) --46 (100%)--> (7) H>'FreeColAction'
deleted I>'shouldBeEnabled' (2) --44 (96%)--> (7) H>'MapboardAction'
Source
Code

eliminate rules

Group by
overlaping
elements

Eliminate
irrelevant SCEs

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

• Conclude the root of the classes in the app. does not add any
information:	


Property>X

----> H>RootClass

!

• When having converse pairs of rules, 	

Property>X
Property>Y

----> Property>Y !
---->

Property>X

	


• prefer the one with better confidence. 	

• If similar confidence, prefer the one that starts with a the
condition of **stronger semantics**.!

H > I, H > K, I=K, U > k, R > k, U=R

H
I

I
H
Source
Code

group rules
•

•

Extract FCA
context

Calculate
concept lattice

Eliminate
redundant
rules

Simplify
rules

Calculate
implication &
association
rules

Confidence >= 50%
and
Support >= 3

Rule
groups

Those rules that share at least
85% of their matches are be
grouped together	


•
•

Group by
overlaping
elements

Eliminate
irrelevant SCEs

threshold comes from
analysis of results (might
change depending on the
case study)	


They represent common
properties of a set of source
code entities	

These groups are ordered by
number of matches

I>getID	

H>FreeColAction

I>getActionPerformed

K>Action

H>FreeColAction

H>MapboardAction
I>getId
RESULTS:
Reduction of information to process

• IntensiVE [270 classes; 2729 methods]	

• Concepts: 1289 / Relations: 4390	

• Rules: 325 / Groups: 50	

!

• Freecol [382 classes; 3252 methods]	

• Concepts: 1261 / Relations: 5149	

• Rules: 134 / Groups: 42
RESULTS:
Rules Freecol
• K → H = 2 rules	

• The concept described by the keyword is confined to classes in
the hierarchy K>'Mission'→ H>'Mission'	


• K → K = 4 rules	

• Combined words Free+Col, Trade+Route, Free+Col+Menu, etc.	

• K → I = 8 rules	

• Classes named *Keyword* should implement the method I	

e.g. K>'Info'→ I>'update', K>'Action'→ I>'actionPerformed',
K>'Thread'→ I>'run', K>'Mission'→ I>'doMission', etc.
RESULTS:
Rules Freecol
• H → K = 9 rules	

• Classes in the hierarchy can be described by the
keyword. e.g. 	


• H>ReportPanel → K>'Panel' K>'Report'	

• H>'NetworkRequestHandler'→ K>'Handler'	

• H>'InputHandler'→ K>'Input'	

• H>'OptionUpdater'→ K>'UI'	

• 'TradeItem'→ K>'Item'
RESULTS:
Rules Freecol
• H->I = 28 rules	

• Classes in the hierarchy should implement the method. e.g.	

• H>'NetworkRequestHandler'→ I>'handle'	

• H>'OptionMap' → I>'addDefaultOptions'	

• H>'Location' → I>'getGoodsContainer'→ I>'getLocationName'	

• H>'MapIterator' → I>'nextPosition'	

• H>'MapboardAction' → I>'getId' → I>'shouldBeEnabled'	

• H>'OptionUpdater' → I>'updateOption',I>'unregister'	

• H>'TradeItem' → I>'makeTrade'	

• H>'PersistentObject'→ I>'readFromXMLImpl' → I>'toXML'
RESULTS:
Rules Freecol
• I->I = 71 rules	

• Implementation protocols. e.g.	

• I>'getColony'→ I>'getXMLElementTagName'→ I>'toXMLImpl'
→I>'readFromXMLImpl'	


• I>'installUI' → I>'createUI'	

• I>'getTransportDestination'→ I>'doMission' →I>'dispose'	

• I>'contains' →I>'add' →I>'newTurn' →I>'remove' 	

• I>'toXML' →I>'getXMLElementTagName' →I>'readFromXMLImpl' 	

• I>'requestFocus' → I>'actionPerformed' →I>'initialize'	

• I>'setOwner' → I>'newTurn' → I>'getTile'	

• I>'setName'→ I>'getName'
RESULTS:
Groups Freecol
• Classes in the hierarchy FreeColAction are named
*Action*, and tend to implement getId and
actionPerformed 	


• 50 matches, 6 rules	

•
•
•
•
•
•

H>'MapboardAsction' (1) --50 (98%)----> (3) I>'getId'!
H>'FreeColAction' (0) ==53 (100%)==> (3) K>'Action'!
H>'FreeColAction' (3) --50 (94%)--> (43) I>'actionPerformed'!
I>'getId' (3) --50 (94%)--> (43) I>'actionPerformed'!
I>'getId',H>'FreeColAction' (0) --52 (100%)--> (4) K>'Action'!
K>'Action' (5) --51 (91%)--> (42) I>'actionPerformed'
RESULTS:
Groups Freecol
• Most of the classes that implement initialize belong to the hierarchy of FreeColPanel	

• initialize prepares a panel to be displayed 	

• 36 matches, 1 rules	

• I>'initialize' (7) --36 (84%)--> (9) H>'FreeColPanel'!
!

• Classes that implement toXMLImpl also implement getXMLElementTagName	

• toXMLImpl writes an XML representation of the object to a stream.	

• getXMLElementTagName gets the tag name that represents the object	

• Exception is FreeColAction, which is the XML root	

• 44 matches, 1 rules	

• I>'toXMLImpl' (1) --44 (98%)--> (16) I>'getXMLElementTagName'
RESULTS:
IntensiVE
• Regularities documented & found!
• Interface!
• Action protocol & undoable protocol!
• Compilation!
• Relation evaluators!
• Cache / Save / Remove on definitions !
• Intension Editors (partial protocol)!
• Instantiable views!
• Constraint editors!
• Evaluators!
• Naming convention!
• Unit testing, View hierarchy !
• Interface + Naming convention!
• Quantifiers (naming + partial interface)
RESULTS:
IntensiVE
• Regularities found & NOT documented!
• Interfaces!
• IntensiVE Explorer Visualization !
• Checkable entities!
• Fuzzy quantifiers!
• Query generation for visual querying!
• Context-menu in visual query language!
• Figure rendering!
• Special classifications!
• Naming conventions!
• Figures, Exceptions, Visualization, Classifications, Exceptions to views, Result pairs,
Reporters.!

• Interfaces + Naming conventions!
• Starbrowser shells
CONCLUSIONS
• Use FCA with objects = source code entities	

• As attributes = several types of properties	

• Calculate implications	

• to mine for intension of regularity rather than extension	

• not just entities that match regularity but explicit
specification of regularity	


• Allow for variations and irregularities = association rules	

• To overcome previous pitfalls and make regularities explicit
THREATS &
LIMITATIONS
• Redundant information	

• There are groups that are sub-sets of other groups	

• All results are correct but....	

• some regularities found might be due to chance and not
as conscientious development decision	


• interpretation of results require to assume a close world	


• Usefulness is subjective	

• i.e. separating useful from useless results	

• Data analyzed could be more of semantic
CURRENT
& FUTURE WORK:
• ...Running the same case studies mixing the results of
Classes and Methods	


• ...Comparing the regularities found with those
previously documented in IntensiVE	


• Calculate which percentage of the irregularities of a
group are indeed an error	


• Use the results to guide the developer while adding
or modifying SCEs	


• Use a similar approach to mine feature dependencies

Más contenido relacionado

Similar a Mining source code for structural regularities (SATTOSE2010)

Associations.ppt
Associations.pptAssociations.ppt
Associations.pptQuyn590023
 
Associations1
Associations1Associations1
Associations1mancnilu
 
Developer testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to IntegrateDeveloper testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to IntegrateLB Denker
 
Market Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning ServicesMarket Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning ServicesLuca Zavarella
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerAndrey Karpov
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryGiuseppe Rizzo
 
Ot regularization and_gradient_descent
Ot regularization and_gradient_descentOt regularization and_gradient_descent
Ot regularization and_gradient_descentankit_ppt
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data MiningKamal Acharya
 
Research overview Oct. 2018
Research overview Oct. 2018Research overview Oct. 2018
Research overview Oct. 2018XavierDevroey
 
Optimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansOptimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansDatabricks
 
Developer testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing FanaticDeveloper testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing FanaticLB Denker
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklistMax Kleiner
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Lionel Briand
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldAndrey Karpov
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...ORAU
 
The fundamentals of regression
The fundamentals of regressionThe fundamentals of regression
The fundamentals of regressionStephanie Locke
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing SoftwareSteven Smith
 

Similar a Mining source code for structural regularities (SATTOSE2010) (20)

Associations.ppt
Associations.pptAssociations.ppt
Associations.ppt
 
Writing clean code
Writing clean codeWriting clean code
Writing clean code
 
Associations1
Associations1Associations1
Associations1
 
Developer testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to IntegrateDeveloper testing 201: When to Mock and When to Integrate
Developer testing 201: When to Mock and When to Integrate
 
Market Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning ServicesMarket Basket Analysis in SQL Server Machine Learning Services
Market Basket Analysis in SQL Server Machine Learning Services
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
 
Ot regularization and_gradient_descent
Ot regularization and_gradient_descentOt regularization and_gradient_descent
Ot regularization and_gradient_descent
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
 
Research overview Oct. 2018
Research overview Oct. 2018Research overview Oct. 2018
Research overview Oct. 2018
 
Code Metrics
Code MetricsCode Metrics
Code Metrics
 
Optimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansOptimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex Plans
 
Explainable AI Workshop
Explainable AI WorkshopExplainable AI Workshop
Explainable AI Workshop
 
Developer testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing FanaticDeveloper testing 101: Become a Testing Fanatic
Developer testing 101: Become a Testing Fanatic
 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklist
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
 
The fundamentals of regression
The fundamentals of regressionThe fundamentals of regression
The fundamentals of regression
 
Improving the Quality of Existing Software
Improving the Quality of Existing SoftwareImproving the Quality of Existing Software
Improving the Quality of Existing Software
 

Más de kim.mens

Context-Oriented Programming
Context-Oriented ProgrammingContext-Oriented Programming
Context-Oriented Programmingkim.mens
 
Object-Oriented Design Heuristics
Object-Oriented Design HeuristicsObject-Oriented Design Heuristics
Object-Oriented Design Heuristicskim.mens
 
Software Patterns
Software PatternsSoftware Patterns
Software Patternskim.mens
 
Code Refactoring
Code RefactoringCode Refactoring
Code Refactoringkim.mens
 
Domain Modelling
Domain ModellingDomain Modelling
Domain Modellingkim.mens
 
Object-Oriented Application Frameworks
Object-Oriented Application FrameworksObject-Oriented Application Frameworks
Object-Oriented Application Frameworkskim.mens
 
Towards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation FrameworkTowards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation Frameworkkim.mens
 
Towards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty ApproachesTowards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty Approacheskim.mens
 
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software EngineeringBreaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineeringkim.mens
 
Context-oriented programming
Context-oriented programmingContext-oriented programming
Context-oriented programmingkim.mens
 
Basics of reflection
Basics of reflectionBasics of reflection
Basics of reflectionkim.mens
 
Advanced Reflection in Java
Advanced Reflection in JavaAdvanced Reflection in Java
Advanced Reflection in Javakim.mens
 
Basics of reflection in java
Basics of reflection in javaBasics of reflection in java
Basics of reflection in javakim.mens
 
Reflection in Ruby
Reflection in RubyReflection in Ruby
Reflection in Rubykim.mens
 
Introduction to Ruby
Introduction to RubyIntroduction to Ruby
Introduction to Rubykim.mens
 
Introduction to Smalltalk
Introduction to SmalltalkIntroduction to Smalltalk
Introduction to Smalltalkkim.mens
 
A gentle introduction to reflection
A gentle introduction to reflectionA gentle introduction to reflection
A gentle introduction to reflectionkim.mens
 
Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...kim.mens
 
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)kim.mens
 
Usage contracts in a nutshell
Usage contracts in a nutshellUsage contracts in a nutshell
Usage contracts in a nutshellkim.mens
 

Más de kim.mens (20)

Context-Oriented Programming
Context-Oriented ProgrammingContext-Oriented Programming
Context-Oriented Programming
 
Object-Oriented Design Heuristics
Object-Oriented Design HeuristicsObject-Oriented Design Heuristics
Object-Oriented Design Heuristics
 
Software Patterns
Software PatternsSoftware Patterns
Software Patterns
 
Code Refactoring
Code RefactoringCode Refactoring
Code Refactoring
 
Domain Modelling
Domain ModellingDomain Modelling
Domain Modelling
 
Object-Oriented Application Frameworks
Object-Oriented Application FrameworksObject-Oriented Application Frameworks
Object-Oriented Application Frameworks
 
Towards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation FrameworkTowards a Context-Oriented Software Implementation Framework
Towards a Context-Oriented Software Implementation Framework
 
Towards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty ApproachesTowards a Taxonomy of Context-Aware Software Variabilty Approaches
Towards a Taxonomy of Context-Aware Software Variabilty Approaches
 
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software EngineeringBreaking the Walls: A Unified Vision on Context-Oriented Software Engineering
Breaking the Walls: A Unified Vision on Context-Oriented Software Engineering
 
Context-oriented programming
Context-oriented programmingContext-oriented programming
Context-oriented programming
 
Basics of reflection
Basics of reflectionBasics of reflection
Basics of reflection
 
Advanced Reflection in Java
Advanced Reflection in JavaAdvanced Reflection in Java
Advanced Reflection in Java
 
Basics of reflection in java
Basics of reflection in javaBasics of reflection in java
Basics of reflection in java
 
Reflection in Ruby
Reflection in RubyReflection in Ruby
Reflection in Ruby
 
Introduction to Ruby
Introduction to RubyIntroduction to Ruby
Introduction to Ruby
 
Introduction to Smalltalk
Introduction to SmalltalkIntroduction to Smalltalk
Introduction to Smalltalk
 
A gentle introduction to reflection
A gentle introduction to reflectionA gentle introduction to reflection
A gentle introduction to reflection
 
Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...Managing the Evolution of Information Systems with Intensional Views and Rela...
Managing the Evolution of Information Systems with Intensional Views and Rela...
 
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
Usage contracts (presented at SATToSE 2014 in L'Aquila, Italy)
 
Usage contracts in a nutshell
Usage contracts in a nutshellUsage contracts in a nutshell
Usage contracts in a nutshell
 

Último

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Último (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Mining source code for structural regularities (SATTOSE2010)

  • 1. Mining source code for structural regularities Kim Mens, Andy Kellens, Gabriela Arevalo, Angela Lozano
  • 2. Problem Context • Context: • Programmers often use regularities: coding conventions, design patterns, crosscutting concerns... • Enforcing such regularities facilitates maintenance, evolution & comprehension • Regularities are often not fully respected (implicit, no support) • Goal: provide automated (tool) support to discover source code (ir)regularities
  • 4. FCA: Lessons learnt from mining aspects • DISADVANTAGES • almost combinatorial amount of results • • • • • does not detect exceptional cases description of the concept • redundancy requires traversal heuristics (ad-hoc) ADVANTAGES • shared properties = hint of concept specification does not require apriori knowledge
  • 5. Rules Notation Concept1 (k) --n m% --> (l) Concept2 should be read as: • n elements in Concept1 also appear in Concept2 • m% of the elements in Concept1 (a.k.a. confidence) are also in Concept2 • k elements in Concept1 are NOT in Concept2 • l elements in Concept2 are NOT in Concept1 ! Visually: Concept1 Concept 2 k n l • Special case: when m = 100% Concept1 (0) ==n 100% ==> (l) Concept2 ! Visually: Concept1 n Concept 2 l
  • 6. Implications Marine dolphins whales Intelligent humans gibbons Marine (0) ==2 (100%)==> (2) Intelligent! Marine mammals (closed world) Subset relation in lattice Asssociations Hair covered (1) --2 (66%)--> (0) 4 legged ! Most hair-covered mammals are 4 legged; gibbons aren’t Superset or siblings in the lattice Hair covered gibbons cats dogs 4 legged
  • 7. CASES • Case 1 : IntensiVE • Tool to validate if regularities documented through several views are respected • Smalltalk • 270 classes; 2729 methods • Case 2 : Freecol • Colonization game (graphic, multiplayer) • Java • 382 classes; 3252 methods
  • 8. OUR Approach • Objects = Classes, & Attributes = • (K) has keyword (in class name) • (I) implements a particular method / message • (H) in hierarchy of class ! FreeColAction • E.g. • FreeColAction: • K>Action, H>FreeColAction, I>getId, I>toXMLImpl FreeColAction getId() toXMLImpl()
  • 9. Source Code Group by overlaping elements Eliminate irrelevant SCEs Eliminate redundant rules Extract FCA context Calculate concept lattice Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups algorithm
  • 10. Source Code eliminate SCEs Group by overlaping elements Rule groups ! • Irrelevant entities: • Object class • Test classes & methods Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3
  • 11. Source Code rule calculation Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Rule groups • Calculate implications: • traverse child-parent relations of key nodes* ! • Calculate associations: • traverse all parent-child relation of key nodes* • traverse all key nodes* pairs that have no connection in the lattice ! • Filter relations with confidence & support below thresholds • confidence ≤ 75% and support ≥ 3 ! * those that add attributes or objects to the lattice Confidence >= 50% and Support >= 3
  • 12. Source Code simplify rules Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups GOAL: Eliminate redundant properties of a rule ! H>IVEditorNodeFigure H>IVEditorFigure ----> K>'IVEditor' K>'Figure'! ! H>IVEditorNodeFigure ====> H>IVEditorFigure! IVEditorFigure IVEditorNodeFigure because IVEditorFigure does not add information to the rule
  • 13. Source Code simplify rules Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Rule groups Confidence >= 50% and Support >= 3 ! IVEditorFigure H>IVEditorNodeFigure ==100%==> H>IVEditorFigure! IVEditorNodeFigure ! H>IVEditorNodeFigure, H>IVEditorFigure ----> K>'IVEditor', K>'Figure'! K>'IVEditor', K>'Figure' ----> H>IVEditorNodeFigure, H>IVEditorFigure! K>'IVEditor', K>'Figure', H>IVEditorFigure ----> H>IVEditorNodeFigure! K>'IVEditor', K>'Figure', H>IVEditorNodeFigure ----> H>IVEditorFigure H>Intensional.IVIntensiVEAction (0) ==24 (100%)==> (0) I>#undoAction,H>Classifications2.AbstractAction,I>#performAction
  • 14. Source Code Group by overlaping elements simplify rules: Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups Priority to apply implications b Suppose (R) a,b,c,e,f → h (Z) f b,e (Y) a b c a b (X) b c ! Is there an order to apply several implications to remove redundancies from R? b f e
  • 15. Source Code simplify rules: Eliminate redundant rules Calculate implication & association rules Confidence >= 50% and Support >= 3 b,e h (R) a,b,c,e,f → h b f a ! Y, X, Z: Y, Z, X: Z, X, Y: Z, Y, X: X, Y, Z: X, Z, Y: Simplify rules Rule groups b (Z) f Calculate concept lattice c (Y) a Extract FCA context Group by overlaping elements Priority to apply implications (X) b Eliminate irrelevant SCEs (R) (R) (R) (R) (R) (R) a,b,c,e,f a,b,c,e,f a,b,c,e,f a,b,c,e,f a,b,c,e,f a,b,c,e,f → → → → → → h h h h h h c e
  • 16. simplify rules: Priority to apply implications A≤B Source Code Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Group by overlaping elements Confidence >= 50% and Support >= 3 Rule groups A.condition ⊆ B.conclusion ∨ B. conclusion ⊆ A. conclusion (X) b c (Y) a b (Z) f b,e (R) a,b,c,e,f → h h b f a ! X 1st: X≤Z because b ⊆ b,e (X.condition ⊆ Z.conclusion) X≤Y because b ⊆ b (X.condition ⊆ Y.conclusion) applying X: (XtoR) a,b,c,e,f → h ! Z 2nd: Z≤Y because b ⊆ b+e (Y. conclusion ⊆ Z. conclusion) applying Z: (Zto(XtoR)) a,b,c,e,f → h c e
  • 17. Source Code eliminate rules • Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Rule groups Confidence >= 50% and Support >= 3 Eliminate unrelated sets ! K>'Colopedia' (0) ==7 (100%)==> (86) I>'actionPerformed' ! a (exc. condition) --(matches)→ (exc. conclusion) b ! i.e. the average number of exceptions is below a quarter of any of the sets (exc. condition)! (exc. conclusion)! (condition size) (conclusion size) 2 (exc. condition)! (exc. condition+matches) (exc. conclusion)! (exc. conclusion +matches) 2 0.25
  • 18. Source Code eliminate rules Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Group by overlaping elements Confidence >= 50% and Support >= 3 Rule groups • Similar rules that conclude SubClass or SuperClass. Property>X -- matchesSub--> (exc. conclusion sub) H>SubClass Property>X -- matchesSuper--> (exc. conclusion super) ! H>SuperClass eliminate the super rule ! (matchesSub)! if it just adds noise ! ! (condition size) (--matches, ++ exceptions):! (matchesSuper)! (exc. conclusion sub)! 0.9 and 0.9 (exc. conclusion super) (condition size) ! deleted K>'Classification' (0) --5 (100%)--> (24) H>Classifications2.AbstractClassification K>'Classification' (0) --5 (100%)--> (0) H>Classifications2.Classification ! eliminate the sub rule! (matchesSub)! (matchesSuper)! if it has lower confidence: (condition size) (condition size) I>'shouldBeEnabled' (0) --46 (100%)--> (7) H>'FreeColAction' deleted I>'shouldBeEnabled' (2) --44 (96%)--> (7) H>'MapboardAction'
  • 19. Source Code eliminate rules Group by overlaping elements Eliminate irrelevant SCEs Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups • Conclude the root of the classes in the app. does not add any information: Property>X ----> H>RootClass ! • When having converse pairs of rules, Property>X Property>Y ----> Property>Y ! ----> Property>X • prefer the one with better confidence. • If similar confidence, prefer the one that starts with a the condition of **stronger semantics**.! H > I, H > K, I=K, U > k, R > k, U=R H I I H
  • 20. Source Code group rules • • Extract FCA context Calculate concept lattice Eliminate redundant rules Simplify rules Calculate implication & association rules Confidence >= 50% and Support >= 3 Rule groups Those rules that share at least 85% of their matches are be grouped together • • Group by overlaping elements Eliminate irrelevant SCEs threshold comes from analysis of results (might change depending on the case study) They represent common properties of a set of source code entities These groups are ordered by number of matches I>getID H>FreeColAction I>getActionPerformed K>Action H>FreeColAction H>MapboardAction I>getId
  • 21. RESULTS: Reduction of information to process • IntensiVE [270 classes; 2729 methods] • Concepts: 1289 / Relations: 4390 • Rules: 325 / Groups: 50 ! • Freecol [382 classes; 3252 methods] • Concepts: 1261 / Relations: 5149 • Rules: 134 / Groups: 42
  • 22. RESULTS: Rules Freecol • K → H = 2 rules • The concept described by the keyword is confined to classes in the hierarchy K>'Mission'→ H>'Mission' • K → K = 4 rules • Combined words Free+Col, Trade+Route, Free+Col+Menu, etc. • K → I = 8 rules • Classes named *Keyword* should implement the method I e.g. K>'Info'→ I>'update', K>'Action'→ I>'actionPerformed', K>'Thread'→ I>'run', K>'Mission'→ I>'doMission', etc.
  • 23. RESULTS: Rules Freecol • H → K = 9 rules • Classes in the hierarchy can be described by the keyword. e.g. • H>ReportPanel → K>'Panel' K>'Report' • H>'NetworkRequestHandler'→ K>'Handler' • H>'InputHandler'→ K>'Input' • H>'OptionUpdater'→ K>'UI' • 'TradeItem'→ K>'Item'
  • 24. RESULTS: Rules Freecol • H->I = 28 rules • Classes in the hierarchy should implement the method. e.g. • H>'NetworkRequestHandler'→ I>'handle' • H>'OptionMap' → I>'addDefaultOptions' • H>'Location' → I>'getGoodsContainer'→ I>'getLocationName' • H>'MapIterator' → I>'nextPosition' • H>'MapboardAction' → I>'getId' → I>'shouldBeEnabled' • H>'OptionUpdater' → I>'updateOption',I>'unregister' • H>'TradeItem' → I>'makeTrade' • H>'PersistentObject'→ I>'readFromXMLImpl' → I>'toXML'
  • 25. RESULTS: Rules Freecol • I->I = 71 rules • Implementation protocols. e.g. • I>'getColony'→ I>'getXMLElementTagName'→ I>'toXMLImpl' →I>'readFromXMLImpl' • I>'installUI' → I>'createUI' • I>'getTransportDestination'→ I>'doMission' →I>'dispose' • I>'contains' →I>'add' →I>'newTurn' →I>'remove' • I>'toXML' →I>'getXMLElementTagName' →I>'readFromXMLImpl' • I>'requestFocus' → I>'actionPerformed' →I>'initialize' • I>'setOwner' → I>'newTurn' → I>'getTile' • I>'setName'→ I>'getName'
  • 26. RESULTS: Groups Freecol • Classes in the hierarchy FreeColAction are named *Action*, and tend to implement getId and actionPerformed • 50 matches, 6 rules • • • • • • H>'MapboardAsction' (1) --50 (98%)----> (3) I>'getId'! H>'FreeColAction' (0) ==53 (100%)==> (3) K>'Action'! H>'FreeColAction' (3) --50 (94%)--> (43) I>'actionPerformed'! I>'getId' (3) --50 (94%)--> (43) I>'actionPerformed'! I>'getId',H>'FreeColAction' (0) --52 (100%)--> (4) K>'Action'! K>'Action' (5) --51 (91%)--> (42) I>'actionPerformed'
  • 27. RESULTS: Groups Freecol • Most of the classes that implement initialize belong to the hierarchy of FreeColPanel • initialize prepares a panel to be displayed • 36 matches, 1 rules • I>'initialize' (7) --36 (84%)--> (9) H>'FreeColPanel'! ! • Classes that implement toXMLImpl also implement getXMLElementTagName • toXMLImpl writes an XML representation of the object to a stream. • getXMLElementTagName gets the tag name that represents the object • Exception is FreeColAction, which is the XML root • 44 matches, 1 rules • I>'toXMLImpl' (1) --44 (98%)--> (16) I>'getXMLElementTagName'
  • 28. RESULTS: IntensiVE • Regularities documented & found! • Interface! • Action protocol & undoable protocol! • Compilation! • Relation evaluators! • Cache / Save / Remove on definitions ! • Intension Editors (partial protocol)! • Instantiable views! • Constraint editors! • Evaluators! • Naming convention! • Unit testing, View hierarchy ! • Interface + Naming convention! • Quantifiers (naming + partial interface)
  • 29. RESULTS: IntensiVE • Regularities found & NOT documented! • Interfaces! • IntensiVE Explorer Visualization ! • Checkable entities! • Fuzzy quantifiers! • Query generation for visual querying! • Context-menu in visual query language! • Figure rendering! • Special classifications! • Naming conventions! • Figures, Exceptions, Visualization, Classifications, Exceptions to views, Result pairs, Reporters.! • Interfaces + Naming conventions! • Starbrowser shells
  • 30. CONCLUSIONS • Use FCA with objects = source code entities • As attributes = several types of properties • Calculate implications • to mine for intension of regularity rather than extension • not just entities that match regularity but explicit specification of regularity • Allow for variations and irregularities = association rules • To overcome previous pitfalls and make regularities explicit
  • 31. THREATS & LIMITATIONS • Redundant information • There are groups that are sub-sets of other groups • All results are correct but.... • some regularities found might be due to chance and not as conscientious development decision • interpretation of results require to assume a close world • Usefulness is subjective • i.e. separating useful from useless results • Data analyzed could be more of semantic
  • 32. CURRENT & FUTURE WORK: • ...Running the same case studies mixing the results of Classes and Methods • ...Comparing the regularities found with those previously documented in IntensiVE • Calculate which percentage of the irregularities of a group are indeed an error • Use the results to guide the developer while adding or modifying SCEs • Use a similar approach to mine feature dependencies