Product Derivation is a key activity in Software Product Line Engineering. During this process, derivation operators modify or create core assets (e.g., model elements, source code instructions, components) by adding, removing or substituting them according to a given configuration. The result is a derived product that generally needs to conform to a programming or modeling language. Some operators lead to invalid products when applied to certain assets, some others do not; knowing this in advance can help to better use them, however this is challenging, specially if we consider assets expressed in extensive and complex languages such as Java. In this paper, we empirically answer the following question: which product line operators, applied to which program elements, can synthesize variants of programs that are incorrect, correct or perhaps even conforming to test suites? We implement source code transformations, based on the derivation operators of the Common Variability Language. We automatically synthesize more than 370,000 program variants from a set of 8 real large Java projects (up to 85,000 lines of code), obtaining an extensive panorama of the sanity of the operations.
Paper was presented at SPLC'15
Assessing Product Line Derivation Operators Applied to Java Source Code: An Empirical Study
1. Assessing Product Line Derivation
Operators Applied to Java Source Code:
An Empirical Study
João Bosco Ferreira Filho, Simon Allier,
Olivier Barais, Mathieu Acher and Benoit Baudry
SPLC 2015, July 20 - 24, 2015, Nashville,TN, USA
3. Given a kind of artefact (expressed in a language)
you want to make it vary
You want variants of a…
3
Word document
Java, HTML, CSS or
C++ program
Class diagram
State machine
model A B C
t1 t3
t2
A B C
t2
A C
4. You need a solution for (de)activating/adding/
removing, substituting some elements; and thus
deriving variants of a…
4
Ideally applicable to any kind of artefact expressed in a language
Word document
Java, HTML, CSS or
C++ program
Class diagram
State machine
model A B C
t1 t3
t2
A B C
t2
A C
5. Common Variability Language (CVL): automatically
deriving products (eg models)
5
0..1
0
1
1
0
1
1
t1 t3
t2
A B C
Object Existence1
Link Existence1
Link Existence2
Link Existence3
Derivation Engine
Derivation Operators
t1 t3A B C
t2
A C
Derived Products
Ideally applicable to any kind of artefact expressed in a language
(conformant to a metamodel)
6. 6
0..1
0
1
1
0
1
1
t1 t3
t2
A B C
Object Existence1
Link Existence1
Link Existence2
Link Existence3
Derivation Engine
t1
t2
A C
Derivation Operators
t1 t3A B C
t2
A C
Derived Products
Previous work show
#1 Using CVL “as is” is not working. It is highly beneficial to specialize
derivation operators for a given language [Filho et al. SPLC’13]; mandatory in
industrial context [Filho et al. STTT’14]
#2 Hard for users to do not make mistake: verification techniques eg [Czarnecki
et al. GPCE’06, Batory et al. GPCE’07, Classen et al. ICSE’10]; or support for
preventing errors and guiding users when specifying variability in an IDE
7. 7
0..1
0
1
1
0
1
1
Object Existence1
Link Existence1
Link Existence2
Link Existence3
Derivation Engine
Derivation Operators
Derived Products
Previous work show
#1 Using CVL “as is” is not working. It is highly beneficial to specialize
derivation operators for a given language [Filho et al. SPLC’13]; mandatory in
industrial context [Filho et al. STTT’14]
#2 Hard for users to do not make mistake: verification techniques eg [Czarnecki
et al. GPCE’06, Batory et al. GPCE’07, Classen et al. ICSE’10]; or support for
preventing errors and guiding users when specifying variability in an IDE
8. e.g., automatically FIND THESE PERCENTAGES!
Which derivation operations are more likely to work? derive uncompilable code?
Can we identify operations subject to specialization?
8
#1 Using CVL “as is” is not working. We need to specialize derivation operators
for a given language [Filho et al. SPLC’13] [Filho et al. STTT’14]
#2 Hard for users to do not make mistake: support for preventing errors and
guiding users when specifying variability in an IDE
Object Existence2
Object Existence3
Object Existence4
Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2
VS3
Object Existence1 >80%
<10%
>40%
t1 t3
t2
A B C
For any
programming/
modeling
“language”
9. e.g., automatically FIND THESE PERCENTAGES!
Which derivation operations are more likely to work? derive uncompilable code?
Can we identify operations subject to specialization?
9
#1 Using CVL “as is” is not working. We need to specialize derivation operators
for a given language [Filho et al. SPLC’13] [Filho et al. STTT’14]
#2 Hard for users to do not make mistake: support for preventing errors and
guiding users when specifying variability in an IDE
Object Existence2
Object Existence3
Object Existence4
Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2
VS3
Object Existence1
>80%
<10%
>40%
11. Kind of derivation operations by example
11
Object existence expresses whether
a determined object will make part or not of the derived variant; its
execution implies on deleting or adding any source code element (e.g.,
statements, assignments, blocks,
literals, etc.) from the original program.
12. Kind of derivation operations by example
12
Object Substitution expresses that a determined program
element will be replaced by another of its same type
13. Kind of derivation operations
13
Object existence expresses whether
a determined object will make part or not of the derived variant;
its execution implies on deleting or adding any source code
element (e.g., statements, assignments, blocks,
literals, etc.) from the original program.
Link Existence expresses whether there is a relationship or
not between two elements, in the case of Java programs, we
consider as a link any relationship between classes: association,
composition, inheritance, etc (e.g., to remove an
extends Class A from a class' header).
Object Substitution expresses that a determined program
element will be replaced by another of its same type, e.g., a
method substituted by another method.
Link End Substitution expresses that a relationship between
a class A and a class B will be replaced by another relationship of
the same type between class A and a third class C (e.g., A
extends C instead of A extends B).
14. e.g., automatically FIND THESE PERCENTAGES!
Which derivation operations are more likely to work? derive uncompilable code?
Can we identify operations subject to specialization?
14
#1 Using CVL “as is” is not working. We need to specialize derivation operators
for a given language [Filho et al. SPLC’13] [Filho et al. STTT’14]
#2 Hard for users to do not make mistake: support for preventing errors and
guiding users when specifying variability in an IDE
Object Existence2
Object Existence3
Object Existence4
Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2
VS3
Object Existence1
>80%
<10%
>40%
16. Methodology
16
p
p'
Object Existence
Link Existence
Object Subst.
LinkEnd Subst.
CVLVPs
Derivation
Operation
Compilation
and Testing
CounterexampleVariantSosie
[Baudry ISSTA’14]
Random operator applied to a random
code element in the program
17. Study (the big picture)
17
p'
86 CVL operations
Derivation Compilation
and Testing
Object Existence Field
Object Existence Interface
Object Existence Foreach
Object Existence Break
…
Object Substitution SuperAccess
Object Substitution Annotation
…
Link Existence …
LinkEnd Subst …
8 Java projects
p'p'p'
370,000
programs
%Counterexamples
(does not compile)
%Variants
(only compiles)
%Sosies
(compiles and
pass the test
suites)
19. Variables and measurements
19
Derivation
Object Existence Field
Object Existence Interface
Object Existence Foreach
Object Existence Break
…
Object Substitution SuperAccess
Object Substitution Annotation
…
Link Existence …
LinkEnd Subst …
%Counterexamples
(does not compile)
%Variants
(only compiles)
%Sosies
(compiles and pass
23. Results and Findings
• 86% of the possible pairs (CVL operator + Type of Java
Code Element) resulted in compilable programs at least
once
▪Many possibilities to vary a Java program
• We found 72 types of CVL-based derivation operations that could produce
compilable Java programs
– e.g., substituting one child or parent class by another,
– suppressing an if statement,
– introducing/suppressing a method invocation,
– etc.
23
24. Results and Findings
• There are operations that will always lead to
counterexamples or to variants
24
Success Rate
~100%
~0%
25. Results and Findings
• Operations with low success rate may not be directly
discarded, but specialized
▪Looking into those operations
• Qualitatively analysing them with the help of dedicated tooling
25
~0%
27. Results and Findings
• Specialization to avoid recurrent errors
▪Simple specializations
• e.g., removing a try in the case of removing a catch
▪Static-analysis-based specializations
• e.g., identifying strongly connected classes to be removed together
• Specialized and typed operators
▪Object Existence
• Class existence, field existence, parameter existence
27
28. Results and Findings
• Varying entire blocks of code instead of single instructions is
more likely to generate correct programs
▪ Do, For, ForEach, While, If, Throw
• 70% to 98% of variants when coupled with Object Existence
▪Fine-grained variability works better (ie easier to put variability
inside a method than to manipulate coarse elements like
« interface »)
▪Anomaly: Classes or Methods à 0.1%
– Invoked in other parts of the code
28
29. Results and Findings
• Object Existence is more likely to generate variants
▪ Object Existence à 21.97% of variants
▪ Link Existence à 11.81% of variants
▪ Object Substitution à 9.22% of variants
▪ Link Substitution à 4.51% of variants
• Overall CVL is not safe to be directly applied to Java
▪ Many different ways to vary a program
• But high probability to break it (90%)
▪ Generic language to vary any base model type but specialization is
clearly required
29
30. Conclusions
• Large-scale assessment of derivation operations
▪More than 370,000 derived products
• Many kinds of operations
▪86 ways of varying a Java program
• Some of them were never considered before by variability studies
• Quantitatively and qualitatively supported insights
▪Extensive panorama of success rates for each operation
▪Visualization tool for analyzing the transformations
• Open new perspectives for supporting variability in
languages, specially in Java
30
31. Future Work
• Using the results to
▪Devise specialized derivation operators for Java
▪Help current variability supporting IDEs to incorporate
new possibilities of variation, knowing the risks to do so
• Apply derivation operators in different areas, for
different objectives (e.g., resilience), such as software
diversification
31
32. 32
Object Existence2
Object Existence3
Object Existence4
Object Existence5
VS1
VS6
VS4
VS5
VS2
0..2
VS3
Object Existence1
>80%
<10%
>40%
Long term vision: variability-aware IDE,
(anti-)patterns for any language
through automated exploration
Question?