TAROT 2013 9th International Summer School on Training And Research On Testing, Volterra, Italy, 9-13 July, 2013
These slides summarize Gilles Perrouin's presentation about "Feature-based Testing of SPLs: Pairwise and Beyond"
5. About me...
PhD (2007) jointly from U. Luxembourg & UNamur
Requirements engineering, Software Architecture and
Development Methodologies for SPLs
6. About me...
PhD (2007) jointly from U. Luxembourg & UNamur
Requirements engineering, Software Architecture and
Development Methodologies for SPLs
Postdoc (2007-2009) in INRIA: MDSPLE and SPL
Testing (2009)
7. About me...
PhD (2007) jointly from U. Luxembourg & UNamur
Requirements engineering, Software Architecture and
Development Methodologies for SPLs
Postdoc (2007-2009) in INRIA: MDSPLE and SPL
Testing (2009)
Since 2010, UNamur, SPL Testing funded by FNRS (3yr
grant since Oct. 2012)
8. About me...
PhD (2007) jointly from U. Luxembourg & UNamur
Requirements engineering, Software Architecture and
Development Methodologies for SPLs
Postdoc (2007-2009) in INRIA: MDSPLE and SPL
Testing (2009)
Since 2010, UNamur, SPL Testing funded by FNRS (3yr
grant since Oct. 2012)
I still have SO much to learn about software testing...
9. Acknowlegments...
Benoit Baudry, Sagar Sen, Jacques Klein, Yves Le Traon,
Sebastian Oster
Arnaud Gotlieb, Aymeric Hervieu
Xavier Devroey, Maxime Cordy, Patrick Heymans, Pierre-
Yves Schobbens, Axel Legay, Eun-Young Kang, Andreas
Classen
Christopher Hénard, Mike Papadakis
16. Research Question (2009)
How to design model fragments so that they compose
well together ?
Methodological hints are insufficient
Need for an automated approach to validate SPL models...
17. Research Question (2009)
How to design model fragments so that they compose
well together ?
Methodological hints are insufficient
Need for an automated approach to validate SPL models...
Testing view: Extract relevant configurations of
the SPL and build them (composition = oracle)
23. Testing the universe ???
Renault Vans: 1021 possible vehicles...
Source: Astesana et al."Constraint-based vehicle
configuration: a case study." Tools with Artificial Intelligence
(ICTAI), IEEE, 2010.
24. Testing the universe ???
Renault Vans: 1021 possible vehicles...
Source: Astesana et al."Constraint-based vehicle
configuration: a case study." Tools with Artificial Intelligence
(ICTAI), IEEE, 2010.
The Linux Kernel FM ! 7,000 features
25. Testing the universe ???
Renault Vans: 1021 possible vehicles...
Source: Astesana et al."Constraint-based vehicle
configuration: a case study." Tools with Artificial Intelligence
(ICTAI), IEEE, 2010.
The Linux Kernel FM ! 7,000 features
Source: S. She, R. Lotufo, T. Berger, A. Wasowski, and K.
Czarnecki, “Reverse engineering feature models,” in ICSE,
2011, pp. 461–470.
26. Testing the universe ???
Renault Vans: 1021 possible vehicles...
Source: Astesana et al."Constraint-based vehicle
configuration: a case study." Tools with Artificial Intelligence
(ICTAI), IEEE, 2010.
The Linux Kernel FM ! 7,000 features
Source: S. She, R. Lotufo, T. Berger, A. Wasowski, and K.
Czarnecki, “Reverse engineering feature models,” in ICSE,
2011, pp. 461–470.
The General Motors PL comprises a few thousands of
features
27. Testing the universe ???
Renault Vans: 1021 possible vehicles...
Source: Astesana et al."Constraint-based vehicle
configuration: a case study." Tools with Artificial Intelligence
(ICTAI), IEEE, 2010.
The Linux Kernel FM ! 7,000 features
Source: S. She, R. Lotufo, T. Berger, A. Wasowski, and K.
Czarnecki, “Reverse engineering feature models,” in ICSE,
2011, pp. 461–470.
The General Motors PL comprises a few thousands of
features
Source: Flores et al,. Mega-scale product line engineering at
General Motors. SPLC '12, pp 259-268
34. Specific Challenges
MDSPLE Context
Integration in early SPL lifecycle (requirements/design level)
Applicability for SPL Engineer
No a priori knowledge of the SPL
Difficult to pinpoint faulty assets in a symmetric composition
approach (faulty interactions likely)...
35. Specific Challenges
MDSPLE Context
Integration in early SPL lifecycle (requirements/design level)
Applicability for SPL Engineer
No a priori knowledge of the SPL
Difficult to pinpoint faulty assets in a symmetric composition
approach (faulty interactions likely)...
Abstract models
36. Specific Challenges
MDSPLE Context
Integration in early SPL lifecycle (requirements/design level)
Applicability for SPL Engineer
No a priori knowledge of the SPL
Difficult to pinpoint faulty assets in a symmetric composition
approach (faulty interactions likely)...
Abstract models
Incremental testing [Uzu08,Loc12] unsuitable
38. Sampling-Based Top-down Testing
1) Select relevant configurations from the FM
2) Derive or retrieve the products realizing those
configurations
3) Write/Select test cases associated to those products
4) Run test cases on some inputs
39. Sampling-Based Top-down Testing
1) Select relevant configurations from the FM
2) Derive or retrieve the products realizing those
configurations
3) Write/Select test cases associated to those products
4) Run test cases on some inputs
A simple (simplistic?) scenario of which Steps #1 and #2
kept us busy the last 4 years => focus of this talk
40. Sampling-Based Top-down Testing
1) Select relevant configurations from the FM
2) Derive or retrieve the products realizing those
configurations
3) Write/Select test cases associated to those products
4) Run test cases on some inputs
A simple (simplistic?) scenario of which Steps #1 and #2
kept us busy the last 4 years => focus of this talk
41. Sampling-Based Top-down Testing
1) Select relevant configurations from the FM
2) Derive or retrieve the products realizing those
configurations
3) Write/Select test cases associated to those products
4) Run test cases on some inputs
A simple (simplistic?) scenario of which Steps #1 and #2
kept us busy the last 4 years => focus of this talk
42. Sampling-Based Top-down Testing
1) Select relevant configurations from the FM
2) Derive or retrieve the products realizing those
configurations
3) Write/Select test cases associated to those products
4) Run test cases on some inputs
A simple (simplistic?) scenario of which Steps #1 and #2
kept us busy the last 4 years => focus of this talk
45. Agenda
FM-level Configuration Selection
T-wise SAT-based with Alloy
Similarity-Driven and prioritization with evolutionary algorithms
Tool Demo
Multi-objective
Going Beyond: Unifying Verification and Test for SPLs
46. Agenda
FM-level Configuration Selection
T-wise SAT-based with Alloy
Similarity-Driven and prioritization with evolutionary algorithms
Tool Demo
Multi-objective
Going Beyond: Unifying Verification and Test for SPLs
On the (actual) trustability of FMs
47. Agenda
FM-level Configuration Selection
T-wise SAT-based with Alloy
Similarity-Driven and prioritization with evolutionary algorithms
Tool Demo
Multi-objective
Going Beyond: Unifying Verification and Test for SPLs
On the (actual) trustability of FMs
50. CIT for SPLs
Pros
Addresses the feature interaction problem
Small test suites (compared to 10^X possible tests)
51. CIT for SPLs
Pros
Addresses the feature interaction problem
Small test suites (compared to 10^X possible tests)
Cons (2009)
Poor support for constraints
Limited SPL tool support[Cohen2006,Cohen2007]: FMs " Covering
Arrays (input pb)
53. T-wise Coverage as a SAT problem
FM
Viewed as a set of constraints between boolean features
54. T-wise Coverage as a SAT problem
FM
Viewed as a set of constraints between boolean features
T-wise
Can be seen as a SAT problem: “Set of valid configurations
that satisfy the conjunction of all t-tuples of features”
Rely on SAT solvers for SPL T-wise testing
55. Related Issues
However FMs are not SAT solvers’ inputs… (Usability
perspective)
Need to devise a solution to encode automatically FMs and T-
wise selection problem
Scalability
SAT solving is NP-complete
We don’t know how to predict in advance the difficulty of a
given problem => need to assess it experimentally
Pragmatic solutions have to be found to address concrete
scalability issues
57. Approach overview
Use Alloy [Jac2006] as an intermediate
representation between FM+T-wise and SAT
solvers
Use MDE (EMF, Kermeta) to generate alloy specs
58. Approach overview
Use Alloy [Jac2006] as an intermediate
representation between FM+T-wise and SAT
solvers
Use MDE (EMF, Kermeta) to generate alloy specs
Scalability: “divide-and-compose”
Split t-tuples in solvable sets
Generate Alloy commands and solve sets
Recompose solutions in an unique configuration suite
59. Approach overview
Use Alloy [Jac2006] as an intermediate
representation between FM+T-wise and SAT
solvers
Use MDE (EMF, Kermeta) to generate alloy specs
Scalability: “divide-and-compose”
Split t-tuples in solvable sets
Generate Alloy commands and solve sets
Recompose solutions in an unique configuration suite
Configurable JAVA-based toolset performing test
selection and analysis of the selection strategies
60. Approach overview
Use Alloy [Jac2006] as an intermediate
representation between FM+T-wise and SAT
solvers
Use MDE (EMF, Kermeta) to generate alloy specs
Scalability: “divide-and-compose”
Split t-tuples in solvable sets
Generate Alloy commands and solve sets
Recompose solutions in an unique configuration suite
Configurable JAVA-based toolset performing test
selection and analysis of the selection strategies
65. Generation time
Generated configurations size
T-tuple occurrence: how many times a given T-tuple
appears ?
Number of duplicates (“divide-and-compose” strategies)
Evaluating T-wise Generation
66. Generation time
Generated configurations size
T-tuple occurrence: how many times a given T-tuple
appears ?
Number of duplicates (“divide-and-compose” strategies)
Similarity: how different are my configurations ?
Evaluating T-wise Generation
67. Generation time
Generated configurations size
T-tuple occurrence: how many times a given T-tuple
appears ?
Number of duplicates (“divide-and-compose” strategies)
Similarity: how different are my configurations ?
Evaluating T-wise Generation
Sim(tci, tcj) =
Tciv ∩ Tcjv
Tciv ∪ Tcjv
68. Generation time
Generated configurations size
T-tuple occurrence: how many times a given T-tuple
appears ?
Number of duplicates (“divide-and-compose” strategies)
Similarity: how different are my configurations ?
Tciv : Variant features[Benavides2010] of configuration ‘i’
Evaluating T-wise Generation
Sim(tci, tcj) =
Tciv ∩ Tcjv
Tciv ∪ Tcjv
71. Initial Assessment
Computed those metrics on a case-study varying scope
and generation time
Wrote paper (that brought me here)
G. Perrouin, S. Sen, J. Klein, B. Baudry, and Y. Le Traon.
Automated and scalable t-wise test case generation
strategies for software product lines. ICST 2010 pp.
459-468, IEEE.
72. Initial Assessment
Computed those metrics on a case-study varying scope
and generation time
Wrote paper (that brought me here)
G. Perrouin, S. Sen, J. Klein, B. Baudry, and Y. Le Traon.
Automated and scalable t-wise test case generation
strategies for software product lines. ICST 2010 pp.
459-468, IEEE.
82. Key Differences
FM Expressivity
Scalability
‘A priori’: Flattening of the FM
‘A posteriori’ : “divide-and-compose” strategies
Determinism
CSP-based provides always the same suite on a given FM
Alloy-based can produce very different test suites due to
random tuple combinations and scope influence
93. Conclusions
CSP outperforms alloy-based
Generation time and configurations size
Automated and scalable t-wise test case generation
strategies for software product lines
Specialization Generality
More details
Perrouin, Gilles, Sebastian Oster, Sagar Sen, Jacques Klein,
Benoit Baudry, and Yves Le Traon. Pairwise testing for
software product lines: Comparison of two approaches.
Software Quality Journal 20, no. 3-4 (2012): 605-643.
98. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
99. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
100. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
101. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
A testing tool is software too = should be tested ;)
102. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
A testing tool is software too = should be tested ;)
Gives you insights design decisions: e.g. flattening and
expressivity
103. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
A testing tool is software too = should be tested ;)
Gives you insights design decisions: e.g. flattening and
expressivity
Choose your case studies wisely
104. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
A testing tool is software too = should be tested ;)
Gives you insights design decisions: e.g. flattening and
expressivity
Choose your case studies wisely
Go for repositories when they exist
105. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
A testing tool is software too = should be tested ;)
Gives you insights design decisions: e.g. flattening and
expressivity
Choose your case studies wisely
Go for repositories when they exist
Publish your models/tools so that others can play with them
106. Lessons Learned
What went wrong
Alloy was not really meant for this: interactive model
exploration batch tool chain running continuously
Exotic case study
Comparison is good
A testing tool is software too = should be tested ;)
Gives you insights design decisions: e.g. flattening and
expressivity
Choose your case studies wisely
Go for repositories when they exist
Publish your models/tools so that others can play with them
115. Problem Solved ?
What about higher values of t (3,4,5,6)?
480 2-wise configurations for Linux FM: Where to start?
116. Problem Solved ?
What about higher values of t (3,4,5,6)?
480 2-wise configurations for Linux FM: Where to start?
t-wise coverage remains essentially difficult to compute
for large models...
120. We used similarity to evaluate generated
configurations
H. Hemmati et al, “Achieving scalable model-
based testing through test case diversity,” ACM
TOSEM, vol. 22, no. 1, 2012.
121. We used similarity to evaluate generated
configurations
H. Hemmati et al, “Achieving scalable model-
based testing through test case diversity,” ACM
TOSEM, vol. 22, no. 1, 2012.
122. We used similarity to evaluate generated
configurations
H. Hemmati et al, “Achieving scalable model-
based testing through test case diversity,” ACM
TOSEM, vol. 22, no. 1, 2012.
Can we use similarity to mimic t-wise
coverage ?
123. We used similarity to evaluate generated
configurations
H. Hemmati et al, “Achieving scalable model-
based testing through test case diversity,” ACM
TOSEM, vol. 22, no. 1, 2012.
Can we use similarity to mimic t-wise
coverage ?
Intuition: dissimilar configurations
cover more t-tuples than similar ones
125. Similarity-driven Selection
Use evolutionary algorithms to evolve a population of
configurations
SBSE well-suited for large configurations spaces
Ensure the validity of generated configurations (via SAT4J)
126. Similarity-driven Selection
Use evolutionary algorithms to evolve a population of
configurations
SBSE well-suited for large configurations spaces
Ensure the validity of generated configurations (via SAT4J)
Designed for Scalability
Fitness function based on distance: correlated with t-wise
coverage but easier to compute
127. Similarity-driven Selection
Use evolutionary algorithms to evolve a population of
configurations
SBSE well-suited for large configurations spaces
Ensure the validity of generated configurations (via SAT4J)
Designed for Scalability
Fitness function based on distance: correlated with t-wise
coverage but easier to compute
Flexibility
Tester decides generation time and # of configurations
Configurations are prioritized w.r.t fitness function: use a
subset if lack of resources
130. SPL Similarity Search Problem
(1+1) EA [Dro02] (non-local variant of HC)
Individual = set of configurations
131. SPL Similarity Search Problem
(1+1) EA [Dro02] (non-local variant of HC)
Individual = set of configurations
Population= 1 individual :)
132. SPL Similarity Search Problem
(1+1) EA [Dro02] (non-local variant of HC)
Individual = set of configurations
Population= 1 individual :)
No crossover, mutation = change one gene at a time
(configuration) depending on its fitness
133. SPL Similarity Search Problem
(1+1) EA [Dro02] (non-local variant of HC)
Individual = set of configurations
Population= 1 individual :)
No crossover, mutation = change one gene at a time
(configuration) depending on its fitness
Fitness function
134. SPL Similarity Search Problem
(1+1) EA [Dro02] (non-local variant of HC)
Individual = set of configurations
Population= 1 individual :)
No crossover, mutation = change one gene at a time
(configuration) depending on its fitness
Fitness function
f :
Cm
−→ R+
(C1, ..., Cm) −→
m
ji≥1 d(Ci, Cj)
137. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
138. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
2. While elapsedTime t
139. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
2. While elapsedTime t
compute fitness function f
140. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
2. While elapsedTime t
compute fitness function f
Prioritize configurations
141. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
2. While elapsedTime t
compute fitness function f
Prioritize configurations
Global distance: make sure that each product maximizes its distance
with all others
142. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
2. While elapsedTime t
compute fitness function f
Prioritize configurations
Global distance: make sure that each product maximizes its distance
with all others
Local distance: pairwise distance
143. Configuration Selection Algorithm
1. Select a set of unpredictable configurations of size m
from SAT solver
Unpredictable: unaffected by the solver order privileging local
similar configurations (internal order of the literals and clauses)
2. While elapsedTime t
compute fitness function f
Prioritize configurations
Global distance: make sure that each product maximizes its distance
with all others
Local distance: pairwise distance
remove worst configuration = iterate on new ones until f
improves
150. Taming large FMs
Ecos: 1 % more coverage implies 2,1E+15 additional 6-
tuples !
1,000 configurations may not be enough...
t=6
(1,000 confs)
0 runs 15,000 runs
eCos 94,191% 95,343%
FreeBSD 76,236% 76,494%
Linux 89,411% 90,671%
151. Conclusions
Balanced coverage and flexibility
Let testers decide w.r.t to the resources they have
Prioritization helps focusing on most covering configurations
Does not replace 100 % coverage CIT approaches for
SPLs [Johansen2012b,Garvin2011] but complement
them for intractable cases
Look at our TR
C. Henard, M. Papadakis, G. Perrouin, J. Klein, P. Heymans,
Y. Le Traon. Bypassing the combinatorial explosion: Using
similarity to generate and prioritize t-wise test suites for large
software product lines. arXiv preprint arXiv:1211.5451 (2012).
154. Prof Smith Says...
Approach not getting all
interactions, CIT assumption
does not hold any more: Fault
finding ability ?
I expect you to get back to
work...
156. Using Mutation to Evaluate Similarity
Mutate FMs and use set of configurations of
various similarity levels to assess their mutant killing
ability
157. Using Mutation to Evaluate Similarity
Mutate FMs and use set of configurations of
various similarity levels to assess their mutant killing
ability
Mutation operators: feature negation, disjunction
(or) - conjunction (and) (2 new clauses)
158. Using Mutation to Evaluate Similarity
Mutate FMs and use set of configurations of
various similarity levels to assess their mutant killing
ability
Mutation operators: feature negation, disjunction
(or) - conjunction (and) (2 new clauses)
22 1010 5050
Dis Sim Dis Sim Dis Sim
eCos 60.35% 48.32% 77.83% 48.41% 83.49% 49.64%
159. Using Mutation to Evaluate Similarity
Mutate FMs and use set of configurations of
various similarity levels to assess their mutant killing
ability
Mutation operators: feature negation, disjunction
(or) - conjunction (and) (2 new clauses)
22 1010 5050
Dis Sim Dis Sim Dis Sim
eCos 60.35% 48.32% 77.83% 48.41% 83.49% 49.64%
Dissimilar suites yield better
mutation score in all cases
162. Using Mutation to Evaluate Similarity
Preliminary Work
Specialized mutation operators
163. Using Mutation to Evaluate Similarity
Preliminary Work
Specialized mutation operators
Equivalent mutant discriminations (FM semantics)
164. Using Mutation to Evaluate Similarity
Preliminary Work
Specialized mutation operators
Equivalent mutant discriminations (FM semantics)
More information
C. Henard, M. Papadakis, G. Perrouin, J. Klein, Y. Le Traon,
Assessing Software Product Line Testing via Model-based
Mutation: An Application to Similarity Testing, AMOST@ICST
2013
165. PLEDGE: A Product Line Editor and Test
Generation Tool
C. Henard, M. Papadakis, G. Perrouin, J. Klein, Y. Le
Traon, SPLC 2013 (Tool Demonstration Papers), ACM
http://research.henard.net/SPL/PLEDGE/
170. More Flexibility: Multi-Objective
Selecting configurations involves several objectives
Maximizing coverage
Minimizing # configurations
Minimizing the testing cost of each configuration
171. More Flexibility: Multi-Objective
Selecting configurations involves several objectives
Maximizing coverage
Minimizing # configurations
Minimizing the testing cost of each configuration
172. More Flexibility: Multi-Objective
Use GAs + SAT: Search problem
Cost: value assigned to each variant feature
Coverage: pairwise
Selecting configurations involves several objectives
Maximizing coverage
Minimizing # configurations
Minimizing the testing cost of each configuration
173. More Flexibility: Multi-Objective
Use GAs + SAT: Search problem
Cost: value assigned to each variant feature
Coverage: pairwise
Objective function (F)
Weighted linear combination of coverage,
cost and # configurations
Selecting configurations involves several objectives
Maximizing coverage
Minimizing # configurations
Minimizing the testing cost of each configuration
180. Conclusion on Multi-Objective
Statistical significance of F guiding search after only 500
generations
Currently being integrated in PLEDGE
Threat: Small FMs so far (100 features)
181. Conclusion on Multi-Objective
Statistical significance of F guiding search after only 500
generations
Currently being integrated in PLEDGE
Threat: Small FMs so far (100 features)
182. Conclusion on Multi-Objective
Statistical significance of F guiding search after only 500
generations
Currently being integrated in PLEDGE
Threat: Small FMs so far (100 features)
More information
183. Conclusion on Multi-Objective
Statistical significance of F guiding search after only 500
generations
Currently being integrated in PLEDGE
Threat: Small FMs so far (100 features)
More information
C. Henard, M. Papadakis, G. Perrouin, J. Klein, Y. Le Traon,
Multi-objective Test Generation for Software Product Lines,
SPLC2013, ACM
184. Conclusion on Multi-Objective
Statistical significance of F guiding search after only 500
generations
Currently being integrated in PLEDGE
Threat: Small FMs so far (100 features)
More information
C. Henard, M. Papadakis, G. Perrouin, J. Klein, Y. Le Traon,
Multi-objective Test Generation for Software Product Lines,
SPLC2013, ACM
188. Salvador, 2 September 2012
Featured Transition Systems
Sells soda
pay soda serveSoda open
close
change take
46
189. Salvador, 2 September 2012
Featured Transition Systems
Sells soda and tea
pay
soda serveSoda
open
tea serveTea
close
change take
Sells soda
pay soda serveSoda open
close
change take
46
190. Salvador, 2 September 2012
Featured Transition Systems
Sells soda and tea
pay
soda serveSoda
open
tea serveTea
close
change take
Can cancel purchase
pay soda
serveSoda open
cancel
return
close
change
take
Sells soda
pay soda serveSoda open
close
change take
46
191. Salvador, 2 September 2012
Featured Transition Systems
Sells soda and tea
pay
soda serveSoda
open
tea serveTea
close
change take
Can cancel purchase
pay soda
serveSoda open
cancel
return
close
change
take
Drinks are free
soda serveSodafree
take
Sells soda
pay soda serveSoda open
close
change take
46
194. FTS cont’d
Designed for Model-Checking
More efficient than product-by-product verification
Tool-support: SNIP [Classen2012], NuSMV [Classen2011]
Real-time [Cordy2012a], adaptive systems [Cordy2012b]
SPL of model-checkers: http://www.info.fundp.ac.be/fts/
195. FTS cont’d
Designed for Model-Checking
More efficient than product-by-product verification
Tool-support: SNIP [Classen2012], NuSMV [Classen2011]
Real-time [Cordy2012a], adaptive systems [Cordy2012b]
SPL of model-checkers: http://www.info.fundp.ac.be/fts/
SPL-dedicated
From product to set of products
From application to domain engineering
196. FTS cont’d
Designed for Model-Checking
More efficient than product-by-product verification
Tool-support: SNIP [Classen2012], NuSMV [Classen2011]
Real-time [Cordy2012a], adaptive systems [Cordy2012b]
SPL of model-checkers: http://www.info.fundp.ac.be/fts/
SPL-dedicated
From product to set of products
From application to domain engineering
Goal: Combination with Testing
MC properties as test selection criteria
Verification of feature interactions
198. Combining Verification and Testing
Verification → Testing
LTL properties as testing criteria: “[](cancel = !serve W start)”
MC will get all configurations violating this property: if you
cancel your order you should not get your drink
199. Combining Verification and Testing
Verification → Testing
LTL properties as testing criteria: “[](cancel = !serve W start)”
MC will get all configurations violating this property: if you
cancel your order you should not get your drink
Testing → Verification
FTS too large to be verified
Focus on some features or interactions → FTS’
Check the behavior of configuration containing them
200. Combining Verification and Testing
Verification → Testing
LTL properties as testing criteria: “[](cancel = !serve W start)”
MC will get all configurations violating this property: if you
cancel your order you should not get your drink
Testing → Verification
FTS too large to be verified
Focus on some features or interactions → FTS’
Check the behavior of configuration containing them
Multiple scenarios can be devised
209. Not really an user-friendly language
No structuring mechanism
Higher-level models (fPromela, fSMV) still requires MC
expertise
Leveraging FTS
210. Not really an user-friendly language
No structuring mechanism
Higher-level models (fPromela, fSMV) still requires MC
expertise
Use of UML instead
Broaden the scope of this techniques to any SPL engineer
Abstraction: Hierarchical states, orthogonal regions
FTS as underlying formal semantics
Leveraging FTS
215. Challenges
UML 2 FTS
Choice of relevant constructs
Symmetric vs Asymmetric composition
Flattening: Well-known pb but few usable solutions
216. Challenges
UML 2 FTS
Choice of relevant constructs
Symmetric vs Asymmetric composition
Flattening: Well-known pb but few usable solutions
Testability of FTS
217. Challenges
UML 2 FTS
Choice of relevant constructs
Symmetric vs Asymmetric composition
Flattening: Well-known pb but few usable solutions
Testability of FTS
Extended Actions, test criteria, FTS-ioco...
218. Challenges
UML 2 FTS
Choice of relevant constructs
Symmetric vs Asymmetric composition
Flattening: Well-known pb but few usable solutions
Testability of FTS
Extended Actions, test criteria, FTS-ioco...
More information
219. Challenges
UML 2 FTS
Choice of relevant constructs
Symmetric vs Asymmetric composition
Flattening: Well-known pb but few usable solutions
Testability of FTS
Extended Actions, test criteria, FTS-ioco...
More information
X. Devroey, M. Cordy, G. Perrouin, E-Y Kang, P-Y Schobbens,
P. Heymans, A. Legay, and B. Baudry. A vision for behavioural
model-driven validation of software product lines. ISOLA
2012, pp. 208-222. Springer.
226. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
227. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
SPLs are barely built from nothing
228. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
SPLs are barely built from nothing
Set of existing products evolved as a SPL not derived directly
from the FMs
229. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
SPLs are barely built from nothing
Set of existing products evolved as a SPL not derived directly
from the FMs
FM Synthesis (or reverse-engineering)
230. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
SPLs are barely built from nothing
Set of existing products evolved as a SPL not derived directly
from the FMs
FM Synthesis (or reverse-engineering)
Several approaches exist [She2011,Ach2011,Hasl2013...]
231. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
SPLs are barely built from nothing
Set of existing products evolved as a SPL not derived directly
from the FMs
FM Synthesis (or reverse-engineering)
Several approaches exist [She2011,Ach2011,Hasl2013...]
Tough pb: parsing code, limitations in the expressiveness of
target FMs languages, heuristics…
232. Source of FMs
Repositories (SPLOT) contain mostly small to medium
size academic FMs
may not represent actual systems
SPLs are barely built from nothing
Set of existing products evolved as a SPL not derived directly
from the FMs
FM Synthesis (or reverse-engineering)
Several approaches exist [She2011,Ach2011,Hasl2013...]
Tough pb: parsing code, limitations in the expressiveness of
target FMs languages, heuristics…
= FMs may not be fully representative of the systems
you want to test
237. Test-and-Fix Loop
Detect discrepancies in two ways
Check if existing products conform to the
FM (SCF, EWC)
Try to build products from randomly
extracted configurations from FM
(GCF,ORF)
Fix them using Hill-Climbing EA
Mutate the FMs to align them with real
products (alter/insert/remove constraint)
Fitness function trying to minimize
different kinds of errors
238. Evaluation
Experimentation on linux kernel FM
Large reverse-engineered FM + manual edits
Easy building infrastructure (kconfig+make)
FM 2K runs 3K runs 4K runs 5K runs
EWC 50 46 43 41 39
SCF 1000 885 556 498 455
ORF 2468 1646 1395 1236 1084
GCF 1000 1000 1000 1000 1000
239. Outcome
Approach looks promising on some aspects (SCF/ORF)
But we cannot generalize one case…
Future work
Assessing the improvement on all configurations (not only a
subset)
Mutation operators
A (bit) more information
C. Henard, M. Papadakis, G. Perrouin, J. Klein, Y. Le Traon,
Towards automated testing and fixing of re-engineered feature
models. NIER@ICSE2013, pp.1245-1248. IEEE Press.
241. Summary
Looked at selecting configurations from FMs
Important subproblem of SPL Testing
CIT main inspiration source
FMs are not covering arrays…
Constraints are important for CIT tools (outside SPLs) and
getting more and more interest = opportunities to compare
Achievements
Scalability
Usability
Ready for industrial practice ?
242. Summary cont’d
T-wise is “blind”
prioritization (weights, ordered suites...)
flexibility (time/budget constraints)
Complement with behavioural SPL Testing QA
Tough challenge: collaboration between Verification and
Testing communities required
MBT and SPL testing
Depending on their source our techniques may be applied to
VIS not only SPL (e.g. linux)
Their validity has to be challenged
Opportunities to work with (model) miners and program
analysts
246. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
247. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
248. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
Is inter(sub)-disciplinary
249. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
Is inter(sub)-disciplinary
SPL Testing culture: Helps finding useful tradeoffs
250. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
Is inter(sub)-disciplinary
SPL Testing culture: Helps finding useful tradeoffs
Diversity of techniques: SAT, Evolutionary, even model-
checking ;)
251. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
Is inter(sub)-disciplinary
SPL Testing culture: Helps finding useful tradeoffs
Diversity of techniques: SAT, Evolutionary, even model-
checking ;)
Collaboration will and communication skills needed, but
interesting :)
252. Take Home Message(s)
SPL Configuration Selection is Hard
Don’t be afraid...
But start simple to understand
Is inter(sub)-disciplinary
SPL Testing culture: Helps finding useful tradeoffs
Diversity of techniques: SAT, Evolutionary, even model-
checking ;)
Collaboration will and communication skills needed, but
interesting :)
253. A Personal Note
Do not have a long testing experience but
Enjoyed an open and friendly community (so far ;))
Naive questions yield non-trivial answers
Pragmatism may help to face exponentials
I hope to still have a tester hat in 4 years…
255. References
[Oster et al 2010] Sebastian Oster, Florian Markert, Philipp Ritter: Automated Incremental Pairwise Testing of
Software Product Lines. SPLC 2010:196-210
[Perrouin2008] Gilles Perrouin, Jacques Klein, Nicolas Guelfi, Jean-Marc Jézéquel: Reconciling Automation and
Flexibility in Product Derivation. SPLC 2008: 339-348
[Perrouin2010] Gilles Perrouin, Sagar Sen, Jacques Klein, Benoit Baudry, Yves Le Traon: Automated and Scalable
T-wise Test Case Generation Strategies for Software Product Lines. ICST 2010: 459-468
[Perrouin2012] Gilles Perrouin, Sebastian Oster, Sagar Sen, Jacques Klein, Benoit Baudry, Yves Le Traon: Pairwise
testing for software product lines: comparison of two approaches. Software Quality Journal 20(3-4): 605-643
(2012)
[Uzu08] E. Uzuncaova, D. Garcia, S. Khurshid, and D. Batory, “Testing software product lines using incremental test
generation,” in ISSRE. IEEE Computer Society, 2008, pp. 249–258.
[Cohen2006] M. B. Cohen, M. B. Dwyer, and J. Shi, “Coverage and adequacy in software product line testing,” in
ROSATEA@ISSTA, 2006, pp. 53–63.[10]
[Cohen2007] M. Cohen, M. Dwyer, and J. Shi, “Interaction testing of highly-configurable systems in the presence of
constraints,” in ISSTA, 2007, pp. 129–139.
[Weißleder2010] Stephan Weißleder: Test models and coverage criteria for automatic model-based test generation
with UML state machines. PhD Thesis, Humboldt University of Berlin 2010, pp. 1-259
[Utting2006] Utting,M.,Legeard,B.:Practicalmodel-based testing: a tools approach. Morgan Kaufmann, 2006
[Kuhn2004] Kuhn DR, Wallace DR, Gallo AM (2004) Software fault interactions and implications for software testing.
IEEE Trans Softw Eng 30(6):418–421
256. References
[Batory2005] D. S. Batory, “Feature models, grammars, and propositional formulas,”in SPLC, 2005, pp. 7–20.
[Czarnecki2007] K. Czarnecki and A. Wasowski, “Feature diagrams and logics: There and back again,” in
SPLC.Los Alamitos, CA, USA: IEEE ComputerSociety, 2007, pp. 23–34.
[Schobbens2007] P. Schobbens, P. Heymans, J. Trigaux, and Y. Bontemps, “Generic semantics of feature
diagrams,” Computer Networks, vol. 51, no. 2, pp.456–479, 2007.
[Benavides2010] Benavides D, Segura S, Ruiz-Cortés A (2010) Automated analysis of feature models 20 years
later: A literature review. Information Systems 35(6):615 – 63
[Mendonca2009] Mendonca M, Branco M, Cowan D (2009) SPLOT: software product lines online tools. In:
Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems
languages and applications, ACM, pp 761–762
[Hervieu2011] Aymeric Hervieu, Benoit Baudry, Arnaud Gotlieb: PACOGEN: Automatic Generation of Pairwise Test
Configurations from Feature Models. ISSRE 2011: 120-129
[Johansen2012a] Martin Fagereng Johansen, Øystein Haugen, Franck Fleurey, Anne Grete Eldegard, Torbjørn
Syversen: Generating Better Partial Covering Arrays by Modeling Weights on Sub-product Lines. MoDELS 2012:
269-284
[Johansen2012b] Martin Fagereng Johansen, Øystein Haugen, Franck Fleurey: An algorithm for generating t-wise
covering arrays from large feature models. SPLC (1) 2012: 46-55
[Johansen2011] Martin Fagereng Johansen, Øystein Haugen, Franck Fleurey: Properties of Realistic Feature
Models Make Combinatorial Testing of Product Lines Feasible. MoDELS 2011: 638-652
257. References
[Classen2012] Classen, A.; Cordy, M.; Heymans, P.; Legay, A. and Schobbens, P-Y. Model checking software
product lines with SNIP. In International Journal on Software Tools for Technology Transfer (STTT), Springer-Verlag,
14 (5): 589-612, 2012.
[Cordy2012a] Maxime Cordy, Pierre-Yves Schobbens, Patrick Heymans, Axel Legay: Behavioural modelling and
verification of real-time software product lines. SPLC (1) 2012: 66-75
[Cordy2012b] Maxime Cordy, Andreas Classen, Patrick Heymans, Axel Legay, Pierre-Yves Schobbens. Model
Checking Adaptive Software with Featured Transition Systems, in Assurance for Self-Adaptive Systems, Lecture
Notes in Computer Science, to appear.
[Classen2008] Classen A, Heymans P, Schobbens P (2008) What’s in a feature: A requirements engineering
perspective. In: Proceedings of the Theory and practice of software, 11th international conference on Fundamental
approaches to software engineering, Springer-Verlag, pp 16–30
[Loc12] Malte Lochau, Ina Schaefer, Jochen Kamischke, and Sascha Lity. Incremental model-based testing of
delta-oriented software product lines. Tests and Proofs, pages 67–82, 2012.
[Ensan2012] Ensan, Faezeh, Ebrahim Bagheri, and Dragan Ga#evi$. Evolutionary search-based test generation for
software product line feature models. In Advanced Information Systems Engineering, pp. 613-628. Springer Berlin
Heidelberg, 2012.
[Droste2002] S. Droste, T. Jansen, and I. Wegener, “On the analysis of the (1+ 1) evolutionary algorithm,” Theor.
Comput. Sci., vol. 276, no. 1-2, pp. 51–81, Apr. 2002
[Garvin2009] Garvin BJ, Cohen MB, Dwyer MB (2009) An improved meta-heuristic search for constrained
interaction testing. In: 1st international symposium on search based software engineering, pp 13–22, 2009
[Garvin2011] B. J. Garvin, M. B. Cohen, and M. B. Dwyer, “Evaluating improvements to a meta-heuristic search for
constrained interaction testing,” Empirical Softw. Engg., vol. 16, no. 1, pp. 61–102, Feb. 2011.
258. References
[Classen2010] Classen, A., Heymans, P., Schobbens, P., Legay, A., Raskin, J.: Model checking lots of sys- tems:
efficient verification of temporal properties in software product lines. In: Proceedings of the 32nd ACM/IEEE
International Conference on Software Engineering - Volume 1. pp. 335–344. ICSE ’10, ACM, New York, NY, USA
(2010)
[Classen2011] Classen, A., Heymans, P., Schobbens, P., Legay, A.: Symbolic model checking of software product
lines. In: Proceedings 33rd International Conference on Software Engineering (ICSE 2011). ACM Press, New York
(2011)
[Ach2011] Acher, M., Cleve, A., Perrouin, G., Heymans, P., Vanbeneden, C., Collet, P., Lahire, P. (2012, January).
On extracting feature models from product descriptions. In Proceedings of the Sixth International Workshop on
Variability Modeling of Software-Intensive Systems (pp. 45-54). ACM.
[She2011] She, S., Lotufo, R., Berger, T., Wasowski, A., Czarnecki, K. (2011, May). Reverse engineering feature
models. In Software Engineering (ICSE), 2011 33rd International Conference on (pp. 461-470). IEEE.
[Hasl2013] Haslinger, E. N., Lopez-Herrejon, R. E., Egyed, A. (2013). On extracting feature models from sets of
valid feature combinations. In Fundamental Approaches to Software Engineering (pp. 53-67). Springer Berlin
Heidelberg.