4. Testing (CPS) Software
4
Algorithms +
Differential Equations
Fail
Z = 20
X = 10, Y = 30
S1(t)
S2(t)
S3(t)
Pass
Z = 20
S3(t)
S1
t
S2
t
S3
t
S3
t
5. Software Testing Challenges (CPS)
• Mixed discrete-continuous behavior (combination of
algorithms and continuous dynamics)
• Inputs/outputs are signals (functions over time)
• Simulation is inexpensive but not yet systematically
automated
• Partial test oracles
5
8. Stateflow
• A Statechart dialect integrated into Simulink
• Captures the state-based behavior of CPS software
• Has mixed discrete-continuous behavior
8
13. Test Suite Effectiveness (1)
• Test suite size should be small because
• Test oracles cannot be fully automated
• Output signals need to be inspected by engineers
13
Model
Simulation
Input
Signals
Output
Signal(s)
S3
t
S2
t
S1
t
S3
t
S2
t
S1
t
Test Case 1
Test Case 2
14. Test Suite Effectiveness (2)
• Test suites should have a high fault revealing power
• Small deviations in outputs may not be recognized/important
• Test inputs that drastically impact the output signal shape are
likely to have a higher fault revealing power
14
Test Output 1
TimeTime
CtrlSig
Faulty Model Output
Correct Model Output
Test Output 2
23. Failure-based Test Generation
23
Instability
Discontinuity
0.0 1.0 2.0
-1.0
-0.5
0.0
0.5
1.0
Time
CtrlSigOutput
• Maximizing the likelihood of presence of specific failure
patterns in output signals
0.0 1.0 2.0
Time
0.0
0.25
0.50
0.75
1.0
CtrlSigOutput
24. We developed our failure-based
test generation algorithm using!
Meta-Heuristic Search
24
25. The Alternative Choice
25
Our ApproachExisting WorkTechnique
Model
Checking
- Require precisely defined
oracles (user-specified
assertions)
- Have been largely applied
to time-discrete models
- State-explosion problem!
- No need for automated test
oracles
- Applicable to time-continuous
and non-linear models
- Our algorithms are black-box
randomized search:
- non-memory intensive
- can be parallelized
26. 26
Failure-based Test Generation
using Meta-Heuristic Search
Input Signals
Slightly Modifying
Each Input Signal
Fitness Functions
Capturing the Likelihood
of Presence of Failure Patterns
in the Output Signals
Repeat
Until maximum resources spent
S Initial Candidate Solution
Search Procedure
R Tweak (S)
if Fitness (R) > Fitness (S)
S R
Return S
27. Output Stability !
Fitness Function
• Sum of the differences of signal values for consecutive
simulation steps
27
stability(sgo) =
kP
i=1
|sgo(i · t) sgo((i 1) · t)|
0.0 1.0 2.0
-1.0
-0.5
0.0
0.5
1.0
Time
CtrlSigOutput
28. Output Continuity !
Fitness Function
28
• Maximum of the minimum left or right derivatives for all the
simulation steps
0.0 1.0 2.0
Time
0.0
0.25
0.50
0.75
1.0
CtrlSigOutput
continuity(sgo) =
K 1
max
i=1
(min(|LeftDer(sgo, i)|, |RightDer(sgo, i)|))
30. Research Questions
• RQ1 (Fault Revealing Ability)
• RQ2 (Fault Revealing Subsumption)
• RQ3 (Test Suite Size)
30
31. Experiment Setup
• Three Stateflow models: two industrial and one publicly
available case study
31
75 (faulty models) * 100 (algorithm runs)
*6 (generation algorithms) * 5 (different test suite sizes) =
225,000 test suites (in total)
Test Suite
(size=3,5,
10,25,50)
{
1.Fault
Seeding
2.Generation
Algorithm
SF
Faulty
SF
{75 75
32. Research Question 1!
Fault Revealing Ability
How does the fault revealing ability of
our proposed test generation algorithms
compare with one another?
32
34. Research Question 2!
Fault Revealing Subsumption
Is any of our generation algorithms
subsumed by other algorithms?
34
35. RQ2: Fault Revealing Subsumption
35
• For each of the 75 faulty models, we identified the best generation
algorithm(s) for different test suite sizes (5, 10, 25, and 50)
Fault 1
State Coverage
Transition Coverage
Output Diversity
Output Stability
Output Continuity
Fault 2 Fault 3 Fault 4
36. RQ2: Fault Revealing Subsumption (2)
36
1. The coverage-based algorithms found the least
number of faults
2. Coverage-based algorithms are subsumed by
output diversity algorithm when the test suite size
increases (size = 25 , 50)
37. Research Question 3!
Test Suite Size
What is the impact of the size of test suites
generated by our generation algorithms on
their fault revealing ability?
37
38. RQ3: Test Suite Size
38
1. The fault revealing rates for output stability/continuity is very
high for small test suites(size = 3,5) for Instability/Discontinuity
failures
2. For Other failures, the ability of output diversity in revealing
failures rapidly increases as the test suite size increases
DiscontinuityInstability Others
0.0
0.5
1.0
3 5 10 25 50
Test Suite Size
FaultRevealingRateMean
3 5 10 25 50 3 5 10 25 50
Output Stability
Ouput Continuity State Coverage
Transition CoverageOutput Diversity
40. Lesson 1!
Coverage-based algorithms are less
effective than output-based algorithms
• The test cases resulting from state/transition coverage
algorithms cover the faulty parts of the models
• 97% state coverage and 81% transition coverage
• Cover faulty parts for 73 (out of 75) fault-seeded models
• However, they fail to generate output signals that are
sufficiently distinct from the oracle signal, hence yielding a
low fault revealing rate
40
41. Lesson 2!
Combining Output-based Algorithms
41
• We suggest to divide the test suite size budget between
output-based algorithms:
Output Continuity
Output Stability
Output Diversity
43. .lusoftware verification & validation
VVS
Effective Test Suites for !
Mixed Discrete-Continuous
Stateflow Controllers
Reza Matinnejad (reza.matinnejad@uni.lu)
Shiva Nejati
Lionel Briand
SnT Center, University of Luxembourg
Thomas Bruckmann
Delphi Automotive Systems, Luxembourg
44. Lesson 1!
Combing Output-based Algorithms
• We suggest to divide the test suite size budget between
output stability, output continuity, and output diversity:
1. Allocate a small part of the test budget to output
continuity
2. Share the rest of the budget between output stability
and output diversity, by giving output diversity a higher
share
44
45. Input / Output Vectors
45
0 5 10
50
150
250
FuelLevelSensor
FuelLevel
0 5 10
100.0
91.43
84.43
75.62
70.01
66.19
61.21
56.66
54.32
52.81
50
100
Time (s) Time (s)
47. Fault Revealing Rate (FRR)
47
FRR(SF, TS) =
(
1 91iq
ˆdist(sgi, gi) > THR
0 81iq
ˆdist(sgi, gi) <= THR
• FRR based on gi, output of the fault-free model, sgi, output of the fault-
seeded model, and a threshold THR:
1. For continuous dynamic systems, the system output is acceptable
when the deviation is small and not necessarily zero
2. It is more likely that manual testers recognize a faulty output signal
when the signal shape drastically differs from the oracle.
48. RQ3: Test Suite Size
48
1. The fault revealing rates for output stability/continuity is
very high for small test suites for Instability/Discontinuity
2. For “Other” failures, the ability of OD in revealing failures
rapidly increases as the test suite size increases
Discontinuity
SC
TC
OD
OS
OC
* *
+ +
--
Instability Others
0.0
0.5
1.0
3 5 10 25 50
*
*
* *
+
+
+
-
-
Test Suite Size
FRRMean
+
-
-
3 5 10 25 50 3 5 10 25 50
* * *
-
-
-
+
+
+
-
- -
*
*
*
+
+
+
+
+
*
*
*-
-
-