Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Dirk Fahland
David Lo
Shahar Maoz

Mining
Branching-Time Scenarios
Eindhoven University of Technology
Singapore
Management
University

Tel Aviv University

Dirk Fahland
David Lo
Shahar Maoz...
Dirk Fahland
David Lo
Shahar Maoz

Mining
Branching-Time Scenarios
Understanding Existing Applications

fix bugs, add
features, test,
document, …

specification

understanding of objects an...
Understanding Existing Applications

fix bugs, add
features, test,
document, …

specification

Usually, there is no specif...
Understanding Existing Applications

fix bugs, add
features, test,
document, …

specification

?
?

?
?

source code

unde...
Understanding Existing Applications
Understanding applications from source
code is
• laborious
• fix bugs, add
time-consum...
Specification Mining

automatically
extract

specification

?
?

?
?

source code

understanding of objects and
object int...
Specification Mining from Event Logs

log

automatically
extract
specification

?
?

?
?

source code

understanding of ob...
This talk

log

automatically
extract
specification

?
?

?
?

source code

understanding of objects and
object interplay
...
This talk

log

automatically
extract
specification

?
What is a good specification
language to get an overview of
?
?
how...
Understanding Object Interplay
FTP server: How does Login/Logout work?
In object-oriented applications,
the hard part is t...
Understanding Object Interplay
An object of class A invokes
method onConnect() on an Scenario: Login/Logout
object of clas...
Understanding Object Interplay
Scenario: Login/Logout
A

UserCmd

B

C

C
onConnect()
onLogin()

setUser()
UserCmd
setLogo...
Understanding Object Interplay
Scenario: Login/Logout
A

UserCmd

B

C

C
onConnect()
onLogin()

setUser()
UserCmd
setLogo...
When does it happen?
Scenario: Login/Logout

This scenario tells us
which objects and
methods are involved in
login/logout...
When does it happen?
Scenario: Login/Logout
A

whenever the prechart
happens…
eventually the mainchart
will happen

UserCm...
Linear-Time LSCs - Invariants
pre
Login
Login

There can be other behaviors
between the behaviors shown in
the LSC.
2 runs...
Linear-Time LSCs - Invariants
pre
Login
Login

This run does not continue with
the complete main chart of the
LSC.
2 runs
...
Understanding Everything

FTP
download

FTP
delete

not everything is an invariant
 alternative behaviors

FTP
rename

2 ...
Linear-Time is Insufficient
scenario for FTP delete command

This LSC for the delete command
does not hold in every run.
2...
Branching Time

We merge all runs on their joint
prefixes into an execution tree.
execution tree

21
Branching-Time LSCs
[Sibay, Uchitel, Braberman ICSE 2008]

whenever the prechart happens

there exists a branch
where the ...
Describing Alternatives
We can define an LSC for the download
command, that is alternative to delete.

execution tree

23
Describing Alternatives
… and also an LSC for the Rename
command.

execution tree

24
Describing Alternatives

execution tree

25
Describing Alternatives

execution tree

26
LSC Mining from Event Logs

log

automatically
extract
complete set of LSCs
(linear / branching)
We want to discover a set...
Logs

log

automatically
extract
log method calls

complete set of LSCs
(linear / branching)

caller1, callee1, method1(…)...
Desired Outcome
log
log
=
tree

automatically
extract
complete set of LSCs
(occuring at least s times
= support)

29
Mining Algorithm
github.com/scenario-based-tools/sam/

tree

variant of [Lo, Maoz, Khoo ASE 2007]

30
Mining Algorithm
github.com/scenario-based-tools/sam/

tree

1. enumerate all sequences of events
occurring ≥ s times

onC...
Mining Algorithm
github.com/scenario-based-tools/sam/

tree

1. enumerate all sequences of events
occurring ≥ s times

onC...
Mining Algorithm
github.com/scenario-based-tools/sam/

tree

1. enumerate all sequences of events
occurring ≥ s times

onC...
Mining Algorithm
github.com/scenario-based-tools/sam/

tree

1. enumerate all sequences of events
occurring ≥ s times

onC...
LSC Mining from Event Logs

log

automatically
extract
complete set of LSCs
(linear and branching)
What do branching
scena...
Experiments
 CrossFTP server: 54 traces, 50 event types
s

Linear
LSC

covered
events

avg.
length

time
[s]

Branching
L...
Experiments
 CrossFTP server: 54 traces, 50 event types
s

Linear
LSC

covered
events

avg.
length

time
[s]

Branching
L...
Experiments
 CrossFTP server: 54 traces, 50 event types
s

Linear
LSC

covered
events

avg.
length

time
[s]

Branching
L...
Experiments
 CrossFTP server: 54 traces, 50 event types
s

Linear
LSC

covered
events

avg.
length

time
[s]

Branching
L...
Experiments
 CrossFTP server: 54 traces, 50 event types
s

Linear
LSC

covered
events

avg.
length

time
[s]

Branching
L...
Experiments
 CrossFTP server: 54 traces, 50 event types
s

Linear
LSC

covered
events

avg.
length

time
[s]

Branching
L...
Linear vs. Branching: CrossFTP
What is the qualitative contribution
of branching LSC to specification
mining?
application ...
Linear vs. Branching: CrossFTP

application life-cycle
from end to end

short invariants of
individual FTP
commands

invar...
Linear vs. Branching: CrossFTP

application life-cycle
from end to end FTP command
+
where triggered
short invariants of
i...
Linear vs. Branching: CrossFTP

application life-cycle individual FTP
individual FTP
FTP commands
from end to end all comm...
Linear vs. Branching: CrossFTP

application life-cycle individual FTP
individual FTP
FTP commands
from end to end all comm...
Take Home Points

http://github.com/scenario-based-tools/sam/

log

• mining branching scenarios
 alternatives, cycles
• ...
about.me/dirk.fahland
@dfahland

Mining
Branching-Time Scenarios
Q&A …is branching time really necessary?
Yes, here is a linear LSC showing a disjunction
for continuing after the pre-char...
The full execution tree satisfies this Linear LSC with disjunction
Branching Time vs. Disjunction
and two branching LSCs d...
Branching Time vs. Disjunction

Removing one branch
from the tree (the
execution of the
download command),
violates the br...
Próxima SlideShare
Cargando en…5
×

Mining Branch-Time Scenarios From Execution Logs

2.733 visualizaciones

Publicado el

This presentation was given at the International Conference on Automated Software Engineering (ASE 2013) in Palo Alto, November 2013.

We describe a technique for automatically extracting specifications from execution traces of an application. The particular specification that we extract are scenarios in the form of conditional existential Live-Sequence Charts (LSC), which are similar to UML Sequence Diagrams.

The technique is implemented in a tool and was evaluated on two real-life event logs.

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Mining Branch-Time Scenarios From Execution Logs

  1. 1. Dirk Fahland David Lo Shahar Maoz Mining Branching-Time Scenarios
  2. 2. Eindhoven University of Technology Singapore Management University Tel Aviv University Dirk Fahland David Lo Shahar Maoz Mining Branching-Time Scenarios
  3. 3. Dirk Fahland David Lo Shahar Maoz Mining Branching-Time Scenarios
  4. 4. Understanding Existing Applications fix bugs, add features, test, document, … specification understanding of objects and object interplay 3
  5. 5. Understanding Existing Applications fix bugs, add features, test, document, … specification Usually, there is no specification or worse, if it exists, it is outdated… understanding of objects and object interplay 4
  6. 6. Understanding Existing Applications fix bugs, add features, test, document, … specification ? ? ? ? source code understanding of objects and object interplay 5
  7. 7. Understanding Existing Applications Understanding applications from source code is • laborious • fix bugs, add time-consuming specification •features, test, error prone document, … ? ? ? ? source code understanding of objects and object interplay 6
  8. 8. Specification Mining automatically extract specification ? ? ? ? source code understanding of objects and object interplay 7
  9. 9. Specification Mining from Event Logs log automatically extract specification ? ? ? ? source code understanding of objects and object interplay 8
  10. 10. This talk log automatically extract specification ? ? ? ? source code understanding of objects and object interplay 9
  11. 11. This talk log automatically extract specification ? What is a good specification language to get an overview of ? ? how an application works? ? source code understanding of objects and object interplay 10
  12. 12. Understanding Object Interplay FTP server: How does Login/Logout work? In object-oriented applications, the hard part is to understand how the different objects relate to and interact with each other. Here are some classes of an FTP server. How does login/logout work? 11
  13. 13. Understanding Object Interplay An object of class A invokes method onConnect() on an Scenario: Login/Logout object of class B. A UserCmd B C C onConnect() UserCmd A B 12
  14. 14. Understanding Object Interplay Scenario: Login/Logout A UserCmd B C C onConnect() onLogin() setUser() UserCmd setLogout() A B 13
  15. 15. Understanding Object Interplay Scenario: Login/Logout A UserCmd B C C onConnect() onLogin() setUser() UserCmd setLogout() A • non-local behavior: multiple objects • logically related B 14
  16. 16. When does it happen? Scenario: Login/Logout This scenario tells us which objects and methods are involved in login/logout. A UserCmd B C onConnect() onLogin() setUser() But it does not tell when this scenario occurs in the application. setLogout() 15
  17. 17. When does it happen? Scenario: Login/Logout A whenever the prechart happens… eventually the mainchart will happen UserCmd B C onConnect() onLogin() setUser() setLogout() 16
  18. 18. Linear-Time LSCs - Invariants pre Login Login There can be other behaviors between the behaviors shown in the LSC. 2 runs 17
  19. 19. Linear-Time LSCs - Invariants pre Login Login This run does not continue with the complete main chart of the LSC. 2 runs 18
  20. 20. Understanding Everything FTP download FTP delete not everything is an invariant  alternative behaviors FTP rename 2 alternative runs 19
  21. 21. Linear-Time is Insufficient scenario for FTP delete command This LSC for the delete command does not hold in every run. 2 runs 20
  22. 22. Branching Time We merge all runs on their joint prefixes into an execution tree. execution tree 21
  23. 23. Branching-Time LSCs [Sibay, Uchitel, Braberman ICSE 2008] whenever the prechart happens there exists a branch where the mainchart happens execution tree 22
  24. 24. Describing Alternatives We can define an LSC for the download command, that is alternative to delete. execution tree 23
  25. 25. Describing Alternatives … and also an LSC for the Rename command. execution tree 24
  26. 26. Describing Alternatives execution tree 25
  27. 27. Describing Alternatives execution tree 26
  28. 28. LSC Mining from Event Logs log automatically extract complete set of LSCs (linear / branching) We want to discover a set of LSCs that can describe all behaviors (or as much as possible of the behaviors). understanding of objects and object interplay 27
  29. 29. Logs log automatically extract log method calls complete set of LSCs (linear / branching) caller1, callee1, method1(…) caller2, callee2, method2(…) … Each execution of the application gives one trace. Run application multiple times for a log. 28
  30. 30. Desired Outcome log log = tree automatically extract complete set of LSCs (occuring at least s times = support) 29
  31. 31. Mining Algorithm github.com/scenario-based-tools/sam/ tree variant of [Lo, Maoz, Khoo ASE 2007] 30
  32. 32. Mining Algorithm github.com/scenario-based-tools/sam/ tree 1. enumerate all sequences of events occurring ≥ s times onConnect() onConnect() onConnect() onLogin() onLogin() onLogin() setLogin() setLogin() setLogin() setLogout() setLogout() setLogout() candidate words Starting from sequences of length 1, recursively append events and check if it occurs often enough. Efficient implementation uses branch and bound and some heuristics. 31
  33. 33. Mining Algorithm github.com/scenario-based-tools/sam/ tree 1. enumerate all sequences of events occurring ≥ s times onConnect() onConnect() onConnect() onLogin() onLogin() onLogin() setLogin() setLogin() setLogin() setLogout() setLogout() setLogout() candidate words From LSC with pre-chart length 1 to LSC with mainchart length 1. 2. generate all candidate LSCs onConnect() onLogin() setLogin() setLogout() onConnect() onLogin() setLogin() setLogout() 32
  34. 34. Mining Algorithm github.com/scenario-based-tools/sam/ tree 1. enumerate all sequences of events occurring ≥ s times onConnect() onConnect() onConnect() onLogin() onLogin() onLogin() setLogin() setLogin() setLogin() setLogout() setLogout() setLogout() candidate words 3. test for each LSC if satisfied ≥ c% 2. generate all candidate LSCs onConnect() onLogin() setLogin() setLogout() onConnect() onLogin() setLogin() setLogout() 33
  35. 35. Mining Algorithm github.com/scenario-based-tools/sam/ tree 1. enumerate all sequences of events occurring ≥ s times onConnect() onConnect() onConnect() onLogin() onLogin() onLogin() setLogin() setLogin() setLogin() setLogout() setLogout() setLogout() candidate words 3. test for each LSC if satisfied ≥ c% 2. generate all candidate LSCs onConnect() onLogin() setLogin() setLogout() onConnect() onLogin() setLogin() setLogout() 34
  36. 36. LSC Mining from Event Logs log automatically extract complete set of LSCs (linear and branching) What do branching scenarios add to specification mining? understanding of objects and object interplay 35
  37. 37. Experiments  CrossFTP server: 54 traces, 50 event types s Linear LSC covered events avg. length time [s] Branching LSC covered events avg. length time [s] 20 7 90% 7 3 7+0 90% 7 3 14 9 90% 5 31 9+12 95% 13 26 10 9 90% 5 1008 9+18 95% 18 685 Branching LSC contain the linear LSC and some more strictly branching LSC that were not found before. Branching LSC are less frequent (lower support threshold). 36
  38. 38. Experiments  CrossFTP server: 54 traces, 50 event types s Linear LSC covered events avg. length time [s] Branching LSC covered events avg. length time [s] 20 7 90% 7 3 7+0 90% 7 3 14 9 90% 5 31 9+12 95% 13 26 10 9 90% 5 1008 9+18 95% 18 685 Branching LSC can explore more events of the log than just Linear LSC. 37
  39. 39. Experiments  CrossFTP server: 54 traces, 50 event types s Linear LSC covered events avg. length time [s] Branching LSC covered events avg. length time [s] 20 7 90% 7 3 7+0 90% 7 3 14 9 90% 5 31 9+12 95% 13 26 10 9 90% 5 1008 9+18 95% 18 685 Branching LSC are longer than Linear LSC. In other words, they show more details for a particular behavior. 38
  40. 40. Experiments  CrossFTP server: 54 traces, 50 event types s Linear LSC covered events avg. length time [s] Branching LSC covered events avg. length time [s] 20 7 90% 7 3 7+0 90% 7 3 14 9 90% 5 31 9+12 95% 13 26 10 9 90% 5 1008 9+18 95% 18 685 Running times for extraction are feasible. Note that LSCs shown here are the LSCs left after removing subsumed ones. Originally, the algorithm finds around 6 million branching LSC in 685 seconds. 39
  41. 41. Experiments  CrossFTP server: 54 traces, 50 event types s Linear LSC covered events avg. length time [s] Branching LSC covered events avg. length time [s] 20 7 90% 7 3 7+0 90% 7 3 14 9 90% 5 31 9+12 95% 13 26 10 9 90% 5 1008 9+18 95% 18 685  Columba mail client: 104 traces, 79 event types s/c Linear LSC covered events avg. length time [s] 20 57 70% 4 159 10 205 72% 6 10/.5 163 78% 6 Branching LSC covered events avg. length time [s] 57+1 71% 9 154 2191 205+53 75% 9 2055 2256 163+44 84% 6 2125 full data sets and results: http://dx.doi.org/10.4121/uuid:aa7db920-aae6-4750-8975-cb739262f432 40
  42. 42. Experiments  CrossFTP server: 54 traces, 50 event types s Linear LSC covered events avg. length time [s] Branching LSC covered events avg. length time [s] 20 7 90% 7 3 7+0 90% 7 3 14 9 90% 5 31 9+12 95% 13 26 10 9 90% 5 1008 9+18 95% 18 685  Columba mail client: 104 traces, 79 event types s/c Linear LSC covered events avg. length time [s] 20 57 70% 4 159 10 205 72% 6 10/.5 163 78% 6 Branching LSC covered events avg. length time [s] 57+1 71% 9 154 2191 205+53 75% 9 2055 2256 163+44 84% 6 2125 full data sets and results: http://dx.doi.org/10.4121/uuid:aa7db920-aae6-4750-8975-cb739262f432 41
  43. 43. Linear vs. Branching: CrossFTP What is the qualitative contribution of branching LSC to specification mining? application life-cycle from end to end connect login logout clean up 42
  44. 44. Linear vs. Branching: CrossFTP application life-cycle from end to end short invariants of individual FTP commands invariant of RENAME 43
  45. 45. Linear vs. Branching: CrossFTP application life-cycle from end to end FTP command + where triggered short invariants of individual FTP commands The branching LSC fills the gap between large and small invariants. login rename command 44
  46. 46. Linear vs. Branching: CrossFTP application life-cycle individual FTP individual FTP FTP commands from end to end all commands + commands + +where they are where they are can be triggered in triggered triggered the same situation short invariants of individual FTP commands We found all ftp commands supported by the server, as alternative LSC. 45
  47. 47. Linear vs. Branching: CrossFTP application life-cycle individual FTP individual FTP FTP commands from end to end all commands + commands + +where they are where they are can be triggered in triggered triggered the same situation … and we could discover cyclic behavior: after rename, there could be another delete command short invariants of individual FTP commands cycles: rename  delete 46
  48. 48. Take Home Points http://github.com/scenario-based-tools/sam/ log • mining branching scenarios  alternatives, cycles • combined with linear: comprehensive specification • future work: visualizing results distributed scenarios complete set of LSCs understanding of objects and object interplay 47
  49. 49. about.me/dirk.fahland @dfahland Mining Branching-Time Scenarios
  50. 50. Q&A …is branching time really necessary? Yes, here is a linear LSC showing a disjunction for continuing after the pre-chart. if then delete or download 49
  51. 51. The full execution tree satisfies this Linear LSC with disjunction Branching Time vs. Disjunction and two branching LSCs describing the two alternatives in separate LSCs. 50
  52. 52. Branching Time vs. Disjunction Removing one branch from the tree (the execution of the download command), violates the branching LSCs, but still satisfies the disjunctive linear LSCs (because only one of them has to hold). 51

×