Since the first algorithms for automatically discovering process models from event logs have been proposed in the late 1990ies the problem of obtaining insights into processes by mining from event logs gained growing attention. By now, the field has grown into a maturing discipline and industry has begun adopting process mining in regular operations, supported by several commercial process mining solutions are available on the market.
In the early days of process mining, several algorithms for constructively discovering a process model from an event log were proposed, each algorithm pursuing unique principles for constructing a model. This first generation of process discovery techniques, which includes for instance the alpha-algorithm, paved the ground for process mining as research discipline. As these algorithms were applied in practice, new research challenges showed up, sparking new results in both pre-processing event data and evaluating process models on event logs. In particular the latter deepened the understanding of the challenges in process mining and established a reliable feedback mechanism in process mining in the form of conformance checking. This feedback mechanism enabled researching a second generation of process mining techniques addressing a large variety of problems such as quality guarantees for discovered models, including the data perspective in discovered models, or discovering temporal logic constraints. In particular, the inductive miner family was seen as a new milestone as it provided a systematic way to develop process discovery algorithms with reliable results. Yet again, as these more capable techniques are being applied to the growing and more detailed event data recorded in practice, further unsolved challenges arise.
In the first part of my talk I will draw an arc from the early days of process mining to the current state of the art in process mining – highlighting central techniques and their impact on later developments. In the second part of my talk, I will then turn to what kinds of event data and challenges are being found in practice today, how existing process mining techniques fail to address them, and thus which open challenges and opportunities the process mining field offers also for researchers from other domains.
19. Learning Models with Concurrency: ILP Miner
[Werf, Dongen, Hurkens, Serebrenik 2009]
18
A
B
C
DE
ABCD
ACBD
AED
D must happen before B
prevents traces #1 and #2
don’t add placeA must happen before B or E
allows all traces
add place
encode as ILP problem
20. Learning Models with Concurrency: ILP Miner
[Werf, Dongen, Hurkens, Serebrenik 2009]
19
A
B
C
DE
p2
end
p4
p3p1
start
ABCD
ACBD
AED
Alpha Algorithm: construct places based on binary relations (derived from directly-follows graph)
[Aalst, Weijters, Maruster 2004]
21. Precise Semantics and “Messy” Data
20
Road
Traffic
Fines Log
ILP Miner: fitting, but complexAlpha Miner: “unsound” (no proper behavior)
22. Less precise: the Visual Approach
21
Directly & Eventually Follows Relation:
thresholds for filtering edges + structural simplification
Heuristics Miner
[Agrawal, Gunopulos, Leymann 1998]
[Weijters, Aalst 2001]
Road
Traffic
Fines Log
23. Many Process Discovery Algorithms…
alpha ILP
Heuristics
Transition
System
Fuzzy Disco
22
24. … and the Challenges of Real-Life Data
ILP
Transition
Systemalpha
Heuristics
Fuzzy Disco
show/hide
details
23
26. How to get correct models on real data?
25
A
B
C
DE
p2
end
p4
p3p1
start
Past Present
Open
Challenges
27. Quality and Forces in Process Discovery
log
process
model
positive
examples
only
26
28. Quality and Forces in Process Discovery
[Buijs, van Dongen, Aalst 2014]
log
process
model
ensure fitness
generalize
increase
precision
simple models
27
29. The Process Discovery Problem
event
log
discover process
model
fitting and precise
can rediscover (generalizes)
Simple,
Sound,
Semantics
Analysis
29
30. Basic Process Discovery Principle
extract
behavioral
specification
synthesize
process
model
process
model
30
event
log
45. 47
Inductive Miner
sound, fitting models (+/- filtering)
allows for reliable analysis of behavior
[Leemans, Fahland, Aalst 2013-2015]
Analyze performance
87 days until
fine is sent
46. Combining Process Mining and Data Mining
[Leoni et al 2013]
48
conditions for choices:
“Appeal to Judge” if amount 36 EUR
60. Process Mining = Discovery + Conformance + Extension +
Log Preprocessing + …
event
log
discover
model of
actual
process
model of
intended
process
check
conformance Deviations between
actual and intended
process
model of
actual
process
model of
intended
process
enriched
model
extend
• Filtering
• Clustering
• Activity identification
• Deviation detection
• Partially ordered
event data
• Event log
visualization
• Database tables
• Database logs
• Event streams
• IoT devices
63
www.promtools.org
61. Find patterns and contexts
• identify variants
• identify independence concurrency
• aggregate sets of low-level events to high-level activities
Learn prediction models
• outcomes of a process based on case features
• detect deviations/risks early on
Mine and integrate domain-knowledge
• Identify patterns/variants/views that fit domain expectations
• Enrich models with domain concepts
Opportunities for Data Mining in Process Mining
64
62. Get ProM
• www.promtools.org
Get event logs
• Real-life event logs
https://data.4tu.nl/repository/collection:event_logs_real
• Synthetic event logs
https://data.4tu.nl/repository/collection:event_logs_synthetic
Read up on analyses
• Case studies
https://www.win.tue.nl/ieeetfpm/doku.php?id=shared:process_mining_case_studies
• BPI Challenge 2017 (and all previous editions)
https://www.win.tue.nl/bpi/doku.php?id=2017:challenge
Take a free online course on Process Mining
• https://www.coursera.org/learn/process-mining/
• https://www.futurelearn.com/courses/process-mining
• https://www.futurelearn.com/courses/process-mining-healthcare
Check the literature list on the next page
How to get started?
65
63. 1. Cook, Jonathan E. and Alexander L. Wolf. “Automating Process Discovery through Event-Data Analysis.” 1995 17th International Conference on Software Engineering (1995): 73-73.
2. Cook, Jonathan E. and Alexander L. Wolf. “Discovering Models of Software Processes from Event-Based Data.” ACM Trans. Softw. Eng. Methodol. 7 (1998): 215-249.
3. Cook, Jonathan E. and Alexander L. Wolf. “Event-Based Detection of Concurrency.” (1998). ACM SIGSOFT’98
4. Agrawal, Rakesh, Dimitrios Gunopulos and Frank Leymann. “Mining Process Models from Workflow Logs.” EDBT (1998).
5. Weijters, A J M M and W M P Van Der Aalst. “Process Mining Discovering Workflow Models from Event-Based Data.” (2001).
6. Maruster, Laura, A. J. M. M. Weijters, Wil M. P. van der Aalst and Antal van den Bosch. “Process Mining: Discovering Direct Successors in Process Logs.” Discovery Science (2002).
7. Aalst, Wil M. P. van der, A. J. M. M. Weijters and Laura Maruster. “Workflow mining: discovering process models from event logs.” IEEE Transactions on Knowledge and Data
Engineering 16 (2004): 1128-1142.
8. Jan Martijn E. M. van der Werf, Boudewijn F. van Dongen, Cor A. J. Hurkens, Alexander Serebrenik: Process Discovery using Integer Linear Programming. Fundam. Inform. 94(3-4):
387-412 (2009)
9. Sander J. J. Leemans, Dirk Fahland, Wil M. P. van der Aalst: Discovering Block-Structured Process Models from Event Logs - A Constructive Approach. Petri Nets 2013: 311-329
10. Adriano Augusto, Raffaele Conforti, Marlon Dumas, Marcello La Rosa, Giorgio Bruno: Automated Discovery of Structured Process Models: Discover Structured vs. Discover and
Structure. ER 2016: 313-329
11. Wil M. P. van der Aalst, Anna A. Kalenkova, Vladimir A. Rubin, Eric Verbeek: Process Discovery Using Localized Events. Petri Nets 2015: 287-308
12. Cohen, Hila and Shahar Maoz. “The confidence in our k-tails.” ASE (2014).
13. Xixi Lu, Dirk Fahland, Wil M. P. van der Aalst: Interactively Exploring Logs and Mining Models with Clustering, Filtering, and Relabeling. BPM (Demos) 2016: 44-49
14. Xixi Lu, Dirk Fahland: A Conceptual Framework for Understanding Event Data Quality for Behavior Analysis. ZEUS 2017: 11-14
15. Xixi Lu, Dirk Fahland, Frank J. H. M. van den Biggelaar, Wil M. P. van der Aalst: Detecting Deviating Behaviors Without Models. Business Process Management Workshops 2015:
126-139
16. Maikel L. van Eck, Natalia Sidorova, Wil M. P. van der Aalst: Discovering and Exploring State-Based Models for Multi-perspective Processes. BPM 2016: 142-157
17. Massimiliano de Leoni, Marlon Dumas, Luciano García-Bañuelos: Discovering Branching Conditions from Business Process Execution Logs. FASE 2013: 114-129
18. Massimiliano de Leoni, Wil M. P. van der Aalst: Data-aware process mining: discovering decisions in processes using alignments. SAC 2013: 1454-1461
19. Joos C. A. M. Buijs, Boudewijn F. van Dongen, Wil M. P. van der Aalst: Quality Dimensions in Process Discovery: The Importance of Fitness, Precision, Generalization and Simplicity.
Int. J. Cooperative Inf. Syst. 23(1) (2014)
20. Arya Adriansyah, Boudewijn F. van Dongen, Wil M. P. van der Aalst: Conformance Checking Using Cost-Based Fitness Analysis. EDOC 2011: 55-64
21. Jorge Munoz-Gama, Josep Carmona: A Fresh Look at Precision in Process Conformance. BPM 2010: 211-226
22. Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen, Wil M. P. van der Aalst: Measuring precision of modeled behavior. Inf. Syst. E-Business
Management 13(1): 37-67 (2015)
Literature
66