First part of my tutorial (jointly with Diego Calvanese) on the "Integrated Modeling and Verification of Processes and Data", held at the BPM 2017 conference. The first part focuses on the motivation for combining processes and data, and a review of the state of the art.
BPM2017 - Integrated Modeling and Verification of Processes and Data Part 1: Introduction
1. Integrated Modelling and Verification
of Processes and Data
Diego Calvanese Marco Montali
{calvanese,montali}@inf.unibz.it
Free University of Bozen-Bolzano
BPM 2017
2. Marrying processes and data
is extremely challenging….
… but is a must
if we want to really understand
how complex dynamic systems operate.
2
3. Two Questions
How to formally and conceptually
account for the process+data interplay?
How to verify such BPMs?
N.B.: modeling and verification go side-by-side
3
4. Two Questions
How to formally and conceptually
account for the process+data interplay?
How to verify such BPMs?
N.B.: modeling and verification go side-by-side
4
Business
Turing
Machines
BTMs
6. Outline
Part 1
• Introduction and motivation: why processes + data
• A quick tour through the literature and integrated models
Part 2
• The framework of Data-Centric Dynamic Systems
• Verification results
Part 3
• Connection to concrete integrated models and systems
• Concluding remarks
6
7. Information Assets
• Data: the main information source about the
history of the domain of interest and the
relevant aspects of the current state of affairs
• Processes: how work is orchestrated in the
domain of interest, so as to create value
• Resources: humans and devices responsible
for the execution of work units within a
process
7
9. Is this Synergy Reflected by BP
Methods and Models?
Survey by Forrester [Karel et al, 2009]: lack of interaction
between data and process experts.
• BPM professionals: data are subsidiary to processes
• Master data managers: data are the main driver for the
company’s existence
• 83/100 companies: no interaction at all between these
two groups
• This isolation propagates to models, languages and tools
9
15. 1. Customer PO
2. order decomposition
Material PO
Line item
Customer PO
15
16. 3. Selection and
interaction with suppliers
1. Customer PO
2. order decomposition
Material PO
Line item
Customer PO
16
17. 3. Selection and
interaction with suppliers
1. Customer PO
2. order decomposition
Material PO
Line item
Customer PO
17
18. 3. Selection and
interaction with suppliers
1. Customer PO
2. order decomposition
Material PO
Line item
Customer PO
4. material assembly18
19. 3. Selection and
interaction with suppliers
1. Customer PO
2. order decomposition
Material PO
Line item
Customer PO
4. material assembly
5. Shipment
19
20. Observations
• A complex process, where the company acts as an
intermediate hub between customers and suppliers
• Happy path
1) The customer issues a purchase order
2) The ordered material is obtained from suppliers
3) The material is shipped, possibly using different packages
• One exceptional path (in general, there are many):
1) The customer cancels the order
2) A cancelation policy is applied to calculate a penalty
20
21. Conventional Data Modeling
Focus: revelant entities, relations, static constraints
Supplier ManufacturingProcurement/Supplier
Sales
Customer PO Line Item
Work OrderMaterial PO
*
*
spawns
0..1
Material
But… how do data evolve?
Where can we find the “state” of a purchase order?
21
UML class diagram
22. Conventional Process Modeling
Focus: control-flow of activities in response to events
But… how do activities update data?
What is the impact of canceling an order?
22
BPMN
collaborative
process
24. Do you like Spaghetti?
Manage
Cancelation
ShipAssemble
Manage
Material POs
Decompose
Customer PO
Activities
Process
Data
Activities
Process
Data
Activities
Process
Data
Activities
Process
Data
Activities
Process
Data
Customers Suppliers&CataloguesCustomer POs Work Orders Material POs
IT integration: difficult to manage, understand, maintain
24
25. Too Late!
• Where are the data?
• Where shall we model relevant business rules?
• Consider an order cancelation policy that needs to check
which material has been already shipped towards
determining the customer penalty…
25
o late to reconstruct the missing pieces
Where is our data?
part is in the DBs,
part is hidden in the process execution engine.
Where are the relevant business rules, and how are they modeled?
At the DB level? Which DB? How to import the process data?
(Also) in the business model? How to import data from the DBs?
DataProcess
Supplier ManufacturingProcurement/Supplier
Sales
Customer PO Line Item
Work OrderMaterial PO
*
*
spawns
0..1
Determine
cancelation
penalty
Notify penalty
Material
Process Engine
Process State
Business rules
For each work order W
For each material PO M in W
if M has been shipped
add returnCost(M) to penalty
32. Case and Persistent Data
Review
Request
Fill Reim-
bursement
Review Reim-
bursement
Rejected
Accepted
req info result reimbursement
personal
info
32
36. A General Recipe
• Explicit control-flow
• Local, case data
• Global, persistent data
• Queries/updates on the persistent data
• External inputs
• Internal generation of fresh IDs
36
“REAL” PROCESS
37. Cooking with
Standard Process Languages
• Explicit control-flow
• Local, case data
• Global, persistent data
• Queries/updates on the persistent data
• External inputs
• Internal generation of fresh IDs
37
BPMN
~
~
38. Business Process
A set of logically related tasks performed to achieve a defined
business outcome for a particular customer or market.
(Davenport, 1992)
A collection of activities that take one or more kinds of input and
create an output that is of value to the customer.
(Hammer & Champy, 1993)
A set of activities performed in coordination in an organizational
and technical environment. These activities jointly realize a
business goal.
(Weske, 2011)
38
39. Business Process
A set of logically related tasks performed to achieve a defined
business outcome for a particular customer or market.
(Davenport, 1992)
A collection of activities that take one or more kinds of input and
create an output that is of value to the customer.
(Hammer & Champy, 1993)
A set of activities performed in coordination in an organizational
and technical environment. These activities jointly realize a
business goal.
(Weske, 2011)
39
Task logic:
tightly intertwined
with data updates!
42. Business Entities/Artifacts
Data-centric paradigm for process modeling
• First: elicitation of relevant business entities that are
evolved within given organizational boundaries
• Then: definition of the lifecycle of such entities, and
how tasks trigger the progression within the
lifecycle
• Active research area, with concrete languages
(e.g., IBM GSM, OMG CMMN)
• Cf. EU project ACSI (completed)
42
47. Chimera
47
Mathias Weske – Novel Challenges in BPM Research
Mathias Weske – Novel Challenges in BPM Research 11
Object Lifecycles
Mathias Weske – Novel Challenges in BPM Research 12
Process Fragments
Mathias Weske – Novel Challenges in BPM Research
Process Fragments
Process Fragments
48. Cooking with Business Entities
• Explicit control-flow
• Local, case data
• Global, persistent data
• Queries/updates on the persistent data
• External inputs
• Internal generation of fresh IDs
48
ARTIFACT-/OBJECT-CENTRIC PROCESSES
~
~
~
~
51. Colored Petri Nets
51
80 4 Formal Definition of Non-hierarchical Coloured Petri Nets
k
if n=k
then k+1
else k
k
data
n
n if success
then 1`n
else empty
n
if n=k
then k+1
else k
(n,d)(n,d)
n
if n=k
then data^d
else data
(n,d)
if success
then 1`(n,d)
else empty
(n,d)
Receive
Ack
Transmit
Ack
Receive
Packet
Transmit
Packet
Send
Packet
NextRec
1`1
NO
C
NO
D
NO
A
NOxDATA
NextSend
1`1
NO
Data
Received
1`""
DATA
B
NOxDATA
Packets
To Send
AllPackets
NOxDATA
11`3
4
1`(1,"COL")++
2`(2,"OUR")++
1`(3,"ED ")
1 1`3
11`"COLOUR"
6
1`(1,"COL")++
3`(2,"OUR")++
2`(3,"ED ")
6
1`(1,"COL")++
1`(2,"OUR")++
1`(3,"ED ")++
1`(4,"PET")++
1`(5,"RI ")++
1`(6,"NET")
Fig. 4.1 Example used to illustrate the formal definitions
colset NO = int;
colset DATA = string;
colset NOxDATA = product NO * DATA;
colset BOOL = bool;
No conceptual representation of persistent storage
52. Recipe?
• Explicit control-flow
• Local, case data
• Global, persistent data
• Queries/updates on the persistent data
• External inputs
• Internal generation of fresh IDs
52
COLORED PETRI NETS
implicit, or using
fresh variables
55. Formal Verification
Automated analysis
of a formal model of the system
against a property of interest,
considering all possible system behaviors
55
picture by Wil van der Aalst
56. Formal Verification
The Conventional, Propositional Case
Process control-flow
(Un)desired property
56
Abstract model underlying variants of artifact-centric systems.
Semantically equivalent to the most expressive models for business proc
systems (e.g., GSM).
Data Process Data+Process
Data Layer: Relational databases / ontologies
Data schema, specifying constraints on the allowed states
Data instance: state of the DCDS
Process Layer: key elements are
Atomic actions
Condition-action-rules for application of actions
Service calls: communication with external environment, new data!
alvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016
57. (Un)desired property
Finite-state
transition
system
Propositional
temporal formula|=
Formal Verification
The Conventional, Propositional Case
Process control-flow
57
Abstract model underlying variants of artifact-centric systems.
Semantically equivalent to the most expressive models for business proc
systems (e.g., GSM).
Data Process Data+Process
Data Layer: Relational databases / ontologies
Data schema, specifying constraints on the allowed states
Data instance: state of the DCDS
Process Layer: key elements are
Atomic actions
Condition-action-rules for application of actions
Service calls: communication with external environment, new data!
alvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016
58. (Un)desired property
Finite-state
transition
system
Propositional
temporal formula|=
Formal Verification
The Conventional, Propositional Case
Process control-flow
58
Verification
via model checking
2007 Turing award:
Clarke, Emerson, Sifakis
Abstract model underlying variants of artifact-centric systems.
Semantically equivalent to the most expressive models for business proc
systems (e.g., GSM).
Data Process Data+Process
Data Layer: Relational databases / ontologies
Data schema, specifying constraints on the allowed states
Data instance: state of the DCDS
Process Layer: key elements are
Atomic actions
Condition-action-rules for application of actions
Service calls: communication with external environment, new data!
alvanese (FUB) Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016
59. (Un)desired property
Formal Verification
The Data-Aware Case
59
Data-aware process
el underlying variants of artifact-centric systems.
quivalent to the most expressive models for business process
GSM).
Data Process Data+Process
elational databases / ontologies
ma, specifying constraints on the allowed states
nce: state of the DCDS
key elements are
tions
action-rules for application of actions
alls: communication with external environment, new data!
Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (24/1)
60. (Un)desired property
First-order
temporal formula|=
Formal Verification
The Data-Aware Case
Infinite-state, relational
transition system [Vardi 2005] 60
el underlying variants of artifact-centric systems.
quivalent to the most expressive models for business process
GSM).
Data Process Data+Process
elational databases / ontologies
ma, specifying constraints on the allowed states
nce: state of the DCDS
key elements are
tions
action-rules for application of actions
alls: communication with external environment, new data!
Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (24/1)
Data-aware process
61. (Un)desired property
First-order
temporal formula|=
?
Formal Verification
The Data-Aware Case
61
Infinite-state, relational
transition system [Vardi 2005]
el underlying variants of artifact-centric systems.
quivalent to the most expressive models for business process
GSM).
Data Process Data+Process
elational databases / ontologies
ma, specifying constraints on the allowed states
nce: state of the DCDS
key elements are
tions
action-rules for application of actions
alls: communication with external environment, new data!
Foundations of Data-Aware Process Analysis INRIA Saclay Paris – 18/3/2016 (24/1)
Data-aware process
62. Why FO Temporal Logics
• To inspect data: FO queries
• To capture system dynamics: temporal
modalities
• To track the evolution of objects: FO
quantification across states
• Example: It is always the case that every
order is eventually either cancelled, or paid
and then delivered
• N.B.: the interplay between FO quantification
and temporal modalities is quite subtle!
62
64. Dimension 1
Static Information Model
How are data structured?
• Propositional symbols —> Finite state system
• Fixed number of values from an unbounded domain
• Full-fledged database:
• relational database
• tree-structured data, XML
• graph-structured data
64
65. Dimension 1
Static Information Model
Are constraints present? How are they interpreted?
• Complete data
• Data under incomplete information
• ontology (with intensional part typically fixed)
• full-fledged ontology-based data access system
• Hard vs soft-constraints (inconsistency-tolerance)
65
66. Dimension 2
Dynamic Component
• Implicit representation of time vs. implicit progression
mechanism vs. explicit process
• When an explicit process is present:
• how is the process dynamics represented?
• procedural vs. declarative approaches (e.g., finite state
machines vs. rule-based)
• Deterministic vs. non-deterministic behaviour
• Linear time vs. branching time model
• Finite vs. infinite traces
66
67. Dimension 3
Data-Process Interaction
How are data manipulated by the process?
• Data is only accessed, but not modified
• Data are updated, but no new values are inserted
• Full-fledged combination of the temporal and
structural dimensions
• Hybrid approaches (e.g., read-only database + read-
write registers)
67
68. Dimension 4
Interaction with the Environment
Is the system interacting with the external world?
• Closed systems vs. bounded input vs. unbounded
input
• Synchronous vs. asynchronous communication
• Message passing, possibly with queues
• One-way or two-way service calls
68
69. Dimension 4
Interaction with the Environment
Which parts of the environment are fixed? Which
change?
• Stateless vs stateful environment
• Fixed database vs. varying database vs. varying
portion of data
• Multiple devices/agents interacting with each other
• Fixed vs changing topologies
69
70. Dimension 5
Formal Analysis
How are (un)desired properties formulated?
• Analysis of fundamental properties: reachability,
absence of deadlock, boundedness, (weak)
soundness
• Analysis of arbitrary formulae in some temporal
logic
• Analysis of properties with queries across the
temporal dimension (in the style of temporal DBs)
70
71. Dimension 5
Formal Analysis
Which forms of analysis?
• Verification
• Dominance, simulation, equivalence
• Synthesis from a given specification
• Composition of available components
71
72. 72
1) Go to the essential
2) Find boundaries of decidability
in a general setting
3) Understand the connection with
concrete languages
4) Implement