Balancing Expressiveness and Verifiability in Data-Aware Business Processes

Balancing Between
Expressiveness and Veriﬁability
Marco Montali
Free University of Bozen-Bolzano
Data-Aware Business Processes
1

Our Starting Point
Marrying processes and data is a must if we
want to really understand how complex dynamic
systems operate
Dynamic systems of interest:
• business processes
• multiagent systems
• distributed systems
3

Complex Systems Lifecycle
4
picture by Wil van der Aalst

Formal Veriﬁcation
Automated analysis
of a formal model of the system
against a property of interest,
considering all possible system behaviors
5
picture by Wil van der Aalst

Our Thesis
Knowledge representation and  
computational logics  
 
can become a swiss-army knife to 
 
understand data-aware dynamic systems,
and  
provide automated reasoning and veriﬁcation
capabilities along their entire lifecycle
6

PracticeBPMN
Declare
UML YAWL
AUML
FCL
GSM
ORM
CMMN
ACM
Bloom
JADE
Dedalus
E-R
OWL
EPC
JASON
BPEL
SQL
SBVR
+ methodologies
8

Theory
Theoremeorem
Theorem
Theorem
Theorem
Theorem
Theorem
Theorem
Theorem
TheoremTheorem
Theorem
10

Our Approach
1. Develop formal models for data-aware dynamic systems
2. Outline a map of (un)decidability and complexity
3. Find robust conditions for decidability/tractability
4. Bring them back into practice
5. Implement proof-of-concept prototypes
11

Outline: 3 Acts
1. Loneliness
2. Marriage
3. Hate and love
12
? ?

The Three Pillars of Complex Systems
System
ProcessesData Resources
In AI and CS, we know a lot about each pillar!
13

Information Assets
• Data: the main information source about the history
of the domain of interest and the relevant aspects
of the current state of affairs
• Processes: how work is carried out in the domain
of interest, leading to manipulate data
• Resources: humans and devices responsible for
the execution of work units within a process
We focus on the ﬁrst two aspects!
14

State of the Art
• Traditional isolation between processes and data
• Why? To attack the complexity (divide et impera)
• AI has greatly contributed to these two aspects
• Data: knowledge bases, conceptual models,
ontologies, ontology-based data access and
integration, inconsistency-tolerant semantics, …
• Processes: reasoning about actions, temporal/
dynamic logics, situation/event calculus, temporal
reasoning, planning, veriﬁcation, synthesis, …
16

Data/Process Fragmentation
• A business process consists of a set of activities that
are performed in coordination in an organizational and
technical environment [Weske, 2007]
• Activities change the real world
• The corresponding updates are reﬂected into the
organizational information system(s)
• Data trigger decision-making, which in turn determines
the next steps to be taken in the process
• Survey by Forrester [Karel et al, 2009]: lack of
interaction between data and process experts
18

Experts Dichotomy
• BPM professionals: think that data are subsidiary to
processes, and neglect the importance of data quality
• Master data managers: claim that data are the main
driver for the company’s existence, and they only focus
on data quality
• Forrester: in 83/100 companies, no interaction at all
between these two groups
• This isolation propagates to languages and tools,
which never properly account for the process-data
connection
19

Conventional Data Modeling
Focus: revelant entities, relations, static constraints
Supplier ManufacturingProcurement/Supplier
Sales
Customer PO Line Item
Work OrderMaterial PO
*
*
spawns
0..1
Material
But… how do data evolve?
Where can we ﬁnd the “state” of a purchase order?
20

Conventional Process Modeling
Focus: control-ﬂow of activities in response to events
But… how do activities update data?
What is the impact of canceling an order?
21

Do you like Spaghetti?
Manage
Cancelation
ShipAssemble
Manage
Material POs
Decompose
Customer PO
Activities
Process
Data
Activities
Process
Data
Activities
Process
Data
Activities
Process
Data
Activities
Process
Data
Customers Suppliers&CataloguesCustomer POs Work Orders Material POs
IT integration: difﬁcult to manage, understand, evolve
22

Too late to reconstruct the
missing pieces…
• Where are the data?
• Part in the DBs
• Part inside the process engine
• Where shall we model relevant business rules?
• At the DB level? Which DB? How to get all the necessary data?
23
Too late to reconstruct the missing pieces
Where is our data?
part is in the DBs,
part is hidden in the process execution engine.
Where are the relevant business rules, and how are they modeled?
At the DB level? Which DB? How to import the process data?
(Also) in the business model? How to import data from the DBs?
DataProcess
Supplier ManufacturingProcurement/Supplier
Sales
Customer PO Line Item
Work OrderMaterial PO
*
*
spawns
0..1
Determine
cancelation
penalty
Notify penalty
Material
Process Engine
Process State
Business rules
For each work order W
For each material PO M in W
if M has been shipped
add returnCost(M) to penalty

“
How is Research Reacting?
A recent review…
Veriﬁcation typically takes place at the design stage
of a business process type. However, at this stage,
required knowledge about data (database
schema, integrity constraints) is typically not yet
available. Conditions should be explicitly stated
under which the proposed method is applicable.
24

…But There is Hope!
• [Meyer et al, 2011]: data-process integration
crucial to assess the value of processes and
evaluate KPIs
• [Dumas, 2011]: data-process integration crucial to
aggregate all relevant information, and to suitably
inject business rules into the system
• [Reichert, 2012]: “Process and data are just two
sides of the same coin”
25

Business Entities/Artifacts
Data-centric paradigm for process modeling
• First: elicitation of relevant business entities that are
evolved within given organizational boundaries
• Then: deﬁnition of the lifecycle of such entities, and
how tasks trigger the progression within the
lifecycle
• Active research area, with concrete languages
(e.g., IBM GSM, OMG CMMN)
• Cf. EU project ACSI (completed)
26

Finite-State Machines
27
d by the information model?
t?
ds).
me ontology language.

The Conventional, Propositional Case
Process control-ﬂow
(Un)desired property
30

Finite-state
transition
system
Propositional
temporal formula|=
31

Finite-state
transition
system
Propositional
temporal formula|=
Veriﬁcation
via model checking
2007 Turing award:
Clarke, Emerson, Sifakis
32

The Data-Aware Case
34
Process+Data

First-order
temporal formula|=
Process+Data
The Data-Aware Case
Inﬁnite-state, relational
transition system [Vardi 2005] 35

First-order
temporal formula|=
?
The Data-Aware Case
36
Process+Data
Inﬁnite-state, relational
transition system [Vardi 2005]

Why FO Temporal Logics
• To inspect data: FO queries
• To capture system dynamics: temporal
modalities
• To track the evolution of objects: FO
quantiﬁcation across states
• Example: It is always the case that every
order is eventually either cancelled, or
paid and then delivered
37

Problem Dimensions
Data
component
Relational DB Description
logic KB
OBDA system Inconsistency
tolerant KB
…
Process
component
condition-
action rules
ECA-like
rules
Golog program …
Task
modeling
Conditional
effects
Add/delete
assertions
Logic  
programs
…
External
inputs
None External
services
Input DB Fixed input …
Network
topology
Single
orchestrator
Full mesh Connected, ﬁxed
graph
…
Interaction
mechanism
None Synchronous Asynchronous
and ordered
…
38

Data-Centric Dynamic Systems
39
model underlying variants of artifact-centric systems
lly equivalent to the most expressive models for bu
e.g., GSM).
Data Process Data+Process
r: Relational databases / ontologies
schema, specifying constraints on the allowed states

DCDS [PODS 2013, …]
• A “pristine” framework for data-aware process modeling
• Data layer: relational DB with constraints
• Schema: constraints on the data
• Instance: state of the DCDS
• Fixed initial instance
• Inﬁnite data domain with a ﬁnite subset for the “initial data”
• Process layer
• Atomic actions
• Condition-action rules for action executability
40

More on Actions
• Actions have parameters
• Bound to corresponding values for execution
• Actions have conditional “CRUD” effects (a là ADL)
• IF-THEN rules
• IF part: query over the current DB
• THEN part: ADD/DELETE tuples, using
• Action parameters
• Results to the IF query (bulk interpretation)
• Service calls to account for new data
41

Example: User Cart
42
Customer
Id …
Product
Id …
InCart
BarCode Customer Product

User Cart Actions
• Any customer may decide to insert a new item of a
given product into her cart 
 
• Any customer may empty her own cart
43
9~y, ~z.Customer(c, ~y) ^ Product(p, ~z) 7! AddToCart(c, p)
9~y.Customer(c, ~y) 7! EmptyCart(c)

User Cart Actions
• Adding to a cart…
• Emptying a cart…
44
AddToCart(c, p) :
true add{InCart(getBarCode(p), c, p)}
EmptyCart(c) :
InCart(b, c, p) del{InCart(b, c, p)}

Execution Semantics
Relational transition systems with database-labeled states 
 
Successors constructed considering all possible ground
executable actions and all possible service call results 
(s.t. the resulting state satisﬁes the schema constraints)
…45

…
Sources of Inﬁnity
47
Inﬁnite-branching  
due to external input

…
Sources of Inﬁnity
48
Runs visiting inﬁnitely many DBs  
due to usage of external input

Our Goal
50
First-order
temporal formula
|=
Inﬁnite-state
transition system

Our Goal
51
First-order
temporal formula
|=
Inﬁnite-state
transition system
|=
Finite-state
abstraction
Propositional
temporal formula
‘

Our Goal
52
First-order
temporal formula
|=
Inﬁnite-state
transition system
|= Propositional
temporal formula
‘
If and only if
Finite-state
abstraction

Which logics?
• First-order temporal logics with active domain
quantiﬁcation
• Branching-time: μLa 
• Linear-time: LTL-FOa 
• Only initial constants can explicitly appear in the formulae
• Corresponding notions of bisimulation/trace equivalence
53
G(8s.Student(s) ! F(Retired(s) _ Graduated(s))
AG(8o.Order(o) ^ Status(o, open) ! EF(mathitStatus(o, shipped))

The Good…
DCDS are:
• Markovian: Next state only depends on the
current state + input.  
Two states with identical DBs are bisimilar.
• Generic: FO/SQL (as all query languages) does
not distinguish structures which are identical
modulo uniform renaming of data objects.
—> Two isomorphic states are bisimilar
54

…the Bad…
55
What is a DCDS? 
A Turing machine
I.e., You are doomed to
undecidability, even for

…the Bad…
56
What is a DCDS? 
A ﬁnite-state control process
operating over an unbounded
memory.
So what is it?
A Turing machine

…the Bad…
57
What is a DCDS?
A ﬁnite-state control process
operating over an unbounded
memory.
So what is it?
A Turing machine
propositional reachability!

…and the Ugly
Simulation of a 2-counter Minsky machine
• Counter —> “size” of a unary relation
• Test counter1 for zero: check whether counter relation is empty
• What matters is the # of tuples, not the actual values
• Can be reconstructed also without negation in the queries
58
New
Increment Decrement

State-Boundedness
Put a pre-defined bound on the DB size
• Resulting transition system: still infinite-state
(even in the 1-bounded case)
• But: infinitely-many encountered values along a
run cannot be “accumulated” in a single state
59

Effect of state-boundedness
60
General State-bounded
μLa undecidable
LTL-FOa undecidable

61
μLa undecidable
decidable  
abstraction must  
depend on the formula!
LTL-FOa undecidable undecidable

62
μLa undecidable
decidable  
Reason?
FO quantiﬁcation
across states.

63
μLa undecidable
decidable  
Reason?
Logic unable to isolate
single runs/computations

Logics with
Persistent Quantiﬁcation
• Intuition: control the ability of the logic to quantify
across states
• Only objects that persist in the active domain of
some node can be tracked
• When an object is lost, the formula trivializes to true
or false
• E.g.: “guarded” until
Persistence-Preserving µ-calculus (µLP)
In some cases, objects maintain their identity only if they persist in th
active domain (cf. business artifacts and their IDs).
. . .
StudId : 123
. . .
StudId : 123
. . .dismiss(123) newStud()
ID() = 123
µLA
µLFOG(8s.Student(s) ! Student(s)U(Retired(s) _ Graduated(s)))
64

Effect of Persistent
Quantiﬁcation
65
μLa undecidable
decidable  
μLp
LTL-FOp

Quantiﬁcation
66
μLa undecidable
decidable  
μLp undecidable
LTL-FOp undecidable

Quantiﬁcation
67
μLa undecidable
decidable  
μLp undecidable
decidable
formula-independent
abstraction!
LTL-FOp undecidable
decidable
formula-independent
abstraction!

Pruning Infinite-Branching
• Consider a system snapshot and its node DBs
• Input is bounded —> only boundedly many
isomorphic types relating the input objects and
those in the DDS active domain
• Input configurations in the same isomorphic
type produce isomorphic snapshots
• Keep only one representative successor
state per isomorphic type
• The “pruned” transition system is finite-
branching and bisimilar to the original one
68

Example
• Input: single (new?) value
• Current state: two objects
a,b
a
b
c
de
69

Example
• Input: single (new?) value
• Current state: two objects
a,b
a
b
c
70

Compacting Inﬁnite Runs
• Key observation: due to persistent quantiﬁcation, the
logic is unable to distinguish local freshness from
global freshness
• So we modify the transition system construction:  
whenever we need to consider a fresh representative
object…
• … if there is an old object that can be recycled  
—> use that one
• … if not —> pick a globally fresh object
• This recycling technique preserves bisimulation!
71

Compacting Inﬁnite Runs
• [Calvanese et al, 2013]: if the system is size-
bounded, the recycling technique reaches a
point were no new objects are needed 
—> ﬁnite-state transition system
• N.B.: the technique does not need to know
the value of the bound
72

Is My System State-
Bounded?
• For a ﬁxed k: decidable.
• For “some” bound: undecidable.
• Studied sufﬁcient, syntactic conditions in the
style of “weak acyclicity” in data exchange
• See PODS 2013, KR 2014
74

How to Make my System
State-Bounded?
• Different methodologies under investigation
• See in particular the work on BAUML (artifact-
centric UML processes)
• Bounds obtained by suitably “constraining” the
UML diagrams representing the data
75

What if I have “Cases”?
• Each case may reach boundedly many values in
each state
• If cases “interact”: also need to bound how many
cases are simultaneously present
• If they do not interact: unboundedly many cases
can coexist
• See [STTT 2016]
76

Conclusion
Marriage between processes and data is challenging,
though necessary
• State-boundedness is a robust condition towards the effective
veriﬁability of such systems
• The same results hold by enriching the data component with
knowledge (ontologies, inconsistency-tolerance, …) 
[RR 2012, ECAI 2012, IJCAI 2013, IJCAI 2015, JAIR 2013]
• The same results hold by changing the query language (Datalog,
…)
• Complexity: exponential in the “data that can be changed”
• Same formal model for execution and veriﬁcation
77

Current lines
• Emphasis on other reasoning services: monitoring, planning,
adversarial synthesis [RR 2013, IJCAI 2015, IJCAI 2016]
• Connections with several modeling languages [ICSOC 2013,
CIKM 2014, FAOC 2016]
• Distributed and multi agent systems [AAMAS 2014, AAAI
2015]
• Relaxations of size-boundedness, with the help of
• Parameterized veriﬁcation
• Veriﬁcation via underapproximation [PODS 2016]
78

Current and Future Work
• Implementations, leveraging the long-standing
literature in data management and formal verification
• Controlling state-boundedness is important
• Domain-specific case studies… how do they relate to
our approach?
• Relaxations of size-boundedness, with the help of
• Parameterized verification
• Verification via underapproximation [PODS 2016]
79

Acknowledgments
All coauthors of this research,  
in particular  
 
Babak Bagheri-Hariri (did PhD @ UNIBZ) 
Diego Calvanese (UNIBZ) 
Giuseppe De Giacomo (UNIROMA) 
Alin Deutsch (UCSD) 
Fabio Patrizi (UNIBZ) 
Ario Santoso (did PhD @ UNIBZ)
80

Balancing Expressiveness and Verifiability in Data-Aware Business Processes

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Similar to Balancing Expressiveness and Verifiability in Data-Aware Business Processes

Similar to Balancing Expressiveness and Verifiability in Data-Aware Business Processes (20)

More from Faculty of Computer Science - Free University of Bozen-Bolzano

More from Faculty of Computer Science - Free University of Bozen-Bolzano (20)

Recently uploaded

Recently uploaded (20)

Balancing Expressiveness and Verifiability in Data-Aware Business Processes