Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Incremental pattern matching in the VIATRA2 model transformation framework
1. Incremental pa,ern matching in the
VIATRA2 model transforma9on
framework
István Ráth (rath@mit.bme.hu)
Budapest University of Technology and Economics
Budapest University of Technology and Economics
Department of Measurement and Informa<on Systems
2. Overview
Introduc9on to GT
o from the VIATRA2 perspec9ve
Incremental pa,ern matching with RETE
Ini9al Performance analysis
o Incremental vs. Local search
Fine tuning
o Paralleliza9on
o Hybrid Pa,ern Matching
Ongoing Research/Future work
Summary
4. MDA, DSM in prac9ce
Modeling
PIM (re-engineering)
Domain-specific
models
Embedded
CORBA J2EE Platform-specific
platform
model model models
model
CORBA J2EE Embedded Legacy
Application
application application application code
BME‐MIT Miniszimpózium
5. MDA, DSM in prac9ce
Modeling
PIM (re-engineering)
Domain-specific
models
Domain‐specific
views
Embedded
CORBA J2EE Platform-specific
platform
model model models
model
CORBA
application
J2EE
application
Embedded
application
Applica9ons
Legacy
code
Application
of VIATRA2
BME‐MIT Miniszimpózium
6. 6
Metamodeling
Class
1
At most one tokens Place
1
Association
* 1
outarc
Metamodel
Token Multiplicity
inarc
* * constraint
Arbitrary
Slot Transition
Instance model
t1:Transition Object
a4:outarc a3:inarc
p1:Place Link p3:Place
a1:inarc a2:outarc
tkn3:tokens
to1:Token t2:Transition
ICGT '08
7. 7
Graph Transforma9on
LHS RHS
a1:inarc a2:outarc a1:inarc a2:outarc
Place Tran. Place Place Tran. Plan
ttn1:tokens tkn2:tokens
Token Token
matching updating
Phases of GT matching
– Pattern Matching phase
– Updating phase: delete+ create
Pattern Matching is the most critical issue from performance viewpoint
8. Incremental model transforma9ons
Key usage scenarios for MT:
o Mapping between languages
o Intra‐domain model manipula9on
• Model execu9on
• Validity checking (constraint evalua9on)
They work with evolving models.
o Users are constantly changing/modifying them.
o Users usually work with large models.
Problem: transforma9ons are slow
o To execute… (large models)
o and to re‐execute again and again (always star9ng from scratch).
Solu9on: incrementality
o Take the source model, and its mapped counterpart;
o Use the informa9on about how the source model was changed;
o Map and apply the changes (but ONLY the changes) to the target
model.
9. Towards incrementality
How to achieve incrementality?
o Incremental updates: avoid re‐genera9on.
• Don’t recreate what is already there.
• Use reference (correspondence) models.
o Incremental execu+on: avoid re‐computa9on.
• Don’t recalculate what was already computed.
• How?
11. Incremental graph pa,ern matching
Graph transforma9ons require pa,ern matching
o Most expensive phase
Goal: retrieve the matching set quickly
How?
o Store (cache) matches
o Update them as the model changes
• Update precisely
Expected results: good, if…
o There is enough memory (*)
o Queries are dominant
o Model changes are rela9vely sparse (**)
o e.g. synchroniza9on, constraint evalua9on, …
12. Opera9onal overview
XForm
pa,ern interpreter model manipula9on
matching
Incremental
VIATRA
pa,ern event
no9fica9on Model space
matcher
updates
13. Architecture
LS pa,ern
Model XForm matcher
parser parser
XML XForm
VIATRA2 Framework
serializer interpreter
Incremental
Na9ve importer & pa,ern
loader interface matcher
Core interfaces
VIATRA Model space Program model store
14. Core idea: use RETE nets
RETE network
Model space
INPUT
t3
o node: (par9al) matches of a p1 p2 p3 t1 t2 k1 k2 t3
(sub)pa,ern
t3 t3
o edge: update propaga9on t3
Demonstra9ng the principle
o input: Petri net
Input nodes
: Place
p1 p2 p3
: Token
k1 k2
: Transi9on
t1 t2 t3
o pa,ern: fireable transi9on
o Model change: new transi9on t3
(t3)
t1
Intermediate
p1, k1 p2, k2
p1
nodes p1, k1, t1 p2, k2, t3
p3 p2
t2 p2, k2, t3
t3
Produc9on node
p1, k1, t1, p3 p2, k2, t2, p3
15. RETE network construc9on
Key: pa,ern decomposi9on
o Pa,ern = set of constraints (defined over pa,ern variables)
o Types of constraints: type, topology (source/target),
hierarchy (containment), a,ribute value, generics
(instanceOf/supertypeOf), injec+vity, [nega9ve] pa,ern
calls, …
Construc9on algorithm (roughly)
o 1. Decompose the pa,ern into elementary constraints (*)
o 2. Process the elementary constraints and connect them
with appropriate intermediate nodes (JOIN, MINUS‐JOIN,
UNION, …)
o 3. Create terminator produc9on node
16. Key RETE components
JOIN node INPUT
INPUT INPUT
o ~rela9onal
algebra: natural
join
JOIN
MINUS‐JOIN JOIN
o Nega9ve
existence (NACs)
PRODUCTION
sourcePlace
17. Suppor9ng a rich pa,ern language
Pa,ern calls
o Simply connect the produc9on nodes
o Pa,ern recursion is fully supported
OR‐pa,erns
o UNION intermediate nodes
Check condi9ons
o check (value(X) % 5 == 3)
o check (length(name(X)) < 4)
o check (myFunction(name(X))!=‘myException’)
o Filter and term evaluator nodes
Result: full VIATRA transforma9on language support;
any pa,ern can be matched incrementally.
18. Updates
Needed when the model space changes
VIATRA no9fica9on mechanism (EMF is also possible)
o Transparent: user modifica9on, model imports, results of a
transforma9on, external modifica9on, … RETE is always
updated!
Input nodes receive elementary modifica9ons and
release an update token
o Represents a change in the par9al matching (+/‐)
Nodes process updates and propagate them if
needed
o PRECISE update mechanism
20. Performance
In theory…
o Building phase is expensive (“warm‐up”)
• How expensive?
o Once the network is built, pa,ern matching is an
“instantaneous” opera9on.
• Excluding the linear cost of reading the result set.
o But… there is a performance penalty on model
manipula9on.
• How much?
Dependencies?
o Pa,ern size
o Matching set size
o Model size
o …?
21. 21
Benchmarking
Aim:
o systema9c and reproducible measurements
o on performance
o under different and precisely defined circumstances
Overall goal:
o help transforma9on engineers in selec9ng tools, fine tuning
op9ons
o Serve as reference for future research
Popular approach in different fields
o AI
o rela9onal databases
o rule‐based expert systems
ICGT '08
22. 22
Benchmarking in graph transforma9on
Specifica9on examples for GT
o Goal: assessing expressiveness
o UML‐to‐XMI, object‐rela9onal mapping,
UML‐to‐EJB, etc.
„Generic” Performance benchmarks for GT
o Varró benchmark
o R. Geiß and M. Kroll: On Improvements of the Varró
Benchmark for Graph Transforma<on Tools
o (Ag<ve Tool Contest, Grabats ’08, …)
Our ICGT’08 paper:
Benchmarks for graph transforma<on with incremental
pa_ern matching
o simula9on
o synchroniza9on
ICGT '08
23. Petri net simula9on benchmark
Example transforma9on: Petri net simula9on
o One complex pa,ern for the enabledness condi9on
o Two graph transforma9on rules for firing (tokens are re‐created)
o As‐long‐as‐possible (ALAP) style execu9on (“fire at will”)
o Model graphs:
• A “large” Petri net actually used in a research project (~60 places, ~70
transi9ons, ~300 arcs)
• Scaling up: automa9c genera9on preserving liveness (up to 100000
places, 100000 transi9ons, 500000 arcs)
Analysis
o Measure execu9on 9me (average mul9ple runs)
o Take “warm‐up” runs into considera9on
Profiling
o Measure overhead, network construc9on 9me
o “Normalize” results
24. Profiling results
Model manipula9on overhead: ~15% (of overall CPU
9me)
o Depends largely on the transforma9on!
Memory overhead
o Petri nets (with RETE networks) up to ~100000 fit into
1‐1.5GB RAM (VIATRA model space limita9ons)
o Grows linearly with model size (as expected)
o Nature of growth is pa,ern‐dependent
Network construc9on overhead
o Similar to memory; pa,ern‐dependent.
o PN: In the same order as VIATRA’s LS heuris9cs
ini9aliza9on.
25. Execu9on 9mes for Petri net simula9on
Matches/outperforms
Sparse Petri net benchmark GrGEN.NET for large
models and high
1000000 itera9on counts.
Viatra/RETE (x1k)
Three orders of
100000
magnitude and Viatra/LS (x1k)
growing…
Execu<on <me (ms)
10000
GrGenNET (x1k)
1000
Viatra/RETE
100
(x1M)
GrGen.NET
10 (x1M)
100 1000 10000 100000
Petri net size
26. 26
Object‐Rela9onal Mapping: Synchroniza9on
:schemaRef
: Package : Schema
:tableRef
: Class : Table
:columnRef
: Attribute : Attribute : Column : Column
:columnRef
orphanTable
Sync order
1. Orphan schema delete
NEG
2. Orphan Tableclass delete
: tablerRef
C: Class 3. Orphan Tableassoc delete
{DEL}
4. Orphan Columnattr. Delete
OR T : Table
5. Renaming
A: Association 6. new Classes/Assocs/Columns
: tableRef created
ICGT '08
27. 27
Object‐Rela9onal Mapping: Synchroniza9on
Test Case generation Source modification
− Fully connected graph − 1/3 classes deleted
− N classes − 1/5 associations deletes
− N(N-1) directed association − ½ attributes renamed
− K attributes − 1 new class added and the fully
connected graph is rebuilt
Execution Characteristic
− Phases Paradigm
1. Generation Features ORM Syn.
2. Build LHS size large
3. Modification fan-out medium
4. Synchronization matchings PD
− Measured: Synchronization transformation
phase sequence
length PD
ICGT '08
29. 29
Varró: STS Benchmark
• on reusable model elements
N
• uites better for LS
S
ICGT '08
30. Ini9al benchmarking: summary
Predictable near‐linear growth
o As long as there is enough memory
o Certain problem classes: constant execu9on 9me
Benchmark example descrip9ons, specifica9ons, and
measurement results available at:
o h_p://wiki.eclipse.org/VIATRA2/Benchmarks
32. Improving performance
Strategies
o Improve the construc9on algorithm
• Memory efficiency (node sharing)
• Heuris9cs‐driven constraint enumera9on (based on pa,ern [and
model space] content)
o Parallelism
• Update the RETE network in parallel with the transforma9on
• Support parallel transforma9on execu9on
• Paper at GT‐VMT’09: Paralleliza9on of Graph Transforma9on
Based on Incremental Pa,ern Matching
o Hybrid pa,ern matching
• „mix” pa,ern matching strategies, use each at its best
• Paper at ICMT’09: Efficient model transforma9ons by combining
pa,ern matching strategies
33. Concurrent pa,ern matching
Asynchronous update propaga9on in RETE
Concurrent to the transforma9on
o Wait for update propaga9on only when querying
query sourcePlace
instantaneous
token removed no<fica
<on
asynchronous
token removed no<fica
asynchronous <on
query targetPlace
synchronizing
token added no<fica<
asynchronous on
transforma9on RETE
35. Mul9‐threaded pa,ern matching
Mul9‐threaded RETE propaga9on
o Node set par99oned into containers
Detec9ng the fixpoint is nontrivial
Timestamp‐based synchroniza9on protocol
Pa,ern query: wait for termina9on!
o Dedicated update propaga9on thread per container
High connec9vity is
Place Token bad for performance Transi9on
p1 p2 p3 k1 k2 t1 t2 t3
Whoops… p1, k1 p2, k2
t3
OK!
OK! p1, k1, t1 p3, k2, t3
p3, k2, t3
p1, k1, t1, p3 p3, k2, t2, p2
OK!
thread 1 thread 2 thread 3
37. Parallel rule execu9on
Rule execu9on is dominant (if PM is fast)
o There is more to gain there!
Problem: parallel rules must not conflict
o Core conflicts: see cri9cal pair analysis
o Pa,ern composi9on, recursion…
o Parallel rules require thread‐safe PM, model
Exclusion / locks are required
token removed no<fica
<on lock by thread 2
token removed no<fica
<on lock by thread 1
thread 1 thread 2 RETE
38. Evalua9ng parallel performance
PNML generator Single Double
Performance evalua9on Serial 2.9s 5.8s
o Code genera9on: Parallel 3.7s 3.9s
50% speed increase
read only
o GraBaTs ‘08 AntWorld benchmark: 4 CPUs, was slower
Observa9on: locks can become bo,lenecks
Proposed solu9ons
o Par9al locking? Rule‐level locking?
Read‐intensive transforma9on is preferable
39. Parallelism: lessons learned
Incremental PM might be good for paralleliza9on
o No search, shorter locking!
Incremental PM might be bad for paralleliza9on!
o Update propaga9on keeps locks longer
o Concurrent matcher solves this
Future work
o Enhancing model locking
o Clever mul9‐threading of RETE
o Very large models: distributed MT, distributed PM
41. Where LS is be,er…
Memory consump9on
o RETE sizes grow roughly linearly with the model space
o Constrained memory trashing
Cache construc9on 9me penalty
o RETE networks take 9me to construct
o „naviga9on pa,erns” can be matched quicker by LS
Expensive updates
o Certain pa,erns’ matching set is HUGE
o Depends largely on the transforma9on
42. Hybrid PM in the source code
Assign a PM implementa9on on a
per‐pa,ern (per‐rule) basis
ability to fine tune performance
on a very fine grained level.
51. Event‐driven live transforma9ons
Problem: MT is mostly batch‐like
o But models are constantly evolving Frequent re‐
transforma9ons are needed for
• mapping
• synchroniza9on
• constraint checking
• …
An incremental PM can solve the performance
problem, but a formalism is needed
o to specify when to (re)act
o and how.
Ideally, the formalism should be MT‐like.
53. Future work
Paralleliza9on
o A lot of research going on (GrGEN.NET, FUJABA, …)
o We have some tricks le• to be explored, too
New applica9ons of incremental PM
o Live transforma9ons model synchroniza9on,
constraint evalua9on, trace model genera9on
o Stochas9c model simula9on collabora9on with
Reiko’s group
Adapta9on of our technology to other pla€orms
o ViatraEMF
54. Summary
Incremental pa,ern matching in VIATRA2
o A leap forward in performance
o Applicable to other tools
o In‐depth inves9ga9ons revealed interes9ng details
Future
o Performance will be further improved
o We’re working on new applica9ons
Thank you for your a,en9on.
h,p://eclipse.org/gmt/VIATRA2
h,p://wiki.eclipse.org/VIATRA2