Generation of Random EMF Models for Benchmarks

Generation of Random
Software Models for
Benchmarks
Markus Scheidgen
1

Agenda
▶ Benchmarks for MDE
▶ Input models for MDE benchmarks
▶ Generation of random models
■ Language
■ Examples
▶ Related Work
▶ Conclusion
2

Benchmarks (I)
▶ in smallMDE technology1 is solely evaluated by its functionality
▶ BigMDE technology is evaluated by its functionality and its
performance (execution time, memory consumption, ...)
▶ Benchmarks enable sound comparison of technologies
based on their performance
1) technology = algorithms methods tools frameworks
3

Benchmarks (II)
▶ A benchmark describes the measure ...
■ of a well defined property
■ acquired in a well defined processes
■ with a well defined workload (tasks and inputs)
■ in a well defined environment
4

All measurements were performed on a Notebook
computer with Intel Core i5 2.4GHz CPU, 8 GB
1067 MHz DDR3 RAM, running Mac OS 10.7.3.
+some environment
- software versions, JVM
configuration
M. Scheidgen, A. Zubow, J. Fischer, T. H. Kolbe: Automated and Transparent Model Fragmentation for Persisting Large Models; ACM/IEEE 15th International
Conference on Model Driven Engineering Languages & Systems (MODELS); Innsbruck; 2012; LNCS Springer

+some environment
configuration
+some process
- distribution, variation,
outlier
- warmup: JIT, caches, GC
All experiments were repeated at least 20 times, and
all present results are respective averages.

We measured the performance of instantiating and
persisting objects.
+some property description
- exact task?
- comparable between
technologies
+some environment
configuration
+some process
outlier

We measured the performance of instantiating and
persisting objects.
+some property description
- exact task?
- comparable between
technologies
+some environment
configuration
+some process
outlier
XMI CDO Morsa EMFFrag
10
3
10
4
10
5
Objectspersecond(×10
4
)
w/ cross references
w/o cross references
+two specific shapes
- two specific shapes
- real world likeness
We created test models with 105 objects, a binary
containment hierarchy, and two different densities of
cross references: one cross reference per object and no
cross references.

Input for Benchmarks (I)
▶ A benchmark input model should
■ include no bias
■ invoke real world behavior
■ cover diﬀerent scenarios
■ metrical scale
6

Input for Benchmarks – In MDE
▶ For MDE technology input is usually a software engineering
artifact, which we commonly refer to as a model
▶ Usually the models from the 2009 Grabat’s graph
transformation contest are used
■ MoDisco-models of JDT
■ diﬀerent sizes and shapes (with and without method
implementations)
■ sizes not linear
7

Input for Benchmarks – Properties: Size & Shape
▶ different properties to mimic different scenarios and invoke different
behavior/performance characteristics
▶ goal: understand correlation between performance properties and model
sizes and shapes
▶ ordinal vs metrical
▶ What defines a shape?
■ metrics (depending on the language, e.g. methods per class in OO
programming)
■ graph/tree properties (degree, connectedness, sparse vs. dense, etc.)
▶ What defines size
■ # objects
■ # values
■ # links
8

Input for Benchmarks – Properties: Size & Shape
▶ different properties to mimic different scenarios and invoke different
behavior/performance characteristics
▶ goal: understand correlation between performance properties and model
sizes and shapes
▶ ordinal vs metrical
▶ What defines a shape?
■ metrics (depending on the language, e.g. methods per class in OO
programming)
■ graph/tree properties (degree, connectedness, sparse vs. dense, etc.)
▶ What defines size
■ # objects
■ # values
■ # links
9

Input for Benchmarks – Approaches
▶ handcraft input models – no scalability
▶ take existing models – only given shapes
▶ generate models – do not mimic the real world
▶ bias?
■ bias in creation, selection, algorithm
■ social problem, can’t use technology to solve social problems
10

Input for Benchmarks – Random Models
▶ random ≠ arbitrary nor uniform
▶ surprise element
▶ probability distributions as abstractions for typical usage of
language constructs
■ e.g. a class has typically a negative binomial distributed (with
certain parameters) number of methods [1]
▶ distribution parameters to deﬁne shapes
▶ random models can be sensible representatives of a large
class of models
11
[1] Tetsuo Tamai, Takako Nakatani: Analysis of Software Evolution Processes Using Statistical
Distribution Models, IWPSE '02

Generation of Random Models – A Generator DSL (I)
12
generator RandomEcore for ecore in "...ecore/model/Ecore.ecore" {
ePackage: EPackage ->
name := RandomID(Normal(8,3))
eClassifiers += eClass#NegBinomial(5,0.5)
;

eClass: EClass ->
abstract := UniformBool(0.2)
eStructuralFeatures += eReference(UniformBool(0.3))#NegBinomial(4,0.7)
eStructuralFeatures += eAttribute#NegBinomial(6,0.5)
;

eReference(boolean composite):EReference ->
upperBound := if (UniformBool(0.5)) -1 else 1
ordered := UniformBool(0.2)
containment := composite
eType:EClass := Uniform(model.EClassifiers.filter[it instanceof EClass])
;
...
}
http://github.com/markus1978/RandomEMF

Input for Benchmarks – A Generator DSL (II)
13
▶ Maps Meta-Model to Grammar-like description
▶ Rule based
▶ Each rule creates an object of a certain meta-class
▶ Each rule calls other rules to create features
▶ Rules can have parameters
▶ Expressions with random values
■ diﬀerent distributions for random number generation
■ random number of rule application
■ random values (e.g. identiﬁer, choices)
▶ xText + xBase DSL

Generation of Random Models – Generated Example
14
1. package dabobobues;
3. class Dues {
4.
5. DuBoBuTus begubicus;
6. ELius brauguslus;
7.
8. void Dues(Alius donus, FanulAudaCio aubetin) {
9. }
10.
11. void baGusFritus() {
12. eudaguslius = "";
13. bigusdaGubolius();
14. if ("") {
15. annulAugusaugusfrigustin("");
16. albucio = Dues()<=++12;
17. bi();
18. eBoTor();
19. } else {
20. brauguslus = 9;
21. baGusFritus();
22. duLus = ""=="";
23. }
24. }
25.
26. void aufribonulAubufrinus(Dues e) {
27. dobubogutor();
28. aubiguTus = 9;
29. }
30. }
randomly generated code
syntheticly generated code

15
generated
generated
actual project
actual project
generated
generated

Generation of Random Models – Problems
▶ randomness is a tool to reduce bias, but clients have to
decide to use it correctly
▶ hard to generate static semantically correct models
16

Related Work
▶ Test-Model generation with SAT-Solvers
■ Meta-Model/Constraint divided into small partitions that cover
test-cases
■ translation into logical equations
■ SAT-Solver
■ translation of results into model-fragments
■ composition of test-models from model fragments
➡ small, valid models with statistically proved test-coverage
17
Sagar Sen, Benoit Baudry, Jean-Marie Mottu: Automatic Model Generation Strategies for Model Transformation
Testing, Theory and Practice of Model Transformations, Springer, 2009
Erwan Brottier, Franck Fleurey, Jim Steel, Benoit Baudry, Yves Le Traon: Metamodel-based Test Generation for Model
Transformations: an Algorithm and a Tool, ISSRE’06, IEEE, 2006

Related Work
▶ Translation into a constructive formalism
■ Meta-Modeling is not constructive (full set of instances can not
be generated from a meta-model)
■ translation into context-free or graph-grammars
■ random application of rules to generate random models
➡ large models, shape can be inﬂuenced via probability
distributions on rule selection
18
K Ehrig, JM Küster, G Taentzer: Generating instance models from meta models, Formal
Methods for Open Object-Based Distributed Systems, Springer, 2006

Related Work
▶ Fitting meta-model instances onto randomly generated tree/
graph structures
■ existing methods for random tree or graph generation
■ interpretation of randomly generated trees/graphs as meta-
model instances
➡ large models, but uniform models, not static semantic aware
19
A Mougenot, A Darrasse, X Blanc, M Soria: Uniform random generation of huge metamodel
instances, ECMDA, Springer, 2009

Related Work
▶ benchmark definitions for graph transformations
▶ different distribution for graph edges to create different
shapes
■ binomial
■ hypergeometric
■ uniform
■ preferential attachment
➡ large models, not static semantic aware
20
Izso, B., Szatmari, Z., Bergmann, G., Horvath, A., & Rath, I.: Towards precise
metrics for predicting graph query performance. 2013 28th IEEE/ACM International
Conference on Automated Software Engineering, ASE 2013

Conclusions
▶ benchmarking in MDE can be improved
▶ there are other options for input models than the Grabats’ 09
contest models
▶ different shapes (preferably on a metrical scale) should be
used to find distinctive merits and flaws in compared
technologies
▶ generators for random models
■ parameters to create differently shaped models
■ randomness and suitable distributions for real world like input
■ linear scaled sizes
21

Generation of Random EMF Models for Benchmarks

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Generation of Random EMF Models for Benchmarks

Similar a Generation of Random EMF Models for Benchmarks (20)

Último

Último (20)

Generation of Random EMF Models for Benchmarks