SlideShare una empresa de Scribd logo
1 de 106
Descargar para leer sin conexión
Design and Validation of Cloud Storage Systems
using Maude
Peter Csaba ¨Olveczky
University of Oslo
University of Illinois at Urbana-Champaign
Based on joint work with Jon Grov and members of UIUC’s Center for
Assured Cloud Computing
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 1 / 58
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 2 / 58
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 3 / 58
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 4 / 58
Cloud Computing Data Stores
Cloud computing systems store/retrieve large amounts of data
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 5 / 58
Availability
Data should always be available
network/site failures, network congestion, scheduled upgrades
−→ data must be replicated
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 6 / 58
Availability
Data should always be available
network/site failures, network congestion, scheduled upgrades
−→ data must be replicated
Large and growing data
Facebook (2014): 300 petabytes data; 350M photos uploaded every
day
−→ data must be partitioned
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 6 / 58
Consistency in Replicated Systems
Figure by Jiaqing Du
Consistency: All replicas of a data item should have same value
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 7 / 58
“CAP Theorem”
Data consistency + partition tolerance + availability impossible
(Figure from http://flux7.com/blogs/nosql/cap-theorem-why-does-it-matter/)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 8 / 58
Slightly Different View
Trade-off
consistency level ←→ latency
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 9 / 58
Eventual Consistency
Weak consistency OK for some applications
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 10 / 58
Eventual Consistency
Weak consistency OK for some applications
. . . but not others:
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 10 / 58
Designing Data Stores
Complex systems
size
replication
concurrence
fault tolerance
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 11 / 58
Designing Data Stores
Complex systems
size
replication
concurrence
fault tolerance
Many hours of “whiteboard analysis”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 11 / 58
Validating Data Store Designs
Correctness: “hand proofs”
error prone
informal
key assumptions implicit
does not scale to nontrivial systems
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 12 / 58
Validating Data Store Designs
Correctness: “hand proofs”
error prone
informal
key assumptions implicit
does not scale to nontrivial systems
Performance: simulation tools, real implementations
additional artifact
cannot be used to reason about correctness
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 12 / 58
Our Approach: Formal Methods
Use formal methods to develop and validate designs
define mathematical model of system
use mathematical rules to analyze system
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 13 / 58
Our Approach: Formal Methods
Use formal methods to develop and validate designs
define mathematical model of system
use mathematical rules to analyze system
Find errors early!
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 13 / 58
Using Formal Methods (I): Validation Perspective
Formal system model S
precise mathematical model
makes assumptions precise and explicit
amenable to mathematical analysis
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 14 / 58
Using Formal Methods (I): Validation Perspective
Formal system model S
precise mathematical model
makes assumptions precise and explicit
amenable to mathematical analysis
Formal property specification P
precise description of consistency model
can check whether S |= P
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 14 / 58
Using Formal Methods (I): Validation Perspective
Formal system model S
precise mathematical model
makes assumptions precise and explicit
amenable to mathematical analysis
Formal property specification P
precise description of consistency model
can check whether S |= P
What about performance analysis?
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 14 / 58
Using Formal Methods (II): Software Engineering
Perspective
Need:
expressive and intuitive modeling language
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
Using Formal Methods (II): Software Engineering
Perspective
Need:
expressive and intuitive modeling language
expressive and intuitive property specification language
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
Using Formal Methods (II): Software Engineering
Perspective
Need:
expressive and intuitive modeling language
expressive and intuitive property specification language
automatically check whether design satisfies property
quick and extensive feedback
saves days of whiteboard analysis
“extensive and automatic test suite”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
Using Formal Methods (II): Software Engineering
Perspective
Need:
expressive and intuitive modeling language
expressive and intuitive property specification language
automatically check whether design satisfies property
quick and extensive feedback
saves days of whiteboard analysis
“extensive and automatic test suite”
design model also for performance analysis!
no new artifact for performance analysis
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
Which Formal Language/Tool?
Difficult challenges:
intuitive
expressive
useful automatic analyses
both correctness and performance analysis
complex properties to check
mature tool support
real-time and probabilistic features
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 16 / 58
Our Framework: Rewriting Logic
Rewriting logic: equations and rewrite rules
expressive
simple/intuitive
object-oriented
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 17 / 58
Our Framework: Rewriting Logic
Rewriting logic: equations and rewrite rules
expressive
simple/intuitive
object-oriented
Maude tool:
simulation
temporal logic model checking
expressive property specification language
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 17 / 58
Our Framework: Rewriting Logic
Rewriting logic: equations and rewrite rules
expressive
simple/intuitive
object-oriented
Maude tool:
simulation
temporal logic model checking
expressive property specification language
Extensions:
real-time systems
probabilistic systems
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 17 / 58
Maude: Software Engineering Perspective I
Models can be developed quickly
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 18 / 58
Maude: Software Engineering Perspective I
Models can be developed quickly
Simulation gives quick feedback (rapid prototyping)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 18 / 58
Maude: Software Engineering Perspective I
Models can be developed quickly
Simulation gives quick feedback (rapid prototyping)
Model checking: analyze all behaviors from one initial state
http://embsys.technikum-wien.at/projects/decs/verification/formalmethods.php
formal test-driven development: “test-driven development approach
where many complex scenarios can be quickly tested by model
checking”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 18 / 58
Maude: Software Engineering Perspective (cont.)
What about performance analysis?
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 19 / 58
Maude: Software Engineering Perspective (cont.)
What about performance analysis?
1 (Randomized) simulations
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 19 / 58
Maude: Software Engineering Perspective (cont.)
What about performance analysis?
1 (Randomized) simulations
2 Probabilistic analysis (using PVeStA)
statistical model checking
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 19 / 58
Maude: Software Engineering Perspective (cont.)
Same artifact for:
precise system description
rapid prototyping
extensive testing
correctness analysis
performance estimation
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 20 / 58
Case Study I
Modeling, Analyzing, and Extending Megastore
Joint work with Jon Grov (U. Oslo)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 21 / 58
Megastore
Megastore:
Google’s wide-area replicated data store
3 billion write and 20 billion read transactions daily (2011)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 22 / 58
Megastore: Key Ideas (I)
(Figure from http://cse708.blogspot.jp/2011/03/megastore-providing-scalable-highly.html)
Data divided into entity groups
Peter’s email
Books on rewriting logic
Narciso’s documents
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 23 / 58
Megastore: Key Ideas (II)
Consistency for transactions accessing a single entity group
no guarantee if transaction reads multiple entity groups
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 24 / 58
Our Work
[Developed and] formalized [our version of the] Megastore [approach]
in Maude
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 25 / 58
Our Work
[Developed and] formalized [our version of the] Megastore [approach]
in Maude
first (public) formalization/detailed description of Megastore
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 25 / 58
Our Work
[Developed and] formalized [our version of the] Megastore [approach]
in Maude
first (public) formalization/detailed description of Megastore
56 rewrite rules (37 for fault tolerance features)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 25 / 58
Performance Estimation
Key performance measures:
average transaction latency
number of committed/aborted transactions
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 26 / 58
Performance Estimation
Key performance measures:
average transaction latency
number of committed/aborted transactions
Randomly generated transactions (rate 2.5 TPS)
Network delays:
30% 30% 30% 10%
Madrid ↔ Paris 10 15 20 50
Madrid ↔ New York 30 35 40 100
Paris ↔ New York 30 35 40 100
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 26 / 58
Performance Estimation
Key performance measures:
average transaction latency
number of committed/aborted transactions
Randomly generated transactions (rate 2.5 TPS)
Network delays:
30% 30% 30% 10%
Madrid ↔ Paris 10 15 20 50
Madrid ↔ New York 30 35 40 100
Paris ↔ New York 30 35 40 100
Simulating for 200 seconds:
Avg. latency (ms) Commits Aborts
Madrid 218 109 38
New York 336 129 16
Paris 331 116 21
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 26 / 58
Megastore-CGC: extending Megastore
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 27 / 58
Motivation
Some transactions must access multiple entity groups
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 28 / 58
Motivation
Some transactions must access multiple entity groups
Our work: extend Megastore with consistency for transactions
accessing multiple entity groups
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 28 / 58
Motivation
Some transactions must access multiple entity groups
Our work: extend Megastore with consistency for transactions
accessing multiple entity groups
Megastore-CGC piggybacks ordering and validation onto Megastore’s
coordination protocol
no additional messages for validation/commit!
maintains Megastore’s performance and fault tolerance
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 28 / 58
Performance Comparison using Real-Time Maude
Simulating for 1000 seconds (no failures)
Megastore:
Commits Aborts Avg. latency (ms)
Madrid 652 152 126
Paris 704 100 118
New York 640 172 151
Megastore-CGC:
Commits Aborts Val. aborts Avg.latency (ms)
Madrid 660 144 0 123
Paris 674 115 15 118
New York 631 171 10 150
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 29 / 58
Model Checking Megastore-CGC
Model checking scenarios
5 transactions , no failures, message delay 30 ms or 80 ms
−→ 108,279 reachable states, 124 seconds
3 transactions, one site failure and fixed message delay
−→ 1,874,946 reachable states, 6,311 seconds
3 transactions, fixed message delay and one message failure
−→ 265,410 reachable states, 858 seconds
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 30 / 58
Case Study II
Work by Si Liu, Muntasir Raihan Rahman, Stephen Skeirik, Indranil
Gupta, Jos´e Meseguer, Son Nguyen, Jatin Ganhotra (ICFEM’14,
QEST’15)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 31 / 58
Apache Cassandra
Key-value data store originally developed at Facebook
Used by Amadeus, Apple, CERN, IBM, Netflix, Facebook/Instagram,
Twitter, . . .
Open source
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 32 / 58
Cassandra Overview
Read consistency either one, quorum, or all
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 33 / 58
Cassandra Overview
Read consistency either one, quorum, or all
Write consistency either zero, one, quorum, or all
[Figures from http://www.slideshare.net/nuboat/cassandra-distributed-data-store]
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 33 / 58
Motivation
1 Formal model from 345K LOC
allows experimenting with different optimizations/variations
2 Analyze basic property: eventual consistency
3 When/how often does Cassandra give stronger guarantees?
strong consistency
read-your-writes
4 Performance evaluation:
compare PVeStA analyses with real implementations
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 34 / 58
Formal Analysis with Multiple Clients
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 35 / 58
Performance Estimation
Formal model + PVeStA vs. actual implementation
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 36 / 58
P-Store
P-Store [N. Schiper, P. Sutra, and F. Pedone; IEEE SRDS’10]
Replicated and partitioned data store
Serializability
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 37 / 58
P-Store
P-Store [N. Schiper, P. Sutra, and F. Pedone; IEEE SRDS’10]
Replicated and partitioned data store
Serializability
Atomic multicast orders concurrent transactions
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 37 / 58
P-Store
P-Store [N. Schiper, P. Sutra, and F. Pedone; IEEE SRDS’10]
Replicated and partitioned data store
Serializability
Atomic multicast orders concurrent transactions
Group commitment for atomic commit
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 37 / 58
Atomic Multicast
Definition
Atomic Multicast: Consistent reception order of messages
(a): any pair of nodes receive the same atomic-multicast messages in
the same order
(b): induced “global read order” must be acyclic
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 38 / 58
Atomic Multicast
Definition
Atomic Multicast: Consistent reception order of messages
(a): any pair of nodes receive the same atomic-multicast messages in
the same order
(b): induced “global read order” must be acyclic
Example
A reads m1 < m2
B reads m2 < m3
C reads m3 < m1
satisfies (a) but not (b)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 38 / 58
Atomic Multicast in Maude (I)
Fundamental problem in distributed systems
Impose order on conflicting concurrent transactions
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
Atomic Multicast in Maude (I)
Fundamental problem in distributed systems
Impose order on conflicting concurrent transactions
Many algorithms for atomic multicast
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
Atomic Multicast in Maude (I)
Fundamental problem in distributed systems
Impose order on conflicting concurrent transactions
Many algorithms for atomic multicast
Define generic atomic multicast primitive in Maude
abstract
covers all possible receiving orders
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
Atomic Multicast in Maude (I)
Fundamental problem in distributed systems
Impose order on conflicting concurrent transactions
Many algorithms for atomic multicast
Define generic atomic multicast primitive in Maude
abstract
covers all possible receiving orders
Infrastructure stores (un)read AM messages
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
My Work: Atomic Multicast in Maude (II)
Atomic-multicast message M:
rl [atomic-multicast] :
< O : Node | msgToSend : M, receivers : OS >
=>
< O : Node | ... >
(atomic-multicast M from O to OS) .
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 40 / 58
My Work: Atomic Multicast in Maude (II)
Atomic-multicast message M:
rl [atomic-multicast] :
< O : Node | msgToSend : M, receivers : OS >
=>
< O : Node | ... >
(atomic-multicast M from O to OS) .
Read:
crl [receiveAtomicMulticast] :
(msg M from O2 to O)
< O : Node | ... >
AM-TABLE
=>
< O : Node | ... >
updateAM(MC, O, AM-TABLE)
if okToRead(MC, O, AM-TABLE) .
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 40 / 58
Analyzing P-Store
Find all reachable final states from init3:
Maude> (search init3 =>! C:Configuration .)
Solution 1
C:Configuration --> ...
< c1 : Client | pendingTrans : t1, txns : emptyTransList >
< c2 : Client | pendingTrans : t2, txns : emptyTransList >
< r1 : PStoreReplica | aborted : none,
committed : < t1 : Transaction | ... >
< r2 : PStoreReplica | aborted : none,
committed : < t2 : Transaction | ... >
...
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 41 / 58
Analyzing P-Store
Find all reachable final states from init3:
Maude> (search init3 =>! C:Configuration .)
Solution 1
C:Configuration --> ...
< c1 : Client | pendingTrans : t1, txns : emptyTransList >
< c2 : Client | pendingTrans : t2, txns : emptyTransList >
< r1 : PStoreReplica | aborted : none,
committed : < t1 : Transaction | ... >
< r2 : PStoreReplica | aborted : none,
committed : < t2 : Transaction | ... >
...
sites validate transactions
but client never gets result
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 41 / 58
Analyzing P-Store (cont.)
Solution 5
...
< r1 : PStoreReplica | aborted : none, committed : none,
submitted : < t1 : Transaction | ... >, ... >
< r2 : PStoreReplica | aborted : none,
committed : < t2 : Transaction| ... > ... >
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 42 / 58
Analyzing P-Store (cont.)
Solution 5
...
< r1 : PStoreReplica | aborted : none, committed : none,
submitted : < t1 : Transaction | ... >, ... >
< r2 : PStoreReplica | aborted : none,
committed : < t2 : Transaction| ... > ... >
Host does not validate t1 even when needed info known
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 42 / 58
Fixing P-Store
Found the source of the errors
all replicas must be involved in voting and notification
not just write replicas
Modeled and analyzed proposed corrected version
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 43 / 58
P-Store Summary
“P-Store verified”
3 significant errors found
one confusing definition
key assumption missing
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 44 / 58
Our Conclusions
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 45 / 58
Our Conclusions I
Developed formal models of large industrial data stores
Google’s Megastore (from brief description)
Apache Cassandra (from 345K LOC and description)
P-Store (academic)
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
Our Conclusions I
Developed formal models of large industrial data stores
Google’s Megastore (from brief description)
Apache Cassandra (from 345K LOC and description)
P-Store (academic)
Automatic model checking analysis of consistency properties
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
Our Conclusions I
Developed formal models of large industrial data stores
Google’s Megastore (from brief description)
Apache Cassandra (from 345K LOC and description)
P-Store (academic)
Automatic model checking analysis of consistency properties
Designed own transactional data stores
Megastore-CGC
variation of Cassandra
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
Our Conclusions I
Developed formal models of large industrial data stores
Google’s Megastore (from brief description)
Apache Cassandra (from 345K LOC and description)
P-Store (academic)
Automatic model checking analysis of consistency properties
Designed own transactional data stores
Megastore-CGC
variation of Cassandra
Errors, ambiguities, missing assumptions found in “verified” P-Store
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
Our Conclusions I
Developed formal models of large industrial data stores
Google’s Megastore (from brief description)
Apache Cassandra (from 345K LOC and description)
P-Store (academic)
Automatic model checking analysis of consistency properties
Designed own transactional data stores
Megastore-CGC
variation of Cassandra
Errors, ambiguities, missing assumptions found in “verified” P-Store
Maude/PVeStA performance estimation close to real implementations
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
Our “Software Engineering” Conclusions
Quickly develop formal models/prototypes of complex systems
experiment with different design choices
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
Our “Software Engineering” Conclusions
Quickly develop formal models/prototypes of complex systems
experiment with different design choices
Simulation and model checking throughout design phase
model-checking-based-testing for subtle “corner cases”
replaces days of whiteboard analysis
too many scenarios for standard test-based development
catch bugs early!
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
Our “Software Engineering” Conclusions
Quickly develop formal models/prototypes of complex systems
experiment with different design choices
Simulation and model checking throughout design phase
model-checking-based-testing for subtle “corner cases”
replaces days of whiteboard analysis
too many scenarios for standard test-based development
catch bugs early!
Single artifact for
system description
rapid prototyping
model checking
performance estimation
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
Our “Software Engineering” Conclusions
Quickly develop formal models/prototypes of complex systems
experiment with different design choices
Simulation and model checking throughout design phase
model-checking-based-testing for subtle “corner cases”
replaces days of whiteboard analysis
too many scenarios for standard test-based development
catch bugs early!
Single artifact for
system description
rapid prototyping
model checking
performance estimation
Megastore and Megastore-CGC modeler had no formal methods
experience
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 48 / 58
Amazon Web Services
Amazon Web Services (AWS):
world’s largest cloud computing service provider
more profitable than Amazon’s retail business
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 49 / 58
Amazon Web Services
Amazon Web Services (AWS):
world’s largest cloud computing service provider
more profitable than Amazon’s retail business
Amazon Simple Storage Service (S3)
stores > 3 trillion objects
99.99% availability of objects
> 1 million requests per second
DynamoDB data store
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 49 / 58
Amazon Web Services and Formal Methods
Formal methods used extensively at AWS during design of S3,
DynamoDB, . . .
Used Lamports TLA+
model checking
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 50 / 58
Experiences at Amazon WS
Model checking finds “corner case” bugs that would be hard to find with
standard industrial methods:
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 51 / 58
Experiences at Amazon WS
Model checking finds “corner case” bugs that would be hard to find with
standard industrial methods:
“We have found that standard verification techniques in industry are
necessary but not sufficient. We routinely use deep design reviews,
static code analysis, stress testing, and fault-injection testing but still
find that subtle bugs can hide in complex fault-tolerant systems.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 51 / 58
Experiences at Amazon WS
Model checking finds “corner case” bugs that would be hard to find with
standard industrial methods:
“the model checker found a bug that could lead to losing data [...].
This was a very subtle bug; the shortest error trace exhibiting the bug
included 35 high-level steps. [...] The bug had passed unnoticed
through extensive design reviews, code reviews, and testing.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 51 / 58
Experiences at Amazon WS II
A formal specification is a valuable precise description of an algorithm:
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 52 / 58
Experiences at Amazon WS II
A formal specification is a valuable precise description of an algorithm:
“the author is forced to think more clearly, helping eliminating “hand
waving,” and tools can be applied to check for errors in the design,
even while it is being written. In contrast, conventional design
documents consist of prose, static diagrams, and perhaps
psuedo-code in an ad hoc untestable language.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 52 / 58
Experiences at Amazon WS II
A formal specification is a valuable precise description of an algorithm:
“Talk and design documents can be ambiguous or incomplete, and
the executable code is much too large to absorb quickly and might
not precisely reflect the intended design. In contrast, a formal
specification is precise, short, and can be explored and experimented
on with tools.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 52 / 58
Experiences at Amazon WS III
Formal methods are surprisingly feasible for mainstream software
development and give good return on investment:
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 53 / 58
Experiences at Amazon WS III
Formal methods are surprisingly feasible for mainstream software
development and give good return on investment:
“In industry, formal methods have a reputation for requiring a huge
amount of training and effort to verify a tiny piece of relatively
straightforward code. Our experience with TLA+ shows this
perception to be wrong. [...] Amazon engineers have used TLA+ on
10 large complex real-world systems. In each, TLA+ has added
significant value. [...] Engineers have been able to learn TLA+ from
scratch and get useful results in two to three weeks.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 53 / 58
Experiences at Amazon WS III
Formal methods are surprisingly feasible for mainstream software
development and give good return on investment:
“Using TLA+ in place of traditional proof writing would thus likely
have improved time to market, in addition to achieving greater
confidence in the system’s correctness.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 53 / 58
Experiences at Amazon WS III
Quick and easy to experiment with different design choices:
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 54 / 58
Experiences at Amazon WS III
Quick and easy to experiment with different design choices:
“We have been able to make innovative performance optimizations
[...] we would not have dared to do without having model-checked
those changes. A precise, testable description of a system becomes a
what-if tool for designs.”
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 54 / 58
Experiences at Amazon WS: Limitations
TLA+ did/could not analyze performance degradation
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 55 / 58
Maude vs TLA+
Maude should be better suited!
more intuitive and expressive specification language
OO
hierarchical states
dynamic object/message creation/deletion
. . .
Support for real-time and probabilistic systems
Also for performance estimation!
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 56 / 58
Conclusions at Amazon
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 57 / 58
Take Away from Talk
Formal methods can be an efficient way to
design
test
describe
validate correctness and performance
experiment with different design choices
industrial state-of-the-art fault-tolerant distributed systems also for
non-experts
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 58 / 58
Take Away from Talk
Formal methods can be an efficient way to
design
test
describe
validate correctness and performance
experiment with different design choices
industrial state-of-the-art fault-tolerant distributed systems also for
non-experts
Maude suitable modeling language and analysis toolset
Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 58 / 58

Más contenido relacionado

Destacado

Destacado (14)

Insurance Industry Research & Analysis
Insurance Industry Research & AnalysisInsurance Industry Research & Analysis
Insurance Industry Research & Analysis
 
I twin report
I twin reportI twin report
I twin report
 
Myths of validation
Myths of validationMyths of validation
Myths of validation
 
Lead Generation per B2C e B2B: concetti (Funnel di Conversione, Customer Jour...
Lead Generation per B2C e B2B: concetti (Funnel di Conversione, Customer Jour...Lead Generation per B2C e B2B: concetti (Funnel di Conversione, Customer Jour...
Lead Generation per B2C e B2B: concetti (Funnel di Conversione, Customer Jour...
 
Computer System Validation Then and Now — Learning Management in the Cloud
Computer System Validation Then and Now — Learning Management in the CloudComputer System Validation Then and Now — Learning Management in the Cloud
Computer System Validation Then and Now — Learning Management in the Cloud
 
GxP in the Cloud is a good practice. Here's why.
GxP in the Cloud is a good practice. Here's why.GxP in the Cloud is a good practice. Here's why.
GxP in the Cloud is a good practice. Here's why.
 
Enterprise & Media Storage in the Cloud
Enterprise & Media Storage in the CloudEnterprise & Media Storage in the Cloud
Enterprise & Media Storage in the Cloud
 
Good Practices for Computerised Systems : PIC/S Guidance
Good Practices for Computerised Systems : PIC/S GuidanceGood Practices for Computerised Systems : PIC/S Guidance
Good Practices for Computerised Systems : PIC/S Guidance
 
Regulatory Considerations for use of Cloud Computing and SaaS Environments
Regulatory Considerations for use of Cloud Computing and SaaS EnvironmentsRegulatory Considerations for use of Cloud Computing and SaaS Environments
Regulatory Considerations for use of Cloud Computing and SaaS Environments
 
PWC success story - Gamification in recruitment - Manu Melwin Joy
PWC success story - Gamification in recruitment - Manu Melwin JoyPWC success story - Gamification in recruitment - Manu Melwin Joy
PWC success story - Gamification in recruitment - Manu Melwin Joy
 
An introduction of cloud storage
An introduction of cloud storage An introduction of cloud storage
An introduction of cloud storage
 
Cloud storage
Cloud storageCloud storage
Cloud storage
 
Cloud storage slides
Cloud storage slidesCloud storage slides
Cloud storage slides
 
B2B vs. B2C: 10 Marketing Experts Have Their Say (SlideShare)
B2B vs. B2C: 10 Marketing Experts Have Their Say (SlideShare)B2B vs. B2C: 10 Marketing Experts Have Their Say (SlideShare)
B2B vs. B2C: 10 Marketing Experts Have Their Say (SlideShare)
 

Similar a Design and Validation of cloud storage systems using Maude - Dr. Peter Csaba Ölveczky

Enabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scaleEnabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scale
Monika Solanki
 
Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...
Monika Solanki
 
Visual Metaphor of Mathematical Abstractions and their Visualization through ...
Visual Metaphor of Mathematical Abstractions and their Visualization through ...Visual Metaphor of Mathematical Abstractions and their Visualization through ...
Visual Metaphor of Mathematical Abstractions and their Visualization through ...
atsidaev
 
Smart Energy - Sebastian Blechmann.pptx
Smart Energy - Sebastian Blechmann.pptxSmart Energy - Sebastian Blechmann.pptx
Smart Energy - Sebastian Blechmann.pptx
FIWARE
 

Similar a Design and Validation of cloud storage systems using Maude - Dr. Peter Csaba Ölveczky (20)

Monitoring and Operational Data Analytics from a User Perspective at First Eu...
Monitoring and Operational Data Analytics from a User Perspective at First Eu...Monitoring and Operational Data Analytics from a User Perspective at First Eu...
Monitoring and Operational Data Analytics from a User Perspective at First Eu...
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and Clouds
 
Data Integration in a Big Data Context
Data Integration in a Big Data ContextData Integration in a Big Data Context
Data Integration in a Big Data Context
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 
Enabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scaleEnabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scale
 
Text mining and machine learning
Text mining and machine learningText mining and machine learning
Text mining and machine learning
 
Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...
 
DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006
 
Addressing open Machine Translation problems with Linked Data.
  Addressing open Machine Translation problems with Linked Data.  Addressing open Machine Translation problems with Linked Data.
Addressing open Machine Translation problems with Linked Data.
 
MEX Vocabulary - A Lightweight Interchange Format for Machine Learning Experi...
MEX Vocabulary - A Lightweight Interchange Format for Machine Learning Experi...MEX Vocabulary - A Lightweight Interchange Format for Machine Learning Experi...
MEX Vocabulary - A Lightweight Interchange Format for Machine Learning Experi...
 
Mapping Research Infrastructures with the ENVRI Reference Model
Mapping Research Infrastructures with the ENVRI Reference ModelMapping Research Infrastructures with the ENVRI Reference Model
Mapping Research Infrastructures with the ENVRI Reference Model
 
COMBINE standards & tools: Getting model management right
COMBINE standards & tools: Getting model management rightCOMBINE standards & tools: Getting model management right
COMBINE standards & tools: Getting model management right
 
Visual Metaphor of Mathematical Abstractions and their Visualization through ...
Visual Metaphor of Mathematical Abstractions and their Visualization through ...Visual Metaphor of Mathematical Abstractions and their Visualization through ...
Visual Metaphor of Mathematical Abstractions and their Visualization through ...
 
Aleksandar Kapisoda: The semantic approach for tracking scientific publications
Aleksandar Kapisoda: The semantic approach for tracking scientific publicationsAleksandar Kapisoda: The semantic approach for tracking scientific publications
Aleksandar Kapisoda: The semantic approach for tracking scientific publications
 
Isab 11 for_slideshare
Isab 11 for_slideshareIsab 11 for_slideshare
Isab 11 for_slideshare
 
Bayesian networks in AI
Bayesian networks in AIBayesian networks in AI
Bayesian networks in AI
 
Smart Energy - Sebastian Blechmann.pptx
Smart Energy - Sebastian Blechmann.pptxSmart Energy - Sebastian Blechmann.pptx
Smart Energy - Sebastian Blechmann.pptx
 
Systems Engineering Challenges
Systems Engineering ChallengesSystems Engineering Challenges
Systems Engineering Challenges
 
GENESIS – A Generic RDF Data Access Interface
GENESIS – A Generic RDF Data Access InterfaceGENESIS – A Generic RDF Data Access Interface
GENESIS – A Generic RDF Data Access Interface
 
Resume
ResumeResume
Resume
 

Más de Facultad de Informática UCM

Más de Facultad de Informática UCM (20)

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
 
DRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersDRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation Computers
 
uElectronics ongoing activities at ESA
uElectronics ongoing activities at ESAuElectronics ongoing activities at ESA
uElectronics ongoing activities at ESA
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura Arm
 
Formalizing Mathematics in Lean
Formalizing Mathematics in LeanFormalizing Mathematics in Lean
Formalizing Mathematics in Lean
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented Computing
 
Computer Design Concepts for Machine Learning
Computer Design Concepts for Machine LearningComputer Design Concepts for Machine Learning
Computer Design Concepts for Machine Learning
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuro
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error Correction
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intento
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPC
 
Type and proof structures for concurrency
Type and proof structures for concurrencyType and proof structures for concurrency
Type and proof structures for concurrency
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
 
Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore wind
 

Último

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 

Último (20)

Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 

Design and Validation of cloud storage systems using Maude - Dr. Peter Csaba Ölveczky

  • 1. Design and Validation of Cloud Storage Systems using Maude Peter Csaba ¨Olveczky University of Oslo University of Illinois at Urbana-Champaign Based on joint work with Jon Grov and members of UIUC’s Center for Assured Cloud Computing Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 1 / 58
  • 2. Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 2 / 58
  • 3. Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 3 / 58
  • 4. Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 4 / 58
  • 5. Cloud Computing Data Stores Cloud computing systems store/retrieve large amounts of data Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 5 / 58
  • 6. Availability Data should always be available network/site failures, network congestion, scheduled upgrades −→ data must be replicated Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 6 / 58
  • 7. Availability Data should always be available network/site failures, network congestion, scheduled upgrades −→ data must be replicated Large and growing data Facebook (2014): 300 petabytes data; 350M photos uploaded every day −→ data must be partitioned Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 6 / 58
  • 8. Consistency in Replicated Systems Figure by Jiaqing Du Consistency: All replicas of a data item should have same value Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 7 / 58
  • 9. “CAP Theorem” Data consistency + partition tolerance + availability impossible (Figure from http://flux7.com/blogs/nosql/cap-theorem-why-does-it-matter/) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 8 / 58
  • 10. Slightly Different View Trade-off consistency level ←→ latency Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 9 / 58
  • 11. Eventual Consistency Weak consistency OK for some applications Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 10 / 58
  • 12. Eventual Consistency Weak consistency OK for some applications . . . but not others: Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 10 / 58
  • 13. Designing Data Stores Complex systems size replication concurrence fault tolerance Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 11 / 58
  • 14. Designing Data Stores Complex systems size replication concurrence fault tolerance Many hours of “whiteboard analysis” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 11 / 58
  • 15. Validating Data Store Designs Correctness: “hand proofs” error prone informal key assumptions implicit does not scale to nontrivial systems Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 12 / 58
  • 16. Validating Data Store Designs Correctness: “hand proofs” error prone informal key assumptions implicit does not scale to nontrivial systems Performance: simulation tools, real implementations additional artifact cannot be used to reason about correctness Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 12 / 58
  • 17. Our Approach: Formal Methods Use formal methods to develop and validate designs define mathematical model of system use mathematical rules to analyze system Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 13 / 58
  • 18. Our Approach: Formal Methods Use formal methods to develop and validate designs define mathematical model of system use mathematical rules to analyze system Find errors early! Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 13 / 58
  • 19. Using Formal Methods (I): Validation Perspective Formal system model S precise mathematical model makes assumptions precise and explicit amenable to mathematical analysis Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 14 / 58
  • 20. Using Formal Methods (I): Validation Perspective Formal system model S precise mathematical model makes assumptions precise and explicit amenable to mathematical analysis Formal property specification P precise description of consistency model can check whether S |= P Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 14 / 58
  • 21. Using Formal Methods (I): Validation Perspective Formal system model S precise mathematical model makes assumptions precise and explicit amenable to mathematical analysis Formal property specification P precise description of consistency model can check whether S |= P What about performance analysis? Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 14 / 58
  • 22. Using Formal Methods (II): Software Engineering Perspective Need: expressive and intuitive modeling language Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
  • 23. Using Formal Methods (II): Software Engineering Perspective Need: expressive and intuitive modeling language expressive and intuitive property specification language Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
  • 24. Using Formal Methods (II): Software Engineering Perspective Need: expressive and intuitive modeling language expressive and intuitive property specification language automatically check whether design satisfies property quick and extensive feedback saves days of whiteboard analysis “extensive and automatic test suite” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
  • 25. Using Formal Methods (II): Software Engineering Perspective Need: expressive and intuitive modeling language expressive and intuitive property specification language automatically check whether design satisfies property quick and extensive feedback saves days of whiteboard analysis “extensive and automatic test suite” design model also for performance analysis! no new artifact for performance analysis Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 15 / 58
  • 26. Which Formal Language/Tool? Difficult challenges: intuitive expressive useful automatic analyses both correctness and performance analysis complex properties to check mature tool support real-time and probabilistic features Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 16 / 58
  • 27. Our Framework: Rewriting Logic Rewriting logic: equations and rewrite rules expressive simple/intuitive object-oriented Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 17 / 58
  • 28. Our Framework: Rewriting Logic Rewriting logic: equations and rewrite rules expressive simple/intuitive object-oriented Maude tool: simulation temporal logic model checking expressive property specification language Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 17 / 58
  • 29. Our Framework: Rewriting Logic Rewriting logic: equations and rewrite rules expressive simple/intuitive object-oriented Maude tool: simulation temporal logic model checking expressive property specification language Extensions: real-time systems probabilistic systems Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 17 / 58
  • 30. Maude: Software Engineering Perspective I Models can be developed quickly Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 18 / 58
  • 31. Maude: Software Engineering Perspective I Models can be developed quickly Simulation gives quick feedback (rapid prototyping) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 18 / 58
  • 32. Maude: Software Engineering Perspective I Models can be developed quickly Simulation gives quick feedback (rapid prototyping) Model checking: analyze all behaviors from one initial state http://embsys.technikum-wien.at/projects/decs/verification/formalmethods.php formal test-driven development: “test-driven development approach where many complex scenarios can be quickly tested by model checking” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 18 / 58
  • 33. Maude: Software Engineering Perspective (cont.) What about performance analysis? Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 19 / 58
  • 34. Maude: Software Engineering Perspective (cont.) What about performance analysis? 1 (Randomized) simulations Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 19 / 58
  • 35. Maude: Software Engineering Perspective (cont.) What about performance analysis? 1 (Randomized) simulations 2 Probabilistic analysis (using PVeStA) statistical model checking Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 19 / 58
  • 36. Maude: Software Engineering Perspective (cont.) Same artifact for: precise system description rapid prototyping extensive testing correctness analysis performance estimation Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 20 / 58
  • 37. Case Study I Modeling, Analyzing, and Extending Megastore Joint work with Jon Grov (U. Oslo) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 21 / 58
  • 38. Megastore Megastore: Google’s wide-area replicated data store 3 billion write and 20 billion read transactions daily (2011) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 22 / 58
  • 39. Megastore: Key Ideas (I) (Figure from http://cse708.blogspot.jp/2011/03/megastore-providing-scalable-highly.html) Data divided into entity groups Peter’s email Books on rewriting logic Narciso’s documents Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 23 / 58
  • 40. Megastore: Key Ideas (II) Consistency for transactions accessing a single entity group no guarantee if transaction reads multiple entity groups Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 24 / 58
  • 41. Our Work [Developed and] formalized [our version of the] Megastore [approach] in Maude Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 25 / 58
  • 42. Our Work [Developed and] formalized [our version of the] Megastore [approach] in Maude first (public) formalization/detailed description of Megastore Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 25 / 58
  • 43. Our Work [Developed and] formalized [our version of the] Megastore [approach] in Maude first (public) formalization/detailed description of Megastore 56 rewrite rules (37 for fault tolerance features) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 25 / 58
  • 44. Performance Estimation Key performance measures: average transaction latency number of committed/aborted transactions Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 26 / 58
  • 45. Performance Estimation Key performance measures: average transaction latency number of committed/aborted transactions Randomly generated transactions (rate 2.5 TPS) Network delays: 30% 30% 30% 10% Madrid ↔ Paris 10 15 20 50 Madrid ↔ New York 30 35 40 100 Paris ↔ New York 30 35 40 100 Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 26 / 58
  • 46. Performance Estimation Key performance measures: average transaction latency number of committed/aborted transactions Randomly generated transactions (rate 2.5 TPS) Network delays: 30% 30% 30% 10% Madrid ↔ Paris 10 15 20 50 Madrid ↔ New York 30 35 40 100 Paris ↔ New York 30 35 40 100 Simulating for 200 seconds: Avg. latency (ms) Commits Aborts Madrid 218 109 38 New York 336 129 16 Paris 331 116 21 Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 26 / 58
  • 47. Megastore-CGC: extending Megastore Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 27 / 58
  • 48. Motivation Some transactions must access multiple entity groups Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 28 / 58
  • 49. Motivation Some transactions must access multiple entity groups Our work: extend Megastore with consistency for transactions accessing multiple entity groups Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 28 / 58
  • 50. Motivation Some transactions must access multiple entity groups Our work: extend Megastore with consistency for transactions accessing multiple entity groups Megastore-CGC piggybacks ordering and validation onto Megastore’s coordination protocol no additional messages for validation/commit! maintains Megastore’s performance and fault tolerance Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 28 / 58
  • 51. Performance Comparison using Real-Time Maude Simulating for 1000 seconds (no failures) Megastore: Commits Aborts Avg. latency (ms) Madrid 652 152 126 Paris 704 100 118 New York 640 172 151 Megastore-CGC: Commits Aborts Val. aborts Avg.latency (ms) Madrid 660 144 0 123 Paris 674 115 15 118 New York 631 171 10 150 Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 29 / 58
  • 52. Model Checking Megastore-CGC Model checking scenarios 5 transactions , no failures, message delay 30 ms or 80 ms −→ 108,279 reachable states, 124 seconds 3 transactions, one site failure and fixed message delay −→ 1,874,946 reachable states, 6,311 seconds 3 transactions, fixed message delay and one message failure −→ 265,410 reachable states, 858 seconds Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 30 / 58
  • 53. Case Study II Work by Si Liu, Muntasir Raihan Rahman, Stephen Skeirik, Indranil Gupta, Jos´e Meseguer, Son Nguyen, Jatin Ganhotra (ICFEM’14, QEST’15) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 31 / 58
  • 54. Apache Cassandra Key-value data store originally developed at Facebook Used by Amadeus, Apple, CERN, IBM, Netflix, Facebook/Instagram, Twitter, . . . Open source Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 32 / 58
  • 55. Cassandra Overview Read consistency either one, quorum, or all Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 33 / 58
  • 56. Cassandra Overview Read consistency either one, quorum, or all Write consistency either zero, one, quorum, or all [Figures from http://www.slideshare.net/nuboat/cassandra-distributed-data-store] Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 33 / 58
  • 57. Motivation 1 Formal model from 345K LOC allows experimenting with different optimizations/variations 2 Analyze basic property: eventual consistency 3 When/how often does Cassandra give stronger guarantees? strong consistency read-your-writes 4 Performance evaluation: compare PVeStA analyses with real implementations Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 34 / 58
  • 58. Formal Analysis with Multiple Clients Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 35 / 58
  • 59. Performance Estimation Formal model + PVeStA vs. actual implementation Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 36 / 58
  • 60. P-Store P-Store [N. Schiper, P. Sutra, and F. Pedone; IEEE SRDS’10] Replicated and partitioned data store Serializability Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 37 / 58
  • 61. P-Store P-Store [N. Schiper, P. Sutra, and F. Pedone; IEEE SRDS’10] Replicated and partitioned data store Serializability Atomic multicast orders concurrent transactions Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 37 / 58
  • 62. P-Store P-Store [N. Schiper, P. Sutra, and F. Pedone; IEEE SRDS’10] Replicated and partitioned data store Serializability Atomic multicast orders concurrent transactions Group commitment for atomic commit Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 37 / 58
  • 63. Atomic Multicast Definition Atomic Multicast: Consistent reception order of messages (a): any pair of nodes receive the same atomic-multicast messages in the same order (b): induced “global read order” must be acyclic Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 38 / 58
  • 64. Atomic Multicast Definition Atomic Multicast: Consistent reception order of messages (a): any pair of nodes receive the same atomic-multicast messages in the same order (b): induced “global read order” must be acyclic Example A reads m1 < m2 B reads m2 < m3 C reads m3 < m1 satisfies (a) but not (b) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 38 / 58
  • 65. Atomic Multicast in Maude (I) Fundamental problem in distributed systems Impose order on conflicting concurrent transactions Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
  • 66. Atomic Multicast in Maude (I) Fundamental problem in distributed systems Impose order on conflicting concurrent transactions Many algorithms for atomic multicast Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
  • 67. Atomic Multicast in Maude (I) Fundamental problem in distributed systems Impose order on conflicting concurrent transactions Many algorithms for atomic multicast Define generic atomic multicast primitive in Maude abstract covers all possible receiving orders Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
  • 68. Atomic Multicast in Maude (I) Fundamental problem in distributed systems Impose order on conflicting concurrent transactions Many algorithms for atomic multicast Define generic atomic multicast primitive in Maude abstract covers all possible receiving orders Infrastructure stores (un)read AM messages Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 39 / 58
  • 69. My Work: Atomic Multicast in Maude (II) Atomic-multicast message M: rl [atomic-multicast] : < O : Node | msgToSend : M, receivers : OS > => < O : Node | ... > (atomic-multicast M from O to OS) . Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 40 / 58
  • 70. My Work: Atomic Multicast in Maude (II) Atomic-multicast message M: rl [atomic-multicast] : < O : Node | msgToSend : M, receivers : OS > => < O : Node | ... > (atomic-multicast M from O to OS) . Read: crl [receiveAtomicMulticast] : (msg M from O2 to O) < O : Node | ... > AM-TABLE => < O : Node | ... > updateAM(MC, O, AM-TABLE) if okToRead(MC, O, AM-TABLE) . Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 40 / 58
  • 71. Analyzing P-Store Find all reachable final states from init3: Maude> (search init3 =>! C:Configuration .) Solution 1 C:Configuration --> ... < c1 : Client | pendingTrans : t1, txns : emptyTransList > < c2 : Client | pendingTrans : t2, txns : emptyTransList > < r1 : PStoreReplica | aborted : none, committed : < t1 : Transaction | ... > < r2 : PStoreReplica | aborted : none, committed : < t2 : Transaction | ... > ... Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 41 / 58
  • 72. Analyzing P-Store Find all reachable final states from init3: Maude> (search init3 =>! C:Configuration .) Solution 1 C:Configuration --> ... < c1 : Client | pendingTrans : t1, txns : emptyTransList > < c2 : Client | pendingTrans : t2, txns : emptyTransList > < r1 : PStoreReplica | aborted : none, committed : < t1 : Transaction | ... > < r2 : PStoreReplica | aborted : none, committed : < t2 : Transaction | ... > ... sites validate transactions but client never gets result Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 41 / 58
  • 73. Analyzing P-Store (cont.) Solution 5 ... < r1 : PStoreReplica | aborted : none, committed : none, submitted : < t1 : Transaction | ... >, ... > < r2 : PStoreReplica | aborted : none, committed : < t2 : Transaction| ... > ... > Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 42 / 58
  • 74. Analyzing P-Store (cont.) Solution 5 ... < r1 : PStoreReplica | aborted : none, committed : none, submitted : < t1 : Transaction | ... >, ... > < r2 : PStoreReplica | aborted : none, committed : < t2 : Transaction| ... > ... > Host does not validate t1 even when needed info known Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 42 / 58
  • 75. Fixing P-Store Found the source of the errors all replicas must be involved in voting and notification not just write replicas Modeled and analyzed proposed corrected version Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 43 / 58
  • 76. P-Store Summary “P-Store verified” 3 significant errors found one confusing definition key assumption missing Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 44 / 58
  • 77. Our Conclusions Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 45 / 58
  • 78. Our Conclusions I Developed formal models of large industrial data stores Google’s Megastore (from brief description) Apache Cassandra (from 345K LOC and description) P-Store (academic) Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
  • 79. Our Conclusions I Developed formal models of large industrial data stores Google’s Megastore (from brief description) Apache Cassandra (from 345K LOC and description) P-Store (academic) Automatic model checking analysis of consistency properties Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
  • 80. Our Conclusions I Developed formal models of large industrial data stores Google’s Megastore (from brief description) Apache Cassandra (from 345K LOC and description) P-Store (academic) Automatic model checking analysis of consistency properties Designed own transactional data stores Megastore-CGC variation of Cassandra Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
  • 81. Our Conclusions I Developed formal models of large industrial data stores Google’s Megastore (from brief description) Apache Cassandra (from 345K LOC and description) P-Store (academic) Automatic model checking analysis of consistency properties Designed own transactional data stores Megastore-CGC variation of Cassandra Errors, ambiguities, missing assumptions found in “verified” P-Store Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
  • 82. Our Conclusions I Developed formal models of large industrial data stores Google’s Megastore (from brief description) Apache Cassandra (from 345K LOC and description) P-Store (academic) Automatic model checking analysis of consistency properties Designed own transactional data stores Megastore-CGC variation of Cassandra Errors, ambiguities, missing assumptions found in “verified” P-Store Maude/PVeStA performance estimation close to real implementations Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 46 / 58
  • 83. Our “Software Engineering” Conclusions Quickly develop formal models/prototypes of complex systems experiment with different design choices Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
  • 84. Our “Software Engineering” Conclusions Quickly develop formal models/prototypes of complex systems experiment with different design choices Simulation and model checking throughout design phase model-checking-based-testing for subtle “corner cases” replaces days of whiteboard analysis too many scenarios for standard test-based development catch bugs early! Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
  • 85. Our “Software Engineering” Conclusions Quickly develop formal models/prototypes of complex systems experiment with different design choices Simulation and model checking throughout design phase model-checking-based-testing for subtle “corner cases” replaces days of whiteboard analysis too many scenarios for standard test-based development catch bugs early! Single artifact for system description rapid prototyping model checking performance estimation Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
  • 86. Our “Software Engineering” Conclusions Quickly develop formal models/prototypes of complex systems experiment with different design choices Simulation and model checking throughout design phase model-checking-based-testing for subtle “corner cases” replaces days of whiteboard analysis too many scenarios for standard test-based development catch bugs early! Single artifact for system description rapid prototyping model checking performance estimation Megastore and Megastore-CGC modeler had no formal methods experience Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 47 / 58
  • 87. Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 48 / 58
  • 88. Amazon Web Services Amazon Web Services (AWS): world’s largest cloud computing service provider more profitable than Amazon’s retail business Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 49 / 58
  • 89. Amazon Web Services Amazon Web Services (AWS): world’s largest cloud computing service provider more profitable than Amazon’s retail business Amazon Simple Storage Service (S3) stores > 3 trillion objects 99.99% availability of objects > 1 million requests per second DynamoDB data store Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 49 / 58
  • 90. Amazon Web Services and Formal Methods Formal methods used extensively at AWS during design of S3, DynamoDB, . . . Used Lamports TLA+ model checking Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 50 / 58
  • 91. Experiences at Amazon WS Model checking finds “corner case” bugs that would be hard to find with standard industrial methods: Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 51 / 58
  • 92. Experiences at Amazon WS Model checking finds “corner case” bugs that would be hard to find with standard industrial methods: “We have found that standard verification techniques in industry are necessary but not sufficient. We routinely use deep design reviews, static code analysis, stress testing, and fault-injection testing but still find that subtle bugs can hide in complex fault-tolerant systems.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 51 / 58
  • 93. Experiences at Amazon WS Model checking finds “corner case” bugs that would be hard to find with standard industrial methods: “the model checker found a bug that could lead to losing data [...]. This was a very subtle bug; the shortest error trace exhibiting the bug included 35 high-level steps. [...] The bug had passed unnoticed through extensive design reviews, code reviews, and testing.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 51 / 58
  • 94. Experiences at Amazon WS II A formal specification is a valuable precise description of an algorithm: Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 52 / 58
  • 95. Experiences at Amazon WS II A formal specification is a valuable precise description of an algorithm: “the author is forced to think more clearly, helping eliminating “hand waving,” and tools can be applied to check for errors in the design, even while it is being written. In contrast, conventional design documents consist of prose, static diagrams, and perhaps psuedo-code in an ad hoc untestable language.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 52 / 58
  • 96. Experiences at Amazon WS II A formal specification is a valuable precise description of an algorithm: “Talk and design documents can be ambiguous or incomplete, and the executable code is much too large to absorb quickly and might not precisely reflect the intended design. In contrast, a formal specification is precise, short, and can be explored and experimented on with tools.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 52 / 58
  • 97. Experiences at Amazon WS III Formal methods are surprisingly feasible for mainstream software development and give good return on investment: Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 53 / 58
  • 98. Experiences at Amazon WS III Formal methods are surprisingly feasible for mainstream software development and give good return on investment: “In industry, formal methods have a reputation for requiring a huge amount of training and effort to verify a tiny piece of relatively straightforward code. Our experience with TLA+ shows this perception to be wrong. [...] Amazon engineers have used TLA+ on 10 large complex real-world systems. In each, TLA+ has added significant value. [...] Engineers have been able to learn TLA+ from scratch and get useful results in two to three weeks.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 53 / 58
  • 99. Experiences at Amazon WS III Formal methods are surprisingly feasible for mainstream software development and give good return on investment: “Using TLA+ in place of traditional proof writing would thus likely have improved time to market, in addition to achieving greater confidence in the system’s correctness.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 53 / 58
  • 100. Experiences at Amazon WS III Quick and easy to experiment with different design choices: Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 54 / 58
  • 101. Experiences at Amazon WS III Quick and easy to experiment with different design choices: “We have been able to make innovative performance optimizations [...] we would not have dared to do without having model-checked those changes. A precise, testable description of a system becomes a what-if tool for designs.” Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 54 / 58
  • 102. Experiences at Amazon WS: Limitations TLA+ did/could not analyze performance degradation Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 55 / 58
  • 103. Maude vs TLA+ Maude should be better suited! more intuitive and expressive specification language OO hierarchical states dynamic object/message creation/deletion . . . Support for real-time and probabilistic systems Also for performance estimation! Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 56 / 58
  • 104. Conclusions at Amazon Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 57 / 58
  • 105. Take Away from Talk Formal methods can be an efficient way to design test describe validate correctness and performance experiment with different design choices industrial state-of-the-art fault-tolerant distributed systems also for non-experts Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 58 / 58
  • 106. Take Away from Talk Formal methods can be an efficient way to design test describe validate correctness and performance experiment with different design choices industrial state-of-the-art fault-tolerant distributed systems also for non-experts Maude suitable modeling language and analysis toolset Peter Csaba ¨Olveczky (U. Oslo/UIUC) Cloud Storage Systems in Maude UCM, February 20, 2017 58 / 58