SlideShare a Scribd company logo
1 of 40
Download to read offline
Extreme availability and self-healing data with 
CRDTs 
Uwe Friedrichsen (codecentric AG) – NoSQL matters – Barcelona, 22. November 2014
@ufried 
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Why NoSQL? 
• Scalability 
• Easier schema evolution 
• Availability on unreliable OTS hardware 
• It‘s more fun ...
Why NoSQL? 
• Scalability 
• Easier schema evolution 
• Availability on unreliable OTS hardware 
• It‘s more fun ...
Challenges 
• Giving up ACID transactions 
• (Temporal) anomalies and inconsistencies 
short-term due to replication or long-term due to partitioning 
It might happen. Thus, it will happen!
C 
Consistency 
A 
Availability 
Strict Consistency 
ACID / 2PC 
P 
Partition Tolerance 
Strong Consistency 
Quorum R&W / Paxos 
Eventual Consistency 
CRDT / Gossip / Hinted Handoff
Strict Consistency (CA) 
• Great programming model 
no anomalies or inconsistencies need to be considered 
• Does not scale well 
best for single node databases 
„We know ACID – It works!“ 
„We know 2PC – It sucks!“ 
Use for moderate data amounts
And what if I need more data? 
• Distributed datastore 
• Partition tolerance is a must 
• Need to give up strict consistency (CP or AP)
Strong Consistency (CP) 
• Majority based consistency model 
can tolerate up to N nodes failing out of 2N+1 nodes 
• Good programming model 
Single-copy consistency 
• Trades consistency for availability 
in case of partitioning 
Paxos (for sequential consistency) 
Quorum-based reads & writes
And what if I need more availability? 
• Need to give up strong consistency (CP) 
• Relax required consistency properties even more 
• Leads to eventual consistency (AP)
Eventual Consistency (AP) 
• Gives up some consistency guarantees 
no sequential consistency, anomalies become visible 
• Maximum availability possible 
can tolerate up to N-1 nodes failing out of N nodes 
• Challenging programming model 
anomalies usually need to be resolved explicitly 
Gossip / Hinted Handoffs 
CRDT
Conflict-free Replicated Data Types 
• Eventually consistent, self-stabilizing data structures 
• Designed for maximum availability 
• Tolerates up to N-1 out of N nodes failing 
State-based CRDT: Convergent Replicated Data Type (CvRDT) 
Operation-based CRDT: Commutative Replicated Data Type (CmRDT)
A bit of theory first ...
Convergent Replicated Data Type 
State-based CRDT – CvRDT 
• All replicas (usually) connected 
• Exchange state between replicas, calculate new state on target replica 
• State transfer at least once over eventually-reliable channels 
• Set of possible states form a Semilattice 
• Partially ordered set of elements where all subsets have a Least Upper Bound (LUB) 
• All state changes advance upwards with respect to the partial order
Commutative Replicated Data Type 
Operation-based CRDT - CmRDT 
• All replicas (usually) connected 
• Exchange update operations between replicas, apply on target replica 
• Reliable broadcast with ordering guarantee for non-concurrent 
updates 
• Concurrent updates must be commutative
That‘s been enough theory ...
Counter
Op-based Counter 
Data 
Integer i 
Init 
i ≔ 0 
Query 
return i 
Operations 
increment(): i ≔ i + 1 
decrement(): i ≔ i - 1
State-based G-Counter (grow only) 
(Naïve approach) 
Data 
Integer i 
Init 
i ≔ 0 
Query 
return i 
Update 
increment(): i ≔ i + 1 
Merge( j) 
i ≔ max(i, j)
State-based G-Counter (grow only) 
(Naïve approach) 
R1 
R2 
R3 
U 
i = 1 
U 
i = 1 
I 
i = 0 
I 
i = 0 
I 
i = 0 
M 
M 
i = 1 
i = 1
State-based G-Counter (grow only) 
(Vector-based approach) 
Data 
Integer V[] / one element per replica set 
Init 
V ≔ [0, 0, ... , 0] 
Query 
return Σi V[i] 
Update 
increment(): V[i] ≔ V[i] + 1 / i is replica set number 
Merge(V‘) 
∀i ∈ [0, n-1] : V[i] ≔ max(V[i], V‘[i])
State-based G-Counter (grow only) 
(Vector-based approach) 
R1 
R2 
R3 
U 
V = [1, 0, 0] 
I 
V = [0, 0, 0] 
I 
V = [0, 0, 0] 
I 
V = [0, 0, 0] 
U 
V = [0, 0, 1] 
M 
V = [1, 0, 0] 
M 
V = [1, 0, 1]
State-based PN-Counter (pos./neg.) 
• Simple vector approach as with G-Counter does not work 
• Violates monotonicity requirement of semilattice 
• Need to use two vectors 
• Vector P to track incements 
• Vector N to track decrements 
• Query result is Σi P[i] – N[i]
State-based PN-Counter (pos./neg.) 
Data 
Integer P[], N[] / one element per replica set 
Init 
P ≔ [0, 0, ... , 0], N ≔ [0, 0, ... , 0] 
Query 
Return Σi P[i] – N[i] 
Update 
increment(): P[i] ≔ P[i] + 1 / i is replica set number 
decrement(): N[i] ≔ N[i] + 1 / i is replica set number 
Merge(P‘, N‘) 
∀i ∈ [0, n-1] : P[i] ≔ max(P[i], P‘[i]) 
∀i ∈ [0, n-1] : N[i] ≔ max(N[i], N‘[i])
Non-negative Counter 
Problem: How to check a global invariant with local information only? 
• Approach 1: Only dec if local state is > 0 
• Concurrent decs could still lead to negative value 
• Approach 2: Externalize negative values as 0 
• inc(negative value) == noop(), violates counter semantics 
• Approach 3: Local invariant – only allow dec if P[i] - N[i] > 0 
• Works, but may be too strong limitation 
• Approach 4: Synchronize 
• Works, but violates assumptions and prerequisites of CRDTs
Sets
Op-based Set 
(Naïve approach) 
Data 
Set S 
Init 
S ≔ {} 
Query(e) 
return e ∈ S 
Operations 
add(e) : S ≔ S ∪ {e} 
remove(e): S ≔ S  {e}
Op-based Set 
(Naïve approach) 
R1 
R2 
R3 
S = {e} 
add(e) 
I 
S = {} 
I 
S = {} 
I 
S = {} 
S = {e} 
S = {} 
rmv(e) 
add(e) 
add(e) 
add(e) 
add(e) 
rmv(e) 
S = {e} 
S = {e} 
S = {e} 
S = {}
State-based G-Set (grow only) 
Data 
Set S 
Init 
S ≔ {} 
Query(e) 
return e ∈ S 
Update 
add(e): S ≔ S ∪ {e} 
Merge(S‘) 
S = S ∪ S‘
State-based 2P-Set (two-phase) 
Data 
Set A, R / A: added, R: removed 
Init 
A ≔ {}, R ≔ {} 
Query(e) 
return e ∈ A ∧ e ∉ R 
Update 
add(e): A ≔ A ∪ {e} 
remove(e): (pre query(e)) R ≔ R ∪ {e} 
Merge(A‘, R‘) 
A ≔ A ∪ A‘, R ≔ R ∪ R‘
Op-based OR-Set (observed-remove) 
Data 
Set S / Set of pairs { (element e, unique tag u), ... } 
Init 
S ≔ {} 
Query(e) 
return ∃u : (e, u) ∈ S 
Operations 
add(e): S ≔ S ∪ { (e, u) } / u is generated unique tag 
remove(e): 
pre query(e) 
R ≔ { (e, u) | ∃u : (e, u) ∈ S } /at source („prepare“) 
S ≔ S  R /downstream („execute“)
Op-based OR-Set (observed-remove) 
R1 
R2 
R3 
S = {ea} 
add(e) 
I 
S = {} 
I 
S = {} 
I 
S = {} 
S = {eb} 
S = {} 
rmv(e) 
add(e) 
add(eb) 
add(ea) 
rmv(ea) 
add(eb) 
S = {eb} 
S = {eb} 
S = {ea, eb} 
S = {eb}
More datatypes 
• Register 
• Dictionary (Map) 
• Tree 
• Graph 
• Array 
• List 
plus more representations for each datatype
Garbage collection 
• Sets could grow infinitely in worst case 
• Garbage collection possible, but a bit tricky 
• Out of scope for this session 
• Can induce surprising behavior sometimes 
• Sometimes stronger consensus is needed 
• Paxos, …
Limitations of CRDTs 
• Very weak consistency guarantees 
Strives for „quiescent consistency“ 
• Eventually consistent 
Not suitable for high-volume ID generator or alike 
• Not easy to understand and model 
• Not all data structures representable 
Use if availability is extremely important
Further reading 
1. Shapiro et al., Conflict-free Replicated 
Data Types, Inria Research report, 2011 
2. Shapiro et al., A comprehensive study 
of Convergent and Commutative 
Replicated Data Types, 
Inria Research report, 2011 
3. Basho Tag Archives: CRDT, 
https://basho.com/tag/crdt/ 
4. Leslie Lamport, 
Time, clocks, and the ordering of 
events in a distributed system, 
Communications of the ACM (1978)
Wrap-up 
• CAP requires rethinking consistency 
• Strict Consistency 
ACID / 2PC 
• Strong Consistency 
Quorum-based R&W, Paxos 
• Eventual Consistency 
CRDT, Gossip, Hinted Handoffs 
Pick your consistency model based on your 
consistency and availability requirements
The real world is not ACID 
Thus, it is perfectly fine to go for a relaxed consistency model
@ufried 
Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - NoSQL matters,Barcelona 2014

More Related Content

What's hot

Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
rhatr
 
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in FlinkAnwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
Flink Forward
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
Cloudera, Inc.
 
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit
 

What's hot (20)

Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
 
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in FlinkAnwar Rizal – Streaming & Parallel Decision Tree in Flink
Anwar Rizal – Streaming & Parallel Decision Tree in Flink
 
Data streaming algorithms
Data streaming algorithmsData streaming algorithms
Data streaming algorithms
 
Generalized Linear Models with H2O
Generalized Linear Models with H2O Generalized Linear Models with H2O
Generalized Linear Models with H2O
 
Predictive Datacenter Analytics with Strymon
Predictive Datacenter Analytics with StrymonPredictive Datacenter Analytics with Strymon
Predictive Datacenter Analytics with Strymon
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity Clusters
 
Lazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
Lazy Join Optimizations Without Upfront Statistics with Matteo InterlandiLazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
Lazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
 
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
 
GraphX and Pregel - Apache Spark
GraphX and Pregel - Apache SparkGraphX and Pregel - Apache Spark
GraphX and Pregel - Apache Spark
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
Data Streaming Algorithms
Data Streaming AlgorithmsData Streaming Algorithms
Data Streaming Algorithms
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with Spark
 
Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017Magellan FOSS4G Talk, Boston 2017
Magellan FOSS4G Talk, Boston 2017
 
Clean, Learn and Visualise data with R
Clean, Learn and Visualise data with RClean, Learn and Visualise data with R
Clean, Learn and Visualise data with R
 
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
 
Hybrid acquisition of temporal scopes for rdf data
Hybrid acquisition of temporal scopes for rdf dataHybrid acquisition of temporal scopes for rdf data
Hybrid acquisition of temporal scopes for rdf data
 
A More Scaleable Way of Making Recommendations with MLlib-(Xiangrui Meng, Dat...
A More Scaleable Way of Making Recommendations with MLlib-(Xiangrui Meng, Dat...A More Scaleable Way of Making Recommendations with MLlib-(Xiangrui Meng, Dat...
A More Scaleable Way of Making Recommendations with MLlib-(Xiangrui Meng, Dat...
 

Similar to Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - NoSQL matters,Barcelona 2014

Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
James Wong
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
Fraboni Ec
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
Tony Nguyen
 
04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
Shree Shree
 

Similar to Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - NoSQL matters,Barcelona 2014 (20)

Self healing data
Self healing dataSelf healing data
Self healing data
 
No stress with state
No stress with stateNo stress with state
No stress with state
 
Approximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming ApplicationsApproximation Data Structures for Streaming Applications
Approximation Data Structures for Streaming Applications
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013
 
Lecture 25
Lecture 25Lecture 25
Lecture 25
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx04-Data-Analysis-Overview.pptx
04-Data-Analysis-Overview.pptx
 
Parallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel applicationParallel Computing 2007: Bring your own parallel application
Parallel Computing 2007: Bring your own parallel application
 
Data structures
Data structuresData structures
Data structures
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notations
 
Building and deploying analytics
Building and deploying analyticsBuilding and deploying analytics
Building and deploying analytics
 
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John MelonakosPT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
PT-4054, "OpenCL™ Accelerated Compute Libraries" by John Melonakos
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Tree representation in map reduce world
Tree representation  in map reduce worldTree representation  in map reduce world
Tree representation in map reduce world
 

More from NoSQLmatters

Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
NoSQLmatters
 
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
NoSQLmatters
 
Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...
NoSQLmatters
 

More from NoSQLmatters (20)

Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
Nathan Ford- Divination of the Defects (Graph-Based Defect Prediction through...
 
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
Stefan Hochdörfer - The NoSQL Store everyone ignores: PostgreSQL - NoSQL matt...
 
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
Adrian Colyer - Keynote: NoSQL matters - NoSQL matters Dublin 2015
 
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
Peter Bakas - Zero to Insights - Real time analytics with Kafka, C*, and Spar...
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015
 
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
Prassnitha Sampath - Real Time Big Data Analytics with Kafka, Storm & HBase -...
 
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
 
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
 
Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...Chris Ward - Understanding databases for distributed docker applications - No...
Chris Ward - Understanding databases for distributed docker applications - No...
 
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
Philipp Krenn - Host your database in the cloud, they said... - NoSQL matters...
 
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
Lucian Precup - Back to the Future: SQL 92 for Elasticsearch? - NoSQL matters...
 
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
Bruno Guedes - Hadoop real time for dummies - NoSQL matters Paris 2015
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
 
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
Benjamin Guinebertière - Microsoft Azure: Document DB and other noSQL databas...
 
David Pilato - Advance search for your legacy application - NoSQL matters Par...
David Pilato - Advance search for your legacy application - NoSQL matters Par...David Pilato - Advance search for your legacy application - NoSQL matters Par...
David Pilato - Advance search for your legacy application - NoSQL matters Par...
 
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
Tugdual Grall - From SQL to NoSQL in less than 40 min - NoSQL matters Paris 2015
 
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
Gregorry Letribot - Druid at Criteo - NoSQL matters 2015
 
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
Michael Hackstein - Polyglot Persistence & Multi-Model NoSQL Databases - NoSQ...
 
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
Rob Harrop- Key Note The God, the Bad and the Ugly - NoSQL matters Paris 2015
 

Recently uploaded

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Recently uploaded (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Uwe Friedrichsen – Extreme availability and self-healing data with CRDTs - NoSQL matters,Barcelona 2014

  • 1. Extreme availability and self-healing data with CRDTs Uwe Friedrichsen (codecentric AG) – NoSQL matters – Barcelona, 22. November 2014
  • 2. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com
  • 3. Why NoSQL? • Scalability • Easier schema evolution • Availability on unreliable OTS hardware • It‘s more fun ...
  • 4. Why NoSQL? • Scalability • Easier schema evolution • Availability on unreliable OTS hardware • It‘s more fun ...
  • 5. Challenges • Giving up ACID transactions • (Temporal) anomalies and inconsistencies short-term due to replication or long-term due to partitioning It might happen. Thus, it will happen!
  • 6. C Consistency A Availability Strict Consistency ACID / 2PC P Partition Tolerance Strong Consistency Quorum R&W / Paxos Eventual Consistency CRDT / Gossip / Hinted Handoff
  • 7. Strict Consistency (CA) • Great programming model no anomalies or inconsistencies need to be considered • Does not scale well best for single node databases „We know ACID – It works!“ „We know 2PC – It sucks!“ Use for moderate data amounts
  • 8. And what if I need more data? • Distributed datastore • Partition tolerance is a must • Need to give up strict consistency (CP or AP)
  • 9. Strong Consistency (CP) • Majority based consistency model can tolerate up to N nodes failing out of 2N+1 nodes • Good programming model Single-copy consistency • Trades consistency for availability in case of partitioning Paxos (for sequential consistency) Quorum-based reads & writes
  • 10. And what if I need more availability? • Need to give up strong consistency (CP) • Relax required consistency properties even more • Leads to eventual consistency (AP)
  • 11. Eventual Consistency (AP) • Gives up some consistency guarantees no sequential consistency, anomalies become visible • Maximum availability possible can tolerate up to N-1 nodes failing out of N nodes • Challenging programming model anomalies usually need to be resolved explicitly Gossip / Hinted Handoffs CRDT
  • 12. Conflict-free Replicated Data Types • Eventually consistent, self-stabilizing data structures • Designed for maximum availability • Tolerates up to N-1 out of N nodes failing State-based CRDT: Convergent Replicated Data Type (CvRDT) Operation-based CRDT: Commutative Replicated Data Type (CmRDT)
  • 13. A bit of theory first ...
  • 14. Convergent Replicated Data Type State-based CRDT – CvRDT • All replicas (usually) connected • Exchange state between replicas, calculate new state on target replica • State transfer at least once over eventually-reliable channels • Set of possible states form a Semilattice • Partially ordered set of elements where all subsets have a Least Upper Bound (LUB) • All state changes advance upwards with respect to the partial order
  • 15. Commutative Replicated Data Type Operation-based CRDT - CmRDT • All replicas (usually) connected • Exchange update operations between replicas, apply on target replica • Reliable broadcast with ordering guarantee for non-concurrent updates • Concurrent updates must be commutative
  • 16. That‘s been enough theory ...
  • 18. Op-based Counter Data Integer i Init i ≔ 0 Query return i Operations increment(): i ≔ i + 1 decrement(): i ≔ i - 1
  • 19. State-based G-Counter (grow only) (Naïve approach) Data Integer i Init i ≔ 0 Query return i Update increment(): i ≔ i + 1 Merge( j) i ≔ max(i, j)
  • 20. State-based G-Counter (grow only) (Naïve approach) R1 R2 R3 U i = 1 U i = 1 I i = 0 I i = 0 I i = 0 M M i = 1 i = 1
  • 21. State-based G-Counter (grow only) (Vector-based approach) Data Integer V[] / one element per replica set Init V ≔ [0, 0, ... , 0] Query return Σi V[i] Update increment(): V[i] ≔ V[i] + 1 / i is replica set number Merge(V‘) ∀i ∈ [0, n-1] : V[i] ≔ max(V[i], V‘[i])
  • 22. State-based G-Counter (grow only) (Vector-based approach) R1 R2 R3 U V = [1, 0, 0] I V = [0, 0, 0] I V = [0, 0, 0] I V = [0, 0, 0] U V = [0, 0, 1] M V = [1, 0, 0] M V = [1, 0, 1]
  • 23. State-based PN-Counter (pos./neg.) • Simple vector approach as with G-Counter does not work • Violates monotonicity requirement of semilattice • Need to use two vectors • Vector P to track incements • Vector N to track decrements • Query result is Σi P[i] – N[i]
  • 24. State-based PN-Counter (pos./neg.) Data Integer P[], N[] / one element per replica set Init P ≔ [0, 0, ... , 0], N ≔ [0, 0, ... , 0] Query Return Σi P[i] – N[i] Update increment(): P[i] ≔ P[i] + 1 / i is replica set number decrement(): N[i] ≔ N[i] + 1 / i is replica set number Merge(P‘, N‘) ∀i ∈ [0, n-1] : P[i] ≔ max(P[i], P‘[i]) ∀i ∈ [0, n-1] : N[i] ≔ max(N[i], N‘[i])
  • 25. Non-negative Counter Problem: How to check a global invariant with local information only? • Approach 1: Only dec if local state is > 0 • Concurrent decs could still lead to negative value • Approach 2: Externalize negative values as 0 • inc(negative value) == noop(), violates counter semantics • Approach 3: Local invariant – only allow dec if P[i] - N[i] > 0 • Works, but may be too strong limitation • Approach 4: Synchronize • Works, but violates assumptions and prerequisites of CRDTs
  • 26. Sets
  • 27. Op-based Set (Naïve approach) Data Set S Init S ≔ {} Query(e) return e ∈ S Operations add(e) : S ≔ S ∪ {e} remove(e): S ≔ S {e}
  • 28. Op-based Set (Naïve approach) R1 R2 R3 S = {e} add(e) I S = {} I S = {} I S = {} S = {e} S = {} rmv(e) add(e) add(e) add(e) add(e) rmv(e) S = {e} S = {e} S = {e} S = {}
  • 29. State-based G-Set (grow only) Data Set S Init S ≔ {} Query(e) return e ∈ S Update add(e): S ≔ S ∪ {e} Merge(S‘) S = S ∪ S‘
  • 30. State-based 2P-Set (two-phase) Data Set A, R / A: added, R: removed Init A ≔ {}, R ≔ {} Query(e) return e ∈ A ∧ e ∉ R Update add(e): A ≔ A ∪ {e} remove(e): (pre query(e)) R ≔ R ∪ {e} Merge(A‘, R‘) A ≔ A ∪ A‘, R ≔ R ∪ R‘
  • 31. Op-based OR-Set (observed-remove) Data Set S / Set of pairs { (element e, unique tag u), ... } Init S ≔ {} Query(e) return ∃u : (e, u) ∈ S Operations add(e): S ≔ S ∪ { (e, u) } / u is generated unique tag remove(e): pre query(e) R ≔ { (e, u) | ∃u : (e, u) ∈ S } /at source („prepare“) S ≔ S R /downstream („execute“)
  • 32. Op-based OR-Set (observed-remove) R1 R2 R3 S = {ea} add(e) I S = {} I S = {} I S = {} S = {eb} S = {} rmv(e) add(e) add(eb) add(ea) rmv(ea) add(eb) S = {eb} S = {eb} S = {ea, eb} S = {eb}
  • 33. More datatypes • Register • Dictionary (Map) • Tree • Graph • Array • List plus more representations for each datatype
  • 34. Garbage collection • Sets could grow infinitely in worst case • Garbage collection possible, but a bit tricky • Out of scope for this session • Can induce surprising behavior sometimes • Sometimes stronger consensus is needed • Paxos, …
  • 35. Limitations of CRDTs • Very weak consistency guarantees Strives for „quiescent consistency“ • Eventually consistent Not suitable for high-volume ID generator or alike • Not easy to understand and model • Not all data structures representable Use if availability is extremely important
  • 36. Further reading 1. Shapiro et al., Conflict-free Replicated Data Types, Inria Research report, 2011 2. Shapiro et al., A comprehensive study of Convergent and Commutative Replicated Data Types, Inria Research report, 2011 3. Basho Tag Archives: CRDT, https://basho.com/tag/crdt/ 4. Leslie Lamport, Time, clocks, and the ordering of events in a distributed system, Communications of the ACM (1978)
  • 37. Wrap-up • CAP requires rethinking consistency • Strict Consistency ACID / 2PC • Strong Consistency Quorum-based R&W, Paxos • Eventual Consistency CRDT, Gossip, Hinted Handoffs Pick your consistency model based on your consistency and availability requirements
  • 38. The real world is not ACID Thus, it is perfectly fine to go for a relaxed consistency model
  • 39. @ufried Uwe Friedrichsen | uwe.friedrichsen@codecentric.de | http://slideshare.net/ufried | http://ufried.tumblr.com