SlideShare a Scribd company logo
1 of 65
Download to read offline
EVENTUAL CONSISTENCY
DESIGNING FAILPROOF SYSTEMS
Grzegorz Skorupa
Software Architect
Illustration: Getty Images
PROBLEMS EVERYWHERE …
• System is failing approx once each two days without visible reason …
• System failed because developer expected each post to have an author
but author was not in DB
• Problem is occuring for Max … but not for any other user …
• One can find any article except for the one about …
Data consistency?
AGENDA
 Problem
 CAP Theorem
 Eventual Consistency
 Building an Eventually Consistent App
At Least Once
Source of Truth
Restoring Consistency
IT SYSTEMS TODAY
Business requirements:
 Highly Available
 Serve large amount of users
 Do complex tasks …
 … On large data sets
 Provide correct data
 Provide up-to-date data
Technical challenges:
 Scalable
 Distributed
 No Single Point of Failure
 Big Data
 Data consistency
FRIENDS EXAMPLE
No invitations for Alice
Friends of Alice Friends of Bob
Friend invitations for BobFriend invitations for Alice
✓
No invitations for Alice
FRIENDS EXAMPLE
Friends of Alice Friends of Bob
Friend invitations for Bob
✓
Friend invitations for Alice
friendAlice ??? friendBob ???
No invitations for Alice
FRIENDS EXAMPLE
Friends of Alice Friends of Bob
Friend invitations for Bob
✓
Friend invitations for Alice
??? Bobinvite
friendAlice ??? friendBob ???
??? Aliceinvite
SOME CONSTRAINTS:
We do not want to see two friend invitations from
the same person
We do not want to be friends twice
If we are friends both of us should see the other
person in the list of friends
FRIENDSHIP RELATIONSHIP
Alice Bobinvite
Alice invites Bob
FRIENDSHIP RELATIONSHIP
Alice Bobinvite
Alice invites Bob
Bob accepts invitation
friendAlice Bob
FRIENDSHIP RELATIONSHIP
Alice Bobinvite
Alice invites Bob
Bob accepts invitation
friendAlice Bob
friendBob Alice
FRIENDSHIP RELATIONSHIP
Alice Bobinvite
Alice invites Bob
Bob accepts invitation
friendAlice Bob
Alice Bobinvite
friendBob Alice
THE NIGHTMARE – INCONSISTENCIES IN DATA
Alice is friends with Bob
THE NIGHTMARE – INCONSISTENCIES IN DATA
Alice is friends with Bob
but Bob is not friends with Alice
THE NIGHTMARE – INCONSISTENCIES IN DATA
Alice is friends with Bob
but Bob is not friends with Alice
Alice is friends with Bob
THE NIGHTMARE – INCONSISTENCIES IN DATA
Alice is friends with Bob
but Bob is not friends with Alice
Alice is friends with Bob
but Bob still sees friend invitation from Alice
WELL, WE HAVE ACID APPROACH
Begin transaction
Commit transaction
friendAlice Bob
Alice Bobinvite
friendBob Alice
NO ACID? WHAT TO DO?
B)
• File upload:
1. Store file path in DB
2. Save file
• Deleting file:
1. Remove file
2. Remove from DB
A)
• File upload:
1. Save file
2. Store file path in DB
• Deleting file:
1. Remove from DB
2. Remove file
VS
CAP THEOREM
Consistency
Availability
Toleration to Partitioning
We can have only the 2 out of 3
Seth G., Lynch N.: Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-tolerant Web Services, SIGACT News v. 33, n. 2, 2002
TWO PHASE COMMIT
Coordinator Cohort
QUERY TO COMMIT
VOTE YES/NO
COMMIT/ROLLBACK
ACKNOWLEDGEMENT
Prepare / Abort
Commit / Abort
Commit / Abort
End
CAP AND TWO PHASE COMMIT
When you do a Two Phase Commit:
You are sacrificing Availability – locking when node is down
Scalability suffers, performance suffers
It has a Single Point of Failure
It does not guarantee Consistency
http://blog.thislongrun.com/2015/04/the-unclear-cp-vs-ca-case-in-cap.html
SO … OUR HIGH TRAFFIC SYSTEM CAN’T BE
CONSISTENT
I am done,
we have no consistency anyways
But what will happen WHEN 1 succeeds and 2 fails?
friendAlice Bob
Alice Bobinvite
friendBob Alice
CAP: AVAILABILITY AND CONSISTENCY
Availability:
Every request received by a non-failing node must result in a (successful) response
Consistency:
There exists a total order of all operations such that each operation looks as if it
were completed at a single instant
SET V = old value
SET V = new value
READ V  old value (from node 1)
READ V  new value (from node 2)
UNDERSTANDING CAP THEOREM
(1) Write Reads
Application
Writes (3) Reads
Application
(2) Synchronize data
BAD CAP, BAD MATH (1)
Writes Reads
Application
Database
Nginx
PHP
WHAT IS A NODE?
Writes
Reads
Nginx
(PHP)
MySQL
(Master)
MySQL
(Replica)
Nginx
(PHP)
Facebook
API
GoLang
Load
Balancer
Redis
BAD CAP, BAD MATH (2)
Primary
Secondary
Secondary
Secondary
Secondary
Writes
Reads
Reads
BAD CAP, BAD MATH (2)
Primary
Secondary
Secondary
Secondary
Secondary
Writes
Reads
Reads
BAD CAP, BAD MATH (2)
Primary
Secondary
Secondary
Secondary
Secondary
Writes
Reads
x
x
PARTITION VS. FAILURE
Application Application Application Application
x
Reads/Writes Reads/Writes
WHAT OUR HIGH TRAFFIC SYSTEMS ARE?
 They are consistent most of the time
 They tolerate partitioning to some extent
 They are available most of the time
From mathematical stand-point
they are neither CA, nor CP, and not AP
DESIGNING SYSTEMS – RELATION TO CAP
Available & Partition Tolerant:
POST /news
GET /news
Consistent & Partition Tolerant:
POST /friends/requests
GET /friends/requests
BUT … WE NEED SOME SORT OF CONSISTENCY:
EVENTUAL CONSISTENCY
The system guarantees that
if no new updates are made to the object,
eventually all accesses will return the last updated value.
Vogels, Werner. "Eventually consistent." Queue 6.6 (2008): 14-19.
USEFUL APPROACHES
Eventually consistent instead of Consistent
At least once + Idempotency instead of Exactly once
Source of truth instead of Absolute truth
Controlled inconsistency – A can live without B
but B can’t live without A
Restore Consistency procedures
Minimize probability of inconsistent data
WHAT ORDER TO APPLY?
friendAlice Bob
Alice Bobinvite
friendBob Alice
FAILURE SCENARIOS – REMOVE INVITATION FIRST (1)
If the system fails here:
Friend request will be gone
Alice will not be friends with Bob
If the system fails here:
Alice will see Bob in friends list
Bob will not see Alice in his friends list
friendAlice Bob
Alice Bobinvite
friendBob Alice
FAILURE SCENARIOS – BOB FIRST (2)
If the system fails here:
Bob will see Alice in friends list
Alice will not see Bob in his friends list
Bob will still see friend request from Alice
If the system fails here:
Alice and Bob will see each other in friends list
Bob will still see friend request from Alice
friendAlice Bob
friendBob Alice
Alice Bobinvite
FAILURE SCENARIOS – ALICE FIRST (3)
If the system fails here:
Alice will see Bob in friends list
Bob will not see Alice in his friends list
Bob will still see friend request from Alice
If the system fails here:
Alice and Bob will see each other in friends list
Bob will still see friend request from Alice
friendAlice Bob
friendBob Alice
Alice Bobinvite
1ST STEP: CHOOSE THE BEST ORDER
BOB FIRST:
Bob will see Alice in friends list
Alice will not see Bob in his friends list
Bob will still see friend request from Alice
Alice and Bob will see each other in friends list
Bob will still see friend request from Alice
ALICE FIRST:
Alice will see Bob in friends list
Bob will not see Alice in his friends list
Bob will still see friend request from Alice
Alice and Bob will see each other in friends list
Bob will still see friend request from Alice
friendAlice Bob
friendAlice Bob
friendBob Alice
friendBob Alice
Alice BobinviteAlice Bobinvite
TWO GENERALS PROBLEM
User: Server:
1. POST /posts „my new post”
2. Wait for response
3. Create the post
4. Send success response
5. Where is my response?
6. Should I resend? X
2ND STEP: MAKE THE ACTION IDEMPOTENT
IF (EXISTS(inviation {from: Alice ,to: Bob}) {
UPSERT {f1: Alice, f2: Bob} //NOT INSERT!!
UPSERT {f1: Bob, f2: Alice} //NOT INSERT!!
DELETE invitation {from: Alice ,to: Bob}
}
3RD STEP: MAKE SURE YOUR SYSTEM CAN WORK
WITH INCONSISTENT DATA
What if is there
but is not?
What if is there
but invitation also is there?
friendAlice Bob
friendBob Alice
friendAlice Bob
Alice Bobinvite
4TH STEP: DON’T LET OTHERS BREAK THE SYSTEM
Write a contract:
 Alice and Bob mentioned in the invitation MUST have respective user accounts
in the system
 the friend relation SHOULD be always be both ways
 When there is a friend relations there SHOULD be no invitation
Should – respects eventual consistency
Must – is always consistent
TEST AGAINST CONTRACT
• FIT https://medium.com/netflix-techblog/fit-failure-injection-testing-35d8e2a9bb2
• Chaos Monkey: https://github.com/Netflix/chaosmonkey
// Code
function acceptInvitation($from, $to) {
$invite = $this->invites->find($from,
$to);
if ($invite) {
$this->friends->befriend($from, $to);
$this->friends->befriend($to, $from);
$this->invites->remove($invite);
}
}
// Test
function testIdempotency() {
$this->invites->create($from, $to);
$this->friends->befriend($from, $to);
$response = $this->controller
->acceptInvitation($from, $to);
$this->assertTrue(200, $response->code());
$this->assertTrue($this->friends->isFriend($from, $to));
$this->assertTrue($this->friends->isFriend($to, $from));
$this->assertNull($this->invites->find($from, $to));
}
YOU HAVE TO THINK ABOUT THE WHOLE
FUNCTIONALITY
What with deleting a friend?
What if Alice could see her pending invitations?
What if Alice could cancel the invitation?
What if both send an invitation to each other?
And finally: what about rush conditions?
EVENT SOURCING CQRS
Change log:
 Alice invited Bob
 Bob declined invitation from Alice
 Alice cancelled invitation to Bob
 Bob invited Alice
 Alice accepted invitation from Bob
Result: Alice and Bob are friends
Write model
Read model
CHANGE LOG
1. Add pending operation to change log
2. Handle operation (create friend associations, remove invitation)
3. Commit operation in change log
Bob accepts invitation from Alice PENDING
Bob accepts invitation from Alice SUCCESS
FAILURE IN THE MIDDLE OF OPERATION
REPLAY …
1. Read pending operations from change log
2. Handle operation (create friend associations, remove invitation)
3. Commit operation in change log
Bob accepts invitation from Alice PENDING
Bob accepts invitation from Alice SUCCESS
< Idempotent!
PARTITION TOLERANCE
MERGING OF CHANGE LOGS
Bob accepts invitation from Alice PENDING Alice deletes friend invitation sent to Bob PENDING
Alice deletes friend invitation sent to Bob DECLINED
Node 1: Node 2:
Bob accepts invitation from Alice SUCCESS
Alice invites Bob SUCCESS
Merge / Conflict solving:
CHEAPER SOLUTION
TRAFFIC REDIRECTION
Bob accepts invitation from Alice PENDING Alice deletes friend invitation sent to Bob PENDING
Node 1: Node 2:
Alice invites Bob SUCCESS
xBob accepts invitation from Alice SUCCESS
READING STALE DATA
Master
Replica Replica
Bob accepts invitation from Alice
Alice and Bob are friends Alice and Bob are NOT friends
Synchronous replication is slow
ASYNC ASYNC
FAST STORAGE THAT IS UP TO DATE
Master
Replica Replica
Bob accepts invitation from Alice
1
Cache with
TTL
2
Alice and Bob are friends
https://cloud.google.com/datastore/docs/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore/#keys-only-global-query-followed-by-lookup-by-key
MASTER FAILS
Master
Replica Replica
Bob accepts invitation from Alice
1
Cache with
TTL
2
Alice and Bob are friends
x
ELECT NEW MASTER
CACHE FAILS
Master
Replica Replica
Bob accepts invitation from Alice
1
Cache with
TTL
2
Alice and Bob are friends(A)READ FROM REPLICAS
(B) SPIN OFF NEW CACHE
x
5TH STEP: RETURN TO CONSISTENT STATE
A: Wait for Bob to fix it
B: Write a vacuum script
Search the most recently processed friend requests
Verify consistency
 Add missing {f1: Alice, f2: Bob} entry
 Or maybe remove it?
A SELF-HEALING SYSTEM
A system that can work with inconsistent data AND
Applies various strategies to ensure eventual consistency
It basically does the three things:
1. Allow for inconsistency
2. Discover inconsistencies
3. Fix inconsistencies
SELF-HEALING DONE BADLY (1)
try {
INSERT A
INSERT B
} catch (Exception $e) {
//rollback
DELETE A
DELETE B
}
Rollback must be a separated process
SELF-HEALING DONE BADLY (2)
$tries = 0;
while (true) {
$succeeded = A();
if ($succeeded) {
break;
}
$tries++;
if ($tries > MAX_TRIES) {
//log it
//throw exception or break
}
}
while (true) {
$succeeded = A();
//A may fail due to rush conditions
if ($succeeded) {
break;
}
}
SOURCE OF TRUTH APPROACH
Each user should have a unique http://example.com/name.surname address
We are using Mongo and Redis
Register user:
1. Find next free unique name.surname.X
2. Try to store in Mongo (it has unique index) – N tries max
3. Store the name in Redis (for performance)
But what if our system fails before 3?
name.surname.1
name.surname.2
name.surname.3
…
SOURCE OF TRUTH
If one asks for http://example.com/name.surname
 We check name.surname against Redis
 Suppose it is not there
 Do we know there is no such user? No!
ADHOC SELF-HEALING
Ad-Hoc self-healing:
 Ask Mongo DB for the user with name.surname
 Not there? Then no such user > return 404
 Is there? Apply self healing
1. Revert to source of truth
2. Fix data according to source of truth
3. Return valid result
DO NOT NINJA CODE IT
When there is an inconsistency you must be informed of it
Multiple inconsistencies suggest a bigger problem
Your consistency checking/fixing should properly log the stuff –
someone has to monitor it
REMEMBER
Design Consistent and Available systems that:
• Become Eventually Consistent during failures
• Ensure to Restore Consistency
CAP does not disallow this
CHEAT SHEET
1. Allow for inconsistency
2. Design for inconsistent data
3. Test against the contract
4. Ensure eventual consistency (do a self-healing system)
5. Know the behavior of functionality when failures happen
EVENTUAL CONSISTENCY
DESIGNING FAILPROOF SYSTEMS
Illustration: Getty Images
Grzegorz Skorupa
Software Architect

More Related Content

Similar to Eventual Consistency - Desining Fail Proof Systems

Building Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksBuilding Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksHenry Story
 
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...Neo4j
 
Project # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docx
Project # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docxProject # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docx
Project # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docxwkyra78
 
How did I get here? Building confidence in a distributed stream processor
How did I get here? Building confidence in a distributed stream processorHow did I get here? Building confidence in a distributed stream processor
How did I get here? Building confidence in a distributed stream processorSean T Allen
 
Graphs for Ai and ML
Graphs for Ai and MLGraphs for Ai and ML
Graphs for Ai and MLNeo4j
 
Neo4j 20 minutes introduction
Neo4j 20 minutes introductionNeo4j 20 minutes introduction
Neo4j 20 minutes introductionAndrás Fehér
 
haventreddityet demo
haventreddityet demohaventreddityet demo
haventreddityet demoJames Pearce
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Max De Marzi
 
Genetic Malware
Genetic MalwareGenetic Malware
Genetic MalwareOkta
 
Decentralised Applications on Bitcoin
Decentralised Applications on BitcoinDecentralised Applications on Bitcoin
Decentralised Applications on BitcoinFederico Tenga
 
Eat my data
Eat my dataEat my data
Eat my dataPeng Zuo
 
Life without CPAN
Life without CPANLife without CPAN
Life without CPANBob Ernst
 

Similar to Eventual Consistency - Desining Fail Proof Systems (19)

Building Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social NetworksBuilding Secure Open & Distributed Social Networks
Building Secure Open & Distributed Social Networks
 
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
 
Part2-Apps-Security.pptx
Part2-Apps-Security.pptxPart2-Apps-Security.pptx
Part2-Apps-Security.pptx
 
Project # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docx
Project # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docxProject # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docx
Project # 3data (3).zipfriends6x6.csvDavid,FrankCindy.docx
 
How did I get here? Building confidence in a distributed stream processor
How did I get here? Building confidence in a distributed stream processorHow did I get here? Building confidence in a distributed stream processor
How did I get here? Building confidence in a distributed stream processor
 
Graphs for Ai and ML
Graphs for Ai and MLGraphs for Ai and ML
Graphs for Ai and ML
 
Scalax
ScalaxScalax
Scalax
 
Neo4J
Neo4JNeo4J
Neo4J
 
Neo4j 20 minutes introduction
Neo4j 20 minutes introductionNeo4j 20 minutes introduction
Neo4j 20 minutes introduction
 
haventreddityet demo
haventreddityet demohaventreddityet demo
haventreddityet demo
 
Intro to Neo4j 2.0
Intro to Neo4j 2.0Intro to Neo4j 2.0
Intro to Neo4j 2.0
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015
 
How's it Going?
How's it Going?How's it Going?
How's it Going?
 
Genetic Malware
Genetic MalwareGenetic Malware
Genetic Malware
 
Genetic Malware
Genetic MalwareGenetic Malware
Genetic Malware
 
Database History From Codd to Brewer
Database History From Codd to BrewerDatabase History From Codd to Brewer
Database History From Codd to Brewer
 
Decentralised Applications on Bitcoin
Decentralised Applications on BitcoinDecentralised Applications on Bitcoin
Decentralised Applications on Bitcoin
 
Eat my data
Eat my dataEat my data
Eat my data
 
Life without CPAN
Life without CPANLife without CPAN
Life without CPAN
 

Recently uploaded

ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfTobias Schneck
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Jaydeep Chhasatia
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionsNirav Modi
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmonyelliciumsolutionspun
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxJoão Esperancinha
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdfMeon Technology
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageDista
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesShyamsundar Das
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfBrain Inventory
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilVICTOR MAESTRE RAMIREZ
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxAutus Cyber Tech
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 

Recently uploaded (20)

ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in Trivandrum
 
Salesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptxSalesforce AI Associate Certification.pptx
Salesforce AI Associate Certification.pptx
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspections
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
 
Fields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptxFields in Java and Kotlin and what to expect.pptx
Fields in Java and Kotlin and what to expect.pptx
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdf
 
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales CoverageSales Territory Management: A Definitive Guide to Expand Sales Coverage
Sales Territory Management: A Definitive Guide to Expand Sales Coverage
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security Challenges
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdf
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-Council
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
ERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptxERP For Electrical and Electronics manufecturing.pptx
ERP For Electrical and Electronics manufecturing.pptx
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 

Eventual Consistency - Desining Fail Proof Systems

  • 1. EVENTUAL CONSISTENCY DESIGNING FAILPROOF SYSTEMS Grzegorz Skorupa Software Architect Illustration: Getty Images
  • 2. PROBLEMS EVERYWHERE … • System is failing approx once each two days without visible reason … • System failed because developer expected each post to have an author but author was not in DB • Problem is occuring for Max … but not for any other user … • One can find any article except for the one about … Data consistency?
  • 3. AGENDA  Problem  CAP Theorem  Eventual Consistency  Building an Eventually Consistent App At Least Once Source of Truth Restoring Consistency
  • 4. IT SYSTEMS TODAY Business requirements:  Highly Available  Serve large amount of users  Do complex tasks …  … On large data sets  Provide correct data  Provide up-to-date data Technical challenges:  Scalable  Distributed  No Single Point of Failure  Big Data  Data consistency
  • 5. FRIENDS EXAMPLE No invitations for Alice Friends of Alice Friends of Bob Friend invitations for BobFriend invitations for Alice ✓
  • 6. No invitations for Alice FRIENDS EXAMPLE Friends of Alice Friends of Bob Friend invitations for Bob ✓ Friend invitations for Alice friendAlice ??? friendBob ???
  • 7. No invitations for Alice FRIENDS EXAMPLE Friends of Alice Friends of Bob Friend invitations for Bob ✓ Friend invitations for Alice ??? Bobinvite friendAlice ??? friendBob ??? ??? Aliceinvite
  • 8. SOME CONSTRAINTS: We do not want to see two friend invitations from the same person We do not want to be friends twice If we are friends both of us should see the other person in the list of friends
  • 10. FRIENDSHIP RELATIONSHIP Alice Bobinvite Alice invites Bob Bob accepts invitation friendAlice Bob
  • 11. FRIENDSHIP RELATIONSHIP Alice Bobinvite Alice invites Bob Bob accepts invitation friendAlice Bob friendBob Alice
  • 12. FRIENDSHIP RELATIONSHIP Alice Bobinvite Alice invites Bob Bob accepts invitation friendAlice Bob Alice Bobinvite friendBob Alice
  • 13. THE NIGHTMARE – INCONSISTENCIES IN DATA Alice is friends with Bob
  • 14. THE NIGHTMARE – INCONSISTENCIES IN DATA Alice is friends with Bob but Bob is not friends with Alice
  • 15. THE NIGHTMARE – INCONSISTENCIES IN DATA Alice is friends with Bob but Bob is not friends with Alice Alice is friends with Bob
  • 16. THE NIGHTMARE – INCONSISTENCIES IN DATA Alice is friends with Bob but Bob is not friends with Alice Alice is friends with Bob but Bob still sees friend invitation from Alice
  • 17. WELL, WE HAVE ACID APPROACH Begin transaction Commit transaction friendAlice Bob Alice Bobinvite friendBob Alice
  • 18. NO ACID? WHAT TO DO? B) • File upload: 1. Store file path in DB 2. Save file • Deleting file: 1. Remove file 2. Remove from DB A) • File upload: 1. Save file 2. Store file path in DB • Deleting file: 1. Remove from DB 2. Remove file VS
  • 19. CAP THEOREM Consistency Availability Toleration to Partitioning We can have only the 2 out of 3 Seth G., Lynch N.: Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-tolerant Web Services, SIGACT News v. 33, n. 2, 2002
  • 20. TWO PHASE COMMIT Coordinator Cohort QUERY TO COMMIT VOTE YES/NO COMMIT/ROLLBACK ACKNOWLEDGEMENT Prepare / Abort Commit / Abort Commit / Abort End
  • 21. CAP AND TWO PHASE COMMIT When you do a Two Phase Commit: You are sacrificing Availability – locking when node is down Scalability suffers, performance suffers It has a Single Point of Failure It does not guarantee Consistency http://blog.thislongrun.com/2015/04/the-unclear-cp-vs-ca-case-in-cap.html
  • 22. SO … OUR HIGH TRAFFIC SYSTEM CAN’T BE CONSISTENT I am done, we have no consistency anyways But what will happen WHEN 1 succeeds and 2 fails? friendAlice Bob Alice Bobinvite friendBob Alice
  • 23. CAP: AVAILABILITY AND CONSISTENCY Availability: Every request received by a non-failing node must result in a (successful) response Consistency: There exists a total order of all operations such that each operation looks as if it were completed at a single instant SET V = old value SET V = new value READ V  old value (from node 1) READ V  new value (from node 2)
  • 24. UNDERSTANDING CAP THEOREM (1) Write Reads Application Writes (3) Reads Application (2) Synchronize data
  • 25. BAD CAP, BAD MATH (1) Writes Reads Application Database Nginx PHP
  • 26. WHAT IS A NODE? Writes Reads Nginx (PHP) MySQL (Master) MySQL (Replica) Nginx (PHP) Facebook API GoLang Load Balancer Redis
  • 27. BAD CAP, BAD MATH (2) Primary Secondary Secondary Secondary Secondary Writes Reads Reads
  • 28. BAD CAP, BAD MATH (2) Primary Secondary Secondary Secondary Secondary Writes Reads Reads
  • 29. BAD CAP, BAD MATH (2) Primary Secondary Secondary Secondary Secondary Writes Reads x x
  • 30. PARTITION VS. FAILURE Application Application Application Application x Reads/Writes Reads/Writes
  • 31. WHAT OUR HIGH TRAFFIC SYSTEMS ARE?  They are consistent most of the time  They tolerate partitioning to some extent  They are available most of the time From mathematical stand-point they are neither CA, nor CP, and not AP
  • 32. DESIGNING SYSTEMS – RELATION TO CAP Available & Partition Tolerant: POST /news GET /news Consistent & Partition Tolerant: POST /friends/requests GET /friends/requests
  • 33. BUT … WE NEED SOME SORT OF CONSISTENCY: EVENTUAL CONSISTENCY The system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value. Vogels, Werner. "Eventually consistent." Queue 6.6 (2008): 14-19.
  • 34. USEFUL APPROACHES Eventually consistent instead of Consistent At least once + Idempotency instead of Exactly once Source of truth instead of Absolute truth Controlled inconsistency – A can live without B but B can’t live without A Restore Consistency procedures Minimize probability of inconsistent data
  • 35. WHAT ORDER TO APPLY? friendAlice Bob Alice Bobinvite friendBob Alice
  • 36. FAILURE SCENARIOS – REMOVE INVITATION FIRST (1) If the system fails here: Friend request will be gone Alice will not be friends with Bob If the system fails here: Alice will see Bob in friends list Bob will not see Alice in his friends list friendAlice Bob Alice Bobinvite friendBob Alice
  • 37. FAILURE SCENARIOS – BOB FIRST (2) If the system fails here: Bob will see Alice in friends list Alice will not see Bob in his friends list Bob will still see friend request from Alice If the system fails here: Alice and Bob will see each other in friends list Bob will still see friend request from Alice friendAlice Bob friendBob Alice Alice Bobinvite
  • 38. FAILURE SCENARIOS – ALICE FIRST (3) If the system fails here: Alice will see Bob in friends list Bob will not see Alice in his friends list Bob will still see friend request from Alice If the system fails here: Alice and Bob will see each other in friends list Bob will still see friend request from Alice friendAlice Bob friendBob Alice Alice Bobinvite
  • 39. 1ST STEP: CHOOSE THE BEST ORDER BOB FIRST: Bob will see Alice in friends list Alice will not see Bob in his friends list Bob will still see friend request from Alice Alice and Bob will see each other in friends list Bob will still see friend request from Alice ALICE FIRST: Alice will see Bob in friends list Bob will not see Alice in his friends list Bob will still see friend request from Alice Alice and Bob will see each other in friends list Bob will still see friend request from Alice friendAlice Bob friendAlice Bob friendBob Alice friendBob Alice Alice BobinviteAlice Bobinvite
  • 40. TWO GENERALS PROBLEM User: Server: 1. POST /posts „my new post” 2. Wait for response 3. Create the post 4. Send success response 5. Where is my response? 6. Should I resend? X
  • 41. 2ND STEP: MAKE THE ACTION IDEMPOTENT IF (EXISTS(inviation {from: Alice ,to: Bob}) { UPSERT {f1: Alice, f2: Bob} //NOT INSERT!! UPSERT {f1: Bob, f2: Alice} //NOT INSERT!! DELETE invitation {from: Alice ,to: Bob} }
  • 42. 3RD STEP: MAKE SURE YOUR SYSTEM CAN WORK WITH INCONSISTENT DATA What if is there but is not? What if is there but invitation also is there? friendAlice Bob friendBob Alice friendAlice Bob Alice Bobinvite
  • 43. 4TH STEP: DON’T LET OTHERS BREAK THE SYSTEM Write a contract:  Alice and Bob mentioned in the invitation MUST have respective user accounts in the system  the friend relation SHOULD be always be both ways  When there is a friend relations there SHOULD be no invitation Should – respects eventual consistency Must – is always consistent
  • 44. TEST AGAINST CONTRACT • FIT https://medium.com/netflix-techblog/fit-failure-injection-testing-35d8e2a9bb2 • Chaos Monkey: https://github.com/Netflix/chaosmonkey // Code function acceptInvitation($from, $to) { $invite = $this->invites->find($from, $to); if ($invite) { $this->friends->befriend($from, $to); $this->friends->befriend($to, $from); $this->invites->remove($invite); } } // Test function testIdempotency() { $this->invites->create($from, $to); $this->friends->befriend($from, $to); $response = $this->controller ->acceptInvitation($from, $to); $this->assertTrue(200, $response->code()); $this->assertTrue($this->friends->isFriend($from, $to)); $this->assertTrue($this->friends->isFriend($to, $from)); $this->assertNull($this->invites->find($from, $to)); }
  • 45. YOU HAVE TO THINK ABOUT THE WHOLE FUNCTIONALITY What with deleting a friend? What if Alice could see her pending invitations? What if Alice could cancel the invitation? What if both send an invitation to each other? And finally: what about rush conditions?
  • 46. EVENT SOURCING CQRS Change log:  Alice invited Bob  Bob declined invitation from Alice  Alice cancelled invitation to Bob  Bob invited Alice  Alice accepted invitation from Bob Result: Alice and Bob are friends Write model Read model
  • 47. CHANGE LOG 1. Add pending operation to change log 2. Handle operation (create friend associations, remove invitation) 3. Commit operation in change log Bob accepts invitation from Alice PENDING Bob accepts invitation from Alice SUCCESS
  • 48. FAILURE IN THE MIDDLE OF OPERATION REPLAY … 1. Read pending operations from change log 2. Handle operation (create friend associations, remove invitation) 3. Commit operation in change log Bob accepts invitation from Alice PENDING Bob accepts invitation from Alice SUCCESS < Idempotent!
  • 49. PARTITION TOLERANCE MERGING OF CHANGE LOGS Bob accepts invitation from Alice PENDING Alice deletes friend invitation sent to Bob PENDING Alice deletes friend invitation sent to Bob DECLINED Node 1: Node 2: Bob accepts invitation from Alice SUCCESS Alice invites Bob SUCCESS Merge / Conflict solving:
  • 50. CHEAPER SOLUTION TRAFFIC REDIRECTION Bob accepts invitation from Alice PENDING Alice deletes friend invitation sent to Bob PENDING Node 1: Node 2: Alice invites Bob SUCCESS xBob accepts invitation from Alice SUCCESS
  • 51. READING STALE DATA Master Replica Replica Bob accepts invitation from Alice Alice and Bob are friends Alice and Bob are NOT friends Synchronous replication is slow ASYNC ASYNC
  • 52. FAST STORAGE THAT IS UP TO DATE Master Replica Replica Bob accepts invitation from Alice 1 Cache with TTL 2 Alice and Bob are friends https://cloud.google.com/datastore/docs/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore/#keys-only-global-query-followed-by-lookup-by-key
  • 53. MASTER FAILS Master Replica Replica Bob accepts invitation from Alice 1 Cache with TTL 2 Alice and Bob are friends x ELECT NEW MASTER
  • 54. CACHE FAILS Master Replica Replica Bob accepts invitation from Alice 1 Cache with TTL 2 Alice and Bob are friends(A)READ FROM REPLICAS (B) SPIN OFF NEW CACHE x
  • 55. 5TH STEP: RETURN TO CONSISTENT STATE A: Wait for Bob to fix it B: Write a vacuum script Search the most recently processed friend requests Verify consistency  Add missing {f1: Alice, f2: Bob} entry  Or maybe remove it?
  • 56. A SELF-HEALING SYSTEM A system that can work with inconsistent data AND Applies various strategies to ensure eventual consistency It basically does the three things: 1. Allow for inconsistency 2. Discover inconsistencies 3. Fix inconsistencies
  • 57. SELF-HEALING DONE BADLY (1) try { INSERT A INSERT B } catch (Exception $e) { //rollback DELETE A DELETE B } Rollback must be a separated process
  • 58. SELF-HEALING DONE BADLY (2) $tries = 0; while (true) { $succeeded = A(); if ($succeeded) { break; } $tries++; if ($tries > MAX_TRIES) { //log it //throw exception or break } } while (true) { $succeeded = A(); //A may fail due to rush conditions if ($succeeded) { break; } }
  • 59. SOURCE OF TRUTH APPROACH Each user should have a unique http://example.com/name.surname address We are using Mongo and Redis Register user: 1. Find next free unique name.surname.X 2. Try to store in Mongo (it has unique index) – N tries max 3. Store the name in Redis (for performance) But what if our system fails before 3? name.surname.1 name.surname.2 name.surname.3 …
  • 60. SOURCE OF TRUTH If one asks for http://example.com/name.surname  We check name.surname against Redis  Suppose it is not there  Do we know there is no such user? No!
  • 61. ADHOC SELF-HEALING Ad-Hoc self-healing:  Ask Mongo DB for the user with name.surname  Not there? Then no such user > return 404  Is there? Apply self healing 1. Revert to source of truth 2. Fix data according to source of truth 3. Return valid result
  • 62. DO NOT NINJA CODE IT When there is an inconsistency you must be informed of it Multiple inconsistencies suggest a bigger problem Your consistency checking/fixing should properly log the stuff – someone has to monitor it
  • 63. REMEMBER Design Consistent and Available systems that: • Become Eventually Consistent during failures • Ensure to Restore Consistency CAP does not disallow this
  • 64. CHEAT SHEET 1. Allow for inconsistency 2. Design for inconsistent data 3. Test against the contract 4. Ensure eventual consistency (do a self-healing system) 5. Know the behavior of functionality when failures happen
  • 65. EVENTUAL CONSISTENCY DESIGNING FAILPROOF SYSTEMS Illustration: Getty Images Grzegorz Skorupa Software Architect