SlideShare una empresa de Scribd logo
1 de 28
Classic Paxos Implemented in Orc
Hemanth Kumar Mantri
Makarand Damle
Term Project CS380D (Distributed Computing)
Consensus
• Agreeing on one result
among a group of
participants
• Consensus protocols are
the basis for the state
machine approach in
distributed computing
• Difficult to achieve when
the participants or the
network fails
Introduction
• To deal with Concurrency
– Mutex and semaphore
– Read/Write locks in 2PL for transaction
• In distributed system
– No global master to issue locks
– Nodes/Channels Fail
Applications
• Chubby
– Distributed lock service by Google
– Provides Coarse grained locking for
distributed resources
• Petal
– Distributed virtual disks
• Frangipani
– A scalable distributed file system.
Why Paxos?
• Two Phase Commit (2PC)
– Coordinator Failures!!
• Three Phase Commit (3PC)
– Network Partition!!!
• Paxos
– Correctness guaranteed
– No liveness guaranteed
2PC
Initial
U
Abort
Commi
t
Prepare
(to all)
Abort
Abort (to all)All Ready
Commit
(to all)
TimeOut
Abort
(to all)
Initial
Ready
Abort
Commi
t
Prepare
Ready
Abort
ACK
Commit
ACK
Coordinator Participant
What’s Wrong?
• Coordinator Fails
– In Phase1
• Participants can reelect a
leader and restart
– In Phase2
• Decision has been taken
• If at least 1 live participant
knows – OK!
• Participant(s) who know it
die(s):
– Reelection: Inconsistent
– BLOCKED till coordinator
recovers!!
• Participant Fails
– In Phase1
• Timeout and so Abort
– In Phase2
• Check with leader after
recovery
• None are Blocked
Problems
• 2PC not resilient to coordinator failures in
2nd phase
• Participants didn’t know about the leader’s
decision: Abort/Commit
• So, a new phase is introduced to avoid
this ambiguity
• ‘Prepare to Commit’
Solution – 3 PC
Init
U
A
C
Prepare
(to all)
Abort
Abort (to all)
All OK
Commit
(to all)
TimeOut
Abort
(to all)
Init
R
A
C
Prepare
Ready
PrepareCommit
OK
Commit
Coordinator Participant
BC
All Ready
Prepare Commit
(to all)
Not All OK
Abort (to all)
Prepare
Abort
PC
Abort
After Recovery
Recovery
• Coordinator Fails in 2nd Phase and also a
participant fails
– Participant: Should have been in PC
– Coordinator: Should have been in BC
– Others can re-elect and restart 3PC (nothing
committed)
• Coordinator fails in 3rd Phase:
– Decision Taken and we know what it is
– No need to BLOCK!
So, what’s wrong again?
• Network partition!!
A
D
B
C
Hub
Leader
New Leader
Problem
How to reach consensus/data consistency in
a given distributed system that can tolerate
non-malicious failures?
Requirements
• Safety
– Only a value that has been proposed may be chosen
– Only one value is chosen
– A node never learns that a value has been chosen unless
it actually has been
• Liveness
– Eventually,
• some proposed value is chosen
• a node can learn the chosen value
• When the protocol is run in 2F+1 processes, F
processes can fail
Terminology
• Classes/Roles of agents:
– Client
• issues a request, waits for response
– Proposers
• Proposes the Client’s request, convinces the Acceptors,
resolves conflicts
– Acceptors
• Accept/Reject proposed values and let the learners know if
accepted
– Learners
• Mostly serve as a replication factor
• A node can act as more than one agent!
Paxos Algorithm
• Phase 1:
– Proposer (Prepare)
• selects a proposal number N
• sends a prepare request with number N to all acceptors
– Acceptor (Promise)
• If number N greater than that of any prepare
request it saw
– Respond a promise not to accept any more proposals
numbered less than N
• Otherwise, reject the proposal and also indicate
the highest proposal number it is considering
Paxos algorithm Contd.
• Phase 2
– Proposer (Accept):
• If N was accepted by majority of acceptors, send accept
request to all acceptors along with a value ‘v’
– Acceptor (Accepted):
• Receives (N,v) and accept the proposal unless it has already
responded to a prepare request having a number greater
than N.
• If accepted, send this value to the Listeners to store it.
Paxos’ properties
• P1: Any proposal number is unique
– If there are T nodes in the system, ith node
uses {i, i+T, i+2T, ……}
• P2: Any two set of acceptors have at least
one acceptor in common.
• P3: Value sent out in phase 2 is the value
of the highest-numbered proposal of all
the responses in phase 1.
Learning a chosen value
• Various Options:
– Each acceptor, whenever it accepts a
proposal, informs all the learners
• Our implementation
– Acceptors inform a distinguished learner
(usually the proposer) and let the
distinguished learner broadcast the result.
Successful Paxos Round
Client Proposer Acceptors Learners
Request(v)
Prepare(N)
Promise(N)
Accept(N,v)
Accepted(N)
Response
Acceptor Failure – Okay!
Client Proposer Acceptors Learners
Request
Prepare(N)
Promise(N)
Accept(N,v)
Accepted(N)
Response
FAIL!
Proposer Failure – Re-elect!
Client Proposer Acceptors Learners
Request
Prepare(N)
Promise(N)
Prepare(N+1)
Promise(N+1)
FAIL!
New Leader
.
.
.
Dueling Proposers
Source: http://the-paper-trail.org/blog/consensus-protocols-paxos/
Issues
• Multiple nodes believe to be Proposers
• Simulate Failures
def class faultyChannel(p) =
val ch = Channel()
def get() = ch.get()
def put(x) =
if ((Random(99) + 1) :> p)
then ch.put(x)
else signal
stop
Implementation
• Learn which nodes are alive
– HeartBeat messages, Timeouts
• Simulate Node Failures
– Same as failing its out channels
• Stress test
– Fail and Unfail nodes at random times
– Ensure leader is elected and the protocol
continues
Optimizations
• Not required for correctness
• Proposer:
– Send only to majority of live acceptors (Cheap
Paxos’ Key)
• Acceptor can Reject:
– Prepare(N) if answered Prepare(M): M > N
– Accept(N,v) if answered Accept(M,u): M > N
– Prepare(N) if answered Accept(M,u): M > N
Possible Future Work
• Extend to include Multi-Paxos, Fast
Paxos, Byzantine Paxos etc
• Use remoteChannels to run across nodes
Questions
References
• Paxos made Simple, Leslie Lamport
• Orc Reference Guide
– http://orc.csres.utexas.edu/
• http://the-paper-trail.org/
• Prof Seif Haridi’s Youtube Video Lectures

Más contenido relacionado

La actualidad más candente

La actualidad más candente (9)

Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
 
Randomized Byzantine Problem by Rabin
Randomized Byzantine Problem by RabinRandomized Byzantine Problem by Rabin
Randomized Byzantine Problem by Rabin
 
Algorithms & Complexity Calculation
Algorithms & Complexity CalculationAlgorithms & Complexity Calculation
Algorithms & Complexity Calculation
 
Operating system 27 semaphores
Operating system 27 semaphoresOperating system 27 semaphores
Operating system 27 semaphores
 
DPRoPHET in Delay Tolerant Network
DPRoPHET in Delay Tolerant NetworkDPRoPHET in Delay Tolerant Network
DPRoPHET in Delay Tolerant Network
 
Performance measures
Performance measuresPerformance measures
Performance measures
 
Chapter 11b
Chapter 11bChapter 11b
Chapter 11b
 
Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock Detection
 

Destacado

The paxos commit algorithm
The paxos commit algorithmThe paxos commit algorithm
The paxos commit algorithm
ahmed hamza
 

Destacado (14)

Paxos
PaxosPaxos
Paxos
 
the Paxos Commit algorithm
the Paxos Commit algorithmthe Paxos Commit algorithm
the Paxos Commit algorithm
 
Paxos introduction
Paxos introductionPaxos introduction
Paxos introduction
 
图解分布式一致性协议Paxos 20150311
图解分布式一致性协议Paxos 20150311图解分布式一致性协议Paxos 20150311
图解分布式一致性协议Paxos 20150311
 
Basic JavaScript Tutorial
Basic JavaScript TutorialBasic JavaScript Tutorial
Basic JavaScript Tutorial
 
An Introduction to ReactJS
An Introduction to ReactJSAn Introduction to ReactJS
An Introduction to ReactJS
 
Reactjs
Reactjs Reactjs
Reactjs
 
Javascript
JavascriptJavascript
Javascript
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
JavaScript - An Introduction
JavaScript - An IntroductionJavaScript - An Introduction
JavaScript - An Introduction
 
Paxos
PaxosPaxos
Paxos
 
React JS and why it's awesome
React JS and why it's awesomeReact JS and why it's awesome
React JS and why it's awesome
 
React js
React jsReact js
React js
 
The paxos commit algorithm
The paxos commit algorithmThe paxos commit algorithm
The paxos commit algorithm
 

Similar a Basic Paxos Implementation in Orc

Chapter 11d coordination agreement
Chapter 11d coordination agreementChapter 11d coordination agreement
Chapter 11d coordination agreement
AbDul ThaYyal
 
Chapter 11c coordination agreement
Chapter 11c coordination agreementChapter 11c coordination agreement
Chapter 11c coordination agreement
AbDul ThaYyal
 
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time SystemsSara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
knowdiff
 
1 messagepassing-121015032028-phpapp01
1 messagepassing-121015032028-phpapp011 messagepassing-121015032028-phpapp01
1 messagepassing-121015032028-phpapp01
Zaigham Abbas
 
Process Synchronization -1.ppt
Process Synchronization -1.pptProcess Synchronization -1.ppt
Process Synchronization -1.ppt
jayverma27
 

Similar a Basic Paxos Implementation in Orc (20)

Chapter 11d coordination agreement
Chapter 11d coordination agreementChapter 11d coordination agreement
Chapter 11d coordination agreement
 
L14.C3.FA18.ppt
L14.C3.FA18.pptL14.C3.FA18.ppt
L14.C3.FA18.ppt
 
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
Message Passing, Remote Procedure Calls and  Distributed Shared Memory as Com...Message Passing, Remote Procedure Calls and  Distributed Shared Memory as Com...
Message Passing, Remote Procedure Calls and Distributed Shared Memory as Com...
 
Chapter 11c coordination agreement
Chapter 11c coordination agreementChapter 11c coordination agreement
Chapter 11c coordination agreement
 
techniques.ppt
techniques.ppttechniques.ppt
techniques.ppt
 
The Beam Vision for Portability: "Write once run anywhere"
The Beam Vision for Portability: "Write once run anywhere"The Beam Vision for Portability: "Write once run anywhere"
The Beam Vision for Portability: "Write once run anywhere"
 
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time SystemsSara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
Sara Afshar: Scheduling and Resource Sharing in Multiprocessor Real-Time Systems
 
Ch5 process synchronization
Ch5   process synchronizationCh5   process synchronization
Ch5 process synchronization
 
datalink.ppt
datalink.pptdatalink.ppt
datalink.ppt
 
An overview of the process for handling enterprise client tickets
An overview of the process for handling enterprise client ticketsAn overview of the process for handling enterprise client tickets
An overview of the process for handling enterprise client tickets
 
9 fault-tolerance
9 fault-tolerance9 fault-tolerance
9 fault-tolerance
 
14 data link control
14 data link control14 data link control
14 data link control
 
BIRTE-13-Kawashima
BIRTE-13-KawashimaBIRTE-13-Kawashima
BIRTE-13-Kawashima
 
Transport layer
Transport layerTransport layer
Transport layer
 
Ch03 processes
Ch03 processesCh03 processes
Ch03 processes
 
1 messagepassing-121015032028-phpapp01
1 messagepassing-121015032028-phpapp011 messagepassing-121015032028-phpapp01
1 messagepassing-121015032028-phpapp01
 
CHAP4.pptx
CHAP4.pptxCHAP4.pptx
CHAP4.pptx
 
Process Synchronization -1.ppt
Process Synchronization -1.pptProcess Synchronization -1.ppt
Process Synchronization -1.ppt
 
Distributed Consensus: Making the Impossible Possible
Distributed Consensus: Making the Impossible PossibleDistributed Consensus: Making the Impossible Possible
Distributed Consensus: Making the Impossible Possible
 
Mutual Exclusion in Distributed Memory Systems
Mutual Exclusion in Distributed Memory SystemsMutual Exclusion in Distributed Memory Systems
Mutual Exclusion in Distributed Memory Systems
 

Más de Hemanth Kumar Mantri (8)

TCP Issues in DataCenter Networks
TCP Issues in DataCenter NetworksTCP Issues in DataCenter Networks
TCP Issues in DataCenter Networks
 
Neural Networks in File access Prediction
Neural Networks in File access PredictionNeural Networks in File access Prediction
Neural Networks in File access Prediction
 
Connected Components Labeling
Connected Components LabelingConnected Components Labeling
Connected Components Labeling
 
JPEG Image Compression
JPEG Image CompressionJPEG Image Compression
JPEG Image Compression
 
Traffic Simulation using NetLogo
Traffic Simulation using NetLogoTraffic Simulation using NetLogo
Traffic Simulation using NetLogo
 
Search Engine Switching
Search Engine SwitchingSearch Engine Switching
Search Engine Switching
 
Hadoop and MapReduce
Hadoop and MapReduceHadoop and MapReduce
Hadoop and MapReduce
 
Auto Tuning
Auto TuningAuto Tuning
Auto Tuning
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Basic Paxos Implementation in Orc

  • 1. Classic Paxos Implemented in Orc Hemanth Kumar Mantri Makarand Damle Term Project CS380D (Distributed Computing)
  • 2. Consensus • Agreeing on one result among a group of participants • Consensus protocols are the basis for the state machine approach in distributed computing • Difficult to achieve when the participants or the network fails
  • 3. Introduction • To deal with Concurrency – Mutex and semaphore – Read/Write locks in 2PL for transaction • In distributed system – No global master to issue locks – Nodes/Channels Fail
  • 4. Applications • Chubby – Distributed lock service by Google – Provides Coarse grained locking for distributed resources • Petal – Distributed virtual disks • Frangipani – A scalable distributed file system.
  • 5. Why Paxos? • Two Phase Commit (2PC) – Coordinator Failures!! • Three Phase Commit (3PC) – Network Partition!!! • Paxos – Correctness guaranteed – No liveness guaranteed
  • 6. 2PC Initial U Abort Commi t Prepare (to all) Abort Abort (to all)All Ready Commit (to all) TimeOut Abort (to all) Initial Ready Abort Commi t Prepare Ready Abort ACK Commit ACK Coordinator Participant
  • 7. What’s Wrong? • Coordinator Fails – In Phase1 • Participants can reelect a leader and restart – In Phase2 • Decision has been taken • If at least 1 live participant knows – OK! • Participant(s) who know it die(s): – Reelection: Inconsistent – BLOCKED till coordinator recovers!! • Participant Fails – In Phase1 • Timeout and so Abort – In Phase2 • Check with leader after recovery • None are Blocked
  • 8. Problems • 2PC not resilient to coordinator failures in 2nd phase • Participants didn’t know about the leader’s decision: Abort/Commit • So, a new phase is introduced to avoid this ambiguity • ‘Prepare to Commit’
  • 9. Solution – 3 PC Init U A C Prepare (to all) Abort Abort (to all) All OK Commit (to all) TimeOut Abort (to all) Init R A C Prepare Ready PrepareCommit OK Commit Coordinator Participant BC All Ready Prepare Commit (to all) Not All OK Abort (to all) Prepare Abort PC Abort After Recovery
  • 10. Recovery • Coordinator Fails in 2nd Phase and also a participant fails – Participant: Should have been in PC – Coordinator: Should have been in BC – Others can re-elect and restart 3PC (nothing committed) • Coordinator fails in 3rd Phase: – Decision Taken and we know what it is – No need to BLOCK!
  • 11. So, what’s wrong again? • Network partition!! A D B C Hub Leader New Leader
  • 12. Problem How to reach consensus/data consistency in a given distributed system that can tolerate non-malicious failures?
  • 13. Requirements • Safety – Only a value that has been proposed may be chosen – Only one value is chosen – A node never learns that a value has been chosen unless it actually has been • Liveness – Eventually, • some proposed value is chosen • a node can learn the chosen value • When the protocol is run in 2F+1 processes, F processes can fail
  • 14. Terminology • Classes/Roles of agents: – Client • issues a request, waits for response – Proposers • Proposes the Client’s request, convinces the Acceptors, resolves conflicts – Acceptors • Accept/Reject proposed values and let the learners know if accepted – Learners • Mostly serve as a replication factor • A node can act as more than one agent!
  • 15. Paxos Algorithm • Phase 1: – Proposer (Prepare) • selects a proposal number N • sends a prepare request with number N to all acceptors – Acceptor (Promise) • If number N greater than that of any prepare request it saw – Respond a promise not to accept any more proposals numbered less than N • Otherwise, reject the proposal and also indicate the highest proposal number it is considering
  • 16. Paxos algorithm Contd. • Phase 2 – Proposer (Accept): • If N was accepted by majority of acceptors, send accept request to all acceptors along with a value ‘v’ – Acceptor (Accepted): • Receives (N,v) and accept the proposal unless it has already responded to a prepare request having a number greater than N. • If accepted, send this value to the Listeners to store it.
  • 17. Paxos’ properties • P1: Any proposal number is unique – If there are T nodes in the system, ith node uses {i, i+T, i+2T, ……} • P2: Any two set of acceptors have at least one acceptor in common. • P3: Value sent out in phase 2 is the value of the highest-numbered proposal of all the responses in phase 1.
  • 18. Learning a chosen value • Various Options: – Each acceptor, whenever it accepts a proposal, informs all the learners • Our implementation – Acceptors inform a distinguished learner (usually the proposer) and let the distinguished learner broadcast the result.
  • 19. Successful Paxos Round Client Proposer Acceptors Learners Request(v) Prepare(N) Promise(N) Accept(N,v) Accepted(N) Response
  • 20. Acceptor Failure – Okay! Client Proposer Acceptors Learners Request Prepare(N) Promise(N) Accept(N,v) Accepted(N) Response FAIL!
  • 21. Proposer Failure – Re-elect! Client Proposer Acceptors Learners Request Prepare(N) Promise(N) Prepare(N+1) Promise(N+1) FAIL! New Leader . . .
  • 23. Issues • Multiple nodes believe to be Proposers • Simulate Failures def class faultyChannel(p) = val ch = Channel() def get() = ch.get() def put(x) = if ((Random(99) + 1) :> p) then ch.put(x) else signal stop
  • 24. Implementation • Learn which nodes are alive – HeartBeat messages, Timeouts • Simulate Node Failures – Same as failing its out channels • Stress test – Fail and Unfail nodes at random times – Ensure leader is elected and the protocol continues
  • 25. Optimizations • Not required for correctness • Proposer: – Send only to majority of live acceptors (Cheap Paxos’ Key) • Acceptor can Reject: – Prepare(N) if answered Prepare(M): M > N – Accept(N,v) if answered Accept(M,u): M > N – Prepare(N) if answered Accept(M,u): M > N
  • 26. Possible Future Work • Extend to include Multi-Paxos, Fast Paxos, Byzantine Paxos etc • Use remoteChannels to run across nodes
  • 28. References • Paxos made Simple, Leslie Lamport • Orc Reference Guide – http://orc.csres.utexas.edu/ • http://the-paper-trail.org/ • Prof Seif Haridi’s Youtube Video Lectures