SlideShare a Scribd company logo
1 of 21
R2D2 
LinkedIn’s Request/Response 
Infrastructure 
Oby Sumampouw (pronounced o-bee soo-mum-pow) 
osumampouw@linkedin.com
Why R2D2? 
Cluster 2 
Server 
Cluster 
Load 
Balancer 2 
Load 
Balancer 
Load 
Balancer 3 Cluster 3
R2D2 in a nutshell 
Client 
Server for 
Resource 
“foo” 
Server for 
Resource 
Profile 
Service 
“foo” 
Server for 
Resource 
“foo” 
Server for 
Resource 
Inbox 
Service 
“foo” 
Server for 
Resource 
“foo” 
Server for 
Resource 
Ads 
Service 
“foo” 
Send request to get profile?id=123 
Zookeeper 
• Listens to profile zookeeper node 
• Get a list of servers’ URIs where profile are hosted 
• Get notified if a server leaves or joins a cluster 
• Choose one server to send the request to 
??? 
Request 
Servers
Agenda 
 R2D2 Architecture 
 How information is stored and organized in zookeeper 
 How R2D2 does load balancing and graceful degradation 
 Partitioning and sticky routing 
 Miscellaneous D2 use cases at LinkedIn: 
- Redlining 
- Cluster variants 
 Q&A
StarWars™? 
•Note*: This R2D2 is not related to StarWars™. Lucas-arts/Disney don’t sue us.
What is rest.li? 
 Open source Java REST framework. Go to http://rest.li
What is D2? 
 Primarily a name server and traffic router 
 The global “address book” is stored in zookeeper 
 We store the back-up in the local filesystem 
Definitions: 
 D2 Cluster represents a collection of identical servers that host one or many D2 services 
 D2 Service represents a service 
 D2 Uri represents a server’s address and weight
How is D2 information organized and stored? 
/ Root 
/d2 
/d2/clusters /d2/services /d2/uris 
/d2/clusters/clusterA 
/d2/clusters/clusterB 
/d2/services/serviceA1 
/d2/services/serviceA2 
/d2/services/serviceB 
Service 
Properties: 
-Cluster = 
clusterA 
-Load-balancer 
configuration 
-Degrader 
configuration 
-Strategy 
configuration 
-Etc. 
Cluster 
Properties: 
-Partition 
configuration 
-Etc. 
/d2/uris/clusterA 
Uri 
Properties: 
-Machine URI 
-Weight 
/d2/uris/clusterB 
/d2/uris/clusterB/ephemeralNode1 
/d2/uris/clusterB/ephemeralNode2
9 
How is zookeeper initialized ? 
Config file Zookeeper 
/ Root 
/d2 
/d2/clusters /d2/services /d2/uris 
/d2/clusters/clusterA 
/d2/clusters/clusterB 
/d2/clusters/clusterC 
/d2/services/serviceA1 
/d2/services/serviceA2 
/d2/services/serviceA3 
ServiceA1 
Client 
ClusterA 
Server 
/d2/uris/clusterA 
/d2/uris/clusterA/ephemeralNode1 
D2Config.java
D2 Load Balancer 
 Client-side load balancer 
 Client keeps track of the state 
 2 Strategies to use: 
- Random 
- Degrader
How does the degrader load balancer work? 
Period 3 
LOAD_BALANCE 
Individual Server 
stats: 
Cluster total call 
count: 
0 
Cluster average 
latency: 
Cluster average 
2500 latency: 
ms 
0 ms 
Cluster drop rate: 
0.0 
Server 1 
Server 2 
Client 
Total Call Count: 0 
Latency: 0 ms 
Total Call Count: 0 
Latency: 0 ms 
100 points 
100 points 
PPeerriioodd 12 
100 
4900 ms 
100 
100 ms 
61 points 
CALL_DROPPING 
3636.5 ms 
67 
133 
3000 ms 
0.2 
LB Configuration: 
Latency Low Water Mark: 
500 ms 
Latency High Water Mark: 
2000 ms 
Min Call Count: 10 
Notice: 
The number of points 
don’t change because 
we are in CALL_DROPPING 
mode
How does the degrader recover from a bad state? 
Server 1 
Server 2 
Period N 
Client 
LOAD_BALANCE 
Individual Server 
stats: 
Cluster total call 
count: 
0 
Cluster average 
latency: 
0 ms 
Cluster drop rate: 
1.0 
1 points 
1 point 
Total Call Count: 0 
Latency: 0 ms 
Total Call Count: 0 
Latency: 0 ms 
CALL_DROPPING 
2 2 points 
Notice: 
We’re in recovery mode 
Because we choke all traffic 
So we will try recovering 
regardless of call stats 
N+1 
0.8 
2 
15 
150 ms 
20 
200 ms 
35 
178.6 ms 
37 points 
37 points 
3 
50 
200 50 
100 
200 ms 
0.6 
LB Configuration: 
Latency Low Water Mark: 
500 ms 
Latency High Water Mark: 
2000 ms 
Min Call Count: 10
A few more extra details 
 Min call count is reduced depending on how degraded the state is 
 It’s not just latency, we also consider error rate and number of outstanding calls 
 We can use many types of latency: 
- AVERAGE 
- 90% 
- 95% 
- 99% 
 We can set different low/high water mark 
for cluster vs for individual node
Call Dropping vs Load Balancing 
Call Dropping Mode Load Balancing Mode 
Affects the entire clusters Affects only individual machines in the 
cluster 
Purpose: graceful degradation Purpose: load balancing traffic 
Drop Rate Points 
Hints: Latency Hints: individual node latency, error 
rate, #outstanding calls
Partitioning and Sticky Routing 
 D2 supports partitioning of clusters 
- Range partitioning 
- Hash partitioning (MD5 or Modulo) 
- Use regex to extract key from URI 
to determine where a request should go 
 Sticky routing within partition is also supported 
- Use regex to extract key from URI (same 
as above) 
- Use consistent hash ring
Consistent Hash Ring 
Integer.MAX_INT Integer.MIN_INT 
| 
100 0 -100 
app1.foo.com 
app2.foo.com 
app3.foo.com 
Request for “foo”
Miscellaneous D2 use cases 
 Redlining: Measure max capacity of server 
 Use real traffic 
 Don’t have to worry about mutable operations 
Integer.MAX_INT Integer.MIN_INT 
| 
100 0 -100 
app1.foo.com 
app2.foo.com 
app3.foo.com
Miscellaneous D2 use cases 
 What if there are different requirements from different clients? 
 Let’s say we have a service called profile. 
- For clients who can only view profile, we want them to go to read-only cluster 
- For clients who can edit profile, we want them to go to read-write cluster. 
 Use Cluster variant technique 
 Cluster variant allows changing D2 Service’s namespace to get around the restriction that 
zookeeper node’s name must be unique.
Miscellaneous D2 use cases 
/ Root 
/d2 
Request for 
profile 
/d2/clusters /d2/services /d2/uris 
/d2/clusters/readonly 
/d2/clusters/readwrite 
/d2/services/profile 
Service 
Properties: 
-Cluster = 
readonly 
/d2/uris/readonly 
/d2/uris/readwrite 
Request for 
profile 
/d2/profileClusterVariant 
/d2/profileClusterVariant/profile 
Service 
Properties: 
-Cluster = 
readwrite 
/d2/uris/readonly/ephemeralNode1 
/d2/uris/readwrite/ephemeralNode1 
readonly 
Server 
readwrite 
Server 
View Client Edit Client
Q&A 
Questions? 
Email me at: osumampouw@linkedin.com 
Check out http://rest.li  https://github.com/linkedin/rest.li for more info 
We’re hiring!
Cross data center routing 
Server Cluster for 
Data Center 1 
Zookeeper 
Data Center 1 
Zookeeper 
Data Center 2 
/d2/uris/clusterA/ephemeralNode1 
/d2/clusters/clusterA 
Client in Data 
Center 1 
©2013 LinkedIn Corporation. All Rights Reserved. 21 
Server Cluster for 
Data Center 2 
/ Root 
/d2 
/d2/clusters /d2/services /d2/uris 
/d2/clusters/clusterA 
/d2/clusters/clusterA-1 
/d2/clusters/clusterA-2 
/d2/services/serviceA 
/d2/services/serviceA-1 
/d2/services/serviceA-2 
/d2/clusters/clusterA-2 
/d2/uris/clusterA-2/ephemeralNode1 
Service 
Properties: 
-Cluster = 
clusterA-2 
View of Zookeeper 
In Data Center 1 
Client in Data 
Center 2

More Related Content

What's hot

Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
viadea
 

What's hot (6)

Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...
Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...
Tungsten University: MySQL Multi-Master Operations Made Simple With Tungsten ...
 
Hands on MapR -- Viadea
Hands on MapR -- ViadeaHands on MapR -- Viadea
Hands on MapR -- Viadea
 
Lvs mini-howto
Lvs mini-howtoLvs mini-howto
Lvs mini-howto
 
Redis acc 2015_eng
Redis acc 2015_engRedis acc 2015_eng
Redis acc 2015_eng
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 

Similar to R2D2 slides from Velocity Conference London 2013

Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
HostedbyConfluent
 

Similar to R2D2 slides from Velocity Conference London 2013 (20)

Mastering Aurora PostgreSQL Clusters for Disaster Recovery
Mastering Aurora PostgreSQL Clusters for Disaster RecoveryMastering Aurora PostgreSQL Clusters for Disaster Recovery
Mastering Aurora PostgreSQL Clusters for Disaster Recovery
 
Tokyo azure meetup #12 service fabric internals
Tokyo azure meetup #12   service fabric internalsTokyo azure meetup #12   service fabric internals
Tokyo azure meetup #12 service fabric internals
 
MariaDB MaxScale
MariaDB MaxScaleMariaDB MaxScale
MariaDB MaxScale
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
 
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
 
Kubernetes Networking - Giragadurai Vallirajan
Kubernetes Networking - Giragadurai VallirajanKubernetes Networking - Giragadurai Vallirajan
Kubernetes Networking - Giragadurai Vallirajan
 
Advanced Monitoring for Amazon RDS - AWS 4D Event Manchester 16th June 2023
Advanced Monitoring for Amazon RDS - AWS 4D Event Manchester 16th June 2023Advanced Monitoring for Amazon RDS - AWS 4D Event Manchester 16th June 2023
Advanced Monitoring for Amazon RDS - AWS 4D Event Manchester 16th June 2023
 
Ibm db2 case study
Ibm db2 case studyIbm db2 case study
Ibm db2 case study
 
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
 
BIND 9 logging best practices
BIND 9 logging best practicesBIND 9 logging best practices
BIND 9 logging best practices
 
(BAC404) Deploying High Availability and Disaster Recovery Architectures with...
(BAC404) Deploying High Availability and Disaster Recovery Architectures with...(BAC404) Deploying High Availability and Disaster Recovery Architectures with...
(BAC404) Deploying High Availability and Disaster Recovery Architectures with...
 
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
 
Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
 
Dynamic routing in microservice oriented architecture
Dynamic routing in microservice oriented architectureDynamic routing in microservice oriented architecture
Dynamic routing in microservice oriented architecture
 
Ead pertemuan-8
Ead pertemuan-8Ead pertemuan-8
Ead pertemuan-8
 
Oracle no sql release 3 4 overview
Oracle no sql release 3 4 overviewOracle no sql release 3 4 overview
Oracle no sql release 3 4 overview
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?
 
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
Securing MongoDB to Serve an AWS-Based, Multi-Tenant, Security-Fanatic SaaS A...
 

Recently uploaded

Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
MaherOthman7
 
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdfALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
Madan Karki
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
drjose256
 
Online crime reporting system project.pdf
Online crime reporting system project.pdfOnline crime reporting system project.pdf
Online crime reporting system project.pdf
Kamal Acharya
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
BalamuruganV28
 

Recently uploaded (20)

"United Nations Park" Site Visit Report.
"United Nations Park" Site  Visit Report."United Nations Park" Site  Visit Report.
"United Nations Park" Site Visit Report.
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
 
Introduction to Arduino Programming: Features of Arduino
Introduction to Arduino Programming: Features of ArduinoIntroduction to Arduino Programming: Features of Arduino
Introduction to Arduino Programming: Features of Arduino
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdfALCOHOL PRODUCTION- Beer Brewing Process.pdf
ALCOHOL PRODUCTION- Beer Brewing Process.pdf
 
Supermarket billing system project report..pdf
Supermarket billing system project report..pdfSupermarket billing system project report..pdf
Supermarket billing system project report..pdf
 
Electrical shop management system project report.pdf
Electrical shop management system project report.pdfElectrical shop management system project report.pdf
Electrical shop management system project report.pdf
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Lab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docxLab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docx
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Intelligent Agents, A discovery on How A Rational Agent Acts
Intelligent Agents, A discovery on How A Rational Agent ActsIntelligent Agents, A discovery on How A Rational Agent Acts
Intelligent Agents, A discovery on How A Rational Agent Acts
 
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message QueuesLinux Systems Programming: Semaphores, Shared Memory, and Message Queues
Linux Systems Programming: Semaphores, Shared Memory, and Message Queues
 
Online crime reporting system project.pdf
Online crime reporting system project.pdfOnline crime reporting system project.pdf
Online crime reporting system project.pdf
 
Final DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manualFinal DBMS Manual (2).pdf final lab manual
Final DBMS Manual (2).pdf final lab manual
 
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxSLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
 
Electrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineElectrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission line
 
How to Design and spec harmonic filter.pdf
How to Design and spec harmonic filter.pdfHow to Design and spec harmonic filter.pdf
How to Design and spec harmonic filter.pdf
 
Insurance management system project report.pdf
Insurance management system project report.pdfInsurance management system project report.pdf
Insurance management system project report.pdf
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 

R2D2 slides from Velocity Conference London 2013

  • 1. R2D2 LinkedIn’s Request/Response Infrastructure Oby Sumampouw (pronounced o-bee soo-mum-pow) osumampouw@linkedin.com
  • 2. Why R2D2? Cluster 2 Server Cluster Load Balancer 2 Load Balancer Load Balancer 3 Cluster 3
  • 3. R2D2 in a nutshell Client Server for Resource “foo” Server for Resource Profile Service “foo” Server for Resource “foo” Server for Resource Inbox Service “foo” Server for Resource “foo” Server for Resource Ads Service “foo” Send request to get profile?id=123 Zookeeper • Listens to profile zookeeper node • Get a list of servers’ URIs where profile are hosted • Get notified if a server leaves or joins a cluster • Choose one server to send the request to ??? Request Servers
  • 4. Agenda  R2D2 Architecture  How information is stored and organized in zookeeper  How R2D2 does load balancing and graceful degradation  Partitioning and sticky routing  Miscellaneous D2 use cases at LinkedIn: - Redlining - Cluster variants  Q&A
  • 5. StarWars™? •Note*: This R2D2 is not related to StarWars™. Lucas-arts/Disney don’t sue us.
  • 6. What is rest.li?  Open source Java REST framework. Go to http://rest.li
  • 7. What is D2?  Primarily a name server and traffic router  The global “address book” is stored in zookeeper  We store the back-up in the local filesystem Definitions:  D2 Cluster represents a collection of identical servers that host one or many D2 services  D2 Service represents a service  D2 Uri represents a server’s address and weight
  • 8. How is D2 information organized and stored? / Root /d2 /d2/clusters /d2/services /d2/uris /d2/clusters/clusterA /d2/clusters/clusterB /d2/services/serviceA1 /d2/services/serviceA2 /d2/services/serviceB Service Properties: -Cluster = clusterA -Load-balancer configuration -Degrader configuration -Strategy configuration -Etc. Cluster Properties: -Partition configuration -Etc. /d2/uris/clusterA Uri Properties: -Machine URI -Weight /d2/uris/clusterB /d2/uris/clusterB/ephemeralNode1 /d2/uris/clusterB/ephemeralNode2
  • 9. 9 How is zookeeper initialized ? Config file Zookeeper / Root /d2 /d2/clusters /d2/services /d2/uris /d2/clusters/clusterA /d2/clusters/clusterB /d2/clusters/clusterC /d2/services/serviceA1 /d2/services/serviceA2 /d2/services/serviceA3 ServiceA1 Client ClusterA Server /d2/uris/clusterA /d2/uris/clusterA/ephemeralNode1 D2Config.java
  • 10. D2 Load Balancer  Client-side load balancer  Client keeps track of the state  2 Strategies to use: - Random - Degrader
  • 11. How does the degrader load balancer work? Period 3 LOAD_BALANCE Individual Server stats: Cluster total call count: 0 Cluster average latency: Cluster average 2500 latency: ms 0 ms Cluster drop rate: 0.0 Server 1 Server 2 Client Total Call Count: 0 Latency: 0 ms Total Call Count: 0 Latency: 0 ms 100 points 100 points PPeerriioodd 12 100 4900 ms 100 100 ms 61 points CALL_DROPPING 3636.5 ms 67 133 3000 ms 0.2 LB Configuration: Latency Low Water Mark: 500 ms Latency High Water Mark: 2000 ms Min Call Count: 10 Notice: The number of points don’t change because we are in CALL_DROPPING mode
  • 12. How does the degrader recover from a bad state? Server 1 Server 2 Period N Client LOAD_BALANCE Individual Server stats: Cluster total call count: 0 Cluster average latency: 0 ms Cluster drop rate: 1.0 1 points 1 point Total Call Count: 0 Latency: 0 ms Total Call Count: 0 Latency: 0 ms CALL_DROPPING 2 2 points Notice: We’re in recovery mode Because we choke all traffic So we will try recovering regardless of call stats N+1 0.8 2 15 150 ms 20 200 ms 35 178.6 ms 37 points 37 points 3 50 200 50 100 200 ms 0.6 LB Configuration: Latency Low Water Mark: 500 ms Latency High Water Mark: 2000 ms Min Call Count: 10
  • 13. A few more extra details  Min call count is reduced depending on how degraded the state is  It’s not just latency, we also consider error rate and number of outstanding calls  We can use many types of latency: - AVERAGE - 90% - 95% - 99%  We can set different low/high water mark for cluster vs for individual node
  • 14. Call Dropping vs Load Balancing Call Dropping Mode Load Balancing Mode Affects the entire clusters Affects only individual machines in the cluster Purpose: graceful degradation Purpose: load balancing traffic Drop Rate Points Hints: Latency Hints: individual node latency, error rate, #outstanding calls
  • 15. Partitioning and Sticky Routing  D2 supports partitioning of clusters - Range partitioning - Hash partitioning (MD5 or Modulo) - Use regex to extract key from URI to determine where a request should go  Sticky routing within partition is also supported - Use regex to extract key from URI (same as above) - Use consistent hash ring
  • 16. Consistent Hash Ring Integer.MAX_INT Integer.MIN_INT | 100 0 -100 app1.foo.com app2.foo.com app3.foo.com Request for “foo”
  • 17. Miscellaneous D2 use cases  Redlining: Measure max capacity of server  Use real traffic  Don’t have to worry about mutable operations Integer.MAX_INT Integer.MIN_INT | 100 0 -100 app1.foo.com app2.foo.com app3.foo.com
  • 18. Miscellaneous D2 use cases  What if there are different requirements from different clients?  Let’s say we have a service called profile. - For clients who can only view profile, we want them to go to read-only cluster - For clients who can edit profile, we want them to go to read-write cluster.  Use Cluster variant technique  Cluster variant allows changing D2 Service’s namespace to get around the restriction that zookeeper node’s name must be unique.
  • 19. Miscellaneous D2 use cases / Root /d2 Request for profile /d2/clusters /d2/services /d2/uris /d2/clusters/readonly /d2/clusters/readwrite /d2/services/profile Service Properties: -Cluster = readonly /d2/uris/readonly /d2/uris/readwrite Request for profile /d2/profileClusterVariant /d2/profileClusterVariant/profile Service Properties: -Cluster = readwrite /d2/uris/readonly/ephemeralNode1 /d2/uris/readwrite/ephemeralNode1 readonly Server readwrite Server View Client Edit Client
  • 20. Q&A Questions? Email me at: osumampouw@linkedin.com Check out http://rest.li  https://github.com/linkedin/rest.li for more info We’re hiring!
  • 21. Cross data center routing Server Cluster for Data Center 1 Zookeeper Data Center 1 Zookeeper Data Center 2 /d2/uris/clusterA/ephemeralNode1 /d2/clusters/clusterA Client in Data Center 1 ©2013 LinkedIn Corporation. All Rights Reserved. 21 Server Cluster for Data Center 2 / Root /d2 /d2/clusters /d2/services /d2/uris /d2/clusters/clusterA /d2/clusters/clusterA-1 /d2/clusters/clusterA-2 /d2/services/serviceA /d2/services/serviceA-1 /d2/services/serviceA-2 /d2/clusters/clusterA-2 /d2/uris/clusterA-2/ephemeralNode1 Service Properties: -Cluster = clusterA-2 View of Zookeeper In Data Center 1 Client in Data Center 2

Editor's Notes

  1. Step back several years, LinkedIn has small code base. Small binary Easy to scale up : Just add more servers One binary becomes too big to be deployed in a single server -> Split into multiple binaries. Birth of specialized services and Service Oriented Architecture (SOA). When a service wants to talk to another service, we have to wire in the address of the load balancer for that cluster Now we have hundreds of services, manually wiring the address of load balancer of route is error prone and slow -> imagine as developer you have to ask around what are the ip address of the load balancer. Load Balancer are expensive and introduces extra network hop.
  2. Imagine you have a client Machines can leave and join cluster at any given time. D2 has a server side and client side. Zookeeper is a distributed service that is used to maintain the state of a system. So it’s pretty fault tolerant even if few servers inside zookeeper dies we’re still OK. Zookeeper = similar to a file system that provides a way to publish/subscribe messages to znode. Servers announce its address to zookeeper Point: zookeeper is not involved in sending every request
  3. Open source Java rest framework currently being used at LinkedIn. This is how it works: Application sits on top of rest.li layer Rest.li sits on top or R2D2. D2 finds the services that rest.li creates, load balances traffic from clients to servers and also provides graceful degradation. R2 handles the request/response interaction between the server and clients. R2 is asynchronous and is implemented using netty/jetty. R2D2 is independent of rest.li. D2 can be used outside of rest.li’s as a name resolvers and load balancer.
  4. There are 3 different constructs that we use to store information for D2. D2 Cluster comprised of identical nodes. No first class nodes or second class nodes. No master/slave. No ACL. D2 is ideal for a trusted middle tier layer (simple to understand) Each D2 Uri will create a new client abstraction for sending traffic into
  5. URI node is zookeeper ephemeral node. cluster and service node are zookeeper permanent node. Point: cluster properties and service properties are rarely updated and almost static. So that’s why it’s permanent zookeeper node.
  6. Some restrictions: ZKFSUtil sets d2 config writer to write to /services, /clusters, /uris /d2 path is configurable Once the client listens, it keeps the global information inside its internal storage. So after it receives information it won’t need to contact zookeeper. Zookeeper publishes info to update D2Client internal state. If ClusterA Server dies, zookeeper automatically removes the ephemeral node so ServiceA1 client will know that it can’t sent request to ClusterAServer
  7. Imagine we have a client and a cluster that consists of 2 servers The client keeps track call statistics to each server We update statistics on a 5 seconds interval Talk about initial state (min call is to reduce flapping) We have 2 modes of operation: CALL_DROPPING and LOAD_BALANCING CALL_DROPPING is we change cluster’s drop rate. LOAD_BALANCING is directing traffic to healthier machines. Explain why there are 2 different modes. Because there are 2 types of problem that can affect a service. Cluster vs individual node CALL_DROPPING is for cluster -> problem with downstream services LOAD_BALANCING is for individual node -> problem with a particular server
  8. Why call dropping mode affects the entire cluster? Because we don’t want to double penalize a bad client (reducing the number of points while increasing the drop rate for a particular client)
  9. From request URI we will compute which partition it belongs to. We use regex to extract the key. Extra attribute: “D2-KeyMapper-TargetHost” : URI (but must be part of the cluster) “D2-Hint-TargetService” to override URI
  10. Imagine we have 3 servers. This is how the client view the servers MD5 hash the URI. For each URI we’ll create 100 points. The number of points is based on weight of the node * number of points per weight (configurable) The reason we use consistent hash ring is because servers can join/leave cluster at any time. So with a consistent hash ring, we’re guaranteed that only 1/n of the request will be reshuffled if there’s changes to the cluster membership
  11. Redlining means performance testing a server so we know what is the maximum capacity a server can handle. We can use real production traffic and not afraid of non-immutable requests
  12. So we’ve talked about how R2D2 work. How we discover services and how to load balance traffic. For more information you can check http://rest.li.