SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Thursday, 5 November 2009
ZooKeeper Futures
            Expanding the Menagerie


            Henry Robinson
            Software engineer @ Cloudera
            Hadoop meetup - 11/5/2009

Thursday, 5 November 2009
Upcoming features for ZooKeeper



           ▪ Observers

           ▪ Dynamic        ensembles
           ▪ ZooKeeper        in Cloudera’s Distribution for Hadoop




Thursday, 5 November 2009
Observers
           ▪ ZOOKEEPER-368

           ▪ Problem:

              ▪   Every node in a ZooKeeper cluster has to vote
              ▪   So increasing the cluster size increases the cost of write
                  operations
              ▪   But increasing the cluster size is the only way currently to get
                  client scalability
              ▪   False tension between number of clients and performance
              ▪   Should only increase size of voting cluster to improve reliability




Thursday, 5 November 2009
Observers	

           ▪ It’s    worse than that
              ▪   Since clients are given a list of servers in the ensemble to connect
                  to, the cluster is not isolated from swamping due to the number of
                  clients
              ▪   That is, if a swarm of clients connect to one server and kill it,
                  they’ll move on to another and do the same.
              ▪   Now we are sharing the same number of clients amongst fewer
                  servers!
              ▪   So if these were enough clients originally to down a server, the
                  prognosis is not good for those remaining
              ▪   Only n/2 servers have to die before the cluster is no longer live



Thursday, 5 November 2009
Cascading Failures




Thursday, 5 November 2009
Cascading Failures




Thursday, 5 November 2009
Cascading Failures




Thursday, 5 November 2009
Cascading Failures




Thursday, 5 November 2009
Observers
           ▪ Simple         way to attack this problem: non-voting cluster members
           ▪ Act   as a fan-in point for client connections by proxying requests to
              the inner voting ensemble
           ▪ Doesn’t    matter if they die (in the sense that liveness is preserved) -
              cluster is still available for writes
           ▪ Write   throughput stays roughly constant as number of Observers
              increases
           ▪ So  we can freely scale the number of Observers to meet the
              requirements of the number of clients




Thursday, 5 November 2009
Observers: More benefits
           ▪ Voting  ensemble members must meet strict latency contracts in
              order to not be considered ‘failed’
           ▪ Therefore  distributing ZooKeeper across many racks, or even
              datacenters, is problematic.
           ▪ No        such requirements made of Observers
           ▪ So deploy the voting ensemble for reliability and low latency
              communicaton, and everywhere you need a client, add an Observer
           ▪ Reads get served locally, so wide distribution isn’t too painful for
              some workloads
           ▪ Likelihood  of partition increases relative to distribution of
              ensemble, so availability is increased in some cases
           ▪ Good   integration point for publish-subscribe, and for specific
              optimisations
Thursday, 5 November 2009
Observers: Current state
           ▪ This       patch required a lot of structural work
           ▪ Hoping           to get in to 3.3
           ▪ One            major refactor patch committed
           ▪ Core           patch up on ZOOKEEPER-368
              ▪   Check it out and add comments!
           ▪ Fully functional - you can apply the patch, update your configuration
              and start using Observers today
           ▪ Benchmarks show expected (and pleasing!) performance
              improvements
           ▪ To      come in future JIRAs - performance tweaking (batching)


Thursday, 5 November 2009
Dynamic Ensembles
           ▪ ZOOKEEPER-107

           ▪ Problem:

              ▪   What if you really do want to change the membership of your
                  cluster?
              ▪   Downtime is problematic for a ‘highly-available’ service
              ▪   But failures occur and machines get repurposed or upgraded




Thursday, 5 November 2009
Dynamic Ensembles
           ▪ We    would like to be able to add or remove machines from the
              cluster without stopping the world
           ▪ Conceptually, this is reasonably easy - we have a mechanism for
              updating information on every server synchronously, and in order
              ▪   (it’s called ZooKeeper)
           ▪ In    practice, this is rather involved:
              ▪   When is a new cluster ‘live’?
              ▪   Who votes on the cluster membership change?
              ▪   How do we deal with slow members?
              ▪   What happens when the leader changes?
              ▪   How do we find the cluster when it’s completely different?

Thursday, 5 November 2009
Dynamic Ensembles
           ▪ Getting        all this right is hard
              ▪   (good!)
           ▪A   fundamental change in how ZooKeeper is designed - much of the
              code is predicated on a static view of the cluster membership
           ▪ Ideally, we      want to prove that the resulting protocol is correct
           ▪ The  key observation is that membership changes must be voted
              upon by both the old and the new configuration
           ▪ So      this is no magic bullet if the cluster is down
           ▪ Need     to keep track of old configurations so that each vote can be
              tallied with the right quorum



Thursday, 5 November 2009
Dynamic Ensembles
           ▪ Lots           of discussion on the JIRA
              ▪   although no public activity for a couple of months
           ▪I     have code that pretty much works
           ▪ But waiting until Observers gets committed before I move focus
              completely to this
           ▪ Current   situation not *too* bad; there are upgrade workarounds
              that are a bit scary theoretically but in practice work ok.




Thursday, 5 November 2009
ZooKeeper Packages in CDH
           ▪ We        maintain Cloudera’s Distribution for Hadoop
              ▪   Packages for Mapred, HDFS, HBase, Pig and Hive
           ▪ We   see ZooKeeper as increasingly important to that stack, as well
              as having a wide variety of other applications
           ▪ Therefore, we’ve  packaged ZooKeeper 3.2.1 and are making it a first
              class part of CDH
           ▪ We’ll track the Apache releases, and also backport important
              patches
           ▪ Wrapped           up in the service framework:
              ▪   /sbin/service zookeeper start
           ▪ RPMs           and tarballs are done, DEBs to follow imminently
           ▪ Download           RPMs at http://archive.cloudera.com/redhat/cdh/unstable/
Thursday, 5 November 2009
Thanks! Questions?
                            henry@cloudera.com




Thursday, 5 November 2009

Más contenido relacionado

La actualidad más candente

Integration with EMC VNX and VNXe hybrid storage arrays
Integration with EMC VNX and VNXe hybrid storage arraysIntegration with EMC VNX and VNXe hybrid storage arrays
Integration with EMC VNX and VNXe hybrid storage arraysVeeam Software
 
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld
 
Leveraging CentOS and Xen for the Go Daddy Private Cloud
Leveraging CentOS and Xen for the Go Daddy Private CloudLeveraging CentOS and Xen for the Go Daddy Private Cloud
Leveraging CentOS and Xen for the Go Daddy Private CloudThe Linux Foundation
 
09 yong.luo-ceph in-ctrip
09 yong.luo-ceph in-ctrip09 yong.luo-ceph in-ctrip
09 yong.luo-ceph in-ctripYong Luo
 
Building vSphere Perf Monitoring Tools
Building vSphere Perf Monitoring ToolsBuilding vSphere Perf Monitoring Tools
Building vSphere Perf Monitoring ToolsPablo Roesch
 
What’s New in VMware vCenter Site Recovery Manager v5.0
What’s New in VMware vCenter Site Recovery Manager v5.0What’s New in VMware vCenter Site Recovery Manager v5.0
What’s New in VMware vCenter Site Recovery Manager v5.0Eric Sloof
 
VMware Site Recovery Manager - Architecting a DR Solution - Best Practices
VMware Site Recovery Manager - Architecting a DR Solution - Best PracticesVMware Site Recovery Manager - Architecting a DR Solution - Best Practices
VMware Site Recovery Manager - Architecting a DR Solution - Best Practicesthephuck
 
VMworld 2011 Review: Preparing for vSphere 5 with Virtualization Manager
VMworld 2011 Review: Preparing for vSphere 5 with Virtualization ManagerVMworld 2011 Review: Preparing for vSphere 5 with Virtualization Manager
VMworld 2011 Review: Preparing for vSphere 5 with Virtualization ManagerSolarWinds
 
Configuring policies in v c ops
Configuring policies in v c opsConfiguring policies in v c ops
Configuring policies in v c opsSunny Dua
 
NIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V ReplicaNIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V ReplicaKristian Nese
 
Cf Summit East 2018 Scaling ColdFusion
Cf Summit East 2018 Scaling ColdFusionCf Summit East 2018 Scaling ColdFusion
Cf Summit East 2018 Scaling ColdFusionmcollinsCF
 
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...VMworld
 
VMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld
 
Windows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaWindows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaRavikanth Chaganti
 
VMworld 2014: vSphere HA Best Practices and FT Tech Preview
VMworld 2014: vSphere HA Best Practices and FT Tech PreviewVMworld 2014: vSphere HA Best Practices and FT Tech Preview
VMworld 2014: vSphere HA Best Practices and FT Tech PreviewVMworld
 
VMworld 2014: Site Recovery Manager and vSphere Replication
VMworld 2014: Site Recovery Manager and vSphere ReplicationVMworld 2014: Site Recovery Manager and vSphere Replication
VMworld 2014: Site Recovery Manager and vSphere ReplicationVMworld
 
Xen: Hypervisor for the Cloud - CCC13
Xen: Hypervisor for the Cloud - CCC13Xen: Hypervisor for the Cloud - CCC13
Xen: Hypervisor for the Cloud - CCC13The Linux Foundation
 

La actualidad más candente (20)

Integration with EMC VNX and VNXe hybrid storage arrays
Integration with EMC VNX and VNXe hybrid storage arraysIntegration with EMC VNX and VNXe hybrid storage arrays
Integration with EMC VNX and VNXe hybrid storage arrays
 
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
 
Leveraging CentOS and Xen for the Go Daddy Private Cloud
Leveraging CentOS and Xen for the Go Daddy Private CloudLeveraging CentOS and Xen for the Go Daddy Private Cloud
Leveraging CentOS and Xen for the Go Daddy Private Cloud
 
09 yong.luo-ceph in-ctrip
09 yong.luo-ceph in-ctrip09 yong.luo-ceph in-ctrip
09 yong.luo-ceph in-ctrip
 
Building vSphere Perf Monitoring Tools
Building vSphere Perf Monitoring ToolsBuilding vSphere Perf Monitoring Tools
Building vSphere Perf Monitoring Tools
 
What’s New in VMware vCenter Site Recovery Manager v5.0
What’s New in VMware vCenter Site Recovery Manager v5.0What’s New in VMware vCenter Site Recovery Manager v5.0
What’s New in VMware vCenter Site Recovery Manager v5.0
 
VMware Site Recovery Manager - Architecting a DR Solution - Best Practices
VMware Site Recovery Manager - Architecting a DR Solution - Best PracticesVMware Site Recovery Manager - Architecting a DR Solution - Best Practices
VMware Site Recovery Manager - Architecting a DR Solution - Best Practices
 
VMworld 2011 Review: Preparing for vSphere 5 with Virtualization Manager
VMworld 2011 Review: Preparing for vSphere 5 with Virtualization ManagerVMworld 2011 Review: Preparing for vSphere 5 with Virtualization Manager
VMworld 2011 Review: Preparing for vSphere 5 with Virtualization Manager
 
Configuring policies in v c ops
Configuring policies in v c opsConfiguring policies in v c ops
Configuring policies in v c ops
 
NIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V ReplicaNIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V Replica
 
Rails in the Cloud
Rails in the CloudRails in the Cloud
Rails in the Cloud
 
Veeam backup and_replication_whats_new_in_v7
Veeam backup and_replication_whats_new_in_v7Veeam backup and_replication_whats_new_in_v7
Veeam backup and_replication_whats_new_in_v7
 
Cf Summit East 2018 Scaling ColdFusion
Cf Summit East 2018 Scaling ColdFusionCf Summit East 2018 Scaling ColdFusion
Cf Summit East 2018 Scaling ColdFusion
 
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...
VMworld 2013: VMware vCenter Site Recovery Manager – Solution Overview and Le...
 
VMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing DatabasesVMworld 2014: Virtualizing Databases
VMworld 2014: Virtualizing Databases
 
Windows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaWindows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V Replica
 
VMworld 2014: vSphere HA Best Practices and FT Tech Preview
VMworld 2014: vSphere HA Best Practices and FT Tech PreviewVMworld 2014: vSphere HA Best Practices and FT Tech Preview
VMworld 2014: vSphere HA Best Practices and FT Tech Preview
 
VMworld 2014: Site Recovery Manager and vSphere Replication
VMworld 2014: Site Recovery Manager and vSphere ReplicationVMworld 2014: Site Recovery Manager and vSphere Replication
VMworld 2014: Site Recovery Manager and vSphere Replication
 
5 Things to Ask Your Virtualization Administrator
5 Things to Ask Your Virtualization Administrator5 Things to Ask Your Virtualization Administrator
5 Things to Ask Your Virtualization Administrator
 
Xen: Hypervisor for the Cloud - CCC13
Xen: Hypervisor for the Cloud - CCC13Xen: Hypervisor for the Cloud - CCC13
Xen: Hypervisor for the Cloud - CCC13
 

Destacado

Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperknowbigdata
 
Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper Omid Vahdaty
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Actionjuvenxu
 
Hadoop, HBase and Zookeeper at Tamtay
Hadoop, HBase and Zookeeper at TamtayHadoop, HBase and Zookeeper at Tamtay
Hadoop, HBase and Zookeeper at TamtayEddie Bui
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperAlex Ehrnschwender
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 

Destacado (7)

Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper
 
Zookeeper In Action
Zookeeper In ActionZookeeper In Action
Zookeeper In Action
 
Hadoop, HBase and Zookeeper at Tamtay
Hadoop, HBase and Zookeeper at TamtayHadoop, HBase and Zookeeper at Tamtay
Hadoop, HBase and Zookeeper at Tamtay
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache Zookeeper
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 

Similar a ZooKeeper Futures

MySQL 5.7 -- SCaLE Feb 2014
MySQL 5.7 -- SCaLE Feb 2014MySQL 5.7 -- SCaLE Feb 2014
MySQL 5.7 -- SCaLE Feb 2014Dave Stokes
 
Deploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremDeploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremKris Buytaert
 
CNPM: Private NPM for Company / 企業級私有NPM
CNPM: Private NPM for Company / 企業級私有NPMCNPM: Private NPM for Company / 企業級私有NPM
CNPM: Private NPM for Company / 企業級私有NPMFeng Yuan
 
SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013
SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013
SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013Amazon Web Services
 
To Cloud or Not To Cloud?
To Cloud or Not To Cloud?To Cloud or Not To Cloud?
To Cloud or Not To Cloud?Greg Lindahl
 
MuraCon 2012 - Creating a Mura CMS plugin with FW/1
MuraCon 2012 - Creating a Mura CMS plugin with FW/1MuraCon 2012 - Creating a Mura CMS plugin with FW/1
MuraCon 2012 - Creating a Mura CMS plugin with FW/1jpanesar
 
Rock Solid Sametime for High Availability
Rock Solid Sametime for High AvailabilityRock Solid Sametime for High Availability
Rock Solid Sametime for High AvailabilityGabriella Davis
 
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Severalnines
 
Immutable infrastructure isn’t the answer
Immutable infrastructure isn’t the answerImmutable infrastructure isn’t the answer
Immutable infrastructure isn’t the answerSam Bashton
 
Building trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsBuilding trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsGuido Serra
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsAchievers Tech
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big timeproitconsult
 
Speed up your Serverless development flow
Speed up your Serverless development flowSpeed up your Serverless development flow
Speed up your Serverless development flowEfi Merdler-Kravitz
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Controlindiver
 
Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...
Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...
Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...Amazon Web Services
 
Presentation Agile Telco
Presentation Agile TelcoPresentation Agile Telco
Presentation Agile TelcoDawid Mielnik
 
2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기iFunFactory Inc.
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 

Similar a ZooKeeper Futures (20)

MySQL 5.7 -- SCaLE Feb 2014
MySQL 5.7 -- SCaLE Feb 2014MySQL 5.7 -- SCaLE Feb 2014
MySQL 5.7 -- SCaLE Feb 2014
 
Deploying your SaaS stack OnPrem
Deploying your SaaS stack OnPremDeploying your SaaS stack OnPrem
Deploying your SaaS stack OnPrem
 
CNPM: Private NPM for Company / 企業級私有NPM
CNPM: Private NPM for Company / 企業級私有NPMCNPM: Private NPM for Company / 企業級私有NPM
CNPM: Private NPM for Company / 企業級私有NPM
 
SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013
SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013
SmugMug's Zero-Downtime Migration to AWS (ARC312) | AWS re:Invent 2013
 
To Cloud or Not To Cloud?
To Cloud or Not To Cloud?To Cloud or Not To Cloud?
To Cloud or Not To Cloud?
 
MuraCon 2012 - Creating a Mura CMS plugin with FW/1
MuraCon 2012 - Creating a Mura CMS plugin with FW/1MuraCon 2012 - Creating a Mura CMS plugin with FW/1
MuraCon 2012 - Creating a Mura CMS plugin with FW/1
 
Rock Solid Sametime for High Availability
Rock Solid Sametime for High AvailabilityRock Solid Sametime for High Availability
Rock Solid Sametime for High Availability
 
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
Webinar slides: How to Manage Replication Failover Processes for MySQL, Maria...
 
Immutable infrastructure isn’t the answer
Immutable infrastructure isn’t the answerImmutable infrastructure isn’t the answer
Immutable infrastructure isn’t the answer
 
Building trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsBuilding trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOps
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
Scaling apps for the big time
Scaling apps for the big timeScaling apps for the big time
Scaling apps for the big time
 
Speed up your Serverless development flow
Speed up your Serverless development flowSpeed up your Serverless development flow
Speed up your Serverless development flow
 
Make It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version ControlMake It Cooler: Using Decentralized Version Control
Make It Cooler: Using Decentralized Version Control
 
Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...
Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...
Re:Inventing your Innovation Cycle by Scaling Out with Spot Instances (CPN207...
 
Presentation Agile Telco
Presentation Agile TelcoPresentation Agile Telco
Presentation Agile Telco
 
2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기
 
Qcon talk
Qcon talkQcon talk
Qcon talk
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 

Más de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Más de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

ZooKeeper Futures

  • 2. ZooKeeper Futures Expanding the Menagerie Henry Robinson Software engineer @ Cloudera Hadoop meetup - 11/5/2009 Thursday, 5 November 2009
  • 3. Upcoming features for ZooKeeper ▪ Observers ▪ Dynamic ensembles ▪ ZooKeeper in Cloudera’s Distribution for Hadoop Thursday, 5 November 2009
  • 4. Observers ▪ ZOOKEEPER-368 ▪ Problem: ▪ Every node in a ZooKeeper cluster has to vote ▪ So increasing the cluster size increases the cost of write operations ▪ But increasing the cluster size is the only way currently to get client scalability ▪ False tension between number of clients and performance ▪ Should only increase size of voting cluster to improve reliability Thursday, 5 November 2009
  • 5. Observers ▪ It’s worse than that ▪ Since clients are given a list of servers in the ensemble to connect to, the cluster is not isolated from swamping due to the number of clients ▪ That is, if a swarm of clients connect to one server and kill it, they’ll move on to another and do the same. ▪ Now we are sharing the same number of clients amongst fewer servers! ▪ So if these were enough clients originally to down a server, the prognosis is not good for those remaining ▪ Only n/2 servers have to die before the cluster is no longer live Thursday, 5 November 2009
  • 10. Observers ▪ Simple way to attack this problem: non-voting cluster members ▪ Act as a fan-in point for client connections by proxying requests to the inner voting ensemble ▪ Doesn’t matter if they die (in the sense that liveness is preserved) - cluster is still available for writes ▪ Write throughput stays roughly constant as number of Observers increases ▪ So we can freely scale the number of Observers to meet the requirements of the number of clients Thursday, 5 November 2009
  • 11. Observers: More benefits ▪ Voting ensemble members must meet strict latency contracts in order to not be considered ‘failed’ ▪ Therefore distributing ZooKeeper across many racks, or even datacenters, is problematic. ▪ No such requirements made of Observers ▪ So deploy the voting ensemble for reliability and low latency communicaton, and everywhere you need a client, add an Observer ▪ Reads get served locally, so wide distribution isn’t too painful for some workloads ▪ Likelihood of partition increases relative to distribution of ensemble, so availability is increased in some cases ▪ Good integration point for publish-subscribe, and for specific optimisations Thursday, 5 November 2009
  • 12. Observers: Current state ▪ This patch required a lot of structural work ▪ Hoping to get in to 3.3 ▪ One major refactor patch committed ▪ Core patch up on ZOOKEEPER-368 ▪ Check it out and add comments! ▪ Fully functional - you can apply the patch, update your configuration and start using Observers today ▪ Benchmarks show expected (and pleasing!) performance improvements ▪ To come in future JIRAs - performance tweaking (batching) Thursday, 5 November 2009
  • 13. Dynamic Ensembles ▪ ZOOKEEPER-107 ▪ Problem: ▪ What if you really do want to change the membership of your cluster? ▪ Downtime is problematic for a ‘highly-available’ service ▪ But failures occur and machines get repurposed or upgraded Thursday, 5 November 2009
  • 14. Dynamic Ensembles ▪ We would like to be able to add or remove machines from the cluster without stopping the world ▪ Conceptually, this is reasonably easy - we have a mechanism for updating information on every server synchronously, and in order ▪ (it’s called ZooKeeper) ▪ In practice, this is rather involved: ▪ When is a new cluster ‘live’? ▪ Who votes on the cluster membership change? ▪ How do we deal with slow members? ▪ What happens when the leader changes? ▪ How do we find the cluster when it’s completely different? Thursday, 5 November 2009
  • 15. Dynamic Ensembles ▪ Getting all this right is hard ▪ (good!) ▪A fundamental change in how ZooKeeper is designed - much of the code is predicated on a static view of the cluster membership ▪ Ideally, we want to prove that the resulting protocol is correct ▪ The key observation is that membership changes must be voted upon by both the old and the new configuration ▪ So this is no magic bullet if the cluster is down ▪ Need to keep track of old configurations so that each vote can be tallied with the right quorum Thursday, 5 November 2009
  • 16. Dynamic Ensembles ▪ Lots of discussion on the JIRA ▪ although no public activity for a couple of months ▪I have code that pretty much works ▪ But waiting until Observers gets committed before I move focus completely to this ▪ Current situation not *too* bad; there are upgrade workarounds that are a bit scary theoretically but in practice work ok. Thursday, 5 November 2009
  • 17. ZooKeeper Packages in CDH ▪ We maintain Cloudera’s Distribution for Hadoop ▪ Packages for Mapred, HDFS, HBase, Pig and Hive ▪ We see ZooKeeper as increasingly important to that stack, as well as having a wide variety of other applications ▪ Therefore, we’ve packaged ZooKeeper 3.2.1 and are making it a first class part of CDH ▪ We’ll track the Apache releases, and also backport important patches ▪ Wrapped up in the service framework: ▪ /sbin/service zookeeper start ▪ RPMs and tarballs are done, DEBs to follow imminently ▪ Download RPMs at http://archive.cloudera.com/redhat/cdh/unstable/ Thursday, 5 November 2009
  • 18. Thanks! Questions? henry@cloudera.com Thursday, 5 November 2009