SlideShare a Scribd company logo
1 of 14
© Hortonworks Inc. 2014
Data-Center Replication
Josh Elser
Member of Technical Staff
PMC, Apache Accumulo
June 12th, 2014
Page 1
Apache, Accumulo, Apache Accumulo, and ZooKeeper are trademarks of the Apache Software Foundation.
with Apache Accumulo
© Hortonworks Inc. 2014
Justification
Shouldn’t Accumulo be able to handle failures?
Absolutely
• Accumulo is great at surviving through failure conditions
• Always chooses the durable option by default
BUT
• Sometimes wide-spread failure over some set of
resources makes them unavailable for some time
Page 2
© Hortonworks Inc. 2014
Justification
• Expect the worst hardware failure
–Both in magnitude and frequency
• Administration problems
–Human error is inevitable
• A single data-center location may be limiting
–Geography: 150ms to send a packet from California to the
Netherlands and back again.
–Throughput: Concurrent user access (1K, 10K, 100K?)
• Applications must also consider failure conditions
–Must maintain its service-level agreement (SLA)
Page 3
Jeff Dean – “Numbers everyone should know”
http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/stanford-295-talk.pdf
© Hortonworks Inc. 2014
Justification
Wait, failure doesn’t really happen at the data center level
Page 4
… … right?
Inside Google data centers– http://www.google.com/about/datacenters/gallery/images/_2000/IDI_014.jpg
© Hortonworks Inc. 2014
Justification
• HVAC Issues
–Microsoft: http://blogs.office.com/2013/03/13/details-of-the-
hotmail-outlook-com-outage-on-march-12th/
• “Acts of God”
–Hurricane Sandy (NYC):
http://www.datacenterknowledge.com/archives/2012/10/30/major-
flooding-nyc-data-centers/
–Thunderstorms (AWS, Netflix, Heroku, Pinterest, Instagram):
http://www.datacenterknowledge.com/archives/2012/06/30/amazo
n-data-center-loses-power-during-storm/
• Squirrels
–Level(3): http://blog.level3.com/level-3-network/the-10-most-
bizarre-and-annoying-causes-of-fiber-cuts/
Page 5
© Hortonworks Inc. 2014
Justification
• “The Joys of Real Hardware” – Jeff Dean
• Typical first year with a new cluster
~0.5 overheating (~1-2 days)
~1 PDU failure (~6 hours)
~20 Rack failures (1-6 hours)
~3 Router failures (~1 hour)
• What happens when you suddenly lose a large
percent of nodes in your cluster?
Page 6
Jeff Dean – “The Joys of Real Hardware”
http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/stanford-295-talk.pdf
© Hortonworks Inc. 2014
Justification
And what about your software…
Page 7
Don’t forget about Grace Hopper. Bugs will show up one way or another
Lots of great technical “post-mortems” which involve software problems
from companies like Github, Google, Dropbox, Cloudflare, and others.
Unmodified source: http://www.bugzilla.org/img/buggie.png
License: http://creativecommons.org/licenses/by-sa/2.0/
© Hortonworks Inc. 2014
Implementation
• A framework for tracking data written to a table
• Interfaces to transmit data to another storage system
• Asynchronous replication, eventual consistency
• Configurable, cyclic replication graph
• “Primary-push” system from primary to peers
• Survives prolonged peer outages
Page 8
NoSQL X RDBMS
© Hortonworks Inc. 2014
Implementation
• Write-Ahead Logs is source of data for replication
–Tricky because WALs are per-TabletServer, not per-table.
• Primary storage of book-keeping in a new table
• Requires some records in Accumulo metadata table
• Pluggable replication work assignment by the Master
to TabletServers on primary
–Default implementation uses ZooKeeper
Page 9
© Hortonworks Inc. 2014
Implementation
• TabletServers track use of WALs
–Only for tables configured for replication
• Master reads these records
–Creates units of replication work
Unit = File needing replication to a peer
–Makes replication work units available
–Cleans up records which are fully replicated
• TabletServers acquire the replication work
–Read part of file, replicate, record progress, repeat
–Tries to replicate as much data as possible
• Garbage Collector manages unused WALs
–Records an update when a WAL is no longer referenced
–Only removes WALs when replication is complete
Page 10
© Hortonworks Inc. 2014
Implementation
ReplicaSystem – interface for defining how data is
replicated to a peer.
• Runs inside a Primary tserver, does the “heavy lifting”
• AccumuloReplicaSystem – implementation that sends
data to another Accumulo instance
–Primary tserver asks the Peer’s Master for a tserver
–Primary tserver sends serialized Mutations from the WAL for the
given table to the Peer tserver
–Peer tserver applies the Mutations locally and reports the number
of Mutations applied back to the Primary
Page 11
Primary Peer
© Hortonworks Inc. 2014
Implementation
• And it works!
• Will be introduced as a part of Apache Accumulo 1.7.0
• Also included in the next release of HDP
• Currently (6/2014) waiting on review from other devs
Page 12
© Hortonworks Inc. 2014
Future
• Replication to other types of systems
–Other NoSQL systems, RDBMSes, others?
• Support for replication of bulk imports
• Support for Conditional Mutations
–Maybe? Somehow?
• Consistency of table configurations
–Iterator/Combiner definitions
–Could cause undesired implications
– Peer system is smaller than primary, cannot hold as much data
– Need to set shorter TTL/age-off on peer than primary
Page 13
© Hortonworks Inc. 2014
Email: jelser@hortonworks.com
Twitter: @je2451
Page 14
I owe a huge thank you to everyone who was a
part of this in one way or another.
JIRA: https://issues.apache.org/jira/browse/ACCUMULO-378

More Related Content

What's hot

Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0DataWorks Summit
 
Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platformLuis Cabaceira
 
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...Symphony Software Foundation
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions Alfresco Software
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisDataWorks Summit/Hadoop Summit
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBryan Bende
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveBryan Bende
 
HTTP/2 Comes to Java - What Servlet 4.0 Means to You
HTTP/2 Comes to Java - What Servlet 4.0 Means to YouHTTP/2 Comes to Java - What Servlet 4.0 Means to You
HTTP/2 Comes to Java - What Servlet 4.0 Means to YouDavid Delabassee
 
Local Apache NiFi Processor Debug
Local Apache NiFi Processor DebugLocal Apache NiFi Processor Debug
Local Apache NiFi Processor DebugDeon Huang
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsBryan Bende
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer GuideDeon Huang
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiBryan Bende
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon
 
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...Symphony Software Foundation
 
Hive2.0 big dataspain-nov-2016
Hive2.0 big dataspain-nov-2016Hive2.0 big dataspain-nov-2016
Hive2.0 big dataspain-nov-2016alanfgates
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics ManagerSriharsha Chintalapani
 
Alfresco scalability and performnce
Alfresco   scalability and performnceAlfresco   scalability and performnce
Alfresco scalability and performncePaul Hampton
 

What's hot (20)

Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Sizing your alfresco platform
Sizing your alfresco platformSizing your alfresco platform
Sizing your alfresco platform
 
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data AnalysisApache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFi
 
NJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep DiveNJ Hadoop Meetup - Apache NiFi Deep Dive
NJ Hadoop Meetup - Apache NiFi Deep Dive
 
HTTP/2 Comes to Java - What Servlet 4.0 Means to You
HTTP/2 Comes to Java - What Servlet 4.0 Means to YouHTTP/2 Comes to Java - What Servlet 4.0 Means to You
HTTP/2 Comes to Java - What Servlet 4.0 Means to You
 
Local Apache NiFi Processor Debug
Local Apache NiFi Processor DebugLocal Apache NiFi Processor Debug
Local Apache NiFi Processor Debug
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC Improvements
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer Guide
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFi
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora  - Benchmark ...
The Alfresco ECM 1 Billion Document Benchmark on AWS and Aurora - Benchmark ...
 
Hive2.0 big dataspain-nov-2016
Hive2.0 big dataspain-nov-2016Hive2.0 big dataspain-nov-2016
Hive2.0 big dataspain-nov-2016
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Schema Registry & Stream Analytics Manager
Schema Registry  & Stream Analytics ManagerSchema Registry  & Stream Analytics Manager
Schema Registry & Stream Analytics Manager
 
Hive Does ACID
Hive Does ACIDHive Does ACID
Hive Does ACID
 
Alfresco scalability and performnce
Alfresco   scalability and performnceAlfresco   scalability and performnce
Alfresco scalability and performnce
 

Viewers also liked

MySQL High Availability and Disaster Recovery with Continuent, a VMware company
MySQL High Availability and Disaster Recovery with Continuent, a VMware companyMySQL High Availability and Disaster Recovery with Continuent, a VMware company
MySQL High Availability and Disaster Recovery with Continuent, a VMware companyContinuent
 
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning PresentationMySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning PresentationColin Charles
 
High Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDBHigh Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDBGareth Davies
 
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...Severalnines
 
Advanced mysql replication techniques
Advanced mysql replication techniquesAdvanced mysql replication techniques
Advanced mysql replication techniquesGiuseppe Maxia
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High AvailabilityColin Charles
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Severalnines
 
MySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinarMySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinarAndrew Morgan
 

Viewers also liked (8)

MySQL High Availability and Disaster Recovery with Continuent, a VMware company
MySQL High Availability and Disaster Recovery with Continuent, a VMware companyMySQL High Availability and Disaster Recovery with Continuent, a VMware company
MySQL High Availability and Disaster Recovery with Continuent, a VMware company
 
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning PresentationMySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
 
High Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDBHigh Availabiltity & Replica Sets with mongoDB
High Availabiltity & Replica Sets with mongoDB
 
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
 
Advanced mysql replication techniques
Advanced mysql replication techniquesAdvanced mysql replication techniques
Advanced mysql replication techniques
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
 
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
 
MySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinarMySQL High Availability Solutions - Feb 2015 webinar
MySQL High Availability Solutions - Feb 2015 webinar
 

Similar to Data-Center Replication with Apache Accumulo

Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoopmarkgrover
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakSean Roberts
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...Hortonworks
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Cloudera, Inc.
 
Introduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed DatabasesIntroduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed DatabasesShankar Iyer
 
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop ClusterEnterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop ClusterDataWorks Summit
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhereDocker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhereDataWorks Summit
 
Objective SQL Server Performance
Objective SQL Server PerformanceObjective SQL Server Performance
Objective SQL Server PerformanceDavid Klee
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_finalRamya Sunil
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...The Linux Foundation
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Chris Nauroth
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Chris Nauroth
 
Some Oracle AWR observations
Some Oracle AWR observationsSome Oracle AWR observations
Some Oracle AWR observationsConnor McDonald
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Chris Nauroth
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopHortonworks
 
Deploying Containers in Production and at Scale
Deploying Containers in Production and at ScaleDeploying Containers in Production and at Scale
Deploying Containers in Production and at ScaleMesosphere Inc.
 

Similar to Data-Center Replication with Apache Accumulo (20)

Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
 
Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)Bay Area Impala User Group Meetup (Sept 16 2014)
Bay Area Impala User Group Meetup (Sept 16 2014)
 
Introduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed DatabasesIntroduction to Distributed Computing & Distributed Databases
Introduction to Distributed Computing & Distributed Databases
 
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop ClusterEnterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
 
Docker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhereDocker based Hadoop provisioning - anywhere
Docker based Hadoop provisioning - anywhere
 
Containers and Big Data
Containers and Big DataContainers and Big Data
Containers and Big Data
 
Objective SQL Server Performance
Objective SQL Server PerformanceObjective SQL Server Performance
Objective SQL Server Performance
 
Hadoop engineering bo_f_final
Hadoop engineering bo_f_finalHadoop engineering bo_f_final
Hadoop engineering bo_f_final
 
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
LCNA14: Why Use Xen for Large Scale Enterprise Deployments? - Konrad Rzeszute...
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Hoya for Code Review
Hoya for Code ReviewHoya for Code Review
Hoya for Code Review
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
 
Some Oracle AWR observations
Some Oracle AWR observationsSome Oracle AWR observations
Some Oracle AWR observations
 
HDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and SupportabilityHDFS: Optimization, Stabilization and Supportability
HDFS: Optimization, Stabilization and Supportability
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1
 
Best Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache HadoopBest Practices for Virtualizing Apache Hadoop
Best Practices for Virtualizing Apache Hadoop
 
Deploying Containers in Production and at Scale
Deploying Containers in Production and at ScaleDeploying Containers in Production and at Scale
Deploying Containers in Production and at Scale
 

More from Josh Elser

Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseJosh Elser
 
Apache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 OverviewApache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 OverviewJosh Elser
 
Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Josh Elser
 
Designing and Testing Accumulo Iterators
Designing and Testing Accumulo IteratorsDesigning and Testing Accumulo Iterators
Designing and Testing Accumulo IteratorsJosh Elser
 
Alternatives to Apache Accumulo’s Java API
Alternatives to Apache Accumulo’s Java APIAlternatives to Apache Accumulo’s Java API
Alternatives to Apache Accumulo’s Java APIJosh Elser
 
RPInventory 2-25-2010
RPInventory 2-25-2010RPInventory 2-25-2010
RPInventory 2-25-2010Josh Elser
 

More from Josh Elser (6)

Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
Apache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 OverviewApache Accumulo 1.8.0 Overview
Apache Accumulo 1.8.0 Overview
 
Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Calcite meetup-2016-04-20
Calcite meetup-2016-04-20
 
Designing and Testing Accumulo Iterators
Designing and Testing Accumulo IteratorsDesigning and Testing Accumulo Iterators
Designing and Testing Accumulo Iterators
 
Alternatives to Apache Accumulo’s Java API
Alternatives to Apache Accumulo’s Java APIAlternatives to Apache Accumulo’s Java API
Alternatives to Apache Accumulo’s Java API
 
RPInventory 2-25-2010
RPInventory 2-25-2010RPInventory 2-25-2010
RPInventory 2-25-2010
 

Recently uploaded

Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 

Recently uploaded (20)

Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 

Data-Center Replication with Apache Accumulo

  • 1. © Hortonworks Inc. 2014 Data-Center Replication Josh Elser Member of Technical Staff PMC, Apache Accumulo June 12th, 2014 Page 1 Apache, Accumulo, Apache Accumulo, and ZooKeeper are trademarks of the Apache Software Foundation. with Apache Accumulo
  • 2. © Hortonworks Inc. 2014 Justification Shouldn’t Accumulo be able to handle failures? Absolutely • Accumulo is great at surviving through failure conditions • Always chooses the durable option by default BUT • Sometimes wide-spread failure over some set of resources makes them unavailable for some time Page 2
  • 3. © Hortonworks Inc. 2014 Justification • Expect the worst hardware failure –Both in magnitude and frequency • Administration problems –Human error is inevitable • A single data-center location may be limiting –Geography: 150ms to send a packet from California to the Netherlands and back again. –Throughput: Concurrent user access (1K, 10K, 100K?) • Applications must also consider failure conditions –Must maintain its service-level agreement (SLA) Page 3 Jeff Dean – “Numbers everyone should know” http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/stanford-295-talk.pdf
  • 4. © Hortonworks Inc. 2014 Justification Wait, failure doesn’t really happen at the data center level Page 4 … … right? Inside Google data centers– http://www.google.com/about/datacenters/gallery/images/_2000/IDI_014.jpg
  • 5. © Hortonworks Inc. 2014 Justification • HVAC Issues –Microsoft: http://blogs.office.com/2013/03/13/details-of-the- hotmail-outlook-com-outage-on-march-12th/ • “Acts of God” –Hurricane Sandy (NYC): http://www.datacenterknowledge.com/archives/2012/10/30/major- flooding-nyc-data-centers/ –Thunderstorms (AWS, Netflix, Heroku, Pinterest, Instagram): http://www.datacenterknowledge.com/archives/2012/06/30/amazo n-data-center-loses-power-during-storm/ • Squirrels –Level(3): http://blog.level3.com/level-3-network/the-10-most- bizarre-and-annoying-causes-of-fiber-cuts/ Page 5
  • 6. © Hortonworks Inc. 2014 Justification • “The Joys of Real Hardware” – Jeff Dean • Typical first year with a new cluster ~0.5 overheating (~1-2 days) ~1 PDU failure (~6 hours) ~20 Rack failures (1-6 hours) ~3 Router failures (~1 hour) • What happens when you suddenly lose a large percent of nodes in your cluster? Page 6 Jeff Dean – “The Joys of Real Hardware” http://static.googleusercontent.com/media/research.google.com/en/us/people/jeff/stanford-295-talk.pdf
  • 7. © Hortonworks Inc. 2014 Justification And what about your software… Page 7 Don’t forget about Grace Hopper. Bugs will show up one way or another Lots of great technical “post-mortems” which involve software problems from companies like Github, Google, Dropbox, Cloudflare, and others. Unmodified source: http://www.bugzilla.org/img/buggie.png License: http://creativecommons.org/licenses/by-sa/2.0/
  • 8. © Hortonworks Inc. 2014 Implementation • A framework for tracking data written to a table • Interfaces to transmit data to another storage system • Asynchronous replication, eventual consistency • Configurable, cyclic replication graph • “Primary-push” system from primary to peers • Survives prolonged peer outages Page 8 NoSQL X RDBMS
  • 9. © Hortonworks Inc. 2014 Implementation • Write-Ahead Logs is source of data for replication –Tricky because WALs are per-TabletServer, not per-table. • Primary storage of book-keeping in a new table • Requires some records in Accumulo metadata table • Pluggable replication work assignment by the Master to TabletServers on primary –Default implementation uses ZooKeeper Page 9
  • 10. © Hortonworks Inc. 2014 Implementation • TabletServers track use of WALs –Only for tables configured for replication • Master reads these records –Creates units of replication work Unit = File needing replication to a peer –Makes replication work units available –Cleans up records which are fully replicated • TabletServers acquire the replication work –Read part of file, replicate, record progress, repeat –Tries to replicate as much data as possible • Garbage Collector manages unused WALs –Records an update when a WAL is no longer referenced –Only removes WALs when replication is complete Page 10
  • 11. © Hortonworks Inc. 2014 Implementation ReplicaSystem – interface for defining how data is replicated to a peer. • Runs inside a Primary tserver, does the “heavy lifting” • AccumuloReplicaSystem – implementation that sends data to another Accumulo instance –Primary tserver asks the Peer’s Master for a tserver –Primary tserver sends serialized Mutations from the WAL for the given table to the Peer tserver –Peer tserver applies the Mutations locally and reports the number of Mutations applied back to the Primary Page 11 Primary Peer
  • 12. © Hortonworks Inc. 2014 Implementation • And it works! • Will be introduced as a part of Apache Accumulo 1.7.0 • Also included in the next release of HDP • Currently (6/2014) waiting on review from other devs Page 12
  • 13. © Hortonworks Inc. 2014 Future • Replication to other types of systems –Other NoSQL systems, RDBMSes, others? • Support for replication of bulk imports • Support for Conditional Mutations –Maybe? Somehow? • Consistency of table configurations –Iterator/Combiner definitions –Could cause undesired implications – Peer system is smaller than primary, cannot hold as much data – Need to set shorter TTL/age-off on peer than primary Page 13
  • 14. © Hortonworks Inc. 2014 Email: jelser@hortonworks.com Twitter: @je2451 Page 14 I owe a huge thank you to everyone who was a part of this in one way or another. JIRA: https://issues.apache.org/jira/browse/ACCUMULO-378

Editor's Notes

  1. Myself. What is replication in this context
  2. High level - why is replication important? Availability
  3. Hardware failures suck. Unexpected configuration/ops problems. Scale past a single data center without having to worry about a single instance. Must satisfy SLAs. Cannot just accept unplanned downtime.
  4. Jeff Dean’s talk about what to expect in the first year of a 1K node cluster
  5. The software running in the application, database and operating system also might have bugs which cause unexpected unavailability.
  6. Describe the characteristics of the replication implementation Book-keeping of data written to tables Interfaces/implementation for replicating data from Accumulo instance to another application Asynchronous Eventually consistent Push data from primary to peer Resilient to prolonged outages
  7. BigTable basics – what is a write-ahead log Write-ahead logs used to track data that was written to a table WAL is the primary element in bookkeeping system Some data written to metadata table, most written to replication table Leverage ZooKeeper for work assignment to tservers
  8. Tservers track WALs used for local tables Master makes work entries for the WALs that tservers use Assign replication work back to the tservers Master cleans up fully-replicated records GC closes WALs that are no longer referenced by tservers
  9. ReplicaSystem interface – does the “heavy lifting”, runs inside of the tserver AccumuloReplicaSystem implementation to replicate to an accumulo instance Local tserver -> peer master Peer master replies with peer tserver Local tserver -> peer tserver
  10. Not just some academic adventure. Will be available in 1.7.0 and the next version of HDP
  11. Replicate to RDBMS or nosql systems Support bulk-imports – not a huge priority because typically easily to replicate on your own. Conditional mutation support somehow – would have to change from async to sync or introduce conflict resolution Problems in table configuration properties that are universal and others which are specific to an instance