SlideShare una empresa de Scribd logo
1 de 38
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HBase Read High Availability Using
Timeline-Consistent Region Replicas
Enis Soztutar (enis@hortonworks.com)
Devaraj Das (ddas@hortonworks.com)
Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
About Us
Enis Soztutar
Committer and PMC member in Apache
HBase and Hadoop since 2007
HBase team @Hortonworks
Twitter @enissoz
Devaraj Das
Committer and PMC member in
Hadoop since 2006
Committer at HBase
Co-founder @Hortonworks
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Outline of the talk
PART I: Use case and semantics
 CAP recap
 Use case and motivation
 Region replicas
 Timeline consistency
 Semantics
PART II : Implementation and next steps
 Server side
 Client side
 Data replication
 Next steps & Summary
Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Part I
Use case and semantics
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
CAP reCAP
Partition tolerance
Consistency Availability
Pick Two
HBase is CP
Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Availability
CAP reCAP
• In a distributed system you cannot NOT have P
• C vs A is about what happens if there is a network
partition!
• A an C are NEVER binary values, always a range
• Different operations in the system can have
different A / C choices
• HBase cannot be simplified as CP
Partition tolerance
Consistency
Pick Two
HBase is CP
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HBase consistency model
For a single row, HBase is strongly consistent within a data center
Across rows HBase is not strongly consistent (but available!).
When a RS goes down, only the regions on that server become
unavailable. Other regions are unaffected.
HBase multi-DC replication is “eventual consistent”
HBase applications should carefully design the schema for correct
semantics / performance tradeoff
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Use cases and motivation
More and more applications are looking for a “0 down time” platform
 30 seconds downtime (aggressive MTTR time) is too much
Certain classes of apps are willing to tolerate decreased consistency
guarantees in favor of availability
 Especially for READs
Some build wrappers around the native API to be able to handle failures of
destination servers
 Multi-DC: when one server is down in one DC, the client switches to a different one
Can we do something in HBase natively?
 Within the same cluster?
Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Use cases and motivation
Designing the application requires careful tradeoff consideration
 In schema design since single-row is strong consistent, but no multi-row trx
 Multi-datacenter replication (active-passive, active-active, backups etc)
It is good to be able to give the application flexibility to pick-and-choose
 Higher availability vs stronger consistency
Read vs Write
 Different consistency models for read vs write
 Read-repair, latest ts-wins vs linearizable updates
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Initial goals
Support applications talking to a single cluster really well
 No perceived downtime
 Only for READs
If apps wants to tolerate cluster failures
 Use HBase replication
 Combine that with wrappers in the application
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Introducing….
Region Replicas in HBase
Timeline Consistency in HBase
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Region replicas
For every region of the table, there can be more than one replica
 Every region replica has an associated “replica_id”, starting from 0
 Each region replica is hosted by a different region server
Tables can be configured with a REGION_REPLICATION parameter
 Default is 1
 No change in the current behavior
One replica per region is the “default” or “primary”
 Only this can accepts WRITEs
 All reads from this region replica return the most recent data
Other replicas, also called “secondaries” follow the primary
 They see only committed updates
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Region replicas
Secondary region replicas are read-only
 No writes are routed to secondary replicas
 Data is replicated to secondary regions (more on this later)
 Serve data from the same data files are primary
 May not have received the recent data
 Reads and Scans can be performed, returning possibly stale data
Region replica placement is done to maximize availability of any particular
region
 Region replicas are not co-located on same region servers
 And same racks (if possible)
Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
rowkey column:value column:value …
RegionServer
Region
memstore
DataNode
b2
b9 b1
DataNode
b2
b1
DataNode
b1
Client
Read and write
RegionServer
Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Page 15
rowkey column:value column:value …
RegionServer
Region
DataNode
b2
b9 b1
DataNode
b2
b1
DataNode
b1
Client
Read and write
memstore
RegionServer
rowkey column:value column:value …
memstore
Region replica
Read only
Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency
Introduced a Consistency enum
 STRONG
 TIMELINE
Consistency.STRONG is default
Consistency can be set per read operation (per-get or per-scan)
Timeline-consistent read RPCs sent to more than one replica
Semantics is a bit different than Eventual Consistency model
Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency
public enum Consistency {
STRONG,
TIMELINE
}
Get get = new Get(row);
get.setConsistency(Consistency.TIMELINE);
...
Result result = table.get(get);
…
if (result.isStale()) {
...
}
Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Semantics
Can be though of as in-cluster active-passive replication
Single homed and ordered updates
 All writes are handled and ordered by the primary region
 All writes are STRONG consistency
Secondaries apply the mutations in order
Only get/scan requests to secondaries
Get/Scan Result can be inspected to see whether the result was from
possibly stale data
Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
X=3
WAL
Data:
WAL
Data:
X=1X=1Write
Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
X=3
WAL
Data:
WAL
Data:
X=1
X=1
X=1
X=1
X=1
X=1Read
X=1Read
X=1Read
Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
Write
X=1
X=1
X=2 X=2
X=2
Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
X=2
X=1
X=2
X=2
X=2
X=2Read
X=2Read
X=1Read
Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
X=2
X=1
X=3
X=2
Write X=3
X=3
Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TIMELINE Consistency Example
Client1
X=1
Client2
WAL
Data:
Replica_id=0 (primary)
Replica_id=1
Replica_id=2
replication
replication
WAL
Data:
WAL
Data:
X=2
X=1
X=3
X=2 X=3
X=3Read
X=2Read
X=1Read
Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
PART II
Implementation and next steps
Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Region replicas – recap
Every region replica has an associated “replica_id”, starting from 0
Each region replica is hosted by a different region server
 All replicas can serve READs
One replica per region is the “default” or “primary”
 Only this can accepts WRITEs
 All reads from this region replica return the most recent data
Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Updates in the Master
Replica creation
 Created during table creation
No distinction between primary & secondary replicas
Meta table contain all information in one row
Load balancer improvements
 LB made aware of replicas
 Does best effort to place replicas in machines/racks to maximize availability
Alter table support
 For adjusting number of replicas
Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Updates in the RegionServer
Treats non-default replicas as read-only
Storefile management
 Keeps itself up-to-date with the changes to do with store file creation/deletions
Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
IPC layer high level flow
Client
YES
Response within
timeout (10 millis)?
NO Send READ to all
secondaries
Send READ to primary
Poll for response
Wait for response
Take the first
successful response;
cancel others
Similar flow for GET/Batch-
GET/Scan, except that Scan is
sticky to the server it sees
success from.
Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Performance and Testing
No significant performance issues discovered
 Added interrupt handling in the RPCs to cancel unneeded replica RPCs
Deeper level of performance testing work is still in progress
Tested via IT tests
 fails if response is not received within a certain time
Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Next steps
What has been described so far is in “Phase-1” of the project
Phase-2
 WAL replication
 Handling of Merges and Splits
 Latency guarantees
– Cancellation of RPCs server side
– Promotion of one Secondary to Primary, and recruiting a new Secondary
Use the infrastructure to implement consensus protocols for read/write
within a single datacenter
Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Replication
Data should be replicated from primary regions to secondary regions
A regions data = Data files on hdfs + in-memory data in Memstores
Data files MUST be shared. We do not want to store multiple copies
Do not cause more writes than necessary
Two solutions:
 Region snapshots : Share only data files
 Async WAL Replication : Share data files, every region replica has its own in-memory data
Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Replication – Region Snapshots
Primary region works as usual
 Buffer up mutations in memstore
 Flush to disk when full
 Compact files when needed
 Deleted files are kept in archive directory for some time
Secondary regions periodically look for new files in primary region
 When a new flushed file is seen, just open it and start serving data from there
 When a compaction is seen, open new file, close the files that are gone
 Good for read-only, bulk load data or less frequently updated data
Implemented in phase 1
Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Replication - Async WAL Replication
Being implemented in Phase 2
Uses replication source to tail the WAL files from RS
 Plugs in a custom replication sink to replay the edits on the secondaries
 Flush and Compaction events are written to WAL. Secondaries pick new files when they see
the entry
A secondary region open will:
 Open region files of the primary region
 Setup a replication queue based on last seen seqId
 Accumulate edits in memstore (memory management issues in the next slide)
 Mimic flushes and compactions from primary region
Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Memory management & flushes
Memory Snapshots-based approach
 The secondaries looks for WAL-edit entries Start-Flush, Commit-Flush
 They mimic what the primary does in terms of taking snapshots
– When a flush is successful, the snapshot is let go
 If the RegionServer hosting secondary is under memory pressure
– Make some other primary region flush
Flush-based approach
 Treat the secondary regions as regular regions
 Allow them to flush as usual
 Flush to the local disk, and clean them up periodically or on certain events
– Treat them as a normal store file for serving reads
Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Summary
Pros
 High-availability for read-only tables
 High-availability for stale reads
 Very low-latency for the above
Cons
 Increased memory from memstores of the secondaries
 Increased blockcache usage
 Extra network traffic for the replica calls
 Increased number of regions to manage in the cluster
Page37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
References
Apache branch hbase-10070 (https://github.com/apache/hbase/tree/hbase-
10070)
HDP-2.1 comes with experimental support for Phase-1
More on the use cases for this work is in Sudarshan’s (Bloomberg) talk
 “Case Studies” track titled “HBase at Bloomberg: High Availability Needs for the Financial
Industry”
Page38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thanks
Q & A

Más contenido relacionado

La actualidad más candente

Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Vinoth Chandar
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsCloudera, Inc.
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to CassandraUmair Mansoob
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedis Labs
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningDavid Stein
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudDatabricks
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring MicroservicesWeaveworks
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlJiangjie Qin
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesBruno Borges
 
Impala presentation
Impala presentationImpala presentation
Impala presentationtrihug
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimDatabricks
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Daniel Hochman
 
Policy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp SentinelPolicy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp SentinelMitchell Pronschinske
 

La actualidad más candente (20)

Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived Hadoop Strata Talk - Uber, your hadoop has arrived
Hadoop Strata Talk - Uber, your hadoop has arrived
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table Snapshots
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
HBase Low Latency
HBase Low LatencyHBase Low Latency
HBase Low Latency
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Frame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine LearningFrame - Feature Management for Productive Machine Learning
Frame - Feature Management for Productive Machine Learning
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Monitoring Microservices
Monitoring MicroservicesMonitoring Microservices
Monitoring Microservices
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Secrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on KubernetesSecrets of Performance Tuning Java on Kubernetes
Secrets of Performance Tuning Java on Kubernetes
 
Impala presentation
Impala presentationImpala presentation
Impala presentation
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
 
Policy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp SentinelPolicy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp Sentinel
 

Destacado

Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopMatthew Hayes
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionCloudera, Inc.
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureDataWorks Summit
 
MetaZeta Clusters Overview
MetaZeta Clusters OverviewMetaZeta Clusters Overview
MetaZeta Clusters OverviewPaul Baclace
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopMatthew Hayes
 
Sphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in DrupalSphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in Drupalelliando dias
 
Not Only Drupal
Not Only DrupalNot Only Drupal
Not Only Drupalmcantelon
 
Computational genomics approaches to precision medicine
Computational genomics approaches to precision medicineComputational genomics approaches to precision medicine
Computational genomics approaches to precision medicineAltuna Akalin
 
High Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practicesHigh Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practicesStoyan Stefanov
 
Basic Crud In Django
Basic Crud In DjangoBasic Crud In Django
Basic Crud In Djangomcantelon
 
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)Altuna Akalin
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationAdam Kawa
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandJosh Elser
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsNavisro Analytics
 
The Physics of Fast Image Compression
The Physics of Fast Image CompressionThe Physics of Fast Image Compression
The Physics of Fast Image CompressionCloudinary
 
Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Chris Aniszczyk
 

Destacado (20)

Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on Hadoop
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
 
MetaZeta Clusters Overview
MetaZeta Clusters OverviewMetaZeta Clusters Overview
MetaZeta Clusters Overview
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on Hadoop
 
Sphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in DrupalSphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in Drupal
 
Not Only Drupal
Not Only DrupalNot Only Drupal
Not Only Drupal
 
Computational genomics approaches to precision medicine
Computational genomics approaches to precision medicineComputational genomics approaches to precision medicine
Computational genomics approaches to precision medicine
 
High Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practicesHigh Performance Web Pages - 20 new best practices
High Performance Web Pages - 20 new best practices
 
Basic Crud In Django
Basic Crud In DjangoBasic Crud In Django
Basic Crud In Django
 
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
 
Danger Of Free
Danger Of FreeDanger Of Free
Danger Of Free
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
 
The Physics of Fast Image Compression
The Physics of Fast Image CompressionThe Physics of Fast Image Compression
The Physics of Fast Image Compression
 
Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)
 

Similar a HBase Read High Availability Using Timeline Consistent Region Replicas

HBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region ReplicasHBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region ReplicasDataWorks Summit
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseJosh Elser
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0DataWorks Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseenissoz
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)Chris Nauroth
 
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...Wangda Tan
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionDataWorks Summit/Hadoop Summit
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3DataWorks Summit
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseSankar H
 
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)Abdelkrim Hadjidj
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionDataWorks Summit/Hadoop Summit
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache AmbariDataWorks Summit
 
Future of Apache Ambari
Future of Apache AmbariFuture of Apache Ambari
Future of Apache AmbariJayush Luniya
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...DataWorks Summit
 

Similar a HBase Read High Availability Using Timeline Consistent Region Replicas (20)

HBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region ReplicasHBase Read High Availabilty using Timeline Consistent Region Replicas
HBase Read High Availabilty using Timeline Consistent Region Replicas
 
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
 
Hadoop 3 in a Nutshell
Hadoop 3 in a NutshellHadoop 3 in a Nutshell
Hadoop 3 in a Nutshell
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...Dataworks Berlin Summit 18' - Deep learning On YARN -  Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)FOD Paris Meetup -  Global Data Management with DataPlane Services (DPS)
FOD Paris Meetup - Global Data Management with DataPlane Services (DPS)
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache Ambari
 
Future of Apache Ambari
Future of Apache AmbariFuture of Apache Ambari
Future of Apache Ambari
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 

Más de enissoz

Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0enissoz
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clustersenissoz
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the unionenissoz
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshotsenissoz
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 

Más de enissoz (7)

Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clusters
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

HBase Read High Availability Using Timeline Consistent Region Replicas

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HBase Read High Availability Using Timeline-Consistent Region Replicas Enis Soztutar (enis@hortonworks.com) Devaraj Das (ddas@hortonworks.com)
  • 2. Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved About Us Enis Soztutar Committer and PMC member in Apache HBase and Hadoop since 2007 HBase team @Hortonworks Twitter @enissoz Devaraj Das Committer and PMC member in Hadoop since 2006 Committer at HBase Co-founder @Hortonworks
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Outline of the talk PART I: Use case and semantics  CAP recap  Use case and motivation  Region replicas  Timeline consistency  Semantics PART II : Implementation and next steps  Server side  Client side  Data replication  Next steps & Summary
  • 4. Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Part I Use case and semantics
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved CAP reCAP Partition tolerance Consistency Availability Pick Two HBase is CP
  • 6. Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Availability CAP reCAP • In a distributed system you cannot NOT have P • C vs A is about what happens if there is a network partition! • A an C are NEVER binary values, always a range • Different operations in the system can have different A / C choices • HBase cannot be simplified as CP Partition tolerance Consistency Pick Two HBase is CP
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HBase consistency model For a single row, HBase is strongly consistent within a data center Across rows HBase is not strongly consistent (but available!). When a RS goes down, only the regions on that server become unavailable. Other regions are unaffected. HBase multi-DC replication is “eventual consistent” HBase applications should carefully design the schema for correct semantics / performance tradeoff
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Use cases and motivation More and more applications are looking for a “0 down time” platform  30 seconds downtime (aggressive MTTR time) is too much Certain classes of apps are willing to tolerate decreased consistency guarantees in favor of availability  Especially for READs Some build wrappers around the native API to be able to handle failures of destination servers  Multi-DC: when one server is down in one DC, the client switches to a different one Can we do something in HBase natively?  Within the same cluster?
  • 9. Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Use cases and motivation Designing the application requires careful tradeoff consideration  In schema design since single-row is strong consistent, but no multi-row trx  Multi-datacenter replication (active-passive, active-active, backups etc) It is good to be able to give the application flexibility to pick-and-choose  Higher availability vs stronger consistency Read vs Write  Different consistency models for read vs write  Read-repair, latest ts-wins vs linearizable updates
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Initial goals Support applications talking to a single cluster really well  No perceived downtime  Only for READs If apps wants to tolerate cluster failures  Use HBase replication  Combine that with wrappers in the application
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Introducing…. Region Replicas in HBase Timeline Consistency in HBase
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Region replicas For every region of the table, there can be more than one replica  Every region replica has an associated “replica_id”, starting from 0  Each region replica is hosted by a different region server Tables can be configured with a REGION_REPLICATION parameter  Default is 1  No change in the current behavior One replica per region is the “default” or “primary”  Only this can accepts WRITEs  All reads from this region replica return the most recent data Other replicas, also called “secondaries” follow the primary  They see only committed updates
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Region replicas Secondary region replicas are read-only  No writes are routed to secondary replicas  Data is replicated to secondary regions (more on this later)  Serve data from the same data files are primary  May not have received the recent data  Reads and Scans can be performed, returning possibly stale data Region replica placement is done to maximize availability of any particular region  Region replicas are not co-located on same region servers  And same racks (if possible)
  • 14. Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved rowkey column:value column:value … RegionServer Region memstore DataNode b2 b9 b1 DataNode b2 b1 DataNode b1 Client Read and write RegionServer
  • 15. Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Page 15 rowkey column:value column:value … RegionServer Region DataNode b2 b9 b1 DataNode b2 b1 DataNode b1 Client Read and write memstore RegionServer rowkey column:value column:value … memstore Region replica Read only
  • 16. Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Introduced a Consistency enum  STRONG  TIMELINE Consistency.STRONG is default Consistency can be set per read operation (per-get or per-scan) Timeline-consistent read RPCs sent to more than one replica Semantics is a bit different than Eventual Consistency model
  • 17. Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency public enum Consistency { STRONG, TIMELINE } Get get = new Get(row); get.setConsistency(Consistency.TIMELINE); ... Result result = table.get(get); … if (result.isStale()) { ... }
  • 18. Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Semantics Can be though of as in-cluster active-passive replication Single homed and ordered updates  All writes are handled and ordered by the primary region  All writes are STRONG consistency Secondaries apply the mutations in order Only get/scan requests to secondaries Get/Scan Result can be inspected to see whether the result was from possibly stale data
  • 19. Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication X=3 WAL Data: WAL Data: X=1X=1Write
  • 20. Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication X=3 WAL Data: WAL Data: X=1 X=1 X=1 X=1 X=1 X=1Read X=1Read X=1Read
  • 21. Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: Write X=1 X=1 X=2 X=2 X=2
  • 22. Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: X=2 X=1 X=2 X=2 X=2 X=2Read X=2Read X=1Read
  • 23. Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: X=2 X=1 X=3 X=2 Write X=3 X=3
  • 24. Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TIMELINE Consistency Example Client1 X=1 Client2 WAL Data: Replica_id=0 (primary) Replica_id=1 Replica_id=2 replication replication WAL Data: WAL Data: X=2 X=1 X=3 X=2 X=3 X=3Read X=2Read X=1Read
  • 25. Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved PART II Implementation and next steps
  • 26. Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Region replicas – recap Every region replica has an associated “replica_id”, starting from 0 Each region replica is hosted by a different region server  All replicas can serve READs One replica per region is the “default” or “primary”  Only this can accepts WRITEs  All reads from this region replica return the most recent data
  • 27. Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Updates in the Master Replica creation  Created during table creation No distinction between primary & secondary replicas Meta table contain all information in one row Load balancer improvements  LB made aware of replicas  Does best effort to place replicas in machines/racks to maximize availability Alter table support  For adjusting number of replicas
  • 28. Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Updates in the RegionServer Treats non-default replicas as read-only Storefile management  Keeps itself up-to-date with the changes to do with store file creation/deletions
  • 29. Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved IPC layer high level flow Client YES Response within timeout (10 millis)? NO Send READ to all secondaries Send READ to primary Poll for response Wait for response Take the first successful response; cancel others Similar flow for GET/Batch- GET/Scan, except that Scan is sticky to the server it sees success from.
  • 30. Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Performance and Testing No significant performance issues discovered  Added interrupt handling in the RPCs to cancel unneeded replica RPCs Deeper level of performance testing work is still in progress Tested via IT tests  fails if response is not received within a certain time
  • 31. Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Next steps What has been described so far is in “Phase-1” of the project Phase-2  WAL replication  Handling of Merges and Splits  Latency guarantees – Cancellation of RPCs server side – Promotion of one Secondary to Primary, and recruiting a new Secondary Use the infrastructure to implement consensus protocols for read/write within a single datacenter
  • 32. Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Replication Data should be replicated from primary regions to secondary regions A regions data = Data files on hdfs + in-memory data in Memstores Data files MUST be shared. We do not want to store multiple copies Do not cause more writes than necessary Two solutions:  Region snapshots : Share only data files  Async WAL Replication : Share data files, every region replica has its own in-memory data
  • 33. Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Replication – Region Snapshots Primary region works as usual  Buffer up mutations in memstore  Flush to disk when full  Compact files when needed  Deleted files are kept in archive directory for some time Secondary regions periodically look for new files in primary region  When a new flushed file is seen, just open it and start serving data from there  When a compaction is seen, open new file, close the files that are gone  Good for read-only, bulk load data or less frequently updated data Implemented in phase 1
  • 34. Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Replication - Async WAL Replication Being implemented in Phase 2 Uses replication source to tail the WAL files from RS  Plugs in a custom replication sink to replay the edits on the secondaries  Flush and Compaction events are written to WAL. Secondaries pick new files when they see the entry A secondary region open will:  Open region files of the primary region  Setup a replication queue based on last seen seqId  Accumulate edits in memstore (memory management issues in the next slide)  Mimic flushes and compactions from primary region
  • 35. Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Memory management & flushes Memory Snapshots-based approach  The secondaries looks for WAL-edit entries Start-Flush, Commit-Flush  They mimic what the primary does in terms of taking snapshots – When a flush is successful, the snapshot is let go  If the RegionServer hosting secondary is under memory pressure – Make some other primary region flush Flush-based approach  Treat the secondary regions as regular regions  Allow them to flush as usual  Flush to the local disk, and clean them up periodically or on certain events – Treat them as a normal store file for serving reads
  • 36. Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Summary Pros  High-availability for read-only tables  High-availability for stale reads  Very low-latency for the above Cons  Increased memory from memstores of the secondaries  Increased blockcache usage  Extra network traffic for the replica calls  Increased number of regions to manage in the cluster
  • 37. Page37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved References Apache branch hbase-10070 (https://github.com/apache/hbase/tree/hbase- 10070) HDP-2.1 comes with experimental support for Phase-1 More on the use cases for this work is in Sudarshan’s (Bloomberg) talk  “Case Studies” track titled “HBase at Bloomberg: High Availability Needs for the Financial Industry”
  • 38. Page38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Thanks Q & A