SlideShare una empresa de Scribd logo
1 de 22
@TwitterAds | Confidential
@ctrezzo
HBaseCon 2013
Apache HBase Replication
Thursday, July 25, 13
@Twitter 2
About me
Active contributor to Apache HBase
Software Engineer @ Twitter
Core Storage Team - Hadoop/HBase
Follow me @ctrezzo
Thursday, July 25, 13
@Twitter 3
Agenda
Introduction
High-level Architecture
Replication State
Path of a replicated edit
Replication Source
Replication Sink
Replication Source Manager
Thursday, July 25, 13
@Twitter 4
HBase replication
Asynchronously copy data between two HBase clusters
Push-based architecture
WAL shipping technique similar to MySQL
Thursday, July 25, 13
@Twitter 5
Guarantees of replication
Eventually consistent
Deliver updates at least once
Atomicity of individual updates will be preserved
Thursday, July 25, 13
@Twitter 6
Administering Replication
Simply set parameter in hbase-site.xml
hbase.replication => true
Setup replication topologies
add_peer, remove_peer, disable_peer, enable_peer,
list_peers
Create/Alter tables with replication scope set
REPLICATION_SCOPE => ‘1’
Thursday, July 25, 13
@Twitter 7
High-Level Architecture
ReplicationSource
Manager
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
ReplicationSource
Region Server
ReplicationSink HTable
Region Server
Cluster 3
1
/state
/peers
/1
/2
/rs
Zookeeper
123
Replication
Admin
Thursday, July 25, 13
@Twitter 8
Replication State
Persistently stored in Zookeeper
Status
Master kill switch
Peers
List of remote target clusters
Queues
List of remaining HLogs to replicate and current position in
each log
Thursday, July 25, 13
@Twitter 9
Path of a replicated edit
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
123
Thursday, July 25, 13
@Twitter 10
Path of a replicated edit
ReplicationSource
Region Server 1
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
123
ReplicationSource
Region Server 2
Region Server
ReplicationSink HTable
Region Server
1
2
4
3
HLog
12
ReplicationSource
Region Server X
Region Server
ReplicationSink HTable
Region Server
1
2
4
3
HLog
1
Thursday, July 25, 13
@Twitter
End-point for shipping WAL entries
One instance for each queue
Runs as a separate thread on region server
Uses AdminProtocol RPC to synchronously ship entries
Filters edits based on replication scope
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
123
11
Replication Source
Thursday, July 25, 13
@Twitter 12
Replication Sink
End-point for receiving shipped WAL entries
One instance per region server
Synchronously receives entries and applies them using
HTable
Batches rows in the same table
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
123
Thursday, July 25, 13
@Twitter 13
Load balancing
Balances load on remote cluster using randomization
Ships edits to random subset of remote region servers
Default is 10%
Cluster 2
20 Region Servers
Cluster 1
Thursday, July 25, 13
@Twitter 14
Path of a replicated edit
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
123
Thursday, July 25, 13
@Twitter 15
Replication Source Manager
Manages all replication sources
Manages change in replication state
Log rolling
Region server failure
Addition/deletion of peer clusters
ReplicationSource
Manager
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
ReplicationSource
Region Server
ReplicationSink HTable
Region Server
Cluster 3
1
/state
/peers
/1
/2
/rs
Zookeeper
123
Thursday, July 25, 13
@Twitter 16
High-Level Architecture
ReplicationSource
Manager
ReplicationSource
Region Server
Region Server
ReplicationSink HTable
Region Server
Cluster 2Cluster 1
1
2
4
3
HLog
ReplicationSource
Region Server
ReplicationSink HTable
Region Server
Cluster 3
1
/state
/peers
/1
/2
/rs
Zookeeper
123
Replication
Admin
Thursday, July 25, 13
@Twitter 17
Additional Resources
Apache HBase user mailing list
user@hbase.apache.org
Apache HBase reference guide
https://hbase.apache.org/book.html
Tweet me
@ctrezzo
Thursday, July 25, 13
@TwitterAds | Confidential
Questions?
Thursday, July 25, 13
@Twitter 19
Replication State
Persistently stored in Zookeeper
Three major replication znodes: Status, Peers, Queues
/hbase/replication
/state [VALUE: true]
/peers
/1 [Value: zk1.host.com,zk2.host.com,zk3.host.com:2181:/hbase]
/peer-state [Value: ENABLED]
/2 [Value: zk5.host.com,zk6.host.com,zk7.host.com:2181:/hbase]
/peer-state [Value: DISABLED]
/rs
/hostname.example.org,6020,1234
/1
/23522342.23422 [VALUE: 254]
/12340993.22342 [VALUE: 0]
/2
/23522342.23422 [VALUE: 34]
/12340993.22342 [VALUE: 0]
/hostname2.example.org,6020,1234
/1
/23522348.23443 [VALUE: 87]
/12340999.22362 [VALUE: 0]
/2
/23522348.23443 [VALUE: 127]
/12340999.22362 [VALUE: 0]
Thursday, July 25, 13
@Twitter 20
Status znode
Master kill switch
Controlled by start_replication, stop_replication
Be careful what you wish for
/hbase/replication
/state [VALUE: true]
Thursday, July 25, 13
@Twitter 21
Peers znode
A set of remote clusters registered as possible replication
targets
Identified by peer id
Contains status of each peer cluster
/hbase/replication
/peers
/1 [Value: zk1.host.com,zk2.host.com,zk3.host.com:2181:/hbase]
/peer-state [Value: ENABLED]
/2 [Value: zk5.host.com,zk6.host.com,zk7.host.com:2181:/hbase]
/peer-state [Value: DISABLED]
Thursday, July 25, 13
@Twitter 22
Queues znode
Queues identified by region server and peer id
Queues contain list of HLogs and current position in log
/hbase/replication
/rs
/hostname.example.org,6020,1234
/1
/23522342.23422 [VALUE: 254]
/12340993.22342 [VALUE: 0]
/2
/23522342.23422 [VALUE: 34]
/12340993.22342 [VALUE: 0]
/hostname2.example.org,6020,1234
/1
/23522348.23443 [VALUE: 87]
/12340999.22362 [VALUE: 0]
/2
/23522348.23443 [VALUE: 127]
/12340999.22362 [VALUE: 0]
Thursday, July 25, 13

Más contenido relacionado

La actualidad más candente

HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, ClouderaHBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, ClouderaCloudera, Inc.
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
State of HBase: Meet the Release Managers
State of HBase: Meet the Release ManagersState of HBase: Meet the Release Managers
State of HBase: Meet the Release ManagersHBaseCon
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.Cloudera, Inc.
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudHBaseCon
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clustersenissoz
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseSankar H
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshotsenissoz
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand EnvironmentHBaseCon
 
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのかApache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのかToshihiro Suzuki
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS AppendYue Chen
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketCloudera, Inc.
 

La actualidad más candente (20)

HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, ClouderaHBaseCon 2012 | HBase Filtering - Lars George, Cloudera
HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
State of HBase: Meet the Release Managers
State of HBase: Meet the Release ManagersState of HBase: Meet the Release Managers
State of HBase: Meet the Release Managers
 
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
 
Flume and HBase
Flume and HBase Flume and HBase
Flume and HBase
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
Operating and supporting HBase Clusters
Operating and supporting HBase ClustersOperating and supporting HBase Clusters
Operating and supporting HBase Clusters
 
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive WarehouseDisaster Recovery and Cloud Migration for your Apache Hive Warehouse
Disaster Recovery and Cloud Migration for your Apache Hive Warehouse
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
Mapreduce over snapshots
Mapreduce over snapshotsMapreduce over snapshots
Mapreduce over snapshots
 
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and  High-Demand EnvironmentHBaseCon 2015: HBase at Scale in an Online and  High-Demand Environment
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
 
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのかApache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
Apache HBaseの現在 - 火山と呼ばれたHBaseは今どうなっているのか
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS Append
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 

Similar a HBase Replication

PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsMydbops
 
SQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideSQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideLars Platzdasch
 
Postgresql_Replication.pptx
Postgresql_Replication.pptxPostgresql_Replication.pptx
Postgresql_Replication.pptxStephenEfange3
 
SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groups
SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability GroupsSQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groups
SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groupsturgaysahtiyan
 
Galera Cluster: Synchronous Multi-Master Replication for MySQL HA
Galera Cluster: Synchronous Multi-Master Replication for MySQL HAGalera Cluster: Synchronous Multi-Master Replication for MySQL HA
Galera Cluster: Synchronous Multi-Master Replication for MySQL HALudovico Caldara
 
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxBuilt-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxnadirpervez2
 
Built in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecBuilt in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecFIRAT GULEC
 
Sql server replication step by step
Sql server replication step by stepSql server replication step by step
Sql server replication step by steplaonap166
 
ProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
 
FlashbackLoggingInternals.ppt
FlashbackLoggingInternals.pptFlashbackLoggingInternals.ppt
FlashbackLoggingInternals.pptssuser2e101e
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMakerKris Buytaert
 
Learn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c AdministrationLearn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c AdministrationRevelation Technologies
 
Oracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewOracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewKris Rice
 
[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)altistory
 
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMark Swarbrick
 
HBase Replication for Bulk Loaded Data
HBase Replication for Bulk Loaded DataHBase Replication for Bulk Loaded Data
HBase Replication for Bulk Loaded DataAshish Singhi
 
Rails 3.1 sneak peak
Rails 3.1 sneak peakRails 3.1 sneak peak
Rails 3.1 sneak peakOleg Kossoy
 
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxUKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxMarco Gralike
 
State of The Dolphin - May 2021
State of The Dolphin - May 2021State of The Dolphin - May 2021
State of The Dolphin - May 2021Frederic Descamps
 

Similar a HBase Replication (20)

PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability MethodsPostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
 
SQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step GuideSQL Server Alwayson for SharePoint HA/DR Step by Step Guide
SQL Server Alwayson for SharePoint HA/DR Step by Step Guide
 
Postgresql_Replication.pptx
Postgresql_Replication.pptxPostgresql_Replication.pptx
Postgresql_Replication.pptx
 
SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groups
SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability GroupsSQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groups
SQLSaturday Bulgaria : HA & DR with SQL Server AlwaysOn Availability Groups
 
Galera Cluster: Synchronous Multi-Master Replication for MySQL HA
Galera Cluster: Synchronous Multi-Master Replication for MySQL HAGalera Cluster: Synchronous Multi-Master Replication for MySQL HA
Galera Cluster: Synchronous Multi-Master Replication for MySQL HA
 
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxBuilt-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
 
Built in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecBuilt in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat Gulec
 
Sql server replication step by step
Sql server replication step by stepSql server replication step by step
Sql server replication step by step
 
ProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQL
 
FlashbackLoggingInternals.ppt
FlashbackLoggingInternals.pptFlashbackLoggingInternals.ppt
FlashbackLoggingInternals.ppt
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMaker
 
Learn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c AdministrationLearn Oracle WebLogic Server 12c Administration
Learn Oracle WebLogic Server 12c Administration
 
Oracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewOracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ Overview
 
[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)[Altibase] 9 replication part2 (methods and controls)
[Altibase] 9 replication part2 (methods and controls)
 
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
 
Sqlmap
SqlmapSqlmap
Sqlmap
 
HBase Replication for Bulk Loaded Data
HBase Replication for Bulk Loaded DataHBase Replication for Bulk Loaded Data
HBase Replication for Bulk Loaded Data
 
Rails 3.1 sneak peak
Rails 3.1 sneak peakRails 3.1 sneak peak
Rails 3.1 sneak peak
 
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptxUKOUG2018 - I Know what you did Last Summer [in my Database].pptx
UKOUG2018 - I Know what you did Last Summer [in my Database].pptx
 
State of The Dolphin - May 2021
State of The Dolphin - May 2021State of The Dolphin - May 2021
State of The Dolphin - May 2021
 

Último

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 

Último (20)

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 

HBase Replication

  • 1. @TwitterAds | Confidential @ctrezzo HBaseCon 2013 Apache HBase Replication Thursday, July 25, 13
  • 2. @Twitter 2 About me Active contributor to Apache HBase Software Engineer @ Twitter Core Storage Team - Hadoop/HBase Follow me @ctrezzo Thursday, July 25, 13
  • 3. @Twitter 3 Agenda Introduction High-level Architecture Replication State Path of a replicated edit Replication Source Replication Sink Replication Source Manager Thursday, July 25, 13
  • 4. @Twitter 4 HBase replication Asynchronously copy data between two HBase clusters Push-based architecture WAL shipping technique similar to MySQL Thursday, July 25, 13
  • 5. @Twitter 5 Guarantees of replication Eventually consistent Deliver updates at least once Atomicity of individual updates will be preserved Thursday, July 25, 13
  • 6. @Twitter 6 Administering Replication Simply set parameter in hbase-site.xml hbase.replication => true Setup replication topologies add_peer, remove_peer, disable_peer, enable_peer, list_peers Create/Alter tables with replication scope set REPLICATION_SCOPE => ‘1’ Thursday, July 25, 13
  • 7. @Twitter 7 High-Level Architecture ReplicationSource Manager ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog ReplicationSource Region Server ReplicationSink HTable Region Server Cluster 3 1 /state /peers /1 /2 /rs Zookeeper 123 Replication Admin Thursday, July 25, 13
  • 8. @Twitter 8 Replication State Persistently stored in Zookeeper Status Master kill switch Peers List of remote target clusters Queues List of remaining HLogs to replicate and current position in each log Thursday, July 25, 13
  • 9. @Twitter 9 Path of a replicated edit ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog 123 Thursday, July 25, 13
  • 10. @Twitter 10 Path of a replicated edit ReplicationSource Region Server 1 Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog 123 ReplicationSource Region Server 2 Region Server ReplicationSink HTable Region Server 1 2 4 3 HLog 12 ReplicationSource Region Server X Region Server ReplicationSink HTable Region Server 1 2 4 3 HLog 1 Thursday, July 25, 13
  • 11. @Twitter End-point for shipping WAL entries One instance for each queue Runs as a separate thread on region server Uses AdminProtocol RPC to synchronously ship entries Filters edits based on replication scope ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog 123 11 Replication Source Thursday, July 25, 13
  • 12. @Twitter 12 Replication Sink End-point for receiving shipped WAL entries One instance per region server Synchronously receives entries and applies them using HTable Batches rows in the same table ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog 123 Thursday, July 25, 13
  • 13. @Twitter 13 Load balancing Balances load on remote cluster using randomization Ships edits to random subset of remote region servers Default is 10% Cluster 2 20 Region Servers Cluster 1 Thursday, July 25, 13
  • 14. @Twitter 14 Path of a replicated edit ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog 123 Thursday, July 25, 13
  • 15. @Twitter 15 Replication Source Manager Manages all replication sources Manages change in replication state Log rolling Region server failure Addition/deletion of peer clusters ReplicationSource Manager ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog ReplicationSource Region Server ReplicationSink HTable Region Server Cluster 3 1 /state /peers /1 /2 /rs Zookeeper 123 Thursday, July 25, 13
  • 16. @Twitter 16 High-Level Architecture ReplicationSource Manager ReplicationSource Region Server Region Server ReplicationSink HTable Region Server Cluster 2Cluster 1 1 2 4 3 HLog ReplicationSource Region Server ReplicationSink HTable Region Server Cluster 3 1 /state /peers /1 /2 /rs Zookeeper 123 Replication Admin Thursday, July 25, 13
  • 17. @Twitter 17 Additional Resources Apache HBase user mailing list user@hbase.apache.org Apache HBase reference guide https://hbase.apache.org/book.html Tweet me @ctrezzo Thursday, July 25, 13
  • 19. @Twitter 19 Replication State Persistently stored in Zookeeper Three major replication znodes: Status, Peers, Queues /hbase/replication /state [VALUE: true] /peers /1 [Value: zk1.host.com,zk2.host.com,zk3.host.com:2181:/hbase] /peer-state [Value: ENABLED] /2 [Value: zk5.host.com,zk6.host.com,zk7.host.com:2181:/hbase] /peer-state [Value: DISABLED] /rs /hostname.example.org,6020,1234 /1 /23522342.23422 [VALUE: 254] /12340993.22342 [VALUE: 0] /2 /23522342.23422 [VALUE: 34] /12340993.22342 [VALUE: 0] /hostname2.example.org,6020,1234 /1 /23522348.23443 [VALUE: 87] /12340999.22362 [VALUE: 0] /2 /23522348.23443 [VALUE: 127] /12340999.22362 [VALUE: 0] Thursday, July 25, 13
  • 20. @Twitter 20 Status znode Master kill switch Controlled by start_replication, stop_replication Be careful what you wish for /hbase/replication /state [VALUE: true] Thursday, July 25, 13
  • 21. @Twitter 21 Peers znode A set of remote clusters registered as possible replication targets Identified by peer id Contains status of each peer cluster /hbase/replication /peers /1 [Value: zk1.host.com,zk2.host.com,zk3.host.com:2181:/hbase] /peer-state [Value: ENABLED] /2 [Value: zk5.host.com,zk6.host.com,zk7.host.com:2181:/hbase] /peer-state [Value: DISABLED] Thursday, July 25, 13
  • 22. @Twitter 22 Queues znode Queues identified by region server and peer id Queues contain list of HLogs and current position in log /hbase/replication /rs /hostname.example.org,6020,1234 /1 /23522342.23422 [VALUE: 254] /12340993.22342 [VALUE: 0] /2 /23522342.23422 [VALUE: 34] /12340993.22342 [VALUE: 0] /hostname2.example.org,6020,1234 /1 /23522348.23443 [VALUE: 87] /12340999.22362 [VALUE: 0] /2 /23522348.23443 [VALUE: 127] /12340999.22362 [VALUE: 0] Thursday, July 25, 13