Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

730 visualizaciones

Publicado el

Ashish Singhi

HBase Disaster recovery solution aims to maintain high availability of HBase service in case of disaster of one HBase cluster with very minimal user intervention. This session will introduce the HBase disaster recovery use cases and the various solutions adopted at Huawei like.
a) Cluster Read-Write mode
b) DDL operations synchronization with standby cluster
c) Mutation and bulk loaded data replication
d) Further challenges and pending work

hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

Publicado en: Tecnología
  • Sé el primero en comentar

hbaseconasia2017: HBase Disaster Recovery Solution at Huawei

  1. 1. 1 HBase Disaster Recovery Solution at Huawei Ashish Singhi
  2. 2. 2 About.html • Senior Technical Leader at Huawei • Around 6 years of experience in Big Data related projects • Apache HBase Committer
  3. 3. 3 Agenda • Why Disaster Recovery ? • Backup Vs Disaster Recovery • HBase Disaster Recovery • Solution • Miscellaneous • Future Work
  4. 4. 4 Why Disaster Recovery ? Cost of Downtime
  5. 5. 5 Agenda • Why Disaster Recovery ? • Backup Vs Disaster Recovery • HBase Disaster Recovery • Solution • Miscellaneous • Future Work
  6. 6. 6 Backup Vs Disaster Recovery Two different problems and solutions Backup Disaster Recovery Process Archive items to cold media Replicate to secondary site Infrastructure Medium level Duplicate of active cluster (high level) Cost Affordable Expensive Restore process One to few at a time One to everything Restore time Slow Fast Production usage Common Rare
  7. 7. 7 Agenda • Why Disaster Recovery ? • Backup Vs Disaster Recovery • HBase Disaster Recovery • Solution • Miscellaneous • Future Work
  8. 8. 8 HBase Disaster Recovery • HBase Disaster recovery is based on replication, which mirrors data across a network in real time. • The technology is used to move data from a local source location to one or more target locations. • Replication over WAN has become an ideal technology for disaster recovery to prevent data loss in the event of failure.
  9. 9. 9 Deployment Strategies
  10. 10. 10 Active – Standby Cluster Active Cluster HBase Standby Cluster HBase Write Read /hbase/clusterStat e: standby /hbase/clusterStat e: active ZooKeeper Serves only Read Client Requests ZooKeeper Replication Serves Read and Write Client Requests
  11. 11. 11 Agenda • Why Disaster Recovery ? • Backup Vs Disaster Recovery • HBase Disaster Recovery • Solution • Miscellaneous • Future Work
  12. 12. 12 Replication WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 1 3 1 Table Batch 1 Replication Sink 1 Bulk load Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] 12 1 Batch Bulk load 12 1 1 Batch Bulk load ZooKeeper
  13. 13. 13 Sync DDL Operations • Synchronize the table properties across clusters • Any change in the source cluster, reflects immediately in the peer clusters. • Does not break the replication. • An additional option with DDL command to sync • Internally sync those changes to peer clusters.
  14. 14. 14 Sync Security related Data • Synchronize security related HBase data across the clusters • Any update in the source cluster ACL, Quota or Visibility Labels table, reflects immediately in peer clusters. • A custom WAL entry filter is added in replication for this. • Does not break the security for HBase data access.
  15. 15. 15 Read Only Cluster • Enable a cluster to serve only read requests • A coprocessor based solution • Standby cluster will serve all the read requests • Standby cluster will serve write requests only if the requests is coming from a, • Super user • From a list of accepted IPs
  16. 16. 16 Cluster Recovery Replication Active Standby Cluster HBase Standby Active Cluster HBase Serves Read and Write Client Requests Write Read /hbase/clusterStat e: standby active /hbase/clusterStat e: active standby ZooKeeper Serves only Read Client Requests ZooKeeper
  17. 17. 17 Agenda • Why Disaster Recovery ? • Backup Vs Disaster Recovery • HBase Disaster Recovery • Solution • Miscellaneous • Future Work
  18. 18. 18 Miscellaneous • Increased the default replication.source.ratio to 0.5 • Adaptive hbase.replication.rpc.timeout • Active cluster HDFS server configurations are maintained in Standby cluster ZooKeeper for bulk loaded data replication.
  19. 19. 19 Agenda • Why Disaster Recovery ? • Backup Vs Disaster Recovery • HBase Disaster Recovery • Solution • Miscellaneous • Future Work
  20. 20. 20 Future work • Move HBase Replication tracking from ZooKeeper to HBase table (HBASE-15867) • Copy bulk loaded data to peer with data locality • Replication data network bandwidth throttling.
  21. 21. 21 Thank You! mailto: ashishsinghi@apache.org Twitter: ashishsinghi89

×