Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
© Hortonworks Inc. 2014
HBase Backup and Restore
Vladimir Rodionov
Ted Yu
Page 1
© Hortonworks Inc. 2014
About the authors
• Ted Yu:
• Been working on HBase for over 6 years
• HBase committer / PMC
• Sen...
© Hortonworks Inc. 2014
HBase Backup – Why We Need It
• Database needs disaster recovery tool
• Previously users can perfo...
© Hortonworks Inc. 2014
Brief History of Backup / Restore work
• Started by engineers at IBM – see HBASE-7912
• Initial de...
© Hortonworks Inc. 2014
HBase Backup Types
• Full backup – foundation for incremental backups
• Incremental backup – can b...
© Hortonworks Inc. 2014
Required Configuration
• Set hbase.backup.enable to true
• BackupLogCleaner for
hbase.master.logcl...
© Hortonworks Inc. 2014
Backup Strategy
• Intra-cluster backup is appropriate for testing
Page 7
© Hortonworks Inc. 2014
Backup Strategy: Dedicated HDFS Cluster
• backup on a separate HDFS archive cluster
Page 8
© Hortonworks Inc. 2014
Backup Strategy: Cloud or a Storage Vendor
• vendor can be a public cloud provider or a storage ve...
© Hortonworks Inc. 2014
Best Practices for Backup-and-Restore
• Secure a full backup image first
• Formulate a restore str...
© Hortonworks Inc. 2014
Creating/Maintaining Backup Image
• Run the following command as hbase superuser:
• hbase backup c...
© Hortonworks Inc. 2014
Using Backup Sets
• Reduces the amount of repetitive input of table names.
• “hbase backup set add...
© Hortonworks Inc. 2014
Restoring a Backup Image
• You can only restore on a live HBase cluster
• Run the following comman...
© Hortonworks Inc. 2014
Backup table
• Backup table will keep track of all backup sessions
–Write/Read backup session stat...
© Hortonworks Inc. 2014
Incremental backups
• Use Write Ahead Logs (WALs) to capture the data changes since the
previous b...
© Hortonworks Inc. 2014
Filter WALs on backup to only include relevant edits
• Suppose incremental backup request is for t...
© Hortonworks Inc. 2014
Restore
• The full backup is restored from the full backup image.
• HFileSplitter job will collect...
© Hortonworks Inc. 2014
Backup Manifest
• Backup image has the following:
• Backup Id, Backup Type, Backup Rootdir, Table ...
© Hortonworks Inc. 2014
Bulk load support
• Bulk loaded Hfiles are recorded in backup table at the end of bulk load,
thru ...
© Hortonworks Inc. 2014
Limitations of the Backup-Restore
• Only one active backup session is supported.
• Both backup and...
© Hortonworks Inc. 2014
Credit
• Richard Ding
• Vladimir Rodionov
Page 21
© Hortonworks Inc. 2014
Q/A
Page 22
© Hortonworks Inc. 2014
Thank you.
Page 23
Próxima SlideShare
Cargando en…5
×

hbaseconasia2017: Backup / Restore feature in HBase

1.043 visualizaciones

Publicado el

Vladimir Rodionov and Ted Yu

Backup and restore functionality is crucial to achieving fault tolerance for data management systems.
In the talk, we are going to cover the newly merged backup and restore phases 2 and 3.
Previously users can perform snapshot for backing up data. However, the associated execution cost may be high due to the flush across region servers. There was no incremental snapshot either.
Backup and restore functionality provides two types of backup:
Full backup – foundation for incremental backups
Incremental backup – can be periodic to capture changes over time
We'll cover three types of backup strategies:
Intra-cluster backup
backup on a separate HDFS archive cluster
backup involving Cloud or a Storage Vendor
Best practices for Backup-and-Restore will be presented next.
We'll explain concepts such as Backup Image, Backup Set with example commands of how they are used.
Mechanism for Incremental backups is covered next.
Finally we'll cover bulk load support for backup.

hbaseconasia2017 hbasecon hbase https://www.eventbrite.com/e/hbasecon-asia-2017-tickets-34935546159#

Publicado en: Tecnología
  • Sé el primero en comentar

hbaseconasia2017: Backup / Restore feature in HBase

  1. 1. © Hortonworks Inc. 2014 HBase Backup and Restore Vladimir Rodionov Ted Yu Page 1
  2. 2. © Hortonworks Inc. 2014 About the authors • Ted Yu: • Been working on HBase for over 6 years • HBase committer / PMC • Senior Staff Engineer at Hortonworks • Vladimir: • active contributor to hbase (over 100 HBase JIRAs) • completed most of the backup work based on IBM’s initial contribution • Senior Staff Engineer at Hortonworks Page 2
  3. 3. © Hortonworks Inc. 2014 HBase Backup – Why We Need It • Database needs disaster recovery tool • Previously users can perform snapshot • However, execution cost for snapshot may be high – flush across region servers is involved • There was no incremental snapshot – whole dataset is captured by snapshot • Incremental backup doesn’t involve flushing, making continuous backup possible Page 3
  4. 4. © Hortonworks Inc. 2014 Brief History of Backup / Restore work • Started by engineers at IBM – see HBASE-7912 • Initial design included backup manifest • Vladimir / Ted picked up the work last year • Vladimir rendered many iterations of patches for phase 2 work (see HBASE-14123) • Due to feedback from community, the design has gone thru major changes • Mostly tested by developers and QA engineers so far Page 4
  5. 5. © Hortonworks Inc. 2014 HBase Backup Types • Full backup – foundation for incremental backups • Incremental backup – can be periodic to capture changes over time • Supports table level backup Page 5
  6. 6. © Hortonworks Inc. 2014 Required Configuration • Set hbase.backup.enable to true • BackupLogCleaner for hbase.master.logcleaner.plugins • LogRollMasterProcedureManager for hbase.procedure.master.classes • LogRollRegionServerProcedureManager for hbase.procedure.regionserver.classes • Backup may get stuck if not configured properly Page 6
  7. 7. © Hortonworks Inc. 2014 Backup Strategy • Intra-cluster backup is appropriate for testing Page 7
  8. 8. © Hortonworks Inc. 2014 Backup Strategy: Dedicated HDFS Cluster • backup on a separate HDFS archive cluster Page 8
  9. 9. © Hortonworks Inc. 2014 Backup Strategy: Cloud or a Storage Vendor • vendor can be a public cloud provider or a storage vendor who uses a Hadoop compatible file system Page 9
  10. 10. © Hortonworks Inc. 2014 Best Practices for Backup-and-Restore • Secure a full backup image first • Formulate a restore strategy and test it • Define and use backup sets for groups of tables that are logical subsets of the entire dataset • Document the backup-and-restore strategy, and ideally log information about each backup Page 10
  11. 11. © Hortonworks Inc. 2014 Creating/Maintaining Backup Image • Run the following command as hbase superuser: • hbase backup create {{ full | incremental } {backup_root_path} {[-t tables] | [-set backup_set_name]}} [[-silent] | [-w number_of_workers] | [-b bandwidth_per_worker]] Page 11
  12. 12. © Hortonworks Inc. 2014 Using Backup Sets • Reduces the amount of repetitive input of table names. • “hbase backup set add” command. • You can have multiple backup sets • Backup set can be used in the “hbase backup create” or “hbase backup restore” commands Page 12
  13. 13. © Hortonworks Inc. 2014 Restoring a Backup Image • You can only restore on a live HBase cluster • Run the following command as hbase superuser • hbase restore {[-set backup_set_name] | [backup_root_path] | [backupId] | -t [tables]} [-m [table_mapping] | [-overwrite] | [-check]] • hbase restore /tmp/backup_incremental backupId_1467823988425 -t mytable1,mytable2 -overwrite Page 13
  14. 14. © Hortonworks Inc. 2014 Backup table • Backup table will keep track of all backup sessions –Write/Read backup session state –Write/Read backup session progress (per region server). –Stores last backed up WAL file timestamp (per region server). –Stores list of all backed up WAL files (for BackupLogCleaner ) –Stores backup sets • Must be backed up and restored separately from other tables • Information needed for restore is on hdfs Page 14
  15. 15. © Hortonworks Inc. 2014 Incremental backups • Use Write Ahead Logs (WALs) to capture the data changes since the previous backup • Log roll is executed across all RegionServers • All the WAL files from incremental backups between the last full backup and the incremental backup are converted to HFiles • A process similar to the DistCp tool is used to move the source backup files to the target file system Page 15
  16. 16. © Hortonworks Inc. 2014 Filter WALs on backup to only include relevant edits • Suppose incremental backup request is for table t, all the tables already registered in a backup system, T, are union’ed with t • For every table K in the union: 1. Convert new WAL files into HFile applying table filter for K 2. Move these HFile(s) to backup destination Page 16
  17. 17. © Hortonworks Inc. 2014 Restore • The full backup is restored from the full backup image. • HFileSplitter job will collect all HFile(s), split them into new region boundaries • HBase Bulk Load utility is invoked by restore to import the HFiles as restored data in the table. Page 17
  18. 18. © Hortonworks Inc. 2014 Backup Manifest • Backup image has the following: • Backup Id, Backup Type, Backup Rootdir, Table List, start timestamp, completion timestamp • Mapping between region server and last recorded WAL timestamp • Backup image keeps lineage of all previously created backup images (ancestors) • When backup image list covers the image being considered, it is removed from restore • See message BackupImage in Backup.proto Page 18
  19. 19. © Hortonworks Inc. 2014 Bulk load support • Bulk loaded Hfiles are recorded in backup table at the end of bulk load, thru preCommitStoreFile() hook • During incremental backup, these Hfiles are copied to backup destination • During restore, these Hfiles are loaded into target table Page 19
  20. 20. © Hortonworks Inc. 2014 Limitations of the Backup-Restore • Only one active backup session is supported. • Both backup and restore can’t be canceled while in progress. (HBASE- 15997,15998) • Single backup destination only is supported. HBASE-15476 • There is no merge for incremental images (HBASE-14135) • Only superuser (hbase) is allowed to perform backup/restore Page 20
  21. 21. © Hortonworks Inc. 2014 Credit • Richard Ding • Vladimir Rodionov Page 21
  22. 22. © Hortonworks Inc. 2014 Q/A Page 22
  23. 23. © Hortonworks Inc. 2014 Thank you. Page 23

×