HBase Incremental Backup

HBase Incremental Backup / Restore
2012/07/23

How to perform Incremental Backup/Restore?

• HBase ships with a handful of useful tools
– CopyTable
– Export / Import

CopyTable

• Purpose:
– Copy part of or all of a table, either to the same cluster or
another cluster
• Usage:
– bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--
endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename

• Options:
– starttime: Beginning of the time range.
– endtime: End of the time range. Without endtime means
starttime to forever.
– new.name: New table's name.
– peer.adr: Address of the peer cluster given in the format
hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeepe
r.znode.parent
– families: Comma-separated list of ColumnFamilies to copy.

CopyTable (cont.)

• Limitation
– Can only backup to another table (Scan + Put)
– While a CopyTable is running, newly inserted or updated rows
may occur and these concurrent edits may cause inconsistency.

Export

• Purpose:
– Dump the contents of table to HDFS in a sequence file
• Usage:
– $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename>
<outputdir> [[<starttime> [<endtime>]]]

• Options:
– *tablename: The name of the table to export
– *outputdir: The location in HDFS to store the exported data
– starttime: Beginning of the time range
– endtime: The matching end time for the time range of the scan
used

Export (cont.)

• Limitation
– Can only backup to HDFS in a sequence file (Scan + Write to
HDFS).
– While a CopyTable is running, newly inserted or updated rows
may occur and these concurrent edits may cause inconsistency.

Import

• Purpose:
– Load data that has been exported back into HBase
• Usage
– $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename>
<inputdir>

Conclusion

• Regular (ex. Daily) Incremental backup
– Use Export and organize output dir as a meaningful hierarchy
• /table_name
/2012 (year)
/07 (month)
/01 (date)
/02
…
/31
/01 (hour)
…
/24
– Perform Import to restore data on-demand
• To reduce the overhead, don’t perform it during the
peak time

HBase Incremental Backup

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to HBase Incremental Backup

Similar to HBase Incremental Backup (20)

Recently uploaded

Recently uploaded (20)

HBase Incremental Backup