8. Built-in
• Export
– MapReduce job against HBase API
– Output to single seqeunce file
• Copy Table
– MapReduce job against HBase API
– Output to another table
Yay
• Simple
• Heavily tested
• Can do point-in-time
Boo
• Slow
• High impact for running cluster
10. Replication
• Export all changes by tailing WAL
YAY
• Simple
• Gets all edits
• Minimal impact on running cluster
Boo
• Turn on from beginning
• Can’t turn it off and catch up
• No built-in point-in-time
• Still need ETL process to get multiple copies
11. (Facebook) Solution!1
Mozilla did something similar2
1. issues.apache.org/jira/browse/HBASE-5509
2. github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/hadoop/Backup.java
12. Facebook Backup
• Copy existing hfiles, hlogs
Yay
• Through HDFS
– Doesn’t impact running cluster
• Fast
– distcp is 100% faster than M/R through HBase
Boo
• Not widely used
• Requires Hardlinks
• Recovery requires WAL replay
• Point-in-time needs filter
13. Backup through the ages
Export
Copy Table
Replication
HBase HBASE-50
HDFS
Facebook
17. Hardlink workarounds
• HBASE-5547
– Move deleted hfiles to .archive directory
• HBASE-6610
– FileLink: equivalent to Windows link files
Enough to get started….
20. Snapshots
• Fast
- zero-copy of files
• Point-in-time semantics
– Part of how its built
• Built-in recovery
– Make a table from a snapshot
• SLA enforcement
– Guaranteed max unavailability
23. Snapshot Types
• Offline
– Table is already disabled
• Globally consistent
– Consistent across all servers
• Timestamp consistent
– Point-in-time according to each server
24. Offline Snapshots
• Table is already disabled
• Requires minimal log replay
– Especially if table is cleanly disabled
• State of the table when disabled
• Don’t need to worry about changing state
YAY
• Fast!
• Simple!
26. Globally Consistent Snapshots
• All regions block writes until everyone agrees
to snapshot
– Two-phase commit-ish
• Time-bound to prevent infinite blocking
– Unavailability SLA maintained per region
• No Flushing – its fast!
28. Cross-Server Consistency Problems
• General distributed coordination problems
– Block writes while waiting for all regions
– Limited by slowest region
– servers = P(failure)
• Stronger guarantees than currently in HBase
• Requires WAL replay to restore table
30. Timestamp Consistent Snapshots
• All writes up to a TS are in the snapshot
• Leverages existing flush functionality
• Doesn’t block writes
• No WAL replay on recovery
34. Recovery
• Export snapshot
– Send snapshot to another cluster
• Clone snapshot
– Create new table from snapshot
• Restore table
– Rollback table to specific state
35. Export Snapshot
• Copy a full snapshot to another cluster
– All required HFiles/Hlogs
– Lots of options
• Fancy dist-cp
– Fast!
– Minimal impact on running cluster
36. Clone Table
• New table from snapshot
• Create multiple tables from same snapshot
• Exact replica at the point-in-time
• Full Read/Write on new table
37. Restore
• Replace existing table with snapshot
• Snapshots current table, just in case
• Minimal overhead
– Handles creating/deleting regions
– Fixes META for you
40. Goodies
• Full support in shell
• Distributed Coordination Framework
• ‘Ragged Backup’ added along the way
• Coming in next CDH
• Backport to 0.94?
41. Special thanks!
• MatteoBertozzi
– All the recovery code
– Shell support
• Jon Hsieh
– Distributed Two-Phase Commit refactor
• All our reviewers…
– Stack, Ted Yu, Jon Hsieh, Matteo
Data flying around,HBase is just chugging along. Your adding servers weekly – daily? – to handle the excess capacity; life is good. But wait, one of your DBAs fat fingers a command a deletes a table, a column family, the database. Or maybe your dev’s want to test out some new features – not on my production server!Or a customer makes a mistake and wants to get back to last Tuesday at 6PM.
HBase has been around for a few years and well, these aren’t exactly new problems.
OK, if you’ve thought about this problem for at least 5 minutes, you’ve probably seen these before. You probably are even running them them already
Ok, we can do better…
Just get a list of all the hfiles/hlogs and copy them over. Use hardlinks to ensure that we have the same state for the tableThis is getting better – we aren’t directly impacting the cluster (except for bandwidth).
General trend down the stack – more knowledge of individual files, layout in HDFS, low-level functionality. Also trending towards a minimal impact on the running cluster – only take the hit on the wire, not through the HBase layer.HBASE-50:Internalhardlinks using reference counting in META, massive patch including restore, offline and online snapshots. WAY too much to review
And for a few years people we really sad and made do with existing tooling. We are starting to run HBase in some large companies though and have stringent data requirements
Story-ize the problem
Focus on TADA of the snapshots
Imagine you have 1000 servers, each with in memory state. How would you save it? How would you save it fast? Any problems?
Example for stronger guaranettes than hbase - Currently, we only support transactions on a single row on a single server. This gives you a semi-omniscent view over all servers hosting a table – full cross server consensus over multiple rows. WAY more than HBase gives you now.
Guarantee that all writes are filtered on a timestamp, flushing on the regionserver so all the information in the snapshot is present entirely in HFiles – NO WAL REPLAY!