4. Outline
• Motivation
• History
• HBase Consistent Indexing
– Index Management
– Recovery Mechanism
• Conclusion
HBase BoF - June 20134
5. Why do we need them?
• Sorted by key
– Great for accessing on that key
HBase BoF - June 20135
What if we want to access by another dimension!?
6. A short example
HBase BoF - June 20136
• Easy to search by name of food
• Hard to search on another dimension
Name Type Date Received Manufacturer Current Count
Apple Macintosh 6/23/13 Good Farm Inc. 200
Turkey Breast 6/23/13 Tasty Meat Co. 42
Chicken Drumstick 6/18/13 Pretty Ok Food 3
Jam Strawberry 6/18/10 Mash It Up Inc. 700
7. A short example
Name Type Date Received Manufacturer Current Count
Apple Macintosh 6/23/13 Good Farm Inc. 200
Turkey Breast 6/23/13 Tasty Meat Co. 42
Chicken Drumstick 6/18/13 Pretty Ok Food 3
Jam Strawberry 6/18/10 Mash It Up Inc. 700
HBase BoF - June 20137
Date Received Name Type Manufacturer Current Count
6/18/13 Jam Strawberry Mash It Up Inc. 700
6/18/13 Chicken Drumstick Pretty Ok Food 3
6/23/13 Apple Macintosh Good Farm Inc. 200
6/23/13 Turkey Breast Tasty Meat Co. 42
8. Outline
• Motivation
• History
• HBase Consistent Indexing
– Index Management
– Recovery Mechanism
• Conclusion
HBase BoF - June 20138
9. HBase is “Special”…
• Partitioned Keys (“HRegion”)
• Scales because regions are independent
• Built-in data recovery mechanisms
HBase BoF - June 20139
11. We’ve gotten better…
• NGData
– HBase-SEP
– HBase-Indexer
• Intel
– Lucene Full Text Indexing
HBase BoF - June 201311
12. Still missing some things
• In-HBase index storage
– Just another table in HBase
• Simple consistency guarantees
– If X fails, then Y
• Minimal overhead for covered indexes
– Network roundtrips
HBase BoF - June 201312
13. Outline
• Motivation
• History
• HBase Consistent Indexing
– Index Management
– Recovery Mechanism
• Conclusion
HBase BoF - June 201313
14. Two Major Components
• Index Management
– Build index updates
– Ensures index is ‘cleaned up’
• Recovery Mechanism
– Ensures index updates are “ACID”
HBase BoF - June 201314
15. Index Management
HBase BoF - June 201315
• Lives within a RegionCoprocesorObesrver
• Access to the local Hregion
• Specifies the mutations to apply to the index
tables
public interface IndexBuilder {
public void setup(RegionCoprocessorEnvironment env);
public Map<Mutation, String> getIndexUpdate(Put put);
public Map<Mutation, String> getIndexUpdate(Delete delete);
}
17. Key Observation #1
“We shouldn’t need to provide stronger
guarantees than HBase - that is just asking for
a bad time.”
- Jon Hsieh
HBase BoF - June 201317
* Paraphrased
*
18. HBase ACID
• Does NOT give you:
– Cross-row consistency
– Cross-table consistency
• Does give you:
– Durable data on success
– Visibility on success without partial rows
HBase BoF - June 201318
19. Key Observation #2
“Secondary indexing is inherently an easier
problem than full transactions… secondary
index updates are idempotent.”
- Lars Hofhansl
HBase BoF - June 201319
20. Idempotent Index Updates
• Doesn’t need full transactions
• Replay as many times as needed
• Can tolerate a little lag
– As long as we get the order right
HBase BoF - June 201320
23. Durable Indexing: Standard Write Path
HBase BoF - June 201323
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
24. Durable Indexing: Standard Write Path
HBase BoF - June 201324
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
25. Durable Indexing
HBase BoF - June 201325
Region
Coprocessor
Host
WAL
RegionCoprocessorHost
Indexer Index
Builder
WAL Updater
Durable!
Indexer
Index Table
Index Table
Index Table
26. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
HBase BoF - June 201326
27. Durable Indexing
HBase BoF - June 201327
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
Indexer
Indexer
Index
TableIndex
TableIndex
Table
28. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
HBase BoF - June 201328
✔
29. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
• After writing WAL, before index update
– WAL Replay updates the index table and the
primary table
HBase BoF - June 201329
✔
30. Durable Indexing
HBase BoF - June 201330
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
Indexer
Indexer
Index
TableIndex
TableIndex
Table
31. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
• After writing WAL, before index update
– WAL Replay updates the index table and the
primary table
HBase BoF - June 201331
✔
✔
32. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
• After writing WAL, before index update
– WAL Replay updates the index table and the
primary table
• Mid-index update
– WAL Replay finishes index update, primary table
update
HBase BoF - June 201332
✔
✔
33. Durable Indexing
HBase BoF - June 201333
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
Indexer
Indexer
Index
TableIndex
TableIndex
Table
34. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
• After writing WAL, before index update
– WAL Replay updates the index table and the primary
table
• Mid-index update
– WAL Replay finishes index update, primary table
update
HBase BoF - June 201334
✔
✔
✔
35. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
• After writing WAL, before index update
– WAL Replay updates the index table and the primary
table
• Mid-index update
– WAL Replay finishes index update, primary table
update
• After index updates, before primary
– WAL Replay restores primary state, idempotently
applies index updates
HBase BoF - June 201335
✔
✔
✔
36. Durable Indexing
HBase BoF - June 201336
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
Indexer
Indexer
Index
TableIndex
TableIndex
Table
37. Failure Situations
• Before writing the WAL
– Nothing is durable, nothing is visible
• After writing WAL, before index update
– WAL Replay updates the index table and the primary
table
• Mid-index update
– WAL Replay finishes index update, primary table
update
• After index updates, before primary
– WAL Replay restores primary state, idempotently
applies index updates
HBase BoF - June 201337
✔
✔
✔
✔
38. Special Note: Failed Index Updates
• Index is corrupted
– Index Table does not exist
– Index table does not have write schema
– Etc.
• Fail-fast behavior
– Kill the whole server
– Forces WAL Replay to enforce correctness
– Modular enough to support alternative schemes
HBase BoF - June 201338
39. Key Points
• Custom KeyValues to enable index durability
in primary table WAL
• Custom WALEdit Codec for index update with
WAL Replay
• Will see index updates before primary
– Only a little bit of lag and never ‘wrong’
– Matches HBase consistency
• Fail-fast behavior to enforce correctness
HBase BoF - June 201339
40. Upcoming Work
• Performance testing
• Standard covered index managers
• Index cleanup on compaction
HBase BoF - June 201340
41. Outline
• Motivation
• History
• HBase Consistent Indexing
– Index Management
– Recovery Mechanism
• Conclusion
HBase BoF - June 201341
42. Conclusion
• Fully transparent to client
• Easy to build custom index maintenance
• Meets current HBase consistency guarantees
• Supports HBase 0.94.9+
– Coming to 0.96/0.98 soon!
HBase BoF - June 201342
43. hbase-index
HBase BoF - June 201343
https://github.com/forcedotcom/phoenix/tre
e/master/contrib/hbase-index
44. Detailed Blog Post
HBase BoF - June 201344
http://jyates.github.io/2013/06/11/hbase-
consistent-secondary-indexing.html
45. Bonus!
• Usable as a standalone module
• Coming to phoenix*
– Built-in support
• Future: added to HBase core (?)
HBase BoF - June 201345
* https://github.com/forcedotcom/phoenix