3. Bronto Overview
Bronto Software provides a cloud-based
marketing platform for organizations to drive
revenue through their email, mobile and social
campaigns
4. Bronto Contd.
● ESP for E-Commerce retailers
● Our customers are marketers
● Charts, graphs, reports
● Market segmentation
● Automation
● We are also hiring
5. Where We Use HBase
● High volume scenarios
● Realtime data
● Batch processing
● HDFS staging area
● Sorting/Indexing not a priority
○ We are working on this
6. HBase Overview
● Implementation of Google’s BigTable
● Sparse, sorted, versioned map
● Built on top of HDFS
● Row level ACID
● Get, Put, Scan
● Assorted RMW operations
7. Tables Overview
Tables are sorted (lexicographically) key value
pairs of uninterpreted byte[]s. Keyspace is
divided up into regions of keys. Each region is
hosted by exactly one machine.
8. R3R1
Server 1
Key Value
a byte[]
aa byte[]
b byte[]
bb byte[]
c byte[]
ca byte[]
R1: [a, b)
R2: [b, c)
R3: [c, d)
R2
Server 1
Table Overview
9. Operations
● Layers of complexity
● Normal failure modes
○ Hardware dies (or combust)
○ Human error
● JVM
● HDFS considerations
● Lots of knobs
10. Cascading Failure
1. High write volume fragments heap
2. GC promotion failure
3. Stop the world GC
4. ZK timeout
5. Receive YouAreDeadException, die
6. Failover
7. Goto 1
11. Useful Tunings
● MSLAB enabled
● hbase.regionserver.handler.count
○ Increasing puts more IO load on RS
○ 50 is our sweet spot
● JVM tuning
○ UseConcMarkSweepGC
○ UseParNewGC
12. Monitoring Tools
● Nagios for hardware checks
● Cloudera Manager
○ Reporting and health checks
○ Apache Ambari and MapR provide similar tools
● Hannibal + custom scripts
○ Identify hot regions for splitting
13. Table Design
● Table design is deceptively simple
● Main Considerations:
○ Row key structure
○ Number of column families
● Know your queries in advance
14. Additional Context
● SAAS environment
○ “Twitter clone” model won’t work
● Thousands of users millions, of attributes
● Skewed customer base
○ Biggest clients have 10MM+ contacts
○ Smallest have thousands
15. Row Keys
● Most important decision
● The only (native) index in HBase
● Random reads and writes are fast
○ Sorted on disk and in memory
○ Bloom filters speed read performance (not in use)
16. Hotspotting
● Associated with monotonically increasing
keys
○ MySql AUTO_INCREMENT
● Writes lock onto one region at a time
● Consequences:
○ Flush and compaction storms
○ $500K cluster limited by $10K machine
17. Row Key Advice
● Read/Write ratio should drive design
○ We pay a write time penalty for faster reads
● Identify queries you need to support
● Consider composite keys instead of indexes
● Bucketed/Salted keys are an option
○ Distribute writes across N buckets
○ Rebucketing is difficult
○ Requires N reads, slow workers
18. Variable Width Keys
customer_hash::email
● Allows scans for a single customer
● Hashed id distributes customers
● Sorted by email address
○ Could also use reverse domain for gmail, yahoo, etc.
19. Fixed Width Keys
site::contact::create::email
● FuzzyRowFilter
○ Can fix site, contact, and reverse_create
○ Can search for any email address
○ Could use a fixed width encoding for domain
■ Search for just gmail, yahoo, etc
● Distributes sites and users
● Contacts sorted by create date
20. Column Families
● Groupings of named columns
● Versioning, compression, TTL
● Different than BigTable
○ BigTable: 100s
○ HBase: 1 or 2
21. Column Family Example
Id d {VERSIONS => 2} s7 {TTL => 604800}
a (address) p (phone) o:3-27 (open) c:3-20 (click)
dfajkdh byte[] byte[]:555-5555 byte[]
hnvdzu9 byte[]:1234 St. XXXX
hnvdzu9 byte[]:1233 St.
hnvdzu9 XXXX byte[]
er9asyjk byte[]: 324 Ave
Column Family Example
● PROTIP: Keep CF and qualifier names short
○ They are repeated on disk for every cell
● “d” supports 2 versions of each column, maps to demographics
● “s7” has seven day TTL, maps to stats kept for 7 days.
22. MemStore
HDFS
s2s1 s3
f1
Column Families In Depth
MemStore
HDFS
s2s1
f2
my_table,,1328551097416.12921
bbc0c91869f88ba6a044a6a1c50.
● StoreFile(s) for each CF
in region
● Sparse
● One memstore per CF
○ Must flush together
● Compactions happen at
region level
(Region)
(family) (family)
23. Compactions
● Rewrites StoreFiles
○ Improves read performance
○ IO Intensive
● Region scope
● Used to take > 50 hours
● Custom script took it down to 18
○ Can (theoretically) run during the day
25. The Table From Hell
● 19 Column Families
● 60% of our region count
● Skewed write pattern
○ KB size store files
○ Frequent compaction storms
○ hbase.hstore.compaction.min.size (HBASE-5461)
● Moved to it’s own cluster
26. And yet...
● Cluster remained operational
○ Table is still in use today
● Met read and write demand
● Regions only briefly active
○ Rowkeys by date and customer
27. What saved us
● Keyed by customer and date
● Effectively write once
○ Kept “active” region count low
● Custom compaction script
○ Skipped old regions
● More hardware
● Were able to selectively migrate
28. Column Family Advice
● Bad choice for fine grained partitioning
● Good for
○ Similarly typed data
○ Varying versioning/retention requirements
● Prefer intra row scans
○ CF and qualifiers are sorted
○ ColumnRangeFilter