Hypertable is an open source, massively scalable database modeled after Google's Bigtable. It is written in C++ for high performance and supports Apache Thrift interfaces for popular languages. Hypertable is actively developed, has over 8 years of development, and supports features like namespaces, atomic counters, secondary indexes, regex filtering, and Hadoop integration. It is designed for horizontal scalability and sparse data structures, allowing for high throughput on both reads and writes even with large datasets.
4. Highlights
• Modeled after Google’s Bigtable database
• High Performance Implementation (C++)
• Apache Thrift interface for all popular languages
(Java, PHP, Ruby, Python, Perl, etc)
• Broad Hadoop distribution support
o Apache 2
o Cloudera CDH3, CDH4, CDH5
o IBM BigInsights 3
o Hortonworks HDP2
o MapR
• Actively developed for 8 years
5. Open Source
• Licensed under the GPL
• Hosted on GitHub
o git://github.com/hypertable/hypertable.git
o https://github.com/hypertable/hypertable.git
• Online source documentation
• Mailing Lists
o groups.google.com/group/hypertable-user
o groups.google.com/group/hypertable-dev
6. Bigtable
• Google’s most successful scalable database
• Bigtable underpins 100+ Google services
• YouTube, Blogger, Google Earth, Google Maps,
Orkut, Gmail, Google Analytics, Google Book
Search, Google Code, Crawl Database, Google
Code …
• Data is physically ordered by primary key – it’s not a
distributed hash table
7. How Hypertable Differs From
A Traditional RDBMS
• Horizontally Scalable
• Sparse Table Structure
o Variable number of columns per-row
o Rows can have billions of columns
• Cells can have multiple time stamped versions
8. Database Model
• Sparse, two-dimensional tables
• Cells can have multiple versions
• Cells addressed by 4-part key
o Row
o Column family
o Column qualifier
o Timestamp
29. Cluster Task
AutomationTool
• ht_cluster
• Modeled after Capistrano
• Role
o Designates a function or service and the set of machines that will perform
that function or service
o Examples: Hyperspace, Master, Slave (RangeServer), ThriftBroker
o Machines can belong to one ore more roles
• Task
o Script written for specific roles and used to manage the associated
function or service
o Examples: start_hyperspace, stop_hyperspace
34. Thrift Broker Metrics
Metric
Units
Connections
count
Requests
requests/s
Errors
errors/s
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string
35. Range Server Metrics
Metric
Units
Scans
scans/s
Updates
updates/s
Bytes Returned
bytes/s
Bytes Scanned
bytes/s
Byte Scan Yield
percentage
Bytes WriUen
bytes/s
Cells Returned
cells/s
Cells Scanned
cells/s
Cell Scan Yield
percentage
Outstanding Scanners
count
Request Backlog
count
Metric
Units
Major Compactions
count
Minor Compactions
count
Merging Compactions
count
GC Compactions
count
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
Tracked Memory
GB
CPU user
percentage
CPU sys
percentage
36. Range Server Metrics
Metric
Units
Ranges
count
CellStores
count
Block Cache Hits
percentage
Block Cache Memory
GB
Block Cache Fill
GB
Query Cache Hits
Percentage
Query Cache Memory
GB
Query Cache Fill
GB
Version
string
37. FS Broker Metrics
Metric
Units
Read Throughput
MB/s
Write Throughput
MB/s
Syncs
syncs/s
Sync Latency
milliseconds
Errors
count
JVM GCs
count
JVM GC Time
milliseconds
JVM Heap Size
GB
Virtual Memory
GB
Resident Memory
GB
Metric
Units
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string
38. Master and Hyperspace
Metrics
Metric
Units
Operations
operations/s
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string
Metric
Units
Requests
requests/s
Virtual Memory
GB
Resident Memory
GB
Heap Size
GB
Heap Slack Bytes
GB
CPU user
percentage
CPU sys
percentage
Version
string
Master
Hyperspace
39. Slow Query Log
• ThriftBroker feature
• Logs queries that
take longer than 10
seconds
• Log line format
o End time (seconds)
o Start time (seconds)
o Function called
o Client IP/port
o Latency (milliseconds)
o Sub-scanner count
o Bytes Returned
o Bytes Scanned
o Disk read
o Servers contacted
o Namespace
o HQL representation of query
43. Atomic Counters
• Column option:
CREATE TABLE counts (
url COUNTER
);
• Modified via existing API using specially
formatted values:
Value Format Description
[+]n Increment counter by n
-n Decrement counter by n
=n Reset counter to n
44. Secondary Indexes
Total Cells Inserted:
1 billion
Total Time Taken:
45 minutes
Aggregate Throughput (inserts/s):
372,362
Aggregate Throughput (bytes/s):
14,763,300
§ Six test machines
- Dual Six-core Opteron HE Processors
- 24 GB RAM
- 4X 2TB SATA drives
§ Single Indexed column
- Key: randomly generated 20-byte integer
- Value: two randomly chosen words from /usr/share/dict/
words
45. Secondary Indexes (HQL)
CREATE TABLE products (
title,
section,
info,
category,
INDEX section,
INDEX info,
QUALIFIER INDEX info,
QUALIFIER INDEX category
);
46. Secondary Indexes
SELECT title
FROM products
WHERE info:actor = “Jack Nicholson”;
B00002VWE0 title Five Easy Pieces (1970)
B002VWNIDG title The Shining (1980)
47. Secondary Indexes
SELECT title, info:author
FROM products
WHERE info:author =~ /^Stephen [PK]/;
0307743659 title The Shining Mass Market Paperback
0307743659 info:author Stephen King
0321776402 title C++ Primer Plus (6th Edition)
(Developer's Library)
0321776402 info:author Stephen Prata
48. Secondary Indexes
SELECT title
FROM products
WHERE Exists(info:studio);
B00002VWE0 title Five Easy Pieces (1970)
B000Q66J1M title 2001: A Space Odyssey [Blu-ray]
B002VWNIDG title The Shining (1980)
49. Secondary Indexes
SELECT title
FROM products
WHERE info:author =~ /^Stephen P/ OR
info:publisher =~ /^Anchor/;
0307743659 title The Shining Mass Market Paperback
0321776402 title C++ Primer Plus (6th Edition)
(Developer's Library)
50. Secondary Indexes
SELECT title
FROM products
WHERE info:author =~ /^Stephen [PK]/ AND
info:publisher =~ /^Anchor/;
0307743659 title The Shining Mass Market Paperback
51. Secondary Indexes
SELECT title
FROM products
WHERE ROW =^ 'B' AND
info:actor = 'Jack Nicholson';
B00002VWE0 title Five Easy Pieces (1970)
B002VWNIDG title The Shining (1980)
52. Regex Filtering
• Google’s RE2 regular expression engine
o Extremely fast (up to 50X Java regex)
o Searches run in time linear in the size of the
input
o Searches constrained to a fixed amount of
memory
• Supported Searches:
o Row key
o Column qualifier
o Value
53. Regex Filtering
SELECT info:/^a/ FROM products;
0307743659 info:author Stephen King
0321321928 info:author Stephen C. Dewhurst
0321776402 info:author Stephen Prata
B00002VWE0 info:actor Karen Black
B00002VWE0 info:actor Jack Nicholson
B000Q66J1M info:actor Gary Lockwood
B000Q66J1M info:actor Keir Dullea
B002VWNIDG info:actor Shelley Duvall
B002VWNIDG info:actor Jack Nicholson
54. Regex Filtering
SELECT title
FROM products
WHERE ROW REGEXP "2";
0321321928 title C++ Common Knowledge: Essential
Intermediate Programming [Paperback]
0321776402 title C++ Primer Plus (6th Edition)
(Developer's Library)
B00002VWE0 title Five Easy Pieces (1970)
B002VWNIDG title The Shining (1980)
55. Regex Filtering
SELECT title
FROM products
WHERE VALUE REGEXP "(";
0321776402 title C++ Primer Plus (6th Edition)
(Developer's Library)
B00002VWE0 title Five Easy Pieces (1970)
B002VWNIDG title The Shining (1980)
57. • Load data from HT to Hive and vice-versa
• Use Hive types
• Use Hive QL (joins, aggregations)
• Low latency data warehousing
• Uses Hypertable’s native MapReduce Input/Output
format
58. Column Family Options
• TTL=<t>
o “time to live”
o Remove cells that are older than <t>
• MAX_VERSIONS=<n>
o Keep only most recent <n> cell versions
59. Access Groups
CREATE TABLE User (
name,
address,
photo,
profile,
ACCESS GROUP default (name, address, photo),
ACCESS GROUP profile (profile)
);
61. Group Commit
• Supports highly concurrent updates
• Trades average latency for better throughput
• By default, commit log writes are auto-coalesced
• Commit log write interval can be statically
configured per-table:
CREATE TABLE counts (
url,
domain
) GROUP_COMMIT_INTERVAL=100;
62. Caching
• Block Cache
o Caches CellStore blocks
o Can be configured to store blocks compressed or
uncompressed (default = compressed)
o Dynamically adjusted size based on workload
• Query Cache
o Caches query results
o Caches single row queries only
63. Compression
• Cell Store blocks are compressed
• Commit Log updates are compressed
• Supported Compression Schemes:
bmz, lzo, quicklz, snappy, zlib, none
• Quicklz performance numbers:
Language Compression
Speed (MB/s)
Decompression
Speed (MB/s)
C++ 308 358
Java 127 95
65. Hypertable vs. HBase
• Modeled after test described in Bigtable paper
• Hypertable 0.9.5.5 vs. HBase 0.90.4
• 16-node Cluster
o CPU: 2X AMD C32 Six-core model 4170 HE 2.1GHz
o RAM: 24GB
o Disk: 4X 2TB SATA
• Tests Run
o Random Write
o Scan
o Random Read Zipfian
o Random Read Uniform
70. • Operational Data Store
• System metrics
o CPU
o Memory
o IO
o Network
• Application metrics
o Web
o DB
o Caches
• Business metrics
o Usage
o Revenue
Case Study:
Noah System
71. • Storage Capacity
o Up to 100TB
o Up to 1 trillion records
• Automatic Sharding
o Irregular data growth patterns
• Heavy Writes
o ~30K inserts/s
• Fast Reads of Recent Data
• Table Scans
System
Requirements
73. • 2nd Largest Indian Internet Portal
• Rediffmail
o One of the world’s largest email services
o Over 100 Million registered users
• Active Deployments
o Rediffmaill
o Email SPAM classification
o News Crawl Database
o Recommendation System
Case Study:
Rediff