John Leach Co-Founder and CTO of Splice Machine with 15+ years software development and machine learning experience will discuss how to use HBase co-processors to build an ANSI-99 SQL database with 1) parallelization of SQL execution plans, 2) ACID transactions with snapshot isolation and 3) consistent secondary indexing.
Transactions are critical in traditional RDBMSs because they ensure reliable updates across multiple rows and tables. Most operational applications require transactions, but even analytics systems use transactions to reliably update secondary indexes after a record insert or update.
In the Hadoop ecosystem, HBase is a key-value store with real-time updates, but it does not have multi-row, multi-table transactions, secondary indexes or a robust query language like SQL. Combining SQL with a full transactional model over HBase opens a whole new set of OLTP and OLAP use cases for Hadoop that was traditionally reserved for RDBMSs like MySQL or Oracle. However, a transactional HBase system has the advantage of scaling out with commodity servers, leading to a 5x-10x cost savings over traditional databases like MySQL or Oracle.
HBase co-processors, introduced in release 0.92, provide a flexible and high-performance framework to extend HBase. In this talk, we show how we used HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source. We will discuss how endpoint transactions are used to serialize SQL execution plans over to regions so that computation is local to where the data is stored. Additionally, we will show how observer co-processors simultaneously support both transactions and secondary indexing.
The talk will also discuss how Splice Machine extended the work of Google Percolator, Yahoo Labs’ OMID, and the University of Waterloo on distributed snapshot isolation for transactions. Lastly, performance benchmarks will be provided, including full TPC-C and TPC-H results that show how Hadoop/HBase can be a replacement of traditional RDBMS solutions.
To view the accompanying slide deck: http://www.slideshare.net/ChicagoHUG/
2. 2
Data Doubling Every 2 Years…
Driven by web, social, mobile, and Internet of Things
Source: 2013 IBM Briefing Book
3. 3
Traditional RDBMSs Overwhelmed…
Scale-up becoming cost-prohibitive
Oracle is
too darn
expensive! My DB is
hitting
the wall
Users keep
getting those
spinning
beach balls
We have to
throw data
away
Our reports
take forever
4. 4
Scale-Out: The Future of Databases
Dramatic improvement in price/performance
Scale Up
(Increase server size)
Scale Out
(More small servers)
vs.
$ $ $ $ $ $
5. 5
Who are We?
THE ONLY
HADOOP RDBMS
Replace your old RDBMS
with a scale-out SQL database
Affordable, Scale-Out
ACID Transactions
No Application Rewrites
10x
Better
Price/Perf
6. 6
Case Study: Harte-Hanks
Overview
Digital marketing services provider
Real-time campaign management
Complex OLTP and OLAP
environment
Challenges
Oracle RAC too expensive to scale
Queries too slow – even up to ½ hour
Getting worse – expect 30-50% data growth
Looked for 9 months for a cost-effective solution
Solution Diagram Initial Results
¼ cost
with commodity scale out
3-7x faster
through parallelized queries
10-20x price/perf
with no application, BI or ETL rewrites
Cross-Channel
Campaigns
Real-Time
Personalization
Real-Time Actions
7. Use Cases
Digital Marketing
Campaign management
Unified Customer Profile
Real-time personalization
Data Lake
Operational reporting and analytics
Operational Data Stores
Fraud Detection
Personalized Medicine
Internet of Things
Network monitoring
Cyber-threat security
Wearables and sensors
7
8. 8
Reference Architecture: Operational Apps
Provide affordable scale-out for applications with a high concurrency of real-time reads/writes
3rd Party
Data Sources
Operational App
(e.g., Unica Campaign Mgmt)
Customers
Operational
Employees
Operational
Reports &
Analytics
9. 9
Reference Architecture: Operational Data Lake
Offload real-time reporting and analytics from expensive OLTP and DW systems
OLTP
Systems
Ad Hoc
Analytics
Operational
Data Lake
Executive
Business
Reports
Operational
Reports &
Analytics
ERP
CRM
Supply
Chain
HR
… Data
Warehouse
Datamart
Stream or
batch updates
ETL
Real-Time,
Event-Driven
Apps
10. 10
Reference Architecture: Unified Customer Profile
Improve marketing ROI with deeper customer intelligence and better cross-channel coordination
Unified
Customer Profile
(aka DMP)
Operational Reports
for Campaign Perf.
Social
Feeds
Web/eCommerce
Clickstreams
Website
Datamart
Stream or
batch updates
BI Tools
Real-time
personalization
data
Demand
Side Platform
(DSP)
Ad Exchange
ADD GOOGLE,
FB logos
1st Party/
CRM Data
3rd Party Data
(e.g., Axciom)
Ad Perf. Data
(e.g., Doubleclick)
Email Mktg Data
Call Center Data
POS Data
Email
Marketing
App
Ad Hoc
Audience
Segmentatio
n
BI Tools
12. 12
Combines the Best of Both Worlds
Scale-out on commodity servers
Proven to 100s of petabytes
Efficiently handle sparse data
Extensive ecosystem
RDBMS
ANSI SQL
Real-time updates
ACID transactions
ODBC/JDBC support
Hadoop
14. 14
Proven Building Blocks: Hadoop and Derby
APACHE DERBY
ANSI SQL-99 RDBMS
Java-based
ODBC/JDBC Compliant
APACHE HBASE/HDFS
Auto-sharding
Real-time updates
Fault-tolerance
Scalability to 100s of PBs
Data replication
15. Derby
100% JAVA ANSI SQL RDBMS – CLI, JDBC, embedded
Modular, Lightweight, Unicode
Authentication and Authorization
Concurrency
Project History
Started as Cloudscape in 1996
Acquired by Informix… then IBM…
IBM Contributed code to Apache project in 2004
An active Apache project with conservative development
DB2 influence. Many of the same limits/features
Has Oracle’s stamp of approval – Java DB and included in JDK6
15
16. Derby Advanced Features
Java Stored Procedures
Triggers
Two-phase commit (XA Support)
Updatable SQL Views
Full Transaction Isolation Support
Encryption
Custom Functions
16
17. Splice SQL Processing
PreparedStatement ps = conn.prepareStatement(“SELECT * FROM
T WHERE ID = ?”);
1. Look up in cache using exact text match (skip to 6 if plan found
in cache)
2. Parse using JavaCC generated parser
3. Bind to dictionary, acquire types
4. Optimize Plan
5. Generate code for plan
6. Create instance of plan
17
18. Splice Details
Parse Phase
Forms explicit tree of query nodes representing statement
Generate Phase
Generate Java byte code (an Activation) directly into an in-memory byte array
Loaded with special ClassLoader that loads from the byte array
Binds arguments to proper types
Optimize Phase
Determine feasible join strategies
Optimize based on cost estimates
Execute Phase
Instantiates arguments to represent specific statement state
Expressions are methods on Activation
Trees of ResultSets generated that represent the state of the query
18
19. Splice Modifications to Derby
19
Derby Component Derby Splice Version
Store Block File-based HBase Tables
Indexes B-Tree Dense index in HBase Table
Concurrency Lock-based, Aries MVCC - Snapshot Isolation
Project-Restrict Plan Predicates on centralized file
scanner
Predicates pushed to shards
and locally applied
Aggregation Plan Aggregation serially computed Aggregations pushed to shards
and spliced together
Join Plan Centralized Hash and NLJ
chosen by optimizer
Distributed Broadcast, Sort-
Merge, Merge, NLJ, and Batch
NLJ chosen by optimizer
Resource Management Number of Connections and
Memory Limitations
Task Resource Queues and
Write Governor
20. 20
HBase: Proven Scale-Out
Auto-sharding
Scales with commodity hardware
Cost-effective from GBs to PBs
High availability thru failover
and replication
LSM-trees
21. 21
Distributed, Parallelized Query Execution
Parallelized computation across cluster
Moves computation to the data
Utilizes HBase co-processors
No MapReduce
22. Splice HBase Extensions
Asynchronous Write Pipeline
Non-blocking, flushable writes
Writes data, indexes, and constraints (index) concurrently
Batches writes in chunks for bulk WAL Edits vs. single WAL Edits
Synchronization free internal scanner vs. synchronized external scanner
Linux Scheduler Modeled Resource Manager
Resource Queues that handle DDL, DML, Dictionary, and Maintenance
Operations
Sparse Data Support
Efficiently store sparse data
Does not store nulls
22
23. Schema Advantages
Non-Blocking Schema Changes
Add columns in a DDL transaction
No read/write locks while adding columns
Sparse Data Support
Efficiently store sparse data
Does not store nulls
23
25. 25
Lockless, ACID transactions
State-of-the-Art Snapshot Isolation
25
Adds multi-row, multi-table transactions
to HBase with rollback
Fast, lockless, high concurrency
ZooKeeper coordination
Extends research from Google
Percolator, Yahoo Labs, U of Waterloo
Transaction A
Transaction B
Transaction C
Ts Tc
26. 26
BI and SQL tool support via ODBC
No application rewrites needed
26
28. SQL Database Ecosystem
28
Ad-hoc Analytics Operational(OLTP + OLAP)
Lower Cost
Higher Cost
High Concurrency
Ingest: Real-time Updates
Operate on 100s of records
Low Concurrency
Ingest: Batch Loads
Scan PBs of data at a time
Commodity Hardware
10x price/performance
Proprietary/Custom Hardware
Millions of Dollars
30. What People are Saying…
30
Recognized as a key innovator in databases
Scaling out on Splice
Machine presented
some major benefits
over Oracle
...automatic balancing between
clusters...avoiding the costly
licensing issues.
Quotes
Awards
An alternative to today’s
RDBMSes,
Splice Machine effectively
combines traditional relational
database technology with
the scale-out capabilities
of Hadoop.
The unique claim of … Splice
Machine is that it can run
transactional applications
as well as support analytics on
top of Hadoop.
31. 31
Summary
THE ONLY
HADOOP RDBMS
Replace your old RDBMS
with a scale-out SQL database
Affordable, Scale-Out
ACID Transactions
No Application Rewrites
10x
Better
Price/Perf