SlideShare a Scribd company logo
1 of 48
Download to read offline
Meet HBase-1.0
And the New Client API
Enis Soztutar
Solomon Duskis
About Us
Enis Söztutar
Hortonworks
Release Manager for 1.0
@enissoz
Solomon Duskis
Google / Bigtable
@sduskis
Outline
Why now?
Major features
Versioning / Compatibility
Upgrade
HBase-1.0 Interfaces
Examples
Outline
Why now?
Major features
Versioning / Compatibility
Upgrade
HBase-1.0 Interfaces
Examples
Why 1.0 now?
Ran out of numbers, which was the plan for switching to the
0.9x versions
Community agreement that HBase has already reached the
maturity level
Start semantic versioning and compatibility guarantees
Apache HBase v1.0 marks a major milestone in the project's development. It is a monumental moment
that the army of contributors who have made this possible should all be proud of. The result is a thing
of collaborative beauty that also happens to power key, large-scale Internet platforms.
Michael Stack
The HBase 1.0 release appropriately acknowledges a maturity already achieved by the Apache
HBase community and software both, and is a great occasion to learn more about HBase, how it can
help you solve your scale data challenges, and the growing ecosystem of Open Source and
commercial software that chooses HBase as foundation.
Andrew Purtell
https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72
Release goals
The 1.0.0 release has three goals:
Release goals
The 1.0.0 release has three goals:
1. Lay a stable foundation for future 1.x
releases
Release goals
The 1.0.0 release has three goals:
1. Lay a stable foundation for future 1.x
releases
2. Stabilize running HBase cluster and its
clients; and
Release goals
The 1.0.0 release has three goals:
1. Lay a stable foundation for future 1.x
releases
2. Stabilize running HBase cluster and its
clients; and
3. Make versioning and compatibility
dimensions explicit
Outline
Why now?
Major features
Versioning / Compatibility
Upgrade
HBase-1.0 Interfaces
Examples
Overview
Over 1500 jiras resolved on top of 0.98.0!
See release announcement for a
comprehensive summary
API overhaul
Introduced new base interfaces
Client API is explicitly marked
Javadoc for client side is separated
Client API will have source compat in 1.x
Read availability with region replicas
Phase 1 of “region replicas” feature. (Phase 2 in 1.1)
Each region can have “replicas” hosted in other RSs
Only primary accepts writes
Reads can be performed with STRONG or TIMELINE
consistency
Online config change
Configuration can be updated while the region
server is running
hbase> update_all_config
hbase> update_master_config
hbase> update_config ‘<serverName>’
Only some configs can be update online
some compaction / load balancer configs for now
Other forward ports from 0.89-fb branch
New and noteworthy
Extensive documentation/website improvements
Automatic tuning of global memstore and block cache sizes
Bucket cache easier to configure
Compressed blocks in the block cache
Pluggable replication endpoint
Basic client backpressure mechanism
New and noteworthy cont.
Docker file
Per-cell TTL
CopyTable with --bulkload
Truncate table command
Atomic Table.checkAndMutate()
Namespace permissions
Under the covers
Cell based read/write path
Ring buffer based WAL improvements
Multi WAL files in HRegionServer
ZK-less assignment (disabled by default)
Client Preemptive Fast Fail
Combining mvcc and seqIds
Various security, tags and visibility labels improvements
Various fixes to REST server
Numerous improvements in other areas and bug fixes too long to list here.
Changes in behavior: JDK
✓*: should work, but not well tested
https://hbase.apache.org/book.html#basic.prerequisites
JDK Version HBase-1.1 HBase-1.0 HBase-0.98
JDK 6 ✗ ✗ ✓
JDK 7 ✓ ✓ ✓
JDK 8 ✓* ✓* ✓*
Changes in behavior: Hadoop
Hadoop Version HBase-1.1 HBase-1.0 HBase-0.98
Hadoop-1.x ✗ ✗ ✓*
Hadoop-2.2 ✗ ✓* ✓*
Hadoop-2.3 ✓* ✓* ✓
Hadoop-2.4 ✓ ✓ ✓
Hadoop-2.5 ✓ ✓ ✓
Hadoop-2.6 ✓ ✓ ✓*
✓*: should work, but not well tested
https://hbase.apache.org/book.html#basic.prerequisites
Changes in behavior
Zookeeper-3.4.x is required
Default ports changed to 160XX (out of ephemeral range)
Hfile v3 is default
Slab cache removed
Default heap is ¼ of physical memory (instead of 1GB)
Outline
Why now?
Major features
Versioning / Compatibility
Upgrade
HBase-1.0 Interfaces
Examples
Semantic Versioning
Starting with the 1.0.0 release, HBase works toward
Semantic Versioning
MAJOR.MINOR.PATCH[-identifiers]
PATCH: only BC bug fixes.
MINOR: BC new features
MAJOR: Incompatible changes
Post 1.0 versions
New versioning already in action
● 1.0.0
● 1.0.1 (patch release)
● 1.1.0 (minor release)
1.0.x and 1.1.x is expected to have ~monthly releases
1.2.0 and 2.0.0 in the works
HBase API surface
Client API
Explicitly marked with InterfaceAudience.Public
Get/Put/Table/Connection, etc
LimitedPrivate API
Explicitly marked with InterfaceAudience.LimitedPrivate
Coprocessors, replication APIs
Private API
Explicitly marked with InterfaceAudience.Private
All other classes not marked
Also InterfaceAudience.{Stable,Evolving,Unstable}
Major Minor Patch
Client-Server Wire Compatibility
✗ ✓ ✓
Server-Server Compatibility
✗ ✓ ✓
File Format Compatibility
✗* ✓ ✓
Client API Compatibility
✗ ✓ ✓
Client Binary Compatibility
✗ ✗ ✓
Server Side Limited API C.
✗ ✗*/✓* ✓
Dependency Compatibility
✗ ✓ ✓
Operation Compatibility
✗ ✗ ✓
1.0.x Compatibility with earlier: Source
1.0.x is (mostly) source compatible with earlier
versions
Filter / Coprocessor users will see some
changes
We strongly advise ALL users to switch to new
API
Deprecated APIs will be removed (in 2.0)
1.0.x Compatibility with earlier: Binary
1.0 is NOT binary compatible with earlier
versions
Clients/coprocessors have to be recompiled to
link against 1.0 jars
Cannot drop/replace jars against an application
compiled with 0.98
1.0.x Compatibility with earlier: Wire
1.0.x is wire compatible with 0.98.x releases
0.98.x client can be used to access 1.0.x
cluster (allows rolling upgrades)
NOT binary compatible with earlier (0.96,0.94)
HFile v3 is default. Once upgraded, cannot “go
back”
Outline
Why now?
Major features
Versioning / Compatibility
Upgrade
HBase-1.0 Interfaces
Examples
Upgrade to 1.0.x
From 0.98.x
Regular upgrade or rolling upgrade fashion is supported.
From 0.96.x
Supported with a shutdown and restart of the cluster.
No rolling upgrades.
No need to run extra steps/scripts.
From 0.94.x
Supported similarly to upgrade from 0.94 -> 0.96.
The upgrade script should be run to rewrite cluster level metadata.
From earlier versions (0.92,0.90,etc) upgrade is not supported
HBase 1.0 Interfaces
Better encapsulation
Why the new interfaces?
HBase 1.0 had a goal to create new client interfaces
● Explicit contracts - Clear definition of the surface
● Defining a standard API in the code
● Clearer focus of responsibility - each piece doing one
thing well.
Naming Overview
HBase 0.98 Name(s) HBase 1.0 name(s)
HConnectionManager, ConnectionManager ConnectionFactory
HConnection, ClusterConnection Connection
HBaseAdmin Admin
HTable Table
RegionLocator
BufferedMutator
ConnectionFactory
Creates new Connections.
Use this instead of new HTable(), new
HBaseAdmin()
User must manage Connections
FYI: Connection type can be overridden in the
Configuration
Managed Connections Going Away
HBase Client used to have implicit connection
management.
Managed Connections was trying to do lifecycle
management without understanding the application,
sometimes with unpredictable results.
HBase 1.0 introduces explicit Connection management.
Connection
Simple replacement for HConnection
Focal point to get a Table, RegionLocator,
Admin, or BufferedMutator
Use TableName instead of String/byte[]
User Managed - must call connection.close()
Connections have a cache of region metadata and a
shared threadpool; close() releases shared resources.
Admin
Replaces HBaseAdmin for administration
Functionality
create/delete/list Table and Snapshots, split table,
add/remove table columns and etc
Retrieved via connection.getAdmin().
Use TableName object instead of String/byte[]
Remember to .close()
RegionLocator
Region metadata related functionality
get start/end keys, get all regions, get region for qualifier
No manipulation of regions. That’s in Admin.
Lightweight - uses cached region information
from connection
Remember to .close()
Table (part I)
Most of HTable’s methods - CRUD
put, delete, get - both single and list
increment, append
scan
batch
checkAnd*
coprocessor service
Table (Part II)
Removed autoflush
The autoflush functionality was complex and used for
batch writes. BufferedMutator was introduces for that
purpose.
One Table per thread
Remember to close()
Release the threadpool
BufferedMutator (part I)
Autoflush and BufferedMutator are used when
“writes are small and many; it especially makes
sense when there is no natural flush point.” --
stack on HBASE-12728
Supports all Batches Mutations
Puts were supported before.
Adds batched Deletes, Appends, Increments,
RowMutations
BufferedMutator (part II)
Used in Map/Reduces
Can be used in high performance servlets, if
you can tolerate some data loss.
Use ExceptionListener
CLOSE!
does a flush() - You might lose data in the buffer
also closes threadpools
Outline
Why now?
Major features
Versioning / Compatibility
Upgrade
HBase-1.0 Interfaces
Examples
Old way
TableName tableName = TableName.valueOf(tableNameString);
HBaseAdmin admin = new HBaseAdmin(config)
HTableDescriptor descriptor = …;
admin.createTable(descriptor);
admin.close();
HTable table = new HTable(tableName, config);
… // do something interesting
table.close();
Table example
TableName tableName = TableName.valueOf(tableNameString);
try (Connection conn = ConnectionFactory.createConnection();
Admin admin = conn.getAdmin();) {
HTableDescriptor descriptor = …;
admin.createTable(descriptor);
try (Table table = conn.getTable(tableName)) {
table.put(...);
...
}
}
Other examples
TableName tableName = TableName.valueOf(tableNameString);
try (Connection conn = ConnectionFactory.createConnection()) {
try (BufferedMutator mutator = conn.getBufferedMutator(tableName)) {
mutator.mutate(...);
}
try (RegionLocator locator = conn.getRegionLocator(tableName)) {
List<HRegionLocation> locations = locator.getAllRegionLocations();
...
}
}
Thanks to the users and developers who made
1.0 happen!
References:
https://hbase.apache.org/book.html#hbase.versioning
https://hbase.apache.org/book.html#basic.prerequisites
https://hbase.apache.org/book.html#hadoop
https://hbase.apache.org/book.html#upgrade1.0
https://mail-archives.apache.org/mod_mbox/hbase-dev/201502.mbox/%3CCAMUu0w-3K1aZgY7nJReUaMBF1Qj%
2B2DNwDNOth1su%2Bxr93zGy3w%40mail.gmail.com%3E
https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72

More Related Content

What's hot

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationSchubert Zhang
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guidelarsgeorge
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceCloudera, Inc.
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
Rigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceRigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceCloudera, Inc.
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014larsgeorge
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBaseHBaseCon
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction HBaseCon
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon
 

What's hot (18)

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
HBase Sizing Guide
HBase Sizing GuideHBase Sizing Guide
HBase Sizing Guide
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, SalesforceHBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
HBaseCon 2012 | Learning HBase Internals - Lars Hofhansl, Salesforce
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
Rigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceRigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase Performance
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on MesosHBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2015: Elastic HBase on Mesos
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 
Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
HBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at XiaomiHBaseCon 2015: HBase Operations at Xiaomi
HBaseCon 2015: HBase Operations at Xiaomi
 

Similar to Meet HBase 1.0

Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0enissoz
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the unionenissoz
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 featuresanand murari
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 featuresanand murari
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0DataWorks Summit
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleHortonworks
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Ankit Singhal
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0DataWorks Summit
 
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and SparkHBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and SparkMichael Stack
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016alanfgates
 
Geo-based content processing using hbase
Geo-based content processing using hbaseGeo-based content processing using hbase
Geo-based content processing using hbaseRavi Veeramachaneni
 
HBase New Features
HBase New FeaturesHBase New Features
HBase New Featuresrxu
 
HBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - OperationsHBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - Operationsphanleson
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 

Similar to Meet HBase 1.0 (20)

Meet Apache HBase - 2.0
Meet Apache HBase - 2.0Meet Apache HBase - 2.0
Meet Apache HBase - 2.0
 
Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 features
 
Hadoop 3.0 features
Hadoop 3.0 featuresHadoop 3.0 features
Hadoop 3.0 features
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
1.0 vs2.0
1.0 vs2.01.0 vs2.0
1.0 vs2.0
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, Scale
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and SparkHBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
HBaseConAsia2018 Track2-4: HTAP DB-System: AsparaDB HBase, Phoenix, and Spark
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, ScaleApache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
 
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
 
Geo-based content processing using hbase
Geo-based content processing using hbaseGeo-based content processing using hbase
Geo-based content processing using hbase
 
HBase New Features
HBase New FeaturesHBase New Features
HBase New Features
 
HBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - OperationsHBase In Action - Chapter 10 - Operations
HBase In Action - Chapter 10 - Operations
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 

Recently uploaded

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Recently uploaded (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Meet HBase 1.0

  • 1. Meet HBase-1.0 And the New Client API Enis Soztutar Solomon Duskis
  • 2. About Us Enis Söztutar Hortonworks Release Manager for 1.0 @enissoz Solomon Duskis Google / Bigtable @sduskis
  • 3. Outline Why now? Major features Versioning / Compatibility Upgrade HBase-1.0 Interfaces Examples
  • 4. Outline Why now? Major features Versioning / Compatibility Upgrade HBase-1.0 Interfaces Examples
  • 5. Why 1.0 now? Ran out of numbers, which was the plan for switching to the 0.9x versions Community agreement that HBase has already reached the maturity level Start semantic versioning and compatibility guarantees
  • 6. Apache HBase v1.0 marks a major milestone in the project's development. It is a monumental moment that the army of contributors who have made this possible should all be proud of. The result is a thing of collaborative beauty that also happens to power key, large-scale Internet platforms. Michael Stack The HBase 1.0 release appropriately acknowledges a maturity already achieved by the Apache HBase community and software both, and is a great occasion to learn more about HBase, how it can help you solve your scale data challenges, and the growing ecosystem of Open Source and commercial software that chooses HBase as foundation. Andrew Purtell https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72
  • 7. Release goals The 1.0.0 release has three goals:
  • 8. Release goals The 1.0.0 release has three goals: 1. Lay a stable foundation for future 1.x releases
  • 9. Release goals The 1.0.0 release has three goals: 1. Lay a stable foundation for future 1.x releases 2. Stabilize running HBase cluster and its clients; and
  • 10. Release goals The 1.0.0 release has three goals: 1. Lay a stable foundation for future 1.x releases 2. Stabilize running HBase cluster and its clients; and 3. Make versioning and compatibility dimensions explicit
  • 11. Outline Why now? Major features Versioning / Compatibility Upgrade HBase-1.0 Interfaces Examples
  • 12. Overview Over 1500 jiras resolved on top of 0.98.0! See release announcement for a comprehensive summary
  • 13. API overhaul Introduced new base interfaces Client API is explicitly marked Javadoc for client side is separated Client API will have source compat in 1.x
  • 14. Read availability with region replicas Phase 1 of “region replicas” feature. (Phase 2 in 1.1) Each region can have “replicas” hosted in other RSs Only primary accepts writes Reads can be performed with STRONG or TIMELINE consistency
  • 15. Online config change Configuration can be updated while the region server is running hbase> update_all_config hbase> update_master_config hbase> update_config ‘<serverName>’ Only some configs can be update online some compaction / load balancer configs for now Other forward ports from 0.89-fb branch
  • 16. New and noteworthy Extensive documentation/website improvements Automatic tuning of global memstore and block cache sizes Bucket cache easier to configure Compressed blocks in the block cache Pluggable replication endpoint Basic client backpressure mechanism
  • 17. New and noteworthy cont. Docker file Per-cell TTL CopyTable with --bulkload Truncate table command Atomic Table.checkAndMutate() Namespace permissions
  • 18. Under the covers Cell based read/write path Ring buffer based WAL improvements Multi WAL files in HRegionServer ZK-less assignment (disabled by default) Client Preemptive Fast Fail Combining mvcc and seqIds Various security, tags and visibility labels improvements Various fixes to REST server Numerous improvements in other areas and bug fixes too long to list here.
  • 19. Changes in behavior: JDK ✓*: should work, but not well tested https://hbase.apache.org/book.html#basic.prerequisites JDK Version HBase-1.1 HBase-1.0 HBase-0.98 JDK 6 ✗ ✗ ✓ JDK 7 ✓ ✓ ✓ JDK 8 ✓* ✓* ✓*
  • 20. Changes in behavior: Hadoop Hadoop Version HBase-1.1 HBase-1.0 HBase-0.98 Hadoop-1.x ✗ ✗ ✓* Hadoop-2.2 ✗ ✓* ✓* Hadoop-2.3 ✓* ✓* ✓ Hadoop-2.4 ✓ ✓ ✓ Hadoop-2.5 ✓ ✓ ✓ Hadoop-2.6 ✓ ✓ ✓* ✓*: should work, but not well tested https://hbase.apache.org/book.html#basic.prerequisites
  • 21. Changes in behavior Zookeeper-3.4.x is required Default ports changed to 160XX (out of ephemeral range) Hfile v3 is default Slab cache removed Default heap is ¼ of physical memory (instead of 1GB)
  • 22. Outline Why now? Major features Versioning / Compatibility Upgrade HBase-1.0 Interfaces Examples
  • 23. Semantic Versioning Starting with the 1.0.0 release, HBase works toward Semantic Versioning MAJOR.MINOR.PATCH[-identifiers] PATCH: only BC bug fixes. MINOR: BC new features MAJOR: Incompatible changes
  • 24. Post 1.0 versions New versioning already in action ● 1.0.0 ● 1.0.1 (patch release) ● 1.1.0 (minor release) 1.0.x and 1.1.x is expected to have ~monthly releases 1.2.0 and 2.0.0 in the works
  • 25. HBase API surface Client API Explicitly marked with InterfaceAudience.Public Get/Put/Table/Connection, etc LimitedPrivate API Explicitly marked with InterfaceAudience.LimitedPrivate Coprocessors, replication APIs Private API Explicitly marked with InterfaceAudience.Private All other classes not marked Also InterfaceAudience.{Stable,Evolving,Unstable}
  • 26. Major Minor Patch Client-Server Wire Compatibility ✗ ✓ ✓ Server-Server Compatibility ✗ ✓ ✓ File Format Compatibility ✗* ✓ ✓ Client API Compatibility ✗ ✓ ✓ Client Binary Compatibility ✗ ✗ ✓ Server Side Limited API C. ✗ ✗*/✓* ✓ Dependency Compatibility ✗ ✓ ✓ Operation Compatibility ✗ ✗ ✓
  • 27. 1.0.x Compatibility with earlier: Source 1.0.x is (mostly) source compatible with earlier versions Filter / Coprocessor users will see some changes We strongly advise ALL users to switch to new API Deprecated APIs will be removed (in 2.0)
  • 28. 1.0.x Compatibility with earlier: Binary 1.0 is NOT binary compatible with earlier versions Clients/coprocessors have to be recompiled to link against 1.0 jars Cannot drop/replace jars against an application compiled with 0.98
  • 29. 1.0.x Compatibility with earlier: Wire 1.0.x is wire compatible with 0.98.x releases 0.98.x client can be used to access 1.0.x cluster (allows rolling upgrades) NOT binary compatible with earlier (0.96,0.94) HFile v3 is default. Once upgraded, cannot “go back”
  • 30. Outline Why now? Major features Versioning / Compatibility Upgrade HBase-1.0 Interfaces Examples
  • 31. Upgrade to 1.0.x From 0.98.x Regular upgrade or rolling upgrade fashion is supported. From 0.96.x Supported with a shutdown and restart of the cluster. No rolling upgrades. No need to run extra steps/scripts. From 0.94.x Supported similarly to upgrade from 0.94 -> 0.96. The upgrade script should be run to rewrite cluster level metadata. From earlier versions (0.92,0.90,etc) upgrade is not supported
  • 33. Why the new interfaces? HBase 1.0 had a goal to create new client interfaces ● Explicit contracts - Clear definition of the surface ● Defining a standard API in the code ● Clearer focus of responsibility - each piece doing one thing well.
  • 34. Naming Overview HBase 0.98 Name(s) HBase 1.0 name(s) HConnectionManager, ConnectionManager ConnectionFactory HConnection, ClusterConnection Connection HBaseAdmin Admin HTable Table RegionLocator BufferedMutator
  • 35. ConnectionFactory Creates new Connections. Use this instead of new HTable(), new HBaseAdmin() User must manage Connections FYI: Connection type can be overridden in the Configuration
  • 36. Managed Connections Going Away HBase Client used to have implicit connection management. Managed Connections was trying to do lifecycle management without understanding the application, sometimes with unpredictable results. HBase 1.0 introduces explicit Connection management.
  • 37. Connection Simple replacement for HConnection Focal point to get a Table, RegionLocator, Admin, or BufferedMutator Use TableName instead of String/byte[] User Managed - must call connection.close() Connections have a cache of region metadata and a shared threadpool; close() releases shared resources.
  • 38. Admin Replaces HBaseAdmin for administration Functionality create/delete/list Table and Snapshots, split table, add/remove table columns and etc Retrieved via connection.getAdmin(). Use TableName object instead of String/byte[] Remember to .close()
  • 39. RegionLocator Region metadata related functionality get start/end keys, get all regions, get region for qualifier No manipulation of regions. That’s in Admin. Lightweight - uses cached region information from connection Remember to .close()
  • 40. Table (part I) Most of HTable’s methods - CRUD put, delete, get - both single and list increment, append scan batch checkAnd* coprocessor service
  • 41. Table (Part II) Removed autoflush The autoflush functionality was complex and used for batch writes. BufferedMutator was introduces for that purpose. One Table per thread Remember to close() Release the threadpool
  • 42. BufferedMutator (part I) Autoflush and BufferedMutator are used when “writes are small and many; it especially makes sense when there is no natural flush point.” -- stack on HBASE-12728 Supports all Batches Mutations Puts were supported before. Adds batched Deletes, Appends, Increments, RowMutations
  • 43. BufferedMutator (part II) Used in Map/Reduces Can be used in high performance servlets, if you can tolerate some data loss. Use ExceptionListener CLOSE! does a flush() - You might lose data in the buffer also closes threadpools
  • 44. Outline Why now? Major features Versioning / Compatibility Upgrade HBase-1.0 Interfaces Examples
  • 45. Old way TableName tableName = TableName.valueOf(tableNameString); HBaseAdmin admin = new HBaseAdmin(config) HTableDescriptor descriptor = …; admin.createTable(descriptor); admin.close(); HTable table = new HTable(tableName, config); … // do something interesting table.close();
  • 46. Table example TableName tableName = TableName.valueOf(tableNameString); try (Connection conn = ConnectionFactory.createConnection(); Admin admin = conn.getAdmin();) { HTableDescriptor descriptor = …; admin.createTable(descriptor); try (Table table = conn.getTable(tableName)) { table.put(...); ... } }
  • 47. Other examples TableName tableName = TableName.valueOf(tableNameString); try (Connection conn = ConnectionFactory.createConnection()) { try (BufferedMutator mutator = conn.getBufferedMutator(tableName)) { mutator.mutate(...); } try (RegionLocator locator = conn.getRegionLocator(tableName)) { List<HRegionLocation> locations = locator.getAllRegionLocations(); ... } }
  • 48. Thanks to the users and developers who made 1.0 happen! References: https://hbase.apache.org/book.html#hbase.versioning https://hbase.apache.org/book.html#basic.prerequisites https://hbase.apache.org/book.html#hadoop https://hbase.apache.org/book.html#upgrade1.0 https://mail-archives.apache.org/mod_mbox/hbase-dev/201502.mbox/%3CCAMUu0w-3K1aZgY7nJReUaMBF1Qj% 2B2DNwDNOth1su%2Bxr93zGy3w%40mail.gmail.com%3E https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72