SlideShare una empresa de Scribd logo
1 de 33
Apache HBase Road Map
A short history of nearly everything HBase. Past, Present, and Future.




Jonathan Gray
November ,
Hadoop World NYC
Agenda


   Past (<= . )

   Present (== .    )

   Future (>= . )
Apache HBase
A Friendly Open Source Project

Disclaimer: These are the personal opinions of Jonathan Gray and do not necessarily reflect the opinions of
Facebook Inc., Apache HBase, the Apache HBase community, or any other person or organization. I also apologize
in advance to any individuals or companies that were left out of slides or discussion. This was not done
purposefully and I love you all.
Apache HBase
▪ A dynamic and pragmatic community
  ▪ HBase committers scattered around many companies

  ▪ A culture of acceptance (contributions please!)

    ▪ Perhaps, occasionally, to a fault

  ▪ Many HBase committers have moved companies


▪ “Road Map” driven by sponsoring companies
  ▪ Bugs fixed and features developed decided by them

  ▪ HBase has no Enterprise Software Company behind it
The Ghost of HBase Past
Early days through . and .
HBase History
▪ Started in         as Bigtable clone for Hadoop
  ▪ First code released in    as part of Hadoop .
▪ Six major releases (three versioning schemes)
  ▪   .   . in March
  ▪   .   . in August
  ▪   .    . in September
  ▪   .    . in January
  ▪   .    . in September
  ▪   .    . in January
Random read/write access for offline processes
HBase History
▪ Early users focused on offline, crawl data storage
  ▪ Powerset was primary user

  ▪ Others like WorldLingo, OpenPlaces



▪ Augmenting Offline MapReduce
  ▪ Needed random writes for web crawl storage

  ▪ Also use random writes to store links and images

  ▪ The road map was easy... Bigtable
OLTP database for web startups
Online HBase
▪ Next generation of HBasers wanted OLTP
  ▪ Streamy.com (my previous startup)

  ▪ StumbleUpon and others


▪ HBase Goes Realtime
  ▪ Gave this talk at Hadoop Summit      w/ JD Cryans
  ▪ “HBase . ... First ever Performance Release”

     “As a random-access store, we are well suited for the storing and serving of
     Web applications, but high latency and variability (100s of ms to seconds)
     has reduced the usefulness of HBase and required the use of external
     caching in the past”
HBase 0.20
▪ Performance Release (aka the Unjavafy release)
  ▪ Rewrite of entire read and write paths

    ▪ Introduction of KeyValue and zero-copy reads

    ▪ New block-based HFile format and LRU block cache

  ▪ New client APIs: Put, Get, Scan, Delete, Result


▪ ZooKeeper Integration
  ▪ Remove all dependencies on master for reads/writes

  ▪ Leader election, fault detection, remove SPOF
A highly available, scalable database for tech companies
HBase 0.90
▪ Durability, Stability, Availability Release
  ▪ “Production Ready HBase”

  ▪ Zero data loss

  ▪ Rewrite of Master and ZooKeeper interactions

  ▪ Testing, debugging, monitoring improvements

  ▪ Random read and large row improvements

  ▪ Lots of awesome new features
HBase 0.90: Production Ready
▪ Zero data loss
  ▪ HDFS Appends, HLog fixes, gremlin testing

▪ Master rewrite
  ▪ Remove from read/write path + failover, no SPOF

▪ Operational improvements
  ▪ HBCK (fsck for HBase), HFile/HLog command-line tools

  ▪ Rolling restarts for minor upgrades

  ▪ New testing framework and     k new lines of tests
HBase 0.90: New Features
▪ Cluster-to-cluster replication

▪ Read performance
  ▪ Bloom filters rewrite

  ▪ Efficient intra-row seeking for large row support

▪ Other stuff
  ▪ Mavenized

  ▪ Stargate REST server and AVRO server

  ▪ Shell improvements and EC   scripts
HBase Today
A large scale production-capable database system
HBase 0.92
▪ Stability and feature release
    ▪ Lots of usability and stability improvements

    ▪ Coprocessors and security

    ▪ Multi-Master cluster replication



▪   . . RC sometime in November
    ▪  blockers and criticals as of this morning
    ▪ FB already deploying a  -based branch in dev
HBase 0.92: Big new features
▪ Coprocessors
  ▪ Triggers and Stored Procedures

  ▪ Pre/Post hooks to all client calls and server operations

  ▪ Dynamically add new RPC calls

  ▪ ACL security atop Coprocessors


▪ HFile v
  ▪ Support for very large regions / files

  ▪ Multi-level block index and inline blooms
HBase 0.92: Performance
▪ Performance improvements
  ▪ More seeking and early-out hints

  ▪ Distributed log splitting

  ▪ CacheOnWrite, EvictOnClose


▪ Compaction improvements
  ▪ Multi-threaded compactions

  ▪ Vastly improved file selection algorithm

  ▪ Lots of metrics and highly configurable
HBase 0.92: Improvements
▪ Operational improvements
  ▪ HBCK improvements, Web UI improvements

  ▪ Slow query log, running tasks and thread status

  ▪ Online schema modifications

▪ Usability and API improvements
  ▪ Increment client API

  ▪ String-based Filter language

  ▪ Multi-family bulk load

  ▪ The HBase Books!
HBase 0.92: Documentation!
▪ The (O’Reilly) HBase Book
  ▪ HBase: The Definitive Guide released in September

  ▪ Massive effort by committer Lars George

  ▪ Lots of input and feedback from the community

▪ The (Apache) HBase Book
  ▪ Apache HBase now has an docbook-format book

  ▪ Every HBase release will ship with a versioned book

  ▪ From installation to schema design and architecture

  ▪ Latest version @ http://hbase.apache.org/book.html
HBase of the Future
 . and beyond
?
           You!
A usable, large scale production database system
HBase 0.94
▪ Stability and usability is the core focus
  ▪ Increase stability by decreasing complexity

  ▪ More work on UI, tools, monitoring, operability

  ▪ Table/family-level metrics


▪ But features will always continue...
  ▪ Fast backups w/ point-in-time recovery

  ▪ Multi-Slave Replication

  ▪ Constraints and other Coprocessor-based contribs
HBase 0.94: New Stuff
▪ Thrift   .
  ▪ New Thrift API to more closely match Java API

  ▪ Embedded Thrift w/ RS short-circuit


▪ Other Goodies
  ▪ TTL + minVersions

  ▪ Point-in-time snapshot scanners

  ▪ Atomic Append operation
HBase 0.94: Performance
▪ Scaling for throughput vs. latency
  ▪ Early-lock-release to decrease row contention

  ▪ Early-thread-release to increase throughput

  ▪ Remove all global wait()/notify() on HLog


▪ Improved seeking and file selection
  ▪ “Lazy-seek” in-order file processing

  ▪ DeleteFamily bloom filter
HBase 0.94: Project Management
▪ Renewed focus on fast release cycle
  ▪ HBase   . branch cut immediately after . release
  ▪ Already close to .   feature freeze, . dev release?
  ▪ blockers and     criticals left

▪ Apache HBase: A slightly less accepting project
  ▪ Stability is really code stability

  ▪ Push towards iterative feature dev and branch dev

  ▪ Coprocessors and Service Interfaces go a long way
flying
nanobots                     jetpacks
                  cars


Holographic storage renders HBase obsolete
Beyond HBase 0.94
▪ Stability and usability is still the core focus
  ▪ More tests, testing frameworks, integration tests


▪ But features will always continue...
  ▪ RPC redux

  ▪ Dynamic configuration

  ▪ Request, IO, and locality based load balancing

  ▪ Multi-Tenancy (QoS, ACL)

  ▪ Tighter coordination with rest of stack (HDFS, Linux)
Conclusion
▪ Apache HBase has come a long way
  ▪ Use case driven development


▪ HBase   .   coming very soon
  ▪ Most stable release to date


▪ Contributors and committers drive development
  ▪ Consumers can’t dictate the road map

  ▪ Individuals and organizations solve their problems

      (They have their own users... and jobs to keep)
Check out the HBase at Facebook Page:

facebook.com/UsingHbase


    Thanks! Questions?

Más contenido relacionado

La actualidad más candente

Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardMatthew Blair
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataCloudera, Inc.
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with HadoopCloudera, Inc.
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaCloudera, Inc.
 
Nyc hadoop meetup introduction to h base
Nyc hadoop meetup   introduction to h baseNyc hadoop meetup   introduction to h base
Nyc hadoop meetup introduction to h base智杰 付
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...Cloudera, Inc.
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetCloudera, Inc.
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini Cloudera, Inc.
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014larsgeorge
 
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, AdobeHBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, AdobeCloudera, Inc.
 

La actualidad más candente (20)

Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
 
Nyc hadoop meetup introduction to h base
Nyc hadoop meetup   introduction to h baseNyc hadoop meetup   introduction to h base
Nyc hadoop meetup introduction to h base
 
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
 
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring BudgetHBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
 
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, AdobeHBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
 

Similar a Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook

Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
 
Thug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen ZhangThug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen ZhangChen Zhang
 
HBase and Hadoop at Urban Airship
HBase and Hadoop at Urban AirshipHBase and Hadoop at Urban Airship
HBase and Hadoop at Urban Airshipdave_revell
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsDataWorks Summit
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012Chris Huang
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
Chicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseChicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseCloudera, Inc.
 
Apache HBase: Where We've Been and What's Upcoming
Apache HBase: Where We've Been and What's UpcomingApache HBase: Where We've Been and What's Upcoming
Apache HBase: Where We've Been and What's Upcominghuguk
 
Webinar: The Future of Hadoop
Webinar: The Future of HadoopWebinar: The Future of Hadoop
Webinar: The Future of HadoopCloudera, Inc.
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Matt Fuller
 
Java EE and NoSQL using JBoss EAP 7 and OpenShift
Java EE and NoSQL using JBoss EAP 7 and OpenShiftJava EE and NoSQL using JBoss EAP 7 and OpenShift
Java EE and NoSQL using JBoss EAP 7 and OpenShiftArun Gupta
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 

Similar a Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook (20)

Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010
 
Thug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen ZhangThug feb 23 2015 Chen Zhang
Thug feb 23 2015 Chen Zhang
 
HBase and Hadoop at Urban Airship
HBase and Hadoop at Urban AirshipHBase and Hadoop at Urban Airship
HBase and Hadoop at Urban Airship
 
The Future of Hbase
The Future of HbaseThe Future of Hbase
The Future of Hbase
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase DeploymentsMulti-tenant, Multi-cluster and Multi-container Apache HBase Deployments
Multi-tenant, Multi-cluster and Multi-container Apache HBase Deployments
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012
 
HBase, no trouble
HBase, no troubleHBase, no trouble
HBase, no trouble
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Chicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBaseChicago Data Summit: Geo-based Content Processing Using HBase
Chicago Data Summit: Geo-based Content Processing Using HBase
 
Apache HBase: Where We've Been and What's Upcoming
Apache HBase: Where We've Been and What's UpcomingApache HBase: Where We've Been and What's Upcoming
Apache HBase: Where We've Been and What's Upcoming
 
Webinar: The Future of Hadoop
Webinar: The Future of HadoopWebinar: The Future of Hadoop
Webinar: The Future of Hadoop
 
Apache Hadoop 0.22 and Other Versions
Apache Hadoop 0.22 and Other VersionsApache Hadoop 0.22 and Other Versions
Apache Hadoop 0.22 and Other Versions
 
October 2013 HUG: HBase 0.96
October 2013 HUG: HBase 0.96October 2013 HUG: HBase 0.96
October 2013 HUG: HBase 0.96
 
ApacheCon-HBase-2016
ApacheCon-HBase-2016ApacheCon-HBase-2016
ApacheCon-HBase-2016
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
 
Java EE and NoSQL using JBoss EAP 7 and OpenShift
Java EE and NoSQL using JBoss EAP 7 and OpenShiftJava EE and NoSQL using JBoss EAP 7 and OpenShift
Java EE and NoSQL using JBoss EAP 7 and OpenShift
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 

Más de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Más de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Último (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook

  • 1.
  • 2. Apache HBase Road Map A short history of nearly everything HBase. Past, Present, and Future. Jonathan Gray November , Hadoop World NYC
  • 3. Agenda Past (<= . ) Present (== . ) Future (>= . )
  • 4. Apache HBase A Friendly Open Source Project Disclaimer: These are the personal opinions of Jonathan Gray and do not necessarily reflect the opinions of Facebook Inc., Apache HBase, the Apache HBase community, or any other person or organization. I also apologize in advance to any individuals or companies that were left out of slides or discussion. This was not done purposefully and I love you all.
  • 5. Apache HBase ▪ A dynamic and pragmatic community ▪ HBase committers scattered around many companies ▪ A culture of acceptance (contributions please!) ▪ Perhaps, occasionally, to a fault ▪ Many HBase committers have moved companies ▪ “Road Map” driven by sponsoring companies ▪ Bugs fixed and features developed decided by them ▪ HBase has no Enterprise Software Company behind it
  • 6. The Ghost of HBase Past Early days through . and .
  • 7. HBase History ▪ Started in as Bigtable clone for Hadoop ▪ First code released in as part of Hadoop . ▪ Six major releases (three versioning schemes) ▪ . . in March ▪ . . in August ▪ . . in September ▪ . . in January ▪ . . in September ▪ . . in January
  • 8. Random read/write access for offline processes
  • 9. HBase History ▪ Early users focused on offline, crawl data storage ▪ Powerset was primary user ▪ Others like WorldLingo, OpenPlaces ▪ Augmenting Offline MapReduce ▪ Needed random writes for web crawl storage ▪ Also use random writes to store links and images ▪ The road map was easy... Bigtable
  • 10. OLTP database for web startups
  • 11. Online HBase ▪ Next generation of HBasers wanted OLTP ▪ Streamy.com (my previous startup) ▪ StumbleUpon and others ▪ HBase Goes Realtime ▪ Gave this talk at Hadoop Summit w/ JD Cryans ▪ “HBase . ... First ever Performance Release” “As a random-access store, we are well suited for the storing and serving of Web applications, but high latency and variability (100s of ms to seconds) has reduced the usefulness of HBase and required the use of external caching in the past”
  • 12. HBase 0.20 ▪ Performance Release (aka the Unjavafy release) ▪ Rewrite of entire read and write paths ▪ Introduction of KeyValue and zero-copy reads ▪ New block-based HFile format and LRU block cache ▪ New client APIs: Put, Get, Scan, Delete, Result ▪ ZooKeeper Integration ▪ Remove all dependencies on master for reads/writes ▪ Leader election, fault detection, remove SPOF
  • 13. A highly available, scalable database for tech companies
  • 14. HBase 0.90 ▪ Durability, Stability, Availability Release ▪ “Production Ready HBase” ▪ Zero data loss ▪ Rewrite of Master and ZooKeeper interactions ▪ Testing, debugging, monitoring improvements ▪ Random read and large row improvements ▪ Lots of awesome new features
  • 15. HBase 0.90: Production Ready ▪ Zero data loss ▪ HDFS Appends, HLog fixes, gremlin testing ▪ Master rewrite ▪ Remove from read/write path + failover, no SPOF ▪ Operational improvements ▪ HBCK (fsck for HBase), HFile/HLog command-line tools ▪ Rolling restarts for minor upgrades ▪ New testing framework and k new lines of tests
  • 16. HBase 0.90: New Features ▪ Cluster-to-cluster replication ▪ Read performance ▪ Bloom filters rewrite ▪ Efficient intra-row seeking for large row support ▪ Other stuff ▪ Mavenized ▪ Stargate REST server and AVRO server ▪ Shell improvements and EC scripts
  • 18. A large scale production-capable database system
  • 19. HBase 0.92 ▪ Stability and feature release ▪ Lots of usability and stability improvements ▪ Coprocessors and security ▪ Multi-Master cluster replication ▪ . . RC sometime in November ▪ blockers and criticals as of this morning ▪ FB already deploying a -based branch in dev
  • 20. HBase 0.92: Big new features ▪ Coprocessors ▪ Triggers and Stored Procedures ▪ Pre/Post hooks to all client calls and server operations ▪ Dynamically add new RPC calls ▪ ACL security atop Coprocessors ▪ HFile v ▪ Support for very large regions / files ▪ Multi-level block index and inline blooms
  • 21. HBase 0.92: Performance ▪ Performance improvements ▪ More seeking and early-out hints ▪ Distributed log splitting ▪ CacheOnWrite, EvictOnClose ▪ Compaction improvements ▪ Multi-threaded compactions ▪ Vastly improved file selection algorithm ▪ Lots of metrics and highly configurable
  • 22. HBase 0.92: Improvements ▪ Operational improvements ▪ HBCK improvements, Web UI improvements ▪ Slow query log, running tasks and thread status ▪ Online schema modifications ▪ Usability and API improvements ▪ Increment client API ▪ String-based Filter language ▪ Multi-family bulk load ▪ The HBase Books!
  • 23. HBase 0.92: Documentation! ▪ The (O’Reilly) HBase Book ▪ HBase: The Definitive Guide released in September ▪ Massive effort by committer Lars George ▪ Lots of input and feedback from the community ▪ The (Apache) HBase Book ▪ Apache HBase now has an docbook-format book ▪ Every HBase release will ship with a versioned book ▪ From installation to schema design and architecture ▪ Latest version @ http://hbase.apache.org/book.html
  • 24. HBase of the Future . and beyond
  • 25. ? You! A usable, large scale production database system
  • 26. HBase 0.94 ▪ Stability and usability is the core focus ▪ Increase stability by decreasing complexity ▪ More work on UI, tools, monitoring, operability ▪ Table/family-level metrics ▪ But features will always continue... ▪ Fast backups w/ point-in-time recovery ▪ Multi-Slave Replication ▪ Constraints and other Coprocessor-based contribs
  • 27. HBase 0.94: New Stuff ▪ Thrift . ▪ New Thrift API to more closely match Java API ▪ Embedded Thrift w/ RS short-circuit ▪ Other Goodies ▪ TTL + minVersions ▪ Point-in-time snapshot scanners ▪ Atomic Append operation
  • 28. HBase 0.94: Performance ▪ Scaling for throughput vs. latency ▪ Early-lock-release to decrease row contention ▪ Early-thread-release to increase throughput ▪ Remove all global wait()/notify() on HLog ▪ Improved seeking and file selection ▪ “Lazy-seek” in-order file processing ▪ DeleteFamily bloom filter
  • 29. HBase 0.94: Project Management ▪ Renewed focus on fast release cycle ▪ HBase . branch cut immediately after . release ▪ Already close to . feature freeze, . dev release? ▪ blockers and criticals left ▪ Apache HBase: A slightly less accepting project ▪ Stability is really code stability ▪ Push towards iterative feature dev and branch dev ▪ Coprocessors and Service Interfaces go a long way
  • 30. flying nanobots jetpacks cars Holographic storage renders HBase obsolete
  • 31. Beyond HBase 0.94 ▪ Stability and usability is still the core focus ▪ More tests, testing frameworks, integration tests ▪ But features will always continue... ▪ RPC redux ▪ Dynamic configuration ▪ Request, IO, and locality based load balancing ▪ Multi-Tenancy (QoS, ACL) ▪ Tighter coordination with rest of stack (HDFS, Linux)
  • 32. Conclusion ▪ Apache HBase has come a long way ▪ Use case driven development ▪ HBase . coming very soon ▪ Most stable release to date ▪ Contributors and committers drive development ▪ Consumers can’t dictate the road map ▪ Individuals and organizations solve their problems (They have their own users... and jobs to keep)
  • 33. Check out the HBase at Facebook Page: facebook.com/UsingHbase Thanks! Questions?