SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Internals and
Operations
SRECon19 Asia/Pacific
June 13, 2019
Biju Nair
Software Engineer
bnair10@bloomberg.net
© 2019 Bloomberg Finance L.P. All rights reserved.
Agenda
• Introduction to HBase
• Operating HBase
• Questions
© 2019 Bloomberg Finance L.P. All rights reserved.
Bloomberg in a nutshell
The Bloomberg Terminal delivers a diverse
array of information on a single platform to
facilitate financial decision-making.
© 2019 Bloomberg Finance L.P. All rights reserved.
Bloomberg technology by the numbers
• 5,000+ software engineers
• 150+ technologists and data scientists devoted to machine learning
• One of the largest private networks in the world
• 120 billion pieces of data from the financial markets each day, with a peak of more than
10 million messages/second
• 2 million news stories ingested / published each day (500+ news stories ingested/second)
• News content from 125K+ sources
• Over 1 billion messages and Instant Bloomberg (IB) chats handled daily
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase at Bloomberg
• Started with v0.94.6
• >2 billion reads per day
• >1 billion writes per day
• 51+ TB of compressed data stored in HBase
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Principles
• Ordered Key Value Store
• Distributed shared nothing
© 2019 Bloomberg Finance L.P. All rights reserved.
Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-f
© 2019 Bloomberg Finance L.P. All rights reserved.
Ordered Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-a
Lexicographicorder
Key-9993 Value-g
© 2019 Bloomberg Finance L.P. All rights reserved.
Ordered Key Value
Key-9999 Value-a
Key-9998 Value-b
Key-9997 Value-c
Key-9996 Value-d
…
…
Key-9995 Value-e
Key-9994 Value-a
Lexicographicorder
Key-9993 Value-g
Region
© 2019 Bloomberg Finance L.P. All rights reserved.
Distributed Ordered Key Value
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
ordered
…
Key-299 Value
…
Key-298 Value
Key-297 Value
ordered
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
Region
Region
© 2019 Bloomberg Finance L.P. All rights reserved.
Distributed Ordered Key Value
RegionServer
Key-199 Value
Key-198 Value
Key-197 Value
Key-499 Value
…
Key-498 Value
Key-497 Value
…
Key-299 Value
…
Key-298 Value
Key-297 Value
…
…
…
Key-599 Value
…
Key-598 Value
Key-597 Value
…
Key-399 Value
…
Key-398 Value
Key-397 Value
…
Key-999 Value
…
Key-998 Value
Key-997 Value
…
Region
Region
RegionServer RegionServer
RegionServer RegionServer RegionServer
© 2019 Bloomberg Finance L.P. All rights reserved.
Table Row View
Key Value
Column Id ValueRow Id Timestamp
© 2019 Bloomberg Finance L.P. All rights reserved.
Table Row View
R11|col1|1234567 Value-A
R11|col2|1234567 Value-B
R11|col3|1234567 Value-C
R11|col4|1234567 Value-D
R11
Value-A Value-B Value-C
Col1 Col2 Col3
Value-D
Col4
© 2019 Bloomberg Finance L.P. All rights reserved.
Versioning
R11|col1|1234567 Value-A1
R11|col1|1234566 Value-A
R11|col2|1234567 Value-B
R11|col3|1234567 Value-CC
R11|col3|1234563 Value-C
R11|col4|1234567 Value-DD
R11|col4|1234560 Value-D1
R11|col4|1234557 Value-D
Descendingorder
© 2019 Bloomberg Finance L.P. All rights reserved.
cf2cf1
Column Family
R11|col1|1234567 Value-A
R11|col2|1234567 Value-B
R11|col3|1234567 Value-C
R11
Value-A Value-B Value-C
cf1:col1 cf1:col2 cf1:col3
Value-1
cf2:col1
R11|col1|1234567 Value-1
R11|col2|1234567 Value-2
R11|col3|1234567 Value-3
Value-2
cf2:col2
Value-3
cf2:col3
© 2019 Bloomberg Finance L.P. All rights reserved.
ACIDity
• Atomic at row level
• Consistent to a point in time before the request
• Isolation through MVCC (reads) and row locks (mutations)
• Durability is guaranteed for all successful mutations
© 2019 Bloomberg Finance L.P. All rights reserved.
Namespace
Namespace: Blue Namespace: Yellow
ACL ACL
ACL ACL
ACL
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase Region Server
File System
WAL
Write
1
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
File System
WAL
Write
1
2
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
File System
WAL t1 t1 t1
Store files
Write
1
2
3
Block Block Block
Idx BlmData
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
DFSClient
WAL t1 t1 t1
Store files
Write
1
2
3
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Write
HBase RS
Memstore
DFSClient
WAL t1 t1 t1
Store files
Write
1
2
3
HDFS DataNode
WAL t1 t1 t1
© 2019 Bloomberg Finance L.P. All rights reserved.
Compaction
File System
K-x K-x D-1
Store files
K-xK-xK-x
© 2019 Bloomberg Finance L.P. All rights reserved.
Compaction
File System
Store files
K-x
Compaction
File System
K-x K-x D-1
Store files
K-xK-xK-x
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Read
HBase RS Memstore
File System
WAL
Read
1
Block Cache Block
t1 t1 t1
Store files
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Read
HBase RS Memstore
File System
WAL
Read
1
Block Cache
2
Block
t1 t1 t1
Store files
© 2019 Bloomberg Finance L.P. All rights reserved.
HDFS Read
HBase/
DFSClient
HDFS
File System
Data
TCP
© 2019 Bloomberg Finance L.P. All rights reserved.
HDFS Short-Circuit Read
HBase/
DFSClient
HDFS
File System
Data
TCP
Open
File
Pass
FD
© 2019 Bloomberg Finance L.P. All rights reserved.
Large Read Cache
HBase Memstore
File System
WAL t1 t1 t1
Store files
Read
1
Block Cache
2
Block
© 2019 Bloomberg Finance L.P. All rights reserved.
Large Read Cache
HBase
Memstore
File System
WAL t1 t1 t1
Store files
Read
1
Block Cache
3
Block
Off-heap Cache Block
2
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
© 2019 Bloomberg Finance L.P. All rights reserved.
HMaster
HBase Complete
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Server Failure
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Server Failure
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Server Failure
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1
HBase
Client
T1 r1
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1T1 r1
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Region Replication
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
T1 r1T1 r1
HBase
Client
https://www.youtube.com/watch?v=l6S-Vbs9WsU
© 2019 Bloomberg Finance L.P. All rights reserved.
Load Balancing
HMaster
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Load Balancing
HMaster
Balancer/AM
RS1 RS2 RS3 RS4 RS5 RS6
system
ZK
HBase
Client
© 2019 Bloomberg Finance L.P. All rights reserved.
Balancer
• Region Count Cost
• Primary Region Count Cost
• Table Skew Cost
• Locality Cost
• Rack Locality Cost
• Region Replica Host Cost
• Region Replica Rack Cost
• Read Request Cost
• Write Request Cost
• Memstore Size Cost
• Storefile Size Cost
• Move Cost
© 2019 Bloomberg Finance L.P. All rights reserved.
Other Features
• HBase Replication
• HBase multi-tenancy support
— https://www.youtube.com/watch?v=bZjz2G38Ju0
• HBase Co-processors and Filters
— https://www.slideshare.net/Hadoop_Summit/hbase-coprocessors-uses-abuses-solutions
© 2019 Bloomberg Finance L.P. All rights reserved.
NameNode
RS4
HMaster
ZQuorum
ZQuorum
Operator View
HMaster
RS1 RS2 RS4 RS5 RS6
HBase
ClientZK
Quorum
DN1 DN2 DN3 DN4 DN5 DN6
NameNode
JVM
OS
JVM
OS
JVM
OS
JVM
OS
JVM
OS
JVM
OS
HW HW HW HW HW HW
© 2019 Bloomberg Finance L.P. All rights reserved.
ZooKeeper Availability
• ZK Quorum
• One leader and remaining followers
— stat
— ruok
— mntr
• Test for availability
— e.g., List children of a znode
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Availability
• Master Availability - http://hmaster-node:16010/jmx
— "name" : "Hadoop:service=HBase,name=Master,sub=Server","tag.isActiveMaster" : "true"
• Dead RegionServers
— "name" : "Hadoop:service=HBase,name=Master,sub=Server", "numDeadRegionServers" : 0
• Region In Transition
— "name" : "Hadoop:service=HBase,name=Master,sub=AssignmentManger","ritCount" : 0
• Test for availability
— e.g., Query system table by listing tables
© 2019 Bloomberg Finance L.P. All rights reserved.
HDFS Availability
• Namenode Availability - http://namenode-host:50070/jmx
—"name" : "Hadoop:service=NameNode,name=FSNamesystem", "tag.HAState" : "active"
• Dead Datanodes
— "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "DeadNodes" : "{}“
• Missing Blocks
— "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "NumberOfMissingBlocks" : 0
• Percentage Used
— "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "PercentUsed" : 59
• Under replicated blocks
— "name" : "Hadoop:service=NameNode,name=FSNamesystemState","UnderReplicatedBlocks":0
• Test for availability
— e.g., Append data to a test file
© 2019 Bloomberg Finance L.P. All rights reserved.
HBase Performance
• RegionServer JMX metrics - http://rs-node:60300/jmx
— "name“:"Hadoop:service=HBase,name=RegionServer,sub=Server“
• Blockcache hit ratio
• Request counts
• Request response time
• Compaction related metrics
• Region count
• Flush related metrics
• Percentage of files local
• Split related metrics
— "name" : "Hadoop:service=HBase,name=RegionServer,sub=Tables",
• Table level metrics
https://www.slideshare.net/MichaelStack4/hbaseconasia2018-track31-serving-billions-of-queries-in-millisecond-latencies
© 2019 Bloomberg Finance L.P. All rights reserved.
JVM
• GC – JMX Metrics
— "name" : "java.lang:type=GarbageCollector,name=ParNew",
— "name" : "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
• GC Logging
— -verbose:gc
— -XX:+PrintHeapAtGC
— -XX:+PrintGCDetails
— -XX:+PrintGCTimeStamps
— -XX:+PrintGCDateStamps
— -XX:+PrintGCApplicationStoppedTime
— -XX:+PrintClassHistogram
— -XX:+PrintGCApplicationConcurrentTime
— -XX:+PrintTenuringDistribution
— -Xloggc:
© 2019 Bloomberg Finance L.P. All rights reserved.
OS/HW
• Memory
• CPU
• Disk
• Networking
© 2019 Bloomberg Finance L.P. All rights reserved.
Logs
• ZooKeeper Log
• HDFS
— Namenode log
— Datanode log
• HBase
— Master log
— RegionServer log
• OS
— Syslog
© 2019 Bloomberg Finance L.P. All rights reserved.
Interacting with HBase
• HBase shell
— DDL: create namespace/table, alter
— Security: grant, revoke
— DML: get, put, scan
— Tools: assign, compact, balance
— General: status
• HBase admin API
• HBase client API
© 2019 Bloomberg Finance L.P. All rights reserved.
Data Backup / Restore
• Snapshot
— hbase shell > snapshot 'table', 'table_mmddyy’
• Restore from snapshot
— hbase shell > restore_snapshot 'table_mmddyy'
• Export Snapshot
— $ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot
• CopyTable
— hbase org.apache.hadoop.hbase.mapreduce.CopyTable
© 2019 Bloomberg Finance L.P. All rights reserved.
Thank You!
Acknowledgement: Apache HBase Community
Reference: http://hbase.apache.org
Connect with Hadoop Team: hadoop@bloomberg.net
© 2019 Bloomberg Finance L.P. All rights reserved.
We are hiring!
Questions?
https://www.bloomberg.com/careers

Más contenido relacionado

La actualidad más candente

Hadoop development series(1)
Hadoop development series(1)Hadoop development series(1)
Hadoop development series(1)Amar kumar
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAnormanbarker
 
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San FranciscoElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San FranciscoAmazon Web Services
 
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop   ubc cs lecture series - g fawkesIntro to big data and hadoop   ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkesgfawkesnew2
 
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SFElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SFAmazon Web Services
 
Hive - Cost Based Optimizer
Hive - Cost Based OptimizerHive - Cost Based Optimizer
Hive - Cost Based OptimizerJohn pullokkaran
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...VMware Tanzu
 

La actualidad más candente (10)

Hadoop development series(1)
Hadoop development series(1)Hadoop development series(1)
Hadoop development series(1)
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
 
ElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San FranciscoElastiCache & Redis: Database Week San Francisco
ElastiCache & Redis: Database Week San Francisco
 
Intro to big data and hadoop ubc cs lecture series - g fawkes
Intro to big data and hadoop   ubc cs lecture series - g fawkesIntro to big data and hadoop   ubc cs lecture series - g fawkes
Intro to big data and hadoop ubc cs lecture series - g fawkes
 
ElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SFElastiCache & Redis: Database Week SF
ElastiCache & Redis: Database Week SF
 
Hive - Cost Based Optimizer
Hive - Cost Based OptimizerHive - Cost Based Optimizer
Hive - Cost Based Optimizer
 
CBO-2
CBO-2CBO-2
CBO-2
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
 
Big data course
Big data  courseBig data  course
Big data course
 
ElastiCache and Redis
ElastiCache and RedisElastiCache and Redis
ElastiCache and Redis
 

Similar a HBase Internals And Operations

How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...DataWorks Summit
 
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...confluent
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streamsconfluent
 
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...confluent
 
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019 높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019 Amazon Web Services Korea
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesMichael Stack
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsDataWorks Summit
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfAmazon Web Services
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfAmazon Web Services
 
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...confluent
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Amazon Web Services
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Databricks
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...Provectus
 
AWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessAWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessDarko Mesaroš
 
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS SummitScaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS SummitAmazon Web Services
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019AWS Summits
 
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019Amazon Web Services
 
Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services Amazon Web Services LATAM
 

Similar a HBase Internals And Operations (20)

How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...How data modelling helps serve billions of queries in millisecond latency wit...
How data modelling helps serve billions of queries in millisecond latency wit...
 
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
The Power of Event Driven Caches (Brendan Powers, Bloomberg L.P) Kafka Summit...
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
 
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
Kafka Connect and KSQL: Useful Tools in Migrating from a Legacy System to Kaf...
 
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019 높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
높은 가용성과 성능 향상을 위한 ElastiCache 활용 팁 - 임근택, SendBird :: AWS Summit Seoul 2019
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
 
MongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to AtlasMongoDB @ Fiverr: The Road to Atlas
MongoDB @ Fiverr: The Road to Atlas
 
HBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, SolutionsHBase coprocessors, Uses, Abuses, Solutions
HBase coprocessors, Uses, Abuses, Solutions
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdf
 
BuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdfBuildingGlobalServices_DevDays2019.pdf
BuildingGlobalServices_DevDays2019.pdf
 
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
Building a Data Subscription Service with Kafka Connect (Danica Fine & Ajay V...
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
 
Cloud Migration Workshop
Cloud Migration WorkshopCloud Migration Workshop
Cloud Migration Workshop
 
AWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With ServerlessAWS DevDay Berlin 2019 - Going Global With Serverless
AWS DevDay Berlin 2019 - Going Global With Serverless
 
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS SummitScaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
Scaling a database with Amazon RDS for Oracle - ADB208 - Chicago AWS Summit
 
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
Budget management with Cloud Economics | AWS Summit Tel Aviv 2019
 
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
Module 1: Introduction to the AWS Cloud - AWSome Day Online Conference 2019
 
Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services Databases in the Cloud em Amazon Web Services
Databases in the Cloud em Amazon Web Services
 

Más de Biju Nair

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleBiju Nair
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka ReferenceBiju Nair
 
Hadoop security
Hadoop securityHadoop security
Hadoop securityBiju Nair
 
Chef patterns
Chef patternsChef patterns
Chef patternsBiju Nair
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User ReferenceBiju Nair
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaBiju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload managementBiju Nair
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar DatabaseBiju Nair
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceBiju Nair
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developersBiju Nair
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk ManagementBiju Nair
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsBiju Nair
 

Más de Biju Nair (14)

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka Reference
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Chef patterns
Chef patternsChef patterns
Chef patterns
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk Management
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
 

Último

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Último (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

HBase Internals And Operations

  • 1. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Internals and Operations SRECon19 Asia/Pacific June 13, 2019 Biju Nair Software Engineer bnair10@bloomberg.net
  • 2. © 2019 Bloomberg Finance L.P. All rights reserved. Agenda • Introduction to HBase • Operating HBase • Questions
  • 3. © 2019 Bloomberg Finance L.P. All rights reserved. Bloomberg in a nutshell The Bloomberg Terminal delivers a diverse array of information on a single platform to facilitate financial decision-making.
  • 4. © 2019 Bloomberg Finance L.P. All rights reserved. Bloomberg technology by the numbers • 5,000+ software engineers • 150+ technologists and data scientists devoted to machine learning • One of the largest private networks in the world • 120 billion pieces of data from the financial markets each day, with a peak of more than 10 million messages/second • 2 million news stories ingested / published each day (500+ news stories ingested/second) • News content from 125K+ sources • Over 1 billion messages and Instant Bloomberg (IB) chats handled daily
  • 5. © 2019 Bloomberg Finance L.P. All rights reserved. HBase at Bloomberg • Started with v0.94.6 • >2 billion reads per day • >1 billion writes per day • 51+ TB of compressed data stored in HBase
  • 6. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Principles • Ordered Key Value Store • Distributed shared nothing
  • 7. © 2019 Bloomberg Finance L.P. All rights reserved. Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-f
  • 8. © 2019 Bloomberg Finance L.P. All rights reserved. Ordered Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-a Lexicographicorder Key-9993 Value-g
  • 9. © 2019 Bloomberg Finance L.P. All rights reserved. Ordered Key Value Key-9999 Value-a Key-9998 Value-b Key-9997 Value-c Key-9996 Value-d … … Key-9995 Value-e Key-9994 Value-a Lexicographicorder Key-9993 Value-g Region
  • 10. © 2019 Bloomberg Finance L.P. All rights reserved. Distributed Ordered Key Value Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value ordered … Key-299 Value … Key-298 Value Key-297 Value ordered … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Region Region
  • 11. © 2019 Bloomberg Finance L.P. All rights reserved. Distributed Ordered Key Value RegionServer Key-199 Value Key-198 Value Key-197 Value Key-499 Value … Key-498 Value Key-497 Value … Key-299 Value … Key-298 Value Key-297 Value … … … Key-599 Value … Key-598 Value Key-597 Value … Key-399 Value … Key-398 Value Key-397 Value … Key-999 Value … Key-998 Value Key-997 Value … Region Region RegionServer RegionServer RegionServer RegionServer RegionServer
  • 12. © 2019 Bloomberg Finance L.P. All rights reserved. Table Row View Key Value Column Id ValueRow Id Timestamp
  • 13. © 2019 Bloomberg Finance L.P. All rights reserved. Table Row View R11|col1|1234567 Value-A R11|col2|1234567 Value-B R11|col3|1234567 Value-C R11|col4|1234567 Value-D R11 Value-A Value-B Value-C Col1 Col2 Col3 Value-D Col4
  • 14. © 2019 Bloomberg Finance L.P. All rights reserved. Versioning R11|col1|1234567 Value-A1 R11|col1|1234566 Value-A R11|col2|1234567 Value-B R11|col3|1234567 Value-CC R11|col3|1234563 Value-C R11|col4|1234567 Value-DD R11|col4|1234560 Value-D1 R11|col4|1234557 Value-D Descendingorder
  • 15. © 2019 Bloomberg Finance L.P. All rights reserved. cf2cf1 Column Family R11|col1|1234567 Value-A R11|col2|1234567 Value-B R11|col3|1234567 Value-C R11 Value-A Value-B Value-C cf1:col1 cf1:col2 cf1:col3 Value-1 cf2:col1 R11|col1|1234567 Value-1 R11|col2|1234567 Value-2 R11|col3|1234567 Value-3 Value-2 cf2:col2 Value-3 cf2:col3
  • 16. © 2019 Bloomberg Finance L.P. All rights reserved. ACIDity • Atomic at row level • Consistent to a point in time before the request • Isolation through MVCC (reads) and row locks (mutations) • Durability is guaranteed for all successful mutations
  • 17. © 2019 Bloomberg Finance L.P. All rights reserved. Namespace Namespace: Blue Namespace: Yellow ACL ACL ACL ACL ACL
  • 18. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase Region Server File System WAL Write 1
  • 19. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore File System WAL Write 1 2
  • 20. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore File System WAL t1 t1 t1 Store files Write 1 2 3 Block Block Block Idx BlmData
  • 21. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore DFSClient WAL t1 t1 t1 Store files Write 1 2 3
  • 22. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Write HBase RS Memstore DFSClient WAL t1 t1 t1 Store files Write 1 2 3 HDFS DataNode WAL t1 t1 t1
  • 23. © 2019 Bloomberg Finance L.P. All rights reserved. Compaction File System K-x K-x D-1 Store files K-xK-xK-x
  • 24. © 2019 Bloomberg Finance L.P. All rights reserved. Compaction File System Store files K-x Compaction File System K-x K-x D-1 Store files K-xK-xK-x
  • 25. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Read HBase RS Memstore File System WAL Read 1 Block Cache Block t1 t1 t1 Store files
  • 26. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Read HBase RS Memstore File System WAL Read 1 Block Cache 2 Block t1 t1 t1 Store files
  • 27. © 2019 Bloomberg Finance L.P. All rights reserved. HDFS Read HBase/ DFSClient HDFS File System Data TCP
  • 28. © 2019 Bloomberg Finance L.P. All rights reserved. HDFS Short-Circuit Read HBase/ DFSClient HDFS File System Data TCP Open File Pass FD
  • 29. © 2019 Bloomberg Finance L.P. All rights reserved. Large Read Cache HBase Memstore File System WAL t1 t1 t1 Store files Read 1 Block Cache 2 Block
  • 30. © 2019 Bloomberg Finance L.P. All rights reserved. Large Read Cache HBase Memstore File System WAL t1 t1 t1 Store files Read 1 Block Cache 3 Block Off-heap Cache Block 2
  • 31. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system
  • 32. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system
  • 33. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK
  • 34. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK
  • 35. © 2019 Bloomberg Finance L.P. All rights reserved. HMaster HBase Complete HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK HBase Client
  • 36. © 2019 Bloomberg Finance L.P. All rights reserved. Region Server Failure HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client
  • 37. © 2019 Bloomberg Finance L.P. All rights reserved. Region Server Failure HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client
  • 38. © 2019 Bloomberg Finance L.P. All rights reserved. Region Server Failure HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client
  • 39. © 2019 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1 HBase Client T1 r1
  • 40. © 2019 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1T1 r1 HBase Client
  • 41. © 2019 Bloomberg Finance L.P. All rights reserved. Region Replication HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK T1 r1T1 r1 HBase Client https://www.youtube.com/watch?v=l6S-Vbs9WsU
  • 42. © 2019 Bloomberg Finance L.P. All rights reserved. Load Balancing HMaster RS1 RS2 RS3 RS4 RS5 RS6 system ZK HBase Client
  • 43. © 2019 Bloomberg Finance L.P. All rights reserved. Load Balancing HMaster Balancer/AM RS1 RS2 RS3 RS4 RS5 RS6 system ZK HBase Client
  • 44. © 2019 Bloomberg Finance L.P. All rights reserved. Balancer • Region Count Cost • Primary Region Count Cost • Table Skew Cost • Locality Cost • Rack Locality Cost • Region Replica Host Cost • Region Replica Rack Cost • Read Request Cost • Write Request Cost • Memstore Size Cost • Storefile Size Cost • Move Cost
  • 45. © 2019 Bloomberg Finance L.P. All rights reserved. Other Features • HBase Replication • HBase multi-tenancy support — https://www.youtube.com/watch?v=bZjz2G38Ju0 • HBase Co-processors and Filters — https://www.slideshare.net/Hadoop_Summit/hbase-coprocessors-uses-abuses-solutions
  • 46. © 2019 Bloomberg Finance L.P. All rights reserved. NameNode RS4 HMaster ZQuorum ZQuorum Operator View HMaster RS1 RS2 RS4 RS5 RS6 HBase ClientZK Quorum DN1 DN2 DN3 DN4 DN5 DN6 NameNode JVM OS JVM OS JVM OS JVM OS JVM OS JVM OS HW HW HW HW HW HW
  • 47. © 2019 Bloomberg Finance L.P. All rights reserved. ZooKeeper Availability • ZK Quorum • One leader and remaining followers — stat — ruok — mntr • Test for availability — e.g., List children of a znode
  • 48. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Availability • Master Availability - http://hmaster-node:16010/jmx — "name" : "Hadoop:service=HBase,name=Master,sub=Server","tag.isActiveMaster" : "true" • Dead RegionServers — "name" : "Hadoop:service=HBase,name=Master,sub=Server", "numDeadRegionServers" : 0 • Region In Transition — "name" : "Hadoop:service=HBase,name=Master,sub=AssignmentManger","ritCount" : 0 • Test for availability — e.g., Query system table by listing tables
  • 49. © 2019 Bloomberg Finance L.P. All rights reserved. HDFS Availability • Namenode Availability - http://namenode-host:50070/jmx —"name" : "Hadoop:service=NameNode,name=FSNamesystem", "tag.HAState" : "active" • Dead Datanodes — "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "DeadNodes" : "{}“ • Missing Blocks — "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "NumberOfMissingBlocks" : 0 • Percentage Used — "name" : "Hadoop:service=NameNode,name=NameNodeInfo", "PercentUsed" : 59 • Under replicated blocks — "name" : "Hadoop:service=NameNode,name=FSNamesystemState","UnderReplicatedBlocks":0 • Test for availability — e.g., Append data to a test file
  • 50. © 2019 Bloomberg Finance L.P. All rights reserved. HBase Performance • RegionServer JMX metrics - http://rs-node:60300/jmx — "name“:"Hadoop:service=HBase,name=RegionServer,sub=Server“ • Blockcache hit ratio • Request counts • Request response time • Compaction related metrics • Region count • Flush related metrics • Percentage of files local • Split related metrics — "name" : "Hadoop:service=HBase,name=RegionServer,sub=Tables", • Table level metrics https://www.slideshare.net/MichaelStack4/hbaseconasia2018-track31-serving-billions-of-queries-in-millisecond-latencies
  • 51. © 2019 Bloomberg Finance L.P. All rights reserved. JVM • GC – JMX Metrics — "name" : "java.lang:type=GarbageCollector,name=ParNew", — "name" : "java.lang:type=GarbageCollector,name=ConcurrentMarkSweep", • GC Logging — -verbose:gc — -XX:+PrintHeapAtGC — -XX:+PrintGCDetails — -XX:+PrintGCTimeStamps — -XX:+PrintGCDateStamps — -XX:+PrintGCApplicationStoppedTime — -XX:+PrintClassHistogram — -XX:+PrintGCApplicationConcurrentTime — -XX:+PrintTenuringDistribution — -Xloggc:
  • 52. © 2019 Bloomberg Finance L.P. All rights reserved. OS/HW • Memory • CPU • Disk • Networking
  • 53. © 2019 Bloomberg Finance L.P. All rights reserved. Logs • ZooKeeper Log • HDFS — Namenode log — Datanode log • HBase — Master log — RegionServer log • OS — Syslog
  • 54. © 2019 Bloomberg Finance L.P. All rights reserved. Interacting with HBase • HBase shell — DDL: create namespace/table, alter — Security: grant, revoke — DML: get, put, scan — Tools: assign, compact, balance — General: status • HBase admin API • HBase client API
  • 55. © 2019 Bloomberg Finance L.P. All rights reserved. Data Backup / Restore • Snapshot — hbase shell > snapshot 'table', 'table_mmddyy’ • Restore from snapshot — hbase shell > restore_snapshot 'table_mmddyy' • Export Snapshot — $ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot • CopyTable — hbase org.apache.hadoop.hbase.mapreduce.CopyTable
  • 56. © 2019 Bloomberg Finance L.P. All rights reserved. Thank You! Acknowledgement: Apache HBase Community Reference: http://hbase.apache.org Connect with Hadoop Team: hadoop@bloomberg.net
  • 57. © 2019 Bloomberg Finance L.P. All rights reserved. We are hiring! Questions? https://www.bloomberg.com/careers