SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
HBase at Xiaomi
Phil Yang & Guanghao Zhang
{yangzhe1991, zghao}@apache.org
About Xiaomi
● Founded in 2010
● $45 billion valuation
● 200+ million global users
● More than 20 independent Apps with DAU 10M+
● Products: smart phone, TV, router, smart band…
Since HBaseCon 2016
● 1 new PMC member
● 2 new committers (5 committers now)
● Resolved 180+ issues
Agenda
● HBase at Xiaomi
● Replication Improvements
● Confusing Behaviors
● Scan Improvements
● Async Client
Clusters and Scenarios
● IDC
4 data centers (China), 30+ online clusters / 2 offline clusters
● AWS / Alibaba Cloud
7 clusters (China/Singapore/US/Europe)
Upgrade 0.94 ⇒ 0.98
Not compatible
Compatible
Upgrade 0.94 ⇒ 0.98
0.94
Cluster
0.98
Cluster
Add disabled
thrift replication
peer
Take a snapshot
ExportSnapshot
via MR
Need webHDFS
if 0.94 cluster is
on HDFS 2.0
HBASE-12814 : Zero downtime upgrade from 94 to 98
(Using thrift replication to transfer data)
Upgrade 0.94 ⇒ 0.98
Enable thrift
peer
Wait for no
log delay
Shutdown
application
with 0.94
client and
startup the
new version
with 0.98
clientClone
Snapshot
Shutdown
0.94 cluster
and merge
machines to
0.98 cluster
Use a Separate Thrift Server for Replication
Purpose:
Reduce GC of RS
No need to restart RS while changing table name mapping config
0.94
cluster
thrift
replication
server
0.98
cluster
thrift
protocol
0.98
native
protocol
Need a mapping
config to convert
table name from
0.94 to 0.98
Use G1 GC
24 core CPU / 128G memory
Why: Reduce Full GC, reduce number of instances
Must use latest jdk8 to prevent crashing in heavy load
12G
+
12G
12G
+
12G
12G
+
12G
12G
+
12G
50G heap
+
50G offheap
CMS to G1
Use G1 GC
-XX:+UnlockExperimentalVMOptions
-XX:MaxGCPauseMillis={50/90/500} for SSD/HDD/offline cluster
-XX:G1NewSizePercent={2/5} for normal/heavy load cluster
-XX:InitiatingHeapOccupancyPercent=65
-XX:+ParallelRefProcEnabled
-XX:ConcGCThreads=4
-XX:ParallelGCThreads=16
-XX:MaxTenuringThreshold=1
-XX:G1HeapRegionSize=32m
-XX:G1MixedGCCountTarget=64
-XX:G1OldCSetRegionThresholdPercent=5
Replication Improvements
● HBASE-16447 Replication by namespaces config in peer
● HBASE-17296 Provide per peer throttling for replication
● HBASE-17314 Limit total buffered size for all replication sources
● HBASE-12770 Don't transfer all the queued hlogs of a dead server to the
same alive server
● HBASE-9465 Push entries to peer clusters serially
HBASE-9465 Serial Replication
Before HBASE-9465(<= 1.3.x):
RS 1 HLog RS 2 HLog
Log1
Log2
Log3
region move
/ fail over
Log4
Log5
Log6
RepliactionSource 1 ReplicationSource 2
Log1
Log2
Log3
Log4
Log5
Log6
init
Log4/5 is pushed before log3
HBASE-9465 Serial Replication
After HBASE-9465(>= 1.4.0/2.0.0):
RS 1 HLog RS 2 HLog
Log1
Log2
Log3
region move
/ fail over
Log4
Log5
Log6
RepliactionSource 1 ReplicationSource 2
Log1
Log2
Log3
Log4
init
Log4 is pushed after log3
waiting
log3
pushed
Log5
Log6
Need set
REPLICATION_SCOPE=2
Confusing Behaviors
Inconsistent results between source cluster and peer cluster
● HBASE-9465
Deletes mask puts, even puts that happened after the delete was entered
● HBASE-15968
Inconsistent Results when Using Filter to Read Data
Column family’s MaxVersions = 3
1. put T1, T2, T3, T4, T5
2. scan with a ValueFilter(value<=3)
T5 | Value=5
T4 | Value=4
T2 | Value=2
T3 | Value=3
T1 | Value=1
T1, T2 are gone from
user view
Inconsistent Results when Using Filter to Read Data
Column family’s MaxVersions = 3
1. put T1, T2, T3, T4, T5
2. scan with a ValueFilter(value<=3)
T5 | Value=5
T4 | Value=4
T2 | Value=2
T3 | Value=3
T1 | Value=1
T5 | Value=5
T4 | Value=4
T2 | Value=2
T3 | Value=3
T1 | Value=1
T1, T2 are back again!
Inconsistent Results when Using Filter to Read Data
Solution:Adjust the execution order in ScanQueryMatcher.matchColumn
➢ check column
➢ check by filter
➢ check versions
Inconsistent Results when Using Filter to Read Data
Solution:Adjust the execution order in ScanQueryMatcher.matchColumn
➢ check column ➢ check column
➢ check by filter ➢ check versions
➢ check versions ➢ check by filter
Inconsistent Results when Using Filter to Read Data
Solution:Adjust the execution order in ScanQueryMatcher.matchColumn
➢ check column ➢ check column
➢ check by filter ➢ check versions
➢ check versions ➢ check by filter
1. put T1, T2, T3, T4, T5
2. scan with a ValueFilter(value<=3)
scan can’t read T1, T2
T5 | Value=5
T4 | Value=4
T2 | Value=2
T3 | Value=3
T1 | Value=1
Scan Improvements
● Add inclusive/exclusive support for startRow and stopRow
● Add Scan.setLimit(int) to limit the number of rows for Scan
● Unify the implementation of small scan and regular scan
● Pass mvcc to client when scan
Unify the Implementation of Small Scan and Regular Scan
client server
open scanner
.
.
.
scan
next
close scanner
client server
small scanregular scan
Unify the Implementation of Small Scan and Regular Scan
● All scan rpc requests return results
● Use pread by default and switch to streaming read if needed
scan
client server
unified scan
.
.
.
scan
Scan May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restrat scan
a | value=1 | mvcc=1
b | value=1 | mvcc=1
Scan May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restart scan
scan with read point 1
and read column a
value = 1
a | value=1 | mvcc=1
b | value=1 | mvcc=1
Scan May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restart scan b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Consistent Result
column a value=1 & column b value=1
column a value=2 & column b nothing
Scan May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restart scan
restart scan with read
point 2 and read
column b nothing
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Scan May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restart scan
read column a value = 1 and
read b nothing.
row level consistency broken!
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
HBASE-17167 Pass mvcc to client when scan
Solution: pass read point to client. When region is moved, use the previous read
point to restart a scan to get a consistent view.
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restart scan
restart scan with
previous read point 1
and read column b
value = 1
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Compaction May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and compact
5. region move and restrat scan
a | value=1 | mvcc=1
b | value=1 | mvcc=1
Compaction May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and compact
5. region move and restart scan
scan with read point 1
and read column a
value = 1
a | value=1 | mvcc=1
b | value=1 | mvcc=1
Compaction May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and compact
5. region move and restart scan
Consistent Result
column a value=1 & column b value=1
column a value=2 & column b nothing
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Compaction May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and compact
5. restart scan
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Compaction May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and compact
5. restart scan
restart scan with
previous read point 1
and read column b
nothing
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Compaction May Break Row Level Consistency
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and compact
5. restart scan
read column a value = 1 and
read b nothing.
row level consistency broken!
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
HBASE-17177
Status: OPEN
Candidate solution: Disable compaction for a while when open region
1. put column a, put column b
2. scan
3. put column a, delete column b
4. region move and restart scan
restart scan with
previous read point 1
and read column b
value = 1
b | value=1 | mvcc=1
b | delete | mvcc=2
a | value=1 | mvcc=1
a | value=2 | mvcc=2
Async HBase Client
Implementation
● Use the asynchronous protobuf stub
● Use CompletableFuture (Must use latest jdk8 for performance issue)
Async HBase Client
User API
● AsyncTable using users’ ExecutorService
● RawAsyncTable for experts
● ResultScanner in old style
● (Raw)ScanResultConsumer using observer pattern
PeformanceEvaluation Test
Thank you!
Phil Yang & Guanghao Zhang
{yangzhe1991, zghao}@apache.org

Más contenido relacionado

La actualidad más candente

Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
Samza memory capacity_2015_ieee_big_data_data_quality_workshop
Samza memory capacity_2015_ieee_big_data_data_quality_workshopSamza memory capacity_2015_ieee_big_data_data_quality_workshop
Samza memory capacity_2015_ieee_big_data_data_quality_workshopTao Feng
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019confluent
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
Benchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per nodeBenchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per nodeTao Feng
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017HBaseCon
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...DataStax
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseDataStax Academy
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx
 
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...HostedbyConfluent
 
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyScyllaDB
 
stream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzaAbhishek Shivanna
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoNathaniel Braun
 
Scylla Summit 2018: Worry-free ingestion - flow-control of writes in Scylla
Scylla Summit 2018: Worry-free ingestion - flow-control of writes in ScyllaScylla Summit 2018: Worry-free ingestion - flow-control of writes in Scylla
Scylla Summit 2018: Worry-free ingestion - flow-control of writes in ScyllaScyllaDB
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesMichael Stack
 

La actualidad más candente (20)

Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
Samza memory capacity_2015_ieee_big_data_data_quality_workshop
Samza memory capacity_2015_ieee_big_data_data_quality_workshopSamza memory capacity_2015_ieee_big_data_data_quality_workshop
Samza memory capacity_2015_ieee_big_data_data_quality_workshop
 
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBaseHBaseCon2017 Highly-Available HBase
HBaseCon2017 Highly-Available HBase
 
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
What's the time? ...and why? (Mattias Sax, Confluent) Kafka Summit SF 2019
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
Benchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per nodeBenchmarking Apache Samza: 1.2 million messages per sec per node
Benchmarking Apache Samza: 1.2 million messages per sec per node
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017
 
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
Terror & Hysteria: Cost Effective Scaling of Time Series Data with Cassandra ...
 
ApacheCon BigData Europe 2015
ApacheCon BigData Europe 2015 ApacheCon BigData Europe 2015
ApacheCon BigData Europe 2015
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
 
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
 
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
Flink Forward Berlin 2017: Stefan Richter - A look at Flink's internal data s...
 
Eventually, Scylla Chooses Consistency
Eventually, Scylla Chooses ConsistencyEventually, Scylla Chooses Consistency
Eventually, Scylla Chooses Consistency
 
stream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samzastream-processing-at-linkedin-with-apache-samza
stream-processing-at-linkedin-with-apache-samza
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Aerospike & GCE (LSPE Talk)
Aerospike & GCE (LSPE Talk)Aerospike & GCE (LSPE Talk)
Aerospike & GCE (LSPE Talk)
 
OpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ CriteoOpenTSDB for monitoring @ Criteo
OpenTSDB for monitoring @ Criteo
 
Scylla Summit 2018: Worry-free ingestion - flow-control of writes in Scylla
Scylla Summit 2018: Worry-free ingestion - flow-control of writes in ScyllaScylla Summit 2018: Worry-free ingestion - flow-control of writes in Scylla
Scylla Summit 2018: Worry-free ingestion - flow-control of writes in Scylla
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
 

Similar a HBaseCon2017 HBase at Xiaomi

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on androidKoan-Sin Tan
 
Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...Universität Rostock
 
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache GiraphAvery Ching
 
Training CNNs with Selective Allocation of Channels (ICML 2019)
Training CNNs with Selective Allocation of Channels (ICML 2019)Training CNNs with Selective Allocation of Channels (ICML 2019)
Training CNNs with Selective Allocation of Channels (ICML 2019)Jongheon Jeong
 
BlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security researchBlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security researchBlueHat Security Conference
 
Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2Jean-Philippe BEMPEL
 
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...HostedbyConfluent
 
Graph processing
Graph processingGraph processing
Graph processingyeahjs
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxfahmi324663
 
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka StreamsStreams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka StreamsHostedbyConfluent
 
RIP Routing Information Protocol Extreme Networks
RIP Routing Information Protocol Extreme NetworksRIP Routing Information Protocol Extreme Networks
RIP Routing Information Protocol Extreme NetworksDani Royman Simanjuntak
 
LeanIX WeAreDevelopers 2018
LeanIX WeAreDevelopers 2018LeanIX WeAreDevelopers 2018
LeanIX WeAreDevelopers 2018LeanIX GmbH
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012Jinwon Lee
 
Solve it Differently with Reactive Programming
Solve it Differently with Reactive ProgrammingSolve it Differently with Reactive Programming
Solve it Differently with Reactive ProgrammingSupun Dissanayake
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Jonathan Salwan
 
OVN DBs HA with scale test
OVN DBs HA with scale testOVN DBs HA with scale test
OVN DBs HA with scale testAliasgar Ginwala
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 
Matlab for diploma students(1)
Matlab for diploma students(1)Matlab for diploma students(1)
Matlab for diploma students(1)Retheesh Raj
 

Similar a HBaseCon2017 HBase at Xiaomi (20)

running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...Inside LoLA - Experiences from building a state space tool for place transiti...
Inside LoLA - Experiences from building a state space tool for place transiti...
 
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
2014.02.13 (Strata) Graph Analysis with One Trillion Edges on Apache Giraph
 
Training CNNs with Selective Allocation of Channels (ICML 2019)
Training CNNs with Selective Allocation of Channels (ICML 2019)Training CNNs with Selective Allocation of Channels (ICML 2019)
Training CNNs with Selective Allocation of Channels (ICML 2019)
 
BlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security researchBlueHat v18 || Hardening hyper-v through offensive security research
BlueHat v18 || Hardening hyper-v through offensive security research
 
Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2Understanding low latency jvm gcs V2
Understanding low latency jvm gcs V2
 
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
Restoring Restoration's Reputation in Kafka Streams with Bruno Cadonna & Luca...
 
Graph processing
Graph processingGraph processing
Graph processing
 
Week5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptxWeek5-Faster R-CNN.pptx
Week5-Faster R-CNN.pptx
 
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka StreamsStreams Don't Fail Me Now - Robustness Features in Kafka Streams
Streams Don't Fail Me Now - Robustness Features in Kafka Streams
 
RIP Routing Information Protocol Extreme Networks
RIP Routing Information Protocol Extreme NetworksRIP Routing Information Protocol Extreme Networks
RIP Routing Information Protocol Extreme Networks
 
LeanIX WeAreDevelopers 2018
LeanIX WeAreDevelopers 2018LeanIX WeAreDevelopers 2018
LeanIX WeAreDevelopers 2018
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Solve it Differently with Reactive Programming
Solve it Differently with Reactive ProgrammingSolve it Differently with Reactive Programming
Solve it Differently with Reactive Programming
 
Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes Dynamic Binary Analysis and Obfuscated Codes
Dynamic Binary Analysis and Obfuscated Codes
 
OVN DBs HA with scale test
OVN DBs HA with scale testOVN DBs HA with scale test
OVN DBs HA with scale test
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Samza la hug
Samza la hugSamza la hug
Samza la hug
 
Understanding low latency jvm gcs
Understanding low latency jvm gcsUnderstanding low latency jvm gcs
Understanding low latency jvm gcs
 
Matlab for diploma students(1)
Matlab for diploma students(1)Matlab for diploma students(1)
Matlab for diploma students(1)
 

Más de HBaseCon

hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on BeamHBaseCon
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at HuaweiHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程HBaseCon
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at NeteaseHBaseCon
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践HBaseCon
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台HBaseCon
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comHBaseCon
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at HuaweiHBaseCon
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0HBaseCon
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
 
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon
 

Más de HBaseCon (20)

hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beamhbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase on Beam
 
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huaweihbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: HBase在Hulu的使用和实践
 
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: 基于HBase的企业级大数据平台
 
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.comhbaseconasia2017: HBase at JD.com
hbaseconasia2017: HBase at JD.com
 
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecturehbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Large scale data near-line loading method and architecture
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
HBaseCon2017 Spark HBase Connector: Feature Rich and Efficient Access to HBas...
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
 
HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
 
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
HBaseCon2017 Warp 10, a novel approach to managing and analyzing time series ...
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Último (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

HBaseCon2017 HBase at Xiaomi

  • 1. HBase at Xiaomi Phil Yang & Guanghao Zhang {yangzhe1991, zghao}@apache.org
  • 2. About Xiaomi ● Founded in 2010 ● $45 billion valuation ● 200+ million global users ● More than 20 independent Apps with DAU 10M+ ● Products: smart phone, TV, router, smart band…
  • 3. Since HBaseCon 2016 ● 1 new PMC member ● 2 new committers (5 committers now) ● Resolved 180+ issues
  • 4. Agenda ● HBase at Xiaomi ● Replication Improvements ● Confusing Behaviors ● Scan Improvements ● Async Client
  • 5. Clusters and Scenarios ● IDC 4 data centers (China), 30+ online clusters / 2 offline clusters ● AWS / Alibaba Cloud 7 clusters (China/Singapore/US/Europe)
  • 6. Upgrade 0.94 ⇒ 0.98 Not compatible Compatible
  • 7. Upgrade 0.94 ⇒ 0.98 0.94 Cluster 0.98 Cluster Add disabled thrift replication peer Take a snapshot ExportSnapshot via MR Need webHDFS if 0.94 cluster is on HDFS 2.0 HBASE-12814 : Zero downtime upgrade from 94 to 98 (Using thrift replication to transfer data)
  • 8. Upgrade 0.94 ⇒ 0.98 Enable thrift peer Wait for no log delay Shutdown application with 0.94 client and startup the new version with 0.98 clientClone Snapshot Shutdown 0.94 cluster and merge machines to 0.98 cluster
  • 9. Use a Separate Thrift Server for Replication Purpose: Reduce GC of RS No need to restart RS while changing table name mapping config 0.94 cluster thrift replication server 0.98 cluster thrift protocol 0.98 native protocol Need a mapping config to convert table name from 0.94 to 0.98
  • 10. Use G1 GC 24 core CPU / 128G memory Why: Reduce Full GC, reduce number of instances Must use latest jdk8 to prevent crashing in heavy load 12G + 12G 12G + 12G 12G + 12G 12G + 12G 50G heap + 50G offheap CMS to G1
  • 11. Use G1 GC -XX:+UnlockExperimentalVMOptions -XX:MaxGCPauseMillis={50/90/500} for SSD/HDD/offline cluster -XX:G1NewSizePercent={2/5} for normal/heavy load cluster -XX:InitiatingHeapOccupancyPercent=65 -XX:+ParallelRefProcEnabled -XX:ConcGCThreads=4 -XX:ParallelGCThreads=16 -XX:MaxTenuringThreshold=1 -XX:G1HeapRegionSize=32m -XX:G1MixedGCCountTarget=64 -XX:G1OldCSetRegionThresholdPercent=5
  • 12. Replication Improvements ● HBASE-16447 Replication by namespaces config in peer ● HBASE-17296 Provide per peer throttling for replication ● HBASE-17314 Limit total buffered size for all replication sources ● HBASE-12770 Don't transfer all the queued hlogs of a dead server to the same alive server ● HBASE-9465 Push entries to peer clusters serially
  • 13. HBASE-9465 Serial Replication Before HBASE-9465(<= 1.3.x): RS 1 HLog RS 2 HLog Log1 Log2 Log3 region move / fail over Log4 Log5 Log6 RepliactionSource 1 ReplicationSource 2 Log1 Log2 Log3 Log4 Log5 Log6 init Log4/5 is pushed before log3
  • 14. HBASE-9465 Serial Replication After HBASE-9465(>= 1.4.0/2.0.0): RS 1 HLog RS 2 HLog Log1 Log2 Log3 region move / fail over Log4 Log5 Log6 RepliactionSource 1 ReplicationSource 2 Log1 Log2 Log3 Log4 init Log4 is pushed after log3 waiting log3 pushed Log5 Log6 Need set REPLICATION_SCOPE=2
  • 15. Confusing Behaviors Inconsistent results between source cluster and peer cluster ● HBASE-9465 Deletes mask puts, even puts that happened after the delete was entered ● HBASE-15968
  • 16. Inconsistent Results when Using Filter to Read Data Column family’s MaxVersions = 3 1. put T1, T2, T3, T4, T5 2. scan with a ValueFilter(value<=3) T5 | Value=5 T4 | Value=4 T2 | Value=2 T3 | Value=3 T1 | Value=1 T1, T2 are gone from user view
  • 17. Inconsistent Results when Using Filter to Read Data Column family’s MaxVersions = 3 1. put T1, T2, T3, T4, T5 2. scan with a ValueFilter(value<=3) T5 | Value=5 T4 | Value=4 T2 | Value=2 T3 | Value=3 T1 | Value=1 T5 | Value=5 T4 | Value=4 T2 | Value=2 T3 | Value=3 T1 | Value=1 T1, T2 are back again!
  • 18. Inconsistent Results when Using Filter to Read Data Solution:Adjust the execution order in ScanQueryMatcher.matchColumn ➢ check column ➢ check by filter ➢ check versions
  • 19. Inconsistent Results when Using Filter to Read Data Solution:Adjust the execution order in ScanQueryMatcher.matchColumn ➢ check column ➢ check column ➢ check by filter ➢ check versions ➢ check versions ➢ check by filter
  • 20. Inconsistent Results when Using Filter to Read Data Solution:Adjust the execution order in ScanQueryMatcher.matchColumn ➢ check column ➢ check column ➢ check by filter ➢ check versions ➢ check versions ➢ check by filter 1. put T1, T2, T3, T4, T5 2. scan with a ValueFilter(value<=3) scan can’t read T1, T2 T5 | Value=5 T4 | Value=4 T2 | Value=2 T3 | Value=3 T1 | Value=1
  • 21. Scan Improvements ● Add inclusive/exclusive support for startRow and stopRow ● Add Scan.setLimit(int) to limit the number of rows for Scan ● Unify the implementation of small scan and regular scan ● Pass mvcc to client when scan
  • 22. Unify the Implementation of Small Scan and Regular Scan client server open scanner . . . scan next close scanner client server small scanregular scan
  • 23. Unify the Implementation of Small Scan and Regular Scan ● All scan rpc requests return results ● Use pread by default and switch to streaming read if needed scan client server unified scan . . . scan
  • 24. Scan May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restrat scan a | value=1 | mvcc=1 b | value=1 | mvcc=1
  • 25. Scan May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restart scan scan with read point 1 and read column a value = 1 a | value=1 | mvcc=1 b | value=1 | mvcc=1
  • 26. Scan May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restart scan b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2 Consistent Result column a value=1 & column b value=1 column a value=2 & column b nothing
  • 27. Scan May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restart scan restart scan with read point 2 and read column b nothing b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 28. Scan May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restart scan read column a value = 1 and read b nothing. row level consistency broken! b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 29. HBASE-17167 Pass mvcc to client when scan Solution: pass read point to client. When region is moved, use the previous read point to restart a scan to get a consistent view. 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restart scan restart scan with previous read point 1 and read column b value = 1 b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 30. Compaction May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and compact 5. region move and restrat scan a | value=1 | mvcc=1 b | value=1 | mvcc=1
  • 31. Compaction May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and compact 5. region move and restart scan scan with read point 1 and read column a value = 1 a | value=1 | mvcc=1 b | value=1 | mvcc=1
  • 32. Compaction May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and compact 5. region move and restart scan Consistent Result column a value=1 & column b value=1 column a value=2 & column b nothing b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 33. Compaction May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and compact 5. restart scan b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 34. Compaction May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and compact 5. restart scan restart scan with previous read point 1 and read column b nothing b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 35. Compaction May Break Row Level Consistency 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and compact 5. restart scan read column a value = 1 and read b nothing. row level consistency broken! b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 36. HBASE-17177 Status: OPEN Candidate solution: Disable compaction for a while when open region 1. put column a, put column b 2. scan 3. put column a, delete column b 4. region move and restart scan restart scan with previous read point 1 and read column b value = 1 b | value=1 | mvcc=1 b | delete | mvcc=2 a | value=1 | mvcc=1 a | value=2 | mvcc=2
  • 37. Async HBase Client Implementation ● Use the asynchronous protobuf stub ● Use CompletableFuture (Must use latest jdk8 for performance issue)
  • 38. Async HBase Client User API ● AsyncTable using users’ ExecutorService ● RawAsyncTable for experts ● ResultScanner in old style ● (Raw)ScanResultConsumer using observer pattern
  • 40. Thank you! Phil Yang & Guanghao Zhang {yangzhe1991, zghao}@apache.org