SlideShare una empresa de Scribd logo
1 de 16
Descargar para leer sin conexión
HBCK2: Concepts, trends and recipes
for fixing issues within HBase 2
Wellington Chevreuil
HBase Committer Cloudera HBase SW Engineer
HBCK (1) - Little bit of history
• Main tool for general inconsistencies in hbase-1.x
• The Swiss Knife for operators
• Packaged together with hbase main project
• Provides both diagnosing and fixing commands
• Some reports may be misleading, e.g., "holes in the region chain"
• Some options can cause damages if not well understood, e.g., "-sidelineBigOverlaps", "-removeParents"
• Commands often work independent of Master
• Can introduce conflicts on meta information maintained by Master
• Lack of implementation details on documentation/help guide
HBCK1 Commands user guide:
HBCK2 in a nutshell
• Simpler tool
• Less fix commands
• No diagnosis command
• Requires deeper HBase internal workings from operators
• Shipped independently from hbase
• Packaged with hbase-operators-tool project
• https://github.com/apache/hbase-operator-tools
• Can evolve on its own pace
• New versions can be run without needing whole hbase upgrade
• Master oriented (more later)
• More detailed documentation about each command
• Still a WIP
• By the time of this presentation, there's still no official release for HBCK2
HBCK2 Concepts
• AMv2 compliant
• HBCK1 does not work with HBase 2 AssignmentManager re-implementation
• Thinner, but more interactive commands
• No such thing as hbck1 -fix command
• Operators required to fix an issue at a time
• Master oriented
• Master must be online
• Commands implementation should use Master HbckService as much as possible
• However, new commands may initially require a client side implementation, then get ported to Master's
HbckService facade
• Fix only, requires other tools for issue diagnosing
• Available only for 2.0.3 onwards, and 2.1.1 onwards
HBCK2 Commands user guide:
HBCK2 Usage trends
• Master not completing initialisation
• Meta/Namespace table "NOT online" issues
• Table RIT issues
• Procedures stuck
• Table in wrong state
• Missing regions in META
• User induced via incompatible OfflineMetaRepair tool
HBCK2 for Operators: How do I get and run it?
• Not released so far, requires local build
• Requirements
• JDK 1.8 or higher
• Git
• Maven
• Checkout related apache github repository:
• $ git clone https://github.com/apache/hbase-operator-tools.git
• Build HBCK2 upon desired hbase version:
• $ mvn -Dhbase.version=2.1.5 clean install
• Above command will produce HBCK2 jar file under ./hbase-hbck2/target/, named
hbase-hbck2-1.0.0-SNAPSNOT.jar (assuming current version is 1.0.0-SNAPSHOT)
• Upload generated jar to the given hbase cluster and run it as below:
• $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar
HBCK2 for Operators: Recipes
• Meta/Namespace table regions "NOT online"
• Due to corruption or manual deletion of /hbase/MasterProcWALs files
• Meta may miss info about RS assignment
• Master logs show regions assigned to an old RS start code
• Run HBCK2 assigns command for META region 1588230740:
• $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 1588230740
• Similar issue may affect namespace and user tables regions
• Affected regions names would be mentioned on log messages similar to above
WARN org.apache.hadoop.hbase.master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPENING,
ts=1550754721289, server=regionserver01.example.com,16020,1550676598448}; ServerCrashProcedures=true. Master startup cannot
progress, in holding-pattern until region onlined.
HBCK2 for Operators: Recipes
• Table RIT issues
• Usually, due several RSes crashes/slowness while regions are transitioning
• Run HBCK2 assigns command for the given region encoded name
11bf6b18ddacdd864728e6cf1199b2a7:
• $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 11bf6b18ddacdd864728e6cf1199b2a7
WARN org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK Region-In-Transition rit=OPENING,
location=regionserver01.example.com,16020,1542314816394, table=hbase:acl, region=11bf6b18ddacdd864728e6cf1199b2a7
...
WARN org.apache.hadoop.hbase.ipc.RpcServer: Dropping timed out call: callId: 702 service: ClientService methodName: Mutate size: 272
connection: 1.1.1.1:56492 deadline: 1542316740911 param: region= hbase:meta,,1,
row=hbase:acl,,1404406671604.11bf6b18ddacdd864728e6cf1199b2a7. connection: 1.1.1.1:56492
HBCK2 for Operators: Recipes
• Procedures stuck
• While troubleshooting causes for RITs, check for procedures attempting to transition regions states:
• $ echo "list_procedures" | hbase shell
• Output for list_procedures shows WAITING_TIMEOUT and/or procedures running for
days
• Other procedures fail to acquire lock owned by one of the stuck procedures:
• Run HBCK2 bypass command to get rid of stuck procedures:
• $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7
PID Name State Submitted Last_Update Parameters
6 org.apache.hadoop.hbase.master.assignment.UnassignProcedure WAITING_TIMEOUT 2019-03-29 11:15:06 2019-04-08 06:33:35 ...
7 org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure RUNNABLE 2019-03-29 11:24:39 2019-03-29 11:24:39 ...
ERROR: org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: f7910bfc9c9... owned by pid=6, CANNOT run 'this' (pid=347).
HBCK2 for Operators: Recipes
• Table in wrong state
• Can happen after hanging enable/disable table procedures, or related sub-procedures
• Bypassing procedures can lead to this as well
• Table indefinitely in temporary states ENABLING/DISABLING
• scan 'hbase:meta', {COLUMN => "table:state"}
• enable 'usertable'
• Run HBCK2 setTableState to manually bring table state to one of the final ones
ENABLED/DISABLED:
• $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar setTableState usertable DISABLEDbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7
usertable column=table:state, timestamp=1555406568751, value=x08x03.
ERROR: Table tableName=usertable, state=ENABLING should be disabled!
HBCK2 for Operators: Recipes
• Missing regions in META
• Operator induced when running incompatible tool OfflineMetaRepair (HBASE-21665)
• Typically manifests as holes on the region chain, or in the case of namespace region missing, master fails
initialisation
• scan 'hbase:meta', {COLUMN => "table:state", ROWPREFIXFILTER => 'hbase:namespace'}
• Still under development through HBASE-22567, HBCK2 addMissingRegionsInMeta
can be used to re-add missing regions:
• $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar addMissingRegionInMeta hbase:namespace
• Still WIP, so syntax might change.
• Check HBASE-22567 for latest developments
e-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7
ROW COLUMN+CELL
0 row(s)
HBCK2 for Contributors
• Apache github repository: https://github.com/apache/hbase-operator-tools
• HBCK2 defined as sub-module hbase-hbck2 of hbase-operator-tools
• HBASE-21745
• Umbrella jira for tracking potential new HBCK2 features
• Faced a new issue in HBase 2? Have a new idea for HBCK2 command?
• Great! Contributions are welcome!
• Start a [DISCUSS] mail thread on dev@hbase.apache.org
• Post a comment on HBASE-21745 describing your idea
e-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7
Thanks!

Más contenido relacionado

La actualidad más candente

HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiHBaseCon
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing DataWorks Summit
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariDataWorks Summit
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)Ryan Blue
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingDataWorks Summit
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDBSage Weil
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizonThejas Nair
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's EvolutionDataWorks Summit
 

La actualidad más candente (20)

HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Hive tuning
Hive tuningHive tuning
Hive tuning
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Apache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at XiaomiApache HBase Improvements and Practices at Xiaomi
Apache HBase Improvements and Practices at Xiaomi
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with Ambari
 
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)The evolution of Netflix's S3 data warehouse (Strata NY 2018)
The evolution of Netflix's S3 data warehouse (Strata NY 2018)
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Hive Does ACID
Hive Does ACIDHive Does ACID
Hive Does ACID
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
Apache Ranger Hive Metastore Security
Apache Ranger Hive Metastore Security Apache Ranger Hive Metastore Security
Apache Ranger Hive Metastore Security
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
HDFS Overview
HDFS OverviewHDFS Overview
HDFS Overview
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
 

Similar a HBCK2: Concepts, trends and recipes for fixing issues within HBase 2

HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trencheswchevreuil
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0enissoz
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 ReleaseNick Dimiduk
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & developmentShashwat Shriparv
 
HBase System Tables / Metadata Info
HBase System Tables / Metadata InfoHBase System Tables / Metadata Info
HBase System Tables / Metadata Infowchevreuil
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the unionenissoz
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of AltiscaleDebugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of AltiscaleData Con LA
 
Rails on HBase
Rails on HBaseRails on HBase
Rails on HBasezpinter
 
Chef Patterns at Bloomberg Scale
Chef Patterns at Bloomberg ScaleChef Patterns at Bloomberg Scale
Chef Patterns at Bloomberg ScaleChef
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014larsgeorge
 

Similar a HBCK2: Concepts, trends and recipes for fixing issues within HBase 2 (20)

HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinteresthbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
 
HBase tales from the trenches
HBase tales from the trenchesHBase tales from the trenches
HBase tales from the trenches
 
Meet Apache HBase - 2.0
Meet Apache HBase - 2.0Meet Apache HBase - 2.0
Meet Apache HBase - 2.0
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0
 
Apache HBase 1.0 Release
Apache HBase 1.0 ReleaseApache HBase 1.0 Release
Apache HBase 1.0 Release
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
 
HBase System Tables / Metadata Info
HBase System Tables / Metadata InfoHBase System Tables / Metadata Info
HBase System Tables / Metadata Info
 
HBase state of the union
HBase   state of the unionHBase   state of the union
HBase state of the union
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0
 
Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of AltiscaleDebugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
Debugging Hive with Hadoop-in-the-Cloud by David Chaiken of Altiscale
 
Rails on HBase
Rails on HBaseRails on HBase
Rails on HBase
 
Chef Patterns at Bloomberg Scale
Chef Patterns at Bloomberg ScaleChef Patterns at Bloomberg Scale
Chef Patterns at Bloomberg Scale
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
ApacheCon-HBase-2016
ApacheCon-HBase-2016ApacheCon-HBase-2016
ApacheCon-HBase-2016
 

Más de Michael Stack

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloudhbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on CloudMichael Stack
 
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinteresthbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at PinterestMichael Stack
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., LtdMichael Stack
 
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didihbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at DidiMichael Stack
 
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...Michael Stack
 
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at TencentMichael Stack
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...Michael Stack
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...Michael Stack
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index ComponentMichael Stack
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomihbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at XiaomiMichael Stack
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and SparkMichael Stack
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBaseMichael Stack
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index SolutionMichael Stack
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent MemoryMichael Stack
 
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLhbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLMichael Stack
 
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBasehbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBaseMichael Stack
 
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...Michael Stack
 
HBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteHBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteMichael Stack
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesMichael Stack
 

Más de Michael Stack (20)

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloudhbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
 
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinteresthbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinterest
 
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
 
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didihbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
 
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
 
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencent
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
 
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomihbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
 
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
 
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
 
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
 
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
 
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLhbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
 
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBasehbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
 
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
 
HBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteHBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
 
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
 

HBCK2: Concepts, trends and recipes for fixing issues within HBase 2

  • 1.
  • 2. HBCK2: Concepts, trends and recipes for fixing issues within HBase 2 Wellington Chevreuil HBase Committer Cloudera HBase SW Engineer
  • 3. HBCK (1) - Little bit of history • Main tool for general inconsistencies in hbase-1.x • The Swiss Knife for operators • Packaged together with hbase main project • Provides both diagnosing and fixing commands • Some reports may be misleading, e.g., "holes in the region chain" • Some options can cause damages if not well understood, e.g., "-sidelineBigOverlaps", "-removeParents" • Commands often work independent of Master • Can introduce conflicts on meta information maintained by Master • Lack of implementation details on documentation/help guide
  • 5. HBCK2 in a nutshell • Simpler tool • Less fix commands • No diagnosis command • Requires deeper HBase internal workings from operators • Shipped independently from hbase • Packaged with hbase-operators-tool project • https://github.com/apache/hbase-operator-tools • Can evolve on its own pace • New versions can be run without needing whole hbase upgrade • Master oriented (more later) • More detailed documentation about each command • Still a WIP • By the time of this presentation, there's still no official release for HBCK2
  • 6. HBCK2 Concepts • AMv2 compliant • HBCK1 does not work with HBase 2 AssignmentManager re-implementation • Thinner, but more interactive commands • No such thing as hbck1 -fix command • Operators required to fix an issue at a time • Master oriented • Master must be online • Commands implementation should use Master HbckService as much as possible • However, new commands may initially require a client side implementation, then get ported to Master's HbckService facade • Fix only, requires other tools for issue diagnosing • Available only for 2.0.3 onwards, and 2.1.1 onwards
  • 8. HBCK2 Usage trends • Master not completing initialisation • Meta/Namespace table "NOT online" issues • Table RIT issues • Procedures stuck • Table in wrong state • Missing regions in META • User induced via incompatible OfflineMetaRepair tool
  • 9. HBCK2 for Operators: How do I get and run it? • Not released so far, requires local build • Requirements • JDK 1.8 or higher • Git • Maven • Checkout related apache github repository: • $ git clone https://github.com/apache/hbase-operator-tools.git • Build HBCK2 upon desired hbase version: • $ mvn -Dhbase.version=2.1.5 clean install • Above command will produce HBCK2 jar file under ./hbase-hbck2/target/, named hbase-hbck2-1.0.0-SNAPSNOT.jar (assuming current version is 1.0.0-SNAPSHOT) • Upload generated jar to the given hbase cluster and run it as below: • $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar
  • 10. HBCK2 for Operators: Recipes • Meta/Namespace table regions "NOT online" • Due to corruption or manual deletion of /hbase/MasterProcWALs files • Meta may miss info about RS assignment • Master logs show regions assigned to an old RS start code • Run HBCK2 assigns command for META region 1588230740: • $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 1588230740 • Similar issue may affect namespace and user tables regions • Affected regions names would be mentioned on log messages similar to above WARN org.apache.hadoop.hbase.master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPENING, ts=1550754721289, server=regionserver01.example.com,16020,1550676598448}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.
  • 11. HBCK2 for Operators: Recipes • Table RIT issues • Usually, due several RSes crashes/slowness while regions are transitioning • Run HBCK2 assigns command for the given region encoded name 11bf6b18ddacdd864728e6cf1199b2a7: • $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar assigns 11bf6b18ddacdd864728e6cf1199b2a7 WARN org.apache.hadoop.hbase.master.assignment.AssignmentManager: STUCK Region-In-Transition rit=OPENING, location=regionserver01.example.com,16020,1542314816394, table=hbase:acl, region=11bf6b18ddacdd864728e6cf1199b2a7 ... WARN org.apache.hadoop.hbase.ipc.RpcServer: Dropping timed out call: callId: 702 service: ClientService methodName: Mutate size: 272 connection: 1.1.1.1:56492 deadline: 1542316740911 param: region= hbase:meta,,1, row=hbase:acl,,1404406671604.11bf6b18ddacdd864728e6cf1199b2a7. connection: 1.1.1.1:56492
  • 12. HBCK2 for Operators: Recipes • Procedures stuck • While troubleshooting causes for RITs, check for procedures attempting to transition regions states: • $ echo "list_procedures" | hbase shell • Output for list_procedures shows WAITING_TIMEOUT and/or procedures running for days • Other procedures fail to acquire lock owned by one of the stuck procedures: • Run HBCK2 bypass command to get rid of stuck procedures: • $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7 PID Name State Submitted Last_Update Parameters 6 org.apache.hadoop.hbase.master.assignment.UnassignProcedure WAITING_TIMEOUT 2019-03-29 11:15:06 2019-04-08 06:33:35 ... 7 org.apache.hadoop.hbase.master.procedure.DeleteTableProcedure RUNNABLE 2019-03-29 11:24:39 2019-03-29 11:24:39 ... ERROR: org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: f7910bfc9c9... owned by pid=6, CANNOT run 'this' (pid=347).
  • 13. HBCK2 for Operators: Recipes • Table in wrong state • Can happen after hanging enable/disable table procedures, or related sub-procedures • Bypassing procedures can lead to this as well • Table indefinitely in temporary states ENABLING/DISABLING • scan 'hbase:meta', {COLUMN => "table:state"} • enable 'usertable' • Run HBCK2 setTableState to manually bring table state to one of the final ones ENABLED/DISABLED: • $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar setTableState usertable DISABLEDbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7 usertable column=table:state, timestamp=1555406568751, value=x08x03. ERROR: Table tableName=usertable, state=ENABLING should be disabled!
  • 14. HBCK2 for Operators: Recipes • Missing regions in META • Operator induced when running incompatible tool OfflineMetaRepair (HBASE-21665) • Typically manifests as holes on the region chain, or in the case of namespace region missing, master fails initialisation • scan 'hbase:meta', {COLUMN => "table:state", ROWPREFIXFILTER => 'hbase:namespace'} • Still under development through HBASE-22567, HBCK2 addMissingRegionsInMeta can be used to re-add missing regions: • $ hbase hbck -j ../hbase-hbck2-1.0.0-SNAPSHOT.jar addMissingRegionInMeta hbase:namespace • Still WIP, so syntax might change. • Check HBASE-22567 for latest developments e-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7 ROW COLUMN+CELL 0 row(s)
  • 15. HBCK2 for Contributors • Apache github repository: https://github.com/apache/hbase-operator-tools • HBCK2 defined as sub-module hbase-hbck2 of hbase-operator-tools • HBASE-21745 • Umbrella jira for tracking potential new HBCK2 features • Faced a new issue in HBase 2? Have a new idea for HBCK2 command? • Great! Contributions are welcome! • Start a [DISCUSS] mail thread on dev@hbase.apache.org • Post a comment on HBASE-21745 describing your idea e-hbck2-1.0.0-SNAPSHOT.jar bypass 6 7