SlideShare una empresa de Scribd logo
1 de 76
Improving HBase Availability and Repair
  Improving HBase Availability and Repair


           Jeff Bean, Jonathan Hsieh
         {jwfbean,jon}@cloudera.com
                    6/13/12
Who Are We?

• Jeff Bean
   • Designated Support Engineer, Cloudera
   • Education Program Lead, Cloudera



• Jonathan Hsieh
   • Software Engineer, Cloudera
   • Apache HBase Committer and PMC member




                    Hadoop Summit 2012. 6/13/12 Copyright 2012   2
                          Cloudera Inc, All Rights Reserved
What is Apache HBase?

                                Apache HBase is an
                                 reliable, column-
                                oriented data store
                                    that provides
                                  consistent, low-
                                  latency, random
                                read/write access.

              Hadoop Summit 2012. 6/13/12 Copyright 2012   3
                    Cloudera Inc, All Rights Reserved
Fault Tolerance vs Highly Available

• Fault tolerant:
   • Ability to recover service if a
     component fails, without losing
                                                                   Fault Tolerant
     data.


• Highly Available:
   • Ability to quickly recover service if                            Highly
     a component fails, without losing                               Available
     data.


• Goal: Minimize downtime!
                      Hadoop Summit 2012. 6/13/12 Copyright 2012                    4
                            Cloudera Inc, All Rights Reserved
HBase Architecture
• HBase is designed to be fault tolerant
  and highly available
   • It depends on other systems to be as well.
                                                                   App   MR
• Replication for fault tolerance
   •   Serve regions from any Region server
   •   Failover HMasters
   •   ZK Quorums
   •   HDFS Block replication on Data Nodes
                                                                   ZK    HDFS
• But replication doesn’t guarantee high
  availability
   • There can still be software or human faults

                      Hadoop Summit 2012. 6/13/12 Copyright 2012                5
                            Cloudera Inc, All Rights Reserved
Causes of HBase Downtime

                                                            HBase Downtime
                                                              Distribution
• Unplanned Maintenance
  • Hardware failures
  • Software errors
                                                                       Planned
  • Human error
• Planned Maintenance
  • Upgrades                                             Unplanned


  • Migrations




                   Hadoop Summit 2012. 6/13/12 Copyright 2012                    6
                         Cloudera Inc, All Rights Reserved
Causes of Unexpected Maintenance Incidents

                                                Unplanned Maintenance: Root
                                                Cause from Cloudera Support
•   Misconfiguration
•   Metadata Corruptions
                                                         Repair
•   Network / HW problems                                Needed
                                                                             HBase, ZK,
                                                          28%
•   SW problems                                                              MR, HDFS
                                                                             Misconfig
                                                                               44%
                                                           Fix
• Long recovery time                                     HW/NW
                                                          16%      Patch
    • Automated and manual                                        Required
                                                                    12%

                                     Source: Cloudera’s production HBase Support Tickets
                                             CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x
                   Hadoop Summit 2012. 6/13/12 Copyright 2012                              7
                         Cloudera Inc, All Rights Reserved
Outline
• Where we were
  • HBase 0.90.x + Hadoop 0.20.x/1.0.x
  • Case Studies


• Where we are today
  • HBase 0.92.x/0.94.x + Hadoop 2.0.x
  • Feature Summary

• Where we are going
  • HBase 0.96.x + Hadoop 2.x
  • Feature Preview
                   Hadoop Summit 2012. 6/13/12 Copyright 2012   8
                         Cloudera Inc, All Rights Reserved
[T]here are known knowns; there are things we know we know.
      We also know there are known unknowns; that is to say we know
      there are some things we do not know.
      But there are also unknown unknowns – there are things we do not
      know we don't know.
                    —United States Secretary of Defense Donald Rumsfeld




WHERE WE WERE:
CASE STUDIES

          Hadoop Summit 2012. 6/13/12 Copyright 2012                      9
                Cloudera Inc, All Rights Reserved
Best Practices to avoid hazards

                                             Unplanned Maintenance: Root
                                             Cause from Cloudera Support


                                                      Repair
                                                      Needed
                                                                          HBase, ZK,
                                                       28%
                                                                          MR, HDFS
                                                                          Misconfig
                                                                            44%
                                                        Fix
                                                      HW/NW
                                                       16%      Patch
                                                               Required
                                                                 12%

 CAN PREVENT HBASE                Source: Cloudera’s production HBase Support Tickets
 MISCONFIGURATIONS                        CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x
                Hadoop Summit 2012. 6/13/12 Copyright 2012                              10
                      Cloudera Inc, All Rights Reserved
Case #1: Memory Over-subscription Hazard


          Misconfig                                                                  Bad Outcome


                                                                                                Masters Take
                                Node A swaps
• Too many MR Slots                                      • MapReduce tasks fail                   Action
• MR Slots too large                                     • HDFS datanode
                       • “Arbitrary” processes             operations time out         • JobTracker blacklists TT
                         pause or unresponsive                                           on node B
                                                         • HBase client operations
                                                           fail                        • Jobs fail or run slow
                                                                                       • NameNode re-replicates
                                                                                         blocks from node A
        Node A Under                                              Node B can’t
            Load                                                connect to node A




                               Hadoop Summit 2012. 6/13/12 Copyright 2012                                           11
                                     Cloudera Inc, All Rights Reserved
Case #2, #3: Hazards of Abusing HDFS and ZK

      Millions of HDFS files                                Millions of ZK nodes
                           Bad Practice                                       Misconfiguration
 500,000 blocks per                               Millions of ZK znodes
 datanode                                         400MB snapshot


   Heartbeat thread              SW Bug               ZK fails to create new
   blocks IO                                          snapshots, fails


     RS cannot access                                                              Bad outcome
                                                          HBase goes down
     HDFS


        HBase goes down             Bad outcome               HBase fails to restart
                                                                                SW Bug, Worse
                        Hadoop Summit 2012. 6/13/12 Copyright 2012
                                                                                  outcome 12
                              Cloudera Inc, All Rights Reserved
Case #4: Splitting Corruption from HW failure

                                                                                   Manual, Slow, and
                     HW Failure                                                     requires expert




                                                                   HBase has
    Region                                                          regions            Multiple 6 hour
               Network failure        Split Recovery
 attempts to                                                    inconsistencies        manual repair
               (takes out NN)          incomplete
     split                                                           (overlaps /         sessions.
                                                                       holes)




                                             SW Bug



                        Hadoop Summit 2012. 6/13/12 Copyright 2012                                  13
                              Cloudera Inc, All Rights Reserved
Case #5: Slow recovery from HW failure

                                                                         Correct but slow!
      Human error




                                          On
                 RS loses           restart, Roo                                  9 hour hlog
 Network                                                           Manual
                HDFS, WAL               t and                                       splitting
HW failure                                                         Repairs
                     s                 .META.                                      recovery
                                     assign fails



                             SW error


                      Hadoop Summit 2012. 6/13/12 Copyright 2012                             14
                            Cloudera Inc, All Rights Reserved
Initial Lessons

• Use Best practices to avoid problems
   • Conservative first
   • Avoid unstable features


• What can we do?
   •   Fix the bugs
   •   Recover from problems faster
   •   Make people smarter to avoid hazards and misconfigurations
   •   Make software smarter to prevent hazards and
       misconfigurations

                      Hadoop Summit 2012. 6/13/12 Copyright 2012    15
                            Cloudera Inc, All Rights Reserved
In war, then, let your great object be
                       victory, not lengthy campaigns.
                                                                -- Sun Tzu




WHERE WE ARE TODAY
HBASE 0.92.X + HADOOP 2.0.X

         Hadoop Summit 2012. 6/13/12 Copyright 2012                          16
               Cloudera Inc, All Rights Reserved
Goal: Reduce unexpected downtime by
recovering faster

• Removing the SPOFs
  • HA HDFS


• Faster Recovery
  • Improved hbck
  • Distributed Log splitting




                    Hadoop Summit 2012. 6/13/12 Copyright 2012   17
                          Cloudera Inc, All Rights Reserved
Problem: HDFS NN goes down under HBase

• HBase depends on HDFS.                                               MR
                                                                 App
   • If HDFS is down, HBase goes down.
• Ramifications.
   • Forces Recovery mechanism
   • Caused some data corruptions
                                                                 ZK    HDFS



• Ideally we avoid having to do recovery at all.



                    Hadoop Summit 2012. 6/13/12 Copyright 2012                18
                          Cloudera Inc, All Rights Reserved
HBase-HDFS HA Nodes

 NameNode (active)                                                     HMaster
 (metadata server)                                                     (region metadata)
 NameNode (standby)                                                    HMaster
  (active-standby                                                      (hot standby)
  hot failover)




                            ZooKeeper Quorum




      HDFS DataNodes                                           HBase RegionServers


                       Hadoop Summit 2012. 6/13/12 Copyright 2012                          19
                             Cloudera Inc, All Rights Reserved
HBase-HDFS HA Nodes: Transparent to HBase

                                                                       HMaster
                                                                       (region metadata)
                                                                       HMaster
 NameNode (active)                                                     (hot standby)




                            ZooKeeper Quorum




      HDFS DataNodes                                           HBase RegionServers


                       Hadoop Summit 2012. 6/13/12 Copyright 2012                          20
                             Cloudera Inc, All Rights Reserved
HBase-HDFS HA Nodes: No more SPOF



                                                                       HMaster
 NameNode (active)                                                     (active)




                            ZooKeeper Quorum




      HDFS DataNodes                                           HBase RegionServers


                       Hadoop Summit 2012. 6/13/12 Copyright 2012                    21
                             Cloudera Inc, All Rights Reserved
Recovery operations

• If a network switch fails or if there is a power outage,
   • HBase, ZK, and HA HDFS will fail
   • Will always still rely on recovery mechanisms.



• Need to be able to quickly recover
   • Metadata Invariants to fix metadata corruptions
   • Data Consistency to restore ACID guarantees




                     Hadoop Summit 2012. 6/13/12 Copyright 2012   22
                           Cloudera Inc, All Rights Reserved
HBase Metadata Corruptions

• Internal HBase metadata                   Unplanned Maintenance: Root Cause
  corruptions                                     from Cloudera Support
  • Prevent HBase from starting
  • Cause some regions to be                         Repair
    unavailable.                                     Needed
                                                      28%                  HBase, ZK,
                                                                           MR, HDFS
                                                                           Misconfig
• Repairs are intricate and                                                  44%
                                                       Fix
  can cause extended periods                         HW/NW
  of downtime.                                        16%        Patch
                                                                Required
                                                                  12%



                   Hadoop Summit 2012. 6/13/12 Copyright 2012                           23
                         Cloudera Inc, All Rights Reserved
HBase Metadata Invariants

 Table Integrity                               Region Consistency
 • Every key shall get assigned                • Metadata about regions should
   to a single region.                           agree in hdfs, meta and region
                                                 server assignment.
          [‘ ‘,A)
          [A,B)                                                   regioninfo
                                                                   in META
          [B, C)
          [C, D)
          [D, E)                                                    Good
          [E, F)                                       region
                                                       assigned            .regioninfo
          [F, G)                                       to RS                  in HDFS
          [G, ‘ ‘)

                     Hadoop Summit 2012. 6/13/12 Copyright 2012                          24
                           Cloudera Inc, All Rights Reserved
Detecting and Repairing corruption with hbck
• HBase 0.90 hbck
  • Checks an HBase
    instance’s internals
    invariants.
• HBase hbck today
  • Checks and can fix
    problem in an HBase
    instance’s internal
    invariants
  • 0.90.7, 0.92.2, 0.9
    4.0
  • CDH3u4, CDH4
                    Hadoop Summit 2012. 6/13/12 Copyright 2012   25
                          Cloudera Inc, All Rights Reserved
Case #4 redux: Splitting Corruption

                                                                              Manual, Slow, and
                     HW Failure                                                requires expert




                                                                   HBase has
    Region     Network failure                                      regions       Multiple 6 hour
                                      Split Recovery
 attempts to                                                    inconsistencies   manual repair
               (takes out NN)          incomplete
     split                                                        (overlaps /       sessions.
                                                                     holes)




                                             SW Bug



                        Hadoop Summit 2012. 6/13/12 Copyright 2012                             26
                              Cloudera Inc, All Rights Reserved
Case #4 redux: Splitting Corruption

                     HW Failure




                                                                   HBase has
    Region     Network failure                                      regions           Automated
                                      Split Recovery
 attempts to                                                    inconsistencies       repair tool
               (takes out NN)          incomplete
     split                                                        (overlaps /          (Minutes)
                                                                     holes)



                                                                                      Fixes are
                                             SW Bug
                                                                                  quicker, operator
                                                                                       can use

                        Hadoop Summit 2012. 6/13/12 Copyright 2012                                  27
                              Cloudera Inc, All Rights Reserved
Case #4 redux: Splitting Corruption

                     HW Failure




                                                                 Minor HBase
    Region     Network failure                                  inconsistencies   Automated
                                      Split Recovery
 attempts to                                                                      repair tool
               (takes out NN)          incomplete                     (bad
     split                                                                         (seconds)
                                                                 assignments)




                                        Fixed SW Bug



                        Hadoop Summit 2012. 6/13/12 Copyright 2012                              28
                              Cloudera Inc, All Rights Reserved
Data Consistency

• When a region server goes down, it tries to flush data in
  memory to HDFS.
• If it cannot write to HDFS, it relies on the WAL/HLog.

• Recovery via the HLog is vital to prevent data loss
   • Understand the write path.
   • Recovery: HLog splitting.
   • Faster Recovery: Distributed HLog splitting.



                     Hadoop Summit 2012. 6/13/12 Copyright 2012   29
                           Cloudera Inc, All Rights Reserved
Write Path (Put / Delete / Increment)

   HBase
                           Region Server
   client

                                                         HLog            Put
                               Server


                                                    HRegion              HRegion
                                                      MemStore            MemStore
                                                      Put




                                                       HStore


                                                                HStore



                                                                          HStore


                                                                                   HStore
                Hadoop Summit 2012. 6/13/12 Copyright 2012                                  30
                      Cloudera Inc, All Rights Reserved
Write Path (Put / Delete / Increment)
                                                                       Note, both regions
                                                                       write to the same
   HBase                                                               HLog
                             Region Server
   client
            Put

                                                           HLog             Put         Put
                                 Server


                                                      HRegion                HRegion
                                                        MemStore              MemStore
                                                        Put                   Put




                                                         HStore


                                                                  HStore



                                                                               HStore


                                                                                          HStore
                  Hadoop Summit 2012. 6/13/12 Copyright 2012                                       31
                        Cloudera Inc, All Rights Reserved
Log Splitting
                                  HMaster




       RegionServer                     RegionServer                       RegionServer
        HLog1                            HLog2                              HLog3


                                                                                                  …
         HRegion


                    HRegion




                                            HRegion


                                                        HRegion




                                                                             HRegion


                                                                                        HRegion
        mem        mem                    mem         mem                   mem        mem
                              Hadoop Summit 2012. 6/13/12 Copyright 2012                              32
                                    Cloudera Inc, All Rights Reserved
Log Splitting
                                  HMaster




       RegionServer                     RegionServer                       RegionServer
        HLog1                            HLog2                              HLog3


                                                                                                  …
         HRegion


                    HRegion




                                            HRegion


                                                        HRegion




                                                                             HRegion


                                                                                        HRegion
        mem        mem                    mem         mem                   mem        mem
                              Hadoop Summit 2012. 6/13/12 Copyright 2012                              33
                                    Cloudera Inc, All Rights Reserved
Log Splitting
                                 HMaster




        HLog1                           HLog2                             HLog3


                                                                                               …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                     HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                            34
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                        Splitting log 1
                                 HMaster




        HLog1                           HLog2                             HLog3


                                                                                                   …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                        HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                35
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                        Splitting log 2
                                 HMaster




        HLog
        HLog1                           HLog2                             HLog3


                                                                                                   …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                        HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                36
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                        Splitting log 3
                                 HMaster




        HLog
        HLog1                           HLog
                                        HLog2                             HLog3


                                                                                                   …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                        HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                37
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                        Splitting log 100
                                 HMaster




        HLog                            HLog                              HLog


                                                                                                    …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                         HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                  38
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                    Whew. I did a lot of
                                                                                 splitting work. That
                                 HMaster                                             took 9 hours!




        HLog                            HLog                              HLog


                                                                                                 …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                      HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                 39
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                     RegionServers, here
                                                                                    are your region
                                 HMaster                                             assignments.


       RegionServer4                  RegionServer5                       RegionServer6

                                                                                                 …




                                                                                                 …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                            HRegion


                                                                                       HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                 40
                                   Cloudera Inc, All Rights Reserved
Log Splitting                                                                           Victory!
                                  HMaster


       RegionServer4                   RegionServer5                       RegionServer6



                                                                                                   …
         HRegion


                    HRegion




                                           HRegion


                                                        HRegion




                                                                             HRegion


                                                                                        HRegion
        mem        mem                    mem         mem                   mem        mem




                              Hadoop Summit 2012. 6/13/12 Copyright 2012                               41
                                    Cloudera Inc, All Rights Reserved
Can we recover more quickly?

• In the case study, this is all done serially by the master
   • The master took 9 hours to recovery.
   • The 100 region server nodes were idle.


• Let’s use the idle machines to do splitting in parallel!

• Distributed log splitting (HBASE-1364)
   • Introduced in 0.92.0 by Prakash Khemani (Facebook)
   • Included in CDH4 (0.92.1)
   • Backported to CDH3u3 (off by default)
                    Hadoop Summit 2012. 6/13/12 Copyright 2012   42
                          Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                              I’m the boss.
                                  HMaster




       RegionServer                     RegionServer                       RegionServer
        HLog1                            HLog2                              HLog3


                                                                                                       …
         HRegion


                    HRegion




                                            HRegion


                                                        HRegion




                                                                             HRegion


                                                                                         HRegion
        mem        mem                    mem         mem                   mem         mem
                              Hadoop Summit 2012. 6/13/12 Copyright 2012                                   43
                                    Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                              There is a lot of
                                                                                     splitting work here,
                                 HMaster                                               let’s split it up.




        HLog1                           HLog2                             HLog3


                                                                                                    …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                           HRegion


                                                                                         HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                     44
                                   Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                             You guys do the work
                                                                                            for me.
                                 HMaster


       RegionServer4                  RegionServer5                       RegionServer6




        HLog1                           HLog2                              HLog3


                                                                                                     …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                            HRegion


                                                                                           HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                      45
                                   Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                             You guys do the work
                                                                                            for me.
                                 HMaster


       RegionServer4                  RegionServer5                       RegionServer6




        HLog1                           HLog2                              HLog3


                                                                                                     …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                            HRegion


                                                                                           HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                      46
                                   Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                             Great, that took 5.4
                                                                                           minutes.
                                 HMaster


       RegionServer4                  RegionServer5                       RegionServer6




                                                                                                     …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                            HRegion


                                                                                          HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                      47
                                   Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                             Good Job, here are
                                                                                         your region
                                 HMaster                                                assignments.


       RegionServer4                  RegionServer5                       RegionServer6




                                                                                                     …
         HRegion


                   HRegion




                                           HRegion


                                                       HRegion




                                                                            HRegion


                                                                                           HRegion
                             Hadoop Summit 2012. 6/13/12 Copyright 2012                                    48
                                   Cloudera Inc, All Rights Reserved
Distributed Log Splitting                                                              Like a Boss.
                                  HMaster


       RegionServer4                   RegionServer5                       RegionServer6



                                                                                                      …
         HRegion


                    HRegion




                                           HRegion


                                                        HRegion




                                                                             HRegion


                                                                                        HRegion
        mem        mem                    mem         mem                   mem        mem




                              Hadoop Summit 2012. 6/13/12 Copyright 2012                                  49
                                    Cloudera Inc, All Rights Reserved
Case #5 redux: Network failure and slow recovery

                                                                        Correct but slow!
      Human error




                                          On
                 RS loses           restart, Roo                                 9 hour hlog
 Network                                                           Manual
                HDFS, WAL               t and                                      splitting
HW failure                                                         Repair
                     s                 .META.                                     recovery
                                     assign fails




                      Hadoop Summit 2012. 6/13/12 Copyright 2012                            50
                            Cloudera Inc, All Rights Reserved
Case #5 redux: Network failure and slow recovery

                                                                         Correct and Faster!
      Human error




                                          On
                                                                                   5.4 Minute
                 RS loses           restart, Roo
 Network                                                           Automatic          hlog
                HDFS, WAL               t and
HW failure                                                          repairs         splitting
                     s                 .META.
                                                                                    recovery
                                     assign fails



                                                          Fixed!


                      Hadoop Summit 2012. 6/13/12 Copyright 2012                               51
                            Cloudera Inc, All Rights Reserved
WHERE WE ARE GOING
HBASE 0.96 + HADOOP 2.X

         Hadoop Summit 2012. 6/13/12 Copyright 2012   52
               Cloudera Inc, All Rights Reserved
Themes

• Minimizing Planned downtime                                    HBase Downtime
  • Changing configurations                                        Distribution
  • Online Schema Change
    (experimental in 0.92, 0.94)
  • Rolling Restarts                                                        Planned

  • Wire compatibility

                                                             Unplanned




                    Hadoop Summit 2012. 6/13/12 Copyright 2012                        53
                          Cloudera Inc, All Rights Reserved
Table unavailable when changing schema

• Changing table schema requires disabling table
   • disable table, alter table schema, enable table
   • Schema includes compression, cf’s, caching, ttl, versions.


• Goal: Quickly change table and column configuration
  settings without having to disable Hbase tables.
   • Feature Online Schema Change (HBASE-1730)
   • Included in but considered experimental in HBase 0.92/0.94.
   • Contributed by Facebook


                     Hadoop Summit 2012. 6/13/12 Copyright 2012    54
                           Cloudera Inc, All Rights Reserved
Changing Server Configs and Software updates

• Rolling restart is an operation for upgrading an HBase
  cluster to a compatible version while keeping HBase
  available and serving data.
   • Handle server config changes.
   • Handle code changes like hotfixes or compatible upgrades




                    Hadoop Summit 2012. 6/13/12 Copyright 2012   55
                          Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                56
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
                       RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                57
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                58
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1                    RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                59
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                60
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2                            RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                61
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                62
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                63
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                64
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell


   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                65
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                66
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations


              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                67
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                68
                             Cloudera Inc, All Rights Reserved
Rolling Restart

                                                         Admin
                                                       operations
                                                                            ZK
              Client                Shell

                                                                           HM1
   User
 operations

                                                                           HM2
              RS1      RS2           RS3              RS4


                                                                     Internal
                                                                    operations

                       Hadoop Summit 2012. 6/13/12 Copyright 2012                69
                             Cloudera Inc, All Rights Reserved
Rolling restart limitations
• There are limitations on                           Unplanned Maintenance: Root
  rolling restarts                                   Cause from Cloudera Support
   • All Servers and clients must be
     wire compatible
   • All must be able to read old
     data in FS and ZK.                                       Repair
                                                              Needed
                                                                                  HBase, ZK,
                                                               28%
• Ramifications:                                                                  MR, HDFS
                                                                                  Misconfig
   • Only minor version upgrades                                                    44%
     possible                                                   Fix
   • New features that change RPCs                            HW/NW
     require custom compatibility                              16%      Patch
     shims.                                                            Required
   • Data format changes not                                             12%
     possible across minor versions.
                                          Source: Cloudera’s production HBase Support Tickets
                                                  CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x
                        Hadoop Summit 2012. 6/13/12 Copyright 2012                              70
                              Cloudera Inc, All Rights Reserved
HBase Compatibility and Extensibility

• Coming in HBase 0.96
  • HBASE-5305 and friends


• Goals:
  • Allow API and changes and persistent data structure changes
    while guarantees compatibility between different minor
    versions (0.96.0 -> 0.96.1)
  • HBase client server compatibility between Major Versions.
    (0.96.x -> 0.98.x)



                   Hadoop Summit 2012. 6/13/12 Copyright 2012     71
                         Cloudera Inc, All Rights Reserved
HDFS Wire Compatibility

• Here in HDFS 2.0.x
   • HADOOP-7347 and friends
                                                                 App   MR
• Goals:
   • Allow API and changes while
     guaranteeing wire compatibility
     between different minor versions
   • HDFS client server compatibility                            ZK    HDFS
     between Major Versions.



                    Hadoop Summit 2012. 6/13/12 Copyright 2012                72
                          Cloudera Inc, All Rights Reserved
HDFS Wire Compatibility

• Here in HDFS 2.0.x
   • HADOOP-7347 and friends
                                                                 App   MR
• Goals:
   • Allow API and changes while
     guaranteeing wire compatibility
     between different minor versions
   • HDFS client server compatibility                            ZK    HDFS
     between Major Versions.



                    Hadoop Summit 2012. 6/13/12 Copyright 2012                73
                          Cloudera Inc, All Rights Reserved
CONCLUSIONS


        Hadoop Summit 2012. 6/13/12 Copyright 2012   74
              Cloudera Inc, All Rights Reserved
Improving how we handling causes of downtime

                                                   Unplanned Maintenance: Root
   HBase Downtime Distribution
                                                   Cause from Cloudera Support
                             Wire
                            compat                                                     Best
                                               hbck                                  practices
                                                            Repair
                         Planned
                                                            Needed
                                                                                HBase, ZK,
                                                             28%
                                                                                MR, HDFS
                                                                                Misconfig
                                                                                  44%
   Unplanned
                                                              Fix
                                                            HW/NW
                                                             16%      Patch
                                                                     Required
                               hbck and                                12%
                            distributed log                                            Wire
                                splitting                                             compat

                      Hadoop Summit 2012. 6/13/12 Copyright 2012                               75
                            Cloudera Inc, All Rights Reserved
jon@cloudera.com
                                                 Twitter: @jmhsieh
                                                      We’re hiring!
QUESTIONS?


        Hadoop Summit 2012. 6/13/12 Copyright 2012                    76
              Cloudera Inc, All Rights Reserved

Más contenido relacionado

La actualidad más candente

Nevmug Martins Point Health Care J Anuary 2009
Nevmug   Martins Point Health Care   J Anuary 2009Nevmug   Martins Point Health Care   J Anuary 2009
Nevmug Martins Point Health Care J Anuary 2009csharney
 
HDFS - What's New and Future
HDFS - What's New and FutureHDFS - What's New and Future
HDFS - What's New and FutureDataWorks Summit
 
Monitoring virtual environments
Monitoring virtual environments Monitoring virtual environments
Monitoring virtual environments Stefan Bergstein
 
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)baggioss
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaCloudera, Inc.
 
Datavail Health Check
Datavail Health CheckDatavail Health Check
Datavail Health CheckDatavail
 
VMware & Riverbed
VMware & RiverbedVMware & Riverbed
VMware & Riverbedvmug
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridScaleOut Software
 
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint PresentationSTN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint Presentationmcini
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Cloudera, Inc.
 
Finding Virtual Coins in the Couch
Finding Virtual Coins in the CouchFinding Virtual Coins in the Couch
Finding Virtual Coins in the CouchNovell
 
Real-Time Loading to Sybase IQ
Real-Time Loading to Sybase IQReal-Time Loading to Sybase IQ
Real-Time Loading to Sybase IQSybase Türkiye
 
The 5 Keys to Virtual Backup Excellence
The 5 Keys to Virtual Backup ExcellenceThe 5 Keys to Virtual Backup Excellence
The 5 Keys to Virtual Backup ExcellenceBill Hobbib
 

La actualidad más candente (19)

Nevmug Martins Point Health Care J Anuary 2009
Nevmug   Martins Point Health Care   J Anuary 2009Nevmug   Martins Point Health Care   J Anuary 2009
Nevmug Martins Point Health Care J Anuary 2009
 
HDFS - What's New and Future
HDFS - What's New and FutureHDFS - What's New and Future
HDFS - What's New and Future
 
Monitoring virtual environments
Monitoring virtual environments Monitoring virtual environments
Monitoring virtual environments
 
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)
 
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, InformaticaHadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica
 
A Beginners Guide to Web Hosting
A Beginners Guide to Web HostingA Beginners Guide to Web Hosting
A Beginners Guide to Web Hosting
 
Monika_Raghuvanshi
Monika_RaghuvanshiMonika_Raghuvanshi
Monika_Raghuvanshi
 
Datavail Health Check
Datavail Health CheckDatavail Health Check
Datavail Health Check
 
VMware & Riverbed
VMware & RiverbedVMware & Riverbed
VMware & Riverbed
 
Top 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data GridTop 6 Reasons to Use a Distributed Data Grid
Top 6 Reasons to Use a Distributed Data Grid
 
Right Availability in RAC environment. Playing with Oracle clusterware infras...
Right Availability in RAC environment. Playing with Oracle clusterware infras...Right Availability in RAC environment. Playing with Oracle clusterware infras...
Right Availability in RAC environment. Playing with Oracle clusterware infras...
 
Twee remedies tegen systeemuitval en datacorruptie
Twee remedies tegen systeemuitval en datacorruptieTwee remedies tegen systeemuitval en datacorruptie
Twee remedies tegen systeemuitval en datacorruptie
 
Edition based redefinition joords
Edition based redefinition joordsEdition based redefinition joords
Edition based redefinition joords
 
Rajaraman _HADOOP docx
Rajaraman _HADOOP docxRajaraman _HADOOP docx
Rajaraman _HADOOP docx
 
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint PresentationSTN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
 
Finding Virtual Coins in the Couch
Finding Virtual Coins in the CouchFinding Virtual Coins in the Couch
Finding Virtual Coins in the Couch
 
Real-Time Loading to Sybase IQ
Real-Time Loading to Sybase IQReal-Time Loading to Sybase IQ
Real-Time Loading to Sybase IQ
 
The 5 Keys to Virtual Backup Excellence
The 5 Keys to Virtual Backup ExcellenceThe 5 Keys to Virtual Backup Excellence
The 5 Keys to Virtual Backup Excellence
 

Destacado

Agile project management with green hopper 6 blueprints
Agile project management with green hopper 6 blueprintsAgile project management with green hopper 6 blueprints
Agile project management with green hopper 6 blueprintsJaibeer Malik
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using SolrPradeep Pujari
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill Carol McDonald
 
Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityMapR Technologies
 
HBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBaseHBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBaseCloudera, Inc.
 
How To Analyze Geolocation Data with Hive and Hadoop
How To Analyze Geolocation Data with Hive and HadoopHow To Analyze Geolocation Data with Hive and Hadoop
How To Analyze Geolocation Data with Hive and HadoopHortonworks
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBaseAnil Gupta
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low LatencyNick Dimiduk
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseEdureka!
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)alexbaranau
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...DataWorks Summit/Hadoop Summit
 

Destacado (14)

Agile project management with green hopper 6 blueprints
Agile project management with green hopper 6 blueprintsAgile project management with green hopper 6 blueprints
Agile project management with green hopper 6 blueprints
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using Solr
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
 
Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
 
HBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBaseHBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Integration of Apache Hive and HBase
 
How To Analyze Geolocation Data with Hive and Hadoop
How To Analyze Geolocation Data with Hive and HadoopHow To Analyze Geolocation Data with Hive and Hadoop
How To Analyze Geolocation Data with Hive and Hadoop
 
Introduction To HBase
Introduction To HBaseIntroduction To HBase
Introduction To HBase
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low Latency
 
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL databaseHBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
HBase Vs Cassandra Vs MongoDB - Choosing the right NoSQL database
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 

Similar a Hadoop Summit 2012 | Improving HBase Availability and Repair

Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)jmhsieh
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseStrata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseCloudera, Inc.
 
Trends in Supporting Production Apache HBase Clusters
Trends in Supporting Production Apache HBase ClustersTrends in Supporting Production Apache HBase Clusters
Trends in Supporting Production Apache HBase ClustersDataWorks Summit
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopCloudera, Inc.
 
4 supporting h base jeff, jon, kathleen - cloudera - final 2
4 supporting h base   jeff, jon, kathleen - cloudera - final 24 supporting h base   jeff, jon, kathleen - cloudera - final 2
4 supporting h base jeff, jon, kathleen - cloudera - final 2Cloudera, Inc.
 
HBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSHBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSCloudera, Inc.
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrowSteve Loughran
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataCloudera, Inc.
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101EMC
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview EMC
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...EMC
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaCloudera, Inc.
 
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...Cloudera, Inc.
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
Storage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesStorage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesLINE Corporation (Tech Unit)
 
Introduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleIntroduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleSpringPeople
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computingJoey Echeverria
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7Ted Dunning
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012Chris Huang
 

Similar a Hadoop Summit 2012 | Improving HBase Availability and Repair (20)

Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)Apache hbase for the enterprise (Strata+Hadoop World 2012)
Apache hbase for the enterprise (Strata+Hadoop World 2012)
 
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the EnterpriseStrata + Hadoop World 2012: Apache HBase Features for the Enterprise
Strata + Hadoop World 2012: Apache HBase Features for the Enterprise
 
Trends in Supporting Production Apache HBase Clusters
Trends in Supporting Production Apache HBase ClustersTrends in Supporting Production Apache HBase Clusters
Trends in Supporting Production Apache HBase Clusters
 
Impala: Real-time Queries in Hadoop
Impala: Real-time Queries in HadoopImpala: Real-time Queries in Hadoop
Impala: Real-time Queries in Hadoop
 
4 supporting h base jeff, jon, kathleen - cloudera - final 2
4 supporting h base   jeff, jon, kathleen - cloudera - final 24 supporting h base   jeff, jon, kathleen - cloudera - final 2
4 supporting h base jeff, jon, kathleen - cloudera - final 2
 
HBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFSHBase User Group #9: HBase and HDFS
HBase User Group #9: HBase and HDFS
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrow
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big Data
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
 
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaHBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop ...
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Storage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messagesStorage infrastructure using HBase behind LINE messages
Storage infrastructure using HBase behind LINE messages
 
Introduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeopleIntroduction To Hadoop Administration - SpringPeople
Introduction To Hadoop Administration - SpringPeople
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computing
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Hbase status quo apache-con europe - nov 2012
Hbase status quo   apache-con europe - nov 2012Hbase status quo   apache-con europe - nov 2012
Hbase status quo apache-con europe - nov 2012
 

Más de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Más de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

Hadoop Summit 2012 | Improving HBase Availability and Repair

  • 1. Improving HBase Availability and Repair Improving HBase Availability and Repair Jeff Bean, Jonathan Hsieh {jwfbean,jon}@cloudera.com 6/13/12
  • 2. Who Are We? • Jeff Bean • Designated Support Engineer, Cloudera • Education Program Lead, Cloudera • Jonathan Hsieh • Software Engineer, Cloudera • Apache HBase Committer and PMC member Hadoop Summit 2012. 6/13/12 Copyright 2012 2 Cloudera Inc, All Rights Reserved
  • 3. What is Apache HBase? Apache HBase is an reliable, column- oriented data store that provides consistent, low- latency, random read/write access. Hadoop Summit 2012. 6/13/12 Copyright 2012 3 Cloudera Inc, All Rights Reserved
  • 4. Fault Tolerance vs Highly Available • Fault tolerant: • Ability to recover service if a component fails, without losing Fault Tolerant data. • Highly Available: • Ability to quickly recover service if Highly a component fails, without losing Available data. • Goal: Minimize downtime! Hadoop Summit 2012. 6/13/12 Copyright 2012 4 Cloudera Inc, All Rights Reserved
  • 5. HBase Architecture • HBase is designed to be fault tolerant and highly available • It depends on other systems to be as well. App MR • Replication for fault tolerance • Serve regions from any Region server • Failover HMasters • ZK Quorums • HDFS Block replication on Data Nodes ZK HDFS • But replication doesn’t guarantee high availability • There can still be software or human faults Hadoop Summit 2012. 6/13/12 Copyright 2012 5 Cloudera Inc, All Rights Reserved
  • 6. Causes of HBase Downtime HBase Downtime Distribution • Unplanned Maintenance • Hardware failures • Software errors Planned • Human error • Planned Maintenance • Upgrades Unplanned • Migrations Hadoop Summit 2012. 6/13/12 Copyright 2012 6 Cloudera Inc, All Rights Reserved
  • 7. Causes of Unexpected Maintenance Incidents Unplanned Maintenance: Root Cause from Cloudera Support • Misconfiguration • Metadata Corruptions Repair • Network / HW problems Needed HBase, ZK, 28% • SW problems MR, HDFS Misconfig 44% Fix • Long recovery time HW/NW 16% Patch • Automated and manual Required 12% Source: Cloudera’s production HBase Support Tickets CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x Hadoop Summit 2012. 6/13/12 Copyright 2012 7 Cloudera Inc, All Rights Reserved
  • 8. Outline • Where we were • HBase 0.90.x + Hadoop 0.20.x/1.0.x • Case Studies • Where we are today • HBase 0.92.x/0.94.x + Hadoop 2.0.x • Feature Summary • Where we are going • HBase 0.96.x + Hadoop 2.x • Feature Preview Hadoop Summit 2012. 6/13/12 Copyright 2012 8 Cloudera Inc, All Rights Reserved
  • 9. [T]here are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don't know. —United States Secretary of Defense Donald Rumsfeld WHERE WE WERE: CASE STUDIES Hadoop Summit 2012. 6/13/12 Copyright 2012 9 Cloudera Inc, All Rights Reserved
  • 10. Best Practices to avoid hazards Unplanned Maintenance: Root Cause from Cloudera Support Repair Needed HBase, ZK, 28% MR, HDFS Misconfig 44% Fix HW/NW 16% Patch Required 12% CAN PREVENT HBASE Source: Cloudera’s production HBase Support Tickets MISCONFIGURATIONS CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x Hadoop Summit 2012. 6/13/12 Copyright 2012 10 Cloudera Inc, All Rights Reserved
  • 11. Case #1: Memory Over-subscription Hazard Misconfig Bad Outcome Masters Take Node A swaps • Too many MR Slots • MapReduce tasks fail Action • MR Slots too large • HDFS datanode • “Arbitrary” processes operations time out • JobTracker blacklists TT pause or unresponsive on node B • HBase client operations fail • Jobs fail or run slow • NameNode re-replicates blocks from node A Node A Under Node B can’t Load connect to node A Hadoop Summit 2012. 6/13/12 Copyright 2012 11 Cloudera Inc, All Rights Reserved
  • 12. Case #2, #3: Hazards of Abusing HDFS and ZK Millions of HDFS files Millions of ZK nodes Bad Practice Misconfiguration 500,000 blocks per Millions of ZK znodes datanode 400MB snapshot Heartbeat thread SW Bug ZK fails to create new blocks IO snapshots, fails RS cannot access Bad outcome HBase goes down HDFS HBase goes down Bad outcome HBase fails to restart SW Bug, Worse Hadoop Summit 2012. 6/13/12 Copyright 2012 outcome 12 Cloudera Inc, All Rights Reserved
  • 13. Case #4: Splitting Corruption from HW failure Manual, Slow, and HW Failure requires expert HBase has Region regions Multiple 6 hour Network failure Split Recovery attempts to inconsistencies manual repair (takes out NN) incomplete split (overlaps / sessions. holes) SW Bug Hadoop Summit 2012. 6/13/12 Copyright 2012 13 Cloudera Inc, All Rights Reserved
  • 14. Case #5: Slow recovery from HW failure Correct but slow! Human error On RS loses restart, Roo 9 hour hlog Network Manual HDFS, WAL t and splitting HW failure Repairs s .META. recovery assign fails SW error Hadoop Summit 2012. 6/13/12 Copyright 2012 14 Cloudera Inc, All Rights Reserved
  • 15. Initial Lessons • Use Best practices to avoid problems • Conservative first • Avoid unstable features • What can we do? • Fix the bugs • Recover from problems faster • Make people smarter to avoid hazards and misconfigurations • Make software smarter to prevent hazards and misconfigurations Hadoop Summit 2012. 6/13/12 Copyright 2012 15 Cloudera Inc, All Rights Reserved
  • 16. In war, then, let your great object be victory, not lengthy campaigns. -- Sun Tzu WHERE WE ARE TODAY HBASE 0.92.X + HADOOP 2.0.X Hadoop Summit 2012. 6/13/12 Copyright 2012 16 Cloudera Inc, All Rights Reserved
  • 17. Goal: Reduce unexpected downtime by recovering faster • Removing the SPOFs • HA HDFS • Faster Recovery • Improved hbck • Distributed Log splitting Hadoop Summit 2012. 6/13/12 Copyright 2012 17 Cloudera Inc, All Rights Reserved
  • 18. Problem: HDFS NN goes down under HBase • HBase depends on HDFS. MR App • If HDFS is down, HBase goes down. • Ramifications. • Forces Recovery mechanism • Caused some data corruptions ZK HDFS • Ideally we avoid having to do recovery at all. Hadoop Summit 2012. 6/13/12 Copyright 2012 18 Cloudera Inc, All Rights Reserved
  • 19. HBase-HDFS HA Nodes NameNode (active) HMaster (metadata server) (region metadata) NameNode (standby) HMaster (active-standby (hot standby) hot failover) ZooKeeper Quorum HDFS DataNodes HBase RegionServers Hadoop Summit 2012. 6/13/12 Copyright 2012 19 Cloudera Inc, All Rights Reserved
  • 20. HBase-HDFS HA Nodes: Transparent to HBase HMaster (region metadata) HMaster NameNode (active) (hot standby) ZooKeeper Quorum HDFS DataNodes HBase RegionServers Hadoop Summit 2012. 6/13/12 Copyright 2012 20 Cloudera Inc, All Rights Reserved
  • 21. HBase-HDFS HA Nodes: No more SPOF HMaster NameNode (active) (active) ZooKeeper Quorum HDFS DataNodes HBase RegionServers Hadoop Summit 2012. 6/13/12 Copyright 2012 21 Cloudera Inc, All Rights Reserved
  • 22. Recovery operations • If a network switch fails or if there is a power outage, • HBase, ZK, and HA HDFS will fail • Will always still rely on recovery mechanisms. • Need to be able to quickly recover • Metadata Invariants to fix metadata corruptions • Data Consistency to restore ACID guarantees Hadoop Summit 2012. 6/13/12 Copyright 2012 22 Cloudera Inc, All Rights Reserved
  • 23. HBase Metadata Corruptions • Internal HBase metadata Unplanned Maintenance: Root Cause corruptions from Cloudera Support • Prevent HBase from starting • Cause some regions to be Repair unavailable. Needed 28% HBase, ZK, MR, HDFS Misconfig • Repairs are intricate and 44% Fix can cause extended periods HW/NW of downtime. 16% Patch Required 12% Hadoop Summit 2012. 6/13/12 Copyright 2012 23 Cloudera Inc, All Rights Reserved
  • 24. HBase Metadata Invariants Table Integrity Region Consistency • Every key shall get assigned • Metadata about regions should to a single region. agree in hdfs, meta and region server assignment. [‘ ‘,A) [A,B) regioninfo in META [B, C) [C, D) [D, E) Good [E, F) region assigned .regioninfo [F, G) to RS in HDFS [G, ‘ ‘) Hadoop Summit 2012. 6/13/12 Copyright 2012 24 Cloudera Inc, All Rights Reserved
  • 25. Detecting and Repairing corruption with hbck • HBase 0.90 hbck • Checks an HBase instance’s internals invariants. • HBase hbck today • Checks and can fix problem in an HBase instance’s internal invariants • 0.90.7, 0.92.2, 0.9 4.0 • CDH3u4, CDH4 Hadoop Summit 2012. 6/13/12 Copyright 2012 25 Cloudera Inc, All Rights Reserved
  • 26. Case #4 redux: Splitting Corruption Manual, Slow, and HW Failure requires expert HBase has Region Network failure regions Multiple 6 hour Split Recovery attempts to inconsistencies manual repair (takes out NN) incomplete split (overlaps / sessions. holes) SW Bug Hadoop Summit 2012. 6/13/12 Copyright 2012 26 Cloudera Inc, All Rights Reserved
  • 27. Case #4 redux: Splitting Corruption HW Failure HBase has Region Network failure regions Automated Split Recovery attempts to inconsistencies repair tool (takes out NN) incomplete split (overlaps / (Minutes) holes) Fixes are SW Bug quicker, operator can use Hadoop Summit 2012. 6/13/12 Copyright 2012 27 Cloudera Inc, All Rights Reserved
  • 28. Case #4 redux: Splitting Corruption HW Failure Minor HBase Region Network failure inconsistencies Automated Split Recovery attempts to repair tool (takes out NN) incomplete (bad split (seconds) assignments) Fixed SW Bug Hadoop Summit 2012. 6/13/12 Copyright 2012 28 Cloudera Inc, All Rights Reserved
  • 29. Data Consistency • When a region server goes down, it tries to flush data in memory to HDFS. • If it cannot write to HDFS, it relies on the WAL/HLog. • Recovery via the HLog is vital to prevent data loss • Understand the write path. • Recovery: HLog splitting. • Faster Recovery: Distributed HLog splitting. Hadoop Summit 2012. 6/13/12 Copyright 2012 29 Cloudera Inc, All Rights Reserved
  • 30. Write Path (Put / Delete / Increment) HBase Region Server client HLog Put Server HRegion HRegion MemStore MemStore Put HStore HStore HStore HStore Hadoop Summit 2012. 6/13/12 Copyright 2012 30 Cloudera Inc, All Rights Reserved
  • 31. Write Path (Put / Delete / Increment) Note, both regions write to the same HBase HLog Region Server client Put HLog Put Put Server HRegion HRegion MemStore MemStore Put Put HStore HStore HStore HStore Hadoop Summit 2012. 6/13/12 Copyright 2012 31 Cloudera Inc, All Rights Reserved
  • 32. Log Splitting HMaster RegionServer RegionServer RegionServer HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 32 Cloudera Inc, All Rights Reserved
  • 33. Log Splitting HMaster RegionServer RegionServer RegionServer HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 33 Cloudera Inc, All Rights Reserved
  • 34. Log Splitting HMaster HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 34 Cloudera Inc, All Rights Reserved
  • 35. Log Splitting Splitting log 1 HMaster HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 35 Cloudera Inc, All Rights Reserved
  • 36. Log Splitting Splitting log 2 HMaster HLog HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 36 Cloudera Inc, All Rights Reserved
  • 37. Log Splitting Splitting log 3 HMaster HLog HLog1 HLog HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 37 Cloudera Inc, All Rights Reserved
  • 38. Log Splitting Splitting log 100 HMaster HLog HLog HLog … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 38 Cloudera Inc, All Rights Reserved
  • 39. Log Splitting Whew. I did a lot of splitting work. That HMaster took 9 hours! HLog HLog HLog … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 39 Cloudera Inc, All Rights Reserved
  • 40. Log Splitting RegionServers, here are your region HMaster assignments. RegionServer4 RegionServer5 RegionServer6 … … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 40 Cloudera Inc, All Rights Reserved
  • 41. Log Splitting Victory! HMaster RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 41 Cloudera Inc, All Rights Reserved
  • 42. Can we recover more quickly? • In the case study, this is all done serially by the master • The master took 9 hours to recovery. • The 100 region server nodes were idle. • Let’s use the idle machines to do splitting in parallel! • Distributed log splitting (HBASE-1364) • Introduced in 0.92.0 by Prakash Khemani (Facebook) • Included in CDH4 (0.92.1) • Backported to CDH3u3 (off by default) Hadoop Summit 2012. 6/13/12 Copyright 2012 42 Cloudera Inc, All Rights Reserved
  • 43. Distributed Log Splitting I’m the boss. HMaster RegionServer RegionServer RegionServer HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 43 Cloudera Inc, All Rights Reserved
  • 44. Distributed Log Splitting There is a lot of splitting work here, HMaster let’s split it up. HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 44 Cloudera Inc, All Rights Reserved
  • 45. Distributed Log Splitting You guys do the work for me. HMaster RegionServer4 RegionServer5 RegionServer6 HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 45 Cloudera Inc, All Rights Reserved
  • 46. Distributed Log Splitting You guys do the work for me. HMaster RegionServer4 RegionServer5 RegionServer6 HLog1 HLog2 HLog3 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 46 Cloudera Inc, All Rights Reserved
  • 47. Distributed Log Splitting Great, that took 5.4 minutes. HMaster RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 47 Cloudera Inc, All Rights Reserved
  • 48. Distributed Log Splitting Good Job, here are your region HMaster assignments. RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion Hadoop Summit 2012. 6/13/12 Copyright 2012 48 Cloudera Inc, All Rights Reserved
  • 49. Distributed Log Splitting Like a Boss. HMaster RegionServer4 RegionServer5 RegionServer6 … HRegion HRegion HRegion HRegion HRegion HRegion mem mem mem mem mem mem Hadoop Summit 2012. 6/13/12 Copyright 2012 49 Cloudera Inc, All Rights Reserved
  • 50. Case #5 redux: Network failure and slow recovery Correct but slow! Human error On RS loses restart, Roo 9 hour hlog Network Manual HDFS, WAL t and splitting HW failure Repair s .META. recovery assign fails Hadoop Summit 2012. 6/13/12 Copyright 2012 50 Cloudera Inc, All Rights Reserved
  • 51. Case #5 redux: Network failure and slow recovery Correct and Faster! Human error On 5.4 Minute RS loses restart, Roo Network Automatic hlog HDFS, WAL t and HW failure repairs splitting s .META. recovery assign fails Fixed! Hadoop Summit 2012. 6/13/12 Copyright 2012 51 Cloudera Inc, All Rights Reserved
  • 52. WHERE WE ARE GOING HBASE 0.96 + HADOOP 2.X Hadoop Summit 2012. 6/13/12 Copyright 2012 52 Cloudera Inc, All Rights Reserved
  • 53. Themes • Minimizing Planned downtime HBase Downtime • Changing configurations Distribution • Online Schema Change (experimental in 0.92, 0.94) • Rolling Restarts Planned • Wire compatibility Unplanned Hadoop Summit 2012. 6/13/12 Copyright 2012 53 Cloudera Inc, All Rights Reserved
  • 54. Table unavailable when changing schema • Changing table schema requires disabling table • disable table, alter table schema, enable table • Schema includes compression, cf’s, caching, ttl, versions. • Goal: Quickly change table and column configuration settings without having to disable Hbase tables. • Feature Online Schema Change (HBASE-1730) • Included in but considered experimental in HBase 0.92/0.94. • Contributed by Facebook Hadoop Summit 2012. 6/13/12 Copyright 2012 54 Cloudera Inc, All Rights Reserved
  • 55. Changing Server Configs and Software updates • Rolling restart is an operation for upgrading an HBase cluster to a compatible version while keeping HBase available and serving data. • Handle server config changes. • Handle code changes like hotfixes or compatible upgrades Hadoop Summit 2012. 6/13/12 Copyright 2012 55 Cloudera Inc, All Rights Reserved
  • 56. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 56 Cloudera Inc, All Rights Reserved
  • 57. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 57 Cloudera Inc, All Rights Reserved
  • 58. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 58 Cloudera Inc, All Rights Reserved
  • 59. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 59 Cloudera Inc, All Rights Reserved
  • 60. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 60 Cloudera Inc, All Rights Reserved
  • 61. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 61 Cloudera Inc, All Rights Reserved
  • 62. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 62 Cloudera Inc, All Rights Reserved
  • 63. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 63 Cloudera Inc, All Rights Reserved
  • 64. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 64 Cloudera Inc, All Rights Reserved
  • 65. Rolling Restart Admin operations ZK Client Shell User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 65 Cloudera Inc, All Rights Reserved
  • 66. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 66 Cloudera Inc, All Rights Reserved
  • 67. Rolling Restart Admin operations ZK Client Shell HM1 User operations RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 67 Cloudera Inc, All Rights Reserved
  • 68. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 68 Cloudera Inc, All Rights Reserved
  • 69. Rolling Restart Admin operations ZK Client Shell HM1 User operations HM2 RS1 RS2 RS3 RS4 Internal operations Hadoop Summit 2012. 6/13/12 Copyright 2012 69 Cloudera Inc, All Rights Reserved
  • 70. Rolling restart limitations • There are limitations on Unplanned Maintenance: Root rolling restarts Cause from Cloudera Support • All Servers and clients must be wire compatible • All must be able to read old data in FS and ZK. Repair Needed HBase, ZK, 28% • Ramifications: MR, HDFS Misconfig • Only minor version upgrades 44% possible Fix • New features that change RPCs HW/NW require custom compatibility 16% Patch shims. Required • Data format changes not 12% possible across minor versions. Source: Cloudera’s production HBase Support Tickets CDH3’s HBase 0.90.x, Hadoop 0.20.x/1.0.x Hadoop Summit 2012. 6/13/12 Copyright 2012 70 Cloudera Inc, All Rights Reserved
  • 71. HBase Compatibility and Extensibility • Coming in HBase 0.96 • HBASE-5305 and friends • Goals: • Allow API and changes and persistent data structure changes while guarantees compatibility between different minor versions (0.96.0 -> 0.96.1) • HBase client server compatibility between Major Versions. (0.96.x -> 0.98.x) Hadoop Summit 2012. 6/13/12 Copyright 2012 71 Cloudera Inc, All Rights Reserved
  • 72. HDFS Wire Compatibility • Here in HDFS 2.0.x • HADOOP-7347 and friends App MR • Goals: • Allow API and changes while guaranteeing wire compatibility between different minor versions • HDFS client server compatibility ZK HDFS between Major Versions. Hadoop Summit 2012. 6/13/12 Copyright 2012 72 Cloudera Inc, All Rights Reserved
  • 73. HDFS Wire Compatibility • Here in HDFS 2.0.x • HADOOP-7347 and friends App MR • Goals: • Allow API and changes while guaranteeing wire compatibility between different minor versions • HDFS client server compatibility ZK HDFS between Major Versions. Hadoop Summit 2012. 6/13/12 Copyright 2012 73 Cloudera Inc, All Rights Reserved
  • 74. CONCLUSIONS Hadoop Summit 2012. 6/13/12 Copyright 2012 74 Cloudera Inc, All Rights Reserved
  • 75. Improving how we handling causes of downtime Unplanned Maintenance: Root HBase Downtime Distribution Cause from Cloudera Support Wire compat Best hbck practices Repair Planned Needed HBase, ZK, 28% MR, HDFS Misconfig 44% Unplanned Fix HW/NW 16% Patch Required hbck and 12% distributed log Wire splitting compat Hadoop Summit 2012. 6/13/12 Copyright 2012 75 Cloudera Inc, All Rights Reserved
  • 76. jon@cloudera.com Twitter: @jmhsieh We’re hiring! QUESTIONS? Hadoop Summit 2012. 6/13/12 Copyright 2012 76 Cloudera Inc, All Rights Reserved

Notas del editor

  1. This pie chart is a product from analyzing critical production Hbase tickets over the past 6 months: misconfig 44%, patch 12%,hw/nw 16%, repair 28%. Meaning that correcting a misconfig was all that it took to bring Hbase back up again. As you can see, misconfigurations and bugs break the most HBase clusters. Fixing bugs is up to the community. Fixing misconfigurations is up to you and the focus of the next segment. Because it’s hard to diagnose, misconfigurations are not what you want to spend your time on.If your cluster is broken, it’s probably a misconfiguration. This is a hard problem becausethe error messages are not tightly tied to the root cause.
  2. CDH3 goes GA 4/12/12
  3. HDFS-2379
  4. CDH4 GA 6/5/12
  5. Tested under HBase
  6. Transparent to clients
  7. Coupled with HBase Master failover means no SPOF
  8. Cause:Disconnect a region server for a whileKill -9 a region serverWhy?All writes at a region server go to a single Hlog. This can contain edits from multiple regions. Regions my get reassigned to multiple other region servers.Need to split up the hlog
  9. Region_mover.rbMove regions off, recording which regions were presentRestore regions, based on recorded regions Still mostly a manual process
  10. Region_mover.rbMove regions off, recording which regions were presentRestore regions, based on recorded regions Still mostly a manual process