SlideShare a Scribd company logo
1 of 23
HDFS HA using Journal Nodes
Introducing Journal Nodes
Manual
Failover
7/26/2013 Copyright 2013 Trend Micro Inc.
Architecture
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
NN
Standby
DN DNDNDN
Block locations map
• When any namespace modification is performed
it durably logs a record of the modification to JNs
• The Standby reads the edits from the JNs and applies
them to its own namespace
JournalNodes’ job
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
NN
Standby
Edits Edits Edits
Edits Edits EditsEdits Edits Edits
Safe
Mode
• Specify path on local disk
• tolerate at most (N - 1) / 2 failures
JournalNodes’ storage
7/26/2013 Copyright 2013 Trend Micro Inc.
• JournalNodes will only allow a single NameNode to be a
writer at a time.
• no potential for corrupting the file system metadata from
a split-brain scenario.
JournalNodes’ fencing
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
NN
Standby
WRITE READ
• Whenever a NameNode becomes active, it first
generate an epoch number.
• first active NameNode after the namespace is initialized
starts with epoch number 1
• any failovers or restarts result in an increment of the
epoch number
JournalNodes’ fencing
7/26/2013 Copyright 2013 Trend Micro Inc.
• When a new NameNode becomes active, it has an
epoch number higher than any previous NameNode
• Call JournalNodes to increment their promised epochs
• Fencing:
– JNs receive newer epoch
 update majority of JNs’ promised epochs  accept
– JNs receive older epoch
 reject
JournalNodes’ fencing
7/26/2013 Copyright 2013 Trend Micro Inc.
• previous Active NameNode could serve read requests
to clients which may be out of date until a write access
performed
• You can specify some fencing method to avoid this
happened
But…
7/26/2013 Copyright 2013 Trend Micro Inc.
Fencing
Method
7/26/2013 Copyright 2013 Trend Micro Inc.
• sshfence
SSH to the Active NameNode and kill the process
Fencing Method
7/26/2013 Copyright 2013 Trend Micro Inc.
• shell
run a shell command to fence the Active NameNode
• The script may have properties with the '_' character
replacing any '.'
ex : dfs_namenode_rpc-address
Fencing Method
7/26/2013 Copyright 2013 Trend Micro Inc.
• Additional environment variable
Fencing Method
7/26/2013 Copyright 2013 Trend Micro Inc.
Automatic
Failover
7/26/2013 Copyright 2013 Trend Micro Inc.
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
• Health monitoring
– the ZKFC pings its local NameNode on a periodic basis with a
health-check command. (healthy/unhealthy)
• ZooKeeper session management
– when the local NameNode is healthy, the ZKFC holds a session
open in ZooKeeper.
– If the local NameNode is active, it also holds a special "lock"
znode.
– if the session expires, the lock node will be automatically
deleted.
ZKFailoverController
7/26/2013 Copyright 2013 Trend Micro Inc.
• ZooKeeper-based election
– if the local NameNode is healthy, and no other node currently
holds the lock znode, it will itself try to acquire the lock.
– If it succeeds, then it has "won the election“
• Failover
– the previous active is fenced
– local NameNode transitions to active state.
ZKFailoverController
7/26/2013 Copyright 2013 Trend Micro Inc.
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
1
2
3
4
5 6
7
Client
Side
7/26/2013 Copyright 2013 Trend Micro Inc.
• Client connect to Active Namenode via proxy
• When Active Namenode down, client receive Exception
 retry and send RPC to another namenode
(implement by ConfiguredFailoverProxyProvider)
Client Failover
7/26/2013 Copyright 2013 Trend Micro Inc.
Steps to Apply
HDFS HA
7/26/2013 Copyright 2013 Trend Micro Inc.
• If setting up a fresh HDFS cluster,
hdfs namenode –format
• copy over the contents of your NameNode metadata
directories to the other
hdfs namenode –bootstrapStandby
./format-failover-namenode.sh
• hdfs –initializeSharedEdits to initialize edits log in
journalnode
• Startup both Namenode
converting a non-HA-enabled cluster
to be HA-enabled
7/26/2013 Copyright 2013 Trend Micro Inc.
• http://hadoop.apache.org/docs/current/hadoop-
yarn/hadoop-yarn-
site/HDFSHighAvailabilityWithQJM.html#Deployment
• http://blog.cloudera.com/blog/2012/10/quorum-based-
journaling-in-cdh4-1/
• https://issues.apache.org/jira/secure/attachment/125475
98/qjournal-design.pdf
Reference
7/26/2013 Copyright 2013 Trend Micro Inc.

More Related Content

What's hot

How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 

What's hot (20)

Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )Percona XtraDB Cluster ( Ensure high Availability )
Percona XtraDB Cluster ( Ensure high Availability )
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
 
The Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsThe Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systems
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
Monitoring Apache Kafka
Monitoring Apache KafkaMonitoring Apache Kafka
Monitoring Apache Kafka
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Single Page Application (SPA) using AngularJS
Single Page Application (SPA) using AngularJSSingle Page Application (SPA) using AngularJS
Single Page Application (SPA) using AngularJS
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
Spectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf WeiserSpectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf Weiser
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Backup & disaster recovery for Solr
Backup & disaster recovery for SolrBackup & disaster recovery for Solr
Backup & disaster recovery for Solr
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
 
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 

Similar to Hdfs ha using journal nodes

Disaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBDDisaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBD
Viswesuwara Nathan
 
Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005
dflexer
 
Elasticsearch Data Analyses
Elasticsearch Data AnalysesElasticsearch Data Analyses
Elasticsearch Data Analyses
Alaa Elhadba
 
Life in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPagesLife in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPages
Ulrich Krause
 
Romanticos com drbd 2
Romanticos com drbd 2Romanticos com drbd 2
Romanticos com drbd 2
eiichi2009
 
Android Forensics: Exploring Android Internals and Android Apps
Android Forensics: Exploring Android Internals and Android AppsAndroid Forensics: Exploring Android Internals and Android Apps
Android Forensics: Exploring Android Internals and Android Apps
Moe Tanabian
 

Similar to Hdfs ha using journal nodes (20)

Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
Hadoop Summit 2012 | HDFS High Availability
Hadoop Summit 2012 | HDFS High AvailabilityHadoop Summit 2012 | HDFS High Availability
Hadoop Summit 2012 | HDFS High Availability
 
Drbd
DrbdDrbd
Drbd
 
Disaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBDDisaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBD
 
Gnr writepath v1.0
Gnr writepath v1.0Gnr writepath v1.0
Gnr writepath v1.0
 
Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
 
HDFS - What's New and Future
HDFS - What's New and FutureHDFS - What's New and Future
HDFS - What's New and Future
 
Elasticsearch Data Analyses
Elasticsearch Data AnalysesElasticsearch Data Analyses
Elasticsearch Data Analyses
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
Life in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPagesLife in the fast lane. Full speed XPages
Life in the fast lane. Full speed XPages
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
Romanticos com drbd 2
Romanticos com drbd 2Romanticos com drbd 2
Romanticos com drbd 2
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Art of the Possible_Tim Faulkes.pdf
Art of the Possible_Tim Faulkes.pdfArt of the Possible_Tim Faulkes.pdf
Art of the Possible_Tim Faulkes.pdf
 
Apache Hadoop- Hadoop Basics.pptx
Apache Hadoop- Hadoop Basics.pptxApache Hadoop- Hadoop Basics.pptx
Apache Hadoop- Hadoop Basics.pptx
 
Distributed replicated block device
Distributed replicated block deviceDistributed replicated block device
Distributed replicated block device
 
Android Forensics: Exploring Android Internals and Android Apps
Android Forensics: Exploring Android Internals and Android AppsAndroid Forensics: Exploring Android Internals and Android Apps
Android Forensics: Exploring Android Internals and Android Apps
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
 

More from Evans Ye

ONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smart
Evans Ye
 
Docker workshop
Docker workshopDocker workshop
Docker workshop
Evans Ye
 
Fits docker into devops
Fits docker into devopsFits docker into devops
Fits docker into devops
Evans Ye
 
Network Traffic Search using Apache HBase
Network Traffic Search using Apache HBaseNetwork Traffic Search using Apache HBase
Network Traffic Search using Apache HBase
Evans Ye
 

More from Evans Ye (20)

Join ASF to Unlock Full Possibilities of Your Professional Career.pdf
Join ASF to Unlock Full Possibilities of Your Professional Career.pdfJoin ASF to Unlock Full Possibilities of Your Professional Career.pdf
Join ASF to Unlock Full Possibilities of Your Professional Career.pdf
 
非常人走非常路:參與ASF打世界杯比賽
非常人走非常路:參與ASF打世界杯比賽非常人走非常路:參與ASF打世界杯比賽
非常人走非常路:參與ASF打世界杯比賽
 
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep LearningTensorFlow on Spark: A Deep Dive into Distributed Deep Learning
TensorFlow on Spark: A Deep Dive into Distributed Deep Learning
 
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public
 
ONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smartONE FOR ALL! Using Apache Calcite to make SQL smart
ONE FOR ALL! Using Apache Calcite to make SQL smart
 
The Apache Way: A Proven Way Toward Success
The Apache Way: A Proven Way Toward SuccessThe Apache Way: A Proven Way Toward Success
The Apache Way: A Proven Way Toward Success
 
The Apache Way
The Apache WayThe Apache Way
The Apache Way
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
How bigtop leveraged docker for build automation and one click hadoop provis...
How bigtop leveraged docker for build automation and  one click hadoop provis...How bigtop leveraged docker for build automation and  one click hadoop provis...
How bigtop leveraged docker for build automation and one click hadoop provis...
 
BigTop vm and docker provisioner
BigTop vm and docker provisionerBigTop vm and docker provisioner
BigTop vm and docker provisioner
 
Docker workshop
Docker workshopDocker workshop
Docker workshop
 
Fits docker into devops
Fits docker into devopsFits docker into devops
Fits docker into devops
 
Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...Getting involved in world class software engineering tips and tricks to join ...
Getting involved in world class software engineering tips and tricks to join ...
 
Deep dive into enterprise data lake through Impala
Deep dive into enterprise data lake through ImpalaDeep dive into enterprise data lake through Impala
Deep dive into enterprise data lake through Impala
 
How we lose etu hadoop competition
How we lose etu hadoop competitionHow we lose etu hadoop competition
How we lose etu hadoop competition
 
Network Traffic Search using Apache HBase
Network Traffic Search using Apache HBaseNetwork Traffic Search using Apache HBase
Network Traffic Search using Apache HBase
 
Vagrant
VagrantVagrant
Vagrant
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Hdfs ha using journal nodes

  • 1. HDFS HA using Journal Nodes
  • 2. Introducing Journal Nodes Manual Failover 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 3. Architecture 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby DN DNDNDN Block locations map
  • 4. • When any namespace modification is performed it durably logs a record of the modification to JNs • The Standby reads the edits from the JNs and applies them to its own namespace JournalNodes’ job 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby Edits Edits Edits Edits Edits EditsEdits Edits Edits Safe Mode
  • 5. • Specify path on local disk • tolerate at most (N - 1) / 2 failures JournalNodes’ storage 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 6. • JournalNodes will only allow a single NameNode to be a writer at a time. • no potential for corrupting the file system metadata from a split-brain scenario. JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby WRITE READ
  • 7. • Whenever a NameNode becomes active, it first generate an epoch number. • first active NameNode after the namespace is initialized starts with epoch number 1 • any failovers or restarts result in an increment of the epoch number JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 8. • When a new NameNode becomes active, it has an epoch number higher than any previous NameNode • Call JournalNodes to increment their promised epochs • Fencing: – JNs receive newer epoch  update majority of JNs’ promised epochs  accept – JNs receive older epoch  reject JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 9. • previous Active NameNode could serve read requests to clients which may be out of date until a write access performed • You can specify some fencing method to avoid this happened But… 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 11. • sshfence SSH to the Active NameNode and kill the process Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 12. • shell run a shell command to fence the Active NameNode • The script may have properties with the '_' character replacing any '.' ex : dfs_namenode_rpc-address Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 13. • Additional environment variable Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 15. 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3
  • 16. • Health monitoring – the ZKFC pings its local NameNode on a periodic basis with a health-check command. (healthy/unhealthy) • ZooKeeper session management – when the local NameNode is healthy, the ZKFC holds a session open in ZooKeeper. – If the local NameNode is active, it also holds a special "lock" znode. – if the session expires, the lock node will be automatically deleted. ZKFailoverController 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 17. • ZooKeeper-based election – if the local NameNode is healthy, and no other node currently holds the lock znode, it will itself try to acquire the lock. – If it succeeds, then it has "won the election“ • Failover – the previous active is fenced – local NameNode transitions to active state. ZKFailoverController 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 18. 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active 1 2 3 4 5 6 7
  • 20. • Client connect to Active Namenode via proxy • When Active Namenode down, client receive Exception  retry and send RPC to another namenode (implement by ConfiguredFailoverProxyProvider) Client Failover 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 21. Steps to Apply HDFS HA 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 22. • If setting up a fresh HDFS cluster, hdfs namenode –format • copy over the contents of your NameNode metadata directories to the other hdfs namenode –bootstrapStandby ./format-failover-namenode.sh • hdfs –initializeSharedEdits to initialize edits log in journalnode • Startup both Namenode converting a non-HA-enabled cluster to be HA-enabled 7/26/2013 Copyright 2013 Trend Micro Inc.
  • 23. • http://hadoop.apache.org/docs/current/hadoop- yarn/hadoop-yarn- site/HDFSHighAvailabilityWithQJM.html#Deployment • http://blog.cloudera.com/blog/2012/10/quorum-based- journaling-in-cdh4-1/ • https://issues.apache.org/jira/secure/attachment/125475 98/qjournal-design.pdf Reference 7/26/2013 Copyright 2013 Trend Micro Inc.