SlideShare una empresa de Scribd logo
1 de 28
Deploying Grid Services
                         Using Hadoop

                           Allen Wittenauer
                             April 11, 2008

Yahoo! @ ApacheCon                             1
                                               1
Agenda

      • Grid Computing At Yahoo!
      • Quick Overview of Hadoop
             – “Hadoop for Systems Administrators”
      • Scaling Hadoop Deployments
      • Yahoo!’s Next Generation Grid Infrastructure
      • Questions (and maybe even Answers!)




Yahoo! @ ApacheCon                                     2
                                                       2
Grid Computing at Yahoo!

      • Drivers
          – 500M unique users per month
          – Billions of interesting events per day
          – “Data analysis is the inner-loop at Yahoo!”
      • Yahoo! Grid Vision and Focus
          – On-demand, shared access to vast pool of resources
          – Support for massively parallel execution (1000s of processors)
          – Data Intensive Super Computing (DISC)
          – Centrally provisioned and managed
          – Service-oriented, elastic
      • What We’re Not
          – Not “Grid” in the sense of scientific community (Globus, etc)
          – Not focused on public or 3rd-party utility (Amazon EC2/S3, etc)



Yahoo! @ ApacheCon                                                            3
                                                                              3
Yahoo! Grid Services


      • Operate multiple grids within Yahoo!
      • 10,000s nodes, 100,000s cores, TBs RAM, PBs disk
      • Support large internal user community
             – Account management, training, etc
      • Manage data needs
             – Ingest TBs per day
      • Deploy and manage software (Hadoop, Pig, etc)




Yahoo! @ ApacheCon                                         4
                                                           4
“Grid” Computing

      • What we really do is utility computing
             – Nobody knows what “utility computing” is, but everyone has heard of
               “grid computing”
                     • Grid computing implies sharing across resources owned by multiple,
                       independent organizations
                     • Utility computing implies sharing one owner’s resources by multiple,
                       independent customers
      • Ultimate goal is to provide shared compute and storage resources
             – Instead of going to a hardware committee to provision balkanized
               resources, a project allocates a part of its budget for use on Yahoo!’s
               shared grids
             – Pay as you go
                     • Only buy 100 computers for 15 minutes of compute time vs. 100 computers
                       24x7



Yahoo! @ ApacheCon                                                                            5
                                                                                              5
What is a Yahoo! Grid Service?

      • Thousands of machines using basic network hardware
             – It’s hard to program for many machines
      • Clustering and sharing software
             – Hadoop, HOD, Torque, Maui, and other bits...
      • Petabytes of data
             – It’s an engineering challenge to load so much data from many sources
      • Attached development environment
             – A clean, well lit place to interact with a grid
      • User support
             – Learning facilitation
      • Usage tracking and billing
             – Someone has to pay the bills...



Yahoo! @ ApacheCon                                                              6
                                                                                6
Quick MapReduce Overview

      • Application writer specifies                  Input 0       Input 1      Input 2

             – two functions: Map and Reduce
             – set of input files
      • Workflow                                      Map 0          Map 1        Map 2

             – input phase generates a number of
               FileSplits from input files (one per
                                                                    Shuffle
               Map Task)
             – Map phase executes user function
               to transform key/value inputs into
                                                         Reduce 0             Reduce 1
               new key/value outputs
             – Framework sorts and shuffles
             – Reduce phase combines all k/v’s
               with the same key into new k/v’s
             – Output phase writes the resulting           Out 0                Out 1
               pairs to files
      • cat * | grep | sort | uniq -c | cat > out
Yahoo! @ ApacheCon                                                                       7
                                                                                         7
Hadoop MapReduce: Process Level

      • Job
             – Map Function + Reduce Function + List of inputs




Yahoo! @ ApacheCon                                               8
                                                                 8
Hadoop Distributed File System




Yahoo! @ ApacheCon                                    9
                                                      9
Gateways: Multi-user Land

      • Provided for two main purposes
             – Meaningful development interaction with a compute cluster
                     • High bandwidth, low latency, and few network barriers enable a tight
                       development loop when creating MapReduce jobs
             – Permission and privilege separation
                     • Limit exposure to sensitive data
                          – Hadoop 0.15 and lower lack users, permissions, etc.
                          – Hadoop 0.16 has users and “weak” permissions
      • Characteristics
             – “Replacement” Lab machine
             – World-writable local disk space
                     • Any single-threaded processing
                     • Code debugging




Yahoo! @ ApacheCon                                                                            11
                                                                                              11
Compute Cluster and Name Nodes

      • Compute Cluster Node
             – Users cannot login!
             – All nodes run both MapReduce and HDFS
               frameworks
             – usually 500 to 2000 machines
             – Each cluster kept relatively homogenous
             – Hardware configuration
                     • 2xSockets (2 or 4 core)
                     • 4x500-750G
                     • 6G-8G RAM


      • Name Node
             – 16G RAM
                     • 14G Java heap = 18-20 million files


Yahoo! @ ApacheCon                                           12
                                                             12
Queuing and Scheduling

      • Hadoop does not have an advanced scheduling
        system
             – MapReduce JobTracker manages one or more jobs
               running within a set of machines
             – Works well for “dedicated” applications, but does not
               work so well for shared resources
      • Grid Services are intended to be a shared multi-user,
        multi-application environment
             – Need to combine Hadoop with an external queuing and
               scheduling system...




Yahoo! @ ApacheCon                                                     13
                                                                       13
Hadoop On Demand (HoD)

      • Wrapper around PBS commands
             – We use freely available Torque and Maui
      • Big win: virtual private JobTracker clusters
             – Job isolation
             – Users create clusters of the size they need
             – Submit jobs to their private JT
      • Big costs:
             –   Lose data locality
             –   Increased complexity
             –   Lose a node for private JobTracker
             –   Single reducer doesn’t free unused nodes
                     • ~ 30% efficiency lost!
      • Looking at changing Hadoop scheduling
             – Task scheduling flexibility combined with node elasticity

Yahoo! @ ApacheCon                                                         14
                                                                           14
HoD Job Scheduling




Yahoo! @ ApacheCon                        15
                                          15
The Reality of a 1000 Node Grid




Yahoo! @ ApacheCon                                     16
                                                       16
Putting It All Together




Yahoo! @ ApacheCon                             17
                                               17
Network Configuration




Yahoo! @ ApacheCon                           18
                                             18
Yahoo!’s Next Generation
   Grid Infrastructure
     A Work In Progress




                           19
Background Information

      • Internal deployments
             – Mostly Yahoo! proprietary technologies
      • M45
             – Educational outreach grid
             – Non-Yahoo!’s using Yahoo! resources
                     • Legal required us not to use any Y! technology!
      • Decision made to start from scratch!
             – Hard to share best practices
             – Potential legal issues
             – Don’t want to support two ways to do the same operation
      • Internal grids converting to be completely OSS as possible
             – Custom glue code to deal with any Y!<-->OSS incompatibilities
                     • user and group data


Yahoo! @ ApacheCon                                                             20
                                                                               20
Naming and Provisioning Services

      • Naming services
             – Kerberos for secure authentication
             – DNS for host resolution
             – LDAP for everything else
      • ISC DHCP
             – Reads table information from LDAP
             – In pairs for redundancy
      • Kickstart
             – We run RHEL 5.x
             – base image + bcfg2
      • bcfg2
             – host customization
             – centralized configuration management


Yahoo! @ ApacheCon                                      21
                                                        21
NFS for Multi-user Support

      • NFS
             – Home Directories
             – Project Directories
                     • Group shared data


      • Grids with service level agreements (SLAs) shouldn’t use NFS !
             – Single point of failure
                     • HA-NFS == $$$
             – Performance
             – Real data should be in HDFS




Yahoo! @ ApacheCon                                                       22
                                                                         22
Big Picture




Yahoo! @ ApacheCon                 23
                                   23
Self Healing/Reporting

      • Torque
             – Use the Torque node health mechanism to disable/fix ‘sick’ nodes
                     • Great reduction in amount of support issues
                     • Address problems in bulk


      • Nagios
             – Usual stuff
             – Custom hooks into Torque


      • Simon
             – Yahoo!’s distributed cluster and application monitoring tools
             – Similar to Ganglia
             – On the roadmap to be open sourced


Yahoo! @ ApacheCon                                                                24
                                                                                  24
Node Usage Report




Yahoo! @ ApacheCon                       25
                                         25
Ranges and Groups

      • Range: group of hosts
             – example: @GRID == all grid hosts
             – custom tools to manipulate hosts based upon ranges:
                     • ssh -r @GRID uptime
                          – Report uptime on all of the hosts in @GRID


      • Netgroup
             – Used to implement ranges
             – The most underrated naming service switch ever?
                     • Cascaded!
                     • Scalable!
                     • Supported in lots of useful places!
                          – PAM (e.g., _succeed_if on Linux)
                          – NFS



Yahoo! @ ApacheCon                                                       26
                                                                         26
Some Links

      • Apache Hadoop
             – http://hadoop.apache.org/


      • Yahoo! Hadoop Blog
             – http://developer.yahoo.com/blogs/hadoop/


      • M45 Press Release
             – http://research.yahoo.com/node/1879


      • Hadoop Summit and DISC Slides
             – http://research.yahoo.com/node/2104




Yahoo! @ ApacheCon                                        27
                                                          27
28

Más contenido relacionado

La actualidad más candente

Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructureelliando dias
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideDouglas Bernardini
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterAltoros
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...Yahoo Developer Network
 
Top 10 lessons learned from deploying hadoop in a private cloud
Top 10 lessons learned from deploying hadoop in a private cloudTop 10 lessons learned from deploying hadoop in a private cloud
Top 10 lessons learned from deploying hadoop in a private cloudRogue Wave Software
 
Real-time searching of big data with Solr and Hadoop
Real-time searching of big data with Solr and HadoopReal-time searching of big data with Solr and Hadoop
Real-time searching of big data with Solr and HadoopRogue Wave Software
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv larsgeorge
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Kathleen Ting
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoopmarkgrover
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedInAllen Wittenauer
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practicelarsgeorge
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014Hassan Islamov
 
Distributed Data processing in a Cloud
Distributed Data processing in a CloudDistributed Data processing in a Cloud
Distributed Data processing in a Cloudelliando dias
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learnedtcurdt
 

La actualidad más candente (20)

Petabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructurePetabyte scale on commodity infrastructure
Petabyte scale on commodity infrastructure
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
 
Intro To Hadoop
Intro To HadoopIntro To Hadoop
Intro To Hadoop
 
Top 10 lessons learned from deploying hadoop in a private cloud
Top 10 lessons learned from deploying hadoop in a private cloudTop 10 lessons learned from deploying hadoop in a private cloud
Top 10 lessons learned from deploying hadoop in a private cloud
 
Real-time searching of big data with Solr and Hadoop
Real-time searching of big data with Solr and HadoopReal-time searching of big data with Solr and Hadoop
Real-time searching of big data with Solr and Hadoop
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
March 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling HadoopMarch 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling Hadoop
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Hadoop Operations at LinkedIn
Hadoop Operations at LinkedInHadoop Operations at LinkedIn
Hadoop Operations at LinkedIn
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
 
ha_module5
ha_module5ha_module5
ha_module5
 
Distributed Data processing in a Cloud
Distributed Data processing in a CloudDistributed Data processing in a Cloud
Distributed Data processing in a Cloud
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 

Destacado

Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteAllen Wittenauer
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System AdministratorsAllen Wittenauer
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsAllen Wittenauer
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemAllen Wittenauer
 
Grid Computing Certification
Grid Computing CertificationGrid Computing Certification
Grid Computing CertificationVskills
 
Globus toolkit in grid
Globus toolkit in gridGlobus toolkit in grid
Globus toolkit in gridDeevena Dayaal
 
Grid computing ppt 2003(done)
Grid computing ppt 2003(done)Grid computing ppt 2003(done)
Grid computing ppt 2003(done)TASNEEM88
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]Raul Soto
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]vaishalisahare123
 
Inroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar vermaInroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar vermagargishankar1981
 

Destacado (20)

Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell Rewrite
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System Administrators
 
Apache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase ContributorsApache Yetus: Intro to Precommit for HBase Contributors
Apache Yetus: Intro to Precommit for HBase Contributors
 
Apache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile ProblemApache Yetus: Helping Solve the Last Mile Problem
Apache Yetus: Helping Solve the Last Mile Problem
 
Session19 Globus
Session19 GlobusSession19 Globus
Session19 Globus
 
Grid Computing Certification
Grid Computing CertificationGrid Computing Certification
Grid Computing Certification
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
 
Globus toolkit in grid
Globus toolkit in gridGlobus toolkit in grid
Globus toolkit in grid
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Cs6703 grid and cloud computing unit 1
Cs6703 grid and cloud computing unit 1Cs6703 grid and cloud computing unit 1
Cs6703 grid and cloud computing unit 1
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing & its applications
Grid computing & its applicationsGrid computing & its applications
Grid computing & its applications
 
Grid computing ppt 2003(done)
Grid computing ppt 2003(done)Grid computing ppt 2003(done)
Grid computing ppt 2003(done)
 
Grid computing
Grid computingGrid computing
Grid computing
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]
 
Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]Grid computing by vaishali sahare [katkar]
Grid computing by vaishali sahare [katkar]
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
Inroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar vermaInroduction to grid computing by gargi shankar verma
Inroduction to grid computing by gargi shankar verma
 

Similar a Deploying Grid Services Using Apache Hadoop

Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopGeorge Ang
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Large scale computing with mapreduce
Large scale computing with mapreduceLarge scale computing with mapreduce
Large scale computing with mapreducehansen3032
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoopMohit Tare
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big DataJoe Alex
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDYVenneladonthireddy1
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introductionSandeep Singh
 
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Abdul Nasir
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.pptSathish24111
 

Similar a Deploying Grid Services Using Apache Hadoop (20)

Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using Hadoop
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Large scale computing with mapreduce
Large scale computing with mapreduceLarge scale computing with mapreduce
Large scale computing with mapreduce
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
 
MHUG - YARN
MHUG - YARNMHUG - YARN
MHUG - YARN
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
 
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.ppt
 

Último

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Deploying Grid Services Using Apache Hadoop

  • 1. Deploying Grid Services Using Hadoop Allen Wittenauer April 11, 2008 Yahoo! @ ApacheCon 1 1
  • 2. Agenda • Grid Computing At Yahoo! • Quick Overview of Hadoop – “Hadoop for Systems Administrators” • Scaling Hadoop Deployments • Yahoo!’s Next Generation Grid Infrastructure • Questions (and maybe even Answers!) Yahoo! @ ApacheCon 2 2
  • 3. Grid Computing at Yahoo! • Drivers – 500M unique users per month – Billions of interesting events per day – “Data analysis is the inner-loop at Yahoo!” • Yahoo! Grid Vision and Focus – On-demand, shared access to vast pool of resources – Support for massively parallel execution (1000s of processors) – Data Intensive Super Computing (DISC) – Centrally provisioned and managed – Service-oriented, elastic • What We’re Not – Not “Grid” in the sense of scientific community (Globus, etc) – Not focused on public or 3rd-party utility (Amazon EC2/S3, etc) Yahoo! @ ApacheCon 3 3
  • 4. Yahoo! Grid Services • Operate multiple grids within Yahoo! • 10,000s nodes, 100,000s cores, TBs RAM, PBs disk • Support large internal user community – Account management, training, etc • Manage data needs – Ingest TBs per day • Deploy and manage software (Hadoop, Pig, etc) Yahoo! @ ApacheCon 4 4
  • 5. “Grid” Computing • What we really do is utility computing – Nobody knows what “utility computing” is, but everyone has heard of “grid computing” • Grid computing implies sharing across resources owned by multiple, independent organizations • Utility computing implies sharing one owner’s resources by multiple, independent customers • Ultimate goal is to provide shared compute and storage resources – Instead of going to a hardware committee to provision balkanized resources, a project allocates a part of its budget for use on Yahoo!’s shared grids – Pay as you go • Only buy 100 computers for 15 minutes of compute time vs. 100 computers 24x7 Yahoo! @ ApacheCon 5 5
  • 6. What is a Yahoo! Grid Service? • Thousands of machines using basic network hardware – It’s hard to program for many machines • Clustering and sharing software – Hadoop, HOD, Torque, Maui, and other bits... • Petabytes of data – It’s an engineering challenge to load so much data from many sources • Attached development environment – A clean, well lit place to interact with a grid • User support – Learning facilitation • Usage tracking and billing – Someone has to pay the bills... Yahoo! @ ApacheCon 6 6
  • 7. Quick MapReduce Overview • Application writer specifies Input 0 Input 1 Input 2 – two functions: Map and Reduce – set of input files • Workflow Map 0 Map 1 Map 2 – input phase generates a number of FileSplits from input files (one per Shuffle Map Task) – Map phase executes user function to transform key/value inputs into Reduce 0 Reduce 1 new key/value outputs – Framework sorts and shuffles – Reduce phase combines all k/v’s with the same key into new k/v’s – Output phase writes the resulting Out 0 Out 1 pairs to files • cat * | grep | sort | uniq -c | cat > out Yahoo! @ ApacheCon 7 7
  • 8. Hadoop MapReduce: Process Level • Job – Map Function + Reduce Function + List of inputs Yahoo! @ ApacheCon 8 8
  • 9. Hadoop Distributed File System Yahoo! @ ApacheCon 9 9
  • 10.
  • 11. Gateways: Multi-user Land • Provided for two main purposes – Meaningful development interaction with a compute cluster • High bandwidth, low latency, and few network barriers enable a tight development loop when creating MapReduce jobs – Permission and privilege separation • Limit exposure to sensitive data – Hadoop 0.15 and lower lack users, permissions, etc. – Hadoop 0.16 has users and “weak” permissions • Characteristics – “Replacement” Lab machine – World-writable local disk space • Any single-threaded processing • Code debugging Yahoo! @ ApacheCon 11 11
  • 12. Compute Cluster and Name Nodes • Compute Cluster Node – Users cannot login! – All nodes run both MapReduce and HDFS frameworks – usually 500 to 2000 machines – Each cluster kept relatively homogenous – Hardware configuration • 2xSockets (2 or 4 core) • 4x500-750G • 6G-8G RAM • Name Node – 16G RAM • 14G Java heap = 18-20 million files Yahoo! @ ApacheCon 12 12
  • 13. Queuing and Scheduling • Hadoop does not have an advanced scheduling system – MapReduce JobTracker manages one or more jobs running within a set of machines – Works well for “dedicated” applications, but does not work so well for shared resources • Grid Services are intended to be a shared multi-user, multi-application environment – Need to combine Hadoop with an external queuing and scheduling system... Yahoo! @ ApacheCon 13 13
  • 14. Hadoop On Demand (HoD) • Wrapper around PBS commands – We use freely available Torque and Maui • Big win: virtual private JobTracker clusters – Job isolation – Users create clusters of the size they need – Submit jobs to their private JT • Big costs: – Lose data locality – Increased complexity – Lose a node for private JobTracker – Single reducer doesn’t free unused nodes • ~ 30% efficiency lost! • Looking at changing Hadoop scheduling – Task scheduling flexibility combined with node elasticity Yahoo! @ ApacheCon 14 14
  • 15. HoD Job Scheduling Yahoo! @ ApacheCon 15 15
  • 16. The Reality of a 1000 Node Grid Yahoo! @ ApacheCon 16 16
  • 17. Putting It All Together Yahoo! @ ApacheCon 17 17
  • 19. Yahoo!’s Next Generation Grid Infrastructure A Work In Progress 19
  • 20. Background Information • Internal deployments – Mostly Yahoo! proprietary technologies • M45 – Educational outreach grid – Non-Yahoo!’s using Yahoo! resources • Legal required us not to use any Y! technology! • Decision made to start from scratch! – Hard to share best practices – Potential legal issues – Don’t want to support two ways to do the same operation • Internal grids converting to be completely OSS as possible – Custom glue code to deal with any Y!<-->OSS incompatibilities • user and group data Yahoo! @ ApacheCon 20 20
  • 21. Naming and Provisioning Services • Naming services – Kerberos for secure authentication – DNS for host resolution – LDAP for everything else • ISC DHCP – Reads table information from LDAP – In pairs for redundancy • Kickstart – We run RHEL 5.x – base image + bcfg2 • bcfg2 – host customization – centralized configuration management Yahoo! @ ApacheCon 21 21
  • 22. NFS for Multi-user Support • NFS – Home Directories – Project Directories • Group shared data • Grids with service level agreements (SLAs) shouldn’t use NFS ! – Single point of failure • HA-NFS == $$$ – Performance – Real data should be in HDFS Yahoo! @ ApacheCon 22 22
  • 23. Big Picture Yahoo! @ ApacheCon 23 23
  • 24. Self Healing/Reporting • Torque – Use the Torque node health mechanism to disable/fix ‘sick’ nodes • Great reduction in amount of support issues • Address problems in bulk • Nagios – Usual stuff – Custom hooks into Torque • Simon – Yahoo!’s distributed cluster and application monitoring tools – Similar to Ganglia – On the roadmap to be open sourced Yahoo! @ ApacheCon 24 24
  • 25. Node Usage Report Yahoo! @ ApacheCon 25 25
  • 26. Ranges and Groups • Range: group of hosts – example: @GRID == all grid hosts – custom tools to manipulate hosts based upon ranges: • ssh -r @GRID uptime – Report uptime on all of the hosts in @GRID • Netgroup – Used to implement ranges – The most underrated naming service switch ever? • Cascaded! • Scalable! • Supported in lots of useful places! – PAM (e.g., _succeed_if on Linux) – NFS Yahoo! @ ApacheCon 26 26
  • 27. Some Links • Apache Hadoop – http://hadoop.apache.org/ • Yahoo! Hadoop Blog – http://developer.yahoo.com/blogs/hadoop/ • M45 Press Release – http://research.yahoo.com/node/1879 • Hadoop Summit and DISC Slides – http://research.yahoo.com/node/2104 Yahoo! @ ApacheCon 27 27
  • 28. 28

Notas del editor

  1. You don’t log into thousands of nodes to run a job, so then you install clustering software, but that isn’t that interesting unless you have data to go with it. Now that you have a lot of data, you need some place to interact with that data and you need to have the tools and know-how to actually do your work.
  2. three things that a mapreduce developer needs to provide: two functions and the set of inputs. inputs are generally large--think terabytes. reducer functions will generally produce multiple output files. all of this is happening in parallel. think of cat asterisk | grep | sort | uniq, but being running over tens, hundreds, thousands of machines at once.
  3. so in hadoop land, the combination of the map function, the reduce function, and the list of inputs constitutes ‘a job’. this job is submitted by our user, who we’ll call jerry, to this entity called the job tracker. the job tracker is typically configured so that it is on a dedicated machine in larger clusters or on a machine that might be shared with other hadoop framework dedicated processes on smaller clusters. anyway, the job tracker takes our job, splits all of the tasks up and hands it to processes called task trackers. task trackers live on every ‘worker’ node in the cluster. these task trackers basically watch the individual tasks and reports back to the job tracker the status of the tasks. if a task fails, the job tracker will try to reschedule that task.
  4. Hadoop also includes a distributed file system. some key points here. clients, like the process being run by our user jerry pulls metadata from a process called the name node and the actual data from a process called the data node. The data node reads the data from local storage, typically stored on multiple “real” file systems on the host machine, such as ext3, zfs, or ufs.
  5. before we get into what we’ve done to turn hadoop into a grid service, a bit of a warning. in some cases, the configurations and solutions we’re using are meant to scale to really large numbers of jobs and/or nodes and/or users. for your own installation, a lot of this just might be overkill. so don’t take this as “the” way to use hadoop. instead, view this as “a” way. and, in fact, you’ll discover that amongst the existing hadoop community, we’re on the extreme end of just about everything.
  6. so given the previous general outline of hadoop, how do yahoo!’s engineers access grid services? as i mentioned earlier you can’t log into thousands of machines. instead, we provide a landing zone of sorts that users log into to launch their jobs and do other grid-related work. we call these machines gateways. gateway nodes are geared towards solving two problems: reduce development time by providing fast access to the grid and to provide some security, especially against accidental data leaks. currently, hadoop doesn’t have a very strong security model, so it takes some extra work outside of the hadoop configuration to make it somewhat protected from intentional or even accidental exposure.
  7. so if the users log into the gateways to launch jobs, where do the jobs actually run? we have similarly configured machines that we call the compute cluster. the compute cluster nodes and the name node associated with that cluster are provisioned such that they are both HDFS and MR machines. We try to keep the machines relatively homogenous to cut down on the amount of custom configuration. so while i list the specs for the hardware that we use, note that on a given compute cluster, they will all generally have the same exact same configuration. the one big exception is the name node. since more memory means more and larger files, we make it a bit beefier in order to have a bigger HDFS.
  8. so now we know where a user does the work, and the configuration of the nodes that do that work, but how does the user submit a job to the system? because of the number of users and the types of jobs they submit we ran into a bit of a problem. hadoop’s built-in scheduler just isn’t that advanced. there is no guarantee of job isolation or service levels or any of those other tools that admins like to utilize. we clearly had to do something in the interim to get us over this bump in the road. the obvious and quick answer was to look at what the HPC people were doing.
  9. Enter Hadoop on Demand or HoD. It utilizes as the basis of its system a set of applications called torque and maui. by using HoD in combination with torque and maui we get those important features like job isolation, but at a pretty big cost. We lost some significant functionality, gained quite a bit of complexity, and even worse, lost a significant amount of efficiency due to nodes that weren’t able to be utilized for other work even though they were idle. to that end, the hadoop development community is looking at what can be done to provide this sort of functionality into the hadoop scheduling system. so if you are interested in those sorts of problems, please help contribute. :) [If it gets asked: why not condor? Had to choose something, and torque was simpler, with a fairly centralized administration system and widespread adoption in the HPC community.]
  10. to give you an idea of the system works and some of the problems assoicated with it, lets take a look at three users using a 12 node cluster. our first user sue requests 4 nodes. 1 node goes to run her private job tracker. the remaining three nodes are used to run her task. so, sue has her job up and running and then jerry comes along. he needs a bit more computing power, so he requests 5 nodes. 1 node for the job tracker and 4 nodes for the task tracker. at this point, we only have two nodes free, since one node is being used for the name node. now david comes along. he’s a bit of a power user, so he needs 6 nodes. one node for his job tracker and 5 nodes for his task trackers. but since there are only two nodes completely free, he has to wait. if sue and jerry’s jobs are using only a single reducer or a single mapper, those other nodes might actually be free. but since we dedicate every machine for the lifetime of the entire job, david may be waiting for resources that are actually free.
  11. our grids typically have more than 12 nodes and user jobs are usually measured in the hundreds of nodes. anywhere 10-20% are down. if we re-scale the picture for 1000 nodes, the picture looks more like this.
  12. So, let’s put the whole system together. our user log into gateway machines. They use hod to establish private jobtrackers for themselves. hod contacts on behalf of the user the resource manager to determine which nodes will comprise this private job tracker and the task trackers. once the private job tracker is established, Jerry will submit jobs directly to his virtual cluster. additionally, he may read/write files from the data nodes either from his job or from the gateway, which for simplicities sake, is designated as a connection to the name node.
  13. When scaling for this many machines, the network lay out is extremely important. there can be a significant amount of network traffic when dealing with large scale jobs. In our configuration,rack of forty hosts has a dedicated switch. this switch then has 2 gigabit connections to each of four cores. as you can see, we always but one hop from any given host to any other host.
  14. The big message here is that our team was tasked to support two very different environments. given that we’re probably one of the most, if not the most, hard core of the open source advocates inside yahoo, the decision to dump all of our internal tools was an easy one. we want to be part of the larger open source community. this means running, using, and more importantly, contributing back the same things that all of you do.
  15. We had some requirements given to use by the yahoo! side of the hadoop development community as to what they wanted in place for future work. So, putting all of our requirements to support a pure multi-user environment together, this is what we came up. We’re using LDAP and DNS for things that people normally use them for with one big exception. We put in place Kerberos for our password management such that our ldap system has no users passwords in it. Using Kerberos allows us to not only support fancy things like single sign on without using SSH keys, it also allowed us to take advantage of Active Directory being used for some of the corporate IT functions. our kickstart image installs a base OS, adds a couple of tweaks, and then intalls bcfg2 to do the real work. bcfg2 allows us to centralize configuration management so that we’re not doing mass changes using trusted ssh.
  16. now, it might surprise you that we’re using NFS at all. keep in mind that we do have actual users. when you have actual users, this also means you need to supply them home directories and project directories for groups. for our research grids, having this dependency upon nfs is fine. for grids that do “real work” where SLAs are at stake, NFS should be avoided as much as possible.
  17. Let’s put this all together. We duplicate the top level DNS, Kerberos, and LDAP services to another data center for redundancy purposes and to provide access to these services in the other data center. From our local copy, we create a set of slaves that are read only. DNS is configured to be caching on each node, so need to duplicate that particular service in the same way. From there, we have a cloud of servers that provide configuration and provisioning management, with almost all of the boot data coming from ldap or from a source code control system that isn’t pictured. Given that the above services are horizontally scalable, we can pretty much support as many grids as we want. we just throw more servers into the farms. and, just to top it all off, we have our NFS layer. we generally share the home directory amongst all grids, but we have dedicated project directories per grid. depending upon the data center, the NFS layer may or may not be the same server.
  18. i mentioned hod using torque earlier. one of the features that we rely upon heavily is the fact that torque has a health check function that will automatically enable or disable a host from being used by the schedular. This has been invaluable for us to cut down on the amount of service tickets we get from users. if a node is sick, we don’t want people to use it. We also use Nagios like a large chunk of the rest of planet. We did write some custom bits to hook into our torque health as another means to report how sick a given grid is. Something completely custom we’re using is called Simon. Simon started life as part of the search engine’s suite of monitoring tools. unfortunately, i can’t talk much about it, but i can say that it is similar in concept to ganglia, but we like to believe it is better. simon is on the roadmap to get open sourced at some point in the mysterious future.
  19. breaks that are at 0 nodes are downtimes== upgrade. the bigger downtimes are for 0.16 upgrade, due to the number of issues we encountered.
  20. we also manipulate machines in terms of groups. we have some custom tools written that allows to submit commands and copy files to batches of machines at once. it is likely the most underrated and underappreciated tool that system administrators have. what if you could cascade groups? and store hostnames in them? this is exactly what netgroup gives you the capability to do. and it is built into pretty much every UNIX in existence. if you have a large set of machines, and aren’t using netgroup, you have probably built something custom that does the exact same thing. in our case, we’re using the netgroup functionality to provide the backend for our group utilities. i’m hoping at some point we’ll be able to open source them as well, but we’re not quite there yet.