SlideShare una empresa de Scribd logo
1 de 14
 Introduction to Distributed Programming
› Background of Hadoop
› What is Hadoop ?
› How Hadoop works ?
 Installing Hadoop
› Setting up SSH
› Setting up Environment Variables
› Running Hadoop
› Web-Based Cluster
 Components of Hadoop
› Working with Hadoop File-System
› Understanding Hadoop Map-Reduce
› Reading and Writing
 Writing Basic Map Reduce Program
› Getting the Patent Data Set
› Constructing Basic Map-Reduce Program
› Working with Hadoop Streaming
› Improving Performance with Combiners
 Advanced MapReduce
› Summarization Patterns
› Filtering Patterns
› Data Organization Patterns
› Join Patterns
› Meta Patterns
› Input and Output Patterns
 Programming Practices
› Developing Map-Reduce Programs
› Monitoring and Debugging on a cluster
› Tuning for performance
 Hadoop Cookbook
› Passing Job-Specific Parameters to your tasks
› Probing for Task-Specific Parameters
› Partitioning into multiple output files
› Inputting from and output to database
› Keeping Output in Sorted Order
 Managing Hadoop
› Checking System’s Health
› Setting permissions
› Managing Quotas , Enabling Trash ,
Adding/Deleting Nodes, Recovering from a
failed NameNode
 Running Hadoop in the Cloud
› Introducing Amazon Web Services
› Setting up AWS and Setting up cloud on EC2
› Running Map-Reduce Programs on EC2
› Cleaning up and Shutting down your EC2
instances.
› Amazon Elastic Map-Reduce and other AWS
Services
 Programming with Pig
› Thinking like a pig
› Installing Pig
› Running Pig
› Learning Pig Latin through Grunt
› Pig Latin Syntax
› Working with UDF
› Working with Scripts
 Getting Started on Hive
 Data Types and File Formats
 HiveQL – Data Definition
 HiveQL - Data Manipulation
 HiveQL – Queries, Views and Indexes
 Schema Design , Tuning & Record
Formats
 Hive Integration with Oozie
 Hive and Amazon Web Services
 NoSQL Database
› Why No SQL ?
› Aggregate Data Models
› Distribution Models
› Consistency
 No SQL DBs
› Key-Value DataBases
› Document Databases
› Column Family Stores
› Graph Databases
 MongoDB
› Introduction
› MongoDB through JavaScript Shell
› Writing Programs using MongoDB
› Document Oriented Data
› Queries and Aggregation
› Updates, Atomic Operations and Deletes
› Indexing, Replication and Sharding
 Mahout – Machine Learning
› Introduction
› Recommenders
 Representing Recommender Data
 Making Recommendations
› Clustering
 Clustering Algorithms in Mahout
› Classification
 Training a Classifier
 Evaluating and Tuning a Classifier
 Moving Data in and out of Hadoop
› Flume
› Oozie
› Sqoop
› Hbase
 Data Serialization Formats
› XML, JSON
› SequenceFiles, Protocol Buffers, Thrift and
Avro
 Utilizing Data Structures and Algorithms
› Modelling Data & Solving Problems with
Graphs
› Parallelized Bloom Filter Creation in Map-
Reduce
 Programming Pipelines with Pig
› Using Pig to find malicious actors in log data.
› Optimizing user workflow with Pig.
 Crunch
 Cascading
 Puppet
 Unit Testing Map-Reduce
 Heavyweight Job Testing using
LocalJobRunner
 Debugging User-Space Problems

Más contenido relacionado

La actualidad más candente

Introduction to apache spark
Introduction to apache sparkIntroduction to apache spark
Introduction to apache spark
UserReport
 

La actualidad más candente (19)

Hadoop
HadoopHadoop
Hadoop
 
Introduction to Apache Spark Ecosystem
Introduction to Apache Spark EcosystemIntroduction to Apache Spark Ecosystem
Introduction to Apache Spark Ecosystem
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Big data
Big dataBig data
Big data
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Introduction to apache spark
Introduction to apache sparkIntroduction to apache spark
Introduction to apache spark
 
Nextag talk
Nextag talkNextag talk
Nextag talk
 
Cloud Optimized Big Data
Cloud Optimized Big DataCloud Optimized Big Data
Cloud Optimized Big Data
 
An introduction to Apache Hadoop Hive
An introduction to Apache Hadoop HiveAn introduction to Apache Hadoop Hive
An introduction to Apache Hadoop Hive
 
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
 
Basic Hadoop Architecture V1 vs V2
Basic  Hadoop Architecture  V1 vs V2Basic  Hadoop Architecture  V1 vs V2
Basic Hadoop Architecture V1 vs V2
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and Scala
 
R and-hadoop
R and-hadoopR and-hadoop
R and-hadoop
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
 
An Introduction of Apache Hadoop
An Introduction of Apache HadoopAn Introduction of Apache Hadoop
An Introduction of Apache Hadoop
 
Intro to Spark
Intro to SparkIntro to Spark
Intro to Spark
 

Similar a Hadoop course curriculm

Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
Jesus Rodriguez
 
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Andrew Brust
 
Hadoop course contents latest
Hadoop course contents latestHadoop course contents latest
Hadoop course contents latest
sandsys technologies
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
Cloudera, Inc.
 
Haoop ppt
Haoop pptHaoop ppt
Haoop ppt
orsenit
 

Similar a Hadoop course curriculm (20)

Hadoop online trainings
Hadoop online trainingsHadoop online trainings
Hadoop online trainings
 
Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
Hadoop Training in Hyderabad
Hadoop Training in HyderabadHadoop Training in Hyderabad
Hadoop Training in Hyderabad
 
Hadoop Training in Hyderabad
Hadoop Training in HyderabadHadoop Training in Hyderabad
Hadoop Training in Hyderabad
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
 
Couch db
Couch dbCouch db
Couch db
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
 
SQL Server 2012 and Big Data
SQL Server 2012 and Big DataSQL Server 2012 and Big Data
SQL Server 2012 and Big Data
 
Hadoop course contents latest
Hadoop course contents latestHadoop course contents latest
Hadoop course contents latest
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
Big data overview
Big data overviewBig data overview
Big data overview
 
Hadoop 80hr v1.0
Hadoop 80hr v1.0Hadoop 80hr v1.0
Hadoop 80hr v1.0
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
 
Haoop ppt
Haoop pptHaoop ppt
Haoop ppt
 

Último

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Último (20)

TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 

Hadoop course curriculm

  • 1.
  • 2.  Introduction to Distributed Programming › Background of Hadoop › What is Hadoop ? › How Hadoop works ?  Installing Hadoop › Setting up SSH › Setting up Environment Variables › Running Hadoop › Web-Based Cluster
  • 3.  Components of Hadoop › Working with Hadoop File-System › Understanding Hadoop Map-Reduce › Reading and Writing  Writing Basic Map Reduce Program › Getting the Patent Data Set › Constructing Basic Map-Reduce Program › Working with Hadoop Streaming › Improving Performance with Combiners
  • 4.  Advanced MapReduce › Summarization Patterns › Filtering Patterns › Data Organization Patterns › Join Patterns › Meta Patterns › Input and Output Patterns  Programming Practices › Developing Map-Reduce Programs › Monitoring and Debugging on a cluster › Tuning for performance
  • 5.  Hadoop Cookbook › Passing Job-Specific Parameters to your tasks › Probing for Task-Specific Parameters › Partitioning into multiple output files › Inputting from and output to database › Keeping Output in Sorted Order  Managing Hadoop › Checking System’s Health › Setting permissions › Managing Quotas , Enabling Trash , Adding/Deleting Nodes, Recovering from a failed NameNode
  • 6.  Running Hadoop in the Cloud › Introducing Amazon Web Services › Setting up AWS and Setting up cloud on EC2 › Running Map-Reduce Programs on EC2 › Cleaning up and Shutting down your EC2 instances. › Amazon Elastic Map-Reduce and other AWS Services
  • 7.  Programming with Pig › Thinking like a pig › Installing Pig › Running Pig › Learning Pig Latin through Grunt › Pig Latin Syntax › Working with UDF › Working with Scripts
  • 8.  Getting Started on Hive  Data Types and File Formats  HiveQL – Data Definition  HiveQL - Data Manipulation  HiveQL – Queries, Views and Indexes  Schema Design , Tuning & Record Formats  Hive Integration with Oozie  Hive and Amazon Web Services
  • 9.  NoSQL Database › Why No SQL ? › Aggregate Data Models › Distribution Models › Consistency  No SQL DBs › Key-Value DataBases › Document Databases › Column Family Stores › Graph Databases
  • 10.  MongoDB › Introduction › MongoDB through JavaScript Shell › Writing Programs using MongoDB › Document Oriented Data › Queries and Aggregation › Updates, Atomic Operations and Deletes › Indexing, Replication and Sharding
  • 11.  Mahout – Machine Learning › Introduction › Recommenders  Representing Recommender Data  Making Recommendations › Clustering  Clustering Algorithms in Mahout › Classification  Training a Classifier  Evaluating and Tuning a Classifier
  • 12.  Moving Data in and out of Hadoop › Flume › Oozie › Sqoop › Hbase  Data Serialization Formats › XML, JSON › SequenceFiles, Protocol Buffers, Thrift and Avro
  • 13.  Utilizing Data Structures and Algorithms › Modelling Data & Solving Problems with Graphs › Parallelized Bloom Filter Creation in Map- Reduce  Programming Pipelines with Pig › Using Pig to find malicious actors in log data. › Optimizing user workflow with Pig.
  • 14.  Crunch  Cascading  Puppet  Unit Testing Map-Reduce  Heavyweight Job Testing using LocalJobRunner  Debugging User-Space Problems