SlideShare una empresa de Scribd logo
1 de 1
Descargar para leer sin conexión
FURTHERMORE:
Big Data Hadoop Certification Training
BIG DATA HADOOP
CHEAT SHEETBig Data
Comprises of large datasets that cannot be processed using traditional computing
techniques, which includes huge volumes, high velocity and extensible variety of data.
Hadoop
An Apache open source framework written in JAVA which allows distributed processing
of large datasets across clusters of computers using simple programming models.
Hadoop Common
These are the JAVA libraries and utilities required by other Hadoop modules which
contains the necessary scripts and files required to start Hadoop
Hadoop YARN
A framework used for job scheduling and managing the cluster resources
Hadoop Distributed File System
A Java based file system that provides scalable and reliable data storage and it provides
high throughput access to the application data
Hadoop MapReduce
A software framework, which is used for writing the applications easily which process
big amount of data in parallel on large clusters
Apache hive
An infrastructure for data warehousing for Hadoop
Apache oozie
An application in Java responsible for scheduling Hadoop jobs
Apache Pig
A data flow platform that is responsible for the execution of the MapReduce jobs
Apache Spark
An open source framework used for cluster computing
Flume
An open source aggregation service responsible for collection and transport of data
from source to destination
Hbase
A column-oriented database of Hadoop that stores big data in a scalable way
Sqoop
An interface application that is used to transfer data between Hadoop and relational
database through commands
Hadoop Distributed File System
Map-Reduce
HBASE
Database with Real-
Time Access
PIG Hive Sqoop
ETL BI Reporting RDBMSTop Level
Interfaces
Top Level
Abstractions
Distributed
Data
Processing
At the base is a Self-
healing clustered
storage system
Different components of Hadoop Architecture involves:
Hadoop File Automation Commands
Commands Task Syntax
cat
Used to copy the source path to the
destination or the standard output
hdfsdfs –cat URI [URI- – -]
chgrp Used to change the group of the files hdfsdfs –chgrp [-R] GROUP URI [URI—]
chmod Used to change the permissions of the file
hdfsdfs –chmod [-R] <MODE[,MODE]- – -:
OCTALMODE> URI [URI – – -]
chown Used to change the owner of the file
hdfsdfs –chown [-
R][OWNER][:{GROUP]]URI[URI]
count Used to count the number of directories hdfs dfs –count [-q] <paths>
cp
Used to copy one or more than one files from
the source to destination path
hdfsdfs –cp URI[URI – – -]<dest>
Du Used to display the size of directories or files hdfsdfs –du [-s][-h]URI [URI – – -]
get Used to copy files to the local file system hdfs dfs –get[-ignorecrc][-crc]<src><localdst>
ls
Used to display the statistics of any file or
directory
hdfsdfs –ls <args>
mkdir Used to create one or more directories hdfsdfs –mkdir<path>
mv
Used to move one or more files from one
location to other
hdfs dfs –mv URI[URI – – -]<dest>
put Used to read from one file system to other hdfsdfs –put<localsrc>- – -<dest>
rm Used to delete one or more than one files hdfsdfs –rmr[-skipTrash]URI[URI- – – ]
stat
Used to display the information of any specific
path
hdfsdfs –stat URI[URI – – -]
help
Used to display the usage information of the
command
help<cmd-name>
Commands Task
Balancer To run cluster balancing utility
daemonlog To get or set the log level of each daemon
dfsadmin To run many HDFS administrative operations
Datanode To run HDFS datanode service
mradmin
To run a number of mapReduce administrative
operations
Jobtracker To run mapReduce job tracker
Namenode To run name node
Tasktracker To run mapReduce task tracker node
Secondary namenode To run secondary namenode
Hadoop Administration Commands
• Learn from industry experts and be sought-after by the industry!
• Learn any technology, show exemplary skills and have an
unmatched career!
• The most trending technology courses to help you fast-track your
career!
• Logical modules for both beginners and mid-level learners
• All recorded sessions available in LMS for lifetime
• 24*7 Support for Lifetime
• Learn Anytime, Anywhere

Más contenido relacionado

La actualidad más candente

Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Simplilearn
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 

La actualidad más candente (20)

TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Partner Connect APAC - 2022 - April
Partner Connect APAC - 2022 - AprilPartner Connect APAC - 2022 - April
Partner Connect APAC - 2022 - April
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Spark rdd vs data frame vs dataset
Spark rdd vs data frame vs datasetSpark rdd vs data frame vs dataset
Spark rdd vs data frame vs dataset
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Speeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT ApproachSpeeding Time to Insight with a Modern ELT Approach
Speeding Time to Insight with a Modern ELT Approach
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
PySpark in practice slides
PySpark in practice slidesPySpark in practice slides
PySpark in practice slides
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
Azure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene PolonichkoAzure DataBricks for Data Engineering by Eugene Polonichko
Azure DataBricks for Data Engineering by Eugene Polonichko
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
Hive Tutorial | Hive Architecture | Hive Tutorial For Beginners | Hive In Had...
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 

Similar a Big data-cheat-sheet

HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
ManiMaran230751
 

Similar a Big data-cheat-sheet (20)

MapReduce1.pptx
MapReduce1.pptxMapReduce1.pptx
MapReduce1.pptx
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
 
Hadoop vs Apache Spark
Hadoop vs Apache SparkHadoop vs Apache Spark
Hadoop vs Apache Spark
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with Hadoop
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Unit 1
Unit 1Unit 1
Unit 1
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
 
Hadoop, Map Reduce and Apache Pig tutorial
Hadoop, Map Reduce and Apache Pig tutorialHadoop, Map Reduce and Apache Pig tutorial
Hadoop, Map Reduce and Apache Pig tutorial
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on Raspberry
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop Guide
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 

Último

pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
JOHNBEBONYAP1
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
ydyuyu
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 

Último (20)

pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck Microsoft
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 

Big data-cheat-sheet

  • 1. FURTHERMORE: Big Data Hadoop Certification Training BIG DATA HADOOP CHEAT SHEETBig Data Comprises of large datasets that cannot be processed using traditional computing techniques, which includes huge volumes, high velocity and extensible variety of data. Hadoop An Apache open source framework written in JAVA which allows distributed processing of large datasets across clusters of computers using simple programming models. Hadoop Common These are the JAVA libraries and utilities required by other Hadoop modules which contains the necessary scripts and files required to start Hadoop Hadoop YARN A framework used for job scheduling and managing the cluster resources Hadoop Distributed File System A Java based file system that provides scalable and reliable data storage and it provides high throughput access to the application data Hadoop MapReduce A software framework, which is used for writing the applications easily which process big amount of data in parallel on large clusters Apache hive An infrastructure for data warehousing for Hadoop Apache oozie An application in Java responsible for scheduling Hadoop jobs Apache Pig A data flow platform that is responsible for the execution of the MapReduce jobs Apache Spark An open source framework used for cluster computing Flume An open source aggregation service responsible for collection and transport of data from source to destination Hbase A column-oriented database of Hadoop that stores big data in a scalable way Sqoop An interface application that is used to transfer data between Hadoop and relational database through commands Hadoop Distributed File System Map-Reduce HBASE Database with Real- Time Access PIG Hive Sqoop ETL BI Reporting RDBMSTop Level Interfaces Top Level Abstractions Distributed Data Processing At the base is a Self- healing clustered storage system Different components of Hadoop Architecture involves: Hadoop File Automation Commands Commands Task Syntax cat Used to copy the source path to the destination or the standard output hdfsdfs –cat URI [URI- – -] chgrp Used to change the group of the files hdfsdfs –chgrp [-R] GROUP URI [URI—] chmod Used to change the permissions of the file hdfsdfs –chmod [-R] <MODE[,MODE]- – -: OCTALMODE> URI [URI – – -] chown Used to change the owner of the file hdfsdfs –chown [- R][OWNER][:{GROUP]]URI[URI] count Used to count the number of directories hdfs dfs –count [-q] <paths> cp Used to copy one or more than one files from the source to destination path hdfsdfs –cp URI[URI – – -]<dest> Du Used to display the size of directories or files hdfsdfs –du [-s][-h]URI [URI – – -] get Used to copy files to the local file system hdfs dfs –get[-ignorecrc][-crc]<src><localdst> ls Used to display the statistics of any file or directory hdfsdfs –ls <args> mkdir Used to create one or more directories hdfsdfs –mkdir<path> mv Used to move one or more files from one location to other hdfs dfs –mv URI[URI – – -]<dest> put Used to read from one file system to other hdfsdfs –put<localsrc>- – -<dest> rm Used to delete one or more than one files hdfsdfs –rmr[-skipTrash]URI[URI- – – ] stat Used to display the information of any specific path hdfsdfs –stat URI[URI – – -] help Used to display the usage information of the command help<cmd-name> Commands Task Balancer To run cluster balancing utility daemonlog To get or set the log level of each daemon dfsadmin To run many HDFS administrative operations Datanode To run HDFS datanode service mradmin To run a number of mapReduce administrative operations Jobtracker To run mapReduce job tracker Namenode To run name node Tasktracker To run mapReduce task tracker node Secondary namenode To run secondary namenode Hadoop Administration Commands • Learn from industry experts and be sought-after by the industry! • Learn any technology, show exemplary skills and have an unmatched career! • The most trending technology courses to help you fast-track your career! • Logical modules for both beginners and mid-level learners • All recorded sessions available in LMS for lifetime • 24*7 Support for Lifetime • Learn Anytime, Anywhere