Enviar búsqueda
Cargar
Hadoop 101 v2
•
Descargar como PPTX, PDF
•
0 recomendaciones
•
593 vistas
John Berns
Seguir
Given at IoT Asia 2014
Leer menos
Leer más
Datos y análisis
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 56
Descargar ahora
Recomendados
Hadoop 101 - Big Data Technology
Hadoop 101 - Big Data Technology
Firman Gautama
My other computer_is_a_datacentre
My other computer_is_a_datacentre
Steve Loughran
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Interview questions on Apache spark [part 2]
Interview questions on Apache spark [part 2]
knowbigdata
Another Intro To Hadoop
Another Intro To Hadoop
Adeel Ahmad
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective Audience
Chandra Sekhar
Geek camp
Geek camp
jdhok
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Zekeriya Besiroglu
Recomendados
Hadoop 101 - Big Data Technology
Hadoop 101 - Big Data Technology
Firman Gautama
My other computer_is_a_datacentre
My other computer_is_a_datacentre
Steve Loughran
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Interview questions on Apache spark [part 2]
Interview questions on Apache spark [part 2]
knowbigdata
Another Intro To Hadoop
Another Intro To Hadoop
Adeel Ahmad
Hadoop And Big Data - My Presentation To Selective Audience
Hadoop And Big Data - My Presentation To Selective Audience
Chandra Sekhar
Geek camp
Geek camp
jdhok
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Zekeriya Besiroglu
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
yhadoop
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
Checkupload1 140213043220-phpapp01
Checkupload1 140213043220-phpapp01
Nitish Bhardwaj
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
jlb666
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
Adam Kawa
PySparkの勘所(20170630 sapporo db analytics showcase)
PySparkの勘所(20170630 sapporo db analytics showcase)
Ryuji Tamagawa
Seminar ppt
Seminar ppt
RajatTripathi34
Introduction to Apache Hadoop
Introduction to Apache Hadoop
Steve Watt
Pptx present
Pptx present
Nitish Bhardwaj
Hadoop 130419075715-phpapp02(1)
Hadoop 130419075715-phpapp02(1)
Nitish Bhardwaj
Intro to Hadoop
Intro to Hadoop
jeffturner
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
Practical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
Swiss Big Data User Group
20171012 found IT #9 PySparkの勘所
20171012 found IT #9 PySparkの勘所
Ryuji Tamagawa
20170210 sapporotechbar7
20170210 sapporotechbar7
Ryuji Tamagawa
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
Hadoop basics
Hadoop basics
Antonio Silveira
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Adam Kawa
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
Understanding Spark Tuning: Strata New York
Understanding Spark Tuning: Strata New York
Rachel Warren
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Holden Karau
Más contenido relacionado
La actualidad más candente
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
yhadoop
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
Checkupload1 140213043220-phpapp01
Checkupload1 140213043220-phpapp01
Nitish Bhardwaj
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
jlb666
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
Adam Kawa
PySparkの勘所(20170630 sapporo db analytics showcase)
PySparkの勘所(20170630 sapporo db analytics showcase)
Ryuji Tamagawa
Seminar ppt
Seminar ppt
RajatTripathi34
Introduction to Apache Hadoop
Introduction to Apache Hadoop
Steve Watt
Pptx present
Pptx present
Nitish Bhardwaj
Hadoop 130419075715-phpapp02(1)
Hadoop 130419075715-phpapp02(1)
Nitish Bhardwaj
Intro to Hadoop
Intro to Hadoop
jeffturner
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Ricardo Varela
Practical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
Swiss Big Data User Group
20171012 found IT #9 PySparkの勘所
20171012 found IT #9 PySparkの勘所
Ryuji Tamagawa
20170210 sapporotechbar7
20170210 sapporotechbar7
Ryuji Tamagawa
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
Hadoop basics
Hadoop basics
Antonio Silveira
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Adam Kawa
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
La actualidad más candente
(20)
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Checkupload1 140213043220-phpapp01
Checkupload1 140213043220-phpapp01
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
PGDAY FR 2014 : presentation de Postgresql chez leboncoin.fr
Introduction To Elastic MapReduce at WHUG
Introduction To Elastic MapReduce at WHUG
PySparkの勘所(20170630 sapporo db analytics showcase)
PySparkの勘所(20170630 sapporo db analytics showcase)
Seminar ppt
Seminar ppt
Introduction to Apache Hadoop
Introduction to Apache Hadoop
Pptx present
Pptx present
Hadoop 130419075715-phpapp02(1)
Hadoop 130419075715-phpapp02(1)
Intro to Hadoop
Intro to Hadoop
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
Practical Hadoop using Pig
Practical Hadoop using Pig
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
20171012 found IT #9 PySparkの勘所
20171012 found IT #9 PySparkの勘所
20170210 sapporotechbar7
20170210 sapporotechbar7
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Hadoop basics
Hadoop basics
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
Similar a Hadoop 101 v2
Understanding Spark Tuning: Strata New York
Understanding Spark Tuning: Strata New York
Rachel Warren
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Holden Karau
Spark Autotuning - Strata EU 2018
Spark Autotuning - Strata EU 2018
Holden Karau
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Stefano Paluello
Spark autotuning talk final
Spark autotuning talk final
Rachel Warren
Ayw computer working
Ayw computer working
pbeerak
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine Learning
Renaldas Zioma
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
ice799
Mysql talk
Mysql talk
LogicMonitor
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Andraz Tori
Data analysis with pandas
Data analysis with pandas
Outreach Digital
Data Analysis With Pandas
Data Analysis With Pandas
Stephan Solomonidis
Fast and Scalable Python
Fast and Scalable Python
Travis Oliphant
Assignment 2 Theoretical
Assignment 2 Theoretical
Esteban Gonzalez
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
GeekNightHyderabad
Operating Systems
Operating Systems
CharlieGilbertson
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Fred de Villamil
Seminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
Hadoop
Hadoop
adm_exoplatform
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Andy Petrella
Similar a Hadoop 101 v2
(20)
Understanding Spark Tuning: Strata New York
Understanding Spark Tuning: Strata New York
Spark Autotuning Talk - Strata New York
Spark Autotuning Talk - Strata New York
Spark Autotuning - Strata EU 2018
Spark Autotuning - Strata EU 2018
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Spark autotuning talk final
Spark autotuning talk final
Ayw computer working
Ayw computer working
Trip down the GPU lane with Machine Learning
Trip down the GPU lane with Machine Learning
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
Mysql talk
Mysql talk
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Data analysis with pandas
Data analysis with pandas
Data Analysis With Pandas
Data Analysis With Pandas
Fast and Scalable Python
Fast and Scalable Python
Assignment 2 Theoretical
Assignment 2 Theoretical
Big Data - Need of Converged Data Platform
Big Data - Need of Converged Data Platform
Operating Systems
Operating Systems
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Seminar Presentation Hadoop
Seminar Presentation Hadoop
Hadoop
Hadoop
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Último
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
michael115558
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
Boston Institute of Analytics
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
olyaivanovalion
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
Delhi Call girls
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
9to5mart
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
MoniSankarHazra
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Delhi Call girls
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Pooja Nehwal
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
Último
(20)
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Hadoop 101 v2
1.
Hadoop 101 A really
quick overview of the concepts…
2.
A few Terabytes
of Data...
3.
4.
5.
Text processing--a few
hours?
6.
But what if
you have more data?
7.
Network Storage--Petabytes!
8.
Network Storage--Petabytes!
9.
What if you
need compute power for complex algorithms?
10.
8 core? 16
Cores? 64 cores? 512 GB RAM?
11.
A network of
commodity computers
12.
Run jobs on
PART of the data on each computer then AGGRETAGE the intermediary results from each computer.
13.
Let’s add a
computer to manage the process of job delegation, merging the results... and keeping track of the results...
14.
We also need
something to keep track of what files are where, so we know where the data is that needs to be computed...
15.
When you have
a lot of computers, and even more hard drives, one thing I can guarantee...
16.
Computers will eventually
fail.
17.
Computers will eventually
fail.
18.
Hard drives will
eventually fail.
19.
Hard drives will
eventually fail.
20.
Hard drives will
eventually fail.
21.
Hard drives will
eventually fail.
22.
Even whole racks
will fail.
23.
If a computer
fails and you only have one copy of your data...
24.
You will be
very, very unhappy.
25.
So lets store
multiple copies of the data. Hard drives are CHEAP!
26.
So lets store
multiple copies of the data. Hard drives are CHEAP!
27.
So lets store
multiple copies of the data. Hard drives are CHEAP!
28.
So lets store
multiple copies of the data. Hard drives are CHEAP!
29.
If one hard
drive fails... we are still OK
30.
If one computer
fails... we are still OK
31.
Even if a
whole rack fails... we are still OK
32.
Once we find
a failure let’s have the system recopy the copies.
33.
Send the compute
job to all nodes.
34.
And let it
run on it’s part of the data….
35.
And let it
run on it’s part of the data….
36.
And let it
run on it’s part of the data….
37.
And let it
run on it’s part of the data….
38.
One is stuck….
39.
We have three
copies—we can redistribute the compute
40.
And take the
one that finishes fastest
41.
Merge sorted sets
based on some key… A-E F-J K-O P-T U-Z
42.
…and write partial
results PART-01 PART-02 PART-03 PART-04 PART-05
43.
Guess, what? We’ve
just invented Hadoop! PART-03 PART-01 PART-02 A-E F-J
44.
So let’s talk
about the pieces of Hadoop.
45.
Data nodes store
and manage the data on a single “slave” computer Data Node
46.
Task trackers manage
the compute Data Node Task Tracker
47.
Job tracker manages
task trackers, ships code to compute nodes Data Node Task Tracker Job Tracker
48.
Name node manages
distribution and replication on the data nodes Data Node Task Tracker Job Tracker Name Node
49.
Map Reduce Task Tracker Job
Tracker
50.
HDFS (Hadoop Distributed
File System) Data Node Name Node
51.
HDFS
52.
Visual Example
53.
Map
54.
Shuffle
55.
Reduce
56.
Putting It All
Together
Descargar ahora