6. Hadoop is Flawed
“You can’t install it without an expert.”
“Fine for R&D, but not for real
production.”
“Hadoop is just for batch processing.”
“The dirty-little-secret with Hadoop
is…”
7. Hadoop isn’t for RealWork™
1.Adopt Hadoop for pilot
projects.
2.Scale Hadoop to production
use.
3.Observe an unacceptable
performance penalty.
4.Morph to a real parallel
DBMS.
13. Evolve
“Hadoop has become the kernel of the
distributed operating system for Big Data…
No one uses the kernel alone.”
-Doug Cutting, Strata 2012
(Cloudera, ASF)
14. Hadoop + MapReduce
“There is nothing really
embarrassing about
embarrassingly parallel
applications."
-Luiz André Barroso, ACM 2011
(Distinguished Engineer Google)
15. Not Just for Batch Anymore…
APACHE APACHE
HAMA
D
R
I
L
L
16. Apache Hadoop YARN
The per-application ApplicationMaster is, in
effect, a framework specific library and is
tasked with negotiating resources from the
ResourceManager and working with the
NodeManager(s) to execute and monitor the
tasks.