SlideShare una empresa de Scribd logo
1 de 27
INTRODUCTION TO
THE HADOOP
ECOSYSTEM
BAKING A LAYER CAKE AND BEYOND…
“Qu’ils mangent de la
brioche.”
1
BEFORE WE BEGIN
Questions for the
audience….
How Many of You
have :
Been working with Hadoop for more than 3
months?
Been working with Hadoop for more than 6
months?
Been working with Hadoop for more than 1
year?How many of you have heard about this thing
called
‘Hadoop’ / ‘Big Data’ and thought it would be fun
to check it out?
About the Speaker
BSCIS - The College of Engineering, The Ohio State University
‘Big Data’ Consultant with > 25 years in IT
Working solely in the ‘Big Data’ space since 2009
Founded Chicago area Hadoop User Group (CHUG) in April 2010
1600+ Members
Over 200 different companies across all industries in the Chicagoland area.
Routinely has talked at different Conferences around the US on Hadoop.
Guest Lecture at Illinois Institute of Technology.
CoAuthored papers found on InfoQ.
MapR Admin, Cloudera Admin & Developer Certified.
3
email: MSegel (at)
segel.com
Skype: Michael_Segel
What is Hadoop?
‘A Framework of software tools to allow one to take a
large problem and process individual pieces in
parallel. ‘
4
Our Hadoop Layer Cake:
Circa 2010
Storag
e
Job
Control
Data Access
5
Programmin
g
Languages
Data Access
Our Hadoop Layer Cake:
Circa 2013 Hadoop 2.0
Storag
e
Job
Control
6
Resourc
e
Control
Real
Time
Messag
es
Confused?
This is just the tip of the
iceberg.
Data
Frameworks
The only constant is
change…
Hadoop is a disruptive technology, forcing the enterprise
to rethink how it handles data.
The core Apache Framework is just the starting point.
Disruption allows new vendors to compete with
established vendors.
If you can build a better mousetrap, you will attract
customers.
Hadoop plays nice with others…
PROPRIETARY SOFTWARE IS BAD.
“Qu’ils mangent de la
brioche.”
8
‘Let them eat
cake’
Myth
:
Reality
:VENDOR LOCK IN IS BAD.
HADOOP IS ONLY GOOD FOR BATCH
PROCESSING
“Qu’ils mangent de la
brioche.”
9
‘Let them eat
cake’
Myth
:
Reality
:HADOOP CAN ALSO BE USED FOR ‘REAL TIME’
PROBLEMS.
[CENSOR
ED]
PROJE
CT
DAT
E
CLIE
NT
REAL TIME HADOOP
SINGLE DATA CENTER SOLUTION
Nightly Batch Jobs Create the
Next Days Advertising Lists
Client Phone Connects to the web
serviceWeb Service talks to Ad
EnginePhone connects to Ad Engine to
get Ad
Ad Engine connects to HBase to
get list of potential Ads to display,
sending the correct Ad to phone.
HADOOP IS A STAND ALONE SYSTEM AND WILL REPLACE
TRADITIONAL VENDOR’S PRODUCTS
“Qu’ils mangent de la
brioche.”
11
‘Let them eat
cake’
Myth
:
Reality
:HADOOP IS PART OF THE ENTERPRISE . IT CAN BE
STANDALONE, OR IT CAN WORK WITH EXISTING
INFRASTRUCTURE.
PROJE
CT
DAT
E
CLIE
NT
TOD
AY
HADOOP AND THE
ENTERPRISE
WE CAN ALL GET ALONG….
Hadoop communicates
well with the rest of the
Enterprise…
Central cluster feeds
distributed web services
with local database
backing…
[split in to two
slides]
PROJE
CT
DAT
E
CLIE
NT
TOD
AY
HADOOP AND THE
ENTERPRISE
WE CAN ALL GET ALONG….
Hadoop communicates
well with the rest of the
Enterprise…
Traditional Data
Stores play nice with
Hadoop. Some seeing
HDFS files as external
tables.
[split in to two
slides]
How Traditional Vendors view
Hadoop
In the beginning they saw Hadoop as a threat.
They will crush them.
If you can’t beat them, join them….
Oracle Partners with Cloudera
EMC partnered with MapR, then released its own distribution. (Green Stack)
Terradata partners with Hortonworks.
Microsoft partnered with Hortonworks.
Intel
Tried to create their own distro.
Last week, dumped their distro, made large investment in to Cloudera.
IBM … Has its own distro, yet certifies their tools to run on Cloudera
Cisco partners with MapR
Amazon (AWS) has own distro, Partners with MapR.
HADOOP CLUSTERS SHOULD BE BUILT ON COMMODITY
HARDWARE .
“Qu’ils mangent de la
brioche.”
15
‘Let them eat
cake’
Myth
:
Reality
:YOU CAN DESIGN YOUR CLUSTER AROUND
CONSTRAINTS…
PROJE
CT
DAT
E
CLIE
NT
ALTERNATIVE CLUSTER
LAYOUT
STORAGE / COMPUTE CLUSTER
A Higher Density of Disk
and Compute Cluster
Premium over
Commodity Hardware
I/O Latency
Could be part of a
virtualization solution.
HADOOP HADOOP IS OPEN SOURCE AND
THEREFORE FREE.
“Qu’ils mangent de la
brioche.”
17
‘Let them eat cake’
Myth
:
Reality
:T.A.N.S.T.A.A.F.L ‘TANS - TAH - FELL’
(THERE AINT NO SUCH THING AS A FREE LUNCH )
There aint no such thing as a free
lunch…
Customers are paying for support.
Tools are primitive, requires work, no real point and click
solution in place, but getting there.
Hadoop fills the gap where you want a custom solution.
Merging semi-structured and structured data is going to be
data dependent, requiring customization.
Beyond ETL, SQL, custom apps require developer
expertise. (You must invest in skills. )
Depending on Use Case, Time to Value (TtV) will differ.
Bottom Line, there is a cost reduction over traditional
solutions, but its not free.
Take away…
Hadoop is a tool set that is constantly evolving.
Beware of marketing myths…
Do your own homework and talk to the vendors.
Make them earn your business.
T.A.S.T.A.A.F.L applies, you need to make an investment in terms of skills.
Hadoop isn’t a separate solution and should be part of your overall
Enterprise strategy.
Hadoop isn’t a silver bullet. By itself, it doesn’t solve your business
problems.
YOU CAN HAVE YOUR
CAKE AND EAT IT TOO!
QUESTIONS?
Thank You For Your
Time
What is a layer cake?
layer cake
noun [C] US
: two or more soft cakes put on top of each other with
jam, cream, icing, etc. (= a sweet mixture made from
sugar) between the cakes and covering the top and
sides
: a term for a diagram showing how various
parts of a group of components tie together
in terms of a functional stack.
22
What is Hadoop?
Storage Layer
The Storage Layer is a Distributed File System that
accomplishes the following:
Uniform Access from any machine in the cluster.
Fast Access (
Resiliency (Self Healing)
Redundancy (Replication)
This is known as HDFS - Hadoop File System
What is Hadoop?
Job Control Layer
The Job Control Layer is the layer that accomplishes the following:
Manages and Schedules Jobs to be run. (Default [FIFO],
Capacity Scheduler,
Manages the over all job, and distributes the subprocesses
across the cluster.
Manages the subprocesses being run on each node in the
cluster.
This is accomplished by a Job Tracker (Cluster level) and Task
Tracker (Node Level)
What is Hadoop?
Data Access Layer
The Data Access Layer is the layer that accomplishes the
following:
Allows for a higher level access which can be
translated to a Map/Reduce Job
Pig (Yahoo!)
Hive (Facebook)
Allows for Adhoc access to data outside of the
Map/Reduce Framework (HBase)
What is Hadoop?
Job Flow Control Layer
The Data Access Layer is the layer that accomplishes the following:
Allows for a higher level access which can be translated to a
Map/Reduce Job
Pig (Yahoo!)
Hive (Facebook)
Allows for Adhoc access to data outside of the Map/Reduce
Framework (HBase)
Allows for processes to be chained together to create a work
flow (Oozie)*
*No where else to put it…
List of Apache Incubator
Projects associated with
Hadoop:
Storm
Accumulo
Knox
Sentry
Falcon
DataFu
Drill
Tez
Twill
Phoenix
Hadoop Dev Tools
Tajo

Más contenido relacionado

La actualidad más candente

Creating $100 million from Big Data Analytics in Banking
Creating $100 million from Big Data Analytics in BankingCreating $100 million from Big Data Analytics in Banking
Creating $100 million from Big Data Analytics in BankingGuy Pearce
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...Datameer
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data AnalyticsDatameer
 
Using Big Data in Finance by Jonah Engler
Using Big Data in Finance by Jonah EnglerUsing Big Data in Finance by Jonah Engler
Using Big Data in Finance by Jonah EnglerJonah Engler
 
Best Practices In Predictive Analytics
Best Practices In Predictive AnalyticsBest Practices In Predictive Analytics
Best Practices In Predictive AnalyticsCapgemini
 
Future and scope of big data analytics in Digital Finance and banking.
Future and scope of big data analytics in Digital Finance and banking.Future and scope of big data analytics in Digital Finance and banking.
Future and scope of big data analytics in Digital Finance and banking.VIJAYAKUMAR P
 
Analytics in banking preview deck - june 2013
Analytics in banking   preview deck - june 2013Analytics in banking   preview deck - june 2013
Analytics in banking preview deck - june 2013Everest Group
 
Big Data LDN 2018: DATA SCIENCE AT ING
Big Data LDN 2018: DATA SCIENCE AT INGBig Data LDN 2018: DATA SCIENCE AT ING
Big Data LDN 2018: DATA SCIENCE AT INGMatt Stubbs
 
Bmc joe goldberg
Bmc joe goldbergBmc joe goldberg
Bmc joe goldbergBigDataExpo
 
AI & ML for Supply Chain Optimization
AI & ML for Supply Chain OptimizationAI & ML for Supply Chain Optimization
AI & ML for Supply Chain OptimizationShiSh Shridhar
 
Customer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCustomer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCloudera, Inc.
 
Cox Automotive: data sells cars
Cox Automotive: data sells carsCox Automotive: data sells cars
Cox Automotive: data sells carsCloudera, Inc.
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]ercan5
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunityStanley Wang
 
How advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sectorHow advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sectorMichael Haddad
 
Big Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingBig Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingGianpaolo Zampol
 

La actualidad más candente (20)

Creating $100 million from Big Data Analytics in Banking
Creating $100 million from Big Data Analytics in BankingCreating $100 million from Big Data Analytics in Banking
Creating $100 million from Big Data Analytics in Banking
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
 
Pres_Big Data for Finance_vsaini
Pres_Big Data for Finance_vsainiPres_Big Data for Finance_vsaini
Pres_Big Data for Finance_vsaini
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Using Big Data in Finance by Jonah Engler
Using Big Data in Finance by Jonah EnglerUsing Big Data in Finance by Jonah Engler
Using Big Data in Finance by Jonah Engler
 
Eric van tol
Eric van tolEric van tol
Eric van tol
 
Best Practices In Predictive Analytics
Best Practices In Predictive AnalyticsBest Practices In Predictive Analytics
Best Practices In Predictive Analytics
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
Future and scope of big data analytics in Digital Finance and banking.
Future and scope of big data analytics in Digital Finance and banking.Future and scope of big data analytics in Digital Finance and banking.
Future and scope of big data analytics in Digital Finance and banking.
 
Analytics in banking preview deck - june 2013
Analytics in banking   preview deck - june 2013Analytics in banking   preview deck - june 2013
Analytics in banking preview deck - june 2013
 
Big Data LDN 2018: DATA SCIENCE AT ING
Big Data LDN 2018: DATA SCIENCE AT INGBig Data LDN 2018: DATA SCIENCE AT ING
Big Data LDN 2018: DATA SCIENCE AT ING
 
Bmc joe goldberg
Bmc joe goldbergBmc joe goldberg
Bmc joe goldberg
 
Big Data
Big DataBig Data
Big Data
 
AI & ML for Supply Chain Optimization
AI & ML for Supply Chain OptimizationAI & ML for Supply Chain Optimization
AI & ML for Supply Chain Optimization
 
Customer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCustomer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital Transformation
 
Cox Automotive: data sells cars
Cox Automotive: data sells carsCox Automotive: data sells cars
Cox Automotive: data sells cars
 
Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]Tiger graph 2021 corporate overview [read only]
Tiger graph 2021 corporate overview [read only]
 
Big data analytic market opportunity
Big data analytic market opportunityBig data analytic market opportunity
Big data analytic market opportunity
 
How advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sectorHow advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sector
 
Big Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingBig Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in Banking
 

Similar a Dubai Big Data in Finance, Intro to Hadoop 2-Apr-14 - Michael Segel

Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Josh Patterson
 
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University TalksHadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talksyhadoop
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo pptPhil Young
 
A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introductionsaisreealekhya
 
Large Scale Data With Hadoop
Large Scale Data With HadoopLarge Scale Data With Hadoop
Large Scale Data With Hadoopguest27e6764
 
Analyst Report : The Enterprise Use of Hadoop
Analyst Report : The Enterprise Use of Hadoop Analyst Report : The Enterprise Use of Hadoop
Analyst Report : The Enterprise Use of Hadoop EMC
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera, Inc.
 
Oct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopOct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopJosh Patterson
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overviewNitesh Ghosh
 
ETL using Big Data Talend
ETL using Big Data Talend  ETL using Big Data Talend
ETL using Big Data Talend Edureka!
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKristof Jozsa
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryAli Dasdan
 
Hadoop for Finance - sample chapter
Hadoop for Finance - sample chapterHadoop for Finance - sample chapter
Hadoop for Finance - sample chapterRajiv Tiwari
 
Hadoop explained [e book]
Hadoop explained [e book]Hadoop explained [e book]
Hadoop explained [e book]Supratim Ray
 

Similar a Dubai Big Data in Finance, Intro to Hadoop 2-Apr-14 - Michael Segel (20)

00 hadoop welcome_transcript
00 hadoop welcome_transcript00 hadoop welcome_transcript
00 hadoop welcome_transcript
 
BigData primer
BigData primerBigData primer
BigData primer
 
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
 
Hadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University TalksHadoop at Yahoo! -- University Talks
Hadoop at Yahoo! -- University Talks
 
1. what is hadoop part 1
1. what is hadoop   part 11. what is hadoop   part 1
1. what is hadoop part 1
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
 
Hadoop
HadoopHadoop
Hadoop
 
Large Scale Data With Hadoop
Large Scale Data With HadoopLarge Scale Data With Hadoop
Large Scale Data With Hadoop
 
Analyst Report : The Enterprise Use of Hadoop
Analyst Report : The Enterprise Use of Hadoop Analyst Report : The Enterprise Use of Hadoop
Analyst Report : The Enterprise Use of Hadoop
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
HadoopWorkshopJuly2014
HadoopWorkshopJuly2014HadoopWorkshopJuly2014
HadoopWorkshopJuly2014
 
Oct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on HadoopOct 2011 CHADNUG Presentation on Hadoop
Oct 2011 CHADNUG Presentation on Hadoop
 
Big data and Hadoop overview
Big data and Hadoop overviewBig data and Hadoop overview
Big data and Hadoop overview
 
lec3_ref.pdf
lec3_ref.pdflec3_ref.pdf
lec3_ref.pdf
 
ETL using Big Data Talend
ETL using Big Data Talend  ETL using Big Data Talend
ETL using Big Data Talend
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
 
Hadoop for Finance - sample chapter
Hadoop for Finance - sample chapterHadoop for Finance - sample chapter
Hadoop for Finance - sample chapter
 
Hadoop explained [e book]
Hadoop explained [e book]Hadoop explained [e book]
Hadoop explained [e book]
 

Último

Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfThe Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfGale Pooley
 
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...dipikadinghjn ( Why You Choose Us? ) Escorts
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfGale Pooley
 
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure serviceWhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure servicePooja Nehwal
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfGale Pooley
 
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...Delhi Call girls
 
Top Rated Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile
 
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbaiVasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbaipriyasharma62062
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptxFinTech Belgium
 
Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...
Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...
Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...priyasharma62062
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdfAdnet Communications
 
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...priyasharma62062
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptxFinTech Belgium
 
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...ssifa0344
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptxFinTech Belgium
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptxFinTech Belgium
 

Último (20)

Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
The Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdfThe Economic History of the U.S. Lecture 25.pdf
The Economic History of the U.S. Lecture 25.pdf
 
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdf
 
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure serviceWhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdf
 
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
 
Top Rated Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbaiVasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
 
03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx03_Emmanuel Ndiaye_Degroof Petercam.pptx
03_Emmanuel Ndiaye_Degroof Petercam.pptx
 
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
 
Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...
Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...
Mira Road Memorable Call Grls Number-9833754194-Bhayandar Speciallty Call Gir...
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf
 
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
Kharghar Blowjob Housewife Call Girls NUmber-9833754194-CBD Belapur Internati...
 
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
 
00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx00_Main ppt_MeetupDORA&CyberSecurity.pptx
00_Main ppt_MeetupDORA&CyberSecurity.pptx
 
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
Solution Manual for Financial Accounting, 11th Edition by Robert Libby, Patri...
 
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
02_Fabio Colombo_Accenture_MeetupDora&Cybersecurity.pptx
 
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
05_Annelore Lenoir_Docbyte_MeetupDora&Cybersecurity.pptx
 

Dubai Big Data in Finance, Intro to Hadoop 2-Apr-14 - Michael Segel

  • 1. INTRODUCTION TO THE HADOOP ECOSYSTEM BAKING A LAYER CAKE AND BEYOND… “Qu’ils mangent de la brioche.” 1
  • 2. BEFORE WE BEGIN Questions for the audience…. How Many of You have : Been working with Hadoop for more than 3 months? Been working with Hadoop for more than 6 months? Been working with Hadoop for more than 1 year?How many of you have heard about this thing called ‘Hadoop’ / ‘Big Data’ and thought it would be fun to check it out?
  • 3. About the Speaker BSCIS - The College of Engineering, The Ohio State University ‘Big Data’ Consultant with > 25 years in IT Working solely in the ‘Big Data’ space since 2009 Founded Chicago area Hadoop User Group (CHUG) in April 2010 1600+ Members Over 200 different companies across all industries in the Chicagoland area. Routinely has talked at different Conferences around the US on Hadoop. Guest Lecture at Illinois Institute of Technology. CoAuthored papers found on InfoQ. MapR Admin, Cloudera Admin & Developer Certified. 3 email: MSegel (at) segel.com Skype: Michael_Segel
  • 4. What is Hadoop? ‘A Framework of software tools to allow one to take a large problem and process individual pieces in parallel. ‘ 4
  • 5. Our Hadoop Layer Cake: Circa 2010 Storag e Job Control Data Access 5 Programmin g Languages
  • 6. Data Access Our Hadoop Layer Cake: Circa 2013 Hadoop 2.0 Storag e Job Control 6 Resourc e Control Real Time Messag es Confused? This is just the tip of the iceberg. Data Frameworks
  • 7. The only constant is change… Hadoop is a disruptive technology, forcing the enterprise to rethink how it handles data. The core Apache Framework is just the starting point. Disruption allows new vendors to compete with established vendors. If you can build a better mousetrap, you will attract customers. Hadoop plays nice with others…
  • 8. PROPRIETARY SOFTWARE IS BAD. “Qu’ils mangent de la brioche.” 8 ‘Let them eat cake’ Myth : Reality :VENDOR LOCK IN IS BAD.
  • 9. HADOOP IS ONLY GOOD FOR BATCH PROCESSING “Qu’ils mangent de la brioche.” 9 ‘Let them eat cake’ Myth : Reality :HADOOP CAN ALSO BE USED FOR ‘REAL TIME’ PROBLEMS.
  • 10. [CENSOR ED] PROJE CT DAT E CLIE NT REAL TIME HADOOP SINGLE DATA CENTER SOLUTION Nightly Batch Jobs Create the Next Days Advertising Lists Client Phone Connects to the web serviceWeb Service talks to Ad EnginePhone connects to Ad Engine to get Ad Ad Engine connects to HBase to get list of potential Ads to display, sending the correct Ad to phone.
  • 11. HADOOP IS A STAND ALONE SYSTEM AND WILL REPLACE TRADITIONAL VENDOR’S PRODUCTS “Qu’ils mangent de la brioche.” 11 ‘Let them eat cake’ Myth : Reality :HADOOP IS PART OF THE ENTERPRISE . IT CAN BE STANDALONE, OR IT CAN WORK WITH EXISTING INFRASTRUCTURE.
  • 12. PROJE CT DAT E CLIE NT TOD AY HADOOP AND THE ENTERPRISE WE CAN ALL GET ALONG…. Hadoop communicates well with the rest of the Enterprise… Central cluster feeds distributed web services with local database backing… [split in to two slides]
  • 13. PROJE CT DAT E CLIE NT TOD AY HADOOP AND THE ENTERPRISE WE CAN ALL GET ALONG…. Hadoop communicates well with the rest of the Enterprise… Traditional Data Stores play nice with Hadoop. Some seeing HDFS files as external tables. [split in to two slides]
  • 14. How Traditional Vendors view Hadoop In the beginning they saw Hadoop as a threat. They will crush them. If you can’t beat them, join them…. Oracle Partners with Cloudera EMC partnered with MapR, then released its own distribution. (Green Stack) Terradata partners with Hortonworks. Microsoft partnered with Hortonworks. Intel Tried to create their own distro. Last week, dumped their distro, made large investment in to Cloudera. IBM … Has its own distro, yet certifies their tools to run on Cloudera Cisco partners with MapR Amazon (AWS) has own distro, Partners with MapR.
  • 15. HADOOP CLUSTERS SHOULD BE BUILT ON COMMODITY HARDWARE . “Qu’ils mangent de la brioche.” 15 ‘Let them eat cake’ Myth : Reality :YOU CAN DESIGN YOUR CLUSTER AROUND CONSTRAINTS…
  • 16. PROJE CT DAT E CLIE NT ALTERNATIVE CLUSTER LAYOUT STORAGE / COMPUTE CLUSTER A Higher Density of Disk and Compute Cluster Premium over Commodity Hardware I/O Latency Could be part of a virtualization solution.
  • 17. HADOOP HADOOP IS OPEN SOURCE AND THEREFORE FREE. “Qu’ils mangent de la brioche.” 17 ‘Let them eat cake’ Myth : Reality :T.A.N.S.T.A.A.F.L ‘TANS - TAH - FELL’ (THERE AINT NO SUCH THING AS A FREE LUNCH )
  • 18. There aint no such thing as a free lunch… Customers are paying for support. Tools are primitive, requires work, no real point and click solution in place, but getting there. Hadoop fills the gap where you want a custom solution. Merging semi-structured and structured data is going to be data dependent, requiring customization. Beyond ETL, SQL, custom apps require developer expertise. (You must invest in skills. ) Depending on Use Case, Time to Value (TtV) will differ. Bottom Line, there is a cost reduction over traditional solutions, but its not free.
  • 19. Take away… Hadoop is a tool set that is constantly evolving. Beware of marketing myths… Do your own homework and talk to the vendors. Make them earn your business. T.A.S.T.A.A.F.L applies, you need to make an investment in terms of skills. Hadoop isn’t a separate solution and should be part of your overall Enterprise strategy. Hadoop isn’t a silver bullet. By itself, it doesn’t solve your business problems.
  • 20. YOU CAN HAVE YOUR CAKE AND EAT IT TOO!
  • 22. What is a layer cake? layer cake noun [C] US : two or more soft cakes put on top of each other with jam, cream, icing, etc. (= a sweet mixture made from sugar) between the cakes and covering the top and sides : a term for a diagram showing how various parts of a group of components tie together in terms of a functional stack. 22
  • 23. What is Hadoop? Storage Layer The Storage Layer is a Distributed File System that accomplishes the following: Uniform Access from any machine in the cluster. Fast Access ( Resiliency (Self Healing) Redundancy (Replication) This is known as HDFS - Hadoop File System
  • 24. What is Hadoop? Job Control Layer The Job Control Layer is the layer that accomplishes the following: Manages and Schedules Jobs to be run. (Default [FIFO], Capacity Scheduler, Manages the over all job, and distributes the subprocesses across the cluster. Manages the subprocesses being run on each node in the cluster. This is accomplished by a Job Tracker (Cluster level) and Task Tracker (Node Level)
  • 25. What is Hadoop? Data Access Layer The Data Access Layer is the layer that accomplishes the following: Allows for a higher level access which can be translated to a Map/Reduce Job Pig (Yahoo!) Hive (Facebook) Allows for Adhoc access to data outside of the Map/Reduce Framework (HBase)
  • 26. What is Hadoop? Job Flow Control Layer The Data Access Layer is the layer that accomplishes the following: Allows for a higher level access which can be translated to a Map/Reduce Job Pig (Yahoo!) Hive (Facebook) Allows for Adhoc access to data outside of the Map/Reduce Framework (HBase) Allows for processes to be chained together to create a work flow (Oozie)* *No where else to put it…
  • 27. List of Apache Incubator Projects associated with Hadoop: Storm Accumulo Knox Sentry Falcon DataFu Drill Tez Twill Phoenix Hadoop Dev Tools Tajo