SlideShare una empresa de Scribd logo
1 de 42
Hive -  Data Warehousing &   Analytics on Hadoop Wednesday, June 10, 2009  Santa Clara Marriott Namit Jain, Zheng Shao Facebook
Agenda ,[object Object],[object Object],[object Object],[object Object],Facebook
[object Object],Facebook
Why Another Data Warehousing System? ,[object Object],[object Object],[object Object],Facebook
 
Lets try Hadoop… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Lets try Hadoop… (continued) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
What is HIVE? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Simplifying Hadoop ,[object Object],[object Object],[object Object],Facebook
[object Object],Facebook
Data Warehousing at Facebook Today Facebook Web Servers Scribe Servers Filers Hive on  Hadoop Cluster Oracle RAC Federated MySQL
Hive/Hadoop Usage @ Facebook ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Hadoop Usage @ Facebook ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
[object Object],Facebook
[object Object],Facebook
Data Model Facebook Logical Partitioning Hash Partitioning clicks HDFS MetaStore /hive/clicks /hive/clicks/ds=2008-03-25 /hive/clicks/ds=2008-03-25/0 … Tables Metastore DB Data Location Bucketing Info Partitioning Cols
HIVE: Components Facebook HDFS Hive CLI DDL Queries Browsing Map Reduce MetaStore Thrift API SerDe Thrift CSV JSON.. Execution Parser Planner Web UI Optimizer DB
Hive Query Language ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Hive Query Language (continued) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Pluggable Map-Reduce Scripts Facebook
Map Reduce Example Facebook Machine 2 Machine 1 <k1, v1> <k2, v2> <k3, v3> <k4, v4> <k5, v5> <k6, v6> <nk1, nv1> <nk2, nv2> <nk3, nv3> <nk2, nv4> <nk2, nv5> <nk1, nv6> Local Map <nk2, nv4> <nk2, nv5> <nk2, nv2> <nk1, nv1> <nk3, nv3> <nk1, nv6> Global Shuffle <nk1, nv1> <nk1, nv6> <nk3, nv3> <nk2, nv4> <nk2, nv5> <nk2, nv2> Local Sort <nk2, 3> <nk1, 2> <nk3, 1> Local Reduce
Hive QL – Join ,[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Hive QL – Join in Map Reduce Facebook page_view user pv_users Map Reduce key value 111 < 1, 1> 111 < 1, 2> 222 < 1, 1> pageid userid time 1 111 9:08:01 2 111 9:08:13 1 222 9:08:14 userid age gender 111 25 female 222 32 male key value 111 < 2, 25> 222 < 2, 32> key value 111 < 1, 1> 111 < 1, 2> 111 < 2, 25> key value 222 < 1, 1> 222 < 2, 32> Shuffle Sort Pageid age 1 25 2 25 pageid age 1 32
Join Optimizations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Hive QL – Map Join Facebook page_view user Hash table pv_users key value 111 <1,2> 222 <2> pageid userid time 1 111 9:08:01 2 111 9:08:13 1 222 9:08:14 userid age gender 111 25 female 222 32 male Pageid age 1 25 2 25 1 32
Hive QL – Group By ,[object Object],[object Object],[object Object],Facebook
Hive QL – Group By in Map Reduce Facebook pv_users Map Reduce pageid age 1 25 1 25 pageid age count 1 25 3 pageid age 2 32 1 25 key value <1,25> 2 key value <1,25> 1 <2,32> 1 key value <1,25> 2 <1,25> 1 key value <2,32> 1 Shuffle Sort pageid age count 2 32 1
Group by Optimizations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Columnar Storage ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Speed Improvements over Time Facebook ,[object Object],[object Object],[object Object],[object Object],[object Object],Date SVN Revision Major Changes Query A Query B Query C 2/22/2009 746906 Before Lazy Deserialization 83 sec 98 sec 183 sec 2/23/2009 747293 Lazy Deserialization 40 sec 66 sec 185 sec 3/6/2009 751166 Map-side Aggregation 22 sec 67 sec 182 sec 4/29/2009 770074 Object Reuse  21 sec 49 sec 130 sec 6/3/2009 781633 Map-side Join * 21 sec 48 sec 132 sec
Overcoming Java Overhead ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Generic UDF and UDAF ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
HQL Optimizations ,[object Object],[object Object],[object Object],Facebook
[object Object],Facebook
Open Source Community ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Deployment Options ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook
Contributors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Facebook ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],Facebook

Más contenido relacionado

La actualidad más candente

Hadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverHadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverDataWorks Summit
 
Hive ICDE 2010
Hive ICDE 2010Hive ICDE 2010
Hive ICDE 2010ragho
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebookragho
 
report on aadhaar anlysis using bid data hadoop and hive
report on aadhaar anlysis using bid data hadoop and hivereport on aadhaar anlysis using bid data hadoop and hive
report on aadhaar anlysis using bid data hadoop and hivesiddharthboora
 
Join optimization in hive
Join optimization in hive Join optimization in hive
Join optimization in hive Liyin Tang
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 
Hive query optimization infinity
Hive query optimization infinityHive query optimization infinity
Hive query optimization infinityShashwat Shriparv
 
Hive : WareHousing Over hadoop
Hive :  WareHousing Over hadoopHive :  WareHousing Over hadoop
Hive : WareHousing Over hadoopChirag Ahuja
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map ReduceApache Apex
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And HdfsCloudera, Inc.
 
Data profiling with Apache Calcite
Data profiling with Apache CalciteData profiling with Apache Calcite
Data profiling with Apache CalciteJulian Hyde
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design PatternsDonald Miner
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-ReduceBrendan Tierney
 
Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in SearchAmund Tveit
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduceBhupesh Chawda
 

La actualidad más candente (19)

Hadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverHadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game Forever
 
Hive ICDE 2010
Hive ICDE 2010Hive ICDE 2010
Hive ICDE 2010
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
 
Hive
HiveHive
Hive
 
report on aadhaar anlysis using bid data hadoop and hive
report on aadhaar anlysis using bid data hadoop and hivereport on aadhaar anlysis using bid data hadoop and hive
report on aadhaar anlysis using bid data hadoop and hive
 
Join optimization in hive
Join optimization in hive Join optimization in hive
Join optimization in hive
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Hive query optimization infinity
Hive query optimization infinityHive query optimization infinity
Hive query optimization infinity
 
Hive : WareHousing Over hadoop
Hive :  WareHousing Over hadoopHive :  WareHousing Over hadoop
Hive : WareHousing Over hadoop
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Map Reduce Online
Map Reduce OnlineMap Reduce Online
Map Reduce Online
 
Data profiling with Apache Calcite
Data profiling with Apache CalciteData profiling with Apache Calcite
Data profiling with Apache Calcite
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design Patterns
 
Introduction to Map-Reduce
Introduction to Map-ReduceIntroduction to Map-Reduce
Introduction to Map-Reduce
 
Mapreduce in Search
Mapreduce in SearchMapreduce in Search
Mapreduce in Search
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 

Similar a Hive - Data Warehousing & Analytics on Hadoop

HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopZheng Shao
 
Hadoop and Hive
Hadoop and HiveHadoop and Hive
Hadoop and HiveZheng Shao
 
Hive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use CasesHive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use Casesnzhang
 
Hadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-DelhiHadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-DelhiJoydeep Sen Sarma
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010nzhang
 
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit JainApache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit JainYahoo Developer Network
 
02 data warehouse applications with hive
02 data warehouse applications with hive02 data warehouse applications with hive
02 data warehouse applications with hiveSubhas Kumar Ghosh
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poliivascucristian
 
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
Hw09   Rethinking The Data Warehouse With Hadoop And HiveHw09   Rethinking The Data Warehouse With Hadoop And Hive
Hw09 Rethinking The Data Warehouse With Hadoop And HiveCloudera, Inc.
 
Hadoop institutes in hyderabad
Hadoop institutes in hyderabadHadoop institutes in hyderabad
Hadoop institutes in hyderabadKelly Technologies
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud lohitvijayarenu
 
Extending twitter's data platform to google cloud
Extending twitter's data platform to google cloud Extending twitter's data platform to google cloud
Extending twitter's data platform to google cloud Vrushali Channapattan
 
An introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysisAn introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysisAbhijit Sharma
 
WaterlooHiveTalk
WaterlooHiveTalkWaterlooHiveTalk
WaterlooHiveTalknzhang
 
Hpdw 2015-v10-paper
Hpdw 2015-v10-paperHpdw 2015-v10-paper
Hpdw 2015-v10-paperrestassure
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsRobert Grossman
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSAPRBETTER
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarinn5712036
 

Similar a Hive - Data Warehousing & Analytics on Hadoop (20)

HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
 
Hadoop and Hive
Hadoop and HiveHadoop and Hive
Hadoop and Hive
 
Hive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use CasesHive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use Cases
 
Hadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-DelhiHadoop Hive Talk At IIT-Delhi
Hadoop Hive Talk At IIT-Delhi
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
 
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit JainApache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
 
02 data warehouse applications with hive
02 data warehouse applications with hive02 data warehouse applications with hive
02 data warehouse applications with hive
 
Distributed computing poli
Distributed computing poliDistributed computing poli
Distributed computing poli
 
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
Hw09   Rethinking The Data Warehouse With Hadoop And HiveHw09   Rethinking The Data Warehouse With Hadoop And Hive
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
 
Pig latin
Pig latinPig latin
Pig latin
 
Hadoop institutes in hyderabad
Hadoop institutes in hyderabadHadoop institutes in hyderabad
Hadoop institutes in hyderabad
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Extending twitter's data platform to google cloud
Extending twitter's data platform to google cloud Extending twitter's data platform to google cloud
Extending twitter's data platform to google cloud
 
An introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysisAn introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysis
 
WaterlooHiveTalk
WaterlooHiveTalkWaterlooHiveTalk
WaterlooHiveTalk
 
Tableau training course
Tableau training courseTableau training course
Tableau training course
 
Hpdw 2015-v10-paper
Hpdw 2015-v10-paperHpdw 2015-v10-paper
Hpdw 2015-v10-paper
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Presentation_BigData_NenaMarin
Presentation_BigData_NenaMarinPresentation_BigData_NenaMarin
Presentation_BigData_NenaMarin
 

Último

NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...
NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...
NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...Amil baba
 
Call Girls in Faridabad 9000000000 Faridabad Escorts Service
Call Girls in Faridabad 9000000000 Faridabad Escorts ServiceCall Girls in Faridabad 9000000000 Faridabad Escorts Service
Call Girls in Faridabad 9000000000 Faridabad Escorts ServiceTina Ji
 
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzersQUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzersSJU Quizzers
 
No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...
No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...
No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...Amil Baba Company
 
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...First NO1 World Amil baba in Faisalabad
 
Call Girls In Moti Bagh (8377877756 )-Genuine Rate Girls Service
Call Girls In Moti Bagh (8377877756 )-Genuine Rate Girls ServiceCall Girls In Moti Bagh (8377877756 )-Genuine Rate Girls Service
Call Girls In Moti Bagh (8377877756 )-Genuine Rate Girls Servicedollysharma2066
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607dollysharma2066
 
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts ServiceVIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts ServiceApsara Of India
 
Call Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full Night
Call Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full NightCall Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full Night
Call Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full Nightssuser7cb4ff
 
North Avenue Call Girls Services, Hire Now for Full Fun
North Avenue Call Girls Services, Hire Now for Full FunNorth Avenue Call Girls Services, Hire Now for Full Fun
North Avenue Call Girls Services, Hire Now for Full FunKomal Khan
 
Call Girl Contact Number Andheri WhatsApp:+91-9833363713
Call Girl Contact Number Andheri WhatsApp:+91-9833363713Call Girl Contact Number Andheri WhatsApp:+91-9833363713
Call Girl Contact Number Andheri WhatsApp:+91-9833363713Sonam Pathan
 
Call Girls CG Road 7397865700 Independent Call Girls
Call Girls CG Road 7397865700  Independent Call GirlsCall Girls CG Road 7397865700  Independent Call Girls
Call Girls CG Road 7397865700 Independent Call Girlsssuser7cb4ff
 
8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR
8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR
8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCRdollysharma2066
 
(伦敦大学毕业证学位证成绩单-PDF版)
(伦敦大学毕业证学位证成绩单-PDF版)(伦敦大学毕业证学位证成绩单-PDF版)
(伦敦大学毕业证学位证成绩单-PDF版)twfkn8xj
 
Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...
Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...
Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...Amil Baba Company
 
ViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcE
ViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcEViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcE
ViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcEApsara Of India
 
NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...
NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...
NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...Amil Baba Dawood bangali
 
NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...
NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...
NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...Amil baba
 
Cash Payment Contact:- 7028418221 Goa Call Girls Service North Goa Escorts
Cash Payment Contact:- 7028418221 Goa Call Girls Service North Goa EscortsCash Payment Contact:- 7028418221 Goa Call Girls Service North Goa Escorts
Cash Payment Contact:- 7028418221 Goa Call Girls Service North Goa EscortsApsara Of India
 
Statement Of Intent - - Copy.documentfile
Statement Of Intent - - Copy.documentfileStatement Of Intent - - Copy.documentfile
Statement Of Intent - - Copy.documentfilef4ssvxpz62
 

Último (20)

NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...
NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...
NO1 WorldWide Amil baba in pakistan Amil Baba in Karachi Black Magic Islamaba...
 
Call Girls in Faridabad 9000000000 Faridabad Escorts Service
Call Girls in Faridabad 9000000000 Faridabad Escorts ServiceCall Girls in Faridabad 9000000000 Faridabad Escorts Service
Call Girls in Faridabad 9000000000 Faridabad Escorts Service
 
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzersQUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
QUIZ BOLLYWOOD ( weekly quiz ) - SJU quizzers
 
No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...
No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...
No,1 Amil baba Islamabad Astrologer in Karachi amil baba in pakistan amil bab...
 
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
Authentic No 1 Amil Baba In Pakistan Authentic No 1 Amil Baba In Karachi No 1...
 
Call Girls In Moti Bagh (8377877756 )-Genuine Rate Girls Service
Call Girls In Moti Bagh (8377877756 )-Genuine Rate Girls ServiceCall Girls In Moti Bagh (8377877756 )-Genuine Rate Girls Service
Call Girls In Moti Bagh (8377877756 )-Genuine Rate Girls Service
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377087607
 
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts ServiceVIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
VIP Call Girls In Goa 7028418221 Call Girls In Baga Beach Escorts Service
 
Call Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full Night
Call Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full NightCall Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full Night
Call Girls Prahlad Nagar 9920738301 Ridhima Hire Me Full Night
 
North Avenue Call Girls Services, Hire Now for Full Fun
North Avenue Call Girls Services, Hire Now for Full FunNorth Avenue Call Girls Services, Hire Now for Full Fun
North Avenue Call Girls Services, Hire Now for Full Fun
 
Call Girl Contact Number Andheri WhatsApp:+91-9833363713
Call Girl Contact Number Andheri WhatsApp:+91-9833363713Call Girl Contact Number Andheri WhatsApp:+91-9833363713
Call Girl Contact Number Andheri WhatsApp:+91-9833363713
 
Call Girls CG Road 7397865700 Independent Call Girls
Call Girls CG Road 7397865700  Independent Call GirlsCall Girls CG Road 7397865700  Independent Call Girls
Call Girls CG Road 7397865700 Independent Call Girls
 
8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR
8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR
8377087607 Full Enjoy @24/7 Call Girls in Patel Nagar Delhi NCR
 
(伦敦大学毕业证学位证成绩单-PDF版)
(伦敦大学毕业证学位证成绩单-PDF版)(伦敦大学毕业证学位证成绩单-PDF版)
(伦敦大学毕业证学位证成绩单-PDF版)
 
Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...
Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...
Real NO1 Amil baba in Faisalabad Kala jadu in faisalabad Aamil baba Faisalaba...
 
ViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcE
ViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcEViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcE
ViP Call Girls In Udaipur 9602870969 Gulab Bagh Escorts SeRvIcE
 
NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...
NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...
NO1 Certified Black magic specialist,Expert in Pakistan Amil Baba kala ilam E...
 
NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...
NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...
NO1 Certified Black magic/kala jadu,manpasand shadi in lahore,karachi rawalpi...
 
Cash Payment Contact:- 7028418221 Goa Call Girls Service North Goa Escorts
Cash Payment Contact:- 7028418221 Goa Call Girls Service North Goa EscortsCash Payment Contact:- 7028418221 Goa Call Girls Service North Goa Escorts
Cash Payment Contact:- 7028418221 Goa Call Girls Service North Goa Escorts
 
Statement Of Intent - - Copy.documentfile
Statement Of Intent - - Copy.documentfileStatement Of Intent - - Copy.documentfile
Statement Of Intent - - Copy.documentfile
 

Hive - Data Warehousing & Analytics on Hadoop

  • 1. Hive - Data Warehousing & Analytics on Hadoop Wednesday, June 10, 2009 Santa Clara Marriott Namit Jain, Zheng Shao Facebook
  • 2.
  • 3.
  • 4.
  • 5.  
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Data Warehousing at Facebook Today Facebook Web Servers Scribe Servers Filers Hive on Hadoop Cluster Oracle RAC Federated MySQL
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. Data Model Facebook Logical Partitioning Hash Partitioning clicks HDFS MetaStore /hive/clicks /hive/clicks/ds=2008-03-25 /hive/clicks/ds=2008-03-25/0 … Tables Metastore DB Data Location Bucketing Info Partitioning Cols
  • 17. HIVE: Components Facebook HDFS Hive CLI DDL Queries Browsing Map Reduce MetaStore Thrift API SerDe Thrift CSV JSON.. Execution Parser Planner Web UI Optimizer DB
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Map Reduce Example Facebook Machine 2 Machine 1 <k1, v1> <k2, v2> <k3, v3> <k4, v4> <k5, v5> <k6, v6> <nk1, nv1> <nk2, nv2> <nk3, nv3> <nk2, nv4> <nk2, nv5> <nk1, nv6> Local Map <nk2, nv4> <nk2, nv5> <nk2, nv2> <nk1, nv1> <nk3, nv3> <nk1, nv6> Global Shuffle <nk1, nv1> <nk1, nv6> <nk3, nv3> <nk2, nv4> <nk2, nv5> <nk2, nv2> Local Sort <nk2, 3> <nk1, 2> <nk3, 1> Local Reduce
  • 23.
  • 24. Hive QL – Join in Map Reduce Facebook page_view user pv_users Map Reduce key value 111 < 1, 1> 111 < 1, 2> 222 < 1, 1> pageid userid time 1 111 9:08:01 2 111 9:08:13 1 222 9:08:14 userid age gender 111 25 female 222 32 male key value 111 < 2, 25> 222 < 2, 32> key value 111 < 1, 1> 111 < 1, 2> 111 < 2, 25> key value 222 < 1, 1> 222 < 2, 32> Shuffle Sort Pageid age 1 25 2 25 pageid age 1 32
  • 25.
  • 26. Hive QL – Map Join Facebook page_view user Hash table pv_users key value 111 <1,2> 222 <2> pageid userid time 1 111 9:08:01 2 111 9:08:13 1 222 9:08:14 userid age gender 111 25 female 222 32 male Pageid age 1 25 2 25 1 32
  • 27.
  • 28. Hive QL – Group By in Map Reduce Facebook pv_users Map Reduce pageid age 1 25 1 25 pageid age count 1 25 3 pageid age 2 32 1 25 key value <1,25> 2 key value <1,25> 1 <2,32> 1 key value <1,25> 2 <1,25> 1 key value <2,32> 1 Shuffle Sort pageid age count 2 32 1
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.

Notas del editor

  1. What is this? This is huge amount of data. Along with the fast growth of active users on Facebook, the size of our data is exploding. In the last 12 months, the amount of data increased by 500%. These data are very valuable. They can be used to understand the user behavior, measure the impact of a new product, and make data-based decisions. Traditionally people store data in data warehouse solutions on top of Oracle and MySQL. In the recent years, we are also seeing new proprietary solutions like AsterData and Netezza. However, these solutions either do not scale to the amount of data that we have, or they are very inflexible that cannot satisfy our data analysis requirements. In order to provide the capability to analyze the huge amount of data that we have, we started the Hive project. Hive is based on Hadoop but does much more than Hadoop. We will show the details in the following slides. ============