Enviar búsqueda
Cargar
Hive sq lfor-hadoop
•
1 recomendación
•
834 vistas
Pragati Singh
Seguir
Hive hadoop learning hive for bigdata,Hive hadoop learning hive for bigdata
Leer menos
Leer más
Ingeniería
Tecnología
Vista de diapositivas
Denunciar
Compartir
Vista de diapositivas
Denunciar
Compartir
1 de 59
Descargar ahora
Descargar para leer sin conexión
Recomendados
Getting started with Hadoop, Hive, and Elastic MapReduce
Getting started with Hadoop, Hive, and Elastic MapReduce
obdit
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Jonathan Seidman
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Uwe Printz
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Yahoo Developer Network
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Uwe Printz
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
SQL in Hadoop
SQL in Hadoop
Sven Bayer
Hadoop basics
Hadoop basics
Antonio Silveira
Recomendados
Getting started with Hadoop, Hive, and Elastic MapReduce
Getting started with Hadoop, Hive, and Elastic MapReduce
obdit
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Jonathan Seidman
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Introduction to the Hadoop Ecosystem with Hadoop 2.0 aka YARN (Java Serbia Ed...
Uwe Printz
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Yahoo Developer Network
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Introduction to the Hadoop Ecosystem (IT-Stammtisch Darmstadt Edition)
Uwe Printz
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
SQL in Hadoop
SQL in Hadoop
Sven Bayer
Hadoop basics
Hadoop basics
Antonio Silveira
Hadoop - Overview
Hadoop - Overview
Jay
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
royans
Intro to Hadoop
Intro to Hadoop
jeffturner
מיכאל
מיכאל
sqlserver.co.il
Hadoop Family and Ecosystem
Hadoop Family and Ecosystem
tcloudcomputing-tw
Integration of HIve and HBase
Integration of HIve and HBase
Hortonworks
Introduction to Hadoop
Introduction to Hadoop
joelcrabb
An Introduction to Hadoop
An Introduction to Hadoop
DerrekYoungDotCom
Hadoop pig
Hadoop pig
Sean Murphy
The Hadoop Ecosystem
The Hadoop Ecosystem
J Singh
Hadoop overview
Hadoop overview
Deborah Akuoko
Hadoop online training
Hadoop online training
Keylabs
Learning Apache HIVE - Data Warehouse and Query Language for Hadoop
Learning Apache HIVE - Data Warehouse and Query Language for Hadoop
Someshwar Kale
Hadoop
Hadoop
Peerapat Asoktummarungsri
6.hive
6.hive
Prashant Gupta
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
Uwe Printz
Introduction to Pig
Introduction to Pig
Prashanth Babu
Introduction to the Hadoop Ecosystem (codemotion Edition)
Introduction to the Hadoop Ecosystem (codemotion Edition)
Uwe Printz
Hadoop hive presentation
Hadoop hive presentation
Arvind Kumar
20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction
Xuan-Chao Huang
20140708hcj
20140708hcj
Hatayama Hideharu
Apache hive
Apache hive
Vaibhav Kadu
Más contenido relacionado
La actualidad más candente
Hadoop - Overview
Hadoop - Overview
Jay
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
royans
Intro to Hadoop
Intro to Hadoop
jeffturner
מיכאל
מיכאל
sqlserver.co.il
Hadoop Family and Ecosystem
Hadoop Family and Ecosystem
tcloudcomputing-tw
Integration of HIve and HBase
Integration of HIve and HBase
Hortonworks
Introduction to Hadoop
Introduction to Hadoop
joelcrabb
An Introduction to Hadoop
An Introduction to Hadoop
DerrekYoungDotCom
Hadoop pig
Hadoop pig
Sean Murphy
The Hadoop Ecosystem
The Hadoop Ecosystem
J Singh
Hadoop overview
Hadoop overview
Deborah Akuoko
Hadoop online training
Hadoop online training
Keylabs
Learning Apache HIVE - Data Warehouse and Query Language for Hadoop
Learning Apache HIVE - Data Warehouse and Query Language for Hadoop
Someshwar Kale
Hadoop
Hadoop
Peerapat Asoktummarungsri
6.hive
6.hive
Prashant Gupta
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
Uwe Printz
Introduction to Pig
Introduction to Pig
Prashanth Babu
Introduction to the Hadoop Ecosystem (codemotion Edition)
Introduction to the Hadoop Ecosystem (codemotion Edition)
Uwe Printz
Hadoop hive presentation
Hadoop hive presentation
Arvind Kumar
20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction
Xuan-Chao Huang
La actualidad más candente
(20)
Hadoop - Overview
Hadoop - Overview
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Intro to Hadoop
Intro to Hadoop
מיכאל
מיכאל
Hadoop Family and Ecosystem
Hadoop Family and Ecosystem
Integration of HIve and HBase
Integration of HIve and HBase
Introduction to Hadoop
Introduction to Hadoop
An Introduction to Hadoop
An Introduction to Hadoop
Hadoop pig
Hadoop pig
The Hadoop Ecosystem
The Hadoop Ecosystem
Hadoop overview
Hadoop overview
Hadoop online training
Hadoop online training
Learning Apache HIVE - Data Warehouse and Query Language for Hadoop
Learning Apache HIVE - Data Warehouse and Query Language for Hadoop
Hadoop
Hadoop
6.hive
6.hive
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to the Hadoop Ecosystem (SEACON Edition)
Introduction to Pig
Introduction to Pig
Introduction to the Hadoop Ecosystem (codemotion Edition)
Introduction to the Hadoop Ecosystem (codemotion Edition)
Hadoop hive presentation
Hadoop hive presentation
20131205 hadoop-hdfs-map reduce-introduction
20131205 hadoop-hdfs-map reduce-introduction
Destacado
20140708hcj
20140708hcj
Hatayama Hideharu
Apache hive
Apache hive
Vaibhav Kadu
Apache Hive
Apache Hive
Ajit Koti
Big Data Timeline
Big Data Timeline
DeZyre
A Big Data Timeline
A Big Data Timeline
Big Cloud
Introduction to HiveQL
Introduction to HiveQL
kristinferrier
SQL in the Hybrid World
SQL in the Hybrid World
Tanel Poder
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Cloudera, Inc.
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
Emilio Coppa
SQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12c
Tanel Poder
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
kapa rohit
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Edureka!
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
Zheng Shao
Destacado
(13)
20140708hcj
20140708hcj
Apache hive
Apache hive
Apache Hive
Apache Hive
Big Data Timeline
Big Data Timeline
A Big Data Timeline
A Big Data Timeline
Introduction to HiveQL
Introduction to HiveQL
SQL in the Hybrid World
SQL in the Hybrid World
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
SQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12c
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers by rohit kapa
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
Similar a Hive sq lfor-hadoop
Dallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: Hadoop
lamont_lockwood
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
Dean Wampler
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
Ameet Paranjape
Agile analytics applications on hadoop
Agile analytics applications on hadoop
Hortonworks
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
russell_jurney
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Chris Baglieri
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Tony Baer
Hadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or question
DataWorks Summit
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
DataWorks Summit
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
DataWorks Summit
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
InMobi Technology
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
Adam Muise
Sql on everything with drill
Sql on everything with drill
Julien Le Dem
Hadoop demo ppt
Hadoop demo ppt
Phil Young
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
AltafKhadim
Hadoop in action
Hadoop in action
Mahmoud Yassin
Big data-at-detik
Big data-at-detik
k4ndar
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Stefano Paluello
Utrecht NL-HUG/Data Science-NL - Agile Data Slides
Utrecht NL-HUG/Data Science-NL - Agile Data Slides
Hortonworks
Paris HUG - Agile Analytics Applications on Hadoop
Paris HUG - Agile Analytics Applications on Hadoop
Hortonworks
Similar a Hive sq lfor-hadoop
(20)
Dallas TDWI Meeting Dec. 2012: Hadoop
Dallas TDWI Meeting Dec. 2012: Hadoop
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
Agile analytics applications on hadoop
Agile analytics applications on hadoop
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL & NoSQL: No Longer an Either-or Question
Hadoop, SQL and NoSQL, No longer an either/or question
Hadoop, SQL and NoSQL, No longer an either/or question
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
Running Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
Sql on everything with drill
Sql on everything with drill
Hadoop demo ppt
Hadoop demo ppt
OPERATING SYSTEM .pptx
OPERATING SYSTEM .pptx
Hadoop in action
Hadoop in action
Big data-at-detik
Big data-at-detik
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
Utrecht NL-HUG/Data Science-NL - Agile Data Slides
Utrecht NL-HUG/Data Science-NL - Agile Data Slides
Paris HUG - Agile Analytics Applications on Hadoop
Paris HUG - Agile Analytics Applications on Hadoop
Más de Pragati Singh
Nessus Scanner: Network Scanning from Beginner to Advanced!
Nessus Scanner: Network Scanning from Beginner to Advanced!
Pragati Singh
Tenable Certified Sales Associate - CS.pdf
Tenable Certified Sales Associate - CS.pdf
Pragati Singh
Analyzing risk (pmbok® guide sixth edition)
Analyzing risk (pmbok® guide sixth edition)
Pragati Singh
Pragati Singh | Sap Badge
Pragati Singh | Sap Badge
Pragati Singh
Ios record of achievement
Ios record of achievement
Pragati Singh
Ios2 confirmation ofparticipation
Ios2 confirmation ofparticipation
Pragati Singh
Certificate of completion android studio essential training 2016
Certificate of completion android studio essential training 2016
Pragati Singh
Certificate of completion android development essential training create your ...
Certificate of completion android development essential training create your ...
Pragati Singh
Certificate of completion android development essential training design a use...
Certificate of completion android development essential training design a use...
Pragati Singh
Certificate of completion android development essential training support mult...
Certificate of completion android development essential training support mult...
Pragati Singh
Certificate of completion android development essential training manage navig...
Certificate of completion android development essential training manage navig...
Pragati Singh
Certificate of completion android development essential training local data s...
Certificate of completion android development essential training local data s...
Pragati Singh
Certificate of completion android development essential training distributing...
Certificate of completion android development essential training distributing...
Pragati Singh
Certificate of completion android app development communicating with the user
Certificate of completion android app development communicating with the user
Pragati Singh
Certificate of completion building flexible android apps with the fragments api
Certificate of completion building flexible android apps with the fragments api
Pragati Singh
Certificate of completion android app development design patterns for mobile ...
Certificate of completion android app development design patterns for mobile ...
Pragati Singh
Certificate of completion java design patterns and apis for android
Certificate of completion java design patterns and apis for android
Pragati Singh
Certificate of completion android development concurrent programming
Certificate of completion android development concurrent programming
Pragati Singh
Certificate of completion android app development data persistence libraries
Certificate of completion android app development data persistence libraries
Pragati Singh
Certificate of completion android app development restful web services
Certificate of completion android app development restful web services
Pragati Singh
Más de Pragati Singh
(20)
Nessus Scanner: Network Scanning from Beginner to Advanced!
Nessus Scanner: Network Scanning from Beginner to Advanced!
Tenable Certified Sales Associate - CS.pdf
Tenable Certified Sales Associate - CS.pdf
Analyzing risk (pmbok® guide sixth edition)
Analyzing risk (pmbok® guide sixth edition)
Pragati Singh | Sap Badge
Pragati Singh | Sap Badge
Ios record of achievement
Ios record of achievement
Ios2 confirmation ofparticipation
Ios2 confirmation ofparticipation
Certificate of completion android studio essential training 2016
Certificate of completion android studio essential training 2016
Certificate of completion android development essential training create your ...
Certificate of completion android development essential training create your ...
Certificate of completion android development essential training design a use...
Certificate of completion android development essential training design a use...
Certificate of completion android development essential training support mult...
Certificate of completion android development essential training support mult...
Certificate of completion android development essential training manage navig...
Certificate of completion android development essential training manage navig...
Certificate of completion android development essential training local data s...
Certificate of completion android development essential training local data s...
Certificate of completion android development essential training distributing...
Certificate of completion android development essential training distributing...
Certificate of completion android app development communicating with the user
Certificate of completion android app development communicating with the user
Certificate of completion building flexible android apps with the fragments api
Certificate of completion building flexible android apps with the fragments api
Certificate of completion android app development design patterns for mobile ...
Certificate of completion android app development design patterns for mobile ...
Certificate of completion java design patterns and apis for android
Certificate of completion java design patterns and apis for android
Certificate of completion android development concurrent programming
Certificate of completion android development concurrent programming
Certificate of completion android app development data persistence libraries
Certificate of completion android app development data persistence libraries
Certificate of completion android app development restful web services
Certificate of completion android app development restful web services
Último
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
upamatechverse
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
upamatechverse
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
Tsuyoshi Horigome
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
purnimasatapathy1234
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
RajkumarAkumalla
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
rakeshbaidya232001
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur High Profile
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
Suhani Kapoor
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Suman Mia
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
ranjana rawat
result management system report for college project
result management system report for college project
Tonystark477637
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls in Nagpur High Profile
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
ranjana rawat
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
Asutosh Ranjan
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
rknatarajan
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
ranjana rawat
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
SIVASHANKAR N
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
sivaprakash250
Extrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
120cr0395
Último
(20)
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
result management system report for college project
result management system report for college project
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
Extrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
Hive sq lfor-hadoop
1.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved Hive: SQL for Hadoop An Essen8al Tool for Hadoop-‐based Data Warehouses Tuesday, January 10, 12 I’ll argue that Hive is indispensable to people crea8ng “data warehouses” with Hadoop, because it gives them a “similar” SQL interface to their data, making it easier to migrate skills and even apps from exis8ng rela8onal tools to Hadoop.
2.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved Adapted from our Hadoop 3-Day Developer Course info@thinkbiganalytics.com thinkbiganalytics.com Tuesday, January 10, 12 These notes are (heavily) adapted from our Hadoop 3-‐day developer course. Of course, we won’t do exercises in this talk ;) The second day focuses on Hive. We also offer Hive training by itself, admin courses, etc.
3.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved Programming Hive Summer, 2012 O’Reilly Tuesday, January 10, 12 I’m collabora8ng with two other colleagues to write “Programming Hive”, which will be published by O’Reilly this summer. Un8l it’s ready, the best source of documenta8on is on the Hive site, hYp://hive.apache.org.
4.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved4 programmingscala.compolyglotprogramming.com/fpjava Dean Wampler Functional Programming for Java Developers Dean Wampler 20+ years exp. Think Big Academy Tuesday, January 10, 12
5.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved5 A True Story... Tuesday, January 10, 12
6.
Once upon an
Internet time, in the land of Enterprise, the King’s Warehouse of Data was bursting at the seams. Tuesday, January 10, 12
7.
And lo, the
scribes decided they could cut costs and store more data by moving to Hadoop. Tuesday, January 10, 12
8.
But then the
scribes complained amongst themselves, “The language of the land of Java hurts our tongues! Tuesday, January 10, 12
9.
“It takes forever
to say anything!” Tuesday, January 10, 12
10.
There was a
scribe who walked through the Kingdom-wide Labyrinth (kwl), where he found The Book of Faces. Tuesday, January 10, 12
11.
In that book,
he found the secret language the Bee Keepers use to query the god Hadoop! Tuesday, January 10, 12
12.
And there was
much rejoicing, for the scribes could work with their data again! Tuesday, January 10, 12
13.
The End. Tuesday, January
10, 12
14.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved14 Enter Hive Tuesday, January 10, 12
15.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved15 Since your team knows SQL and all your DataWarehouse apps are written in SQL, Hive minimizes the effort of migrating to Hadoop. Tuesday, January 10, 12
16.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Ideal for data warehousing. • Ad-hoc queries of data. • Familiar SQL dialect. • Analysis of large data sets. • Hadoop MapReduce jobs. 16 Hive Tuesday, January 10, 12 Hive is a killer app, in our opinion, for data warehouse teams migrating to Hadoop, because it gives them a familiar SQL language that hides the complexity of MR programming.
17.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Invented at Facebook. • Open sourced to Apache in 2008. • http://hive.apache.org 17 Hive Tuesday, January 10, 12
18.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved18 A Scenario: Mining Daily Click Stream Logs Tuesday, January 10, 12
19.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved19 Ingest & Transform: • From: file://server1/var/log/clicks.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … Tuesday, January 10, 12 As we copy the daily click stream log over to a local staging location, we transform it into the Hive table format we want.
20.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved20 Ingest & Transform: • From: file://server1/var/log/clicks.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … Timestamp Tuesday, January 10, 12
21.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved21 Ingest & Transform: • From: file://server1/var/log/clicks.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … The server Tuesday, January 10, 12
22.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved22 Ingest & Transform: • From: file://server1/var/log/clicks.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … The process (“movies search”) and the process id. Tuesday, January 10, 12
23.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved23 Ingest & Transform: • From: file://server1/var/log/clicks.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … Customer id Tuesday, January 10, 12
24.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved24 Ingest & Transform: • From: file://server1/var/log/clicks.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … The log “message” Tuesday, January 10, 12
25.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved25 Ingest & Transform: • From: file://server1/var/log/clicks.log 09:02:17^Aserver1^Amovies^A18^A1234^Asea rch for “vampires in love”. … • To: /staging/2012-01-09.log Jan 9 09:02:17 server1 movies[18]: 1234: search for “vampires in love”. … Tuesday, January 10, 12 As we copy the daily click stream log over to a local staging location, we transform it into the Hive table format we want.
26.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved26 Ingest & Transform: 09:02:17^Aserver1^Amovies^A18^A1234^Asea rch for “vampires in love”. … • To: /staging/2012-01-09.log • Removed month (Jan) and day (09). • Added ^A as field separators (Hive convention). • Separated process id from process name. Tuesday, January 10, 12 The transformations we made. (You could use many different Linux, scripting, code, or Hadoop-related ingestion tools to do this.
27.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved27 Ingest & Transform: hadoop fs -put /staging/2012-01-09.log /clicks/2012/01/09/log.txt • Put in HDFS: • (The final file name doesn’t matter…) Tuesday, January 10, 12 Here we use the hadoop shell command to put the file where we want it in the file system. Note that the name of the target file doesn’t matter; we’ll just tell Hive to read all files in the directory, so there could be many files there!
28.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved28 Back to Hive... CREATE EXTERNAL TABLE clicks ( hms STRING, hostname STRING, process STRING, pid INT, uid INT, message STRING) PARTITIONED BY ( year INT, month INT, day INT); • Create an external Hive table: You don’t have to use EXTERNAL and PARTITIONED together…. Tuesday, January 10, 12 Now let’s create an “external” table that will read those files as the “backing store”. Also, we make it partitioned to accelerate queries that limit by year, month or day. (You don’t have to use external and partitioned together…)
29.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved29 Back to Hive... ALTER TABLE clicks ADD IF NOT EXISTS PARTITION ( year=2012, month=01, day=09) LOCATION '/clicks/2012/01/09'; • Add a partition for 2012-01-09: • A directory in HDFS. Tuesday, January 10, 12 We add a partition for each day. Note the LOCATION path, which is a the directory where we wrote our file.
30.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved30 Now,Analyze!! SELECT hms, uid, message FROM clicks WHERE message LIKE '%vampire%' AND year = 2012 AND month = 01 AND day = 09; • What’s with the kids and vampires?? • After some MapReduce crunching... … 09:02:29 1234 search for “twilight of the vampires” 09:02:35 1234 add to cart “vampires want their genre back” … Tuesday, January 10, 12 And we can run SQL queries!!
31.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • SQL analysis with Hive. • Other tools can use the data, too. • Massive scalability with Hadoop. 31 Recap Tuesday, January 10, 12
32.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved32 Hadoop Master ✲ Job Tracker Name Node HDFS Hive Driver (compiles, optimizes, executes) CLI HWI Thrift Server Metastore JDBC ODBC Karmasphere Others... Tuesday, January 10, 12 Hive queries generate MR jobs. (Some operations don’t invoke Hadoop processes, e.g., some very simple queries and commands that just write updates to the metastore.) CLI = Command Line Interface. HWI = Hive Web Interface.
33.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • HDFS • MapR • S3 • HBase (new) • Others... 33 Tables Hadoop Master ✲ Job Tracker Name Node HDFS Hive Driver (compiles, optimizes, executes) CLI HWI Thrift Server Metastore JDBC ODBC Karmasphere Others... Tuesday, January 10, 12 There is “early” support for using Hive with HBase. Other databases and distributed file systems will no doubt follow.
34.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Table metadata stored in a relational DB. 34 Tables Hadoop Master ✲ Job Tracker Name Node HDFS Hive Driver (compiles, optimizes, executes) CLI HWI Thrift Server Metastore JDBC ODBC Karmasphere Others... Tuesday, January 10, 12 For production, you need to set up a MySQL or PostgreSQL database for Hive’s metadata. Out of the box, Hive uses a Derby DB, but it can only be used by a single user and a single process at a time, so it’s fine for personal development only.
35.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Most queries use MapReduce jobs. 35 Queries Hadoop Master ✲ Job Tracker Name Node HDFS Hive Driver (compiles, optimizes, executes) CLI HWI Thrift Server Metastore JDBC ODBC Karmasphere Others... Tuesday, January 10, 12 Hive generates MapReduce jobs to implement all the but the simplest queries.
36.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Benefits • Horizontal scalability. • Drawbacks • Latency! 36 MapReduce Queries Tuesday, January 10, 12 The high latency makes Hive unsuitable for “online” database use. (Hive also doesn’t support transactions and has other limitations that are relevant here…) So, these limitations make Hive best for offline (batch mode) use, such as data warehouse apps.
37.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Benefits • Horizontal scalability. • Data redundancy. • Drawbacks • No insert, update, and delete! 37 HDFS Storage Tuesday, January 10, 12 You can generate new tables or write to local files. Forthcoming versions of HDFS will support appending data.
38.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Schema on Read • Schema enforcement at query time, not write time. 38 HDFS Storage Tuesday, January 10, 12 Especially for external tables, but even for internal ones since the files are HDFS files, Hive can’t enforce that records written to table files have the specified schema, so it does these checks at query time.
39.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • No Transactions. • Some SQL features not implemented (yet). 39 Other Limitations Tuesday, January 10, 12
40.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved40 More on Tables and Schemas Tuesday, January 10, 12
41.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved41 Data Types • The usual scalar types: •TINYINT, …, BIGNT. •FLOAT, DOUBLE. •BOOLEAN. •STRING. Tuesday, January 10, 12 Like most databases...
42.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved42 Data Types • The unusual complex types: •STRUCT. •MAP. •ARRAY. Tuesday, January 10, 12 Structs are like “objects” or “c-style structs”. Maps are key-value pairs, and you know what arrays are ;)
43.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved43 CREATE TABLE employees ( name STRING, salary FLOAT, subordinates ARRAY<STRING>, deductions MAP<STRING,FLOAT>, address STRUCT< street:STRING, city:STRING, state:STRING, zip:INT> ); Tuesday, January 10, 12 subordinates references other records by the employee name. (Hive doesn’t have indexes, in the usual sense, but an indexing feature was recently added.) Deductions is a key-value list of the name of the deduction and a float indicating the amount (e.g., %). Address is like a “class”, “object”, or “c-style struct”, whatever you prefer.
44.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved44 File & Record Formats CREATE TABLE employees (…) ROW FORMAT DELIMITED FIELDS TERMINATED BY '001' COLLECTION ITEMS TERMINATED BY '002' MAP KEYS TERMINATED BY '003' LINES TERMINATED BY 'n' STORED AS TEXTFILE; All the defaults for text files! Tuesday, January 10, 12 Suppose our employees table has a custom format and field delimiters. We can change them, although here I’m showing all the default values used by Hive!
45.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved45 Select,Where, Group By, Join,... Tuesday, January 10, 12
46.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved46 Common SQL... • You get most of the usual suspects for SELECT, WHERE, GROUP BY and JOIN. Tuesday, January 10, 12 We’ll just highlight a few unique features.
47.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved47 ADD JAR MyUDFs.jar; CREATE TEMPORARY FUNCTION net_salary AS 'com.example.NetCalcUDF'; SELECT name, net_salary(salary, deductions) FROM employees; “User Defined Functions” Tuesday, January 10, 12 Following a Hive defined API, implement your own functions, build, put in a jar, and then use them in your queries. Here we (pretend to) implement a function that takes the employee’s salary and deductions, then computes the net salary.
48.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved48 SELECT name, salary FROM employees ORDER BY salary ASC; ORDER BY vs. SORT BY • A total ordering - one reducer. SELECT name, salary FROM employees SORT BY salary ASC; • A local ordering - sorts within each reducer. Tuesday, January 10, 12 For a giant data set, piping everything through one reducer might take a very long time. A compromise is to sort “locally”, so each reducer sorts it’s output. However, if you structure your jobs right, you might achieve a total order depending on how data gets to the reducers. (E.g., each reducer handles a year’s worth of data, so joining the files together would be totally sorted.)
49.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Only equality (x = y). 49 Inner Joins SELECT ... FROM clicks a JOIN clicks b ON ( a.uid = b.uid, a.day = b.day) WHERE a.process = 'movies' AND b.process = 'books' AND a.year > 2012; Tuesday, January 10, 12 Note that the a.year > ‘…’ is in the WHERE clause, not the ON clause for the JOIN. (I’m doing a correlation query; which users searched for movies and books on the same day?) Some outer and semi join constructs supported, as well as some Hadoop-specific optimization constructs.
50.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved50 A Final Example of Controlling MapReduce... Tuesday, January 10, 12
51.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Calling out to external programs to perform map and reduce operations. 51 Specify Map & Reduce Processes Tuesday, January 10, 12
52.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved52 Example FROM ( FROM clicks MAP message USING '/tmp/vampire_extractor' AS item_title, count CLUSTER BY item_title) it INSERT OVERWRITE TABLE vampire_stuff REDUCE it.item_title, it.count USING '/tmp/thing_counter.py' AS item_title, counts; Tuesday, January 10, 12 Note the MAP … USING and REDUCE … USING. We’re also using CLUSTER BY (distributing and sorting on “item_title”).
53.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved53 Example FROM ( FROM clicks MAP message USING '/tmp/vampire_extractor' AS item_title, count CLUSTER BY item_title) it INSERT OVERWRITE TABLE vampire_stuff REDUCE it.item_title, it.count USING '/tmp/thing_counter.py' AS item_title, counts; Call specific map and reduce processes. Tuesday, January 10, 12 Note the MAP … USING and REDUCE … USING. We’re also using CLUSTER BY (distributing and sorting on “item_title”).
54.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved54 … And Also: FROM ( FROM clicks MAP message USING '/tmp/vampire_extractor' AS item_title, count CLUSTER BY item_title) it INSERT OVERWRITE TABLE vampire_stuff REDUCE it.item_title, it.count USING '/tmp/thing_counter.py' AS item_title, counts; Like GROUP BY, but directs output to specific reducers. Tuesday, January 10, 12 Note the MAP … USING and REDUCE … USING. We’re also using CLUSTER BY (distributing and sorting on “item_title”).
55.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved55 … And Also: FROM ( FROM clicks MAP message USING '/tmp/vampire_extractor' AS item_title, count CLUSTER BY item_title) it INSERT OVERWRITE TABLE vampire_stuff REDUCE it.item_title, it.count USING '/tmp/thing_counter.py' AS item_title, counts; How to populate an “internal” table. Tuesday, January 10, 12 Note the MAP … USING and REDUCE … USING. We’re also using CLUSTER BY (distributing and sorting on “item_title”).
56.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved56 Hive: Conclusions Tuesday, January 10, 12
57.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Not a real SQL Database. • Transactions, updates, etc. • … but features will grow. • High latency queries. • Documentation poor. 57 Hive Disadvantages Tuesday, January 10, 12
58.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved • Indispensable for SQL users. • Easier than Java MR API. • Makes porting data warehouse apps to Hadoop much easier. 58 Hive Advantages Tuesday, January 10, 12
59.
Copyright © 2011-‐2012,
Think Big Analy8cs, All Rights Reserved59 Questions? dean.wampler@thinkbiganalytics.com thinkbiganalytics.com This presentation: github.com/deanwampler/Presentations Tuesday, January 10, 12
Descargar ahora