Enviar búsqueda
Cargar
SQL-H a new way to enable SQL analytics
•
17 recomendaciones
•
14,421 vistas
DataWorks Summit
Seguir
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 24
Recomendados
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Teradata Aster
Unified big data architecture
Unified big data architecture
DataWorks Summit
Teradata Big Data London Seminar
Teradata Big Data London Seminar
Hortonworks
Teradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made Easy
TIBCO Spotfire
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
Inside Analysis
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Cloudera, Inc.
Radio flyer cs
Radio flyer cs
Project Leadership Associates, Inc.
Recomendados
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Teradata Aster
Unified big data architecture
Unified big data architecture
DataWorks Summit
Teradata Big Data London Seminar
Teradata Big Data London Seminar
Hortonworks
Teradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made Easy
TIBCO Spotfire
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
Inside Analysis
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Cloudera, Inc.
Radio flyer cs
Radio flyer cs
Project Leadership Associates, Inc.
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
Bi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
Joseph Lopez
Sap sap so h 2013
Sap sap so h 2013
deepersnet
Mobile Analytics
Mobile Analytics
arunvanlvanoor
Innovation Webinar - Using IFS Applications BI to drive business excellence
Innovation Webinar - Using IFS Applications BI to drive business excellence
IFS
Agile Business Intelligence
Agile Business Intelligence
Don Jackson
Cv D Pietrzak Dpbc En
Cv D Pietrzak Dpbc En
dariuszpietrzak
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
Anand Deshpande
Empowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
Inside Analysis
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
IBM Danmark
Open Source Solution
Open Source Solution
ittishait
Metadata Use Cases You Can Use
Metadata Use Cases You Can Use
dmurph4
Innovations in SAP BusinessObjects 4.0
Innovations in SAP BusinessObjects 4.0
Pierre Leroux
Tera stream for datastreams
Tera stream for datastreams
치민 최
Saleseffectivity and business intelligence
Saleseffectivity and business intelligence
marekdan
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
Provoke Solutions
Kaizentric Presentation
Kaizentric Presentation
Azhagarasan Annadorai
Rationalizing an Enterprise IT Architecture
Rationalizing an Enterprise IT Architecture
Bob Rhubart
Database Architecture Proposal
Database Architecture Proposal
DATANYWARE.com
Sap Supplier Risk Performance 2011
Sap Supplier Risk Performance 2011
Henner Schliebs
Dancing with the Elephant
Dancing with the Elephant
DataWorks Summit
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
Felicia Haggarty
Más contenido relacionado
La actualidad más candente
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
Bi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
Joseph Lopez
Sap sap so h 2013
Sap sap so h 2013
deepersnet
Mobile Analytics
Mobile Analytics
arunvanlvanoor
Innovation Webinar - Using IFS Applications BI to drive business excellence
Innovation Webinar - Using IFS Applications BI to drive business excellence
IFS
Agile Business Intelligence
Agile Business Intelligence
Don Jackson
Cv D Pietrzak Dpbc En
Cv D Pietrzak Dpbc En
dariuszpietrzak
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
Anand Deshpande
Empowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
Inside Analysis
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
IBM Danmark
Open Source Solution
Open Source Solution
ittishait
Metadata Use Cases You Can Use
Metadata Use Cases You Can Use
dmurph4
Innovations in SAP BusinessObjects 4.0
Innovations in SAP BusinessObjects 4.0
Pierre Leroux
Tera stream for datastreams
Tera stream for datastreams
치민 최
Saleseffectivity and business intelligence
Saleseffectivity and business intelligence
marekdan
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
Provoke Solutions
Kaizentric Presentation
Kaizentric Presentation
Azhagarasan Annadorai
Rationalizing an Enterprise IT Architecture
Rationalizing an Enterprise IT Architecture
Bob Rhubart
Database Architecture Proposal
Database Architecture Proposal
DATANYWARE.com
Sap Supplier Risk Performance 2011
Sap Supplier Risk Performance 2011
Henner Schliebs
La actualidad más candente
(20)
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Bi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
Sap sap so h 2013
Sap sap so h 2013
Mobile Analytics
Mobile Analytics
Innovation Webinar - Using IFS Applications BI to drive business excellence
Innovation Webinar - Using IFS Applications BI to drive business excellence
Agile Business Intelligence
Agile Business Intelligence
Cv D Pietrzak Dpbc En
Cv D Pietrzak Dpbc En
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
Empowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
Open Source Solution
Open Source Solution
Metadata Use Cases You Can Use
Metadata Use Cases You Can Use
Innovations in SAP BusinessObjects 4.0
Innovations in SAP BusinessObjects 4.0
Tera stream for datastreams
Tera stream for datastreams
Saleseffectivity and business intelligence
Saleseffectivity and business intelligence
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
Kaizentric Presentation
Kaizentric Presentation
Rationalizing an Enterprise IT Architecture
Rationalizing an Enterprise IT Architecture
Database Architecture Proposal
Database Architecture Proposal
Sap Supplier Risk Performance 2011
Sap Supplier Risk Performance 2011
Similar a SQL-H a new way to enable SQL analytics
Dancing with the Elephant
Dancing with the Elephant
DataWorks Summit
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
Felicia Haggarty
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Yukinori Suda
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
オラクルエンジニア通信
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Modern Data Stack France
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
lucenerevolution
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
bigdatagurus_meetup
Apache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster Computing
All Things Open
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Cloudera, Inc.
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Alluxio, Inc.
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
DataWorks Summit/Hadoop Summit
Miro Consulting Oracle Exadata Database Machine Offering
Miro Consulting Oracle Exadata Database Machine Offering
garylcoleman
Druid deep dive
Druid deep dive
Kashif Khan
HTAP Queries
HTAP Queries
Atif Shaikh
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Adam Muise
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Data Con LA
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Yahoo Developer Network
Sybase To Oracle Migration for DBAs
Sybase To Oracle Migration for DBAs
Clearwater Technical Group Inc
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
Swiss Big Data User Group
Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
Yousun Jeong
Similar a SQL-H a new way to enable SQL analytics
(20)
Dancing with the Elephant
Dancing with the Elephant
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
Apache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster Computing
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
Miro Consulting Oracle Exadata Database Machine Offering
Miro Consulting Oracle Exadata Database Machine Offering
Druid deep dive
Druid deep dive
HTAP Queries
HTAP Queries
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Sybase To Oracle Migration for DBAs
Sybase To Oracle Migration for DBAs
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
Más de DataWorks Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
Managing the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
Más de DataWorks Summit
(20)
Data Science Crash Course
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Último
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Wonjun Hwang
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Fwdays
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Memoori
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
charlottematthew16
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Stephanie Beckett
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
hariprasad279825
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
Hervé Boutemy
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Fwdays
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
NavinnSomaal
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Dubai Multi Commodity Centre
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
Zilliz
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
The Digital Insurer
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ridwan Fadjar
Último
(20)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
SQL-H a new way to enable SQL analytics
1.
SQL-H: A New
Way to Enable SQL Analytics on Hadoop Sushil Thomas June 2012
2.
Outline •
HCatalog primer • Aster primer • SQL-H definition and features • SQL-H example usage 2 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
3.
HCatalog Primer • HCatalog
provides table management and storage management for Apache Hadoop - Provides a shared schema and data type mechanism - Provides a table abstraction so that users need not be concerned with where or how their data is stored - Provides interoperability across data processing tools such as Pig, Map Reduce, Streaming, and Hive • Uses Hive-like DDL commands. Supports tables, views, partitions. • Provides parallel load and store interfaces • Agnostic to file format of stored data - Currently supports RCFile, CSV text, JSON text, and SequenceFile 3 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
4.
HCatalog Primer: Example
Syntax ! CREATE EXTERNAL TABLE apachelog (! host STRING, identity STRING, user STRING,! time STRING, request STRING, status STRING,! size STRING, referer STRING, agent STRING)! ROW FORMAT! SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe’! WITH SERDEPROPERTIES ("input.regex" = "([^]*) …”)! STORED AS TEXTFILE! LOCATION ‘hdfs://data/apachelogs’;! ! Note: This is run via HCatalog interfaces to record the format of data stored in HDFS for later use by Hive, Pig etc. This is not run on the Aster system. ! 4 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
5.
HCatalog Primer: Read
Flow (Hadoop Job Submission) Job Controller HCatalog Server Node Table Name, Partitions HCatalog Server Splits 5 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
6.
HCatalog Primer: Read
Flow (Hadoop Job Execution) Processing Nodes (running Hive, Pig or MR jobs) Map Task Map Task Map Task Tuples Tuples Tuples Split Split Split … Source Data Source Data Source Data 6 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
7.
Aster Primer
ARC Data Engine Partition Inter … Cluster SQL-MapReduce Parser ARC Data Express Engine Partition Optimizer Worker Nodes Executor ARC Data Engine Partition Inter SQL Engine … Cluster Queen Node ARC Data Express Engine Partition 7 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
8.
Aster SQL-H • Direct
access to HCatalog data within AsterDB - HCatalog tables available without duplicating DDL commands on the Aster side • HCatalog tables are first class objects within AsterDB - Full support for all SQL operators • We use the HCatalog interfaces to read tuples in parallel on all data nodes 8 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
9.
Aster Reads From
HCatalog (Planning) Aster Optimizer HCatalog Server Node Table Name, Partitions HCatalog Server Splits Query Planning Phase 9 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
10.
Aster Reads From
HCatalog (Execution) HDFS Split ARC Data Data Tuples Nodes Split Engine Partition HDFS Split ARC Data Data Tuples Engine Partition Nodes Split HDFS Split ARC Data Data Tuples Engine Partition Nodes Split Execution Phase On A Single Worker Node 10 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
11.
Features – Simple
and Comprehensive Support • Interactions with HCatalog master server and HDFS only - No MapReduce slots used - Hadoop system can be used for other activity simultaneously • Aster runs native HCatalog InputReader code for translating HCatalog table names into input splits, and then getting data from input splits - No impedance mismatch between the two systems - Everything supported by HCatalog interfaces is supported in Aster • Changes made on HCatalog are reflected immediately on the Aster side - New tables, modified schemas, new partitions etc. are available immediately. No extra steps required. 11 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
12.
Features - Usability •
Full integration with BI tools - Tableau, MSTR etc. now work with data in Hadoop seamlessly • Data in Hadoop can now be joined with relational data in your Aster system - Previously, using data from multiple systems involved complex ETL tasks • Full SQL support - HCatalog table data can be inserted into a SQL flow just like native table data • If desired, provides a load pipeline into Aster from Hadoop 12 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
13.
Features – Teradata
Aster Analytical Foundation • Full suite of Aster Analytical Foundation functions available for data in Hadoop - Time-Series/Path Analysis - Statistical Analysis - Relational Analysis - Text Analysis - Clustering Analysis - Data Transformations • Makes users productive faster • Spend time analyzing data, not building functionality and tools 13 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
14.
Features - Performance •
Partition pruning is transparently supported - select * from hadoop_weblogs where ds=‘2012-06-10’ • If “hadoop_weblogs” is partitioned on ‘ds’, then this command will only scan data in this particular partition • Performance Notes - Data transfer is required, but the network may not be your bottleneck. Time taken for the initial data read may be a small part of overall query performance - Aster’s native SQL execution engine is a lot faster than Hive’s MR based execution engine - As queries get complex, performance advantage increases - If required, impact on hadoop system and network bandwidth usage can be tuned down 14 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
15.
Example SQL Syntax
– Remote Catalog beehive=> extl host=hcatalog1.asterdata.com ! List of databases! Name ! ----------! prod ! testdb ! (2 rows)! ! beehive=> extd host=hcatalog1.asterdata.com database=prod! List of tables! Name ! ---------! apachelogs ! movieratings ! (2 rows)! 15 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
16.
Example SQL Syntax
– Remote Catalog beehive=> extd host=hcatalog1.asterdata.com database=prod table=movieratings! Table ”prod".”movieratings"! Table ”prod".”movieratings"! Name | Type | Partitioned Column ! ---------+---------+--------------------! userid | string | f! movieid | int | f! rating | double | f! ds | string | t! (4 rows)! 16 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
17.
Example SQL Syntax
– HCatalog Data Access SELECT * FROM load_from_hcatalog(! ! ON mr_driver ! server(’hcatalog1.asterdata.com’)! ! dbname(‘prod’)! ! tablename(‘student’)! ! columns(‘userid’, ’movieid’, ‘rating’));! ! ! CREATE VIEW hadoop_weblogs AS! SELECT * FROM load_from_hcatalog(! ON mr_driver! . . .);! 17 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
18.
Example SQL Syntax
– Data Load From HCatalog CREATE TABLE aster_weblogs DISTRIBUTE BY HASH(userid) AS! SELECT * FROM hadoop_weblogs;! 18 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
19.
Example SQL Syntax
– Partition Pruning beehive=> extd host=hcatalog1.asterdata.com database=prod table=movieratings! Table ”prod".”movieratings"! Name | Type | Partitioned Column ! ---------+---------+--------------------! userid | string | f! movieid | int | f! rating | double | f! ds | string | t! (4 rows)! ! ! // Because ‘ds’ is a partitioned column, the query below! // will only pull in data from the ‘2011-06-10’ partition! SELECT * FROM hadoop_movieratings! WHERE ds=‘2011-06-10’;! 19 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
20.
Example SQL Join
Syntax – Complex Queries // Join example! ! select t1.name, t2.page_url, t1.price ! from ! aster_product t1, ! hadoop_weblogs t2 ! where t1.product_id=t2.product_id;! ! ! ! 20 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
21.
Example SQL-MapReduce Syntax //
Find all the sessions with a particular page visit pattern where! // atleast 3 products have been checked out during the session! ! SELECT * FROM npath(! ON hadoop_weblogs! PARTITION BY sessionid ORDER BY clicktime! MODE(nonoverlapping) ! PATTERN(‘h.h*.d*.c{3,}.d’)! SYMBOLS(pagetype = ‘home’ as h, pagetype=‘checkout’ as c,! pagetype<>’home’ and pagetype<>’checkout’ as d)! RESULT(first(sessionid of c) as sessionid,! max_choose(productprice, productname of c) as most_expensive,! max(productprice of c) as max_price,! min_choose(productprice, productname of c) as least_expensive, ! min(productprice of c) as min_price))! ORDER BY sessionid;! 21 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
22.
Example BI Tool
Usage – Path Analysis on Data Stored in Aster and Hadoop 22 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
23.
Example BI Tool
Usage – Path Analysis on Data Stored in Aster and Hadoop 23 Confidential and proprietary. Copyright © 2011 Teradata Corporation.