Enviar búsqueda
Cargar
SQL-H a new way to enable SQL analytics
•
17 recomendaciones
•
14,421 vistas
DataWorks Summit
Seguir
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 24
Recomendados
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Teradata Aster
Unified big data architecture
Unified big data architecture
DataWorks Summit
Teradata Big Data London Seminar
Teradata Big Data London Seminar
Hortonworks
Teradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made Easy
TIBCO Spotfire
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
Inside Analysis
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Cloudera, Inc.
Radio flyer cs
Radio flyer cs
Project Leadership Associates, Inc.
Recomendados
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Razorfish Multi-Channel Marketing: Better Customer Segmentation and Targeting
Teradata Aster
Unified big data architecture
Unified big data architecture
DataWorks Summit
Teradata Big Data London Seminar
Teradata Big Data London Seminar
Hortonworks
Teradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made Easy
TIBCO Spotfire
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
Inside Analysis
Talk IT_ Oracle_김태완_110831
Talk IT_ Oracle_김태완_110831
Cana Ko
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Hadoop World 2011: Big Data Architecture: Integrating Hadoop with Other Enter...
Cloudera, Inc.
Radio flyer cs
Radio flyer cs
Project Leadership Associates, Inc.
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
Bi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
Joseph Lopez
Sap sap so h 2013
Sap sap so h 2013
deepersnet
Mobile Analytics
Mobile Analytics
arunvanlvanoor
Innovation Webinar - Using IFS Applications BI to drive business excellence
Innovation Webinar - Using IFS Applications BI to drive business excellence
IFS
Agile Business Intelligence
Agile Business Intelligence
Don Jackson
Cv D Pietrzak Dpbc En
Cv D Pietrzak Dpbc En
dariuszpietrzak
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
Anand Deshpande
Empowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
Inside Analysis
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
IBM Danmark
Open Source Solution
Open Source Solution
ittishait
Metadata Use Cases You Can Use
Metadata Use Cases You Can Use
dmurph4
Innovations in SAP BusinessObjects 4.0
Innovations in SAP BusinessObjects 4.0
Pierre Leroux
Tera stream for datastreams
Tera stream for datastreams
치민 최
Saleseffectivity and business intelligence
Saleseffectivity and business intelligence
marekdan
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
Provoke Solutions
Kaizentric Presentation
Kaizentric Presentation
Azhagarasan Annadorai
Rationalizing an Enterprise IT Architecture
Rationalizing an Enterprise IT Architecture
Bob Rhubart
Database Architecture Proposal
Database Architecture Proposal
DATANYWARE.com
Sap Supplier Risk Performance 2011
Sap Supplier Risk Performance 2011
Henner Schliebs
Dancing with the Elephant
Dancing with the Elephant
DataWorks Summit
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
Felicia Haggarty
Más contenido relacionado
La actualidad más candente
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Database Architechs
Bi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
Joseph Lopez
Sap sap so h 2013
Sap sap so h 2013
deepersnet
Mobile Analytics
Mobile Analytics
arunvanlvanoor
Innovation Webinar - Using IFS Applications BI to drive business excellence
Innovation Webinar - Using IFS Applications BI to drive business excellence
IFS
Agile Business Intelligence
Agile Business Intelligence
Don Jackson
Cv D Pietrzak Dpbc En
Cv D Pietrzak Dpbc En
dariuszpietrzak
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
Anand Deshpande
Empowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
Inside Analysis
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
IBM Danmark
Open Source Solution
Open Source Solution
ittishait
Metadata Use Cases You Can Use
Metadata Use Cases You Can Use
dmurph4
Innovations in SAP BusinessObjects 4.0
Innovations in SAP BusinessObjects 4.0
Pierre Leroux
Tera stream for datastreams
Tera stream for datastreams
치민 최
Saleseffectivity and business intelligence
Saleseffectivity and business intelligence
marekdan
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
Provoke Solutions
Kaizentric Presentation
Kaizentric Presentation
Azhagarasan Annadorai
Rationalizing an Enterprise IT Architecture
Rationalizing an Enterprise IT Architecture
Bob Rhubart
Database Architecture Proposal
Database Architecture Proposal
DATANYWARE.com
Sap Supplier Risk Performance 2011
Sap Supplier Risk Performance 2011
Henner Schliebs
La actualidad más candente
(20)
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
Bi Is Not An Isolated Decision
Bi Is Not An Isolated Decision
Sap sap so h 2013
Sap sap so h 2013
Mobile Analytics
Mobile Analytics
Innovation Webinar - Using IFS Applications BI to drive business excellence
Innovation Webinar - Using IFS Applications BI to drive business excellence
Agile Business Intelligence
Agile Business Intelligence
Cv D Pietrzak Dpbc En
Cv D Pietrzak Dpbc En
From the Big Data keynote at InCSIghts 2012
From the Big Data keynote at InCSIghts 2012
Empowering the Business with Agile Analytics
Empowering the Business with Agile Analytics
Big Data i CSC's optik, CSC Representative
Big Data i CSC's optik, CSC Representative
Open Source Solution
Open Source Solution
Metadata Use Cases You Can Use
Metadata Use Cases You Can Use
Innovations in SAP BusinessObjects 4.0
Innovations in SAP BusinessObjects 4.0
Tera stream for datastreams
Tera stream for datastreams
Saleseffectivity and business intelligence
Saleseffectivity and business intelligence
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
Kaizentric Presentation
Kaizentric Presentation
Rationalizing an Enterprise IT Architecture
Rationalizing an Enterprise IT Architecture
Database Architecture Proposal
Database Architecture Proposal
Sap Supplier Risk Performance 2011
Sap Supplier Risk Performance 2011
Similar a SQL-H a new way to enable SQL analytics
Dancing with the Elephant
Dancing with the Elephant
DataWorks Summit
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
Felicia Haggarty
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Yukinori Suda
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
オラクルエンジニア通信
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Modern Data Stack France
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
lucenerevolution
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
bigdatagurus_meetup
Apache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster Computing
All Things Open
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Cloudera, Inc.
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Alluxio, Inc.
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
DataWorks Summit/Hadoop Summit
Miro Consulting Oracle Exadata Database Machine Offering
Miro Consulting Oracle Exadata Database Machine Offering
garylcoleman
Druid deep dive
Druid deep dive
Kashif Khan
HTAP Queries
HTAP Queries
Atif Shaikh
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Adam Muise
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Data Con LA
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Yahoo Developer Network
Sybase To Oracle Migration for DBAs
Sybase To Oracle Migration for DBAs
Clearwater Technical Group Inc
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
Swiss Big Data User Group
Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
Yousun Jeong
Similar a SQL-H a new way to enable SQL analytics
(20)
Dancing with the Elephant
Dancing with the Elephant
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
Performance evaluation of cloudera impala 0.6 beta with comparison to Hive
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Architecting the Future of Big Data & Search - Eric Baldeschwieler
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
Apache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster Computing
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Accelerate and Scale Big Data Analytics with Disaggregated Compute and Storage
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
Miro Consulting Oracle Exadata Database Machine Offering
Miro Consulting Oracle Exadata Database Machine Offering
Druid deep dive
Druid deep dive
HTAP Queries
HTAP Queries
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
Sybase To Oracle Migration for DBAs
Sybase To Oracle Migration for DBAs
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
Más de DataWorks Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
Managing the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
Más de DataWorks Summit
(20)
Data Science Crash Course
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Último
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
apidays
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
SynarionITSolutions
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Igalia
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Roshan Dwivedi
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Neo4j
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Último
(20)
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
SQL-H a new way to enable SQL analytics
1.
SQL-H: A New
Way to Enable SQL Analytics on Hadoop Sushil Thomas June 2012
2.
Outline •
HCatalog primer • Aster primer • SQL-H definition and features • SQL-H example usage 2 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
3.
HCatalog Primer • HCatalog
provides table management and storage management for Apache Hadoop - Provides a shared schema and data type mechanism - Provides a table abstraction so that users need not be concerned with where or how their data is stored - Provides interoperability across data processing tools such as Pig, Map Reduce, Streaming, and Hive • Uses Hive-like DDL commands. Supports tables, views, partitions. • Provides parallel load and store interfaces • Agnostic to file format of stored data - Currently supports RCFile, CSV text, JSON text, and SequenceFile 3 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
4.
HCatalog Primer: Example
Syntax ! CREATE EXTERNAL TABLE apachelog (! host STRING, identity STRING, user STRING,! time STRING, request STRING, status STRING,! size STRING, referer STRING, agent STRING)! ROW FORMAT! SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe’! WITH SERDEPROPERTIES ("input.regex" = "([^]*) …”)! STORED AS TEXTFILE! LOCATION ‘hdfs://data/apachelogs’;! ! Note: This is run via HCatalog interfaces to record the format of data stored in HDFS for later use by Hive, Pig etc. This is not run on the Aster system. ! 4 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
5.
HCatalog Primer: Read
Flow (Hadoop Job Submission) Job Controller HCatalog Server Node Table Name, Partitions HCatalog Server Splits 5 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
6.
HCatalog Primer: Read
Flow (Hadoop Job Execution) Processing Nodes (running Hive, Pig or MR jobs) Map Task Map Task Map Task Tuples Tuples Tuples Split Split Split … Source Data Source Data Source Data 6 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
7.
Aster Primer
ARC Data Engine Partition Inter … Cluster SQL-MapReduce Parser ARC Data Express Engine Partition Optimizer Worker Nodes Executor ARC Data Engine Partition Inter SQL Engine … Cluster Queen Node ARC Data Express Engine Partition 7 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
8.
Aster SQL-H • Direct
access to HCatalog data within AsterDB - HCatalog tables available without duplicating DDL commands on the Aster side • HCatalog tables are first class objects within AsterDB - Full support for all SQL operators • We use the HCatalog interfaces to read tuples in parallel on all data nodes 8 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
9.
Aster Reads From
HCatalog (Planning) Aster Optimizer HCatalog Server Node Table Name, Partitions HCatalog Server Splits Query Planning Phase 9 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
10.
Aster Reads From
HCatalog (Execution) HDFS Split ARC Data Data Tuples Nodes Split Engine Partition HDFS Split ARC Data Data Tuples Engine Partition Nodes Split HDFS Split ARC Data Data Tuples Engine Partition Nodes Split Execution Phase On A Single Worker Node 10 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
11.
Features – Simple
and Comprehensive Support • Interactions with HCatalog master server and HDFS only - No MapReduce slots used - Hadoop system can be used for other activity simultaneously • Aster runs native HCatalog InputReader code for translating HCatalog table names into input splits, and then getting data from input splits - No impedance mismatch between the two systems - Everything supported by HCatalog interfaces is supported in Aster • Changes made on HCatalog are reflected immediately on the Aster side - New tables, modified schemas, new partitions etc. are available immediately. No extra steps required. 11 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
12.
Features - Usability •
Full integration with BI tools - Tableau, MSTR etc. now work with data in Hadoop seamlessly • Data in Hadoop can now be joined with relational data in your Aster system - Previously, using data from multiple systems involved complex ETL tasks • Full SQL support - HCatalog table data can be inserted into a SQL flow just like native table data • If desired, provides a load pipeline into Aster from Hadoop 12 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
13.
Features – Teradata
Aster Analytical Foundation • Full suite of Aster Analytical Foundation functions available for data in Hadoop - Time-Series/Path Analysis - Statistical Analysis - Relational Analysis - Text Analysis - Clustering Analysis - Data Transformations • Makes users productive faster • Spend time analyzing data, not building functionality and tools 13 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
14.
Features - Performance •
Partition pruning is transparently supported - select * from hadoop_weblogs where ds=‘2012-06-10’ • If “hadoop_weblogs” is partitioned on ‘ds’, then this command will only scan data in this particular partition • Performance Notes - Data transfer is required, but the network may not be your bottleneck. Time taken for the initial data read may be a small part of overall query performance - Aster’s native SQL execution engine is a lot faster than Hive’s MR based execution engine - As queries get complex, performance advantage increases - If required, impact on hadoop system and network bandwidth usage can be tuned down 14 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
15.
Example SQL Syntax
– Remote Catalog beehive=> extl host=hcatalog1.asterdata.com ! List of databases! Name ! ----------! prod ! testdb ! (2 rows)! ! beehive=> extd host=hcatalog1.asterdata.com database=prod! List of tables! Name ! ---------! apachelogs ! movieratings ! (2 rows)! 15 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
16.
Example SQL Syntax
– Remote Catalog beehive=> extd host=hcatalog1.asterdata.com database=prod table=movieratings! Table ”prod".”movieratings"! Table ”prod".”movieratings"! Name | Type | Partitioned Column ! ---------+---------+--------------------! userid | string | f! movieid | int | f! rating | double | f! ds | string | t! (4 rows)! 16 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
17.
Example SQL Syntax
– HCatalog Data Access SELECT * FROM load_from_hcatalog(! ! ON mr_driver ! server(’hcatalog1.asterdata.com’)! ! dbname(‘prod’)! ! tablename(‘student’)! ! columns(‘userid’, ’movieid’, ‘rating’));! ! ! CREATE VIEW hadoop_weblogs AS! SELECT * FROM load_from_hcatalog(! ON mr_driver! . . .);! 17 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
18.
Example SQL Syntax
– Data Load From HCatalog CREATE TABLE aster_weblogs DISTRIBUTE BY HASH(userid) AS! SELECT * FROM hadoop_weblogs;! 18 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
19.
Example SQL Syntax
– Partition Pruning beehive=> extd host=hcatalog1.asterdata.com database=prod table=movieratings! Table ”prod".”movieratings"! Name | Type | Partitioned Column ! ---------+---------+--------------------! userid | string | f! movieid | int | f! rating | double | f! ds | string | t! (4 rows)! ! ! // Because ‘ds’ is a partitioned column, the query below! // will only pull in data from the ‘2011-06-10’ partition! SELECT * FROM hadoop_movieratings! WHERE ds=‘2011-06-10’;! 19 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
20.
Example SQL Join
Syntax – Complex Queries // Join example! ! select t1.name, t2.page_url, t1.price ! from ! aster_product t1, ! hadoop_weblogs t2 ! where t1.product_id=t2.product_id;! ! ! ! 20 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
21.
Example SQL-MapReduce Syntax //
Find all the sessions with a particular page visit pattern where! // atleast 3 products have been checked out during the session! ! SELECT * FROM npath(! ON hadoop_weblogs! PARTITION BY sessionid ORDER BY clicktime! MODE(nonoverlapping) ! PATTERN(‘h.h*.d*.c{3,}.d’)! SYMBOLS(pagetype = ‘home’ as h, pagetype=‘checkout’ as c,! pagetype<>’home’ and pagetype<>’checkout’ as d)! RESULT(first(sessionid of c) as sessionid,! max_choose(productprice, productname of c) as most_expensive,! max(productprice of c) as max_price,! min_choose(productprice, productname of c) as least_expensive, ! min(productprice of c) as min_price))! ORDER BY sessionid;! 21 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
22.
Example BI Tool
Usage – Path Analysis on Data Stored in Aster and Hadoop 22 Confidential and proprietary. Copyright © 2011 Teradata Corporation.
23.
Example BI Tool
Usage – Path Analysis on Data Stored in Aster and Hadoop 23 Confidential and proprietary. Copyright © 2011 Teradata Corporation.