Hortonworks and Teradata have partnered to provide a clear path to Big Analytics via stable and reliable Hadoop for the enterprise. The Teradata® Portfolio for Hadoop is a flexible offering of products and services for customers to integrate Hadoop into their data architecture while taking advantage of the world-class service and support Teradata provides.
2. 2
Shift from a Single Platform to an Ecosystem
"Logical" Data Warehouse
“The hype around replacing the data
warehouse gives way to the more
sensible strategy of augmenting it …
The influence of the logical data
warehouse has created a situation in
which multiple repository strategies are
now expected.”
“Big Data requirements are solved by
a range of platforms including
analytical databases, discovery
platforms, and NoSQL solutions
beyond Hadoop.”
Source: “Big Data Comes of Age”. EMA and 9sight
Consulting. Nov 2012.
3. Marketing
Applications
Business
Intelligence
Data
Mining
Math
and Stats
Languages
ANALYTIC TOOLS
& APPS
Customers
Partners
Business
Analysts
Data
Scientists
USERS
UNIFIED DATA ARCHITECTURE
MOVE MANAGE ACCESS
INTEGRATED DATA WAREHOUSE
INTEGRATED DISCOVERY PLATFORM
ERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
SOURCES
DATA
PLATFORM
System Conceptual View
Marketing
Executives
Operational
Systems
Frontline
Workers
Engineers
4. Marketing
Applications
Business
Intelligence
Data
Mining
Math
and Stats
Languages
ANALYTIC TOOLS
& APPS
Customers
Partners
Business
Analysts
Data
Scientists
USERS
UNIFIED DATA ARCHITECTURE
Business Conceptual View
INTEGRATED DATA WAREHOUSE
INTEGRATED DISCOVERY PLATFORM
ERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
Web and
Social
SOURCES
DATA
PLATFORM
Business Intelligence
Predictive Analytics
Operational Intelligence
Data Discovery
Path, graph, time-series analysis
Pattern Detection
Fast Data Loading
& Availability
Filtering &
Processing
Deep History:
Online Archival
Fast-Fail Hypothesis Testing
Marketing
Executives
Operational
Systems
Frontline
Workers
Engineers
MOVE MANAGE ACCESS
Data Mgmt.
(data lake)
5. 5
Discovering Deep Retail Insights with UDA
Transforming Web Walks into DNA Sequences
Impact
Situation
Largest German online retailer, conglomerate with numerous
brands and 50 websites. 1 Millions visitors, viewing 2M
products.
Problem
Needed a better way of analyzing consumer behavior on the
websites, communicating with category managers
Solution
Treat each web visit sequence like DNA sequence. Built a fast
query tools so analysts can express queries easily for their
categories, get deeper insights
• Leverage Aster platform to generate rapid path insights
• Drives 15% increase in market baskets through personalization
• Drives 10-20% increase in conversions by shortening paths
• Can now see what does and doesn’t lead to sales
• Widening use across all the Corporate Group websites
6. Modern Data Architecture: Teradata
TVI – Proactive system monitoring tied to Teradata customer support
Viewpoint Alerts Services System
KNOX
AMBARI
SOURCE DATA
Sensor Log
Data
Customer/
Inventory
Data
Clickstream
Data
Flat Files
Sentiment
Analysis
Data
DB
File
JMS
REST
HTTP
Streaming
Query/Visualization/
Reporting/Analytical
Tools and Apps
JDBC/ODBC Compliant
Tool
Analytical
Platforms
Aster Discovery
Platform
Teradata IDW
MAPREDUCE
YARN
Health Node
Health Space
Usage Capacity
Heatmap Metrics
Analysis
HDFS
REFINE
HIVE
PIG
ETL
CUSTOM
LOAD
SQOOP
FLUME
NFS
Web HDFS
EXTRACT
BULK COPY
DISTCP AFS
STRUCTURING
HCATALOG
INTERACTIVE
QueryGrid
EXPORT
SQOOP / HIVE
LOAD
TDCH
EXTRACT
Bidirectional
7. 7
Teradata Portfolio for Hadoop
” Bringing Hadoop to the Enterprise”
• Most Trusted and Flexible Hadoop Platforms for Your
Next-Generation Unified Data Architecture™
1. Teradata Aster Big Analytics Appliance
2. Teradata Appliance for Hadoop
3. Teradata Commodity Offering with Dell
4. Hortonworks Data Platform software-only support resell
• Complete consulting and training capability
> Big Analytics Services—across the UDA
> Data Integration Optimization—ETL, ELT across the UDA
> Hadoop deployment and mentoring
> Teradata delivering Hortonworks training
> Hadoop Managed Services—operations and administration
• Customer Support for Hadoop
> World-class Teradata customer support, backed by Hortonworks
8. 8
Teradata Loom® 2.3
“Integrated metadata management, data lineage
and data wrangling for Enterprise Hadoop”
Loom is a platform for profiling, preparing and tracking data lineage for data
in Hadoop
• Hadoop Data Governance and Metadata Management
– Rich information model for capturing and managing the relationships
– Data dictionary for the big data landscape
– Support for non-Hadoop sources
Free version of Loom pre-installed with
Hortonworks Sandbox
• Automation (Activescan)
– Discovering and introspecting new data in the cluster
– Triggering external processing (e.g. Oozie script for ETL)
– Automatically collecting metadata about the job - lineage, statistics
– Polling YARN job history for lineage
• User Interactivity (Workbench)
– Advanced user interfaces for data exploration, profiling and preparation
– Data wrangling for interactively cleaning/reshaping raw data into useable data
9. Teradata Appliance for Hadoop
9
Teradata QueryGrid ®
Teradata Studio with
Smart Loader
Value Added Software from Partners
Teradata Viewpoint
Teradata Connector for Hadoop (TDCH)
Intelligent Start and Stop
NameNode Failover
Optimized hardware for Hadoop
BYNET™ V5 40GB/s InfiniBand interconnect
Teradata Vital Infrastructure
Teradata Distribution for Hadoop
(Based on Hortonworks HDP)
Kerberos
HCatalog
Teradata Loom® ( for data management )
10. 10
Teradata QueryGrid™ Vision
Business users Data Scientists
TERADATA
ASTER
DATABASE
SQL,
SQL-MR,
SQL-GR
TERADATA
DATABASE
Multiple
Teradata
Systems
HADOOP
Push-down
to Hadoop
System
IDW
TERADATA
DATABASE
Discovery
TERADATA
ASTER
DATABASE
COMPUTE
CLUSTER
Run SAS, Perl,
Ruby, Python, R
RDBMS
DATABASES
Push-down
to Other
Database
MONGODB
DATABASE
Push-down
to NoSQL
Databases
11. 11
Teradata QueryGrid™: Teradata - Hadoop
Give business users on-the-fly access to data in Hadoop
• Trusted: Use existing tools/skills and enable
self-service BI with granular security
• Standard: 100% ANSI SQL access to
Hadoop data
• Fast: Queries run on Teradata or Aster,
data accessed from Hadoop
• Efficient: Intelligent data access
leveraging the Hadoop HCatalog
QueryGrid: Teradata-Hadoop
QueryGrid: Aster-Hadoop
Hadoop
MR
Hive
Hadoop Layer: HDFS
Pig
HCatalog
Data
Data Filtering
12. Teradata Viewpoint
12
Single Operational View (SOV)
for Teradata, Aster, & Hadoop
• Hadoop Portlets:
– Node Monitor (Aster & Hadoop)
– Hadoop Services
• Integration into existing:
– Monitoring: System Health, Metrics
Analysis, Metrics Graph, Capacity
Heatmap, Space Usage.
– Admin: Alert Viewer, Alert Setup,
Teradata Systems, Role Manager
13. Teradata Connector for Hadoop (TDCH)
13
• Key Features
– High-speed connector between Teradata and
Hadoop based on Apache Sqoop framework
– Both import and export data between Teradata and
Hadoop
– Leverages the JDBC-FastLoad/FastExport mechanism
from Teradata
– Import/export Hive rcfile/sequencefile/textfile format
and Hive partitioned files
INTEGRATED
DATA WAREHOUSE
CAPTURE | STORE | REFINE
• Available through Hortonworks
> Hortonworks
• Teradata Connector for Apache Hadoop (Release v1.2.0)
• Download link: http://hortonworks.com/download/
14. Teradata Studio: Smart Loader for Hadoop
Self-Service Load
14
• Hadoop View
– Browse through tables
within the Hadoop cluster
- Views table properties
– Bi-directional table copies
- Drag and drop interface
- Maps data types between Hadoop
and Teradata tables
– Transfer Status and History
- Track load status
• Benefits
– Simplifies Hadoop browsing
– Ad hoc data movement between
Teradata and Hadoop
– No scripting required
– Point and click
15. 15
Questions and Next Steps
More about Teradata & Hortonworks
http://www.hortonworks.com/partner/teradata/
Teradata Loom for HDP
http://www.teradata.com/tryloom
Find Us
@Strata
Booth # 324
Teradata Hadoop Station