Data Discovery, Visualization, and Apache Hadoop

Data Discovery, VisualizationData Discovery, Visualization
and Apache Hadoopand Apache Hadoop
An InformationWeek WebcastAn InformationWeek Webcast
Sponsored bySponsored by

Webcast LogisticsWebcast Logistics

TodayToday’s Presenters’s Presenters
Ted J. Wasserman
Product Manager
Tableau Software
John Kreisa
VP Strategic Marketing
Hortonworks
Lenny Liebmann
Contributing Editor
InformationWeek

© Hortonworks Inc. 2012
Agenda
• How Hadoop fits into the Modern Data Architecture
• How it works with your existing data center infrastructure
• Typical Hadoop patterns of use
• The importance of data discovery for all business users
• Get started with visual analytics software and Hadoop
• Demo
• Next Steps

Insert Poll 1 HEREInsert Poll 1 HERE

Big Data: Changing The Game for Organizations
Page 6
Megabytes
Gigabytes
Terabytes
Petabytes
Purchase detail
Purchase record
Payment record
ERP
CRM
WEB
BIG DATA
Offer details
Support Contacts
Customer Touches
Segmentation
Web logs
Offer history
A/B testing
Dynamic Pricing
Affiliate Networks
Search Marketing
Behavioral Targeting
Dynamic Funnels
User Generated Content
Mobile Web
SMS/MMSSentiment
External Demographics
HD Video, Audio, Images
Speech to Text
Product/Service Logs
Social Interactions & Feeds
Business Data Feeds
User Click Stream
Sensors / RFID / Devices
Spatial & GPS Coordinates
Increasing Data Variety and Complexity
Transactions + Interactions
+ Observations
= BIG DATA

Existing Data Architecture
TRADITIONAL REPOS
RDBMS EDW MPP
OLTP,
POS
SYSTEMS
MANAGE
&
MONITOR
Traditional Sources
(RDBMS, OLTP, OLAP)
BUILD &
TEST
Business
Analytics
Custom
Applications
Enterprise
Applications
Page 7

Emerging Data Architecture
TRADITIONAL REPOS
RDBMS EDW MPP
OLTP,
POS
SYSTEMS
MANAGE
&
MONITOR
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources
(web logs, email, sensors, social media)
BUILD &
TEST
Business
Analytics
Custom
Applications
Enterprise
Applications
ENTERPRISE
HADOOP PLATFORM
Page 8

Interoperating With Your Tools
Page 9
TRADITIONAL REPOS Viewpoint
Microsoft Applications
HORTONWORKS
DATA PLATFORM
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources
(web logs, email, sensors, social media)

Big Data
Transactions, Interactions, Observations
Hadoop Common Patterns of Use
Business Cases
HORTONWORKS
DATA PLATFORM
Refine Explore Enrich
Batch Interactive Online
“Right-time” Access to Data
Page 10

Business Cases of Hadoop
Vertical Refine Explore Enrich
Retail & Web
• Log Analysis/Site
Optimization
• Social Network Analysis
• Dynamic Pricing
• Session & Content
Optimization
Retail
• Loyalty Program
Optimization
• Brand and Sentiment
Analysis
• Dynamic Pricing/Targeted
Offer
Intelligence • Threat Identification • Person of Interest Discovery • Cross Jurisdiction Queries
Finance
• Risk Modeling & Fraud
Identification
• Trade Performance
Analytics
• Surveillance and Fraud
Detection
• Customer Risk Analysis
• Real-time upsell, cross sales
marketing offers
Energy
• Smart Grid: Production
Optimization
• Grid Failure Prevention
• Smart Meters
• Individual Power Grid
Manufacturing • Supply Chain Optimization • Customer Churn Analysis
• Dynamic Delivery
• Replacement parts
Healthcare &
Payer
• Electronic Medical Records
(EMPI)
• Clinical Trials Analysis • Insurance Premium
Determination

OS Cloud VM Appliance
HDP: Enterprise Hadoop Distribution
Page 12
PLATFORM SERVICES
HADOOP CORE
DATA
SERVICES
OPERATIONAL
SERVICES
Manage &
Operate at
Scale
Store,
Process and
Access Data
HORTONWORKS
DATA PLATFORM (HDP)
Distributed
Storage & Processing
Hortonworks
Data Platform (HDP)
Enterprise Hadoop
•The ONLY 100% open source
and complete distribution
•Enterprise grade, proven and
tested at scale
•Ecosystem endorsed to ensure
interoperability
Enterprise Readiness

What We Do…
• We distribute the only 100%
Open Source Enterprise
Hadoop Distribution:
Hortonworks Data
Platform
• We engineer, test & certify
HDP for enterprise usage
• We employ the core
architects, builders and
operators of Apache
Hadoop
• We drive innovation within
Apache Software
Foundation projects
• We are uniquely positioned
to deliver the highest quality
of Hadoop support
• We enable the ecosystem
to work better with Hadoop
DevelopDevelop DistributeDistribute SupportSupport
We develop, distribute and support
the ONLY 100% open source
Enterprise Hadoop distribution
Endorsed by Strategic Partners
Headquarters: Palo Alto, CA
Employees: 200+ and growing
Investors: Benchmark, Index, Yahoo

Insert Poll 2 HEREInsert Poll 2 HERE

Hortonworks Sandbox
Fastest Onramp to Apache Hadoop
• What is it
– A free download of a virtualized single-node implementation of the enterprise-ready
Hortonworks Data Platform
– A personal Hadoop environment
– An integrated learning environment with frequently, easily updatable hands-on step-by-
step tutorials
• What it does
– Dramatically accelerates the process of learning Apache Hadoop
– Accelerate and validates the use of Hadoop within your unique data architecture
– Use your data to explore and investigate your use cases
• ZERO to big data in 15 minutes
• Get Started!
Page 23
Download Hortonworks Sandbox
www.hortonworks.com/sandbox
Sign up for Training for in-depth learning
hortonworks.com/hadoop-training/

Hadoop Summit 2013
• June 26-27, 2013- San Jose Convention Cntr
• Co-hosted by Hortonworks & Yahoo!
• Theme: Enabling the Next Generation
Enterprise Data Platform
• 90+ Sessions and 7 Tracks:
• Community Focused Event
– Sessions selected by a Conference Committee
– Community Choice allowed public to vote for sessions
they want to see
• Training classes offered pre event
– Apache Hadoop Essentials: A Technical
Understanding for Business Users
– Understanding Microsoft HDInsight
and Apache Hadoop
– Developing Solutions with Apache
Hadoop – HDFS and MapReduce
– Applying Data Science using Apache Hadoop
Page 24
hadoopsummit.org

Next Steps
• Try Tableau on Hortonworks Sandbox!
• Download Sandbox
– Hortonworks.com/sandbox
• Download Tableau trial
– Tableausoftware.com/trial
• Visit Hortonworks blog on connecting Tableau to the
Sandbox
– http://hortonworks.com/kb/how-to-connect-tableau-to-
hortonworks-sandbox/

Q&AQ&A
Ted J. Wasserman
Product Manager
Tableau Software
John Kreisa
VP Strategic Marketing
Hortonworks
Lenny Liebmann
Contributing Editor
InformationWeek

ResourcesResources
To View This or Other Events On-Demand Please Visit:
http://www.informationweek.com/events
http://www.netseminar.com
For more information please visit:
http://hortonworks.com/products/hortonworks-sandbox/

Data Discovery, Visualization, and Apache Hadoop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Data Discovery, Visualization, and Apache Hadoop

Similar to Data Discovery, Visualization, and Apache Hadoop (20)

More from Hortonworks

More from Hortonworks (20)

Recently uploaded

Recently uploaded (20)

Data Discovery, Visualization, and Apache Hadoop

Editor's Notes