Pass bac jd_sm

Using Power View and Hive
to Gain Business Insights
Finding Hidden Answers in Data
Joey D’Antoni, Comcast Cable
Stacia Misner, Data Inspirations

April 10-12 | Chicago, IL

Please silence
cell phones


About Us
Joey D’Antoni Stacia Misner

• Principal Architect for SQL Server at Comcast • Principal Consultant at Data Inspirations
Cable • @StaciaMisner on Twitter
• @jdanton on Twitter • blog.datainspirations.com
• Joedantoni.wordpress.com

3

Agenda

• Introducing Big Data
• Overview and Summary of Data Set
• Insights into the Data
• Conclusions

4

Classic Data Analysis

Loading Analyzing Visualization

5

Classic Data Analysis …Uses Just a Subset

Data Warehouse &
BI Solutions

ETL

Classic Data Analysis …Requires Structure

Data Warehouse &
BI Solutions

ETL

Why Leave the RDBMS

8

Key Differences

Basically
Available
Soft-state
Eventually
Scale Out As Needed Impose Schema consistent
With Commodity Hardware On Read

Hadoop Ecosystem
Note: This is only a
subset of ecosystem!

MapReduce

HDFS

Hadoop and Hive Demo

11

Extract, Transform, Load (ETL) Process
Some

Process
Your

Some Database Business
Doesn’t Care
About

Credit—Buck Woody, Microsoft

12

Our ETL Process

Collection
HDFS
Server

Hive is a Data Warehouse System that connects to Hadoop and
allows SQL queries to be written against data sets in Hadoop

13

The Data Set

Set Top Box Engagement Times
• Max Set Top Boxes Viewing Channels
• Aggregate Viewing Seconds
• Potential Total Seconds Watched
• Recorded in 5, 15 and 60 minute aggregates

This data is from the week of 11-17, July 2012

14

Preparation for Data Analysis

• Define question to answer

• Define ideal data set

• Find data

15

Remember Legal and Privacy Issues

16

Diving into Data Analysis

• Cleanse
• Reformat as needed
• Decide what is usable

• Explore
• Create summaries
• Perform statistical analysis
• Use visualizations

17

Aggregate Statistics on Data

18

Resources

Connecting Excel to Hive (Hive ODBC Driver, Excel Hive Add-in)
• http://social.technet.microsoft.com/wiki/contents/articles/6226.how-to-
connect-excel-to-hadoop-on-azure-via-hiveodbc.aspx
Connecting PowerPivot to Hadoop on Azure
• http://dennyglee.com/2012/01/21/connecting-powerpivot-to-hadoop-on-
azure-self-service-bi-to-big-data-in-the-cloud/
Connecting Power View to Hadoop on Azure
• http://dennyglee.com/2012/02/10/connecting-power-view-to-hadoop-on-
azurean-awesomesauce-way-to-view-big-data-in-the-cloud/

19

Thank you!
Diamond Sponsor


Pass bac jd_sm

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (6)

Similar a Pass bac jd_sm

Similar a Pass bac jd_sm (20)

Más de Joseph D'Antoni

Más de Joseph D'Antoni (20)

Pass bac jd_sm

Notas del editor