Making Sense of Big Data

Making Sense of
Big Data
October11, 2012
#MakeSenseBD

#MakeSenseBD

“”
Information is powerful.

But it is how we use is it that will define us.

10/15/2012 Infochimps Confidential 2

#MakeSenseBD

First
What is Big Data?

“data sets so large and complex that it becomes
difficult to process using on-hand database
management tools.”


#MakeSenseBD

#Volume
#Velocity
#Variety

2010 = 1.2 2020 = 35.2
Zettabytes/yr Zettabytes/yr

Source: 2011 IDC Digital Universe Study

#MakeSenseBD

It’s All About The Data
DIGITAL CONTENT

OPERATIONAL DATA

WEB LOGS

SOCIAL MEDIA
FILES
SMART GRIDS

TRANSACTIONAL DATA

AD IMPRESSIONS
R&D DATA

5

#MakeSenseBD

Problem
“Little Data For Business Users“


#MakeSenseBD

Problem
One Size Does Not Fit All

Non-Relational Relational
Analytic Teradata IBM InfoSphere
Aster Netezza HP Vertica Infobright
Hadoop Hadapt ParAccel
Horton Calpont
EMC SAP Hana Oracle
Cloudera VectorWise
Greenplum SAP Sybase IQ Times-Ten
MapR
Zettaset
Operational Spark Oracle IBM DB2 SQLSrvr JustOneDB
InterSystems
Progress Document MarkLogic MySQL Ingress PostgreSQL
Objectivity McObject
Lotus Notes Sybase ASE EnterpriseDB
Versant

NoSQL CouchDB
NewSQL
MongoDB ‘Data as a Service’ HandlerSocket
Key Amazon RDS
Couchbase RavenDB Akiban
Cloudant SQL Azure
Value App Engine MySQL Cluster
Database.com
SimpleDB Clustrix
Xeround FathomDB
Drizzle
Riak Big Tables GenieDB
Redis
Cassandra
Graph SchoonerSQL ScaleBase ScalArc
Membrain Tokutek NimbusDB
FlockDB CodeFutures
Voldemort HyperTable
InfiniteGraph Continuent VoltDB
BerkeleyDB HBase
Neo4j Translattice
AllegroGraph


#MakeSenseBD

Problem
Complexity of A New Data Architecture
Structured
BI User
Departmental
Reports (reports)
Online Teradata
Click Data Data Warehouse SQL BI
Data Mart Server
Virt Virt Virt
Online DM DM DM
BI Data

CRM Real-Time Data App
Data Streaming Server
Operational Customer
Application
POS
Data
BI
Hadoop Server
Cust Srvc Analytics User
Data
Call Logs NoSQL
Warehouse Platform

In-Memory
Social Sandbox
Sandbox
Sandbox
Sandbox
Analytics
Bus User
Semi-structured IT (ETL) (Reports)

#MakeSenseBD

“Big Data For Business Users“


#MakeSenseBD

$ $
$ $

?

Executive
Data

10/15/2012 Infochimps Confidential
12

#MakeSenseBD

#thisisreallygood


#MakeSenseBD

#timeforaPOLL


#MakeSenseBD

Next
Hadoop + NoSQL technologies =

the ability to process large and complex
data sets without the challenges
associated with legacy, and at a fraction
of the price.


#MakeSenseBD

Enterprise Data Warehouse
Request Answer
Parsing
? Engines

BYNET Interconnect

Amp Amp Amp
Node Node Node

....

PARC | 16

#MakeSenseBD

Big Data Warehouse
Search Recommend

Rank
Analytic
Request Master: Answer
Score Next-Best-Action Name Node
Job Tracker

Ethernet Interconnect

Slave: Slave: Slave:
Task Trckr Task Trckr Task Trckr
Data Node Data Node Data Node

Semi-
.... Structured
Data

PARC | 17

#MakeSenseBD

Real
Time

Traditional Operational
Application Ecosystem

Deployment in
Analytic Public/Private Cloud
Appliances
Toolset Integration

Traditional
Decision Support Hardened

Batch
Large Small
Enterprise Enterprise


#MakeSenseBD

#lotsofdata + #simpleanalytics


#MakeSenseBD

Images Web, Mobile, CRM,
ERP, SCM…

Business
Docs,
Transactions &
Text Interactions

Web
Logs SQL NoSQL NewSQL

Social EDW MPP NewSQL

Sensors Business
Intelligence &
Analytics
Dashboards, Reports
GPS Visualization…


#MakeSenseBD

Use Case
Hedge Fund

How do I predict whether companies will
make their quarterly earnings forecast?


#MakeSenseBD

Walmart


#MakeSenseBD

Target


#MakeSenseBD

Cars
In Lot

News
Text

Web
Pricing Quarterly
Revenue
Prediction
Social
Sentiment

Weather
Sensors

Local
Employment


#MakeSenseBD

Use Case
Media Company

How do I merge my traditional media
sources with new media sources to
provide improved and instant insights to
my customers?


#MakeSenseBD
New Media
Data Scientist App Developer
Gnip
Powertrack
Business Users

Gnip
EDC

Sources Sentiment

Moreover
Metabase
In-Motion
Data Delivery APIs Listening
Service Application
TV
Transcription
NoSQL

Radio
Transcription

Print
Transcription
IT Staff
Traditional Media

#MakeSenseBD

Use Case
Retail Company

How do I increase online revenue?


#MakeSenseBD
Family 60% + 10%
Million $ Q 40%
Color 30%
Welcome 15% Kids Exclusive
Current Baby 60%

Approved Hue Denim
Weekend 15%
Threadless
Offers Sunday 25% Denim
Million $ Q
Spring 25%
Khakis
Color 30%
Color 30%
Million $ Q Color Denim 30%
Khakis Hoodies 10%
Dynamically Populated
Personalized Email

Known & Unknown Existing
Customers & Approved
Online/Offline Behavior Product
Content

#MakeSenseBD
Current
Campaign
Offers

Online
Click Data

Online Traditional
BI Data Analytics

Targeted Offers Personalized
Data & Products Email Campaign
Past CRM
Data Model

Hadoop Graph
POS Cluster Analytics
Data
Data
Model
Cust Srvc Measure
Call Logs Performance

Social

Product
Content

#MakeSenseBD

#85%AccurateFirstTime


#MakeSenseBD

#timeforaPOLL


#MakeSenseBD

I’m Ready
So How Do I Start?

…without spending a *$#&-load of
money before proving ROI?


#MakeSenseBD

Deployment Options

On-Premise

Public Cloud
Provider Trusted
Data Center Provider


#MakeSenseBD

You Manage Someone Else Manages

$ $
$
$
$
Private Big Data Virtual Private Big Public Big Data Virtual Private Big Public Big Data
Cloud (You Data Cloud (You Cloud (You Data Cloud Cloud (Managed
Manage) Manage) Manage) (Managed Service) Service)

$Cost Security Risk Time To Market


#MakeSenseBD

Who?
#InfochimpsOfCourse


#MakeSenseBD

Infochimps
Enterprise Customers
• Managed Big Data Services
• Elastic & Secure Private &
Public Clouds
• Across a Global Network of App
BI
Analytics Sys
BI

Trusted Data Center Data
Lang
Data Intelligence Data
Delivery
Delivery Network
Providers Hadoop NoSQL
Infra
• With Batch & Real-time Delivery

Analytic Framework Global Network Of
• Supporting Structured & Data Center Infrastructure Providers

Unstructured Data

#MakeSenseBD

Data
Intelligence Network

Cloud-based
Data PaaS
Virtual Private & Public Cloud
Data Tier4 Lights Out Data Centers
Marketplace OpenStack & VSphere
Managed Services

Big Data PaaS
Public Cloud
15,000 Data Sets Amazon & Rackspace
Managed Services

#MakeSenseBD

Elastic Big Data PaaS

Deployment From Laptop to Cloud (Public & Private) Amazon, Rackspace, OpenStack & VSphere
Ironfan


#MakeSenseBD

Big Data Managed Service Offerings
Community Public Virtual Private Private
Cloud Cloud Cloud

Access to Pre-integrated, pre- Pre-integrated, pre- Pre-integrated, pre-
Infochimps Big tested Big Data tested Big Data tested Big Data
Data Platform via stack stack stack
open source
Quickly deploy in Deployed in a Deployed in your
Deploy Anywhere Amazon trusted lights-out Data Center -
Cloud, Rackspace data center Open Stack or
Cloud network Vsphere

Try It Under Your High SLA
Control Fully Managed Managed Service Customized
Service Managed Service

#MakeSenseBD

#LastPOLL


#MakeSenseBD

#1 Big Data Platform For The Cloud
#MakeSenseBD

www.infochimps.com/demo

1-855-DATA-FUN (1-855-328-2386)


Making Sense of Big Data

Recomendados

Recomendados

Más contenido relacionado

Más de Infochimps, a CSC Big Data Business

Más de Infochimps, a CSC Big Data Business (11)

Último

Último (20)

Making Sense of Big Data

Notas del editor