SlideShare una empresa de Scribd logo
1 de 97
Descargar para leer sin conexión
2© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
New IT Agenda
Provide Access To
All Applications & Data Through
Mobile Devices.
Use Adaptive, Data-Driven
Security To Rapidly Respond To
Emerging Threats.
Build A Data Lake To Deliver
Insights And Applications On
All Data.
Move To A Software-Defined
Data Center Infrastructure And
Expand It To A Hybrid Cloud.
2
Use Agile Development To
Build New Customer-Centric
Applications.
Balance Risk
Cut Operational Costs &
Legacy More Than Ever
React Faster To
Find New Growth
Today’s Business
Challenges
Best Of Breed. Architected Horizontally, Not Vertically. Choice.
Our Strategy: Build A Differentiated Stack
BIG DATA SOLUTIONS
PLATFORM AS A SERVICE
AGILE APPLICATION DEVELOPMENT
INFORMATION INFRASTRUCTURE
CONVERGED INFRASTRUCTURE
SOFTWARE-DEFINED DATA CENTER
HYBRID-CLOUD, MOBILITY
ADVANCEDSECURITY
Many Industries Face Structural Change
Volvo Cars – Big Data app & service
Deliveries to your car – Roam Delivery
The EMC Exabyte Journey
20,772 HARD DRIVES
300 PALLETS
8 TRUCKS
85 PETABYTESSOLD INTO ONE WEB-SCALE PROVIDER IN ONE ORDER
Capacity
Performance
Low Service
Level
High Service
Level
Performance “Good Enough”
Capacity Optimized ($/GB)
Data Loss Not A Disaster
Consistently Good Performance
Eventual Consistency Of Data
Data Loss Not A Disaster
Performance “Good Enough”
Capacity Optimized ($/GB)
Data Loss A Disaster
Consistently Good Performance
Consistent Data
Data Loss A Disaster
Great Performance
Consistent Data
Data Loss A Disaster
11© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
13© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
The Core is Super-Scaling
14© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
The Edge is Hyper-Extending
THE 3RD PLATFORM OF IT
TODAY’S DATA CENTER SOFTWARE-DEFINED DATA CENTER
TRADITIONAL APPLICATIONS NEXT GEN CLOUD APPLICATIONS
120M
-2016-
91M
-2013-
34M
-2016-
11M
-2013-
TRADITIONAL APPLICATION GROWTH NEXT GEN CLOUD APPLICATION GROWTH
THE 3RD PLATFORM REDEFINES EVERYTHING
BUILT FOR THE SPEED OF BUSINESS
21Pivotal Confidential–Internal Use Only 21Pivotal Confidential–Internal Use Only
Data Driven Application
Development
22Pivotal Confidential–Internal Use Only
Pivotal
At-a-Glance
Ÿ  New Independent Venture: Spun out &
jointly owned by EMC & VMware
Ÿ  Deep Execution Talent: 1700 employees
Ÿ  Proven Leadership: Paul Maritz, CEO
Ÿ  Global Customer Validation:
+1000 Tier-1 Enterprise Customers
Ÿ  Strategic Backing: $100M investment by GE
Ÿ  Bold Vision: New platform for a new era,
focused on the intersection of apps, big data
and analytics
23Pivotal Confidential–Internal Use Only
Need for Speed
Ÿ  Enterprises are being driven to compete, innovate & execute
faster than ever before:
–  Global reach and emerging markets
–  Ever-increasing customer expectations
–  Legacy environment & cost pressures
Ÿ  At a time when we’re witnessing the most disruptive platform
shifts and advances in technology in over 30 years
Ÿ  Every business is quickly becoming a software business
Ÿ  Software is how companies engage with customers, powered
by new data insights and new
Social
Cloud
Big
Data
Mobile
24Pivotal Confidential–Internal Use Only
What Matters: Apps. Data. Analytics.
Apps power businesses, and
those apps generate data
Analytic insights from that data
drive new app functionality,
which in-turn drives new data
The faster you can move
around that cycle, the faster
you learn, innovate & pull
away from the competition
25Pivotal Confidential–Internal Use Only
“Software is Eating the World”
26Pivotal Confidential–Internal Use Only
Software is Changing Industries
$3.5B valuation
Financial Services
$3.5B valuation
Travel & Hospitality
$3.5B valuation
Transportation
$3.2B Acquisition by Google
Home Automation
$20B valuation
Entertainment
$1.1B acquisition
Monsanto--Agriculture
27Pivotal Confidential–Internal Use Only
We need to innovate or we die. Pivotal and Cloud Foundry is
our big bet to leapfrog the competition
With Cloud Foundry, we built what looks like a software
company… Moving from silos to a single platform.
Cloud Foundry's potential to transform business is vast…
companies leveraging software will outperform their peers
Enterprises Must Become Great at Software
28Pivotal Confidential–Internal Use Only
Francisco Gonzalez
CEO at BBVA
“Banks need to take on Amazon and
Google or die. The shift to digital
requires a complete overhaul of
banks technology…it is a matter of
survival.”
29Pivotal Confidential–Internal Use Only
Rapid Execution Requires a New Approach
•  Agile teams and rapid iteration
•  Continuous delivery without downtime
•  Horizontally scalability (data and app)
•  Standardized service binding and
discovery
•  First class Mobile support
•  Deep user analytics
Development
Delivery
Operation
Iteration
30Pivotal Confidential–Internal Use Only
Jonathan Rosenberg
CTO & VP, Collaboration
“PaaS is the operating system for
the cloud. As the set of APIs and
services for PaaS's grow, the choice
of PaaS becomes more crucial as the
costs of porting go up. This is one of
the benefits of open source PaaS
offerings like Cloud Foundry.”
31Pivotal Confidential–Internal Use Only
Is Your Enterprise Ready?
32Pivotal Confidential–Internal Use Only
Fail, Learn, Adapt, Repeat.
6 Months to 6 Weeks
$1.1M Saving per App
33Pivotal Confidential–Internal Use Only
Incredible Cloud Foundry Ecosystem
34Pivotal Confidential–Internal Use Only
The Cloud Foundry Foundation
more to come…
“This is a significant announcement for PaaS in general and for
Cloud Foundry in particular. It potentially signals a
consolidation that is going to become apparent…predicts that
Red Hat will shutter OpenShift and throw its hat in with Cloud
Foundry within the year.” - Forbes, Ben Kepes, 2/24/14
35Pivotal Confidential–Internal Use Only
Elastic Runtime
Java, Spring, Ruby, Node.JS
Built-in “Middleware” Services
Operation Manager
Installation, Management,
Monitoring, Upgrades/Updates
...ETC
Pivotal Approach: Your Platform for Building Great
Software
PivotalOne
Pivotal One
CO-INNOVATION
Agile Software Development
Data Lake Solutions
PIVOTAL
MySQL
Pivotal One
SERVICES
36Pivotal Confidential–Internal Use Only
Pivotal Approach: An Application Centric World
Infrastructure Specific
JVM
VM
Pre-Provisioned Pool of VMs
Container 1
App Server
JVM, etc..
Container 2
App Server
JVM, etc..
App1 Common Access Tier (App1, App2)
App Server
Configurations Built-in Middleware Services
JVM
VM
App2
App Server
Configurations
IaaS Agnostic
37Pivotal Confidential–Internal Use Only
GE Capital Builds Foundation For Value Add Insights
“ Critical Insights and data is deleted
because it’s too expensive to store.”
“We need the ability to blend data
fabric, build analytics, and create
applications on top of this.”
“Access any internal or external
data of interest through a familiar
interface.”
“Now we analyze Social Media to
predict trends, and help dealers
make decisions.”
38Pivotal Confidential–Internal Use Only
Use-Case: Data and PaaS Drives Business Agility
Pivotal CF Operation Manager
Any Infrastructure
Big/Fast Data
Real-time change to customer-facing
application based on data analysis
Deploy/Update
(Private/Public)
39Pivotal Confidential–Internal Use Only
Spring becomes the enabler
Deploy to
Cloud or on
premise
Big,
Fast,
Flexible
Data Data
Processing,
Integration
Spring Data
•  JPA/JDBC
•  MongoDB
•  Redis
•  Neo4j
•  GemFire
•  Data REST
•  Spring
Hadoop
•  Spring
Integration
•  Spring
Batch
•  CloudFoundry
•  vCloud Suite
•  Google App
Engine
•  Amazon Elastic
Beanstalk
•  CloudBees
40Pivotal Confidential–Internal Use Only
Data Driven: Harder Than it Sounds
Operationalize
Ingest
Distill
Interface
Process
Analytical Transactional
Operationalize
Ingest
Distill
Interface
Process
Analytical Transactional
Operationalize
Ingest
Distill
Interface
Process
Analytical Transactional
Real Time Near Real Time Batch
Predictive call routing, fraud
prediction, dynamic pricing,
re-marketing, stream analytics
Analytic model designs, transaction
analysis, trend analysis
ETL, archive, trending, monthly and
weekly jobs
41Pivotal Confidential–Internal Use Only
Data Driven: Impossible in Silos
Finance Manufacturing Marketing IT
Data Growth Over 60%
Floods These Silos
42Pivotal Confidential–Internal Use Only
One Platform, Multiple Use Cases
Flexibility to expand the
ingestion of network data as
the underlying infrastructure
changes – easy expansion into
4G LTE.
External
data sources
•  Linked-in
•  Twitter
•  Facebook
•  Weather
•  …
Internal
data sources
•  CRM
•  EDW
•  Customer Portals
•  …
Mobile Network Infrastructure
Intelligently set triggers at the
edge of the network that looks
for 'interesting' events that
require instant action
Build
New Apps
Integrate
Existing Apps
Agile approach to enable apps to access in real-time
All
History
Current
Status
Predicted
Intelligence
Single Unified Platform
•  Capture everything
•  Real-time data processing
•  Single version of the truth
•  End-to-end visibility across
different data sources
•  Scalable and cost effective
43Pivotal Confidential–Internal Use Only
RTI In Telco Ecosystem
RTI-T
Integration
Sanity Check
Filter
Enrich
Transform
Apply Logic
OSS
BSS
Network
Elements
Application
Hosting
Environment
for Real-time
Use-case
Realization
Network
Optimization
Customer
Experience
Management
....
Ingest DistributeProcess
Route
Persist
Long Term Storage
Operational
Datastore
Analytics
44Pivotal Confidential–Internal Use Only
Solution Architecture
RTI-T
Integration
Sanity Check
Filter
Enrich
Transform
Apply Logic
OSS
BSS
Network
Elements
Cloud
Foundry
Network
Optimization
Customer
Experience
Management
....
Ingest DistributeProcess
Route
Persist
Pivotal Data Lake
GemFire XDADS/HAWQ
45Pivotal Confidential–Internal Use Only
World’s Leading Experts
Pivotal Labs – Pivotal Data Labs
Pivotal One
Pivotal Application Suite
BATCH BATCH
NEAR TIME NEAR TIMEHAWQGreenplum DB
Pivotal HD
REAL TIME REAL TIMEGemFire XDGemFire
46Pivotal Confidential–Internal Use Only
Pivotal’s Opportunity
Uniquely positioned to help
enterprises modernize each
facet of this cycle today
Comprehensive portfolio of
products spanning Apps, Data
& Analytics
Converging these technologies
into a coherent, next-gen
Enterprise PaaS platform
BUILT FOR THE SPEED OF BUSINESS
© Copyright 2014 Pivotal. All rights reserved.
It is not the strongest of the
species that survives, nor the
most intelligent that survives.
It is the one that is most
adaptable to change.
“
”-Charles Darwin
BE PREPARED FOR
THE SPEED OF BUSINESS
BUILT FOR THE SPEED OF BUSINESS
© Copyright 2014 Pivotal. All rights reserved.
Ÿ  What is the Pivotal platform?
Ÿ  Why is it so cool?
Ÿ  What amazing things have we done with it?
BE PREPARED FOR
THE SPEED OF BUSINESS
© Copyright 2014 Pivotal. All rights reserved.
Evolving Enterprise Data Architecture
Analytic
Data Marts
MPP Database
Operational
Intelligence
In-Memory DB
Run-Time
Applications
In-MemoryObject
EnterpriseData
Warehouse
RDBMS
Data Staging
Platform
TraditionalBI/Reporting
Data Visualization
Data
Ingestion
System
Stream/CEP
© Copyright 2014 Pivotal. All rights reserved.
Analytic
Data Marts
Operational
Intelligence
Run-Time
Applications
EnterpriseData
Warehouse
Data Staging
Platform
TraditionalBI/Reporting
Data Visualization
Data Ingestion
System
Pivotal Data Product Portfolio
© Copyright 2014 Pivotal. All rights reserved.
Ÿ  What is the Pivotal platform?
Ÿ  Why is it so cool?
Ÿ  What amazing things have we done with it?
BE PREPARED FOR
THE SPEED OF BUSINESS
© Copyright 2014 Pivotal. All rights reserved.
Why is the Pivotal platform so awesome?
Infrastructure
Independent
Fast &
Scalable
Schema
Free
Easy to
Use
Real
Time
© Copyright 2014 Pivotal. All rights reserved.
Big Data & Data Science
Decision	
  	
  =	
  	
  Data	
  	
  +	
  	
  Rules	
  
“Big	
  Data”	
   Data	
  
Science	
  
© Copyright 2014 Pivotal. All rights reserved.
Social	
  Media	
  
Commercial	
  &	
  
Public	
  Data	
  
Dark	
  Data	
  
“Big Data”
Opera7onal	
  
Data	
  
© Copyright 2014 Pivotal. All rights reserved.
Combining data sources: Example
IPSQ (Quality)
Owner: TS Production team
Test flags from production line
1 year ~300GB
APDM
Owner: TS Production team
Full vehicle history including IPST (technical),
IPSL (logistics), IPSQ test flags and all test
results.
30 years ~TBs
FASTA
Owner: Aftersales
Dealership electronic tests
Identifies early issues with cars
>25TB
IQS: Initial Quality Survey from JD Power
Owner: R&D
Survey responses from new owners after 90
days for approx 1700 vehicles
Few thousand lines ~MB
Social Data
Owner: R&D
Pulling 500MB per day from Twitter
TQP
Owner: Supplier management
PDFs of parts spec sheets
~ 500GB
© Copyright 2014 Pivotal. All rights reserved.
Generating value from data:
Car configurator example
Sales
Configurations
Customers
Sales
Configurations
Customers
Basic
Recommendation
Engine
Sales
Configurations
Customers
Advanced
Recommendation
Engine
Sales
Configurations
Customers
Yield
Optimization
Engine
All car elements:
•  Attribute frequencies (colors etc)
•  Attribute combination frequencies
For instance:
•  Browsing history
•  Usage patterns
•  Demographic insights
Ideally:
•  Volumes
•  Pricing
•  By market
•  Linkable to configurations
© Copyright 2014 Pivotal. All rights reserved.
Traditional
Systems “Big Data”“Fast Data”
The value of data over time
Time
Value of
Data ($)
µs ms s hour day month year yr+
Pivotal Data
Science Labs
© Copyright 2014 Pivotal. All rights reserved.
"Pivotal is aiming for the
enterprise market that's
realizing that software
is the biggest
differentiator in
any industry."
— Larry Dignan, ZDNet
“The number of companies that
have bought into the initiative, the
amount of code being contributed,
the customer wins that
ecosystem members are
enjoying suggest that Cloud
Foundry is preeminent among all
the open source PaaS initiatives."
— Ben Kepes, Forbes
"If you're in the business of
building enterprise software,
scrambling to figure out what
your company is doing around
big data and analytics, mobile
and the cloud, then there's a fair
chance you'll want to pay
attention to Pivotal."
— Arik Hesseldahl, WSJD
But don’t just take our word for it…
© Copyright 2014 Pivotal. All rights reserved.
Ÿ  What is the Pivotal platform?
Ÿ  Why is it so cool?
Ÿ  What amazing things have we done with it?
BE PREPARED FOR
THE SPEED OF BUSINESS
© Copyright 2014 Pivotal. All rights reserved.© Copyright 2014 Pivotal. All rights reserved.
What does
traffic data
look like?
© Copyright 2014 Pivotal. All rights reserved.
…like this?
© Copyright 2014 Pivotal. All rights reserved.
…or this?
(Note: This is the
least offensive topic
cluster in our
Twitter data!)
© Copyright 2014 Pivotal. All rights reserved.
Velocity by Time of Day
© Copyright 2014 Pivotal. All rights reserved.
0 50 100 150 200
0.0000.0050.0100.0150.0200.0250.030
density
Link 1000064869
km/h
Velocity Distribution
© Copyright 2014 Pivotal. All rights reserved.
Gaussian Mixture Model
0 50 100 150 200
0.0000.0050.0100.0150.0200.0250.030
density
Combined
Component 1
Component 2
Component 3
Component 4
Link 1000064869
km/h
0 50 100 150 200
0.0000.0050.0100.0150.0200.0250.030
density
Combined
Component 1
Component 2
Component 3
Component 4
Link 1000064869
km/h
0 50 100 150 200
0.0000.0050.0100.0150.0200.0250.030
density
Combined
Component 1
Component 2
Component 3
Component 4
Link 1000064869
km/h
0 50 100 150 200
0.0000.0050.0100.0150.0200.0250.030
density
Combined
Component 1
Component 2
Component 3
Component 4
Link 1000064869
km/h
© Copyright 2014 Pivotal. All rights reserved.
Decision Trees
Example
0 50 100 150 200 250
0.000.010.020.030.04
density
Combined
Component 1
Component 2
Component 3
hour >= 14
hour < 20
weekday = 1,2,3,4,5 weekday = 1,2,3,4,5
weekday = 1,2,3,4,5
nextlink = −1,100000
< 14
>= 20
6,7 6,7
6,7
100002
1
0.47
1
0.69
1
0.76
1
0.81
3
0.85
2
0.55
2
0.73
3
0.65
2
0.56
2
0.63
2
0.68
3
0.49
3
0.73
© Copyright 2014 Pivotal. All rights reserved.
Sneak Peek at our TfL Data Demo
Ÿ  Used the freely
accessible TfL data
for a demo
Ÿ  Shows # of active
disruptions over
different days in
London
Ø  Rush hour effects visible
Ø  Nights are more quiet, but more disruptions on weekend nights
© Copyright 2014 Pivotal. All rights reserved.
Kaiser Permanente Hackathon
Insight
Patient Application
© Copyright 2014 Pivotal. All rights reserved.
Kaiser Permanente Hackathon
© Copyright 2014 Pivotal. All rights reserved.
Text Analytics for Churn Prediction
Customer
A major telecom company
Business Problem
Reducing churn through more accurate
models
Challenges
Ÿ  Existing models only used structured
features
Ÿ  Call center memos had poor structure and
had lots of typos
Solution
Ÿ  Built sentiment analysis models to predict
churn and topic models to understand topics
of conversation in call center memos
Ÿ  Achieved 16% improve in ROC curve for
Churn Prediction
© Copyright 2014 Pivotal. All rights reserved.
Predicting Commodity Futures through Twitter
Customer
A major a agri-business cooperative
Business Problem
Predict price of commodity futures through
Twitter
Challenges
Ÿ  Language on Twitter does not adhere to
rules of grammar and has poor structure
Ÿ  No domain specific label corpus of tweet
sentiment – problem is semi-supervised
Solution
Ÿ  Built Sentiment Analysis and Text
Regression algorithms to predict commodity
futures from Tweets
Ÿ  Established the foundation for blending the
structured data (market fundamentals) with
unstructured data (tweets)
© Copyright 2014 Pivotal. All rights reserved.
Network Intrusion Detection
Customer
One of the worlds largest health care providers
Business Problem
Detect advanced cyber threats in a large
heterogeneous environment and to reduce
malware ‘free-time’
Challenges
Ÿ  Covert threats employ advanced techniques to
bypass traditional security appliances.
Ÿ  Last year 416 days was the median number
before detection on a compromised network.
(Source: Mandiant)
Solution
Ÿ  Built a new behavioral intrusion detection
framework based on machine learning, graph
theory and security research
Ÿ  Designed operational components of their next
generation SIEM.
Firewall
•  Engineered a full
featured custom
social graph based
intrusion model
•  Identified breaches
not a single security
product they owned
was able to detect
© Copyright 2014 Pivotal. All rights reserved.
Website Classification
Customer
An internet domain name service provider
Business Problem
Create a multilevel website classification that
groups websites by function rather than topic
Challenges
Ÿ  Complex unstructured data format required
several transformations
Ÿ  Model needed to be language independent, so
classic language features could not be used
Solution
Ÿ  New hierarchical model resulted in reducing the
number of previously ‘unclassified’ websites by
~75%
Ÿ  Created an in-database analytics framework for
unsupervised learning models and enabled real-
time validation of current production model
Map of Domains
© Copyright 2014 Pivotal. All rights reserved.
Pivotal’s Platform
Uniquely positioned to help
enterprises modernize each
facet of this cycle today
Comprehensive portfolio of
products spanning Apps, Data
& Analytics
Converging these technologies
into a coherent, next-gen
Enterprise PaaS platform
BUILT FOR THE SPEED OF BUSINESS
Infrastructure for Data
Driven Workloads
79© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
PLATFORM
AS A SERVICE
VIRTUAL
WORKSPACE
BUSINESS
DATA LAKE
SECURITY
ANALYTICS
SOFTWARE
DEFINED
DATA CENTER
SERVICEPROVIDER
ENTERPRISEDATACENTERA Unique Federation Of Companies
Delivering The Software-Defined Enterprise. Solutions & Choice.
Partners
BIG DATA SOLUTIONS
PLATFORM AS A SERVICE
AGILE APPLICATION DEVELOPMENT
SOFTWARE-DEFINED DATA CENTER
HYBRID-CLOUD, MOBILITY
INFORMATION INFRASTRUCTURE
CONVERGED INFRASTRUCTURE
ADVANCEDSECURITY
80© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Converged
Infrastructures
Partners
vCloud
Hybrid Service
Hybrid Cloud
Managed As One Cloud
Federation Solutions
5 Solutions Enabling The Software-Defined Enterprise
Next Gen Cloud Apps
PLATFORM
AS A SERVICE
SOFTWARE-DEFINED
DATA CENTER
VIRTUAL
WORKSPACE
BUSINESS
DATA LAKE
SECURITY
ANALYTICS
SOFTWARE-DEFINED
DATA CENTER
VIRTUAL
WORKSPACE
PLATFORM
AS A SERVICE
SECURITY
ANALYTICS
BUSINESS
DATA LAKE
81© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Hadoop Overview
Hadoop
is an open-source framework from Apache that allows for
parallel batch processing of very large data sets
MapReduce
is the Hadoop process that divides the workload so
multiple devices can process it
HDFS
is the file system for the data. It provides data protection
and locality with multiple mirrors (usually 3 times)
82© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Isilon Scale-Out NAS Architecture
OneFS Operating
Environment
Intra-cluster
Communication Layer
Client/Application Layer Ethernet Layer
SingleFS/Volume
CIFSNFS
FTPHTTP
HDFS for
Hadoop
REST for
Object
Gig-e
10 Gig-e
Network
Protocols
83© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Traditional “Share-Nothing” Hadoop
Existing Virtualized Data Center SHARE-NOTHING Hadoop Infrastructure
Unstructured Data
1
Existing Primary Storage
2 3 4 2 3 4 2 3 4 2 3 4
•  Hadoop replication count
(R=3) means 4 data copies
•  Data has to copy to the
Hadoop cluster before analysis
can begin (Time to Results)
How will you maintain data
consistency when a file changes
on your primary storage?
84© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Existing Virtualized Data Center
Existing Primary Storage
Isilon “Share-Everything” Hadoop
1
Ÿ  Start using Hadoop NOW with
unused processing and RAM
available in your VMware
environment
Ÿ  No replication required
(Use your existing data)
Ÿ  Access to same data via NAS
and HDFS protocols
Ÿ  Time to results extremely fast
using already existing data with
NO COPIES or wasted $
Analysis Can
Begin with the
1st VM
New Hadoop Compute Nodes
Unstructured Data
Use Native HDFS Protocol
85© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Ethernet
Job Tracker Task Tracker DataNode 2nd NameNode
NameNode
Hadoop Architecture - Traditional
R (RHIPE) Mahout Hive HBasePIG
NameNode
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
86© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Ethernet
R (RHIPE)
PIG
Mahout Hive HBase
Job Tracker Task Tracker DataNode
Compute Node Compute Node Compute Node
Compute NodeCompute Node Compute Node
NameNode
Hadoop Architecture with Isilon
name
node
name
node
name
node
name
node
datanode
87© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
HDFS
SMB, NFS,
HTTP, FTP,
HDFS
Node
reply
Node
reply
Node
reply
Node
reply
NameNode
Data
Support for Multiple Hadoop Distributions
name
node
name
node
name
node
name
node
datanode
NFS
SMB
SMB
NFS
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
88© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Dependent Scaling Traditional Hadoop HDFS
Isilon HDFS
Ÿ  Storage to Compute ratio is fixed
Ÿ  Scaling compute means scaling
capacity
Ÿ  Difficult to provide QoS
Ÿ  Compute upgrade is a forklift
Ÿ  Scale compute independent of
storage
Ÿ  Achieve optimal performance
balance even as workloads evolve
Ÿ  No data migrations, ever!
Ÿ  Add new performance as
hardware evolves
Compute
Storage
Required
performance/
capacity
Required Hadoop
Cluster Nodes
89© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Independent Scaling Traditional Hadoop HDFS
Isilon HDFS
Ÿ  Storage to Compute ratio is fixed
Ÿ  Scaling compute means scaling
capacity
Ÿ  Difficult to provide QoS
Ÿ  Compute upgrade is a forklift
Ÿ  Scale compute independent of
storage
Ÿ  Achieve optimal performance
balance even as workloads evolve
Ÿ  No data migrations, ever!
Ÿ  Add new performance as
hardware evolves
Compute
Storage
Required
performance/
capacity
Required Hadoop
Cluster Nodes
90© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Snapshot & Version Control Before
After
Ÿ  Traditional HDFS does not have
replication
Ÿ  No Snapshotting of data
Ÿ  Loss of version control
Ÿ  Not designed for Mission Critical
Ÿ  Full SnapshotIQ integration
identifies changes
Ÿ  Multi-threaded, Multi-Node Scale-
Out replication
Ÿ  Improved RPO/RTO for business
continuity
Ÿ  Geo-replicated Hadoop!
91© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Data Center Network
Time-to-Results
Data Copy Analysis In-Place Analysis
Existing Primary Storage
Hadoop on a Stick
Have you ever
copied 100TB from
Primary Storage to
a Hadoop system?
How long does it
take to copy
100TB from one
place to another
over a 10Gb link?
>24 Hours
Data Center Network
Existing Primary Storage
Hadoop Compute Nodes
Reading
relevant
data to
analysis
92© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
A real world example
Cost Comparison
Customer requirements
Ÿ  640 TB raw capacity
Ÿ  64 Compute (1 per 10TB)
DAS Option
Ÿ  14.8% usable capacity/DataNode
Ÿ  38 racks of servers
Isilon Option
Ÿ  10 Racks (including Compute)
Ÿ  65% less expensive than DAS
Hadoop on Isilon is often significantly less costly!
Network
Hadoop
Licensing
Management
Config
Installation
Energy
Isilon
Servers
$ 0
$ 1M
$ 3M
$ 4M
$ 5M
$ 6M
$ 2M
93© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Efficiency and flexibility
The Isilon Advantage for Hadoop
Ÿ  No data ingest necessary
Ÿ  Eliminate 3x mirroring
Ÿ  Over 80% storage utilization
Ÿ  SmartDedupe to further reduce storage needs by up to 30%
Ÿ  Scale compute and data independently
Ÿ  Multi-protocol access
Ÿ  Simultaneous multi-distribution support
Ÿ  Ability to leverage VMware vSphere Big Data Extensions to reduce
datacenter footprint, power, space, and cooling
94© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Data protection and security
The Isilon Advantage for Hadoop
Ÿ  Highly resilient architecture
Ÿ  Robust data protection options
(DR, snapshots, etc.)
Ÿ  Eliminate NameNode single point of failure
Ÿ  SEC 17a-4 compliant WORM
Ÿ  Kerberos authentication
Ÿ  Hadoop multi-tenancy
95© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
How Do I Start Using Hadoop?
EMC Hadoop Starter Kit (HSK)
Ÿ  Visit https://community.emc.com/docs/DOC-26892
Ÿ  Watch the demo video
Ÿ  Follow the instructions to deploy Hadoop to your existing
Isilon and VMware infrastructure in about an hour
Ÿ  There are customized HSKs for Apache, Pivotal, Cloudera,
and Hortonworks
96© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
Becoming a data driven organization

Más contenido relacionado

La actualidad más candente

3D Data Strategy Framework
3D Data Strategy Framework3D Data Strategy Framework
3D Data Strategy FrameworkDaniel Ren
 
Big Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance ReimaginedBig Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance ReimaginedMatt Stubbs
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceDATAVERSITY
 
Becoming a Data Driven Organisation
Becoming a Data Driven OrganisationBecoming a Data Driven Organisation
Becoming a Data Driven OrganisationWizdee
 
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...Craig Milroy
 
How to Consume Your Data for AI
How to Consume Your Data for AIHow to Consume Your Data for AI
How to Consume Your Data for AIDATAVERSITY
 
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...DATAVERSITY
 
Neil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceAlation
 
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data ScienceChief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data ScienceCraig Milroy
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceDATAVERSITY
 
RWDG Slides: Operationalize Data Governance for Business Outcomes
RWDG Slides: Operationalize Data Governance for Business OutcomesRWDG Slides: Operationalize Data Governance for Business Outcomes
RWDG Slides: Operationalize Data Governance for Business OutcomesDATAVERSITY
 
Reinventing the Modern Information Pipeline: Paxata and MapR
Reinventing the Modern Information Pipeline: Paxata and MapRReinventing the Modern Information Pipeline: Paxata and MapR
Reinventing the Modern Information Pipeline: Paxata and MapRLilia Gutnik
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessInside Analysis
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar ibi
 
Using Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIUsing Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIDATAVERSITY
 
Keys to Formulating an Effective Data Management Strategy in the Age of Data
Keys to Formulating an Effective Data Management Strategy in the Age of DataKeys to Formulating an Effective Data Management Strategy in the Age of Data
Keys to Formulating an Effective Data Management Strategy in the Age of DataImpetus Technologies
 
Article Evaluation 4
Article Evaluation 4Article Evaluation 4
Article Evaluation 4AnshumanRaina
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentCraig Milroy
 

La actualidad más candente (20)

3D Data Strategy Framework
3D Data Strategy Framework3D Data Strategy Framework
3D Data Strategy Framework
 
Big Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance ReimaginedBig Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance Reimagined
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and Governance
 
Becoming a Data Driven Organisation
Becoming a Data Driven OrganisationBecoming a Data Driven Organisation
Becoming a Data Driven Organisation
 
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
Chief Data Architect or Chief Data Officer: Connecting the Enterprise Data Ec...
 
How to Consume Your Data for AI
How to Consume Your Data for AIHow to Consume Your Data for AI
How to Consume Your Data for AI
 
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
Slides: Case Study — How J.B. Hunt is Driving Efficiency with AI and Real-Tim...
 
Neil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay LondonNeil Sholay - Data Driven Business - #OracleCloudDay London
Neil Sholay - Data Driven Business - #OracleCloudDay London
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data ScienceChief Data Officer: Evolution to the Chief Analytics Officer and Data Science
Chief Data Officer: Evolution to the Chief Analytics Officer and Data Science
 
Accelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and GovernanceAccelerate Your Move to the Cloud with Data Catalogs and Governance
Accelerate Your Move to the Cloud with Data Catalogs and Governance
 
RWDG Slides: Operationalize Data Governance for Business Outcomes
RWDG Slides: Operationalize Data Governance for Business OutcomesRWDG Slides: Operationalize Data Governance for Business Outcomes
RWDG Slides: Operationalize Data Governance for Business Outcomes
 
Reinventing the Modern Information Pipeline: Paxata and MapR
Reinventing the Modern Information Pipeline: Paxata and MapRReinventing the Modern Information Pipeline: Paxata and MapR
Reinventing the Modern Information Pipeline: Paxata and MapR
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of Success
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
 
Self-Service Analytics
Self-Service AnalyticsSelf-Service Analytics
Self-Service Analytics
 
Using Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIUsing Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROI
 
Keys to Formulating an Effective Data Management Strategy in the Age of Data
Keys to Formulating an Effective Data Management Strategy in the Age of DataKeys to Formulating an Effective Data Management Strategy in the Age of Data
Keys to Formulating an Effective Data Management Strategy in the Age of Data
 
Article Evaluation 4
Article Evaluation 4Article Evaluation 4
Article Evaluation 4
 
Chief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data EnvironmentChief Data Officer: DataOps - Transformation of the Business Data Environment
Chief Data Officer: DataOps - Transformation of the Business Data Environment
 

Similar a Becoming a data driven organization

Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)
Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)
Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)VMware Tanzu
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!Gabi Bauer
 
Make from your it department a competitive differentiator for your business
Make from your it department a competitive differentiator for your businessMake from your it department a competitive differentiator for your business
Make from your it department a competitive differentiator for your businessMarcos Quezada
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsRick Perret
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 IBM Sverige
 
DevOps for Enterprise Systems : Innovate like a Startup
DevOps for Enterprise Systems : Innovate like a StartupDevOps for Enterprise Systems : Innovate like a Startup
DevOps for Enterprise Systems : Innovate like a StartupDevOps for Enterprise Systems
 
CMOfinalpresentation.ppt
CMOfinalpresentation.pptCMOfinalpresentation.ppt
CMOfinalpresentation.pptMr Garg
 
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...Denodo
 
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)Denny Muktar
 
MT01 The business imperatives driving cloud adoption
MT01 The business imperatives driving cloud adoptionMT01 The business imperatives driving cloud adoption
MT01 The business imperatives driving cloud adoptionDell EMC World
 
Ibm symp14 referent_christian klezl_cloud
Ibm symp14 referent_christian klezl_cloudIbm symp14 referent_christian klezl_cloud
Ibm symp14 referent_christian klezl_cloudIBM Switzerland
 
Cloud,beyond the hype, looking at the journey to Cloud
Cloud,beyond the hype, looking at the journey to CloudCloud,beyond the hype, looking at the journey to Cloud
Cloud,beyond the hype, looking at the journey to CloudChristian Verstraete
 
Big Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business ResultsBig Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business ResultsCA Technologies
 
Cloud what is the best model for vietnam
Cloud   what is the best model for vietnamCloud   what is the best model for vietnam
Cloud what is the best model for vietnamPhuc (Peter) Huynh
 
Drive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB PostgresDrive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB PostgresEDB
 
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTXCustomer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTXtsigitnist02
 

Similar a Becoming a data driven organization (20)

Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)
Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)
Keynote: Software Kept Eating the World (Pivotal Cloud Platform Roadshow)
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!
 
Make from your it department a competitive differentiator for your business
Make from your it department a competitive differentiator for your businessMake from your it department a competitive differentiator for your business
Make from your it department a competitive differentiator for your business
 
Azure Biz
Azure BizAzure Biz
Azure Biz
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013 Building Confidence in Big Data - IBM Smarter Business 2013
Building Confidence in Big Data - IBM Smarter Business 2013
 
Cloud the current future v6
Cloud   the current future v6Cloud   the current future v6
Cloud the current future v6
 
DevOps for Enterprise Systems : Innovate like a Startup
DevOps for Enterprise Systems : Innovate like a StartupDevOps for Enterprise Systems : Innovate like a Startup
DevOps for Enterprise Systems : Innovate like a Startup
 
CMOfinalpresentation.ppt
CMOfinalpresentation.pptCMOfinalpresentation.ppt
CMOfinalpresentation.ppt
 
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...
 
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
IBM Private Cloud Platform - Setting Foundation for Hybrid (JUKE, 2015)
 
MT01 The business imperatives driving cloud adoption
MT01 The business imperatives driving cloud adoptionMT01 The business imperatives driving cloud adoption
MT01 The business imperatives driving cloud adoption
 
Ibm symp14 referent_christian klezl_cloud
Ibm symp14 referent_christian klezl_cloudIbm symp14 referent_christian klezl_cloud
Ibm symp14 referent_christian klezl_cloud
 
Cloud,beyond the hype, looking at the journey to Cloud
Cloud,beyond the hype, looking at the journey to CloudCloud,beyond the hype, looking at the journey to Cloud
Cloud,beyond the hype, looking at the journey to Cloud
 
Big Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business ResultsBig Data Management: A Unified Approach to Drive Business Results
Big Data Management: A Unified Approach to Drive Business Results
 
Cloud what is the best model for vietnam
Cloud   what is the best model for vietnamCloud   what is the best model for vietnam
Cloud what is the best model for vietnam
 
Drive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB PostgresDrive DBMS Transformation with EDB Postgres
Drive DBMS Transformation with EDB Postgres
 
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTXCustomer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
Customer Presentation - IBM Cloud Pak for Data Overview (Level 100).PPTX
 

Último

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Último (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Becoming a data driven organization

  • 1.
  • 2. 2© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
  • 3. New IT Agenda Provide Access To All Applications & Data Through Mobile Devices. Use Adaptive, Data-Driven Security To Rapidly Respond To Emerging Threats. Build A Data Lake To Deliver Insights And Applications On All Data. Move To A Software-Defined Data Center Infrastructure And Expand It To A Hybrid Cloud. 2 Use Agile Development To Build New Customer-Centric Applications. Balance Risk Cut Operational Costs & Legacy More Than Ever React Faster To Find New Growth Today’s Business Challenges
  • 4. Best Of Breed. Architected Horizontally, Not Vertically. Choice. Our Strategy: Build A Differentiated Stack BIG DATA SOLUTIONS PLATFORM AS A SERVICE AGILE APPLICATION DEVELOPMENT INFORMATION INFRASTRUCTURE CONVERGED INFRASTRUCTURE SOFTWARE-DEFINED DATA CENTER HYBRID-CLOUD, MOBILITY ADVANCEDSECURITY
  • 5.
  • 6. Many Industries Face Structural Change
  • 7. Volvo Cars – Big Data app & service Deliveries to your car – Roam Delivery
  • 8. The EMC Exabyte Journey
  • 9. 20,772 HARD DRIVES 300 PALLETS 8 TRUCKS 85 PETABYTESSOLD INTO ONE WEB-SCALE PROVIDER IN ONE ORDER
  • 10. Capacity Performance Low Service Level High Service Level Performance “Good Enough” Capacity Optimized ($/GB) Data Loss Not A Disaster Consistently Good Performance Eventual Consistency Of Data Data Loss Not A Disaster Performance “Good Enough” Capacity Optimized ($/GB) Data Loss A Disaster Consistently Good Performance Consistent Data Data Loss A Disaster Great Performance Consistent Data Data Loss A Disaster
  • 11. 11© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
  • 12. 12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.
  • 13. 13© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. The Core is Super-Scaling
  • 14. 14© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. The Edge is Hyper-Extending
  • 15.
  • 17. TODAY’S DATA CENTER SOFTWARE-DEFINED DATA CENTER TRADITIONAL APPLICATIONS NEXT GEN CLOUD APPLICATIONS 120M -2016- 91M -2013- 34M -2016- 11M -2013- TRADITIONAL APPLICATION GROWTH NEXT GEN CLOUD APPLICATION GROWTH
  • 18. THE 3RD PLATFORM REDEFINES EVERYTHING
  • 19.
  • 20. BUILT FOR THE SPEED OF BUSINESS
  • 21. 21Pivotal Confidential–Internal Use Only 21Pivotal Confidential–Internal Use Only Data Driven Application Development
  • 22. 22Pivotal Confidential–Internal Use Only Pivotal At-a-Glance Ÿ  New Independent Venture: Spun out & jointly owned by EMC & VMware Ÿ  Deep Execution Talent: 1700 employees Ÿ  Proven Leadership: Paul Maritz, CEO Ÿ  Global Customer Validation: +1000 Tier-1 Enterprise Customers Ÿ  Strategic Backing: $100M investment by GE Ÿ  Bold Vision: New platform for a new era, focused on the intersection of apps, big data and analytics
  • 23. 23Pivotal Confidential–Internal Use Only Need for Speed Ÿ  Enterprises are being driven to compete, innovate & execute faster than ever before: –  Global reach and emerging markets –  Ever-increasing customer expectations –  Legacy environment & cost pressures Ÿ  At a time when we’re witnessing the most disruptive platform shifts and advances in technology in over 30 years Ÿ  Every business is quickly becoming a software business Ÿ  Software is how companies engage with customers, powered by new data insights and new Social Cloud Big Data Mobile
  • 24. 24Pivotal Confidential–Internal Use Only What Matters: Apps. Data. Analytics. Apps power businesses, and those apps generate data Analytic insights from that data drive new app functionality, which in-turn drives new data The faster you can move around that cycle, the faster you learn, innovate & pull away from the competition
  • 25. 25Pivotal Confidential–Internal Use Only “Software is Eating the World”
  • 26. 26Pivotal Confidential–Internal Use Only Software is Changing Industries $3.5B valuation Financial Services $3.5B valuation Travel & Hospitality $3.5B valuation Transportation $3.2B Acquisition by Google Home Automation $20B valuation Entertainment $1.1B acquisition Monsanto--Agriculture
  • 27. 27Pivotal Confidential–Internal Use Only We need to innovate or we die. Pivotal and Cloud Foundry is our big bet to leapfrog the competition With Cloud Foundry, we built what looks like a software company… Moving from silos to a single platform. Cloud Foundry's potential to transform business is vast… companies leveraging software will outperform their peers Enterprises Must Become Great at Software
  • 28. 28Pivotal Confidential–Internal Use Only Francisco Gonzalez CEO at BBVA “Banks need to take on Amazon and Google or die. The shift to digital requires a complete overhaul of banks technology…it is a matter of survival.”
  • 29. 29Pivotal Confidential–Internal Use Only Rapid Execution Requires a New Approach •  Agile teams and rapid iteration •  Continuous delivery without downtime •  Horizontally scalability (data and app) •  Standardized service binding and discovery •  First class Mobile support •  Deep user analytics Development Delivery Operation Iteration
  • 30. 30Pivotal Confidential–Internal Use Only Jonathan Rosenberg CTO & VP, Collaboration “PaaS is the operating system for the cloud. As the set of APIs and services for PaaS's grow, the choice of PaaS becomes more crucial as the costs of porting go up. This is one of the benefits of open source PaaS offerings like Cloud Foundry.”
  • 31. 31Pivotal Confidential–Internal Use Only Is Your Enterprise Ready?
  • 32. 32Pivotal Confidential–Internal Use Only Fail, Learn, Adapt, Repeat. 6 Months to 6 Weeks $1.1M Saving per App
  • 33. 33Pivotal Confidential–Internal Use Only Incredible Cloud Foundry Ecosystem
  • 34. 34Pivotal Confidential–Internal Use Only The Cloud Foundry Foundation more to come… “This is a significant announcement for PaaS in general and for Cloud Foundry in particular. It potentially signals a consolidation that is going to become apparent…predicts that Red Hat will shutter OpenShift and throw its hat in with Cloud Foundry within the year.” - Forbes, Ben Kepes, 2/24/14
  • 35. 35Pivotal Confidential–Internal Use Only Elastic Runtime Java, Spring, Ruby, Node.JS Built-in “Middleware” Services Operation Manager Installation, Management, Monitoring, Upgrades/Updates ...ETC Pivotal Approach: Your Platform for Building Great Software PivotalOne Pivotal One CO-INNOVATION Agile Software Development Data Lake Solutions PIVOTAL MySQL Pivotal One SERVICES
  • 36. 36Pivotal Confidential–Internal Use Only Pivotal Approach: An Application Centric World Infrastructure Specific JVM VM Pre-Provisioned Pool of VMs Container 1 App Server JVM, etc.. Container 2 App Server JVM, etc.. App1 Common Access Tier (App1, App2) App Server Configurations Built-in Middleware Services JVM VM App2 App Server Configurations IaaS Agnostic
  • 37. 37Pivotal Confidential–Internal Use Only GE Capital Builds Foundation For Value Add Insights “ Critical Insights and data is deleted because it’s too expensive to store.” “We need the ability to blend data fabric, build analytics, and create applications on top of this.” “Access any internal or external data of interest through a familiar interface.” “Now we analyze Social Media to predict trends, and help dealers make decisions.”
  • 38. 38Pivotal Confidential–Internal Use Only Use-Case: Data and PaaS Drives Business Agility Pivotal CF Operation Manager Any Infrastructure Big/Fast Data Real-time change to customer-facing application based on data analysis Deploy/Update (Private/Public)
  • 39. 39Pivotal Confidential–Internal Use Only Spring becomes the enabler Deploy to Cloud or on premise Big, Fast, Flexible Data Data Processing, Integration Spring Data •  JPA/JDBC •  MongoDB •  Redis •  Neo4j •  GemFire •  Data REST •  Spring Hadoop •  Spring Integration •  Spring Batch •  CloudFoundry •  vCloud Suite •  Google App Engine •  Amazon Elastic Beanstalk •  CloudBees
  • 40. 40Pivotal Confidential–Internal Use Only Data Driven: Harder Than it Sounds Operationalize Ingest Distill Interface Process Analytical Transactional Operationalize Ingest Distill Interface Process Analytical Transactional Operationalize Ingest Distill Interface Process Analytical Transactional Real Time Near Real Time Batch Predictive call routing, fraud prediction, dynamic pricing, re-marketing, stream analytics Analytic model designs, transaction analysis, trend analysis ETL, archive, trending, monthly and weekly jobs
  • 41. 41Pivotal Confidential–Internal Use Only Data Driven: Impossible in Silos Finance Manufacturing Marketing IT Data Growth Over 60% Floods These Silos
  • 42. 42Pivotal Confidential–Internal Use Only One Platform, Multiple Use Cases Flexibility to expand the ingestion of network data as the underlying infrastructure changes – easy expansion into 4G LTE. External data sources •  Linked-in •  Twitter •  Facebook •  Weather •  … Internal data sources •  CRM •  EDW •  Customer Portals •  … Mobile Network Infrastructure Intelligently set triggers at the edge of the network that looks for 'interesting' events that require instant action Build New Apps Integrate Existing Apps Agile approach to enable apps to access in real-time All History Current Status Predicted Intelligence Single Unified Platform •  Capture everything •  Real-time data processing •  Single version of the truth •  End-to-end visibility across different data sources •  Scalable and cost effective
  • 43. 43Pivotal Confidential–Internal Use Only RTI In Telco Ecosystem RTI-T Integration Sanity Check Filter Enrich Transform Apply Logic OSS BSS Network Elements Application Hosting Environment for Real-time Use-case Realization Network Optimization Customer Experience Management .... Ingest DistributeProcess Route Persist Long Term Storage Operational Datastore Analytics
  • 44. 44Pivotal Confidential–Internal Use Only Solution Architecture RTI-T Integration Sanity Check Filter Enrich Transform Apply Logic OSS BSS Network Elements Cloud Foundry Network Optimization Customer Experience Management .... Ingest DistributeProcess Route Persist Pivotal Data Lake GemFire XDADS/HAWQ
  • 45. 45Pivotal Confidential–Internal Use Only World’s Leading Experts Pivotal Labs – Pivotal Data Labs Pivotal One Pivotal Application Suite BATCH BATCH NEAR TIME NEAR TIMEHAWQGreenplum DB Pivotal HD REAL TIME REAL TIMEGemFire XDGemFire
  • 46. 46Pivotal Confidential–Internal Use Only Pivotal’s Opportunity Uniquely positioned to help enterprises modernize each facet of this cycle today Comprehensive portfolio of products spanning Apps, Data & Analytics Converging these technologies into a coherent, next-gen Enterprise PaaS platform
  • 47. BUILT FOR THE SPEED OF BUSINESS
  • 48. © Copyright 2014 Pivotal. All rights reserved. It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is most adaptable to change. “ ”-Charles Darwin BE PREPARED FOR THE SPEED OF BUSINESS
  • 49. BUILT FOR THE SPEED OF BUSINESS
  • 50. © Copyright 2014 Pivotal. All rights reserved. Ÿ  What is the Pivotal platform? Ÿ  Why is it so cool? Ÿ  What amazing things have we done with it? BE PREPARED FOR THE SPEED OF BUSINESS
  • 51. © Copyright 2014 Pivotal. All rights reserved. Evolving Enterprise Data Architecture Analytic Data Marts MPP Database Operational Intelligence In-Memory DB Run-Time Applications In-MemoryObject EnterpriseData Warehouse RDBMS Data Staging Platform TraditionalBI/Reporting Data Visualization Data Ingestion System Stream/CEP
  • 52. © Copyright 2014 Pivotal. All rights reserved. Analytic Data Marts Operational Intelligence Run-Time Applications EnterpriseData Warehouse Data Staging Platform TraditionalBI/Reporting Data Visualization Data Ingestion System Pivotal Data Product Portfolio
  • 53. © Copyright 2014 Pivotal. All rights reserved. Ÿ  What is the Pivotal platform? Ÿ  Why is it so cool? Ÿ  What amazing things have we done with it? BE PREPARED FOR THE SPEED OF BUSINESS
  • 54. © Copyright 2014 Pivotal. All rights reserved. Why is the Pivotal platform so awesome? Infrastructure Independent Fast & Scalable Schema Free Easy to Use Real Time
  • 55. © Copyright 2014 Pivotal. All rights reserved. Big Data & Data Science Decision    =    Data    +    Rules   “Big  Data”   Data   Science  
  • 56. © Copyright 2014 Pivotal. All rights reserved. Social  Media   Commercial  &   Public  Data   Dark  Data   “Big Data” Opera7onal   Data  
  • 57. © Copyright 2014 Pivotal. All rights reserved. Combining data sources: Example IPSQ (Quality) Owner: TS Production team Test flags from production line 1 year ~300GB APDM Owner: TS Production team Full vehicle history including IPST (technical), IPSL (logistics), IPSQ test flags and all test results. 30 years ~TBs FASTA Owner: Aftersales Dealership electronic tests Identifies early issues with cars >25TB IQS: Initial Quality Survey from JD Power Owner: R&D Survey responses from new owners after 90 days for approx 1700 vehicles Few thousand lines ~MB Social Data Owner: R&D Pulling 500MB per day from Twitter TQP Owner: Supplier management PDFs of parts spec sheets ~ 500GB
  • 58. © Copyright 2014 Pivotal. All rights reserved. Generating value from data: Car configurator example Sales Configurations Customers Sales Configurations Customers Basic Recommendation Engine Sales Configurations Customers Advanced Recommendation Engine Sales Configurations Customers Yield Optimization Engine All car elements: •  Attribute frequencies (colors etc) •  Attribute combination frequencies For instance: •  Browsing history •  Usage patterns •  Demographic insights Ideally: •  Volumes •  Pricing •  By market •  Linkable to configurations
  • 59. © Copyright 2014 Pivotal. All rights reserved. Traditional Systems “Big Data”“Fast Data” The value of data over time Time Value of Data ($) µs ms s hour day month year yr+ Pivotal Data Science Labs
  • 60. © Copyright 2014 Pivotal. All rights reserved. "Pivotal is aiming for the enterprise market that's realizing that software is the biggest differentiator in any industry." — Larry Dignan, ZDNet “The number of companies that have bought into the initiative, the amount of code being contributed, the customer wins that ecosystem members are enjoying suggest that Cloud Foundry is preeminent among all the open source PaaS initiatives." — Ben Kepes, Forbes "If you're in the business of building enterprise software, scrambling to figure out what your company is doing around big data and analytics, mobile and the cloud, then there's a fair chance you'll want to pay attention to Pivotal." — Arik Hesseldahl, WSJD But don’t just take our word for it…
  • 61. © Copyright 2014 Pivotal. All rights reserved. Ÿ  What is the Pivotal platform? Ÿ  Why is it so cool? Ÿ  What amazing things have we done with it? BE PREPARED FOR THE SPEED OF BUSINESS
  • 62. © Copyright 2014 Pivotal. All rights reserved.© Copyright 2014 Pivotal. All rights reserved. What does traffic data look like?
  • 63. © Copyright 2014 Pivotal. All rights reserved. …like this?
  • 64. © Copyright 2014 Pivotal. All rights reserved. …or this? (Note: This is the least offensive topic cluster in our Twitter data!)
  • 65. © Copyright 2014 Pivotal. All rights reserved. Velocity by Time of Day
  • 66. © Copyright 2014 Pivotal. All rights reserved. 0 50 100 150 200 0.0000.0050.0100.0150.0200.0250.030 density Link 1000064869 km/h Velocity Distribution
  • 67. © Copyright 2014 Pivotal. All rights reserved. Gaussian Mixture Model 0 50 100 150 200 0.0000.0050.0100.0150.0200.0250.030 density Combined Component 1 Component 2 Component 3 Component 4 Link 1000064869 km/h 0 50 100 150 200 0.0000.0050.0100.0150.0200.0250.030 density Combined Component 1 Component 2 Component 3 Component 4 Link 1000064869 km/h 0 50 100 150 200 0.0000.0050.0100.0150.0200.0250.030 density Combined Component 1 Component 2 Component 3 Component 4 Link 1000064869 km/h 0 50 100 150 200 0.0000.0050.0100.0150.0200.0250.030 density Combined Component 1 Component 2 Component 3 Component 4 Link 1000064869 km/h
  • 68. © Copyright 2014 Pivotal. All rights reserved. Decision Trees Example 0 50 100 150 200 250 0.000.010.020.030.04 density Combined Component 1 Component 2 Component 3 hour >= 14 hour < 20 weekday = 1,2,3,4,5 weekday = 1,2,3,4,5 weekday = 1,2,3,4,5 nextlink = −1,100000 < 14 >= 20 6,7 6,7 6,7 100002 1 0.47 1 0.69 1 0.76 1 0.81 3 0.85 2 0.55 2 0.73 3 0.65 2 0.56 2 0.63 2 0.68 3 0.49 3 0.73
  • 69. © Copyright 2014 Pivotal. All rights reserved. Sneak Peek at our TfL Data Demo Ÿ  Used the freely accessible TfL data for a demo Ÿ  Shows # of active disruptions over different days in London Ø  Rush hour effects visible Ø  Nights are more quiet, but more disruptions on weekend nights
  • 70. © Copyright 2014 Pivotal. All rights reserved. Kaiser Permanente Hackathon Insight Patient Application
  • 71. © Copyright 2014 Pivotal. All rights reserved. Kaiser Permanente Hackathon
  • 72. © Copyright 2014 Pivotal. All rights reserved. Text Analytics for Churn Prediction Customer A major telecom company Business Problem Reducing churn through more accurate models Challenges Ÿ  Existing models only used structured features Ÿ  Call center memos had poor structure and had lots of typos Solution Ÿ  Built sentiment analysis models to predict churn and topic models to understand topics of conversation in call center memos Ÿ  Achieved 16% improve in ROC curve for Churn Prediction
  • 73. © Copyright 2014 Pivotal. All rights reserved. Predicting Commodity Futures through Twitter Customer A major a agri-business cooperative Business Problem Predict price of commodity futures through Twitter Challenges Ÿ  Language on Twitter does not adhere to rules of grammar and has poor structure Ÿ  No domain specific label corpus of tweet sentiment – problem is semi-supervised Solution Ÿ  Built Sentiment Analysis and Text Regression algorithms to predict commodity futures from Tweets Ÿ  Established the foundation for blending the structured data (market fundamentals) with unstructured data (tweets)
  • 74. © Copyright 2014 Pivotal. All rights reserved. Network Intrusion Detection Customer One of the worlds largest health care providers Business Problem Detect advanced cyber threats in a large heterogeneous environment and to reduce malware ‘free-time’ Challenges Ÿ  Covert threats employ advanced techniques to bypass traditional security appliances. Ÿ  Last year 416 days was the median number before detection on a compromised network. (Source: Mandiant) Solution Ÿ  Built a new behavioral intrusion detection framework based on machine learning, graph theory and security research Ÿ  Designed operational components of their next generation SIEM. Firewall •  Engineered a full featured custom social graph based intrusion model •  Identified breaches not a single security product they owned was able to detect
  • 75. © Copyright 2014 Pivotal. All rights reserved. Website Classification Customer An internet domain name service provider Business Problem Create a multilevel website classification that groups websites by function rather than topic Challenges Ÿ  Complex unstructured data format required several transformations Ÿ  Model needed to be language independent, so classic language features could not be used Solution Ÿ  New hierarchical model resulted in reducing the number of previously ‘unclassified’ websites by ~75% Ÿ  Created an in-database analytics framework for unsupervised learning models and enabled real- time validation of current production model Map of Domains
  • 76. © Copyright 2014 Pivotal. All rights reserved. Pivotal’s Platform Uniquely positioned to help enterprises modernize each facet of this cycle today Comprehensive portfolio of products spanning Apps, Data & Analytics Converging these technologies into a coherent, next-gen Enterprise PaaS platform
  • 77. BUILT FOR THE SPEED OF BUSINESS
  • 79. 79© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. PLATFORM AS A SERVICE VIRTUAL WORKSPACE BUSINESS DATA LAKE SECURITY ANALYTICS SOFTWARE DEFINED DATA CENTER SERVICEPROVIDER ENTERPRISEDATACENTERA Unique Federation Of Companies Delivering The Software-Defined Enterprise. Solutions & Choice. Partners BIG DATA SOLUTIONS PLATFORM AS A SERVICE AGILE APPLICATION DEVELOPMENT SOFTWARE-DEFINED DATA CENTER HYBRID-CLOUD, MOBILITY INFORMATION INFRASTRUCTURE CONVERGED INFRASTRUCTURE ADVANCEDSECURITY
  • 80. 80© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Converged Infrastructures Partners vCloud Hybrid Service Hybrid Cloud Managed As One Cloud Federation Solutions 5 Solutions Enabling The Software-Defined Enterprise Next Gen Cloud Apps PLATFORM AS A SERVICE SOFTWARE-DEFINED DATA CENTER VIRTUAL WORKSPACE BUSINESS DATA LAKE SECURITY ANALYTICS SOFTWARE-DEFINED DATA CENTER VIRTUAL WORKSPACE PLATFORM AS A SERVICE SECURITY ANALYTICS BUSINESS DATA LAKE
  • 81. 81© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Hadoop Overview Hadoop is an open-source framework from Apache that allows for parallel batch processing of very large data sets MapReduce is the Hadoop process that divides the workload so multiple devices can process it HDFS is the file system for the data. It provides data protection and locality with multiple mirrors (usually 3 times)
  • 82. 82© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Isilon Scale-Out NAS Architecture OneFS Operating Environment Intra-cluster Communication Layer Client/Application Layer Ethernet Layer SingleFS/Volume CIFSNFS FTPHTTP HDFS for Hadoop REST for Object Gig-e 10 Gig-e Network Protocols
  • 83. 83© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Traditional “Share-Nothing” Hadoop Existing Virtualized Data Center SHARE-NOTHING Hadoop Infrastructure Unstructured Data 1 Existing Primary Storage 2 3 4 2 3 4 2 3 4 2 3 4 •  Hadoop replication count (R=3) means 4 data copies •  Data has to copy to the Hadoop cluster before analysis can begin (Time to Results) How will you maintain data consistency when a file changes on your primary storage?
  • 84. 84© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Existing Virtualized Data Center Existing Primary Storage Isilon “Share-Everything” Hadoop 1 Ÿ  Start using Hadoop NOW with unused processing and RAM available in your VMware environment Ÿ  No replication required (Use your existing data) Ÿ  Access to same data via NAS and HDFS protocols Ÿ  Time to results extremely fast using already existing data with NO COPIES or wasted $ Analysis Can Begin with the 1st VM New Hadoop Compute Nodes Unstructured Data Use Native HDFS Protocol
  • 85. 85© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Ethernet Job Tracker Task Tracker DataNode 2nd NameNode NameNode Hadoop Architecture - Traditional R (RHIPE) Mahout Hive HBasePIG NameNode Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node
  • 86. 86© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Ethernet R (RHIPE) PIG Mahout Hive HBase Job Tracker Task Tracker DataNode Compute Node Compute Node Compute Node Compute NodeCompute Node Compute Node NameNode Hadoop Architecture with Isilon name node name node name node name node datanode
  • 87. 87© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. HDFS SMB, NFS, HTTP, FTP, HDFS Node reply Node reply Node reply Node reply NameNode Data Support for Multiple Hadoop Distributions name node name node name node name node datanode NFS SMB SMB NFS MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce
  • 88. 88© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Dependent Scaling Traditional Hadoop HDFS Isilon HDFS Ÿ  Storage to Compute ratio is fixed Ÿ  Scaling compute means scaling capacity Ÿ  Difficult to provide QoS Ÿ  Compute upgrade is a forklift Ÿ  Scale compute independent of storage Ÿ  Achieve optimal performance balance even as workloads evolve Ÿ  No data migrations, ever! Ÿ  Add new performance as hardware evolves Compute Storage Required performance/ capacity Required Hadoop Cluster Nodes
  • 89. 89© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Independent Scaling Traditional Hadoop HDFS Isilon HDFS Ÿ  Storage to Compute ratio is fixed Ÿ  Scaling compute means scaling capacity Ÿ  Difficult to provide QoS Ÿ  Compute upgrade is a forklift Ÿ  Scale compute independent of storage Ÿ  Achieve optimal performance balance even as workloads evolve Ÿ  No data migrations, ever! Ÿ  Add new performance as hardware evolves Compute Storage Required performance/ capacity Required Hadoop Cluster Nodes
  • 90. 90© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Snapshot & Version Control Before After Ÿ  Traditional HDFS does not have replication Ÿ  No Snapshotting of data Ÿ  Loss of version control Ÿ  Not designed for Mission Critical Ÿ  Full SnapshotIQ integration identifies changes Ÿ  Multi-threaded, Multi-Node Scale- Out replication Ÿ  Improved RPO/RTO for business continuity Ÿ  Geo-replicated Hadoop!
  • 91. 91© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Data Center Network Time-to-Results Data Copy Analysis In-Place Analysis Existing Primary Storage Hadoop on a Stick Have you ever copied 100TB from Primary Storage to a Hadoop system? How long does it take to copy 100TB from one place to another over a 10Gb link? >24 Hours Data Center Network Existing Primary Storage Hadoop Compute Nodes Reading relevant data to analysis
  • 92. 92© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. A real world example Cost Comparison Customer requirements Ÿ  640 TB raw capacity Ÿ  64 Compute (1 per 10TB) DAS Option Ÿ  14.8% usable capacity/DataNode Ÿ  38 racks of servers Isilon Option Ÿ  10 Racks (including Compute) Ÿ  65% less expensive than DAS Hadoop on Isilon is often significantly less costly! Network Hadoop Licensing Management Config Installation Energy Isilon Servers $ 0 $ 1M $ 3M $ 4M $ 5M $ 6M $ 2M
  • 93. 93© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Efficiency and flexibility The Isilon Advantage for Hadoop Ÿ  No data ingest necessary Ÿ  Eliminate 3x mirroring Ÿ  Over 80% storage utilization Ÿ  SmartDedupe to further reduce storage needs by up to 30% Ÿ  Scale compute and data independently Ÿ  Multi-protocol access Ÿ  Simultaneous multi-distribution support Ÿ  Ability to leverage VMware vSphere Big Data Extensions to reduce datacenter footprint, power, space, and cooling
  • 94. 94© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. Data protection and security The Isilon Advantage for Hadoop Ÿ  Highly resilient architecture Ÿ  Robust data protection options (DR, snapshots, etc.) Ÿ  Eliminate NameNode single point of failure Ÿ  SEC 17a-4 compliant WORM Ÿ  Kerberos authentication Ÿ  Hadoop multi-tenancy
  • 95. 95© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved. How Do I Start Using Hadoop? EMC Hadoop Starter Kit (HSK) Ÿ  Visit https://community.emc.com/docs/DOC-26892 Ÿ  Watch the demo video Ÿ  Follow the instructions to deploy Hadoop to your existing Isilon and VMware infrastructure in about an hour Ÿ  There are customized HSKs for Apache, Pivotal, Cloudera, and Hortonworks
  • 96. 96© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.