Digital transformations require a new hybrid cloud—one that’s open by design, and frees clients to choose and change environments, data and services as needed. This approach allows cloud apps and services to be rapidly composed using the best relevant data and insights available, while maintaining clear visibility, control and security—everywhere. How do you decide where to put data on a hybrid cloud and how to use it? What’s the best hybrid cloud strategy in terms of data and workload? How should you leverage a 50/50 rule or a 80/20 rule and user interaction to evaluate which data/workload to move to the cloud and which data/workload to keep on-premise? Hybrid cloud provides an open platform for innovation, including cognitive computing. Organizations are looking for taking shadow IT out of the shadows by providing a self-service way to the information and a hybrid cloud strategy is allowing that. Also, how to use hybrid cloud for better manage data sovereignty & compliance?
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Hybrid Cloud Strategy for Big Data and Analytics
1. Hybrid Cloud Strategy for
Big Data & Analytics
Christine Ouyang
Marcio Moura
04/05/2017
2. Agenda
– What is Hybrid Cloud?
– Why Hybrid Cloud?
– The North Star, Hybrid Cloud Big Data & Analytics
– Implementation Considerations for Hybrid Cloud
– Hybrid Cloud Big Data & Analytics Use Cases
4. What is Hybrid Cloud?
The connection of on-premise environments and/or dedicated cloud with public cloud.
LOCAL
PUBLIC
ON-PREMISES
DEDICATED
PRIVATE
VPN
Tunnel
5. The Integrated Digital Enterprise is Hybrid
Data
Engineers
Developer Data
Scientists
Business
Analyst
Data Owner
CDO
Devices
IoT
Communities
APIs/Services
Social & Internet
Advanced
Analytics
Big Data
Digital
Transformation
APIs/Service
s
Real-Time
Analytics
Statistical
Modeling
Information Governance & Security
Self-Service
Data & Analytics
Public Cloud Dedicated Cloud On-Premises
Metadata
Catalog
Public Cloud
Dedicated Cloud
On-Premises /
Local Cloud
Analytics
Statistical
Modeling
Edge
Analytics
Sensors
Sensors
VPN
Tunnel
Private
LOCAL
7. Why Hybrid Cloud?
• Provides the ability for different personas to tap into
data and analytics where it makes the most sense
(localization wise)
• Allows the analytics to run where the data is stored
(data gravity)
• Provides better management of legal and regulatory
requirements in terms of privacy regulation associated
with data sovereignty and regulatory requirements
such as HIPAA, PCI, and SOX
• Provides an agile platform for developing/deploying
new applications
• Allows workload portability and cost optimization
8. Hybrid Cloud is the path forward
Dedicated Cloud & IT
Benefits:
Fully customizable
Robust management
Secure by design
Best of both worlds.
Better outcomes.
Lower total cost of ownership
Enhance operational efficiency
Facilitate innovation
Meet customer expectations more readily
Create new business models
Benefits:
Low entry cost
Pay-per-use
Highly elastic
Hybrid Cloud
Public Cloud
9. A Hybrid Cloud enables multi-speed IT
Composable Environments to rapidly build and deploy new cloud-native and mobile
solutions
Flexibility to move apps to the cloud as-is or build cloud native solutions
Leverage Existing Investments by connecting them to cloud services
Cloud Native
Solutions
Cloud Enabled
Solutions
Multi-speed IT
Partners Apps Access
Info Process Interaction
Core
Enterprise
Digital
Ecosystem
Core
Enterprise
Digital
Ecosystem
Traditional Projects: The
capabilities to capitalize on
your institutional knowledge
New Projects: The speed
and agility to drive
innovation and growth
Multi-Speed IT
10. Hybrid cloud model enables organizations to redefine customer
service
Customer-facing architecture:
Optimized for speed and agility
Transactional architecture:
Optimized for availability and stability
Dedicated Public
On-premises Off-premises
Local
Private
11. Four primary drivers for Hybrid Cloud
Integration
Workload / Resource
Optimization
Portability Compliance
12. X
%
2
70
Banking is projected to account for the largest proportion of
cloud activity across any industry
Market forecast 2019
$620 bn
cloud market
Banking —16%
Telecom —13%
CPG & Retail —13%
Government —8%
Travel & Transportation —8%
Insurance —8%
Industrial Products —7%
Media & Entertainment —5%
Financial Markets —5%
Healthcare & Life Sciences —5%
Energy and Utility —3%
Electronics —3%
Education —2%
Automotive —2%
Chemical & Petroleum —2%
Aerospace & Defense —1%
increase in the workload on cloud by top-
tier banks
of total IT spend on cloud solutions by 2020
14. The Future of Hybrid Cloud for Big Data & Analytics
(The North Star)
• Extend the private environment to cloud
• Exposing the private environment data to the cloud
• Migrating/ modernizing some of the private environment applications to the cloud
• Integrating some of the workloads & data across the private environment and public cloud
• Provide a seamless & consistent experience for all workloads across all environments of
hybrid cloud.
15. The North Star
Data & Analytics
Data Movement
Data Preparation
& Integration
Data Sovereignty
& Compliance
Data Governance
& Security
Workload & Portability
17. How to implement a Hybrid Cloud Strategy?
Cultural Shift
Varying Levels of
Hybrid Sophistication
18. Hybrid Cloud Scenarios
On-
Premises /
Local
Dedicated
Public
SOE SOAu
Public
Dedicated
On-Premises /
Local
On-Premises
/ Local
Dedicated
Public
On-Premises
/ Local
Dedicated
Public
On-Premises /
Local
Data
Dedicated Public
On-Premises /
Local
Data
Dedicated
Data1
Public
Data2
Next Gen Hybrid Workloads Hybrid Infrastructure Scale Out Hybrid Cloud
Brokerage &
ManagementIndependent Workloads
SOE/SOAu – SOI/SOR
Integration
Portability & Optimization Capacity Access Backup & Archive Disaster & Recovery
Choose on-premises / local,
dedicated or public cloud
based on independent data
& workload requirements
Systems of record on
private environment and
systems of engagement &
systems of automation on
public cloud, and system
of insights across all
environments
Application and data are
portable and can go from
on-premises / local to
dedicated or public cloud
for improved optimization
The use of public or
dedicated cloud as
additional resources for on-
premises / local large
private jobs / applications
Leverage off-premises
resources for backup &
archive of on-premises /
local resources
Setup and make available a
parallel on-premises / local
environment using the
public or dedicated cloud
Planned or policy based
management and sourcing
across multiple
environments of the hybrid
cloud
SOI
SOI
Public CloudDedicated Cloud
On-Premises /
Local Cloud
Data
App
Data
App
Private Private
SOR
Private
App
Private
Data
App App
App 1
App 1
Data1
App 2
On-
Premises /
Local
Dedicated
Public
App
Data
Private
Policy
19. Big Data & Analytics on Hybrid Cloud
Today the most important considerations for Data & Analytics on the Hybrid Cloud
strategy are:
Data Gravity Localization Wise Data Topology
20. Big Data & Analytics on Hybrid Cloud
The future of the hybrid cloud strategy for the best-in-class traditional organizations will
be based on exposing/extending private environment to the cloud. On the other hand,
the internet born organizations are using an 80/20 rule by moving 80% of their data to
the cloud with only 20% retained on-premises under the following three categories of
data:
1. Data that the organization wants to share publicly (public cloud) – ex(s):
Software as a Service (SaaS) online office applications churn or fraud detection
2. Data that the organization wants to share across the enterprise (dedicated
cloud) – ex(s): metadata catalog, master customer data, activity data hub, asset
hub, statistical models hub, and content hub
3. Data that is highly sensitive data (on-premises) – ex(s): high-confidential
customer data, revenue data, and intellectual property
22. Analytics In-Motion
Analytics Operating System
Security
Platform
Information Management & Governance
Actionable
Insight
Discovery & Exploration
Ingestion &
Integration
Analytical Data Lake
Data
Access
Data
Sources
Enhanced
Applications
Big Data & Analytics Components
23. Big Data & Analytics Capabilities
Security
Information Management & Governance
Actionable
Insight
Analytics In-Motion
Enhanced
Applications
Discovery & Exploration
Analytics Operating System
Data Sources Ingestion &
Integration
Data
Access
Machine &
Sensor Data
Image
& Video
Enterprise
Content
Social
Data
Weather
Data
Commercial
Data Sets
New
sources
Traditional
sources
Third-Party
Data
Transactional
Data
System of
Record Data
Dataacquisition&applicationaccess
Internet
Data Sets
Application
Data
Batch
Ingestion
Change Data
Capture
Self-Service
Data
Virtualization
Data
Federation
APIs
Visualization
&
Storyboarding
Reporting, Analysis
&
Content Analytics
Decision
Management
Predictive Analytics
&
Modeling
Cognitive
Analytics
Insight
as a
Service
Customer
Experience
New
Business Models
Financial
performance
Risk
Fraud
& Operations
IT
Economics
In-Memory Processing
Simple Programming
Paradigm
Data Lifecycle
Management
Master & Entity
Data
Reference
Data
Data Catalog Data Models Data Quality
Data EncryptionData Masking & Redaction Data Protection Security Intelligence
Data Science Search
Real-Time
ingestion
Document
Interpretation
&
Classification
Consistent
Analytics Engine
Data Quality
Streaming Analytics Complex Event Processing Data Enrichment
Analytical Data Lake
Data Warehouses
&
Data Marts
Landing Zone
Data Archive
Historical Data
Deep Analytics
Repository
Exploratory
Analytics
Repositories
Sand Boxes
PlatformOn-Premise Cloud Hybrid
24. Data Topology on Hybrid Cloud
Mixture:
Raw and Modeled
Legend
RAW
(Years)
Detailed
System
of
Records
Data
Modeled
(Months)Data
Integration
(ELT/ELT)
(Years) (Years) (Years) (Years)
Detailed
Provisioning
Subject Area
Data
Delta
Processing
Detailed
Integrated
Subject Area
Data
Detailed Data
Aggregates
Summary
Data
Aggregates
Dimensional
Data
Landing Area
Self Provisioning Data
Private
Public /
Private
Private /
On-Premises
Data Discovery & ExplorationData Sandbox Areas
Data Prediction & Statistical Modeling
More Refined Data
User Guided,
Advanced
Analytics &
Visualization
Business
Analysts
Data
Scientists
Analytical or
Predictive Models
Visualization, Data
Mining &
Exploration
User Reports &
Dashboards
Landing
Zone
Provisioning
Zone
Self Service & Collaboration and
Advanced Analytics Zone
(Months)
Shared
Operational
Information
Data Warehouse &
Data Marts Zone
Sensors
Social Media
Customer
Conversations
External
Sources
Back Office
Applications
Devices
Information Catalog /
Master Data
Management
Public
Internal
Sources
On-Premises
Weather
Data Sources
Trusted,HighlyGoverned,
HighlySecured
Agile,LessGoverned,
LessSecure
Streaming
Streaming
Activity Hub /
Asset Hub /
Statistical Models Hub /
Content Hub
Shared Operational ZoneOn-Line Services
Hybrid Cloud Data Lake
Front Office
Applications
Internet
Wearable's
Sensors
Streaming
26. Data Movement on Hybrid Cloud
One of the biggest challenges for ta hybrid cloud strategy
Top 5 considerations to reduce latency and maintain performance
Location, Location, Location Dedicated
Connections
Streaming
Traffic Optimization Workload Optimization
27. Data Preparation & Integration on Hybrid Cloud
Data prep isn’t just for data pros.
The 3 phases of overcoming hybrid cloud data integration
Exposing on-premises
data to SaaS apps Hybrid Cloud Data Lake
Real-time analytics
with streaming data
28. 28
Security on Hybrid Cloud
3/29/17World of Watson 2016
Security has been and continues to be the #1 concern with the adoption of public cloud,
and the same challenges are also present in a hybrid cloud scenario.
Some key considerations for security on a hybrid cloud strategy includes:
Mind The Gap Get ComplaintDeal with Data
34. 34
IBM Cloud Architecture Center
World of Watson 2016
https://developer.ibm.com/architecture/
• Cloud solutions across eight architectures
• Based on open standards from customer-
proven enterprise grade architectures
Learn. Discover.
• Forty-one solutions across domains, cross-
domain, emerging and industries
• Prescriptive guides explain implementation best
practices
Try.
• Experiment with nine sample
solution implementations
• Using single click deploy or a
guided walkthrough
Prove
architecture
works quickly
Build what
experts
recommend
Find your solution
Built on IBM Cloud
Solutions
– Data and Analytics
– Internet of Things
– Cognitive
36. 36 3/29/17
Notices and disclaimers
continued
Information concerning non-IBM products was obtained from the
suppliers of those products, their published announcements or other
publicly available sources. IBM has not tested those products in
connection with this publication and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be
addressed to the suppliers of those products. IBM does not warrant the
quality of any third-party products, or the ability of any such third-party
products to interoperate with IBM’s products. IBM expressly
disclaims all warranties, expressed or implied, including but not
limited to, the implied warranties of merchantability and fitness
for a particular, purpose.
The provision of the information contained herein is not intended to,
and does not, grant any right or license under any IBM patents,
copyrights, trademarks or other intellectual property right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS,
Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document
Management System™, FASP®, FileNet®, Global Business Services®,
Global Technology Services®, IBM ExperienceOne™, IBM
SmartCloud®, IBM Social Business®, Information on Demand, ILOG,
Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON,
OpenPower, PureAnalytics™, PureApplication®, pureCluster™,
PureCoverage®, PureData®, PureExperience®, PureFlex®,
pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®,
Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli® Trusteer®, Unica®, urban{code}®, Watson,
WebSphere®, Worklight®, X-Force® and System z® Z/OS, are
trademarks of International Business Machines Corporation, registered
in many jurisdictions worldwide. Other product and service names
might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark
information" at: www.ibm.com/legal/copytrade.shtml.
37. Thank You
Dr. Christine Ouyang
Distinguished Engineer and Master Inventor
IBM – couyang@us.ibm.com
Marcio Moura
Executive Architect
IBM – mtmoura@us.ibm.com