Más contenido relacionado La actualidad más candente (20) Similar a Hybrid is the New Normal (20) Más de DataWorks Summit (20) Hybrid is the New Normal3. 3 © Cloudera, Inc. All rights reserved.
Three Types of Workload Lifecycles
1hr
SPIN UP SPIN
DOWN
24/7
24/7
1hr
SPIN UP SPIN
DOWN
Persistent
Transient
Elastic
4. 4 © Cloudera, Inc. All rights reserved.
HOW CLOUDERA HELPS with HYBRID ?
5. 5 © Cloudera, Inc. All rights reserved.
• The modern platform for machine
learning and analytics
• with multiple deployment options
• and one shared data experience
6. 6 © Cloudera, Inc. All rights reserved.
SHARED DATA EXPERIENCEDEPLOYMENT OPTIONSMODERN PLATFORM
Amazon
S3
LOCATION
STORAGE
MANAGEABILITY
Microsoft
ADLS
HDFS KUDU
Data Center
Self Managed Managed Service
DATA
ENGINEERING
DATA
WAREHOUSE
DATA
SCIENCE
OPERATIONAL
DATABASE
SECURITY
GOVERNANCE
LIFECYCLE MANAGEMENT
DATA CATALOG
CLOUDERA VALUE PROPOSITION
7. 7 © Cloudera, Inc. All rights reserved.
Big Data Infrastructure Evolution
Infrastructure traditionally
Each cluster is self-contained with compute,
data context, and data
Data context = HMS, Sentry, Navigator
Compute
Context
Data
Compute
Context
Data
Compute
Context
Data
• Compute, Context and Data together.
Designed for best performance.
• Highly Available, mission critical
• Multi-tenant, Secure and Governed
• Fixed Size, provisioned for peak capacity
• Low Utilization rates if only transient
workloads
• Not easy to Scale
8. 8 © Cloudera, Inc. All rights reserved.
Big Data Infrastructure Evolution
Decoupled Infrastructure
Data is separate from compute (e.g., in
ADLS/S3), but context needs to be managed
redundantly in each cluster
Compute
Context
Compute
Context
Compute
Context
Data
• Decoupled data and compute
• Use Object Store
• Scale easily
• Workload Specific infrastructure
• Support Transient and Persistent workloads
• Compute and Context are still together
• Maintain schemas in application. You lose
Context when transient workload
completes
• No way to troubleshoot. You lose
Workload information (logs and statistics)
when the cluster goes away.
9. 9 © Cloudera, Inc. All rights reserved.
Big Data Infrastructure Evolution
Modern Infrastructure with Cloudera
Compute clusters are launched as needed
Data and data context are stored externally and
are long-running
Workload Analytics stores your logs and job
statistics
ComputeCompute
Data
Cloudera SDX/WXM
Compute
• Decoupled data and compute
• Use Object Store
• Scale easily
• Workload Specific infrastructure
• Support Transient and Persistent workloads
• Persistent Context and Workload Analytics
10. 10 © Cloudera, Inc. All rights reserved.
CLOUDERA
ALTUS
Flexible cloud
deployment options
including workload-
optimized Managed
Services with Cloudera
Shared Data
Experience (SDX)
DATA ENGINEERING DATA WAREHOUSE
MULTI
FUNCTION
CLOUD
STORAGE
DATA CATALOG
GOVERNANCESECURITY CONTROL
PLANE
LIFECYCLE
MANAGEMENT
Microsoft
ADLS
AWS
S3
DIRECTOR
11. 11 © Cloudera, Inc. All rights reserved.
Cloudera Altus Architecture
Customer Cloud
Compute
Storage
CLI
Web
SDK
ALTUS DATA
WAREHOUSE
ALTUS DATA
ENGINEERING
ALTUS
CONTROL
PLANE
12. 12 © Cloudera, Inc. All rights reserved.
On-Premises Cloud Bursting
Data Engineering Workflow:
Batch ML, Simulations, etc.
1+ Jobs
(e.g.Spark)
Transient Cluster(s):
Atlus Managed
1+ Jobs
(e.g. Spark)
On-Premises Cluster:
Bare Metal/Private
Cloud
HDFS
Data
Science
Data
Eng
Data
Warehouse
Continuous or
On-Demand
Persistent SDX
Altus Control PlaneSDX: Schema (HMS), Security (Sentry), Lineage/Metadata (Navigator)
Object Store (ADLS, S3)
Data
Warehouse
(Impala)
Elastic Cluster(s):
Atlus Managed
BI Tools
(e.g. Tableau)
SQL Editor
(e.g. Hue)
13. 13 © Cloudera, Inc. All rights reserved.
Altus Demo
Data Engineering Workflow:
Batch ML, Simulations, etc.
1+ Jobs
(e.g.Spark)
Transient Cluster(s):
Atlus Managed
1+ Jobs
(e.g. Spark)
Persistent SDX
Altus Control PlaneSDX: Schema (HMS), Security (Sentry), Lineage/Metadata (Navigator)
Object Store (ADLS, S3)
Data
Warehouse
(Impala)
Elastic Cluster(s):
Atlus Managed
BI Tools
(e.g. Tableau)
SQL Editor
(e.g. Hue)
On-Premises Cluster:
Bare Metal/Private
Cloud
HDFS
Data
Science
Data
Eng
Data
Warehouse
Continuous or
On-Demand