Más contenido relacionado La actualidad más candente (20) Similar a How to Lower TCO and Avoid Cloud Lock-in
(20) Más de Cloudera, Inc. (20) How to Lower TCO and Avoid Cloud Lock-in
1. 1© Cloudera, Inc. All rights reserved.
How to Lower TCO and
Avoid Cloud Lock-in
Jim Fisher, Director of Systems Engineering at Cloudera
Ifi Derekli, Systems Engineer at Cloudera
Susan Greslik, Systems Engineer at Cloudera
2. 2© Cloudera, Inc. All rights reserved.
● Moving from on-prem to Cloud: Best practices for lowering your TCO
● Portability and selecting the right Cloud provider
● Demo of using multiple Cloud providers
● Bringing it all together with Cloudera Altus
Agenda
3. 3© Cloudera, Inc. All rights reserved.
Big Data deployments in cloud are
accelerating
● Increased agility through end-user self-
service
● Organization focused on higher value
items
● Perceived lower overall TCO by
optimizing infrastructure usage
Benefits of the Cloud
4. 4© Cloudera, Inc. All rights reserved.
Three Types of Deployment Models in Cloud
1hr
SPIN UP SPIN
DOWN
24/7
24/7
1hr
SPIN UP SPIN
DOWN
Persistent
Transient
Elastic
5. 5© Cloudera, Inc. All rights reserved.
Characteristics are Different for Each
Usage Requirements Example Workloads
Persistent ● Runs 24/7
● Only expands when
new capacity is
needed
● High availability & disaster
recovery
● Cluster operational management
● Resource management
● Security
● NoSQL
● Streaming
● BI analytics
● Multi-user
Transient ● Runs on an
intermittent basis
(e.g.: daily, weekly,
hourly)
● Object store integration
● Fast cluster provisioning
● Cluster metadata persistence
● Usage-based pricing
● ETL workflows
● Model training
● Ad hoc analytics
● Dev/Test workflows
Elastic ● Some nodes run 24/7
● Others added &
removed as needed
(e.g.: daily, weekly,
monthly, quarterly)
● Combination of requirements
from persistent and transient
clusters
● BI analytics during
peak hours
● End of week, month,
quarter processing
6. 6© Cloudera, Inc. All rights reserved.
Cost Models are Different
Cost Structure Cost Optimization Option
Persistent Yearly “rental” of infrastructure Multi-year agreements (e.g.: Reserved
Instances)
Transient Hourly “rental” of infrastructure Preemptible VMs (e.g.: Spot Instances)
Elastic Yearly “rental” for persistent nodes and
hourly “rental” for transient nodes
Multi-year agreements for persistent nodes
and preemptible VMs for transient nodes
On-Premise Purchase of infrastructure that is
typically depreciated over 3 years
Limited
7. 7© Cloudera, Inc. All rights reserved.
● Utilizing publicly available list pricing with no discounts
● AWS pricing used since they are the market leader but same concepts apply
to other Cloud providers
● Your mileage may vary - These are going to be different for every
organization but the concepts and numbers are directionally correct
● Only using infrastructure costs since they are often the majority of TCO
Some Assumptions
8. 8© Cloudera, Inc. All rights reserved.
How Much Does a Server Cost?
Component Details Cost Estimates
Server 20 cores, 256GB
RAM, 12 4TB disks
$18,000
Data Center Power, cooling, and
data center space*
$4,000
Networking Switches &
networking equip.
$5,000
Administrator One person who
manages 100 servers
$6,000
TOTAL (3 Years) $33,000
TOTAL
(Annually)
$11,000
* https://ongoingoperations.com/data-center-pricing-credit-unions/
9. 9© Cloudera, Inc. All rights reserved.
How Much is a Similar Server in the Cloud?
Component Details
d2.8xlarge 36 vCPUs, 244 GB RAM,
12 4TB disks
Option Unit
Cost
Total
(3 Years)
Total
(Annual)
On-Premise Server $33,000 / 3 years $33,000 $11,000
On-Demand Pricing $5.52 / hour $145,065 $48,355
Reserved Instance (1-Year) $23,616 / year $70,848 $23,616
Reserved Instance (3-Year) $41,560 / 3 years $41,560 $13,853
10. 10© Cloudera, Inc. All rights reserved.
Benefits of using object storage
● Create a data lake in object store for multiple
clusters and eliminate data silos
● Provides durability for the data, so you don’t
have to worry about replication.
● Allows you to separate compute and storage so
you can grow independently.
● ..which leads to less costs than local storage
Except when…
● Performance is critical. Each attached disk is
roughly 100Mb/s and that’s rough estimate per
server to object storage
What if You Want to Use Object Storage?
Object Store
11. 11© Cloudera, Inc. All rights reserved.
How Does Object Storage Affect the Costs?
Component Details
r4.8xlarge 32 vCPUs, 244 GB RAM
EBS Disks 640GB
S3 Storage 12 TB* (no replication required)
Option Compute Storage Total (Annual)
On-Premise Server $11,000 $0 $11,000
d2.8xl Reserved Instance (3-
Year)
$13,853 $0 $13,853
r4.8xl Reserved Instance (3-
Year)
$7,009 $5,458 $12,467
Object Store
* Object Storage costs may be less since you pay for what you use
12. 12© Cloudera, Inc. All rights reserved.
Benefits of Transient Clusters
● Pay only for what you use
● Right-size cluster based on
workload needs
● Better isolation between different
users and groups
Cloud Workloads are often Transient
13. 13© Cloudera, Inc. All rights reserved.
What if you only needed 6 Hours per Day?
Component Details
r4.8xlarge 32 vCPUs, 244 GB RAM, 6 hours/day
EBS Disks 640GB, 6 hours/day
S3 Storage 12 TB*, 24x7
Option Compute Storage Total (Annual)
On-Premise Server $11,000 $0 $11,000
Reserved Instance (3-Year) $7,009 $5,458 $12,467
On Demand Pricing (6
hours/day)
$4,648 $4,728 $9,376
* Object Storage costs may be less since you pay for what you use
14. 14© Cloudera, Inc. All rights reserved.
What are Preemptible Instances?
Spare computing capacity that you can bid on a significant discount levels as
compared to on-demand pricing. AWS suggests that costs can be 50-90% less
than On-Demand and Google says they can be up to 80% cheaper.
Known as Spot pricing in AWS and Preemptible VMs in Google.
Preemptible Instances can be used to lower costs
15. 15© Cloudera, Inc. All rights reserved.
How much can you save with Spot at 70% Discount?
Component Details
r4.8xlarge 32 vCPUs cores, 244 GB RAM, 6
hours/day
EBS Disks 640GB, 6 hours/day
S3 Storage 12 TB*, 24x7
Option Compute Storage Total (Annual)
On-Premise Server $11,000 $0 $11,000
On Demand Pricing (6
hours/day)
$4,648 $4,728 $9,376
Spot Pricing - 70% (6 hours/day) $1,3941 $4,728 $6,122
* Object Storage costs may be less since you pay for what you use
16. 16© Cloudera, Inc. All rights reserved.
25 Servers cost about $275,000 per year
35 Servers cost about $385,000 per year
Let’s Walk through a Scenario
Workload Requirements
ETL ● SLAs to complete within 6 hours
● Need 20 servers to meet SLAs
BI ● 15 servers to meet minimum workload
● 25 servers needed during business hours
17. 17© Cloudera, Inc. All rights reserved.
Two Clusters in the Cloud
Component Details
r4.2xlarge 100, with 8 vCPUs cores, 61 GB RAM
EBS Disks 15.6 TB, 24 hours/day
S3 Storage 300 TB
Object Store
BI - Persistent running 24/7
Component Details
r4.2xlarge 80 with 8 vCPUs, 61 GB RAM, 6 hours/day
EBS Disks 12.8 TB, 6 hours/day
S3 Storage (included with BI workload)
ETL - Transient Running 6 hrs/daily
18. 18© Cloudera, Inc. All rights reserved.
● Cloud can be less when on-
premise is not as highly utilized
and more when on-premise is
efficiently utilized
● Cloud TCO best practices used
● Cloud provides benefits of
isolation and on-demand flexibility
How Do the Costs Compare?
Option Compute Storage Total (Annual)
On-Premise (25 Servers) $275,000 $0 $275,000
On-Premise (35 Servers) $385,000 $0 $385,000
Cloud (ETL) $27,888 $4,853 $32,741
Cloud (BI) $175,233 $136,395 $311,628
19. 19© Cloudera, Inc. All rights reserved.
General
● Don’t look as Cloud as infrastructure hosted in another data center
● Understand the workloads so you use the right tool to optimize the TCO
Best Practices on How to Lower Cloud TCO
Storage
● Utilize object storage when possible to eliminate data silos
● Use local storage when performance SLAs are more critical
Persistent
● Use Reserved Instances when workloads are
known and can be committed to for multiple
years
Transient
● Use Preemptible instances when possible
but you may have to re-design application
20. 20© Cloudera, Inc. All rights reserved.
Portability and selecting the right Cloud
provider
21. 21© Cloudera, Inc. All rights reserved.
Why is portability important?
Financial considerations
• Price negotiation position
• Instance pricing fluctuation
• Project type can dictate cost
(storage, processing power)
Conflict of Interest
• Competition with cloud vendors
• Freedom to have choices for
given projects is critical
Maximize Capabilities
• Freedom to leverage all
features available across
vendors
Diversify Risk
• Eliminate 100% dependency on
vendor’s technology
• Ensure uptime of your
environment despite Cloud
vendor potential issues
22. 22© Cloudera, Inc. All rights reserved.
What to consider?
Questions to ask
Pricing model ● Does the vendor round to nearest minute? Hour? Do
they offer discounts for upfront commitment?
Variety of services provided ● Does the vendor provide enough options for
instances (e.g. dense disk, memory-optimized, cpu-
optimized), storage options (e.g. local disk, object-
store), or network capabilities to meet your needs?
Ease of Use ● Do you have an existing skill-set for a particular
vendor?
● Is the platform simple to deploy and easy to learn?
Support & Management ● What type of assistance will the vendor provide?
● How easy is it to troubleshoot your cloud
environment?
23. 23© Cloudera, Inc. All rights reserved.
Orchestration tool for deploying, monitoring and scaling Cloudera EDH on cloud
infrastructure
Characteristics:
• Embodies Cloudera best-practices and reference
Architectures
• Complements on-prem offerings for IaaS users
• Extends capabilities of Cloudera Manager
• Grows and shrinks clusters via single pane of glass
Main Goals:
• Reduce time-to-value
• Enable new usage patterns (on-demand clusters,
self-service)
• Facilitate portability amongst cloud vendors
• Allow predictability of workloads
What is Cloudera Director?
Cloudera Director
24. 24© Cloudera, Inc. All rights reserved.
• Re-usability
• Cluster configuration files
• Node templates (Master,
Workers, Edge)
• “Standard” cluster configs ->
predictable project costs
• Only Pay for What You Use
• Automatic billing
• Flexible SKUs based on use
case
Cloudera Director Economics
25. 25© Cloudera, Inc. All rights reserved.
How Does Director Help with Portability?
IFI TODO: Screenshot of director
28. 28© Cloudera, Inc. All rights reserved.
● Brand for Cloudera PaaS
offerings
● Foundation acts as framework
for building services.
● Altus for data engineers is first
user-facing service.
Cloudera Altus is a PaaS for big data analytics
29. 29© Cloudera, Inc. All rights reserved.
● Understand your workloads so you use the right tools and right vendor to
optimize the TCO
● Plan for portability to reduce risk and costs and enable options
● Cloudera can help you plan and architect efficiently
● Cloudera Altus - PaaS offering so you can focus on your applications
Key Takeaways
31. 31© Cloudera, Inc. All rights reserved.
An Enterprise Data Hub reimagined in the cloud
Object Store Object Store
Data Science
Workbench
Common Governance
Common Security
Common: Operations, Governance, Security, Schema, Catalog
SQL WorkbenchPartner EcosystemWorkload Management