AWS Summit 2014 Melbourne - Breakout 3
A behind the scenes look at key aspects of the AWS infrastructure deployments. Some of the true differences between a cloud infrastructure design and conventional enterprise infrastructure deployment and why the cloud fundamentally changes application deployment speed, economics, and provides more and better tools for delivering high reliability applications. Few companies can afford to have a datacenter in every region in which they serve customers or have employees. Even fewer can afford to have multiple datacenter in each region where they have a presence. Even fewer can afford to invest in custom optimized network, server, storage, monitoring, cooling, and power distribution systems and software. We'll look more closely at these systems, how they work, how they are scaled, and the advantages they bring to customers.
Presenter: Rodney Haywood, Manager, Solutions Architects, Amazon Web Services
2. Agenda
Redefining Scale
at AWS
AWS Designed
Hardware &
Infrastructure
Multi-AZ Design Point
& Why it Works
3. Perspective on Scaling
On average, AWS adds enough
new server capacity every day
to support Amazon’s global
infrastructure when it was a
$7B business (2004).
10. Infrastructure-as-a-service Magic Quadrant 2014
“AWS is the overwhelming
market share leader, with
more than five times the
compute capacity in use
than the aggregate total of
the other fourteen
providers.”
Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, May 2014. This Magic Quadrant graphic was published by
Gartner, Inc. as part of a larger research note and should be evaluated in the context of the entire report. Gartner does not endorse any vendor, product or service depicted in its research publications,
and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be
construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
11. Agenda
Redefining Scale
at AWS
AWS Designed
Hardware &
Infrastructure
Multi-AZ Design Point
& Why it Works
12. Pace of Innovation
Infrastructure pace of
innovation increasing
– Driven by cloud service providers and
high-scale internet applications
– Cost of datacenter and H/W
infrastructure dominates
– Infrastructure more than just a cost
center
High focus on innovation
– Driving down cost
– Increasing aggregate reliability
– Reducing resource consumption
footprint
13. AWS Custom Server Designs
OEM Server Ecosystem
– Optimized for 10s to 100s of thousands of customers
– Broadly applicable servers can run a variety of workloads
Cloud Server Ecosystem
– Optimized for single customer
– Highly specialized servers optimized for specific workload
– Large scale deployments allow hardware specialization
– Move hot s/w kernels to hardware implementations
– Datacenters, servers, networking, storage to designed to integrated spec.
14. AWS Custom Storage Designs
Commercial high-density storage:
• Quanta M4600H 4U Disk Enclosure
• Impressive best in class general purpose design
• We use custom design with still higher density
OEM storage & servers must target vast workload
diversity
High scale supports AWS-specific optimizations
– More space, power, & cost efficient
15. Networking Equipment
• Relative cost of networking
increasing quickly
• Profit margins high
• Ecosystem vertically
integrated
Monthly Costs
8%
3 year server & 10 year infrastructure amortization
16. Get the Network Out of the Way
Current Mainframe Model Goes Commodity Networks Over-Subscribed
• Forces workload placement
restrictions
• Goal: Make all points in
datacenter equidistant
• Amazon custom routers &
protocol stacks
17. Power Infrastructure
Negotiated power purchasing
agreements
AWS custom high-voltage
sub-stations in some regions
– Lower power cost
– Build faster
18. Procurement & Supply Chain Optimization
Procurement Supply Chain
Global demand allows
purchasing power at volume
Direct component purchasing
– Precise inventory control
– Better pricing
– Optimized designs
Demand-driven supply chain
Shorter cycle time drives higher
utilization
– Predicting next week easier
than 4 to 6 months out
Less overbuy & less capacity risk
yielding lower costs
19. Utilization & Economics
On premise 30% utilization
VERY good &10% to 20%
more common
Solution: Pool number of
heterogeneous services
Don’t block the business
Don’t over-buy
Transfers capital expense
to variable expense
Apply capital for business
investments rather than
infrastructure
Cost encourages prioritization
of work by application
developers
High scale needed to make a
spot market for low priority
work
Pay as You Go
Pay as You Grow
Server Utilization
Problem
Chargeback Models
Drive Good Behavior
20. Amazon Cycle of Innovation
15+ years of
operational excellence
LoRweduecre
Prices
Innovate
Listen to
Customers
Lower
Costs
Re-invest
in
Features
Improve
Processes
45 AWS price
reductions since 2006
21. AWS Rapid Pace of Innovation!
+4
8!
E!lastic Load!
Balancing!
Auto Scaling!
Amazon VPC!
Amazon RDS!
2009!
+6
1!
Amazon SNS!
!AWS Identity !
& Access !
!
Management!
Amazon Route 53!
2010!
+82!
Amazon SES!
!AWS Elastic !
Beanstalk!
!AWS !
CloudFormation!
!Amazon !
ElastiCache!
!AWS Direct !
Connect!
GovCloud!
2011!
+280!
!Amazon Elastic!
Transcoder!
AWS OpsWorks!
!Amazon !
CloudHSM!
!Amazon !
AppStream!
!Amazon !
CloudTrail!
!Amazon !
WorkSpaces!
Amazon Kinesis!
2013!
+159!
AWS S!torage!
Gateway!
!Amazon !
Dynamo DB!
!Amazon !
CloudSearch!
Amazon SWF!
Amazon Glacier!
Amazon Redshift!
AWS Data !
!
Pipeline!
2012!
Since inception AWS has:!
!
• Released 927 new services and features !
• Introduced over 35 major new services!
!
!
+24!
Amazon EBS!
Amazon!
!
CloudFront!
2008!
+270!
Amazon Cognito!
!Amazon Mobile!
Analytics!
Amazon Zocalo!
2014!
*as of July 31, 2014
22. Agenda
Redefining Scale
at AWS
AWS Designed
Hardware &
Infrastructure
Multi-AZ Design Point
& Why it Works
23. Conventional Design: Cross-Region Replication
5th app availability “9” only via multi-datacenter replication
Conventional approach:
– Two datacenters in distant locations
– Replicate all data to both datacenters
99.999%
The industry-wide dominant multi-DC availability approach
– Looks rock solid but performs remarkably poorly in
practice
Acid Test: Are you willing to pull the plug on the primary server?
24. What is wrong with inter-regional replication?
Asynchronous replication between datacenters
– Committing to an SSD order 1 to 2 msec
– LA to New York 74 msec round trip
On failure, a difficult & high skill decision:
– Fail-over & lose transactions, or
– Don’t fail-over & lose availability
I’ve been on these calls in the past
– No win situation
– Very hard to get right
25. What Else is Wrong with X-Country Replication?
Fragile: Active/Passive Doesn’t Work
– Failover to a system that hasn’t been taking operational load
– Passive secondary not recently tested
– Secondary config or S/W version different, incorrect load balancer config,
incorrect network ACLs, latent hardware problem, router problem,
resource shortage under load
– Can’t test without negative customer impact
– If you don’t test it, it won’t work
2-Way Redundancy Expensive:
– More than ½ capacity reserved to handle failure
– 3 datacenters much less expensive but impractical w/o high scale
26. AWS Multi-Availability Zone Model
Choose Region to be close to user, close to data, or meeting jurisdictional
requirements
Synchronous replication to 2 (or better 3) Availability Zones
– Easy when less than 2 to 3 msec away
– Can failover w/o customer impact
ELB over EC2 instances in different AZs
Stateless EC2 apps easy
For persistent state use
– DynamoDB
– Simple Storage Service
– Mutli-AZ RDS
27. New Research: Customers
Improve Availability by Migrating
Apps to AWS
32% reduction in total
application downtime
2013 AWS Customer Survey
Research Note: Benchmarking availability and reliability
in the cloud: Amazon Web Services Nucleus Research,
November 2013, Document N168
28. Is Hosting On-premises Less Expensive?
Utilization fundamentally higher in cloud
– Aggregating non-correlated workloads,
scale, spot market
Amazon specific H/W designs
– ODM acquisition of custom servers & net
gear
– Direct purchasing of disk, memory, & CPU
– AWS controlled hypervisor & net protocol
layers
Deep R&D: Many new data centers built each
year
Immense scale
– Volume purchasing, highly automated,
specialists in all areas
Amazon margins are tiny compared to
enterprise margins
29. Summary
AWS Economics driven by scale & singular focus
– Economies of scale
– Increased availability through multiple-datacenter deployment
– Steadily declining price
Mega-scale advantages available to all customers regardless of size
– Datacenter presence near all customers world-wide
– Multiple datacenters in each region for high availability
– Deeper R&D investment & operational focus in datacenter, server, storage, &
networking than any IT organization in the world
– Buying power that rivals the biggest in the world
Cloud Model Fundamentally different from the last 30 years
– Even if rebranded as “cloud enabled”, “private cloud”, “cloud-like”
30. Expand your skills with AWS
Certification
Exams
Validate your proven
technical expertise with
the AWS platform
aws.amazon.com/certification
On-Demand
Resources
Videos & Labs
Get hands-on practice
working with AWS
technologies in a live
environment
aws.amazon.com/training/
self-paced-labs
Instructor-Led
Courses
Training Classes
Expand your technical
expertise to design, deploy,
and operate scalable,
efficient applications on AWS
aws.amazon.com/training