Felix Candelario
Global Financial Services Solutions Architect explains the high level AWS Cloud Architecture, the concepts behind Availability Zones, Regions and how they relate with the traditional concept of Data Centers, Pods. He concludes with a presentation on how applications can be architected for the AWS Cloud, and how Mission Critical, Disaster Recovery and Business Continuity architectures are in use by Financial Services customer today.
5. DR Terminology Map
ELB/Appliance
EC2/Auto Scaling
Route 53
Load Balancers
Web/App Servers
Your Data
Centers
DNS
Amazon RDS
Security Groups / ACL
Availability Zones / VPC
Multi-region
Geographical
Redundancy
Data Centers
Firewall
Database Servers
6. What is an AWS Region?
• Geographic locations that contains a cluster of
availability zones in a given metropolitan area.
• Each region is completely isolated and
independent from other regions
• Each region consists of 2 or more AZs to support
high availability (HA) through AZ independence
7. Highly Reliable Global Footprint
• Over 1 million active
customers per month across
190 countries
• 2,300 government agencies
• 7,000 educational
institutions
• 35 availability zones + 9
more coming soon
• 59 edge locations
13+ worldwide regions
8. What are Availability Zones?
• Groupings of one or more data centers that are
physically isolated.
• AZs are connected to each other over low-
latency links within the same region
• Using 2 or more AZs within a region can provide
support for capabilities such as synchronous
database replication and better pricing when
using Amazon EC2 Spot instances
9. Availability Zones are Notated as Letters
35 Availability Zones (AZs)
• Example
• US East 1 (Northern VA)
– us-east-1a
– us-east-1b
– us-east-1c
– us-east-1d
– us-east-1e
Availability
Zone A
Availability
Zone B
Availability
Zone C
US-EAST-1
Availability
Zone D
Availability
Zone E
10. What is an Amazon VPC?
• Virtual isolated network that you define in which you can
launch AWS resources such as Amazon EC2 instances
• Complete control of your virtual networking environment
such as
• Set your own IP address ranges
• Create subnets
• Configure routing tables and network gateways
• Allows extension of your corporate network to the AWS
Cloud
11. VPC Pattern Diagram - Example
Development
Amazon VPC
Integration
Amazon VPC
Pre-production
Amazon VPC
Production
Amazon VPC
13. What Compute Services are available?
Amazon EC2 Auto Scaling
Elastic Load
Balancing
Actual
EC2
Elastic Virtual servers
in the cloud
Dynamic traffic
distribution
Automated scaling
of EC2 capacity
14. What Network Services are available?
Amazon VPC: AWS DirectConnect Amazon Route 53
Availability
Zone B
Availability
Zone A
Private, isolated
section of the AWS
Cloud
Private connectivity
between AWS and your
datacenter
Domain Name System
(DNS) web service.
16. Resiliency
Backup Disaster Recovery
Reducing likelihood of
service failure
Maintaining Data
Integrity
Recovery after loss of
availability
It’s not all or nothing. Choose a strategy that
fits the business objective.
18. Ascending levels of DR options
Backup &
Restore
Pilot Light
Warm
Standby
Hot-Site
Backup of on-
premises data to
AWS to use in a DR
event
Replicate data and
minimal running
services into AWS,
ready to take over
and flare up
Replicate data and
services into AWS
ready to take over
Replicated and load
balanced
environments that
are both actively
taking production
traffic
RPO
a
RTO
COST
24 hours 24 hours
$
RPO
a
RTO
COST
12 hours 4 hours
$$
RPO
a
RTO
COST
1-4 hours 15 min
$$$
RPO
a
RTO
COST
<15 min 0-5 min
$$$
Business continuity
begins
Un-interrupted Business
continuity
19. ~$200 / Month
In US-EAST
+VPN
On-premises
Active Production
www.example.com
Corporate data center AWS region
AWS DR failover
App
Servers
DB
Server
VPN
Connection
Storage
GatewayiSCSI
Backup
System
S3 / Bucket
Glacier / Archive
Web
Servers Internet traffic
S3 (1TB)
$31/Month
Glacier (2TB)
$22/Month
Storage Gateway
$125/Month
S3 / Bucket
S3 (1TB)
$31/Month
1TB Data
Volume
Backup and Restore Architecture
20. Suitable for
• Solutions that can sustain higher technical debt
• Lower business critical nature
• Low cost DR option
Leverage existing investments in
• De-duplication
• Compression
• WAN Acceleration
Backup and Restore Details
23. Database
server
Pilot light–recovery
www.example.com
Start in minutes
Add additional
capacity,
if needed
Reverse
proxy/
caching
server
Data
volume
Application
server
Corporate data center
Reverse
proxy/
caching
server
Application
server
Master
Database
server
26. Warm standby–prep
Mirroring /replication
Application
data source
cut over
Elastic load
balancer
Active
Not active for
production traffic
Route 53
www.example.com
Scaled down
standbyCorporate data center
Data
volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
27. Warm standby–recover
Elastic load
balancerActive
Route 53
www.example.com
Scaled-up
production
Corporate data center
Data
volume
Application
server
Database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
Server
Master
Database
server
29. Hot site–prep
Mirroring /replication
Application
data source
cut over
Elastic load
balancer
Active
Route 53
www.example.com
Corporate data center
Data
volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
Active
30. Hot site–recovery
Elastic load
balancer
Route 53
www.example.com
Corporate data center
Data
volume
Application
server
Database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
Active
Scaled up
for production
use
31. Considerations
Suitable for:
• Solutions that require RTO & RPO in minutes
• Core business critical functions
• Higher cost DR option
Warm Standby and Multi-site Details
33. Continuous Testing of Infrastructure
• Continuously and constantly test.
• Regularly execute tests in stable, production &
production-like test environments.
• Infrastructure as Code
• CI/CD Test in Infrastructure Build Pipeline
• Testing of infrastructure during Integration Test
34. Warm Standby – Testing
Mirroring /replication
Application
data source
cut over
Elastic load
balancer
Active
Not active for
production traffic
Route 53
www.example.com
Scaled down
standbyCorporate data center
Data
volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
35. Warm Standby – Testing
Mirroring /replication
Application
data source
cut over
Elastic load
balancer
Active
Not active for
production traffic
Route 53
www.example.com
Scaled down
standbyCorporate data center
Data
volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
36. Warm Standby – Testing
Mirroring /replication
Application
data source
cut over
Elastic load
balancer
Active
Not active for
production traffic
Route 53
www.example.com
Scaled down
standbyCorporate data center
Data
volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
37. Warm Standby – Testing
Mirroring /replication
Application
data source
cut over
Elastic load
balancer
Active
Not active for
production traffic
Route 53
www.example.com
Scaled down
standbyCorporate data center
Data
volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS region
Reverse
proxy/
caching
server
Application
server
Master
Database
server
aws rds reboot-db-instance --db-instance-identifier
dbInstanceID --force-failover
41. Cloud Based Architectures
• High level of control over the environment
• Automate Everything! – Utilise AWS APIs
• Infrastructure as code – CloudFormation
• Parallel environment
• Rolling Update / All at Once
• Blue / Green Deployments
- Significant difference between physical and cloud is the
control and visibility cloud provides
42. Common thread: Environment automation
Deployment success depends on
mitigating risk for:
• Application issues (functional)
• Application performance
• People/process errors
• Infrastructure failure
• Rollback capability
• Large costs
CloudFormation most
comprehensive
automation platform
• Scope stacks from
network to software
• Control higher-level
automation services:
Elastic Beanstalk, ECS,
OpsWorks, Auto Scaling
Strength of
automation
platform
43. Benefits of deployment on AWS
AWS:
• Agile deployments
• Flexible options
• RPO/RTO & Business
Continuity objectives
• Scalable capacity
• Pay for what you use
• Automation capabilities
45. Art of the Possible - State of DevOps 2016
Frequent Deployments
200x more frequent
deployment
Faster Recovery
24x faster recovery
from failure
Lower Failure Rate
3x lower change failure
rate
Less Unplanned Work
22% less time spent on
unplanned work and
rework
Shorter Lead Times
2,555x shorter lead
times
Source: Puppet Labs - State of DevOps 2016 Report