Chris Dagdigian provides practical tips for life science IT leadership based on his experience working in bioinformatics. Some key points include:
1) Cloud adoption in life sciences is driven by the need for flexible capabilities and collaboration rather than cost savings alone.
2) Common mistakes include lack of planning, bypassing security reviews, and forcing legacy patterns onto cloud infrastructure.
3) AWS is the leader in cloud capabilities but all providers oversimplify challenges in their marketing. Real-world requirements around networking, security and provisioning need to be considered.
Cloud Sobriety for Life Science IT Leadership (2018 Edition)
1. 1
Cloud Sobriety 2018
May 2018
Third Rock Ventures
Informatics/IT Symposium
Practical Tips for Life
Science IT Leadership_____________________
_
_____________________
_
Chris Dagdigian
Senior Director - BioTeam
chris@bioteam.net v2
2. 2
Content Warning
I am not an “expert”
… or a “thought leader”
I try to speak honestly about what I
see, do and experience “on the
ground” as an IT worker
My views are biased by the types of
work I perform. Filter my words through
your own expertise …
TRV has asked me to concentrate on
topics relevant to IT Directors dealing
with computational teams
5. 5
‣ Science changes faster than we can refresh infrastructure
‣ You must be prepared to design, deploy and support complex
IT infrastructure that lives for years
‣ … in an environment where scientists cannot really predict
their requirements or tooling needs beyond 6 months
‣ This is what keeps Research IT professionals awake at night
If you have to support scientists …
6. Why Cloud — Core Life Science Adoption Drivers
‣ The primary driver for IaaS Cloud Adoption in our
space is NOT SAVING MONEY
‣ Cloud is a CAPABILITY and a
COLLABORATION IT strategy
6
7. Why Cloud — Core Life Science Adoption Drivers
‣ Capability Drivers
• Pressure-relief valve for when on-prem IT is not aligned to current need
• Delegate significant infrastructure controls to end-users & teams
• Leverage game-changing AWS services that have no comparable on-
prem option ( s3:// + event triggers, Lambda, Glue, Batch, Poly, Lex,
SageMaker, API Gateway, etc. etc)
• Leverage hosted/managed services like RDS, ECS, EKS and Fargate
that can reduce operational burden of your SysAdmin & Ops people
7
8. Why Cloud — Core Life Science Adoption Drivers
‣ Collaboration Drivers
• The future of pharmaceutical drug development increasingly relies on
complex multi-party relationships. Companies may be “friends” in one
area and fierce competitors in a different market/area.
• Nobody wants to bring a frenemy inside the corporate firewall
• Also
• s3:// is ideal for high-velocity data exchange and ingest. Petabytes of open-
access lifesci data already hosted/available on AWS
• Your third party data providers, outsourced genome sequencing shops,
CROs and other business partners are already on AWS - data delivery is
easy!
8
9. Why Cloud — Get yer mind right
‣ It is absolutely essential that Senior Leadership
across the enterprise share the same vision for
cloud including core use cases and security/risk
model
‣ Or else this will happen …
9
10. Why Cloud —When Your Org is Not “Cloud Comfy”
‣ AD Administrators say this: “The cloud is scary and insecure, we will not allow your
cloud servers to bind to our AD Forest”
‣ ITIL Addicted Support Org says this: “All linux hosts must run RedHat and be
registered to our Satellite Server, Must be bound to Active Directory, Must run Anti-
Virus. All servers must be listed in our CMDB. No exceptions”
• This breaks tons of stuff. Plus heads explode when “serverless” is brought up !
‣ Security Team says this: “All egress traffic will be blocked unless you fill out this MS
Word .doc listing all possible SRC and DST hostnames + all applicable protocols. We
will not allow subnet-wide firewall policies - only policies that list specific SRC
hostnames and IPs will be allowed.
• This breaks anything that auto-scales and any AWS service (like Lambda) that pulls an ENI
from one of your private subnets for temporary use
10
11. 11
Biotech/Pharma Cloud: Most Common Mistakes
‣ C-Level executives shouting “cloud first!” without doing any sort of math or detailed planning
‣ Bypassing InfoSec and Legal stakeholders in early stages of cloud training, research and
knowledge gathering - lack awareness in these groups breeds horrific policy
‣ Treating cloud as hostile/alien environment rather than a remote/virtual datacenter
‣ Forcing legacy IT design patterns, governance models, inventory management and
provisioning mechanisms into the IaaS space without evaluating alternatives or even bothering
to think that alternatives are worth considering
‣ Not enough bandwidth to the cloud footprint
‣ Improper IP space allocation and network subnet planning
‣ Allowing users and developers full access before “safety rails” and operational procedures are
in place
13. 13
Which Cloud? IaaS Cloud Ranking
‣ Data Intensive Science rarely fits into ‘canned’ solution stacks|services
‣ We need flexibility and diverse options to build suitable tooling
‣ The best IaaS scoring metric: “How many building blocks do you offer?”
‣ Best sign of a pathetic “pretender cloud”
‣ Only offers Object Storage, Block Storage and VMs (little else)
14. 14
IaaS Rankings: My $.02
1. AWS
2. Azure
3. Google*
‣ AWS is still ~2 years or more ahead of all others in
terms of capability and useful IaaS building blocks
‣ AWS is best all around for exploratory and R&D type
use cases and workloads
‣ Azure has proven itself and is improving very rapidly.
Corporate/Enterprise workloads are well supported
‣ Google* (see next slide)
15. 15
IaaS Rankings: My $.02
1. AWS
2. Azure
3. Google*
‣ Google can be attractive for high-value pipelines and workflows where it
makes sense to commit engineering and enhancement resources. Also
attractive if you go “all-in” and are willing to rewrite and redesign to
support “the google” way of operations
‣ Consider: Google should be evaluated as a potential competitive threat to
your company and business model. Alphabet is far more likely to start life
science, biotech, pharma or other companies in “our world” that could
end up competing with you.
‣ ** Edit: <angsty personal anti-google opinions removed because it turns
out to be impossible to separate ‘personal’ thoughts from my corporate
role AND I was using old info/experiences in a 2018 talk which is not really
fair to google cloud as they exist today>
17. 17
Screening the AWS Kool-Aid:
‣ AWS is incredible at outreach and evangelism across all media and
platforms. They give away tons of stuff including advice, code, CF
templates, “how-to guides” etc. because they know this is a huge
driver for client uptake and adoption.
‣ The AWS re:Invent youtube channel is among the best technical
training resources I’ve ever encountered
‣ But …
‣ IT leadership must understand that AWS always presents the rosiest
possible ‘cloud-native’ view of the universe in their public outreach
• … and sometimes this conflicts with the real world.
18. 18
Screening the AWS Kool-Aid
‣ [1 of 2] In the AWS outreach universe …
• Everybody is fluent in DevOps, CloudFormation, Lambda & API Gateway
• Legacy things don’t exist except as migrate/rewrite opportunities
• All workloads and apps can be rewritten to be server less
• All apps are stateless and thus targets for auto-scaling, multi-az, spot & load balancers
• EC2 servers are ‘cattle-nodes’ or managed via immutable architecture design patterns.
No server is ever special or manually configured with critical stuff.
• Direct Connect, VPN Links, Routing and VPC Peering setup is easy/magical and rarely
has to be discussed, let alone planned out carefully. VPC-to-Premise and VPC-to-VPC
traffic flows are assumed.
19. 19
Screening the AWS Kool-Aid:
‣ [2 of 2] In the AWS outreach universe …
• Unrestricted outbound internet access is assumed (even from within private subnets)
• Everybody has all the IAM permissions they need and/or every company has a smooth
IAM/RBAC implementation process
• Most things you build will be public/internet facing
• Route53 handles DNS for your domain name(s) so healthcheck-influenced failover, multi-
az and geo-redundancy can magically occur
• All AWS services are fair game for use, can store corporate data and are not subject to
advance review/approval by internal Legal/IT/InfoSec teams
• RDS is perfect for all the things and you’ll never have to run a gnarly Oracle or MS SQL
Server setup on EC2
20. 20
Screening the AWS Kool-Aid: Real World Example #1
‣ Midlevel Manager: “Why are you not granting me Administrator level
IAM access in the prod VPC? This CloudFormation template I found on
the internet looks awesome!”
‣ Narrator: “… the CF template in question would have created
untagged non-compliant security groups, subnets and servers
exposed to the internet via ELB and ElasticIP … within a production
VPC explicitly designed to have zero internet-facing surfaces”
21. 21
Screening the AWS Kool-Aid: Real World Example #2
‣ Watches re:Invent video and falls in love with containers: “Why is it
taking you weeks to get me a functional ECR cluster within our
production VPC’s private subnets?”
‣ Narrator: “… Because egress traffic to Internet IPs is screened by a
firewall and AWS decided that it was acceptable for the ‘ecs-agent'
binary to communicate with external “telemetry” endpoints that are
undisclosed in any public documentation or list of AWS endpoints.
Also - the hostname pattern used for that telemetry endpoint makes it
impossible to globally whitelist that service on our firewall. ”
22. 22
Screening the AWS Kool-Aid: Real World Example #3
‣ IT Project Manager: “Please take Application X out of our managed
datacenter environment and deploy it according to this neat AWS Best
Practices diagram I saw on the AWS blog … we really need the HA,
scaling and multi-az failover benefits !!!! ”
‣ Narrator: “Your legacy application is not stateless, runs a database
inside the app server, uses hardcoded hostnames in it’s
configuration, uses insanely low TCP port #s (like TCP:16 !) for
cluster communication and the domain/hostname you are demanding
to use is not currently managed via AWS Route53”
23. 23
Screening the AWS Kool-Aid: Real World Example #4
‣ Developer: “You folks are horrible at cloud, all my code is failing with
‘ssl verification’ errors!”
‣ Narrator: “Our organization intercepts and decrypts outbound SSL
traffic to monitor for data exfiltration, malware, botnet control traffic
and other bad things. If you are going to talk TLS/HTTPS to the
outside world you need to install a few extra CA certificates … ”
24. 24
Screening the AWS Kool-Aid: Wrap Up
‣ The Good: AWS is incredible at open access training, documentation,
“Getting Started with X” guides and other public-facing evangelism
that truly drives excitement among potential users
‣ The Bad: Only the shiniest possible fully-automated cloud-native views
are really expressed, legacy baggage is under-mentioned and the ‘ease
of use’ message often runs up hard against real world corporate
security, operational and provisioning rules.
‣ Advice: We need to be just as good at outreach and expectation setting
as the AWS Evangelists are. Extensive work needs to be done to
promote and advertise how we actually use the cloud within our
Organization. Failures of expectation are OUR FAULT, not AWS.
26. 26
Hybrid Cloud Thoughts (Scientific Computing Focus)
‣ Awesome in theory and on the whiteboard. Awkward in reality.
‣ Hybrid cloud is not ideal for data intensive science workloads where
our apps are encumbered by tera|petabyte data access requirements -
especially when egress data transfer fees enter the mix
‣ Can work great for Chemistry, Molecular Modeling, Protein
structure/folding work where the compute demands are very high but
the data movement requirements are very low
‣ Blunt Advice: Design hybrid cloud to spec for your requirements using
your people. Buying a Hybrid Cloud “vision” from a vendor who does
not understand your data footprint can be a recipe for failure
27. 27
Multi-Cloud Thoughts
‣ Also awesome in theory and on the whiteboard but …
‣ Egress data charges and DNS control can be annoying. Plus you’ve
doubled your cloud training/knowledge/ops needs
‣ Blunt Advice: Start small and keep it simple.
The best multi-cloud design pattern I’ve started to see lately is this:
‣ AWS for Scientific Computing, Collaboration & Commercial Ops
‣ Azure for Active Directory, Federation and all things SSO + a few
Business applications running native in Azure stack
30. 30
Data Intensive Science: Scalable Compute Power
‣ If you are cloud-native, greenfield or able to re-write your existing
pipelines than use AWS Batch as the baseline compute platform
‣ If you need/prefer something that looks like a more traditional HPC
Cluster (that still auto-scales) then use AWS CfnCluster
‣ If you are worried about AWS lock-in: Every commercial HPC
scheduler or stack ISV has a cloud-friendly vision they’d be happy to
sell you
‣ If you want to submit compute jobs via API but are worried about AWS
Batch lock-in than code your submit scripts to the DRMAA API - that
will at least make your stuff portable across many HPC schedulers and
environments
32. 32
Data Intensive Science: Storage
‣ Target s3:// wherever possible. Object storage is the future of scientific
data at rest. Period.
‣ Understand all the s3:// permutations including IA-class, Single-AZ IA-
class and s3:// transfer acceleration options
‣ You will still probably need Windows CIFS shares at some point. Most
of my clients just do this with EBS + Windows Servers
33. 33
Data Intensive Science: Storage, continued
‣ You will probably need NFS
‣ AWS EFS … is … kinda sorta ok … for some use cases
‣ Look at Avere vFXT for high-end options that can handle single
namespace NFS across on-prem and multi-cloud environments
‣ Look at AWS Marketplace (SoftNas, etc) for other NAS/filer options
‣ If you need a parallel filesystem
‣ Look at GlusterFS - much easier to deploy on AWS vs. alternatives
that require low-latency interconnects or metadata controllers
‣ And RedHat seems to be supporting/pushing it hard
35. 35
Data Intensive Science: Server Access
‣ Your staff will need friction-free access to deployed AWS servers.
Despite what AWS says, not everything is an immutable cattle node
‣ Remember that one key cloud capability win is ability to delegate
significant control over infrastructure to project teams — it won't ONLY
be cloud Ops and Devs connecting to your footprint. Users too!
‣ This makes VPC design, multi-account strategy, VPC peering topology,
Direct Connect decisions, Routing and connectivity config of extreme
importance
‣ Don’t make the mistake of just designing to the technical need —
account for how users/developers/support will connect!
36. 36
Data Intensive Science: Server Access, Continued
‣ Use of disconnected VPCs (no access to production resources or on-
premise networks) and ‘sandbox accounts’ is totally legit
‣ Use of SSH Bastion or JumpHosts for access to VPCs when Direct
Connect, VPN or VPC Peering is not wanted is totally legit
‣ Jumphost Advice: Invest major time and effort into training or wiki
documentation on “SSH Tips & Tricks” including specific examples of
how to do SSH port forwarding and proxy commands that tunnel
through the JumpHost to hit the server the user wants
‣ This can have a 100x improvement on staff morale and productivity —
one of my major early career mistakes was assuming that all SSH
users were aware all the convenient time and effort-saving SSH
features.
37. 37
Data Intensive Science: Linux Role Based Access
Control
‣ Remember this is the real world
‣ Not all linux EC2 hosts are immutable cattle nodes
‣ You should be prepared to support humans logging in with corporate
network usernames and credentials
‣ Your super awesome cloud-native architecture may not have this need
now but “corporate cruft” grows over time and eventually you’ll have
this need. Easier to prepare now and keep it on the shelf.
‣ Basic password checking is often not enough - what about sudo/admin
access for certain users and groups?
38. 38
Data Intensive Science: Linux Role Based Access
Control
‣ Modern linux w/ modern SSSD software can easily “realm join” an
Active Directory domain and authenticate network users
‣ Sudo/admin access can be allowed/denied via AD group membership
and an /etc/sudoers rule. SSH keys can be dropped by DevOps tooling
‣ But you won’t be able to do this:
‣ Cross transitive AD child domains. Your linux client will not be
allowed to transit trust boundaries in AD forest. Simple Linux/AD
integration via “realm join” is insufficient for complex AD
environments
‣ No central management of POSIX stuff like personal SSH keys,
$SHELL and $HOME preferences. No custom per-host sudo rules etc.
Linux / AD Integration (Simple AD Topology)
39. 39
Wat?
‣ Works great:
‣ Company AD is
“COMPANY.COM”
‣ Single Domain/Forest
‣ ‘realm join company.com’
will work just fine
‣ Can login with
user@company.com
credentials
Linux and complex AD topologies
‣ DOES NOT WORK
‣ Company AD is “COMPANY.COM”
‣ Company child domains in Forest:
‣ NAFTA.COMPANY.COM
‣ EAME.COMPANY.COM
‣ APAC.COMPANY.COM
Linux can join ‘company.com’ but SSSD on Linux will not be
allowed by Active Directory to cross transitive trust boundaries
to access users in any child domain within the AD Forest
Logging in as user@company.com will not work if ‘user’ is in a
child domain
40. 40
Data Intensive Science: Linux Role Based Access
Control
‣ Consider deploying FreeIPA (Open Source) or Redhat Identity
Management (“RHEL IdM”) server as middleware
‣ FreeIPA/IdM kinda masquerades as a Domain Controller and can cross
transitive trust boundaries AND maintain trust relationships across
multiple domains and directories — SUPER USEFUL
‣ FreeIPA/IdM provides full RBAC controls for Linux users, offers fine-
grained control over sudo rules, can store all of the POSIX info we care
about ($shell, $home preferences, personal SSH keys for each user)
without requiring an AD schema change
Linux / AD Integration (More capable or for more complex AD environments)
43. 43
Data Intensive Science: App & Data Security
‣ AWS is excellent in this area. Use the AWS security offerings wherever
possible
‣ s3:// TSE crypto makes at-rest encryption trivial with no key management or
user/code behavior change required
‣ AWS KMS makes encrypted EBS and RDS storage pretty easy
‣ KMS and cross-region data replication are not friendly to each other - research
what this does to snapshot handling and replication costs if this applies to you
‣ If your threat model includes “AWS gets hacked or subpoenaed” then please
understand that DIY encryption is EASY but DIY PKI management is super hard
to do safely and securely
‣ AWS has great tools for automated scanning of environment for governance ,
policy and security violations. Use them, log them and alert upon them!
45. 45
Other Security Related Topics …
‣ AWS design patterns, evangelists and docs all assume that everything we run
has unrestricted access to the internet via IGW or NAT Gateway
‣ AWS does not hide the alternatives (Proxies, SGs, Firewalls) but they do not
go out of their way to make them visible as viable design patterns
‣ And remember, AWS API endpoints are “on the internet” because the IPs still
resolve to public IPv4 or IPv6 addresses. The only thing that changes with
VPC Service Endpoints is the routing path
‣ I think this is a bad assumption for AWS to make. In 2018 we need to rethink
unrestricted and unmonitored VPC egress traffic.
‣ Unrestricted egress should not just be enabled for everything by default.
46. 46
Other Security Related Topics …
‣ I don’t know when they did this but AWS did make my single biggest gripe
about their IP space go away!
‣ AWS does publish all of their IP space in easy to find documentation but they
used to make NO DISTINCTION between IP space used by their Service APIs
and IP space that could be assigned to EC2 servers operated by AWS
customers
‣ For a long time this meant that it was really hard to firewall access ONLY to
AWS API services — we had to switch to name-based destination policies
because the AWS API IP space was commingled with IPs that could be used
by dodgy EC2 servers hacking us or slinging malware our way
‣ But this has changed! You can now filter “AMAZON” out of “EC2” lists!
https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html#aws-ip-egress-control
47. 47
Other Security Related Topics …
‣ Internet egress from a VPC should require informed action with positive
control mechanism. It should not be enabled globally by default.
‣ At a minimum distinguish between the following and build suitable controls
for each option:
‣ Things that need no external access at all
‣ Things that need access ONLY to AWS API endpoints & AWS Services
‣ Things that require access that can be whitelisted by destination hostname
(software repos, patching services, code repos, etc.)
‣ Things that really require unrestricted outbound access
My $.02 advice:
48. 48
Other Security Related Topics …
‣ Filter access to AWS APIs via Security Group membership
‣ Considering operation of your own fleet of Proxy Servers to
control outbound access is legit
‣ Squid is great software
‣ Security controls in squid.conf allow easy rules for clients and
destinations
‣ Logging is very customizable and you can export the log
stream to your SEIM system to look for abuse or dark patterns.
Or just archive the logs in in case of future audit need
‣ … you do have a SEIM, right?
My $.02 advice:
49. 49
Other Security Related Topics …
‣ Despite all of the awesome AWS security features and services it
is still totally legit to consider running virtual firewall appliances
at the edge of your VPCs
‣ If your on-premise firewall maker has a virtual AWS version there
are management/visibiltiy/trend-analysis advantages to running
the same platform inside AWS
My $.02 advice:
50. 50
Other Security Related Topics …
‣ Encrypted network traffic is a significant security threat surface
‣ IP Data theft & exfiltration of info out of your environment
‣ Malware downloads and payload delivery that bypass traditional firewall
and screening methods. Control Plane for botnet management, etc. etc.
‣ I fully support organizations that decide they need to intercept, decrypt and
monitor encrypted traffic streams leaving their environments. However …
‣ This is a total nightmare when done promiscuously (like intercepting AWS
API traffic) — SSL decrypt needs to be weighed against the problems and
hassles it causes. Should be applied judiciously
‣ Intercepting and decrypting AWS API traffic causes far more problems than
the threat surface it purports to protect against (personal view)
My $.02 advice:
52. 52
Final AWS Cloud Advice
Lessons learned from many projects and lots of mistakes
53. 53
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Fat pipes benefit all. Investing in fast connectivity to the Internet is
worthwhile. Having 1gbps links to the outside world should be the new
normal
‣ This also means that 1gbps or 10gbps Direct Connect is also worthwhile -
but maybe a bit more business justification is needed!
‣ All IT departments and senior leaders must be read-in on cloud plans —
extend training and “why we are doing X” evangelism to these folks if
needed.
‣ If you don’t have complete support for the cloud roadmap you’ll end up in
nasty conflict with Risk Management, Compliance, InfoSec, Network teams
and the AD admins
54. 54
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Carve out ALL YOUR RFC1918 PRIVATE IP SPACE NOW and reserve it for
planned and future usage. Make sure those IP addresses and CIDR blocks
are never deployed at any office, site location or site-to-site VPN
‣ Reserve a ton of space so you can grow into multi-region, multi-account and
multi-cloud without hitting IP address and subnet allocation limits
‣ Any sort of CIDR overlap or IP address conflict causes massive headaches
with essential cloud services including VPC design, VPC peering, VPC
Routing, Direct Connect, VPN Connections, etc. — careful planning will help
you avoid this nightmare.
55. 55
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Think about domain names and DNS. Seriously.
‣ AWS Route53 DNS is almost essential for seamless load balancer failover,
multi-az high-availability and/or global service failover.
‣ But … not all companies can or want to place their primary business domain
name and DNS zone records under control of Route53
‣ Purchase additional domain names if it makes sense. Use those with
Route53 and your AWS footprint
‣ Also legit: Different domain names for “private facing” vs. “public facing”
services hosted in AWS
56. 56
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Use AWS Organizations and their multi-account best practices
‣ Seriously. My only wish is that this was available 5 years earlier!
57. 57
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Log all the things to a dedicated AWS account
‣ CloudTrail (including S3 data events and Lambda if desired), AWS Config
Events, VPC FLowLogs and all Cloudwatch log streams
‣ Do this on day 0 even if you don’t plan to set up your dashboard/monitoring
service until much much later. This is the data you need to respond to a
security event, API key breach or other serious incident.
‣ It’s also very useful for troubleshooting
58. 58
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Train your people and make training resources easy to access
‣ Cloud Unicorns are hard to find, hire and keep
‣ Everyone benefits with more skill, experience, training and lab access —
and not just the front-line Ops and Dev teams. Security, Legal, networking,
end-users and management all benefit from more info and awareness.
‣ Strongly advise online learning portals like https://acloud.guru/
and https://qwiklabs.com/ (there are many options in this space)
‣ My personal AWS certs are a result of cloudacademy.com, acloud.guru and
the AWS re:Invent conference presentation videos
59. 59
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ It’s ok to hire outside experts and professional services
‣ I have a very positive impression of AWS Professional Services
‣ I have a very positive impression of AWS Database Migration Services
‣ The specific hairy areas where professional advice is helpful:
‣ VPC, VPN, direct-connect and peering setup/planning
‣ Complex IAM policies and access rationalization
‣ Complex database migration efforts
‣ Big Data guidance
60. 60
Final Thoughts
Lessons learned from many projects and lots of mistakes
‣ Plan on (and budget for) AWS Support subscription
‣ If you can afford AWS Enterprise Support than go for it
‣ It’s very very very useful
‣ Named account managers, direct access to TAMs over email & Slack
‣ Fast service on Support Cases
61. 61
Final AWS Cloud Advice
Things you should BUY vs things you should BUILD
62. 62
Final Thoughts
Things you should BUY vs things you should BUILD
‣ AWS Backup, DR and Cross-Region/Cross-Account Replication
‣ Don’t even think about a DIY solution
‣ Just go to https://www.n2ws.com and sign up for their Cloud Protection
Manager marketplace subscription
‣ CPM is probably the best AWS Marketplace product I’ve ever seen or used.
It’s seriously fantastic and pricing is reasonable
‣ It provides fantastic capabilities but does so by simply placing a UI and
scheduling layer over standard AWS API calls. Nothing proprietary and
nothing to trap you in to a particular backup/DR product.
63. 63
Final Thoughts
Things you should BUY vs things you should BUILD
‣ SSO and Identity Federation
‣ This is a major hassle for mere IT mortals
‣ Just purchase SSO/Federation as a service from one of the many
companies (Okta, etc.) that do this professionally and move on with your
life
64. 64
Final Thoughts
Things you should BUY vs things you should BUILD
‣ Cloud Monitoring, Reporting, Spend Optimization & Governance
‣ There are dozens of companies that offer this as a service - do yourself a
favor and just screen the available options and purchase the one that best
fits your need and budget model. They offer far more than native AWS
services at the moment.
‣ Warning: These companies are still trying to figure out pricing models.
Some charge by data volume (Splunk, etc.) while others have pricing
models that may penalize Multi-Account usage. You need to screen
vendors for the Features they offer as well as for any potential pricing
gotchas
65. 65
Not a CloudCheckr endorsement :) All my other screenshots from dashboard
systems had sensitive info requiring time consuming redaction efforts …