In this session, Encirca Services by DuPont Pioneer discusses how they performed a lift-and-shift migration from their on-premises data center to AWS in less than six months. First, they cover how they aligned organizational stakeholders to prepare for the migration. Then, they discuss strategies used to increase the pace of their mass migration. Finally, they talk about actions taken after the migration to measure success and solicit feedback from customers.
2. 2
Agenda
1. Intro and Background
2. Challenges and Enablers
3. Migration Details
4. The Future
I N T R O
3. 3
What it takes to perform a lift-
and-shift to Amazon Web
Services (AWS)
Takeaway
I N T R O
4. 4
Lift-and-Shift
What?
• Acloud migration strategy replicating in-house apps to the cloud
without
redesign and re-architecting
Why?
• Re-architecting is expensive
• Risk mitigation
• Quick wins
• Improved HAand DR—Disaster Prevention
I N T R O
5. 5
Poll
How many people—
• Work for a company of > 1,000 employees?
• Entire company on AWS?
• Your area of your company is all-in on AWS?
• Some workloads are on AWS?
• Considering a migration to AWS?
• Don’t think AWS is right for them?
I N T R O
7. 7
About Me
From Des Moines, IA
Graduate of University of Iowa
Started in financial services
Last 10 years in agriculture
• DuPont Pioneer
• Encirca services
• Granular
Background in software engineering and cloud architecture
Email: brycehemme@granular.ag
Twitter: @brycehemme
Bryce Hemme – Director of Platform Engineering
8. 8
B A C K G R O U N D
Parent
Companies
Agriculture Seed
Business
Agriculture
Software Business
Agriculture
Software Products
Merged in 2017
Iowa-based seed
company acquired
by DuPont in 1999
Silicon Valley based
software company
acquired by DuPont
in 2017
Atlanta-based
software company
acquired by DuPont
in 2015Software product
developed by
DuPont Pioneer
11. 11
B A C K G R O U N D
Digital Agriculture
Managing Farm as a Business
Granular FMS
Managing Critical
Input
AcreValue
Advanced Agronomy Management
Encirca®
BUSINESS AGRONOMY LAND
12. 12
Encirca Services
B A C K G R O U N D
Encirca Services by DuPont Pioneer provides advanced agronomic
support, services and online analysis tools to help increase farmer
productivity, profitability, and sustainability
13. 13
Encirca Services
• Four engineering staff in 2013…300+ in 2017
• Windows, Linux, massive relational databases and big
compute
• Started in on-premises data center
• Small cloud footprint started in 2013
B A C K G R O U N D
14. 14
The Team
Front row: Nate Faue, Seraj Islam, Rashmi Shrestha
Back row: Jeremy Hofman, Eduardo Gamboa, Nate Mecklenburg, Gary Jepsen, Robbie Hornung, Jason Thompson
Not pictured: Ryan Zanger, Josh Kehoe, Andrew Lear
I N T R O
15. 15
Drivers for Our Cloud Migration
• Growing company with reduced IT budgets
• Silo between Dev and Ops
• Limited automation
• Desired autonomy
• Need to innovate and move fast
• Reliability
T H E C H A L L E N G E
17. 17
Convincing Leadership
• Why would I leave a data center that costs me nothing?
• Quantify the gains/quantify the cost of doing nothing
• What is the cost of slowing innovation?
• How much time is wasted due to org silos? (dev as opposed to ops)
• What can be gained by increased reliability?
• Market the cloud
C H A L L E N G E S A N D E N A B L E R S
19. 19
Now you’re likely responsible for…
• Windows
• Linux - CentOS
• DBMS
• Appliances
PLATFORMS
• Web, app, and DB servers
• Load balancers
• Web application firewalls
• Source control
• Build infrastructure—don’t
forget Apple
• Networking
INFRASTRUCTURE SERVICES
• DNS
• Provisioning, patching,
base images
• Identity and access
management
• Monitoring
• Security
• Compliance
C H A L L E N G E S A N D E N A B L E R S
20. 20
What are the challenges we had to address to
enable a successful migration?
21. 21
People and Skills
Issue
• Engineering staff has limited knowledge of AWS and
cloud engineering
Solution
• Hire these skills early, before you think you need
them
• Identify an internal evangelist (or two), get them
proper training/exposure
• Leverage AWS Solutions Architect to develop a
training plan
C H A L L E N G E S A N D E N A B L E R S
22. 22
Admin Connectivity
Issue
• No dedicated connection from corporate network to AWS
• Implementing would require significant help from corporate IT
• Don’t want AWS resources facing Internet, except when
necessary
Solution
• EC2 bastion hosts as SSH gateways
• Follow hub-and-spoke model outlined by AWS
• All traffic is tunneled through bastion hosts
C H A L L E N G E S A N D E N A B L E R S
http://amzn.to/2iX6gaN http://amzn.to/2yvQ7DE
23. 23
Monitoring and Alerting
Issue
• Existing monitoring solutions work well for on-premise
applications
• Need insight into environment in order to self-support
Solution
• Heavily invest in SaaS based solutions
• Monitor applications, infrastructure, and use logs to identify
anomalies
• Simple integration with AWS services provides additional
insight
C H A L L E N G E S A N D E N A B L E R S
24. 24
Source Control and Builds
Issue
• All source control and build infrastructure lives inside corporate
network
Solution
• Migrate to cloud hosted source control —GitLab
• Develop scalable build system in the cloud —Jenkins
• Develop tooling to make life easier at scale —Tauoneer and
Jenkins DSL
C H A L L E N G E S A N D E N A B L E R S
25. 25
Authentication
Issue
• Used SAML 2.0 and a custom authentication provider
• Not easily portable
Solution
• Migrated to Amazon Cognito user pools
• Amazon Cognito and JWT fit much better with our needs
C H A L L E N G E S A N D E N A B L E R S
26. 26
API Gateway Routing
Issue
• Used third-party API Gateway appliance for complex routing
scenarios
• Used it as an authentication gateway, also—great for
decoupling authentication from apps
Solution
• Custom-built Apache-based API Gateway
• Built completely with infrastructure-as-code approach
• Enhanced mod_authn_jwt to support Amazon Cognito JWTs
C H A L L E N G E S A N D E N A B L E R S
http://bit.ly/2xa8tWS
27. 27
SQL Server Clustering
Issue
• Massive SQL Server cluster, too big for RDS
• Requires high disk IOPS
• SQL Server High Availability groups work for failover using
Microsoft’s drivers, but not for other clients/drivers
Solution
• EC2 based SQL Server High Availability groups
• Created RAID array using EBS volumes
• HAProxy to the rescue—handles routing of failover traffic and is
compatible with all clients/drivers
C H A L L E N G E S A N D E N A B L E R S
28. 28
Coupling to Internal Systems
Issue
• Dependencies to internal systems exist and no connectivity will
exist post-lift-and-shift
Solution
• Identify dependencies early
• Establish remediation plans
• Focus engineers on implementing remediation
C H A L L E N G E S A N D E N A B L E R S
29. 29
Testing of Deployment
Issue
• Doing a large cutover to the cloud is highly risky
Solution
• Test release through identical environments
• Run on premise and cloud in parallel—dogfooding
• Plan for the worst, hope for the best
C H A L L E N G E S A N D E N A B L E R S
30. 30
Availability Zone - 1 Availability Zone - 2
API Gateways and
Web Servers
API Gateways and
Web Servers
Event Processing Event Processing
Lift-and-Shift Account
Elastic Load Balancer
S3 Web Assets
S3 Link
Shared Account
Cognito
AccessLift-and-Shift
Route 53
DNS Resolution
Replication and
Failover
C H A L L E N G E S A N D E N A B L E R S
VPC Peering w/
Other Accounts
Corporate Data Center
Jenkins and Source
Control
Testers
Customers
Customers and
32. 32
Migration
• Iterative over six months
• Slowly moving through each environment
• Production cutover started at 7:00 p.m. Saturday
• < 15 minutes of downtime
T H E M I G R A T I O N
33. 33
Issues Encountered
Increased error rates
• Monitoring helped identify minor issues that were mostly related
to performance
Performance degraded
• Use of CDC and full-sync of replicas causes contention
• 1 DB in non-production didn’t mimic production (100 GB vs 4
TB)
• Lack of same load in non-production caused resize of DB
instances
T H E M I G R A T I O N
All issues resolved throughout the evening
34. 34
T H E M I G R A T I O N
API Error Rates
Performance
degradation
Mistake in deploy job
35. 35
T H E M I G R A T I O N
API Response Time
Shutdown of non-essential processing
Cutover of traffic
Performance
degradation
36. 36
T H E M I G R A T I O N
Direct Customer Feedback
Lift and Shift Team,
I want to kiss all you…who moved everything to AWS. The speed…is
exceptional. But what impressed me the most is how fast a crop zone
change or DZ reset in Studio translates to Encirca!
I do that more than the average bear and it was lightning fast!
Everything looks and feels so good!
Happiest day ever.
38. 38
G O I N G C L O U D N A T I V E
Where We’re Going
• API redesign - AWS API Gateway,AWS Lambda, Python
• Break up SQL Server DBs into smaller, domain-focused
DBs - PostgreSQL, Amazon DynamoDB
• More robust event messaging and event sourcing
• Convergence of disparate systems - where it makes
sense
39. 39
Availability Zone - 1 Availability Zone - 2
API Gateways and
Web Servers
API Gateways and
Web Servers
Event Processing Event Processing
Lift-and-Shift Account
Elastic Load Balancer
CloudNative Account
VPC Peering
Availability Zone - 1 Availability Zone - 2
S3 Web Assets
S3 Link
S3 Web Assets
API Gateway
Replication and
Failover
Microservices Microservices
Redis Cache Redis Cache
Replication
DynamoDB SQS
Shared Account
Auth
CloudFormation
Amazon Cognito
Route 53
DNS Resolution
Replication and
Failover
G O I N G C L O U D N A T I V E