SlideShare a Scribd company logo
1 of 29
Silver Linings
Our journey in migrating a cloud
Tom Dance
Head of Platform Development
Tristan Davey
Senior Software Engineer
Moving Clouds
Relocating from Google to Amazon
Large monolith - 500,000+ LOC
We outgrew Google App Engine
➔ Poor documentation and support
➔ Immature API’s (everything is beta!)
➔ Bumping into limitations
➔ Proprietary technology
Feature development and fixes slow
Needed more flexibility
Scaling the engineering team was hard
SafetyCloud Google App Engine
Google Cloud Platform
Web Frontend
Web API
Email
Client API
Binary Data
SafetyCloud
‘The Monolith’
SafetyCloud Monolithic Architecture
SafetyCulture The Goal
Improve our product
➔ More reliable and performant syncing
➔ New modern user interface
➔ Feature equivalent
➔ Full backwards compatibility with iAuditor
Address the problems a large monolithic codebase brings
Scalable, flexible, open technologies
Strong partner for infrastructure
SafetyCulture The Solution
10 Microservices built with Node.js
Single Page App built with Ember.js
Document store with Couchbase
Document indexing with ElasticSearch
Scalable cloud based infrastructure with Amazon Web Services
Amazon Web Services
Web API
Email
Client API
Binary Data
Web Frontend
SafetyCulture Microservice Architecture
Rebuilding an API
Reconstructing our client API in Node.js
Client API
HTTP API for SafetyCulture iOS and Android Applications
● Authentication
● Document Synchronisation
● User Management
● Document Permissions
Client API Change Considerations
Consumed by over 500,000 devices
Many users in legacy versions of consuming clients:
● 2% of users on version older than 1 year
● 8.5% of users on version older than 6 months
● 25.3% of users on version older than 1 month
Consuming clients relied on undocumented quirks and edge
cases to function - these needed to be maintained
Language
Server Framework
Database
Query Engine
Binary Storage
Scaling
Client API Rebuild
Original API
Python
WebApp2
Google Datastore
SQL-Like Queries
Google Blobstore
Vertical + Horizontal
Rebuilt API
Coffeescript
Hapi.js
Couchbase Server
MapReduce Indexes
Amazon S3
Horizontal
Client API Maintaining Backwards Compatibility
API Specification-based implementation
External specification of original API became the internal implementation
specification of the new API.
Manual and Automated testing
Automated unit and integration tests.
Production device testing with large scale, multi-hour real-world tests.
Replay-based Regression testing
Production device traffic was observed, recorded and replayed with a custom-
built tool. Allowed us to identify request/response behaviours.
Client API Rebuild Outcomes
Built, tested, deployed in under 9 engineering-months
Client API Codebase: 10000+ LOC
Regression Test Codebase: 22000+ LOC
Seamlessly continued working with legacy clients
Horizontally scales to easily meet peak demand
Now serves 1,200,000 requests/day
Data Migration
Google Datastore to Couchbase Server
Google Datastore
Non-relational key-value store
Proprietary software
Eventually consistent
1MB value limit
Basic indexing and querying
Couchbase Server
Non-relational document store
Open-source project
Eventually consistent
Configurable document limit
MapReduce-based indexing
PRODUCTION
PRODUCTION
LevelDB
Database
Migration
Couchbase
Stage 1
Migration
Stage 2
Migration
Google
Datastore
Couchbase
Server
MIGRATION
Stage 1
Migration
Stage 2
Migration
MIGRATION
Stage 1 Migration
LevelDB ➜ Couchbase
1 Key/Value Pair = 1 Document
JSON Serialise Data
Index data for Stage 2
Stage 2 Migration
Couchbase ➜ Couchbase
Many Key/Value Pair = 1 Document
Data denormalisation
Transform data structures
Reformat data types
Stage 1
Migration
Stage 2
Migration
MIGRATION
Migration Process
40+ Quad Core Machines
320+ documents migrated concurrently
Concurrency and tasks controlled by Amazon Simple Queue Service
6.5 hours from start to finish
80+ test migrations
PRODUCTION
Stage 1
Migration
Stage 2
Migration
Google
Datastore
Couchbase
Server
MIGRATIONVALIDATION
Clean-room
Validation
Migration
Validation
Process
Validation
VALIDATION
Clean-room
Validation
Migration
Validation
Process
Validation
Clean-room Validation
Rebuilt transforms from
specification
Compared random
subsamples of original
data with result data
Migration Validation
SQS Queue validation
Numerical validation of
entities and sizes
Strict error monitoring
Process Validation
Issue management
Documented procedures
Checklists & audits
Google Datastore
129,607,422 KV Entities
121 Query Indexes
1900 Ops/sec average
Couchbase Server
2,596,011 Documents
25 MapReduce Indexes
260 Ops/sec average
PRODUCTION
Couchbase
Server
Google
Datastore
SafetyCulture
Moving into our new cloud...
Instant Switchover
“Google App Engine one day, Amazon Web Services the next”
28 Hour Switchover Process
➔ Downtime required
➔ Minimum Load Period - Saturday to Sunday
➔ Required 15 engineering Staff
➔ Additional support staff
SafetyCulture Moving Clouds
12 microservices
➔ Unique scaling requirements for each
➔ Stateless and fault tolerant
Infrastructure
➔ 30+ Virtual machines serving simultaneously
➔ 14 Load balancers
Use AWS services where possible
➔ DynamoDB, ELBs, ASGs, CloudWatch, Route53...
SafetyCulture The Infrastructure
SafetyCulture Development
Continuous integration and delivery
➔ ~500 deploys in under five months
➔ Zero downtime deploys
Better team workflow
➔ Agile development methodology
➔ Every pull request gets reviewed and tested
➔ Microservices allow for faster and isolated development
➔ Features hidden behind feature flags
SafetyCulture The Business
A better product for customers
➔ Faster and more reliable
➔ Clean and modern UI
➔ More features and fixes being released
In the five months since launch
➔ 100% growth in database records
➔ 50% user growth
➔ 40% saving in infrastructure costs
May the safe be with you...
safetyculture.io
@safetycultrehq

More Related Content

What's hot

Artificial Intelligence & Machine learning foundation topic in AWS
Artificial Intelligence & Machine learning foundation topic in AWS Artificial Intelligence & Machine learning foundation topic in AWS
Artificial Intelligence & Machine learning foundation topic in AWS Varun Manik
 
Getting Started with Amazon EventBridge
Getting Started with Amazon EventBridgeGetting Started with Amazon EventBridge
Getting Started with Amazon EventBridgeSrushith Repakula
 
AWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase Productivity
AWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase ProductivityAWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase Productivity
AWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase ProductivityAWS User Group - Thailand
 
Workshop : Wild Rydes Takes Off - The Dawn of a New Unicorn
Workshop : Wild Rydes Takes Off - The Dawn of a New UnicornWorkshop : Wild Rydes Takes Off - The Dawn of a New Unicorn
Workshop : Wild Rydes Takes Off - The Dawn of a New UnicornAmazon Web Services
 
Let's Talk About Serverless - Focusing on AWS Lambda
Let's Talk About Serverless - Focusing on AWS LambdaLet's Talk About Serverless - Focusing on AWS Lambda
Let's Talk About Serverless - Focusing on AWS LambdaOkis Chuang
 
AWS DynamoDB Streams - A quick introduction
AWS DynamoDB Streams - A quick introductionAWS DynamoDB Streams - A quick introduction
AWS DynamoDB Streams - A quick introductionChris Richardson
 
DevOps as a Service - Kuberiter
DevOps as a Service - KuberiterDevOps as a Service - Kuberiter
DevOps as a Service - Kuberiterlawrence143
 
AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )
AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )
AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )Amazon Web Services
 
To Serverless And Beyond!
To Serverless And Beyond!To Serverless And Beyond!
To Serverless And Beyond!SheenBrisals
 
Choosing the right messaging service for your serverless app [with lumigo]
Choosing the right messaging service for your serverless app [with lumigo]Choosing the right messaging service for your serverless app [with lumigo]
Choosing the right messaging service for your serverless app [with lumigo]Dhaval Nagar
 
Using AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondUsing AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondSoftServe
 
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar SeriesContinuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar SeriesAmazon Web Services
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to ServerlessNikolaus Graf
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsRightScale
 
Thinking Asynchronously Full Vesion - Utah UG
Thinking Asynchronously Full Vesion - Utah UGThinking Asynchronously Full Vesion - Utah UG
Thinking Asynchronously Full Vesion - Utah UGEric Johnson
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless ArchitectureSaul Caganoff
 
Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS SoftServe
 

What's hot (20)

Artificial Intelligence & Machine learning foundation topic in AWS
Artificial Intelligence & Machine learning foundation topic in AWS Artificial Intelligence & Machine learning foundation topic in AWS
Artificial Intelligence & Machine learning foundation topic in AWS
 
Getting Started with Amazon EventBridge
Getting Started with Amazon EventBridgeGetting Started with Amazon EventBridge
Getting Started with Amazon EventBridge
 
AWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase Productivity
AWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase ProductivityAWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase Productivity
AWS Community Day Bangkok 2019 - Dev Ops Philosophy Increase Productivity
 
Workshop : Wild Rydes Takes Off - The Dawn of a New Unicorn
Workshop : Wild Rydes Takes Off - The Dawn of a New UnicornWorkshop : Wild Rydes Takes Off - The Dawn of a New Unicorn
Workshop : Wild Rydes Takes Off - The Dawn of a New Unicorn
 
Let's Talk About Serverless - Focusing on AWS Lambda
Let's Talk About Serverless - Focusing on AWS LambdaLet's Talk About Serverless - Focusing on AWS Lambda
Let's Talk About Serverless - Focusing on AWS Lambda
 
AWS DynamoDB Streams - A quick introduction
AWS DynamoDB Streams - A quick introductionAWS DynamoDB Streams - A quick introduction
AWS DynamoDB Streams - A quick introduction
 
DevOps as a Service - Kuberiter
DevOps as a Service - KuberiterDevOps as a Service - Kuberiter
DevOps as a Service - Kuberiter
 
AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )
AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )
AWS re:Invent 2016: Serverless Computing Patterns at Expedia (SVR306) )
 
To Serverless And Beyond!
To Serverless And Beyond!To Serverless And Beyond!
To Serverless And Beyond!
 
Cabot corporate profile 2018
Cabot corporate profile  2018Cabot corporate profile  2018
Cabot corporate profile 2018
 
Choosing the right messaging service for your serverless app [with lumigo]
Choosing the right messaging service for your serverless app [with lumigo]Choosing the right messaging service for your serverless app [with lumigo]
Choosing the right messaging service for your serverless app [with lumigo]
 
Using AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and BeyondUsing AWS Lambda for Infrastructure Automation and Beyond
Using AWS Lambda for Infrastructure Automation and Beyond
 
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar SeriesContinuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
Continuous Delivery with AWS Lambda - AWS April 2016 Webinar Series
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to Serverless
 
Docker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud ApplicationsDocker in Production: How RightScale Delivers Cloud Applications
Docker in Production: How RightScale Delivers Cloud Applications
 
Thinking Asynchronously Full Vesion - Utah UG
Thinking Asynchronously Full Vesion - Utah UGThinking Asynchronously Full Vesion - Utah UG
Thinking Asynchronously Full Vesion - Utah UG
 
Serverless Architecture
Serverless ArchitectureServerless Architecture
Serverless Architecture
 
Serverless
ServerlessServerless
Serverless
 
Practical Cloud
Practical CloudPractical Cloud
Practical Cloud
 
Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS Rapid Prototyping for Big Data with AWS
Rapid Prototyping for Big Data with AWS
 

Similar to Silver Linings - North Queensland IT Industry Conference

Modernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectModernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectDevOps.com
 
The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
Cloud Native Apps
Cloud Native AppsCloud Native Apps
Cloud Native AppsDavid Chou
 
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)Amazon Web Services
 
Accelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform ServicesAccelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform ServicesAmazon Web Services
 
Automated DevOps Workflows with Chef on AWS
Automated DevOps Workflows with Chef on AWSAutomated DevOps Workflows with Chef on AWS
Automated DevOps Workflows with Chef on AWSAmazon Web Services
 
ClearScale: Continuous Automation with Docker on AWS
ClearScale: Continuous Automation with Docker on AWSClearScale: Continuous Automation with Docker on AWS
ClearScale: Continuous Automation with Docker on AWSAmazon Web Services
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSAmazon Web Services
 
Amazon Web Services User Group Sydney - March 2018
Amazon Web Services User Group Sydney - March 2018Amazon Web Services User Group Sydney - March 2018
Amazon Web Services User Group Sydney - March 2018PolarSeven Pty Ltd
 
MongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
MongoDB World 2016: Get MEAN and Lean with MongoDB and KubernetesMongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
MongoDB World 2016: Get MEAN and Lean with MongoDB and KubernetesMongoDB
 
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...Ian Choi
 
A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite Codemotion
 
Running Enterprise Workloads on AWS
Running Enterprise Workloads on AWSRunning Enterprise Workloads on AWS
Running Enterprise Workloads on AWSAmazon Web Services
 
Accelerate Application Innovation Journey with Azure Kubernetes Service
Accelerate Application Innovation Journey with Azure Kubernetes Service Accelerate Application Innovation Journey with Azure Kubernetes Service
Accelerate Application Innovation Journey with Azure Kubernetes Service WinWire Technologies Inc
 
Webinar aws 101 a walk through the aws cloud- introduction to cloud computi...
Webinar aws 101   a walk through the aws cloud- introduction to cloud computi...Webinar aws 101   a walk through the aws cloud- introduction to cloud computi...
Webinar aws 101 a walk through the aws cloud- introduction to cloud computi...Amazon Web Services
 
SMC301 The State of Serverless Computing
SMC301 The State of Serverless ComputingSMC301 The State of Serverless Computing
SMC301 The State of Serverless ComputingAmazon Web Services
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)AWS Vietnam Community
 

Similar to Silver Linings - North Queensland IT Industry Conference (20)

Modernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectModernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-Architect
 
The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017The State of Serverless Computing | AWS Public Sector Summit 2017
The State of Serverless Computing | AWS Public Sector Summit 2017
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Cloud Native Apps
Cloud Native AppsCloud Native Apps
Cloud Native Apps
 
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)AWS re:Invent 2016: The State of Serverless Computing (SVR311)
AWS re:Invent 2016: The State of Serverless Computing (SVR311)
 
Accelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform ServicesAccelerate your Cloud Success with Platform Services
Accelerate your Cloud Success with Platform Services
 
Automated DevOps Workflows with Chef on AWS
Automated DevOps Workflows with Chef on AWSAutomated DevOps Workflows with Chef on AWS
Automated DevOps Workflows with Chef on AWS
 
ClearScale: Continuous Automation with Docker on AWS
ClearScale: Continuous Automation with Docker on AWSClearScale: Continuous Automation with Docker on AWS
ClearScale: Continuous Automation with Docker on AWS
 
WIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWSWIN401_Migrating Microsoft Applications to AWS
WIN401_Migrating Microsoft Applications to AWS
 
Amazon Web Services User Group Sydney - March 2018
Amazon Web Services User Group Sydney - March 2018Amazon Web Services User Group Sydney - March 2018
Amazon Web Services User Group Sydney - March 2018
 
MongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
MongoDB World 2016: Get MEAN and Lean with MongoDB and KubernetesMongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
MongoDB World 2016: Get MEAN and Lean with MongoDB and Kubernetes
 
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
[OpenInfra Days Vietnam 2019] Innovation with open sources and app modernizat...
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite
 
Running Enterprise Workloads on AWS
Running Enterprise Workloads on AWSRunning Enterprise Workloads on AWS
Running Enterprise Workloads on AWS
 
Microsoft: Invent with Purpose
Microsoft: Invent with PurposeMicrosoft: Invent with Purpose
Microsoft: Invent with Purpose
 
Accelerate Application Innovation Journey with Azure Kubernetes Service
Accelerate Application Innovation Journey with Azure Kubernetes Service Accelerate Application Innovation Journey with Azure Kubernetes Service
Accelerate Application Innovation Journey with Azure Kubernetes Service
 
Webinar aws 101 a walk through the aws cloud- introduction to cloud computi...
Webinar aws 101   a walk through the aws cloud- introduction to cloud computi...Webinar aws 101   a walk through the aws cloud- introduction to cloud computi...
Webinar aws 101 a walk through the aws cloud- introduction to cloud computi...
 
SMC301 The State of Serverless Computing
SMC301 The State of Serverless ComputingSMC301 The State of Serverless Computing
SMC301 The State of Serverless Computing
 
Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)Build an app on aws for your first 10 million users (2)
Build an app on aws for your first 10 million users (2)
 

Recently uploaded

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Silver Linings - North Queensland IT Industry Conference

  • 1. Silver Linings Our journey in migrating a cloud
  • 2. Tom Dance Head of Platform Development Tristan Davey Senior Software Engineer
  • 3. Moving Clouds Relocating from Google to Amazon
  • 4. Large monolith - 500,000+ LOC We outgrew Google App Engine ➔ Poor documentation and support ➔ Immature API’s (everything is beta!) ➔ Bumping into limitations ➔ Proprietary technology Feature development and fixes slow Needed more flexibility Scaling the engineering team was hard SafetyCloud Google App Engine
  • 5. Google Cloud Platform Web Frontend Web API Email Client API Binary Data SafetyCloud ‘The Monolith’ SafetyCloud Monolithic Architecture
  • 6. SafetyCulture The Goal Improve our product ➔ More reliable and performant syncing ➔ New modern user interface ➔ Feature equivalent ➔ Full backwards compatibility with iAuditor Address the problems a large monolithic codebase brings Scalable, flexible, open technologies Strong partner for infrastructure
  • 7. SafetyCulture The Solution 10 Microservices built with Node.js Single Page App built with Ember.js Document store with Couchbase Document indexing with ElasticSearch Scalable cloud based infrastructure with Amazon Web Services
  • 8. Amazon Web Services Web API Email Client API Binary Data Web Frontend SafetyCulture Microservice Architecture
  • 9. Rebuilding an API Reconstructing our client API in Node.js
  • 10. Client API HTTP API for SafetyCulture iOS and Android Applications ● Authentication ● Document Synchronisation ● User Management ● Document Permissions
  • 11. Client API Change Considerations Consumed by over 500,000 devices Many users in legacy versions of consuming clients: ● 2% of users on version older than 1 year ● 8.5% of users on version older than 6 months ● 25.3% of users on version older than 1 month Consuming clients relied on undocumented quirks and edge cases to function - these needed to be maintained
  • 12. Language Server Framework Database Query Engine Binary Storage Scaling Client API Rebuild Original API Python WebApp2 Google Datastore SQL-Like Queries Google Blobstore Vertical + Horizontal Rebuilt API Coffeescript Hapi.js Couchbase Server MapReduce Indexes Amazon S3 Horizontal
  • 13. Client API Maintaining Backwards Compatibility API Specification-based implementation External specification of original API became the internal implementation specification of the new API. Manual and Automated testing Automated unit and integration tests. Production device testing with large scale, multi-hour real-world tests. Replay-based Regression testing Production device traffic was observed, recorded and replayed with a custom- built tool. Allowed us to identify request/response behaviours.
  • 14. Client API Rebuild Outcomes Built, tested, deployed in under 9 engineering-months Client API Codebase: 10000+ LOC Regression Test Codebase: 22000+ LOC Seamlessly continued working with legacy clients Horizontally scales to easily meet peak demand Now serves 1,200,000 requests/day
  • 15. Data Migration Google Datastore to Couchbase Server
  • 16. Google Datastore Non-relational key-value store Proprietary software Eventually consistent 1MB value limit Basic indexing and querying Couchbase Server Non-relational document store Open-source project Eventually consistent Configurable document limit MapReduce-based indexing PRODUCTION
  • 18. Stage 1 Migration Stage 2 Migration MIGRATION Stage 1 Migration LevelDB ➜ Couchbase 1 Key/Value Pair = 1 Document JSON Serialise Data Index data for Stage 2 Stage 2 Migration Couchbase ➜ Couchbase Many Key/Value Pair = 1 Document Data denormalisation Transform data structures Reformat data types
  • 19. Stage 1 Migration Stage 2 Migration MIGRATION Migration Process 40+ Quad Core Machines 320+ documents migrated concurrently Concurrency and tasks controlled by Amazon Simple Queue Service 6.5 hours from start to finish 80+ test migrations
  • 21. VALIDATION Clean-room Validation Migration Validation Process Validation Clean-room Validation Rebuilt transforms from specification Compared random subsamples of original data with result data Migration Validation SQS Queue validation Numerical validation of entities and sizes Strict error monitoring Process Validation Issue management Documented procedures Checklists & audits
  • 22. Google Datastore 129,607,422 KV Entities 121 Query Indexes 1900 Ops/sec average Couchbase Server 2,596,011 Documents 25 MapReduce Indexes 260 Ops/sec average PRODUCTION Couchbase Server Google Datastore
  • 24. Instant Switchover “Google App Engine one day, Amazon Web Services the next” 28 Hour Switchover Process ➔ Downtime required ➔ Minimum Load Period - Saturday to Sunday ➔ Required 15 engineering Staff ➔ Additional support staff SafetyCulture Moving Clouds
  • 25.
  • 26. 12 microservices ➔ Unique scaling requirements for each ➔ Stateless and fault tolerant Infrastructure ➔ 30+ Virtual machines serving simultaneously ➔ 14 Load balancers Use AWS services where possible ➔ DynamoDB, ELBs, ASGs, CloudWatch, Route53... SafetyCulture The Infrastructure
  • 27. SafetyCulture Development Continuous integration and delivery ➔ ~500 deploys in under five months ➔ Zero downtime deploys Better team workflow ➔ Agile development methodology ➔ Every pull request gets reviewed and tested ➔ Microservices allow for faster and isolated development ➔ Features hidden behind feature flags
  • 28. SafetyCulture The Business A better product for customers ➔ Faster and more reliable ➔ Clean and modern UI ➔ More features and fixes being released In the five months since launch ➔ 100% growth in database records ➔ 50% user growth ➔ 40% saving in infrastructure costs
  • 29. May the safe be with you... safetyculture.io @safetycultrehq