SlideShare una empresa de Scribd logo
1 de 38
Descargar para leer sin conexión
Andrey Koleshko
Back-end developer @ Toptal
Github/twitter: @ka8725
Email: ka8725@gmail.com
Rails data migrations
● Code is never set in stone
● DB structure mutates
○ Columns/tables rename/drop
○ Move one type of relationship to other (e.g. from “belongs to” to
“has and belongs to many”, from “has many” to “has one”, etc.)
● Zero-downtime policy (production experience)
● Ton of data to migrate
● Public API exposed for other services
● NoSQL
The problem definition
● Code is never set in stone
● DB structure mutates
○ Columns/tables rename/drop
○ Move one type of relationship to other (e.g. from “belongs to” to
“has and belongs to many”, from “has many” to “has one”, etc.)
● Zero-downtime policy (production experience)
● Ton of data to migrate
● Public API exposed for other services
● NoSQL
The problem definition
● No production yet
● Production without zero-downtime policy
● Production with zero-downtime policy
Different situations
● No production yet
● Production without zero-downtime policy
● Production with zero-downtime policy
Different situations: the hardest case
Schema migrations != Data migrations
Tell things apart
class AddStatusToUser < AR::Migration
def up
add_column :users, :status, :string
end
def down
remove_column :users, :status
end
end
Tell things apart: schema migrations
class AddStatusToUser < AR::Migration
def up
add_column :users, :status, :string
User.find_each do |user|
user.status = 'active'
user.save!
end
end
...
Tell things apart: data migrations
● Write data migrations inside schema migrations (1)
● Write data migrations separately from schema migrations (2)
Different solutions
● Write any Rails code carelessly (a)
● Redefine models and use them in place (b)
● Call migration data code written outside (seeds, services, etc.) (c)
● Raw SQL (d)
● Rake tasks (e)
Different solutions
|{1, 2} x {a, b, c, d, e}| = 10
Different solutions
● Do you need the migrations functioning forever?
● Is a developer environment important more than production?
Pick a solution based on balance
● Do you need the migrations functioning forever?
○ No, clean them up from time to time
○ Don’t run all migrations at fresh start
○ Local/staging loads dump and the final schema at once
○ Obfuscate dump if needed
● Is a developer environment important more than production?
○ Obviously no, see the points above
My choice
class AddStatusToUser < AR::Migration
def up
add_column :users, :status, :string
User.find_each do |user|
user.status = 'active'
user.save!
end
end
...
Solution #1: Ruby code inside schema migration
● Error-prone - What if someone renames User model later?
● Not recommended
Solution #1: Ruby code inside schema migration
class AddStatusToUser < AR::Migration
class User < ActiveRecord::Base; end
def up
add_column :users, :status, :string
User.find_each { |user| user.update!(status: ‘active’) }
end
...
Solution #2: Redefine models inside migrations
class AddStatusToUser < AR::Migration
class User < AR::Base; belongs_to :role, polymorphic:
true; end
class Role < AR::Base; has_many :users, as: :role; end
----------------------------------------------------------
role = Role.create!(name: 'admin')
User.create!(nick: '@ka8725', role: role)
Solution #2: Redefine models inside migrations. Bug
Solution #2: Redefine models inside migrations. Bug
> user = User.find_by(nick: '@ka8725')
> user.role # => nil
Solution #2: Redefine models inside migrations. Bug
> user = User.find_by(nick: '@ka8725')
> user.role # => nil
> user.role_type # => AddStatusToUser::Role
Expected:
> user.role_type == Role # => true
● Much better than the previous one
● Error-prone - How to deal with tricky associations?
● Interesting bug with polymorphic associations
● Not recommended
Solution #2: Redefine models inside migrations
Common approach in Rails community?
● Has all previous problems
● Not a better choice
● Not recommended
Solution #3: Call migration data code written outside
from schema migrations
● Fast execution
● No previous problems
Solution #4: Raw SQL
● SQL knowledge
● More time to code
Solution #4: Raw SQL
Solution #5: Rake tasks
● Define custom Rake tasks
● Run when needed
rake db_migration:fix_data
Solution #5: Rake tasks
● Not a bad choice
● Requires some manual work
● Can be automated
● Can be developed to similar solution as schema migrations
in Rails
Solution #5: Rake tasks
Not bad solution for a start
● Define data migrations inside schema migrations
● But write tests for data migrations
● https://railsguides.net/change-data-in-migrations-like-a-boss/
● https://github.com/ka8725/migration_data
● Similar solution for schema migrations with versioning
○ https://github.com/ilyakatz/data-migrate
● Write SQL
● Schema migrations are made in several steps
○ https://blog.codeship.com/rails-migrations-zero-downtime/
● Heavy migrations (last for hours) are split into several
background jobs scheduled with some interval
The best choice suites production zero-downtime
The best choice suites production zero-downtime
The best choice suites production zero-downtime
Sort and run combined:
for local env only!
● Schema migrations should be fast (<1s)
● Avoid data migrations inside schema migrations
● Data migrations run after deployment
● Complementary actions are made on following deploys if the
data migration is run successfully
Production zero-downtime: deployment caveats
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Zero downtime
Production
code
DB
Deploy timeline
Schema migrations
Symlink
Data migrations
Split to smaller jobs
Process(1-1000) Process(10001-2000) Process(20001-3000)
j#1 j#2
j#3
j#6
j#4
j#7
j#8
j#5
@ka8725
Andrey Koleshko
Remotely working vetetran
Questions?

Más contenido relacionado

La actualidad más candente

Stockholm JAM September 2018
Stockholm JAM September 2018Stockholm JAM September 2018
Stockholm JAM September 2018Andrey Devyatkin
 
Introduction to Reactjs
Introduction to ReactjsIntroduction to Reactjs
Introduction to ReactjsNodeXperts
 
ReactiveX
ReactiveXReactiveX
ReactiveXBADR
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)Jarek Potiuk
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Flink Taiwan User Group
 
Seminar globalize3 - DungNV
Seminar globalize3 - DungNVSeminar globalize3 - DungNV
Seminar globalize3 - DungNVFramgia Vietnam
 
Why should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming ParadigmWhy should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming ParadigmTech Triveni
 
IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)David Catuhe
 
ReactiveX-SEA
ReactiveX-SEAReactiveX-SEA
ReactiveX-SEAYang Yang
 
Sprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdfSprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdfChristian Zellot
 
NE Scala 2016 roundup
NE Scala 2016 roundupNE Scala 2016 roundup
NE Scala 2016 roundupHung Lin
 
Introduction to javascript technologies
Introduction to javascript technologiesIntroduction to javascript technologies
Introduction to javascript technologiesAbdalla Elsayed
 
The state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobsThe state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobsAndrey Devyatkin
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional ProgrammerDave Cross
 
Introduction to functional programming, with Elixir
Introduction to functional programming,  with ElixirIntroduction to functional programming,  with Elixir
Introduction to functional programming, with Elixirkirandanduprolu
 
Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)Brooklyn Zelenka
 
Intro to Crystal Programming Language
Intro to Crystal Programming LanguageIntro to Crystal Programming Language
Intro to Crystal Programming LanguageAdler Hsieh
 
Moving From Angular to React
Moving From Angular to ReactMoving From Angular to React
Moving From Angular to ReactIlya Gurevich
 
LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINE Corporation
 

La actualidad más candente (20)

Stockholm JAM September 2018
Stockholm JAM September 2018Stockholm JAM September 2018
Stockholm JAM September 2018
 
Introduction to Reactjs
Introduction to ReactjsIntroduction to Reactjs
Introduction to Reactjs
 
Let's Graph
Let's GraphLet's Graph
Let's Graph
 
ReactiveX
ReactiveXReactiveX
ReactiveX
 
React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)React native introduction (Mobile Warsaw)
React native introduction (Mobile Warsaw)
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
 
Seminar globalize3 - DungNV
Seminar globalize3 - DungNVSeminar globalize3 - DungNV
Seminar globalize3 - DungNV
 
Why should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming ParadigmWhy should a Java programmer shifts towards Functional Programming Paradigm
Why should a Java programmer shifts towards Functional Programming Paradigm
 
IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)IE WebGL and Babylon.js (Web3D 2014)
IE WebGL and Babylon.js (Web3D 2014)
 
ReactiveX-SEA
ReactiveX-SEAReactiveX-SEA
ReactiveX-SEA
 
Sprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdfSprint Boot & Kotlin - Meetup.pdf
Sprint Boot & Kotlin - Meetup.pdf
 
NE Scala 2016 roundup
NE Scala 2016 roundupNE Scala 2016 roundup
NE Scala 2016 roundup
 
Introduction to javascript technologies
Introduction to javascript technologiesIntroduction to javascript technologies
Introduction to javascript technologies
 
The state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobsThe state of Jenkins pipelines or do I still need freestyle jobs
The state of Jenkins pipelines or do I still need freestyle jobs
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Introduction to functional programming, with Elixir
Introduction to functional programming,  with ElixirIntroduction to functional programming,  with Elixir
Introduction to functional programming, with Elixir
 
Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)Relay: Seamless Syncing for React (VanJS)
Relay: Seamless Syncing for React (VanJS)
 
Intro to Crystal Programming Language
Intro to Crystal Programming LanguageIntro to Crystal Programming Language
Intro to Crystal Programming Language
 
Moving From Angular to React
Moving From Angular to ReactMoving From Angular to React
Moving From Angular to React
 
LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話LINEデリマでのElasticsearchの運用と監視の話
LINEデリマでのElasticsearchの運用と監視の話
 

Similar a Rails data migrations

Snowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesSnowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesDrew Hansen
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherJohn Wood
 
Active record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with ArelActive record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with ArelAlex Tironati
 
Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life DevOps.com
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...PostgresOpen
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureDatabricks
 
Database Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDatabase Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDan Stine
 
Keeping code clean
Keeping code cleanKeeping code clean
Keeping code cleanBrett Child
 
Liquibase Integration with MuleSoft
Liquibase Integration with MuleSoftLiquibase Integration with MuleSoft
Liquibase Integration with MuleSoftNeerajKumar1965
 
Webinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBWebinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBMongoDB
 
SynapseIndia drupal presentation on drupal info
SynapseIndia drupal  presentation on drupal infoSynapseIndia drupal  presentation on drupal info
SynapseIndia drupal presentation on drupal infoSynapseindiappsdevelopment
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychThe Software House
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database DeploymentsMike Willbanks
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Gabriele Bartolini
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid WorldTanel Poder
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)NerdWalletHQ
 
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020OdessaJS Conf
 
Instant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesInstant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesYshay Yaacobi
 
Viktor Turskyi "Effective NodeJS Application Development"
Viktor Turskyi "Effective NodeJS Application Development"Viktor Turskyi "Effective NodeJS Application Development"
Viktor Turskyi "Effective NodeJS Application Development"Fwdays
 

Similar a Rails data migrations (20)

Snowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesSnowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD Pipelines
 
Polyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great TogetherPolyglot Persistence - Two Great Tastes That Taste Great Together
Polyglot Persistence - Two Great Tastes That Taste Great Together
 
Active record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with ArelActive record, standalone migrations, and working with Arel
Active record, standalone migrations, and working with Arel
 
Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life Achieving Full Stack DevOps at Colonial Life
Achieving Full Stack DevOps at Colonial Life
 
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...Selena Deckelmann - Sane Schema Management with  Alembic and SQLAlchemy @ Pos...
Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Pos...
 
Modern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data CaptureModern ETL Pipelines with Change Data Capture
Modern ETL Pipelines with Change Data Capture
 
Database Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDatabase Migrations with Gradle and Liquibase
Database Migrations with Gradle and Liquibase
 
Keeping code clean
Keeping code cleanKeeping code clean
Keeping code clean
 
Liquibase Integration with MuleSoft
Liquibase Integration with MuleSoftLiquibase Integration with MuleSoft
Liquibase Integration with MuleSoft
 
Webinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDBWebinar: Migrating from RDBMS to MongoDB
Webinar: Migrating from RDBMS to MongoDB
 
SynapseIndia drupal presentation on drupal info
SynapseIndia drupal  presentation on drupal infoSynapseIndia drupal  presentation on drupal info
SynapseIndia drupal presentation on drupal info
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database Deployments
 
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
 
DDD with Behat
DDD with BehatDDD with Behat
DDD with Behat
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
 
Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)Gobblin @ NerdWallet (Nov 2015)
Gobblin @ NerdWallet (Nov 2015)
 
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
'Effective node.js development' by Viktor Turskyi at OdessaJS'2020
 
Instant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesInstant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositories
 
Viktor Turskyi "Effective NodeJS Application Development"
Viktor Turskyi "Effective NodeJS Application Development"Viktor Turskyi "Effective NodeJS Application Development"
Viktor Turskyi "Effective NodeJS Application Development"
 

Más de Андрей Колешко (6)

Business domain isolation in db
Business domain isolation in dbBusiness domain isolation in db
Business domain isolation in db
 
Корпоративное приложение на Rails
Корпоративное приложение на RailsКорпоративное приложение на Rails
Корпоративное приложение на Rails
 
Ruby exceptions
Ruby exceptionsRuby exceptions
Ruby exceptions
 
Rails3 way
Rails3 wayRails3 way
Rails3 way
 
Complete ruby code
Complete ruby codeComplete ruby code
Complete ruby code
 
Rails 3 assets pipeline
Rails 3 assets pipelineRails 3 assets pipeline
Rails 3 assets pipeline
 

Último

Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Jaydeep Chhasatia
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Incrobinwilliams8624
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...OnePlan Solutions
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampVICTOR MAESTRE RAMIREZ
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionsNirav Modi
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilVICTOR MAESTRE RAMIREZ
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?AmeliaSmith90
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyRaymond Okyere-Forson
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.Sharon Liu
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesShyamsundar Das
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptkinjal48
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfBrain Inventory
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native BuildpacksVish Abrams
 

Último (20)

Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
Enterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze IncEnterprise Document Management System - Qualityze Inc
Enterprise Document Management System - Qualityze Inc
 
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
Transforming PMO Success with AI - Discover OnePlan Strategic Portfolio Work ...
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - Datacamp
 
eAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspectionseAuditor Audits & Inspections - conduct field inspections
eAuditor Audits & Inspections - conduct field inspections
 
Generative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-CouncilGenerative AI for Cybersecurity - EC-Council
Generative AI for Cybersecurity - EC-Council
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?How Does the Epitome of Spyware Differ from Other Malicious Software?
How Does the Epitome of Spyware Differ from Other Malicious Software?
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human Beauty
 
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
20240319 Car Simulator Plan.pptx . Plan for a JavaScript Car Driving Simulator.
 
Watermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security ChallengesWatermarking in Source Code: Applications and Security Challenges
Watermarking in Source Code: Applications and Security Challenges
 
Webinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.pptWebinar_050417_LeClair12345666777889.ppt
Webinar_050417_LeClair12345666777889.ppt
 
Why Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdfWhy Choose Brain Inventory For Ecommerce Development.pdf
Why Choose Brain Inventory For Ecommerce Development.pdf
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
Streamlining Your Application Builds with Cloud Native Buildpacks
Streamlining Your Application Builds  with Cloud Native BuildpacksStreamlining Your Application Builds  with Cloud Native Buildpacks
Streamlining Your Application Builds with Cloud Native Buildpacks
 

Rails data migrations

  • 1. Andrey Koleshko Back-end developer @ Toptal Github/twitter: @ka8725 Email: ka8725@gmail.com Rails data migrations
  • 2. ● Code is never set in stone ● DB structure mutates ○ Columns/tables rename/drop ○ Move one type of relationship to other (e.g. from “belongs to” to “has and belongs to many”, from “has many” to “has one”, etc.) ● Zero-downtime policy (production experience) ● Ton of data to migrate ● Public API exposed for other services ● NoSQL The problem definition
  • 3. ● Code is never set in stone ● DB structure mutates ○ Columns/tables rename/drop ○ Move one type of relationship to other (e.g. from “belongs to” to “has and belongs to many”, from “has many” to “has one”, etc.) ● Zero-downtime policy (production experience) ● Ton of data to migrate ● Public API exposed for other services ● NoSQL The problem definition
  • 4. ● No production yet ● Production without zero-downtime policy ● Production with zero-downtime policy Different situations
  • 5. ● No production yet ● Production without zero-downtime policy ● Production with zero-downtime policy Different situations: the hardest case
  • 6. Schema migrations != Data migrations Tell things apart
  • 7. class AddStatusToUser < AR::Migration def up add_column :users, :status, :string end def down remove_column :users, :status end end Tell things apart: schema migrations
  • 8. class AddStatusToUser < AR::Migration def up add_column :users, :status, :string User.find_each do |user| user.status = 'active' user.save! end end ... Tell things apart: data migrations
  • 9. ● Write data migrations inside schema migrations (1) ● Write data migrations separately from schema migrations (2) Different solutions
  • 10. ● Write any Rails code carelessly (a) ● Redefine models and use them in place (b) ● Call migration data code written outside (seeds, services, etc.) (c) ● Raw SQL (d) ● Rake tasks (e) Different solutions
  • 11. |{1, 2} x {a, b, c, d, e}| = 10 Different solutions
  • 12. ● Do you need the migrations functioning forever? ● Is a developer environment important more than production? Pick a solution based on balance
  • 13. ● Do you need the migrations functioning forever? ○ No, clean them up from time to time ○ Don’t run all migrations at fresh start ○ Local/staging loads dump and the final schema at once ○ Obfuscate dump if needed ● Is a developer environment important more than production? ○ Obviously no, see the points above My choice
  • 14. class AddStatusToUser < AR::Migration def up add_column :users, :status, :string User.find_each do |user| user.status = 'active' user.save! end end ... Solution #1: Ruby code inside schema migration
  • 15. ● Error-prone - What if someone renames User model later? ● Not recommended Solution #1: Ruby code inside schema migration
  • 16. class AddStatusToUser < AR::Migration class User < ActiveRecord::Base; end def up add_column :users, :status, :string User.find_each { |user| user.update!(status: ‘active’) } end ... Solution #2: Redefine models inside migrations
  • 17. class AddStatusToUser < AR::Migration class User < AR::Base; belongs_to :role, polymorphic: true; end class Role < AR::Base; has_many :users, as: :role; end ---------------------------------------------------------- role = Role.create!(name: 'admin') User.create!(nick: '@ka8725', role: role) Solution #2: Redefine models inside migrations. Bug
  • 18. Solution #2: Redefine models inside migrations. Bug > user = User.find_by(nick: '@ka8725') > user.role # => nil
  • 19. Solution #2: Redefine models inside migrations. Bug > user = User.find_by(nick: '@ka8725') > user.role # => nil > user.role_type # => AddStatusToUser::Role Expected: > user.role_type == Role # => true
  • 20. ● Much better than the previous one ● Error-prone - How to deal with tricky associations? ● Interesting bug with polymorphic associations ● Not recommended Solution #2: Redefine models inside migrations
  • 21. Common approach in Rails community?
  • 22. ● Has all previous problems ● Not a better choice ● Not recommended Solution #3: Call migration data code written outside from schema migrations
  • 23. ● Fast execution ● No previous problems Solution #4: Raw SQL ● SQL knowledge ● More time to code
  • 25. Solution #5: Rake tasks ● Define custom Rake tasks ● Run when needed rake db_migration:fix_data
  • 26. Solution #5: Rake tasks ● Not a bad choice ● Requires some manual work ● Can be automated ● Can be developed to similar solution as schema migrations in Rails
  • 28. Not bad solution for a start ● Define data migrations inside schema migrations ● But write tests for data migrations ● https://railsguides.net/change-data-in-migrations-like-a-boss/ ● https://github.com/ka8725/migration_data
  • 29. ● Similar solution for schema migrations with versioning ○ https://github.com/ilyakatz/data-migrate ● Write SQL ● Schema migrations are made in several steps ○ https://blog.codeship.com/rails-migrations-zero-downtime/ ● Heavy migrations (last for hours) are split into several background jobs scheduled with some interval The best choice suites production zero-downtime
  • 30. The best choice suites production zero-downtime
  • 31. The best choice suites production zero-downtime Sort and run combined: for local env only!
  • 32. ● Schema migrations should be fast (<1s) ● Avoid data migrations inside schema migrations ● Data migrations run after deployment ● Complementary actions are made on following deploys if the data migration is run successfully Production zero-downtime: deployment caveats
  • 36. Zero downtime Production code DB Deploy timeline Schema migrations Symlink Data migrations
  • 37. Split to smaller jobs Process(1-1000) Process(10001-2000) Process(20001-3000) j#1 j#2 j#3 j#6 j#4 j#7 j#8 j#5