SlideShare una empresa de Scribd logo
1 de 85
Descargar para leer sin conexión
Database
Synchronization
     Shaun Haber
  Warner Bros. Records
What is it?


• Merging content between a dev site and a
  production site
Disclaimer

• No single answer
• No “Drupally” solution
• Not exclusive to Drupal
• Not magic
Who the hell am I?
Warner Music Group
Warner Bros. Records

• Subsidiary of Warner Music Group
• Family of labels (Reprise, Sire, etc.)
• Over 100 artists
• Top-selling albums
• It’s music biz after all!
So what?
WBR Tech

• Only label with an in-house Tech team
• “Start-up” mentality
• Fast-paced, hectic, and fun!
• We use Drupal... religiously
93 Drupal Sites
  1 new site every week
Launching like crazy!
Source: http://flickr.com/photos/krosinsky/2848288562/




Web sites in the wild!
Websites in the wild


• Always collecting new data!
Data   Launch




       Time
Not a bad thing,
       obviously

• Want websites to grow
• More users + more data = PROFIT
But...

• How do we keep the site updated?
 -   New content
 -   New features
 -   Code fixes
 -   <insert your own update here>
Minor updates

                Major updates




                     Source: http://flickr.com/photos/nimboo/132386298
Minor Updates

• CSS tweak
• template.php change
• Add a new Block
• Change settings on a View
• Install a new module
Major Updates

• Schema changes
• Information re-architecture
• Significant configuration changes
• User flow changes
• New theme integration
Strategy?




            Maintain a separate Dev site!
Prod server




Dev server




  New


              Time
Prod server




Dev server




  New         QA


                   Time
Prod server




                   Prod

Dev server




  New         QA   Dev


                   Time
Prod server




                   Prod   Prod

Dev server




  New         QA   Dev


                   Time
Prod server




                   Prod   Prod

Dev server




  New         QA   Dev    Dev


                   Time
Prod server




                                 ?
                   Prod   Prod

Dev server




  New         QA   Dev    Dev


                   Time
Syncing Databases Sucks

     Code      Easy


      Files    Easy


    Database   Hard
Prod server




                   Prod   Prod

Dev server




  New         QA   Dev    Dev


                   Time
Prod server

                                 Prod 2.0

                   Prod   Prod

Dev server




  New         QA   Dev    Dev


                   Time
Order of Events
1. Develop a new site
2. Launch site
3. Take snapshot of prod site
4. Develop on snapshot
5. Magic? => Relaunch new version of site
But it’s not Magic!
1. Take dev site down
2. Shift sequenced IDs on Dev
3. Take prod site down
4. Merge content from Prod to Dev
5. QA “new” dev site
6. Copy dev site to prod site
7. Bring “new” prod site live
Source: http://flickr.com/photos/interplast/6339098/




It’s Database Surgery!
2 Step Process

• Step 1 - Shift Sequenced IDs

• Step 2 - Merge content
3
    2
1
3       3
    2       2
1       1
6
5           5
4       4
3       3
    2       2
1       1
11
10

             6
             5
         4
 3       3
     2       2
 1       1
11
10

     6       6
     5       5
 4       4
 3       3
     2       2
 1       1
11
10
    6       6
 5          5
4       4
3       3
    2       2
1       1
11
10
    6       6
 5          5
4       4
3       3
2a      2a
1       1
11
10
    6       6
 5          5
4       4
3       3
2a      2a
1       1
Step 1 - Shifting IDs

• comments_cid
• files_fid
• node_revisions_vid
• node_nid
• users_uid
Need to know

• Highest common ID between Dev and
  Prod
• Delta value to shift
• Reference of known tables and fields
Highest Common ID

• Top item on the “stack” at time of the
  snapshot.

                     3
       3                          3
           2                          2
        1                         1
Delta value
• Amount to shift the conflicted items, with
  extra padding

                  11
                  10

                       7

                  3
UPDATE table
SET id = id + $delta
WHERE id > $common
And that’s it for Step 1
Actually, it’s MUCH
more complicated...
What tables have nid?
comments.nid                       poll.nid
content_field_* nid.field_*_nid      poll_choices.nid
content_type_* nid.field_*_nid      poll_votes.nid
files.nid                           term_node.nid
forum.nid                          uc_cart_products.nid
forward_log.nid                    uc_order_products.nid
history.nid                        uc_product_features.nid
node.nid                           uc_products.nid
node_access.nid                    uc_roles_products.nid
node_comment_statistics.nid        usernode.nid
node_counter.nid                   webform.nid
node_revisions.nid                 webform_component.nid
nodefamily.parent_nid, child_nid   webform_submissions.nid
panels_node.nid                    webform_submitted_data.nid
Also...
• Special tables:
 • location, sequences, url_alias, etc.
• node-nid.tpl.php
• Serialized PHP variables in DB
• PHP code in DB
• URLs in DB or elsewhere (e.g., /node/123)
Well shit!
Do the best we can!

• Reference of all known tables
• Reference of all known sequence fields
• Reference of all known “special cases”
• Automate as much as possible
Scripting Time!
Check for unknown tables
$rs = db_query(“SHOW TABLES”);

while ($row = db_fetch_row($rs)) {
  if (!is_known_table($row[0]) {
    log_unknown_table($row[0]);
  }
}

if (found_unknown_tables()) {
  print_unknown_tables();
  exit;
}
Store all known tables in a
               txt file
access                   buddylist_groups
accesslog                buddylist_pending_requests
audio_widget_thumbnail   cache*
audio_widget_track       comments
authmap                  contact
blocks                   content_field_*
blocks_roles             content_type_*
boxes                    devel_queries
buddylist                devel_times
buddylist_buddy_group    ...
Store all fields in separate
                 txt files
comments.nid                    node_comment_statistics.nid
content_field_* nid.field_*_nid   node_counter.nid
content_type_* nid.field_*_nid   node_revisions.nid
files.nid                        nodefamily.parent_nid, child_nid
forum.nid                       panels_node.nid
forward_log.nid                 poll.nid
history.nid                     poll_choices.nid
node.nid                        poll_votes.nid
node_access.nid                 ...
Now we can shift IDs!


• Iterate thru DB tables
• If table has known fields, shift IDs
  (remember that SQL command?)
• Rinse and repeat for each sequenced ID
UPDATE table
SET id = id + $delta
WHERE id > $common
Special Cases
Sequences table


• Simply reset the value to new highest ID
• Do this after shifting IDs in the “primary”
  table (node.nid, user.uid, etc.)
UPDATE sequences
SET `$seq` = $max
Location table

• Stores ID val in column `eid`
• Stores sequence type in column `type`
 • type = node, user
UPDATE location
SET `eid` = `eid` + $delta
WHERE `eid` > $common
AND `type` = $type
Url_alias table

• ID values are embedded as strings
• Use pattern matching to parse the ID
 • node: node/nid
 • user: user/uid, blog/uid
• Add the delta, update new alias
Pseudo-code
SELECT * FROM url_alias WHERE src LIKE ‘node/%’

preg_match('/node/([0-9]*)/', $src, $matches)

$id = $matches[1]

$id = $id + $delta

UPDATE url_alias SET src = 'node/$id' WHERE pid = $pid
Manually
• Rename any node-nid.tpl.php files
• Search for ID vals in DB:
 • Eval’ed PHP code
 • Serialized PHP code
 • URLs
 • anything else?
Step 1 Recap

• Maintain indexes for tables and fields
• Automate using the indexes
• Review indexes before each shift
• Inspect for manual cases after each shift
• Document every new case you find!
At least most of this
 can be automated!
Step 2 - Merging
    Content
Merging Content
10

     6          6
 5              5
 4          4
3           3
What to merge?

• Content
• Really, just the content
• No variables, settings, etc.
Need to know


• Highest Common ID (same from Step 1)
• Reference of tables
Process

• Iterate thru Prod tables:
 • Skip
 • INSERT IGNORE (I)
 • REPLACE (R)
 • DROP and INSERT (A)
Special Cases

• Url_alias table
• Sequences table
• Some nodes
Url_Alias table


• Don’t go by pid
• REPLACE INTO url_alias SET src = '$src',
  dst = '$dst'
Sequences table


• Manually inspect sequence values!
Node timestamps


• Get timestamp of Highest Common nid
• Check for older nodes on Prod that have
  been modified recently
Replace on Dev with

SELECT nid
FROM node
WHERE changed > $timestamp
AND nid > $common
That’s it... for now.
Future

• Share sequences table between Dev and
  Prod
• Even/odd IDs (Drupal 6+)
• Macro recordings and playbacks
Questions?
• Shaun Haber
  shaun.haber@wbr.com

  http://srhaber.com
  Twitter: @srhaber

Más contenido relacionado

Destacado

BADCamp 2008 Core Crazy
BADCamp 2008 Core CrazyBADCamp 2008 Core Crazy
BADCamp 2008 Core CrazyShaun Haber
 
PSFK Future of Work Report
PSFK Future of Work ReportPSFK Future of Work Report
PSFK Future of Work ReportPSFK
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 

Destacado (6)

BADCamp 2008 Core Crazy
BADCamp 2008 Core CrazyBADCamp 2008 Core Crazy
BADCamp 2008 Core Crazy
 
Drupal + WBR
Drupal + WBRDrupal + WBR
Drupal + WBR
 
PSFK Future of Work Report
PSFK Future of Work ReportPSFK Future of Work Report
PSFK Future of Work Report
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 

Similar a Database Synchronization Between Dev and Prod Sites

Fixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data PatternsFixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data PatternsMartin Jackson
 
Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Rabble .
 
Efficient JavaScript Development
Efficient JavaScript DevelopmentEfficient JavaScript Development
Efficient JavaScript Developmentwolframkriesing
 
How to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsRowan Hick
 
Modern tooling to assist with developing applications on FreeBSD
Modern tooling to assist with developing applications on FreeBSDModern tooling to assist with developing applications on FreeBSD
Modern tooling to assist with developing applications on FreeBSDSean Chittenden
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and dontsDuyhai Doan
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersJonathan Levin
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeWim Godden
 
Building production websites with Node.js on the Microsoft stack
Building production websites with Node.js on the Microsoft stackBuilding production websites with Node.js on the Microsoft stack
Building production websites with Node.js on the Microsoft stackCellarTracker
 
Windows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local DatabaseWindows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local DatabaseOliver Scheer
 
Smart Client Development
Smart Client DevelopmentSmart Client Development
Smart Client DevelopmentTamir Khason
 
Driver Debugging Basics
Driver Debugging BasicsDriver Debugging Basics
Driver Debugging BasicsBala Subra
 
Windows Azure: Lessons From the Field
Windows Azure: Lessons From the FieldWindows Azure: Lessons From the Field
Windows Azure: Lessons From the FieldMichael Collier
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsServer Density
 

Similar a Database Synchronization Between Dev and Prod Sites (20)

VS Debugging Tricks
VS Debugging TricksVS Debugging Tricks
VS Debugging Tricks
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little Pieces
 
Fixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data PatternsFixing Growing Pains With Puppet Data Patterns
Fixing Growing Pains With Puppet Data Patterns
 
Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007Introduction to Active Record at MySQL Conference 2007
Introduction to Active Record at MySQL Conference 2007
 
Efficient JavaScript Development
Efficient JavaScript DevelopmentEfficient JavaScript Development
Efficient JavaScript Development
 
CouchDB
CouchDBCouchDB
CouchDB
 
Qure Tech Presentation
Qure Tech PresentationQure Tech Presentation
Qure Tech Presentation
 
How to avoid hanging yourself with Rails
How to avoid hanging yourself with RailsHow to avoid hanging yourself with Rails
How to avoid hanging yourself with Rails
 
Modern tooling to assist with developing applications on FreeBSD
Modern tooling to assist with developing applications on FreeBSDModern tooling to assist with developing applications on FreeBSD
Modern tooling to assist with developing applications on FreeBSD
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
 
Advanced Deployment
Advanced DeploymentAdvanced Deployment
Advanced Deployment
 
MySQL Proxy tutorial
MySQL Proxy tutorialMySQL Proxy tutorial
MySQL Proxy tutorial
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
Building production websites with Node.js on the Microsoft stack
Building production websites with Node.js on the Microsoft stackBuilding production websites with Node.js on the Microsoft stack
Building production websites with Node.js on the Microsoft stack
 
Windows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local DatabaseWindows Phone 8 - 7 Local Database
Windows Phone 8 - 7 Local Database
 
Smart Client Development
Smart Client DevelopmentSmart Client Development
Smart Client Development
 
Driver Debugging Basics
Driver Debugging BasicsDriver Debugging Basics
Driver Debugging Basics
 
Windows Azure: Lessons From the Field
Windows Azure: Lessons From the FieldWindows Azure: Lessons From the Field
Windows Azure: Lessons From the Field
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 

Último

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Último (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Database Synchronization Between Dev and Prod Sites

  • 1. Database Synchronization Shaun Haber Warner Bros. Records
  • 2. What is it? • Merging content between a dev site and a production site
  • 3. Disclaimer • No single answer • No “Drupally” solution • Not exclusive to Drupal • Not magic
  • 4. Who the hell am I?
  • 5.
  • 6.
  • 8. Warner Bros. Records • Subsidiary of Warner Music Group • Family of labels (Reprise, Sire, etc.) • Over 100 artists • Top-selling albums • It’s music biz after all!
  • 10. WBR Tech • Only label with an in-house Tech team • “Start-up” mentality • Fast-paced, hectic, and fun! • We use Drupal... religiously
  • 11. 93 Drupal Sites 1 new site every week
  • 13.
  • 15. Websites in the wild • Always collecting new data!
  • 16. Data Launch Time
  • 17. Not a bad thing, obviously • Want websites to grow • More users + more data = PROFIT
  • 18. But... • How do we keep the site updated? - New content - New features - Code fixes - <insert your own update here>
  • 19. Minor updates Major updates Source: http://flickr.com/photos/nimboo/132386298
  • 20. Minor Updates • CSS tweak • template.php change • Add a new Block • Change settings on a View • Install a new module
  • 21. Major Updates • Schema changes • Information re-architecture • Significant configuration changes • User flow changes • New theme integration
  • 22. Strategy? Maintain a separate Dev site!
  • 24. Prod server Dev server New QA Time
  • 25. Prod server Prod Dev server New QA Dev Time
  • 26. Prod server Prod Prod Dev server New QA Dev Time
  • 27. Prod server Prod Prod Dev server New QA Dev Dev Time
  • 28. Prod server ? Prod Prod Dev server New QA Dev Dev Time
  • 29. Syncing Databases Sucks Code Easy Files Easy Database Hard
  • 30. Prod server Prod Prod Dev server New QA Dev Dev Time
  • 31. Prod server Prod 2.0 Prod Prod Dev server New QA Dev Dev Time
  • 32. Order of Events 1. Develop a new site 2. Launch site 3. Take snapshot of prod site 4. Develop on snapshot 5. Magic? => Relaunch new version of site
  • 33. But it’s not Magic! 1. Take dev site down 2. Shift sequenced IDs on Dev 3. Take prod site down 4. Merge content from Prod to Dev 5. QA “new” dev site 6. Copy dev site to prod site 7. Bring “new” prod site live
  • 35. 2 Step Process • Step 1 - Shift Sequenced IDs • Step 2 - Merge content
  • 36. 3 2 1
  • 37. 3 3 2 2 1 1
  • 38. 6 5 5 4 4 3 3 2 2 1 1
  • 39. 11 10 6 5 4 3 3 2 2 1 1
  • 40. 11 10 6 6 5 5 4 4 3 3 2 2 1 1
  • 41. 11 10 6 6 5 5 4 4 3 3 2 2 1 1
  • 42. 11 10 6 6 5 5 4 4 3 3 2a 2a 1 1
  • 43. 11 10 6 6 5 5 4 4 3 3 2a 2a 1 1
  • 44. Step 1 - Shifting IDs • comments_cid • files_fid • node_revisions_vid • node_nid • users_uid
  • 45. Need to know • Highest common ID between Dev and Prod • Delta value to shift • Reference of known tables and fields
  • 46. Highest Common ID • Top item on the “stack” at time of the snapshot. 3 3 3 2 2 1 1
  • 47. Delta value • Amount to shift the conflicted items, with extra padding 11 10 7 3
  • 48. UPDATE table SET id = id + $delta WHERE id > $common
  • 49. And that’s it for Step 1
  • 50. Actually, it’s MUCH more complicated...
  • 51. What tables have nid? comments.nid poll.nid content_field_* nid.field_*_nid poll_choices.nid content_type_* nid.field_*_nid poll_votes.nid files.nid term_node.nid forum.nid uc_cart_products.nid forward_log.nid uc_order_products.nid history.nid uc_product_features.nid node.nid uc_products.nid node_access.nid uc_roles_products.nid node_comment_statistics.nid usernode.nid node_counter.nid webform.nid node_revisions.nid webform_component.nid nodefamily.parent_nid, child_nid webform_submissions.nid panels_node.nid webform_submitted_data.nid
  • 52. Also... • Special tables: • location, sequences, url_alias, etc. • node-nid.tpl.php • Serialized PHP variables in DB • PHP code in DB • URLs in DB or elsewhere (e.g., /node/123)
  • 54. Do the best we can! • Reference of all known tables • Reference of all known sequence fields • Reference of all known “special cases” • Automate as much as possible
  • 56. Check for unknown tables $rs = db_query(“SHOW TABLES”); while ($row = db_fetch_row($rs)) { if (!is_known_table($row[0]) { log_unknown_table($row[0]); } } if (found_unknown_tables()) { print_unknown_tables(); exit; }
  • 57. Store all known tables in a txt file access buddylist_groups accesslog buddylist_pending_requests audio_widget_thumbnail cache* audio_widget_track comments authmap contact blocks content_field_* blocks_roles content_type_* boxes devel_queries buddylist devel_times buddylist_buddy_group ...
  • 58.
  • 59. Store all fields in separate txt files comments.nid node_comment_statistics.nid content_field_* nid.field_*_nid node_counter.nid content_type_* nid.field_*_nid node_revisions.nid files.nid nodefamily.parent_nid, child_nid forum.nid panels_node.nid forward_log.nid poll.nid history.nid poll_choices.nid node.nid poll_votes.nid node_access.nid ...
  • 60. Now we can shift IDs! • Iterate thru DB tables • If table has known fields, shift IDs (remember that SQL command?) • Rinse and repeat for each sequenced ID
  • 61. UPDATE table SET id = id + $delta WHERE id > $common
  • 63. Sequences table • Simply reset the value to new highest ID • Do this after shifting IDs in the “primary” table (node.nid, user.uid, etc.)
  • 65. Location table • Stores ID val in column `eid` • Stores sequence type in column `type` • type = node, user
  • 66. UPDATE location SET `eid` = `eid` + $delta WHERE `eid` > $common AND `type` = $type
  • 67. Url_alias table • ID values are embedded as strings • Use pattern matching to parse the ID • node: node/nid • user: user/uid, blog/uid • Add the delta, update new alias
  • 68. Pseudo-code SELECT * FROM url_alias WHERE src LIKE ‘node/%’ preg_match('/node/([0-9]*)/', $src, $matches) $id = $matches[1] $id = $id + $delta UPDATE url_alias SET src = 'node/$id' WHERE pid = $pid
  • 69. Manually • Rename any node-nid.tpl.php files • Search for ID vals in DB: • Eval’ed PHP code • Serialized PHP code • URLs • anything else?
  • 70. Step 1 Recap • Maintain indexes for tables and fields • Automate using the indexes • Review indexes before each shift • Inspect for manual cases after each shift • Document every new case you find!
  • 71. At least most of this can be automated!
  • 72. Step 2 - Merging Content
  • 73. Merging Content 10 6 6 5 5 4 4 3 3
  • 74. What to merge? • Content • Really, just the content • No variables, settings, etc.
  • 75. Need to know • Highest Common ID (same from Step 1) • Reference of tables
  • 76. Process • Iterate thru Prod tables: • Skip • INSERT IGNORE (I) • REPLACE (R) • DROP and INSERT (A)
  • 77.
  • 78. Special Cases • Url_alias table • Sequences table • Some nodes
  • 79. Url_Alias table • Don’t go by pid • REPLACE INTO url_alias SET src = '$src', dst = '$dst'
  • 80. Sequences table • Manually inspect sequence values!
  • 81. Node timestamps • Get timestamp of Highest Common nid • Check for older nodes on Prod that have been modified recently
  • 82. Replace on Dev with SELECT nid FROM node WHERE changed > $timestamp AND nid > $common
  • 84. Future • Share sequences table between Dev and Prod • Even/odd IDs (Drupal 6+) • Macro recordings and playbacks
  • 85. Questions? • Shaun Haber shaun.haber@wbr.com http://srhaber.com Twitter: @srhaber