SlideShare una empresa de Scribd logo
1 de 50
Migrate an
infrastructure to
Galera Cluster
Art van Scheppingen
Head of Database Engineering
2
1. Who are we?
2. What is Galera?
3. What is Spil Games using Galera for?
4. What have we learned?
5. Future technologies
6. Conclusion
7. Questions?
Overview
Who are we?
Who is Spil Games?
4
• Game publishers
• Company founded in 2001
• 350+ employees world wide
• 180M+ unique visitors per month
• Over 60M registered users
• 45 portals in 19 languages
• Casual games
• Social games
• Real time multiplayer games
• Mobile games
• 40+ MySQL clusters
• 65k queries per second
• 8 Sphinx servers
• 8k queries per second
Facts
5
Geographic Reach
180 Million Monthly Active Users(*)
Source: (*) Google Analytics, August 2012
What is Galera?
How to get Highly Available
and beyond
1. Replication plugin for MySQL by Codership
• Synchronous (parallel) replication
• Supports InnoDB
• MyISAM “works”
• Committing transactions actually replicates data
2. Allows clustering of nodes
• Minimum of 3 nodes for HA
• Galera Arbitrator allows 2 nodes
• Primary Component
• Normally the full cluster is a PC
• Split: several components, one PC is allowed
What is Galera?
How does Galera work?
Galera
Server-1 Server-2 Server-n
MySQL MySQL MySQL
Connect/read/write
to any node
Synchronous replication
Galera synchronous replication
Galera replication
Server-n
MySQL MySQL MySQL
commit
Client receives OK
Transaction applied
to slaves
Galera vs Master-Slave replication
Server-1 Server-2 Server-n
MySQL
master
MySQL
slave
read onlyread+write
Asynchronous
single-threaded
replication
High Availability (1)
Galera
Server-1 Server-2 Server-n
MySQL MySQL MySQL
High Availability (2)
Galera
Server-1 Server-2 Server-n
Load balancer / Query router
MySQL MySQL MySQL
High Availability (3)
Galera
Server-1
Load balancer
MySQL MySQL MySQL
Server-2
Load balancer
Server-n
Load balancer
High Availability (4)
Galera
Server-1 Server-2 Server-n
Load balancer (port 3306 read,3307 write)
MySQL MySQL MySQL
read+write read only read only
GaleraGalera
Server-1 Server-2 Server-n
Load balancer
read/write to two nodes
New
node
SST
Requests to
join cluster
Cluster drains
node
MySQL MySQL MySQL
Synchronous replication
MySQL
Node joining SST (State Snapshot Transfer)
GaleraGalera
Server-1 Server-2 Server-n
Load balancer
Existing
node
IST
Requests to
join cluster
MySQL MySQL MySQL
Synchronous replication
MySQL
Node joining IST (Incremental State Transfer)
Galera replication over WAN
Galera replication
Server-n
MySQL MySQL
DC1 DC2
Commit is delayed by RTT
WAN replication Galera 2.x
DC1 DC2
Node 1
Node 2 Node 3
Node 4
Node 5 Node 6
WAN replication Galera 3.x
DC1 DC2
Node 1
Node 2 Node 3
Node 4
Node 5 Node 6
What are we
using Galera for?
Synchronous replication for
the masses
1. Legacy services databases
• MySQL Master-Master
2. SSP (Spil Storage Platform)
• MySQL Master-Master (last pair remaining)
• Galera
3. ROAR (Read Often, Alter Rarely)
• Galera / Sphinx
Our systems
Master-Master setup used at Spil Games
Server-1 Server-2 Server-n
MySQL
active
master
MySQL
inactive
master
Asynchronous replication
read onlyread+write
MMM
db-something (192.168.1.1)
db-something-r1 (192.168.1.2)
db-something-r2 (192.168.1.3)
Master-Master setup used at Spil Games
MySQL
active
master
MySQL
inactive
master
read onlyread+write
MMM
db-something (192.168.1.1)
db-something-r1 (192.168.1.2)
db-something-r2 (192.168.1.3)
MySQL
slave
MySQL
slave
db-something-r3 (192.168.1.4) db-something-r4 (192.168.1.5)read only
Migrating legacy dbs to Galera (lab)
Galera
MySQL MySQL MySQL
legacy1
inactive
master
MySQL
Clone database
(innobackupex) Feed database dump
(mysqldump)
Start slaving
legacy3
inactive
master
legacy2
inactive
master
Scaling Galera (1)
Server-1 Server-2 Server-n
Load balancer (port 3306 write+read)
Galera GaleraGalera
MySQLMySQL MySQL MySQLMySQL
Scaling Galera (2)
Server-1 Server-2 Server-n
Load balancer (port 3306 write, 3307 read)
Galera
MySQL MySQL MySQLMySQL MySQL
read only read only
asynchronous
replication
asynchronous
replication
1. Around 20 legacy database clusters
• 50 servers in total
2. Maintenance
• Master-Master requires a lot of (manual)
maintenance
3. Replacement is needed
• 35 of them will be older than 3 years in 2014
4. Current state: 1 pair migrated to a cluster in
Openstack
Why consolidate legacy systems?
28
• Storage API between application and databases
• All data is sharded
• User
• Function
• Location
• Initial design:
• Every cluster (two masters) will contain two shards
• Data written interleaved
• HA for both shards
• Both masters active and “warmed up”
SSP (Spil Storage Platform)
SSP
Shard 1 Shard 2
SSP Master-Master setup
Server-1 Server-2 Server-n
MySQL
active
master
MySQL
active
master
Asynchronous replication
read+writeread+write
MMM
db-ssp001 (192.168.2.1) db-ssp002 (192.168.2.2)
SSP Master-Master setup
Server-1 Server-2 Server-n
MySQL
broken
master
MySQL
active
master
read+write
MMM
db-ssp001 (192.168.2.1)
db-ssp002 (192.168.2.2)
SSP Galera setup
Galera
Server-1 Server-2 Server-n
Load balancer
MySQL MySQL MySQL
read/write to any node
Synchronous replication
1. Total of 2 old style SSP shard nodes (1 pair)
2. Total of 6 Galera SSP shard nodes (2 clusters)
3. 6 more nodes are planned for Q2
4. Add Galera nodes/clusters when necessary
Current state of the SSP
What have we
learned so far?
Pitfalls, hurdles, etc
Creating backups
1. Two ways to make backups:
• Issue SST
• Either mysqldump or Innobackupex
• Regular Innobackupex
• --galera_info
• set global wsrep_desync=on to remove node
Backup SST
Galera
Server-1 Server-2 Server-n
Load balancer
MySQL MySQL MySQL
read/write to two nodes
Synchronous replication
Backup
receiver
SST
Request SST
Cluster drains
node
Backup
done
Backup Innobackupex
Galera
Server-1 Server-2 Server-n
Load balancer
MySQL MySQL MySQL
read/write to two nodes
Synchronous replication
BackupPC
wsrep_desync=ON
Stream backup
wsrep_desync=OFF
Restoring backups
1. Restored backup can be used to prevent SST of new
joiners
2. Automated backup verification
• Restores (randomly) chosen backup
• Installs necessary MySQL version (5.1/5.5)
• Perform basic checks
• Enable replication
• Will not work fully as it needs a working
cluster to join
Monitoring
1. Cluster
• Nodes in the cluster
• Warning at 2, critical at 1
• Availability of the address
2. Load balancer
• Node checks
• Write only on one node
3. Performance monitoring
• Adding metrics to mysql_statsd is easy
• wsrep_flow_control
Flow control
1. Usage of replication threads
• Scale from 0.0 to 1.0
2. Recommended to stay below 0.1 (10% blocked)
3. Adding more nodes will not solve your problem
4. Increase replication threads
• Recommended 2*CPU cores
• What if 64 is not enough?
• How do you close flood gates?
Other things we bumped into…
1. MySQL version updates
• Update one by one
• PXC SST changes
2. Availability after restart
• Joins cluster after IST/SST
• LRU still loading
3. In descriptive errors during SST
• Local user authentication (after starting mysqld
with sudo!)
4. Schema changes
Future for Galera
at Spil Games
What will we do in the near
future?
Openstack
1. Offer DAAS to our (internal) customers
2. Spawning (automated) database nodes and clusters
when necessary
3. Mix and match Galera and regular MySQL
replication
WAN Replication
1. No immediate use case (yet)
• No need for WAN in sharded environment
• Game catalogue cluster needs it in the future
2. Test Galera 3.0
• Datacenter awareness
MaxScale
1. Beta testing MaxScale for SkySQL
• Works flawless in the lab (so far)
• Not yet tested with mixed Galera/MySQL
replication
2. MaxScale itself is not HA (yet)
• Keepalived?
Conclusion
What is our verdict?
Conclusion(s)
1. Galera definitely live up to expectations
2. Decreased cluster wide performance
3. Increased replication performance
4. High investment in time for initial setup/tools
5. Maintenance is easier
6. Well worth the investment for us
Questions?
• Building globally available storage layers
Friday 1:50PM - 2:40PM @ Ballroom F
• MaxScale, the pluggable router
Today 4:50PM - 5:40PM @ Ballroom A
• Geographical Disaster recovery challenges and solutions
Today 4:50PM - 5:40PM @ Ballroom C
• Write Conflicts in Multi-Master Replication Topologies
Thursday 4:30PM - 5:20PM @ Ballroom H
Other (Galara related) talks to attend to
49
• Presentation can be found at:
http://spil.com/plsc2014galera
• If you wish to contact me:
Email: art@spilgames.com
Twitter: @banpei
• Engineering @ Spil Games
Blog: http://engineering.spilgames.com
Twitter: @spilengineering
Thank you!
50
Our current HA environment:
http://thinkaurelius.com/2013/03/30/titan-server-from-a-single-server-to-a-highly-available-cluster/
What we have learned so far:
http://renaissanceronin.wordpress.com/2009/10/05/playing-with-plasma-cutters/
Near future:
http://www.example-infographics.com/envisioning-the-near-future-of-technology/
Conclusion:
http://www.flickr.com/photos/louisephotography/5796499806/in/photostream/
Photo sources

Más contenido relacionado

Último

GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitysandeepnani2260
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerkumenegertelayegrama
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityApp Ethena
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SESaleh Ibne Omar
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptxerickamwana1
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per MVidyaAdsule1
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air CoolerA Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Coolerenquirieskenstar
 

Último (17)

GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per M
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air CoolerA Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Cooler
 

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Spil Games: Migrate an infrastructure to Galera Cluster

  • 1. Migrate an infrastructure to Galera Cluster Art van Scheppingen Head of Database Engineering
  • 2. 2 1. Who are we? 2. What is Galera? 3. What is Spil Games using Galera for? 4. What have we learned? 5. Future technologies 6. Conclusion 7. Questions? Overview
  • 3. Who are we? Who is Spil Games?
  • 4. 4 • Game publishers • Company founded in 2001 • 350+ employees world wide • 180M+ unique visitors per month • Over 60M registered users • 45 portals in 19 languages • Casual games • Social games • Real time multiplayer games • Mobile games • 40+ MySQL clusters • 65k queries per second • 8 Sphinx servers • 8k queries per second Facts
  • 5. 5 Geographic Reach 180 Million Monthly Active Users(*) Source: (*) Google Analytics, August 2012
  • 6. What is Galera? How to get Highly Available and beyond
  • 7. 1. Replication plugin for MySQL by Codership • Synchronous (parallel) replication • Supports InnoDB • MyISAM “works” • Committing transactions actually replicates data 2. Allows clustering of nodes • Minimum of 3 nodes for HA • Galera Arbitrator allows 2 nodes • Primary Component • Normally the full cluster is a PC • Split: several components, one PC is allowed What is Galera?
  • 8. How does Galera work? Galera Server-1 Server-2 Server-n MySQL MySQL MySQL Connect/read/write to any node Synchronous replication
  • 9. Galera synchronous replication Galera replication Server-n MySQL MySQL MySQL commit Client receives OK Transaction applied to slaves
  • 10. Galera vs Master-Slave replication Server-1 Server-2 Server-n MySQL master MySQL slave read onlyread+write Asynchronous single-threaded replication
  • 11. High Availability (1) Galera Server-1 Server-2 Server-n MySQL MySQL MySQL
  • 12. High Availability (2) Galera Server-1 Server-2 Server-n Load balancer / Query router MySQL MySQL MySQL
  • 13. High Availability (3) Galera Server-1 Load balancer MySQL MySQL MySQL Server-2 Load balancer Server-n Load balancer
  • 14. High Availability (4) Galera Server-1 Server-2 Server-n Load balancer (port 3306 read,3307 write) MySQL MySQL MySQL read+write read only read only
  • 15. GaleraGalera Server-1 Server-2 Server-n Load balancer read/write to two nodes New node SST Requests to join cluster Cluster drains node MySQL MySQL MySQL Synchronous replication MySQL Node joining SST (State Snapshot Transfer)
  • 16. GaleraGalera Server-1 Server-2 Server-n Load balancer Existing node IST Requests to join cluster MySQL MySQL MySQL Synchronous replication MySQL Node joining IST (Incremental State Transfer)
  • 17. Galera replication over WAN Galera replication Server-n MySQL MySQL DC1 DC2 Commit is delayed by RTT
  • 18. WAN replication Galera 2.x DC1 DC2 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
  • 19. WAN replication Galera 3.x DC1 DC2 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
  • 20. What are we using Galera for? Synchronous replication for the masses
  • 21. 1. Legacy services databases • MySQL Master-Master 2. SSP (Spil Storage Platform) • MySQL Master-Master (last pair remaining) • Galera 3. ROAR (Read Often, Alter Rarely) • Galera / Sphinx Our systems
  • 22. Master-Master setup used at Spil Games Server-1 Server-2 Server-n MySQL active master MySQL inactive master Asynchronous replication read onlyread+write MMM db-something (192.168.1.1) db-something-r1 (192.168.1.2) db-something-r2 (192.168.1.3)
  • 23. Master-Master setup used at Spil Games MySQL active master MySQL inactive master read onlyread+write MMM db-something (192.168.1.1) db-something-r1 (192.168.1.2) db-something-r2 (192.168.1.3) MySQL slave MySQL slave db-something-r3 (192.168.1.4) db-something-r4 (192.168.1.5)read only
  • 24. Migrating legacy dbs to Galera (lab) Galera MySQL MySQL MySQL legacy1 inactive master MySQL Clone database (innobackupex) Feed database dump (mysqldump) Start slaving legacy3 inactive master legacy2 inactive master
  • 25. Scaling Galera (1) Server-1 Server-2 Server-n Load balancer (port 3306 write+read) Galera GaleraGalera MySQLMySQL MySQL MySQLMySQL
  • 26. Scaling Galera (2) Server-1 Server-2 Server-n Load balancer (port 3306 write, 3307 read) Galera MySQL MySQL MySQLMySQL MySQL read only read only asynchronous replication asynchronous replication
  • 27. 1. Around 20 legacy database clusters • 50 servers in total 2. Maintenance • Master-Master requires a lot of (manual) maintenance 3. Replacement is needed • 35 of them will be older than 3 years in 2014 4. Current state: 1 pair migrated to a cluster in Openstack Why consolidate legacy systems?
  • 28. 28 • Storage API between application and databases • All data is sharded • User • Function • Location • Initial design: • Every cluster (two masters) will contain two shards • Data written interleaved • HA for both shards • Both masters active and “warmed up” SSP (Spil Storage Platform) SSP Shard 1 Shard 2
  • 29. SSP Master-Master setup Server-1 Server-2 Server-n MySQL active master MySQL active master Asynchronous replication read+writeread+write MMM db-ssp001 (192.168.2.1) db-ssp002 (192.168.2.2)
  • 30. SSP Master-Master setup Server-1 Server-2 Server-n MySQL broken master MySQL active master read+write MMM db-ssp001 (192.168.2.1) db-ssp002 (192.168.2.2)
  • 31. SSP Galera setup Galera Server-1 Server-2 Server-n Load balancer MySQL MySQL MySQL read/write to any node Synchronous replication
  • 32. 1. Total of 2 old style SSP shard nodes (1 pair) 2. Total of 6 Galera SSP shard nodes (2 clusters) 3. 6 more nodes are planned for Q2 4. Add Galera nodes/clusters when necessary Current state of the SSP
  • 33. What have we learned so far? Pitfalls, hurdles, etc
  • 34. Creating backups 1. Two ways to make backups: • Issue SST • Either mysqldump or Innobackupex • Regular Innobackupex • --galera_info • set global wsrep_desync=on to remove node
  • 35. Backup SST Galera Server-1 Server-2 Server-n Load balancer MySQL MySQL MySQL read/write to two nodes Synchronous replication Backup receiver SST Request SST Cluster drains node Backup done
  • 36. Backup Innobackupex Galera Server-1 Server-2 Server-n Load balancer MySQL MySQL MySQL read/write to two nodes Synchronous replication BackupPC wsrep_desync=ON Stream backup wsrep_desync=OFF
  • 37. Restoring backups 1. Restored backup can be used to prevent SST of new joiners 2. Automated backup verification • Restores (randomly) chosen backup • Installs necessary MySQL version (5.1/5.5) • Perform basic checks • Enable replication • Will not work fully as it needs a working cluster to join
  • 38. Monitoring 1. Cluster • Nodes in the cluster • Warning at 2, critical at 1 • Availability of the address 2. Load balancer • Node checks • Write only on one node 3. Performance monitoring • Adding metrics to mysql_statsd is easy • wsrep_flow_control
  • 39. Flow control 1. Usage of replication threads • Scale from 0.0 to 1.0 2. Recommended to stay below 0.1 (10% blocked) 3. Adding more nodes will not solve your problem 4. Increase replication threads • Recommended 2*CPU cores • What if 64 is not enough? • How do you close flood gates?
  • 40. Other things we bumped into… 1. MySQL version updates • Update one by one • PXC SST changes 2. Availability after restart • Joins cluster after IST/SST • LRU still loading 3. In descriptive errors during SST • Local user authentication (after starting mysqld with sudo!) 4. Schema changes
  • 41. Future for Galera at Spil Games What will we do in the near future?
  • 42. Openstack 1. Offer DAAS to our (internal) customers 2. Spawning (automated) database nodes and clusters when necessary 3. Mix and match Galera and regular MySQL replication
  • 43. WAN Replication 1. No immediate use case (yet) • No need for WAN in sharded environment • Game catalogue cluster needs it in the future 2. Test Galera 3.0 • Datacenter awareness
  • 44. MaxScale 1. Beta testing MaxScale for SkySQL • Works flawless in the lab (so far) • Not yet tested with mixed Galera/MySQL replication 2. MaxScale itself is not HA (yet) • Keepalived?
  • 46. Conclusion(s) 1. Galera definitely live up to expectations 2. Decreased cluster wide performance 3. Increased replication performance 4. High investment in time for initial setup/tools 5. Maintenance is easier 6. Well worth the investment for us
  • 48. • Building globally available storage layers Friday 1:50PM - 2:40PM @ Ballroom F • MaxScale, the pluggable router Today 4:50PM - 5:40PM @ Ballroom A • Geographical Disaster recovery challenges and solutions Today 4:50PM - 5:40PM @ Ballroom C • Write Conflicts in Multi-Master Replication Topologies Thursday 4:30PM - 5:20PM @ Ballroom H Other (Galara related) talks to attend to
  • 49. 49 • Presentation can be found at: http://spil.com/plsc2014galera • If you wish to contact me: Email: art@spilgames.com Twitter: @banpei • Engineering @ Spil Games Blog: http://engineering.spilgames.com Twitter: @spilengineering Thank you!
  • 50. 50 Our current HA environment: http://thinkaurelius.com/2013/03/30/titan-server-from-a-single-server-to-a-highly-available-cluster/ What we have learned so far: http://renaissanceronin.wordpress.com/2009/10/05/playing-with-plasma-cutters/ Near future: http://www.example-infographics.com/envisioning-the-near-future-of-technology/ Conclusion: http://www.flickr.com/photos/louisephotography/5796499806/in/photostream/ Photo sources

Notas del editor

  1. so that may be the reason our name is not widely known.