SlideShare una empresa de Scribd logo
1 de 78
Real Life Scaling:
      A Tale of Two Websites

By Justin Carmony - Utah Open Source Conference 2009
“It was the best of times, it was
the worst of times...”
           - A Tale of Two Cities, Charles Dickens
About Me
• Website Hobbyist since 1997
• Professional Since 2005
• Worked in Large Open Source & Proprietary Solutions
• Full Time Contractor & Consultant
• Sponsorship Manager for UTOSC
• Blog: http://www.justincarmony.com/blog/
This Presentation
• Point You In The Right Direction
 ‣ Not a Substitute for Research & Homework
• Developers Perspective, Not So Much Sys Admin
• Q & A Session Afterwards
• Available for Questions Remainder of Conference
• Ask via Email: justin@justincarmony.com
The Scaling Conundrum
  Because No One Likes Premature Escalation
The Scale of Scaling
         Traffic
The Scale of Scaling
    # of Websites   Traffic
The Scale of Scaling
    # of Websites   Traffic
The Problem:
Many Developers Skip “Medium”
   and go Straight to “XXL”
The Solution:
Implement Reasonable Solutions
    For Your Website’s Size
Scalability vs Performance
    The Difference Is Important... Yet Useless
Performance Before Scaling
        Except With Limitations
“Zero Theory”
“Zero Theory”
200 Current Visitors
500 New Sign-Ups
300 Messages Sent
1,000 Users
10,000 Users
100,000 Users
“Zero Theory”
200 Current Visitors   2,000 Current Visitors
500 New Sign-Ups       5,000 New Sign-Ups
300 Messages Sent      3,000 Messages Sent
1,000 Users            10,000 Users
10,000 Users           100,000 Users
100,000 Users          1,000,000 Users
Tale #1: Cyber Evolution
Website: www.cevo.com

Online Gaming League

Co-Developed w/ Eric Ping

Communication Between Many
“Clients” & “Servers”

Great Growth Over 4 Years

Learning by Fire for Myself
Tale #2: Dating DNA
Website: http://www.datingdna.com

Online Free Dating Service

#1 iPhone Dating App

Overnight 1,000% Growth

Continuing to Grow Rapidly

Unique Scaling Challenges
#1 Challenge: The Database
#1 Challenge: The Database
•   80% of Performance Issues were
    Database Related
#1 Challenge: The Database
•   80% of Performance Issues were
    Database Related

•   Issues Aren’t Noticeable Until
    After Growth & High Load
#1 Challenge: The Database
•   80% of Performance Issues were
    Database Related

•   Issues Aren’t Noticeable Until
    After Growth & High Load

•   Most Issues Were Due to Poor
    Queries and/or Poor Indexes
Database Abuse!
Just Because You Can, Doesn’t Mean You Should
Database Abuse
• Excessive Logging (instead of using rotated files, etc)
• Excessive Writes (INSERT, UPDATE, REPLACE, DELETE)
• Shear Volume of Repetitive Queries
• Non-Indexed Searches
• Sub-Queries Instead of JOINs
Database Abuse
• Excessive Logging (instead of using rotated files, etc)
• Excessive Writes (INSERT, UPDATE, REPLACE, DELETE)
• Shear Volume of Repetitive Queries
• Non-Indexed Searches
• Sub-Queries Instead of JOINs
Prevention: Learn About Database Design
Database Abuse
• Excessive Logging (instead of using rotated files, etc)
• Excessive Writes (INSERT, UPDATE, REPLACE, DELETE)
• Shear Volume of Repetitive Queries
• Non-Indexed Searches
• Sub-Queries Instead of JOINs
Prevention: Learn About Database Design
      This is NOT for DBAs Only
Example: Custom Forums
•   In Our Defense, We Made This In
    About 24 Hours

•   Nested Queries = 1000+ Queries
    Per Page - Not Joking

•   10 second Load Times

•   End Result: Users Hated It
Example: Standings Page
• Before:
 ‣ Nested Queries To Determine
   Stats Each Row
 ‣ Regenerated Every Time Match
   Updated

• After:
 ‣ Generate Every 30 Minutes
 ‣ Proper JOINs, Only One Query
MySQL Locks
• Problem: Popular Tables Using MyISAM
• Solutions:
 ‣ Switched to InnoDB
 ‣ Optimized Logic to Reduce Writes
 ‣ Reduced JOINs in Queries
Database Replication
Complicated, Major Trade-Offs, Requires Preparation
Open APIs
•   Can Get Heavily Abused

•   Can Grow Unexpectedly

•   Must Be Extremely Efficient

•   CEVO’s APIs
    ‣ 100,000s of Player Lookups
    ‣ Thousands of CMN Players
    ‣ Match History Calls
Identifying What’s Unique
Identifying What’s Unique
•   “What Unique Part of Your Site
    Will be Complicated to Scale?”
Identifying What’s Unique
•   “What Unique Part of Your Site
    Will be Complicated to Scale?”
Identifying What’s Unique
•   “What Unique Part of Your Site
    Will be Complicated to Scale?”

•   Important to Identify Early

•   Allows you to Plan for the Future

•   Many Solutions for the Common
    Stuff to Scale
X 2      -X
Dating DNA’s Compatibility Score Growth

            X = # of Users
# of Score Records
# of Score Records
10 Users
200 Users
5,000 Users
25,000 Users
300,000 Users
1,000,000 Users
# of Score Records
10 Users               90 Records
200 Users
5,000 Users
25,000 Users
300,000 Users
1,000,000 Users
# of Score Records
10 Users                90 Records
200 Users            39,800 Records
5,000 Users
25,000 Users
300,000 Users
1,000,000 Users
# of Score Records
10 Users                  90 Records
200 Users             39,800 Records
5,000 Users        24,995,000 Records
25,000 Users
300,000 Users
1,000,000 Users
# of Score Records
10 Users                   90 Records
200 Users              39,800 Records
5,000 Users         24,995,000 Records
25,000 Users       624,975,000 Records
300,000 Users
1,000,000 Users
# of Score Records
10 Users                     90 Records
200 Users                39,800 Records
5,000 Users          24,995,000 Records
25,000 Users        624,975,000 Records
300,000 Users     89,999,700,000 Records
1,000,000 Users
# of Score Records
10 Users                     90 Records
200 Users                39,800 Records
5,000 Users          24,995,000 Records
25,000 Users        624,975,000 Records
300,000 Users     89,999,700,000 Records
1,000,000 Users   One Trillion Records!
How We Solved It
• Introduce Limits on Records per User
• Only Save Decent Scores
• Smart Logic to Predetermine Good Matches
• Shard Table Into Multiple Tables
• We’re Still Managing & Finding More Optimizations
User Uploaded Content
• Multiple Aspects to Consider
 ‣ Storage (File Size)
 ‣ Serving (File Sizes, Bandwidth, Redundancy)
 ‣ Backups (Time, Speed)
• Challenges
 ‣ Content Outgrowing Server
 ‣ Serving Multiple Versions
Memcached
• Scales Much Easier than Traditional Databases
• Reduced Load Off Databases by 70%
• Implement Quickly -- Its Awesome!
• Gave Presentation On @ UPHPU
 ‣ Check My Blog for Recording
Hardware
• Be careful of “Throwing Hardware” at Problems
• Run on Realistic Hardware
• Developers, You Need To Be Cost Conscious
• Can You Afford to Scale Up? How About Scale Down?
Other Challenges
•   Apache Configurations
    ‣ MaxClient Limits Reached

•   MySQL Configurations
    ‣ Max Connections Reached

•   Network Communication
    ‣ Saturated Bandwidth Between
      Servers
Tools I’ve Used
• MySQL - JetProfiler (No FOSS, Hopefully One Day)
• Server Performance - top, htop, atop, iotop
• PHP - Zend Debugger & Profiler, xdebug
• FirePHP & FireBug
So... Where’s The Scaling?
Getting Ready To Scale
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
• Keep It Simple, Complexity == Complications
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
• Keep It Simple, Complexity == Complications
• Create Metrics to Monitor the Health of the
  Components
Getting Ready To Scale
• Compartmentalize “Parts” Into “Components”
• Create “Scaling Strategy” For Each “Component”
• Keep It Simple, Complexity == Complications
• Create Metrics to Monitor the Health of the
  Components
• “Zero Theory” For Each Component
Makes Components of
    Dating DNA
Makes Components of
    Dating DNA


       Dating DNA
       Application
Makes Components of
             Dating DNA
   Jobs / Cron                   User Uploaded
     System                         Content




                   Dating DNA          Database
Web Servers
                   Application         Cluster




Score Generation                  Memcache
     System                        Cluster
Makes Components of
             Dating DNA
   Jobs / Cron                   User Uploaded
     System                         Content




                   Dating DNA          Database
Web Servers
                   Application         Cluster




Score Generation                  Memcache
     System                        Cluster
Makes Components of
             Dating DNA
   Jobs / Cron                     User Uploaded
     System                           Content




                                         Database
Web Servers        Communication         Cluster




Score Generation                    Memcache
     System                          Cluster
Scaling Components
Scaling Components


Web Application




  Server #1
Scaling Components
Comp E1

Comp D1

Comp C1

Comp B1

Comp A1


Server #1
Scaling Components
Comp E1

Comp D1

Comp C1

Comp B1

Comp A1


Server #1      Server #2
Scaling Components


Comp C1

Comp B1        Comp E1

Comp A1        Comp D1


Server #1      Server #2
Scaling Components


Comp C1        Comp A2

Comp B1        Comp E1

Comp A1        Comp D1


Server #1      Server #2
Scaling Components


Comp C1        Comp A2

Comp B1        Comp E1

Comp A1        Comp D1


Server #1      Server #2   Server #3   Server #4
Scaling Components


Comp C1        Comp A2     Comp C2

Comp B1        Comp E1     Comp B2

Comp A1        Comp D1     Comp A3


Server #1      Server #2   Server #3   Server #4
Scaling Components


Comp C1        Comp A2     Comp C2

Comp B1        Comp E1     Comp B2     Comp B3

Comp A1        Comp D1     Comp A3     Comp A4


Server #1      Server #2   Server #3   Server #4
Scaling Components
• Should Be Able to “Live” With Each Other
• Deployment & Management all Automated
 ‣ Not Necessarily “Automatic”
• Fault Tolerance
• Testing & Staging Environments
Scaling Web Servers
• Challenges                      Load Balancer
 ‣ Routing
 ‣ Sessions
• Ideas                 Web Server           Web Server


 ‣ Separate Dynamic &
   Static
                          Cache                   DB
 ‣ Use Sub Domains
Asynchronous Score
          Generation
• Pass Message To “Score Server”
 ‣ Pass User ID Only
• Quickly Generate a Small Set of Scores
• Queue User for Full Process
 ‣ Spawn Small Generation
• Update “MEMORY” MySQL Table on Status
The Open Source Advantage
• Choice & Options w/ Cost
• Solid Applications, Tested & Proven
• Great Communities
• Give Back, Karma++
Last Thoughts
• Once again, K.I.S.S
• Proactive > Reactive
• Monitoring, Monitoring, Monitoring!
• The “Cloud”
• Get Advice from the Community
Questions?
Thank You
Thank You
Website   www.justincarmony.com

  Email   justin@justincarmony.com

Twitter   JustinCarmony

 Skype    JustinCarmony

   IRC    irc.freenode.net #uphpu

Más contenido relacionado

Último

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Último (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

Destacado

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 

Destacado (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

Real Life Scaling: A Tale of Two Websites

  • 1. Real Life Scaling: A Tale of Two Websites By Justin Carmony - Utah Open Source Conference 2009
  • 2. “It was the best of times, it was the worst of times...” - A Tale of Two Cities, Charles Dickens
  • 3. About Me • Website Hobbyist since 1997 • Professional Since 2005 • Worked in Large Open Source & Proprietary Solutions • Full Time Contractor & Consultant • Sponsorship Manager for UTOSC • Blog: http://www.justincarmony.com/blog/
  • 4. This Presentation • Point You In The Right Direction ‣ Not a Substitute for Research & Homework • Developers Perspective, Not So Much Sys Admin • Q & A Session Afterwards • Available for Questions Remainder of Conference • Ask via Email: justin@justincarmony.com
  • 5. The Scaling Conundrum Because No One Likes Premature Escalation
  • 6. The Scale of Scaling Traffic
  • 7. The Scale of Scaling # of Websites Traffic
  • 8. The Scale of Scaling # of Websites Traffic
  • 9. The Problem: Many Developers Skip “Medium” and go Straight to “XXL”
  • 10. The Solution: Implement Reasonable Solutions For Your Website’s Size
  • 11. Scalability vs Performance The Difference Is Important... Yet Useless
  • 12. Performance Before Scaling Except With Limitations
  • 14. “Zero Theory” 200 Current Visitors 500 New Sign-Ups 300 Messages Sent 1,000 Users 10,000 Users 100,000 Users
  • 15. “Zero Theory” 200 Current Visitors 2,000 Current Visitors 500 New Sign-Ups 5,000 New Sign-Ups 300 Messages Sent 3,000 Messages Sent 1,000 Users 10,000 Users 10,000 Users 100,000 Users 100,000 Users 1,000,000 Users
  • 16. Tale #1: Cyber Evolution Website: www.cevo.com Online Gaming League Co-Developed w/ Eric Ping Communication Between Many “Clients” & “Servers” Great Growth Over 4 Years Learning by Fire for Myself
  • 17. Tale #2: Dating DNA Website: http://www.datingdna.com Online Free Dating Service #1 iPhone Dating App Overnight 1,000% Growth Continuing to Grow Rapidly Unique Scaling Challenges
  • 18. #1 Challenge: The Database
  • 19. #1 Challenge: The Database • 80% of Performance Issues were Database Related
  • 20. #1 Challenge: The Database • 80% of Performance Issues were Database Related • Issues Aren’t Noticeable Until After Growth & High Load
  • 21. #1 Challenge: The Database • 80% of Performance Issues were Database Related • Issues Aren’t Noticeable Until After Growth & High Load • Most Issues Were Due to Poor Queries and/or Poor Indexes
  • 22. Database Abuse! Just Because You Can, Doesn’t Mean You Should
  • 23. Database Abuse • Excessive Logging (instead of using rotated files, etc) • Excessive Writes (INSERT, UPDATE, REPLACE, DELETE) • Shear Volume of Repetitive Queries • Non-Indexed Searches • Sub-Queries Instead of JOINs
  • 24. Database Abuse • Excessive Logging (instead of using rotated files, etc) • Excessive Writes (INSERT, UPDATE, REPLACE, DELETE) • Shear Volume of Repetitive Queries • Non-Indexed Searches • Sub-Queries Instead of JOINs Prevention: Learn About Database Design
  • 25. Database Abuse • Excessive Logging (instead of using rotated files, etc) • Excessive Writes (INSERT, UPDATE, REPLACE, DELETE) • Shear Volume of Repetitive Queries • Non-Indexed Searches • Sub-Queries Instead of JOINs Prevention: Learn About Database Design This is NOT for DBAs Only
  • 26. Example: Custom Forums • In Our Defense, We Made This In About 24 Hours • Nested Queries = 1000+ Queries Per Page - Not Joking • 10 second Load Times • End Result: Users Hated It
  • 27. Example: Standings Page • Before: ‣ Nested Queries To Determine Stats Each Row ‣ Regenerated Every Time Match Updated • After: ‣ Generate Every 30 Minutes ‣ Proper JOINs, Only One Query
  • 28. MySQL Locks • Problem: Popular Tables Using MyISAM • Solutions: ‣ Switched to InnoDB ‣ Optimized Logic to Reduce Writes ‣ Reduced JOINs in Queries
  • 29. Database Replication Complicated, Major Trade-Offs, Requires Preparation
  • 30. Open APIs • Can Get Heavily Abused • Can Grow Unexpectedly • Must Be Extremely Efficient • CEVO’s APIs ‣ 100,000s of Player Lookups ‣ Thousands of CMN Players ‣ Match History Calls
  • 32. Identifying What’s Unique • “What Unique Part of Your Site Will be Complicated to Scale?”
  • 33. Identifying What’s Unique • “What Unique Part of Your Site Will be Complicated to Scale?”
  • 34. Identifying What’s Unique • “What Unique Part of Your Site Will be Complicated to Scale?” • Important to Identify Early • Allows you to Plan for the Future • Many Solutions for the Common Stuff to Scale
  • 35. X 2 -X Dating DNA’s Compatibility Score Growth X = # of Users
  • 36. # of Score Records
  • 37. # of Score Records 10 Users 200 Users 5,000 Users 25,000 Users 300,000 Users 1,000,000 Users
  • 38. # of Score Records 10 Users 90 Records 200 Users 5,000 Users 25,000 Users 300,000 Users 1,000,000 Users
  • 39. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 25,000 Users 300,000 Users 1,000,000 Users
  • 40. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 300,000 Users 1,000,000 Users
  • 41. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 624,975,000 Records 300,000 Users 1,000,000 Users
  • 42. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 624,975,000 Records 300,000 Users 89,999,700,000 Records 1,000,000 Users
  • 43. # of Score Records 10 Users 90 Records 200 Users 39,800 Records 5,000 Users 24,995,000 Records 25,000 Users 624,975,000 Records 300,000 Users 89,999,700,000 Records 1,000,000 Users One Trillion Records!
  • 44. How We Solved It • Introduce Limits on Records per User • Only Save Decent Scores • Smart Logic to Predetermine Good Matches • Shard Table Into Multiple Tables • We’re Still Managing & Finding More Optimizations
  • 45. User Uploaded Content • Multiple Aspects to Consider ‣ Storage (File Size) ‣ Serving (File Sizes, Bandwidth, Redundancy) ‣ Backups (Time, Speed) • Challenges ‣ Content Outgrowing Server ‣ Serving Multiple Versions
  • 46. Memcached • Scales Much Easier than Traditional Databases • Reduced Load Off Databases by 70% • Implement Quickly -- Its Awesome! • Gave Presentation On @ UPHPU ‣ Check My Blog for Recording
  • 47. Hardware • Be careful of “Throwing Hardware” at Problems • Run on Realistic Hardware • Developers, You Need To Be Cost Conscious • Can You Afford to Scale Up? How About Scale Down?
  • 48. Other Challenges • Apache Configurations ‣ MaxClient Limits Reached • MySQL Configurations ‣ Max Connections Reached • Network Communication ‣ Saturated Bandwidth Between Servers
  • 49. Tools I’ve Used • MySQL - JetProfiler (No FOSS, Hopefully One Day) • Server Performance - top, htop, atop, iotop • PHP - Zend Debugger & Profiler, xdebug • FirePHP & FireBug
  • 52. Getting Ready To Scale • Compartmentalize “Parts” Into “Components”
  • 53. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component”
  • 54. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component” • Keep It Simple, Complexity == Complications
  • 55. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component” • Keep It Simple, Complexity == Complications • Create Metrics to Monitor the Health of the Components
  • 56. Getting Ready To Scale • Compartmentalize “Parts” Into “Components” • Create “Scaling Strategy” For Each “Component” • Keep It Simple, Complexity == Complications • Create Metrics to Monitor the Health of the Components • “Zero Theory” For Each Component
  • 57. Makes Components of Dating DNA
  • 58. Makes Components of Dating DNA Dating DNA Application
  • 59. Makes Components of Dating DNA Jobs / Cron User Uploaded System Content Dating DNA Database Web Servers Application Cluster Score Generation Memcache System Cluster
  • 60. Makes Components of Dating DNA Jobs / Cron User Uploaded System Content Dating DNA Database Web Servers Application Cluster Score Generation Memcache System Cluster
  • 61. Makes Components of Dating DNA Jobs / Cron User Uploaded System Content Database Web Servers Communication Cluster Score Generation Memcache System Cluster
  • 64. Scaling Components Comp E1 Comp D1 Comp C1 Comp B1 Comp A1 Server #1
  • 65. Scaling Components Comp E1 Comp D1 Comp C1 Comp B1 Comp A1 Server #1 Server #2
  • 66. Scaling Components Comp C1 Comp B1 Comp E1 Comp A1 Comp D1 Server #1 Server #2
  • 67. Scaling Components Comp C1 Comp A2 Comp B1 Comp E1 Comp A1 Comp D1 Server #1 Server #2
  • 68. Scaling Components Comp C1 Comp A2 Comp B1 Comp E1 Comp A1 Comp D1 Server #1 Server #2 Server #3 Server #4
  • 69. Scaling Components Comp C1 Comp A2 Comp C2 Comp B1 Comp E1 Comp B2 Comp A1 Comp D1 Comp A3 Server #1 Server #2 Server #3 Server #4
  • 70. Scaling Components Comp C1 Comp A2 Comp C2 Comp B1 Comp E1 Comp B2 Comp B3 Comp A1 Comp D1 Comp A3 Comp A4 Server #1 Server #2 Server #3 Server #4
  • 71. Scaling Components • Should Be Able to “Live” With Each Other • Deployment & Management all Automated ‣ Not Necessarily “Automatic” • Fault Tolerance • Testing & Staging Environments
  • 72. Scaling Web Servers • Challenges Load Balancer ‣ Routing ‣ Sessions • Ideas Web Server Web Server ‣ Separate Dynamic & Static Cache DB ‣ Use Sub Domains
  • 73. Asynchronous Score Generation • Pass Message To “Score Server” ‣ Pass User ID Only • Quickly Generate a Small Set of Scores • Queue User for Full Process ‣ Spawn Small Generation • Update “MEMORY” MySQL Table on Status
  • 74. The Open Source Advantage • Choice & Options w/ Cost • Solid Applications, Tested & Proven • Great Communities • Give Back, Karma++
  • 75. Last Thoughts • Once again, K.I.S.S • Proactive > Reactive • Monitoring, Monitoring, Monitoring! • The “Cloud” • Get Advice from the Community
  • 78. Thank You Website www.justincarmony.com Email justin@justincarmony.com Twitter JustinCarmony Skype JustinCarmony IRC irc.freenode.net #uphpu