SlideShare a Scribd company logo
1 of 43
Download to read offline
194B
GETTING 100B METRICS TO DISK
Jonathan Thurman -Site Reliability Engineer
@jthurman42

http://www.flickr.com/photos/meteopassione/9157134653/
NEW RELIC

• Performance Monitoring
• Web Apps
• Mobile Apps
• Servers
• Databases, Caches & More…
• Software Analytics
O K AY, Y O U
C O L L E C T D ATA
• 194 Billion Metrics
• 100,000 req/sec
• 2 Gbps Inbound
• 216 Terabytes
• All backed my MySQL

http://www.flickr.com/photos/bobsfever/6658919861/
HOW WE GOT HERE

http://www.flickr.com/photos/auvet/853157494/
BUILDING BLOCKS

• Hosted Environment
• Xen Virtual Machines
• Data storage
• ATA over Ethernet
• SATA drives
• MySQL 5.0
• Single Ruby on Rails Application
http://www.flickr.com/photos/riekhavoc/4648423297/
SHARDING FROM
INCEPTION
• Account Information
• Read heavy
• Single HA Instance
• Agent Data
• Write heavy
• 8 shards based on AccountId

http://www.flickr.com/photos/erikb/48221952/
TA L E O F 

TWO MODELS

• Ruby on Rails
• class ShardData < ActiveRecord::Base
• Look up shard for Account
• Override ConnectionHandler

http://www.flickr.com/photos/jungle_boy/140279885/
T R I B B L E S TA B L E S

• Metric table name contains
• AccountID
• Year and Julian Day
• Resolution
• ts_72_13221_1h
• Currently ~200k tables per DB

http://www.flickr.com/photos/15942690@N00/4571141076/
BINGE AND PURGE

• Purging data
• DELETE FROM …
• DROP TABLE …
• innodb_file_per_table
• innodb_lazy_drop_table


(pre 5.5.30-30.2)

http://www.flickr.com/photos/exalthim/2261294871/
http://www.flickr.com/photos/heliocentric/1571127347/

http://www.flickr.com/photos/davidmonro/8331755849/

http://www.flickr.com/photos/aigle_dore/6225535459/
G R O W I N G PA I N S

http://www.flickr.com/photos/aigle_dore/5626285743/
M U LT I P L E P O I N T S
O F FA I L U R E

• Single shard slows down
• App servers wait for response
• DB connection pool becomes full
• Site goes down

http://www.flickr.com/photos/boston_public_library/8204384670/
SHARDGUARD

• Monitor all databases
• Identify shard status:
• Bad? Mark as “wedged”
• Good? Clear “wedged” flag
• ShardData checks status!

http://www.flickr.com/photos/mac_filko/5486980804/
S TA B I L I T Y A N D
PERFORMANCE

• Degraded performance
• New Accounts => Shard 9!
• Old accounts remain as-is

http://www.flickr.com/photos/ejpphoto/7823027272/
D ATA C O L L E C T I O N

• Rails isn’t great for data collection
• Ruby isn’t great either…
• Rewritten in Java using Jetty

http://www.flickr.com/photos/autograt/224540606/
http://www.flickr.com/photos/epsos/8474532085/

CACHE IS KING

• Buffered, not queued
• RAM is cheaper than I/O
• Get creative with batch processing
INSERT INTO
(SELECT …

• Select rows and re-process
• Cache last hour in Java’s Heap
• Write a journal and post-process it

http://www.flickr.com/photos/esoteric_13/4741001804/
READ / WRITE
PROBLEM

• Sequential Inserts
• Batched in 5k chunks
• Optimize for Throughput
• Must complete < 1 minute
READ / WRITE
PROBLEM

• Scattered Reads
• Optimized for Latency
• Unique Covering Indexes
MOVE TO
HARDWARE
• Instant performance!
• Just add…
• Datacenter - Chicago, US
• Servers - Dell
• Storage - Direct Attached
• Time - About 6 months

http://www.flickr.com/photos/zebble/9621007/
SPINNING

RUST

• Dell MD1200 shelves
• 8 Disks per shelf
• RAID 5 virtual disk
• Dedicated Hot-spare

http://www.flickr.com/photos/walkn/5472536812/
T H E G R E AT
E X PA N S E

• MD1200s support 12 disks
• Add four more!
• Online RAID expansion

http://www.flickr.com/photos/aigle_dore/5853807037/
# FA I L

• “On-line” expansion, not so much
• Added second 4 disk RAID 5
• LVM Concatenation for space

http://www.flickr.com/photos/fireflythegreat/2845637227/
NEED MORE
C A PA C I T Y

• Tight on disk space
• Performance not an issue
• New Accounts => Shard 10!
• Old Accounts as-is

http://www.flickr.com/photos/seandreilinger/6289721616/
S H A R D P I T FA L L S

http://www.flickr.com/photos/21206761@N00/469110140/
M I G R AT I O N
PROBLEM

• Accounts cannot move
• Not all tables have the shard key
• Rails defaults to auto-increment IDs
• Massive primary key collisions
• Punt and move the metrics

http://www.flickr.com/photos/tzafrir/125380911/
BREAKING UP IS
HARD TO DO

• Agent Databases
• Metadata / Notes / Errors
• Timeslice Databases
• Time-series metric data
• 1 Minute and 1 Hour resolution

http://www.flickr.com/photos/rsepulveda/4275236049/
RESOURCE POOLS

• Distributed by Shard Key
• Distribution can CHANGE
• Lookup table, not hash
• Data can be MOVED

http://www.flickr.com/photos/dclark3996/4971906528/
BACKUPS

• Custom mysqldump wrapper
• Based on business need
• Backup per table
• Ignore tables to be purged

http://www.flickr.com/photos/usdagov/6896218334/
EVOLUTION

http://www.flickr.com/photos/pfsullivan_1056/3485953405/
SSD REVOLUTION

• 600GB Intel 320 SSDs
• Dell MD1220 Direct Attached shelf
• Disks are no longer the bottle-neck
• Inserts in Read-optimized order


are “fast enough”
YOU CAN USE SSD
W I T H D ATA B A S E S

• 6 of 420 drives RMA’d
• March 2012 to Aug 2013
• Average 180TB lifetime writes
• 91% wear remaining

http://www.flickr.com/photos/joeshlabotnik/3584172834/
R E D U N D A N T A R R AY
OF EXPENSIVE DISKS

• Rebuilds under load > 4 hours
• Migrated to RAID 60
• 2 x 12 disk span
• Ditch the Hot-spares

http://www.flickr.com/photos/mbk/27640225/
XFS TUNING

• mkfs.xfs -s size=4096
• options
• noatime
• nobarrier
• inode64
• logbsize=256k

http://www.flickr.com/photos/rocketlass/5169004165/
SHARDGUARD
PA R T D E U X

• Protect all the things!
• Kill UI queries over 75 seconds
• Kill background queries over 1 hour
• Yes, all of them
• No really, kill them, now

http://www.flickr.com/photos/chiky/7194089194/
IF YOU DON’T
BELIEVE ME…

• Delayed Job
• Long running background query
• InnoDB History List Traversal
TO INFINITY AND BEYOND

http://www.flickr.com/photos/temma2/1149223191/
HARDWARE V2

• Dell R620
• 2 x Intel E5-2690 @ 2.90GHz
• 96GB RAM
• MD1220 Storage Shelf
• 800GB Intel SSD S3500

http://www.flickr.com/photos/tnarik/2590037637/
CONTINUOUS

IMPROVEMENT

• EXT4 / ZFS / XFS
• RAID Card vs HBA
• Percona Server 5.6
• Multiple MySQL Instances
• Databases per Service

http://www.flickr.com/photos/shawnclover/8555834230/
JOIN THE TEAM

NewRelic.com/jobs

More Related Content

What's hot

Windy cityrails performance_tuning
Windy cityrails performance_tuningWindy cityrails performance_tuning
Windy cityrails performance_tuningJohn McCaffrey
 
Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"
Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"
Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"Fwdays
 
PLAT-8 Spring Web Scripts and Spring Surf
PLAT-8 Spring Web Scripts and Spring SurfPLAT-8 Spring Web Scripts and Spring Surf
PLAT-8 Spring Web Scripts and Spring SurfAlfresco Software
 
React.js for Rails Developers
React.js for Rails DevelopersReact.js for Rails Developers
React.js for Rails DevelopersArkency
 
Modern javascript
Modern javascriptModern javascript
Modern javascriptKevin Ball
 
React on rails v6.1 at LA Ruby, November 2016
React on rails v6.1 at LA Ruby, November 2016React on rails v6.1 at LA Ruby, November 2016
React on rails v6.1 at LA Ruby, November 2016Justin Gordon
 
Webcomponents are your frameworks best friend
Webcomponents are your frameworks best friendWebcomponents are your frameworks best friend
Webcomponents are your frameworks best friendFilip Bruun Bech-Larsen
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuningJohn McCaffrey
 
Write Once, Run Everywhere - Ember.js Munich
Write Once, Run Everywhere - Ember.js MunichWrite Once, Run Everywhere - Ember.js Munich
Write Once, Run Everywhere - Ember.js MunichMike North
 
Cloud Native Camel Riding
Cloud Native Camel RidingCloud Native Camel Riding
Cloud Native Camel RidingChristian Posta
 
Web Development using Ruby on Rails
Web Development using Ruby on RailsWeb Development using Ruby on Rails
Web Development using Ruby on RailsAvi Kedar
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Wen-Tien Chang
 
Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...
Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...
Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...Geoff Varosky
 
Server Check.in case study - Drupal and Node.js
Server Check.in case study - Drupal and Node.jsServer Check.in case study - Drupal and Node.js
Server Check.in case study - Drupal and Node.jsJeff Geerling
 
PLAT-7 Spring Web Scripts and Spring Surf
PLAT-7 Spring Web Scripts and Spring SurfPLAT-7 Spring Web Scripts and Spring Surf
PLAT-7 Spring Web Scripts and Spring SurfAlfresco Software
 
KoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginners
KoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginnersKoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginners
KoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginnersTobias Koprowski
 
Web a Quebec - JS Debugging
Web a Quebec - JS DebuggingWeb a Quebec - JS Debugging
Web a Quebec - JS DebuggingRami Sayar
 

What's hot (20)

Windy cityrails performance_tuning
Windy cityrails performance_tuningWindy cityrails performance_tuning
Windy cityrails performance_tuning
 
Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"
Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"
Michael North "Ember.js 2 - Future-friendly ambitious apps, that scale!"
 
PLAT-8 Spring Web Scripts and Spring Surf
PLAT-8 Spring Web Scripts and Spring SurfPLAT-8 Spring Web Scripts and Spring Surf
PLAT-8 Spring Web Scripts and Spring Surf
 
React.js for Rails Developers
React.js for Rails DevelopersReact.js for Rails Developers
React.js for Rails Developers
 
Modern javascript
Modern javascriptModern javascript
Modern javascript
 
React on rails v6.1 at LA Ruby, November 2016
React on rails v6.1 at LA Ruby, November 2016React on rails v6.1 at LA Ruby, November 2016
React on rails v6.1 at LA Ruby, November 2016
 
Webcomponents are your frameworks best friend
Webcomponents are your frameworks best friendWebcomponents are your frameworks best friend
Webcomponents are your frameworks best friend
 
Frameworks and webcomponents
Frameworks and webcomponentsFrameworks and webcomponents
Frameworks and webcomponents
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
 
Write Once, Run Everywhere - Ember.js Munich
Write Once, Run Everywhere - Ember.js MunichWrite Once, Run Everywhere - Ember.js Munich
Write Once, Run Everywhere - Ember.js Munich
 
Cloud Native Camel Riding
Cloud Native Camel RidingCloud Native Camel Riding
Cloud Native Camel Riding
 
Web Development using Ruby on Rails
Web Development using Ruby on RailsWeb Development using Ruby on Rails
Web Development using Ruby on Rails
 
Cloud tools
Cloud toolsCloud tools
Cloud tools
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3
 
Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...
Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...
Best Practices in SharePoint Development - Just Freakin Work! Overcoming Hurd...
 
Server Check.in case study - Drupal and Node.js
Server Check.in case study - Drupal and Node.jsServer Check.in case study - Drupal and Node.js
Server Check.in case study - Drupal and Node.js
 
PLAT-7 Spring Web Scripts and Spring Surf
PLAT-7 Spring Web Scripts and Spring SurfPLAT-7 Spring Web Scripts and Spring Surf
PLAT-7 Spring Web Scripts and Spring Surf
 
KoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginners
KoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginnersKoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginners
KoprowskiT_SQLRelay2014#4_Caerdydd_MaintenancePlansForBeginners
 
Web a Quebec - JS Debugging
Web a Quebec - JS DebuggingWeb a Quebec - JS Debugging
Web a Quebec - JS Debugging
 
Agile sites2
Agile sites2Agile sites2
Agile sites2
 

Viewers also liked

Velocity 2013 london developer-friendly web performance testing in continuou...
Velocity 2013 london  developer-friendly web performance testing in continuou...Velocity 2013 london  developer-friendly web performance testing in continuou...
Velocity 2013 london developer-friendly web performance testing in continuou...Michael Klepikov
 
Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012
Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012
Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012Tim Morrow
 
Velocity EU 2013 What is the velocity of an unladen swallow?
Velocity EU 2013 What is the velocity of an unladen swallow?Velocity EU 2013 What is the velocity of an unladen swallow?
Velocity EU 2013 What is the velocity of an unladen swallow?pdyball
 
Performance and Metrics at Lonely Planet
Performance and Metrics at Lonely PlanetPerformance and Metrics at Lonely Planet
Performance and Metrics at Lonely PlanetMark Jennings
 
Data viz as_interface_makoto_inoue
Data viz as_interface_makoto_inoueData viz as_interface_makoto_inoue
Data viz as_interface_makoto_inoueMakoto Inoue
 
Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?Andy Davies
 
Bring the Noise
Bring the NoiseBring the Noise
Bring the NoiseJon Cowie
 
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...MeasureWorks
 
Velocity EU 2012 - Third party scripts and you
Velocity EU 2012 - Third party scripts and youVelocity EU 2012 - Third party scripts and you
Velocity EU 2012 - Third party scripts and youPatrick Meenan
 
Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...
Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...
Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...James Wickett
 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsJohn Allspaw
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observabilityTheo Schlossnagle
 
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...tboubez
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, SwitzerlandEuroIoTa
 
What HTTP/2.0 Will Do For You
What HTTP/2.0 Will Do For YouWhat HTTP/2.0 Will Do For You
What HTTP/2.0 Will Do For YouMark Nottingham
 
Case Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidCase Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidSalil Kalia
 
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014Nguyen Tung
 
Web Page Test - Beyond the Basics
Web Page Test - Beyond the BasicsWeb Page Test - Beyond the Basics
Web Page Test - Beyond the BasicsAndy Davies
 
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon
 
Step by Step Mobile Optimization
Step by Step Mobile OptimizationStep by Step Mobile Optimization
Step by Step Mobile OptimizationGuy Podjarny
 

Viewers also liked (20)

Velocity 2013 london developer-friendly web performance testing in continuou...
Velocity 2013 london  developer-friendly web performance testing in continuou...Velocity 2013 london  developer-friendly web performance testing in continuou...
Velocity 2013 london developer-friendly web performance testing in continuou...
 
Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012
Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012
Why Page Speed Isn't Enough - Tim Morrow - Velocity Europe 2012
 
Velocity EU 2013 What is the velocity of an unladen swallow?
Velocity EU 2013 What is the velocity of an unladen swallow?Velocity EU 2013 What is the velocity of an unladen swallow?
Velocity EU 2013 What is the velocity of an unladen swallow?
 
Performance and Metrics at Lonely Planet
Performance and Metrics at Lonely PlanetPerformance and Metrics at Lonely Planet
Performance and Metrics at Lonely Planet
 
Data viz as_interface_makoto_inoue
Data viz as_interface_makoto_inoueData viz as_interface_makoto_inoue
Data viz as_interface_makoto_inoue
 
Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices… Tomorrow’s Performance Anti-Patterns?
 
Bring the Noise
Bring the NoiseBring the Noise
Bring the Noise
 
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...
MeasureWorks - Velocity Conference Europe 2012 - a Web Performance dashboard ...
 
Velocity EU 2012 - Third party scripts and you
Velocity EU 2012 - Third party scripts and youVelocity EU 2012 - Third party scripts and you
Velocity EU 2012 - Third party scripts and you
 
Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...
Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...
Be Mean to Your Code with Gauntlt and the Rugged Way // Velocity EU 2013 Work...
 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
Velocity Europe 2013: Beyond Pretty Charts: Analytics for the cloud infrastru...
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
 
What HTTP/2.0 Will Do For You
What HTTP/2.0 Will Do For YouWhat HTTP/2.0 Will Do For You
What HTTP/2.0 Will Do For You
 
Case Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with DruidCase Study: Realtime Analytics with Druid
Case Study: Realtime Analytics with Druid
 
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014
 
Web Page Test - Beyond the Basics
Web Page Test - Beyond the BasicsWeb Page Test - Beyond the Basics
Web Page Test - Beyond the Basics
 
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDBHBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
HBaseCon 2015: Industrial Internet Case Study using HBase and TSDB
 
Step by Step Mobile Optimization
Step by Step Mobile OptimizationStep by Step Mobile Optimization
Step by Step Mobile Optimization
 

Similar to Getting 100B Metrics to Disk

In-browser storage and me
In-browser storage and meIn-browser storage and me
In-browser storage and meJason Casden
 
Austin cassandra meetup
Austin cassandra meetupAustin cassandra meetup
Austin cassandra meetupgdusbabek
 
Memcached: What is it and what does it do?
Memcached: What is it and what does it do?Memcached: What is it and what does it do?
Memcached: What is it and what does it do?Brian Moon
 
Building Rackspace Cloud Monitoring
Building Rackspace Cloud MonitoringBuilding Rackspace Cloud Monitoring
Building Rackspace Cloud Monitoringgdusbabek
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
SQLite forensics - Free Lists, unallocated space, carving
SQLite forensics - Free Lists, unallocated space, carvingSQLite forensics - Free Lists, unallocated space, carving
SQLite forensics - Free Lists, unallocated space, carvingDmitry Kirillin
 
Openstack Swift - Lots of small files
Openstack Swift - Lots of small filesOpenstack Swift - Lots of small files
Openstack Swift - Lots of small filesAlexandre Lecuyer
 
SharePoint Performance - Best Practices from the Field
SharePoint Performance - Best Practices from the Field SharePoint Performance - Best Practices from the Field
SharePoint Performance - Best Practices from the Field Jason Himmelstein
 
SharePoint Performance: Best Practices from the Field
SharePoint Performance: Best Practices from the FieldSharePoint Performance: Best Practices from the Field
SharePoint Performance: Best Practices from the FieldJason Himmelstein
 
Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14GABeech
 
Breaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL DatastoresBreaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL Datastoresgdusbabek
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion EngineAdam Doyle
 
PGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John Naylor
PGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John NaylorPGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John Naylor
PGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John NaylorEqunix Business Solutions
 
DrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtDrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtNick Santamaria
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Advanced Core Data - The Things You Thought You Could Ignore
Advanced Core Data - The Things You Thought You Could IgnoreAdvanced Core Data - The Things You Thought You Could Ignore
Advanced Core Data - The Things You Thought You Could IgnoreAaron Douglas
 
Just Too Late
Just Too LateJust Too Late
Just Too Latekatzj
 
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best PracticesApril, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best PracticesHoward Greenberg
 
Urbanesia - Development History
Urbanesia - Development HistoryUrbanesia - Development History
Urbanesia - Development HistoryBatista Harahap
 

Similar to Getting 100B Metrics to Disk (20)

In-browser storage and me
In-browser storage and meIn-browser storage and me
In-browser storage and me
 
Austin cassandra meetup
Austin cassandra meetupAustin cassandra meetup
Austin cassandra meetup
 
Memcached: What is it and what does it do?
Memcached: What is it and what does it do?Memcached: What is it and what does it do?
Memcached: What is it and what does it do?
 
Building Rackspace Cloud Monitoring
Building Rackspace Cloud MonitoringBuilding Rackspace Cloud Monitoring
Building Rackspace Cloud Monitoring
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
SQLite forensics - Free Lists, unallocated space, carving
SQLite forensics - Free Lists, unallocated space, carvingSQLite forensics - Free Lists, unallocated space, carving
SQLite forensics - Free Lists, unallocated space, carving
 
Openstack Swift - Lots of small files
Openstack Swift - Lots of small filesOpenstack Swift - Lots of small files
Openstack Swift - Lots of small files
 
SharePoint Performance - Best Practices from the Field
SharePoint Performance - Best Practices from the Field SharePoint Performance - Best Practices from the Field
SharePoint Performance - Best Practices from the Field
 
SharePoint Performance: Best Practices from the Field
SharePoint Performance: Best Practices from the FieldSharePoint Performance: Best Practices from the Field
SharePoint Performance: Best Practices from the Field
 
Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14Stack Exchange Infrastructure - LISA 14
Stack Exchange Infrastructure - LISA 14
 
Breaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL DatastoresBreaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL Datastores
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
 
PGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John Naylor
PGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John NaylorPGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John Naylor
PGConf.ASIA 2019 Bali - Upcoming Features in PostgreSQL 12 - John Naylor
 
DrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtDrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an Afterthought
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Advanced Core Data - The Things You Thought You Could Ignore
Advanced Core Data - The Things You Thought You Could IgnoreAdvanced Core Data - The Things You Thought You Could Ignore
Advanced Core Data - The Things You Thought You Could Ignore
 
Just Too Late
Just Too LateJust Too Late
Just Too Late
 
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best PracticesApril, 2021 OpenNTF Webinar - Domino Administration Best Practices
April, 2021 OpenNTF Webinar - Domino Administration Best Practices
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
Urbanesia - Development History
Urbanesia - Development HistoryUrbanesia - Development History
Urbanesia - Development History
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 

Getting 100B Metrics to Disk

  • 1. 194B GETTING 100B METRICS TO DISK Jonathan Thurman -Site Reliability Engineer @jthurman42 http://www.flickr.com/photos/meteopassione/9157134653/
  • 2. NEW RELIC • Performance Monitoring • Web Apps • Mobile Apps • Servers • Databases, Caches & More… • Software Analytics
  • 3. O K AY, Y O U C O L L E C T D ATA • 194 Billion Metrics • 100,000 req/sec • 2 Gbps Inbound • 216 Terabytes • All backed my MySQL http://www.flickr.com/photos/bobsfever/6658919861/
  • 4. HOW WE GOT HERE http://www.flickr.com/photos/auvet/853157494/
  • 5. BUILDING BLOCKS • Hosted Environment • Xen Virtual Machines • Data storage • ATA over Ethernet • SATA drives • MySQL 5.0 • Single Ruby on Rails Application http://www.flickr.com/photos/riekhavoc/4648423297/
  • 6. SHARDING FROM INCEPTION • Account Information • Read heavy • Single HA Instance • Agent Data • Write heavy • 8 shards based on AccountId http://www.flickr.com/photos/erikb/48221952/
  • 7. TA L E O F 
 TWO MODELS • Ruby on Rails • class ShardData < ActiveRecord::Base • Look up shard for Account • Override ConnectionHandler http://www.flickr.com/photos/jungle_boy/140279885/
  • 8.
  • 9. T R I B B L E S TA B L E S • Metric table name contains • AccountID • Year and Julian Day • Resolution • ts_72_13221_1h • Currently ~200k tables per DB http://www.flickr.com/photos/15942690@N00/4571141076/
  • 10. BINGE AND PURGE • Purging data • DELETE FROM … • DROP TABLE … • innodb_file_per_table • innodb_lazy_drop_table
 (pre 5.5.30-30.2) http://www.flickr.com/photos/exalthim/2261294871/
  • 12. G R O W I N G PA I N S http://www.flickr.com/photos/aigle_dore/5626285743/
  • 13. M U LT I P L E P O I N T S O F FA I L U R E • Single shard slows down • App servers wait for response • DB connection pool becomes full • Site goes down http://www.flickr.com/photos/boston_public_library/8204384670/
  • 14. SHARDGUARD • Monitor all databases • Identify shard status: • Bad? Mark as “wedged” • Good? Clear “wedged” flag • ShardData checks status! http://www.flickr.com/photos/mac_filko/5486980804/
  • 15. S TA B I L I T Y A N D PERFORMANCE • Degraded performance • New Accounts => Shard 9! • Old accounts remain as-is http://www.flickr.com/photos/ejpphoto/7823027272/
  • 16. D ATA C O L L E C T I O N • Rails isn’t great for data collection • Ruby isn’t great either… • Rewritten in Java using Jetty http://www.flickr.com/photos/autograt/224540606/
  • 17. http://www.flickr.com/photos/epsos/8474532085/ CACHE IS KING • Buffered, not queued • RAM is cheaper than I/O • Get creative with batch processing
  • 18. INSERT INTO (SELECT … • Select rows and re-process • Cache last hour in Java’s Heap • Write a journal and post-process it http://www.flickr.com/photos/esoteric_13/4741001804/
  • 19. READ / WRITE PROBLEM • Sequential Inserts • Batched in 5k chunks • Optimize for Throughput • Must complete < 1 minute
  • 20. READ / WRITE PROBLEM • Scattered Reads • Optimized for Latency • Unique Covering Indexes
  • 21. MOVE TO HARDWARE • Instant performance! • Just add… • Datacenter - Chicago, US • Servers - Dell • Storage - Direct Attached • Time - About 6 months http://www.flickr.com/photos/zebble/9621007/
  • 22. SPINNING
 RUST • Dell MD1200 shelves • 8 Disks per shelf • RAID 5 virtual disk • Dedicated Hot-spare http://www.flickr.com/photos/walkn/5472536812/
  • 23. T H E G R E AT E X PA N S E • MD1200s support 12 disks • Add four more! • Online RAID expansion http://www.flickr.com/photos/aigle_dore/5853807037/
  • 24. # FA I L • “On-line” expansion, not so much • Added second 4 disk RAID 5 • LVM Concatenation for space http://www.flickr.com/photos/fireflythegreat/2845637227/
  • 25. NEED MORE C A PA C I T Y • Tight on disk space • Performance not an issue • New Accounts => Shard 10! • Old Accounts as-is http://www.flickr.com/photos/seandreilinger/6289721616/
  • 26.
  • 27. S H A R D P I T FA L L S http://www.flickr.com/photos/21206761@N00/469110140/
  • 28. M I G R AT I O N PROBLEM • Accounts cannot move • Not all tables have the shard key • Rails defaults to auto-increment IDs • Massive primary key collisions • Punt and move the metrics http://www.flickr.com/photos/tzafrir/125380911/
  • 29. BREAKING UP IS HARD TO DO • Agent Databases • Metadata / Notes / Errors • Timeslice Databases • Time-series metric data • 1 Minute and 1 Hour resolution http://www.flickr.com/photos/rsepulveda/4275236049/
  • 30.
  • 31. RESOURCE POOLS • Distributed by Shard Key • Distribution can CHANGE • Lookup table, not hash • Data can be MOVED http://www.flickr.com/photos/dclark3996/4971906528/
  • 32. BACKUPS • Custom mysqldump wrapper • Based on business need • Backup per table • Ignore tables to be purged http://www.flickr.com/photos/usdagov/6896218334/
  • 34. SSD REVOLUTION • 600GB Intel 320 SSDs • Dell MD1220 Direct Attached shelf • Disks are no longer the bottle-neck • Inserts in Read-optimized order
 are “fast enough”
  • 35. YOU CAN USE SSD W I T H D ATA B A S E S • 6 of 420 drives RMA’d • March 2012 to Aug 2013 • Average 180TB lifetime writes • 91% wear remaining http://www.flickr.com/photos/joeshlabotnik/3584172834/
  • 36. R E D U N D A N T A R R AY OF EXPENSIVE DISKS • Rebuilds under load > 4 hours • Migrated to RAID 60 • 2 x 12 disk span • Ditch the Hot-spares http://www.flickr.com/photos/mbk/27640225/
  • 37. XFS TUNING • mkfs.xfs -s size=4096 • options • noatime • nobarrier • inode64 • logbsize=256k http://www.flickr.com/photos/rocketlass/5169004165/
  • 38. SHARDGUARD PA R T D E U X • Protect all the things! • Kill UI queries over 75 seconds • Kill background queries over 1 hour • Yes, all of them • No really, kill them, now http://www.flickr.com/photos/chiky/7194089194/
  • 39. IF YOU DON’T BELIEVE ME… • Delayed Job • Long running background query • InnoDB History List Traversal
  • 40. TO INFINITY AND BEYOND http://www.flickr.com/photos/temma2/1149223191/
  • 41. HARDWARE V2 • Dell R620 • 2 x Intel E5-2690 @ 2.90GHz • 96GB RAM • MD1220 Storage Shelf • 800GB Intel SSD S3500 http://www.flickr.com/photos/tnarik/2590037637/
  • 42. CONTINUOUS
 IMPROVEMENT • EXT4 / ZFS / XFS • RAID Card vs HBA • Percona Server 5.6 • Multiple MySQL Instances • Databases per Service http://www.flickr.com/photos/shawnclover/8555834230/