SlideShare una empresa de Scribd logo
1 de 72
Events & Metrics
The Lifeblood Of Webops

 Alexis Lê-Quôc (Product Guy) at Datadog
                NYCBUG
              July 6th, 2011
I <3 BSD
  ‣OpenBSD user since 2.8 (pf)
  ‣Love the documentation
  ‣m0n0wall/pfSense
  ‣ZFS-envy
What I’m going to talk about
 ‣Briefly we do and for whom
 ‣Where we started
 ‣The kind of data we deal with
 ‣How it fits altogether
 ‣A few things we learned along the way
 ‣Q+A
SaaS Platform for Dev & Ops
‣Aggregation
‣Correlation
‣Collaboration


       What we do?
Where We Started
The Mess
                                                                                                            Usage Analytics              Too many data streams,
                                                                                                                                             too many silos
                                                                  IAAS / PAAS


                                                                                                                                   Issue Resolution

                                                                                                                   t


                                                       ics
 Servers and Devices
                                                                              s                                igh




                                                   ices
                                                                         tric                               ins
                                               metr

                                                     g
                                                                       e                                                                      Too many choices to

                                              billin
                                     m                             m



                                              cho
                                         et
                                           ric                                                                         s                        make, too often
                                              s
                                                            ?!?                                                 change
                                                                                      Dev team



                       changes                                    !?                        s
                                                                                         ic              choices
                                                                                    metr
                                                 Ops team                                                                                    Applications
                               s
                          t ric                                                                        ch
                                                                                                         an
                       me         ts                                                                        ge
me




                                                                                  even                                                       Only getting worse as
                            ev en                                                     ts                       s
tri




                                                                  ad


                                       s                                                   + fe
                                   ice                                                         edb                                            SaaS Silos multiply
 cs




                                                                    vic



                                  o                                                                    ack
                               ch
                                                                        e
                                                            me
                                                    s
                                           s
                                      tric
                                                  choice


                                                             tri
                                    me




                                                              cs




                                                                                       Cap. Planning                          SDLC support

  Monitoring


                                                 Hosting                                                                            Asset Mgmt
                                                                                     CDNs
                                                                                                                                        Separate Dev and Ops
                                                                                                                                      teams, looking at separate
                                                                                                                                            data streams
WHERE WE STARTED
   Discourages exploration
Very Specific View
Different View
   Same Reality
Dev Interdiction
      Part 1
Where We Are
In Action
https://app.datad0g.com/dash/host/8#/date_range/1309383780732-1309988580732
Welcome developers           Context Matters
‣Graphite                    ‣Ganglia Event API
‣statsd



Large Datasets              Data Exploration
‣OpenTSDB                     ‣d3, protovis

               TRENDS
       Visible through Datadog and others
Sides Of A Coin
Events          Metrics
User comments   Unique visitors
Alert           Load
Build           Transaction duration
Batch job       etc.
etc.
       Aggregation
Taxonomy
Atomicity
Concistency
Isolation
Durability

e.g. SQL DBs



           CLASSICS
        http://en.wikipedia.org/wiki/Eventual_consistency
Atomicity                                    Basically
Concistency                                  Available
Isolation                                    Soft-state
Durability                                   Eventual
                                             consistency
e.g. SQL DBs
                                             e.g. DNS


           CLASSICS
        http://en.wikipedia.org/wiki/Eventual_consistency
Data
      Intensive
      Real
      Time

      e.g. real-time web


NEW COMER
Brian Cantrill: http://dtrace.org/resources/bmc/DIRT.pdf
Aggregation
Constant data influx
Large data sets

              Correlation
              On-demand visualization
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE
Constant data influx
  BA
Large data sets

              Correlation
              On-demand visualization
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation
              On-demand visualization
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation




                        SE
              On-demand visualization


                      BA
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation




                        SE
              On-demand visualization


                      BA
              Background data analysis

                             Collaboration




                                        T
                             Real-time updates




                                      IR
                                     D
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation




                        SE
              On-demand visualization


                      BA
              Background data analysis

                             Collaboration




                                        T
                             Real-time updates




                                      IR
                                     D
                             On-the-fly data analysis

  Datadog = DIRT + BASE + a tiny bit of ACID
How It All Fits Together
    http://www.flickr.com/photos/tom-margie/1253798184/
Architecture
   Simplified
Architecture
       Simplified




  SE
BA
Architecture
              Simplified




         SE
   T
 IR


       BA
D
Architecture
              Simplified




         SE



                ID
   T
 IR




               C
       BA



              A
D
The Environment
4 Dimensions
Compute
Storage
Network
Management
ON-PREMISE TRAITS
http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute
Fast
Inelastic




       ON-PREMISE TRAITS
        http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute
Fast
Inelastic




Storage
Fast
Centralized
Redundant

         ON-PREMISE TRAITS
          http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute                                                                               Network
Fast                                                                                  Fast
Inelastic                                                                             Localized




Storage
Fast
Centralized
Redundant

         ON-PREMISE TRAITS
          http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute                                                                               Network
Fast                                                                                  Fast
Inelastic                                                                             Localized




Storage
Fast                                                                       Management
Centralized                                                                People-based
Redundant                                                                  Full access

         ON-PREMISE TRAITS
          http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
CLOUD TRAITS
Compute
Slow
Elastic




          CLOUD TRAITS
Compute
Slow
Elastic




Storage
Slow
Jittery
Maybe durable
Low memory

                CLOUD TRAITS
Compute                    Network
Slow                       “Fast”
Elastic                    Geo-distributed




Storage
Slow
Jittery
Maybe durable
Low memory

                CLOUD TRAITS
Compute                    Network
Slow                       “Fast”
Elastic                    Geo-distributed




Storage
Slow
Jittery                   Management
Maybe durable             No bare-metal
Low memory                “Magic” API

                CLOUD TRAITS
What We Have
   Found
Network
Network
Layer 2: Virtual Domain
Layer 3: Crude Edge Filtering
Layer 7: Crude Load Balancing
DNS
CDN
Network
Layer 2: Virtual Domain




                !
Layer 3: Crude Edge Filtering


              ks
           or
Layer 7: Crude Load Balancing
DNS
          W
        It
CDN
Storage
Latency

                                     BASE
                                     Amazon S3


                       BASE
                       Apache Cassandra
          ACID
          PostgreSQL
   DIRT
   Redis
                                            Capacity

                  Storage
Latency

                                      BASE




                                            y
                                           nc
                                      Amazon S3




                                           te
                                       La
                                t
                        BASE




                                pu
                    y

                             gh
                  er
                        Apache Cassandra


                           ou
           ACID  tt

                           hr
               Ji

                        dt
           PostgreSQL
                    i te
                 Lim

   DIRT
           y
          or
      em




   Redis
                                                Capacity
    m
  w
Lo




                    Storage
Low Memory
 http://aws.amazon.com/ec2/#instance
Jittery, Limited Throughput
          Network Block Storage (EBS)

  https://app.datad0g.com/dash/dash/1032#/date_range/1308608717016-1309213517016
Average wait in ms

                     DEV      tps   rd_sec/s   wr_sec/s   avgrq-sz   avgqu-sz    await   svctm   %util
03:35:02   PM    dev8-80   375.95   23614.08       5.70      62.83      47.21   125.58    1.26   47.34
03:35:02   PM    dev8-96   373.63   23749.65       5.64      63.58      45.55   121.91    1.22   45.72
03:35:02   PM   dev8-112   375.28   23693.47       5.52      63.15      45.52   121.22    1.23   46.31
03:35:02   PM   dev8-128   375.31   23721.57       7.19      63.22      56.00   148.96    1.34   50.35




                Read throughput in sector/s                                     Average service
                      Total: 368Mb/s                                              time in ms

   Limited Throughput In Numbers
                      RAID 0 EBS Volumes, m1.large instances
Some Tricks
Software RAID
RAID 0
Offsite backups




              Some Tricks
Software RAID       Limited by slowest
RAID 0              volume
Offsite backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups




Streaming replication
S3 backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups

Streaming replication
S3 backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups     Complexity
                        Recovery Time Objective
Streaming replication   Recovery Point Objective
S3 backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups     Complexity
                        Recovery Time Objective
Streaming replication   Recovery Point Objective
S3 backups

Database Service
MySQL/Oracle RDS

              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups     Complexity
                        Recovery Time Objective
Streaming replication   Recovery Point Objective
S3 backups

Database Service        Trust
MySQL/Oracle RDS        RDS Outage 2 months ago

              Some Tricks
Network Block Storage
 Is The Dark Side
Network Block Storage
 Is The Dark Side

 Bait For Enterprise
    Customers
Network Block Storage
    Is The Dark Side

    Bait For Enterprise
       Customers


Hard Problem For
 Cloud Providers
Don’t rely on networked block storage
Small data sets only if you have to

Don’t trust data-at-rest
Copy, replicate, back up

Do use S3 if you can
Object semantics a limitation
Slow but durable



       Some Do’s And Don’t
Compute
“Performance”
      Scale up   Shard


       ACID
       Nodes



                 BASE DIRT Add more
                 Nodes Nodes
                                      Number

                 Compute
Don’t rely on scale-ups
Low memory a hard limit for DBs
Noisy neighbors
Individual performance poor and jittery

Scale out
First scale up
Then Shard
Parallelize across machines
Vector-processing via GPUs


       Some Do’s And Don’t
Management
An API for everything
Compute
Storage
Network
Management
Questions!

http://datadoghq.com
      twitter: @alq

Más contenido relacionado

Último

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Último (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Destacado

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destacado (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Events & Metrics: The Lifeblood of Webops

  • 1. Events & Metrics The Lifeblood Of Webops Alexis Lê-Quôc (Product Guy) at Datadog NYCBUG July 6th, 2011
  • 2. I <3 BSD ‣OpenBSD user since 2.8 (pf) ‣Love the documentation ‣m0n0wall/pfSense ‣ZFS-envy
  • 3. What I’m going to talk about ‣Briefly we do and for whom ‣Where we started ‣The kind of data we deal with ‣How it fits altogether ‣A few things we learned along the way ‣Q+A
  • 4.
  • 5. SaaS Platform for Dev & Ops ‣Aggregation ‣Correlation ‣Collaboration What we do?
  • 7. The Mess Usage Analytics Too many data streams, too many silos IAAS / PAAS Issue Resolution t ics Servers and Devices s igh ices tric ins metr g e Too many choices to billin m m cho et ric s make, too often s ?!? change Dev team changes !? s ic choices metr Ops team Applications s t ric ch an me ts ge me even Only getting worse as ev en ts s tri ad s + fe ice edb SaaS Silos multiply cs vic o ack ch e me s s tric choice tri me cs Cap. Planning SDLC support Monitoring Hosting Asset Mgmt CDNs Separate Dev and Ops teams, looking at separate data streams
  • 8. WHERE WE STARTED Discourages exploration
  • 10. Different View Same Reality
  • 14. Welcome developers Context Matters ‣Graphite ‣Ganglia Event API ‣statsd Large Datasets Data Exploration ‣OpenTSDB ‣d3, protovis TRENDS Visible through Datadog and others
  • 15. Sides Of A Coin
  • 16. Events Metrics User comments Unique visitors Alert Load Build Transaction duration Batch job etc.
  • 17. etc. Aggregation
  • 19. Atomicity Concistency Isolation Durability e.g. SQL DBs CLASSICS http://en.wikipedia.org/wiki/Eventual_consistency
  • 20. Atomicity Basically Concistency Available Isolation Soft-state Durability Eventual consistency e.g. SQL DBs e.g. DNS CLASSICS http://en.wikipedia.org/wiki/Eventual_consistency
  • 21. Data Intensive Real Time e.g. real-time web NEW COMER Brian Cantrill: http://dtrace.org/resources/bmc/DIRT.pdf
  • 22. Aggregation Constant data influx Large data sets Correlation On-demand visualization Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 23. Aggregation SE Constant data influx BA Large data sets Correlation On-demand visualization Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 24. Aggregation SE T Constant data influx IR BA D Large data sets Correlation On-demand visualization Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 25. Aggregation SE T Constant data influx IR BA D Large data sets Correlation SE On-demand visualization BA Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 26. Aggregation SE T Constant data influx IR BA D Large data sets Correlation SE On-demand visualization BA Background data analysis Collaboration T Real-time updates IR D On-the-fly data analysis
  • 27. Aggregation SE T Constant data influx IR BA D Large data sets Correlation SE On-demand visualization BA Background data analysis Collaboration T Real-time updates IR D On-the-fly data analysis Datadog = DIRT + BASE + a tiny bit of ACID
  • 28. How It All Fits Together http://www.flickr.com/photos/tom-margie/1253798184/
  • 29. Architecture Simplified
  • 30. Architecture Simplified SE BA
  • 31. Architecture Simplified SE T IR BA D
  • 32. Architecture Simplified SE ID T IR C BA A D
  • 36. Compute Fast Inelastic ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 37. Compute Fast Inelastic Storage Fast Centralized Redundant ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 38. Compute Network Fast Fast Inelastic Localized Storage Fast Centralized Redundant ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 39. Compute Network Fast Fast Inelastic Localized Storage Fast Management Centralized People-based Redundant Full access ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 41. Compute Slow Elastic CLOUD TRAITS
  • 43. Compute Network Slow “Fast” Elastic Geo-distributed Storage Slow Jittery Maybe durable Low memory CLOUD TRAITS
  • 44. Compute Network Slow “Fast” Elastic Geo-distributed Storage Slow Jittery Management Maybe durable No bare-metal Low memory “Magic” API CLOUD TRAITS
  • 45. What We Have Found
  • 47. Network Layer 2: Virtual Domain Layer 3: Crude Edge Filtering Layer 7: Crude Load Balancing DNS CDN
  • 48. Network Layer 2: Virtual Domain ! Layer 3: Crude Edge Filtering ks or Layer 7: Crude Load Balancing DNS W It CDN
  • 50. Latency BASE Amazon S3 BASE Apache Cassandra ACID PostgreSQL DIRT Redis Capacity Storage
  • 51. Latency BASE y nc Amazon S3 te La t BASE pu y gh er Apache Cassandra ou ACID tt hr Ji dt PostgreSQL i te Lim DIRT y or em Redis Capacity m w Lo Storage
  • 53. Jittery, Limited Throughput Network Block Storage (EBS) https://app.datad0g.com/dash/dash/1032#/date_range/1308608717016-1309213517016
  • 54. Average wait in ms DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 03:35:02 PM dev8-80 375.95 23614.08 5.70 62.83 47.21 125.58 1.26 47.34 03:35:02 PM dev8-96 373.63 23749.65 5.64 63.58 45.55 121.91 1.22 45.72 03:35:02 PM dev8-112 375.28 23693.47 5.52 63.15 45.52 121.22 1.23 46.31 03:35:02 PM dev8-128 375.31 23721.57 7.19 63.22 56.00 148.96 1.34 50.35 Read throughput in sector/s Average service Total: 368Mb/s time in ms Limited Throughput In Numbers RAID 0 EBS Volumes, m1.large instances
  • 56. Software RAID RAID 0 Offsite backups Some Tricks
  • 57. Software RAID Limited by slowest RAID 0 volume Offsite backups Some Tricks
  • 58. Software RAID Limited by slowest RAID 0 volume Offsite backups Streaming replication S3 backups Some Tricks
  • 59. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Streaming replication S3 backups Some Tricks
  • 60. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Complexity Recovery Time Objective Streaming replication Recovery Point Objective S3 backups Some Tricks
  • 61. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Complexity Recovery Time Objective Streaming replication Recovery Point Objective S3 backups Database Service MySQL/Oracle RDS Some Tricks
  • 62. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Complexity Recovery Time Objective Streaming replication Recovery Point Objective S3 backups Database Service Trust MySQL/Oracle RDS RDS Outage 2 months ago Some Tricks
  • 63. Network Block Storage Is The Dark Side
  • 64. Network Block Storage Is The Dark Side Bait For Enterprise Customers
  • 65. Network Block Storage Is The Dark Side Bait For Enterprise Customers Hard Problem For Cloud Providers
  • 66. Don’t rely on networked block storage Small data sets only if you have to Don’t trust data-at-rest Copy, replicate, back up Do use S3 if you can Object semantics a limitation Slow but durable Some Do’s And Don’t
  • 68. “Performance” Scale up Shard ACID Nodes BASE DIRT Add more Nodes Nodes Number Compute
  • 69. Don’t rely on scale-ups Low memory a hard limit for DBs Noisy neighbors Individual performance poor and jittery Scale out First scale up Then Shard Parallelize across machines Vector-processing via GPUs Some Do’s And Don’t
  • 71. An API for everything Compute Storage Network Management