SlideShare una empresa de Scribd logo
1 de 67
100 Million Events
    Eric Lubow
    @elubow
    elubow@simplereach.com
Overview




100 Million Events   Eric Lubow   @elubow
Overview
•   SimpleReach




    100 Million Events   Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events




    100 Million Events   Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events
•   Finding Patterns in Your Data




    100 Million Events              Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events
•   Finding Patterns in Your Data
•   What Mistakes?




    100 Million Events              Eric Lubow   @elubow
Overview
•   SimpleReach
•   100 Million Events
•   Finding Patterns in Your Data
•   What Mistakes?
•   Questions


    100 Million Events              Eric Lubow   @elubow
Socially Intelligent



100 Million Events                          Eric Lubow   @elubow
Size




100 Million Events   Eric Lubow   @elubow
Size
•   100m events
    recorded per day and
    growing




     100 Million Events    Eric Lubow   @elubow
Size
•   100m events
    recorded per day and
    growing
•   500m Pageviews per
    month and growing




     100 Million Events    Eric Lubow   @elubow
Right Tool For The Job




100 Million Events              Eric Lubow   @elubow
Why?




100 Million Events   Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads




    100 Million Events                          Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important




    100 Million Events                          Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important
•   Different aspects of a system have different requirements




    100 Million Events                               Eric Lubow   @elubow
Why?
•   Heavier READ loads vs heavier write loads
•   Data relationships may be less important
•   Different aspects of a system have different requirements
•   Know your compromises




    100 Million Events                               Eric Lubow   @elubow
Cassandra




100 Million Events   Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion




    100 Million Events            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)




    100 Million Events                                            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows




    100 Million Events                                            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows
•   Range queries in Hive (Slice predicate ranges)




    100 Million Events                                            Eric Lubow   @elubow
Cassandra
•   Large data volume ingestion
•   Really fast writes to many locations (eventual consistency)
•   Query by column groups within rows
•   Range queries in Hive (Slice predicate ranges)
•   Fault tolerant




    100 Million Events                                            Eric Lubow   @elubow
What Mistakes?




100 Million Events   Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?




    100 Million Events         Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)




    100 Million Events                 Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock




    100 Million Events                 Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock
•   Snapshots before drop keyspace




    100 Million Events                 Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock
•   Snapshots before drop keyspace
•   How many experts does it take to run a cluster?




    100 Million Events                                Eric Lubow   @elubow
What Mistakes?
•   Manage how many servers?
•   Re-inventing the wheel (Helenus)
•   Composites Rock
•   Snapshots before drop keyspace
•   How many experts does it take to run a cluster?
•   You can tune Cassandra?!?

    100 Million Events                                Eric Lubow   @elubow
Server Management


                     Cluster SSH




100 Million Events   Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx


                              Cluster SSH




    100 Million Events        Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
                              Cluster SSH




    100 Million Events        Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
•   Monitoring and Alerting Tools   Cluster SSH




    100 Million Events              Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
•   Monitoring and Alerting Tools   Cluster SSH
•   Performance




    100 Million Events              Eric Lubow   @elubow
Server Management
•   Hand tools - AWS, csshx
•   Configuration Management
•   Monitoring and Alerting Tools   Cluster SSH
•   Performance
•   Security




    100 Million Events              Eric Lubow   @elubow
Helenus




100 Million Events   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra




    100 Million Events                   Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus




    100 Million Events                       Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface




    100 Million Events                            Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface
•   Parallel querying (split up queries)




    100 Million Events                            Eric Lubow   @elubow
Helenus
•   Built Node.js driver for Cassandra
•   https://github.com/simplereach/helenus
•   CQL 2/3, Composite Column, Thrift Interface
•   Parallel querying (split up queries)
•   Fault tolerance and resilience


    100 Million Events                            Eric Lubow   @elubow
Data Patterns




100 Million Events   Eric Lubow   @elubow
Data Patterns
•   Storage is cheap




    100 Million Events   Eric Lubow   @elubow
Data Patterns
•   Storage is cheap
•   Composites are WAY better than underscores




    100 Million Events                           Eric Lubow   @elubow
Data Patterns
•   Storage is cheap
•   Composites are WAY better than underscores
•   Beyond UTF8Type




    100 Million Events                           Eric Lubow   @elubow
Data Patterns
•   Storage is cheap
•   Composites are WAY better than underscores
•   Beyond UTF8Type
•   Timestamps as LongType




    100 Million Events                           Eric Lubow   @elubow
Safety Mechanisms




100 Million Events   Eric Lubow   @elubow
Safety Mechanisms
•   Snapshots before dropping keyspaces




    100 Million Events                    Eric Lubow   @elubow
Safety Mechanisms
•   Snapshots before dropping keyspaces
•   Authorization and authentication




    100 Million Events                    Eric Lubow   @elubow
Safety Mechanisms
•   Snapshots before dropping keyspaces
•   Authorization and authentication
•   (Limit) Direct access to the data store




    100 Million Events                        Eric Lubow   @elubow
Expertise




100 Million Events   Eric Lubow   @elubow
Expertise
•   What happens when you need help?




    100 Million Events                 Eric Lubow   @elubow
Expertise
•   What happens when you need help?
•   How do you become an expert?




    100 Million Events                 Eric Lubow   @elubow
Expertise
•   What happens when you need help?
•   How do you become an expert?
•   What happens when you need more experts?




    100 Million Events                         Eric Lubow   @elubow
Tunables




100 Million Events   Eric Lubow   @elubow
Tunables
•   Replication factor and read_repair_chance




    100 Million Events                          Eric Lubow   @elubow
Tunables
•   Replication factor and read_repair_chance
•   Phi Convict and RPC timeout for AWS or DC separation




    100 Million Events                                     Eric Lubow   @elubow
Tunables
•   Replication factor and read_repair_chance
•   Phi Convict and RPC timeout for AWS or DC separation
•   MAX_HEAP_SIZE and HEAP_NEWSIZE (Analytics vs Realtime)




    100 Million Events                                     Eric Lubow   @elubow
Future
•   Priam
•   Asgard
•   Curator
•   Work for             ?
•   Hastur



    100 Million Events       Eric Lubow   @elubow
Summary




100 Million Events   Eric Lubow   @elubow
Summary
•   Learn from others mistakes




    100 Million Events           Eric Lubow   @elubow
Summary
•   Learn from others mistakes
•   Tuning and data patterns




    100 Million Events           Eric Lubow   @elubow
Summary
•   Learn from others mistakes
•   Tuning and data patterns
•   It’s ok to re-invent the wheel




    100 Million Events               Eric Lubow   @elubow
Summary
•   Learn from others mistakes
•   Tuning and data patterns
•   It’s ok to re-invent the wheel
•   Applications for/with Cassandra




    100 Million Events                Eric Lubow   @elubow
We’re Hiring




100 Million Events                  Eric Lubow   @elubow
Questions are guaranteed in life.
Answers aren’t.

               Eric Lubow
               @elubow
               elubow@simplereach.com


               Thank you.

Más contenido relacionado

Similar a 100m Events

Adopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebaseAdopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebaseMichael Klishin
 
Premature optimisation: The Root of All Evil
Premature optimisation: The Root of All EvilPremature optimisation: The Root of All Evil
Premature optimisation: The Root of All EvilFabio Akita
 
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...iMasters
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Thingsroyrapoport
 
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o MalTDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o Maltdc-globalcode
 
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...ICZN
 
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015Chef
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyondAnshum Gupta
 
Micro Services - Smaller is Better?
Micro Services - Smaller is Better?Micro Services - Smaller is Better?
Micro Services - Smaller is Better?Eberhard Wolff
 
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part ITesting Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part IXavierDevroey
 
Conexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização PrematuraConexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização PrematuraFabio Akita
 
Micro Service – The New Architecture Paradigm
Micro Service – The New Architecture ParadigmMicro Service – The New Architecture Paradigm
Micro Service – The New Architecture ParadigmEberhard Wolff
 
Interns What Is DevOps
Interns What Is DevOpsInterns What Is DevOps
Interns What Is DevOpsAaron Blythe
 
Dashboard Mania
Dashboard ManiaDashboard Mania
Dashboard ManiaTim Lossen
 
Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness John Willis
 
Immutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanImmutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanDocker, Inc.
 
Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)Rolf Huisman
 
DevOps 2016 summit
DevOps 2016 summitDevOps 2016 summit
DevOps 2016 summitChihyang Li
 
Continuous Delivery and Micro Services - A Symbiosis
Continuous Delivery and Micro Services - A SymbiosisContinuous Delivery and Micro Services - A Symbiosis
Continuous Delivery and Micro Services - A SymbiosisEberhard Wolff
 

Similar a 100m Events (20)

Adopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebaseAdopting Elixir in a 10 year old codebase
Adopting Elixir in a 10 year old codebase
 
Premature optimisation: The Root of All Evil
Premature optimisation: The Root of All EvilPremature optimisation: The Root of All Evil
Premature optimisation: The Root of All Evil
 
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
DevCommerce Conference 2016: Performance, anti-patterns e stacks pra desenvol...
 
Canary Analyze All the Things
Canary Analyze All the ThingsCanary Analyze All the Things
Canary Analyze All the Things
 
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o MalTDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
TDC2016SP - Otimização Prematura: a Raíz de Todo o Mal
 
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...Andrew Polaszek - ZooBank:  ICZN’s open-access web register of animal names a...
Andrew Polaszek - ZooBank: ICZN’s open-access web register of animal names a...
 
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
Your Goat Antifragiled My Snowflake!: Demystifying DevOps Jargon - ChefConf 2015
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyond
 
Micro Services - Smaller is Better?
Micro Services - Smaller is Better?Micro Services - Smaller is Better?
Micro Services - Smaller is Better?
 
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part ITesting Variability-Intensive Systems, tutorial SPLC 2017, part I
Testing Variability-Intensive Systems, tutorial SPLC 2017, part I
 
Conexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização PrematuraConexão Kinghost - Otimização Prematura
Conexão Kinghost - Otimização Prematura
 
Micro Service – The New Architecture Paradigm
Micro Service – The New Architecture ParadigmMicro Service – The New Architecture Paradigm
Micro Service – The New Architecture Paradigm
 
Interns What Is DevOps
Interns What Is DevOpsInterns What Is DevOps
Interns What Is DevOps
 
Dashboard Mania
Dashboard ManiaDashboard Mania
Dashboard Mania
 
Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness Dockercon USA 2016 - Immutable Awesomeness
Dockercon USA 2016 - Immutable Awesomeness
 
Immutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh CormanImmutable Awesomeness by John Willis and Josh Corman
Immutable Awesomeness by John Willis and Josh Corman
 
Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)Programming quantum computers in Q# (Techorama NL 2018)
Programming quantum computers in Q# (Techorama NL 2018)
 
Ds @ bol
Ds @ bolDs @ bol
Ds @ bol
 
DevOps 2016 summit
DevOps 2016 summitDevOps 2016 summit
DevOps 2016 summit
 
Continuous Delivery and Micro Services - A Symbiosis
Continuous Delivery and Micro Services - A SymbiosisContinuous Delivery and Micro Services - A Symbiosis
Continuous Delivery and Micro Services - A Symbiosis
 

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Último (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

100m Events

  • 1. 100 Million Events Eric Lubow @elubow elubow@simplereach.com
  • 2. Overview 100 Million Events Eric Lubow @elubow
  • 3. Overview • SimpleReach 100 Million Events Eric Lubow @elubow
  • 4. Overview • SimpleReach • 100 Million Events 100 Million Events Eric Lubow @elubow
  • 5. Overview • SimpleReach • 100 Million Events • Finding Patterns in Your Data 100 Million Events Eric Lubow @elubow
  • 6. Overview • SimpleReach • 100 Million Events • Finding Patterns in Your Data • What Mistakes? 100 Million Events Eric Lubow @elubow
  • 7. Overview • SimpleReach • 100 Million Events • Finding Patterns in Your Data • What Mistakes? • Questions 100 Million Events Eric Lubow @elubow
  • 8. Socially Intelligent 100 Million Events Eric Lubow @elubow
  • 9. Size 100 Million Events Eric Lubow @elubow
  • 10. Size • 100m events recorded per day and growing 100 Million Events Eric Lubow @elubow
  • 11. Size • 100m events recorded per day and growing • 500m Pageviews per month and growing 100 Million Events Eric Lubow @elubow
  • 12. Right Tool For The Job 100 Million Events Eric Lubow @elubow
  • 13. Why? 100 Million Events Eric Lubow @elubow
  • 14. Why? • Heavier READ loads vs heavier write loads 100 Million Events Eric Lubow @elubow
  • 15. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important 100 Million Events Eric Lubow @elubow
  • 16. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important • Different aspects of a system have different requirements 100 Million Events Eric Lubow @elubow
  • 17. Why? • Heavier READ loads vs heavier write loads • Data relationships may be less important • Different aspects of a system have different requirements • Know your compromises 100 Million Events Eric Lubow @elubow
  • 18. Cassandra 100 Million Events Eric Lubow @elubow
  • 19. Cassandra • Large data volume ingestion 100 Million Events Eric Lubow @elubow
  • 20. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) 100 Million Events Eric Lubow @elubow
  • 21. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows 100 Million Events Eric Lubow @elubow
  • 22. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows • Range queries in Hive (Slice predicate ranges) 100 Million Events Eric Lubow @elubow
  • 23. Cassandra • Large data volume ingestion • Really fast writes to many locations (eventual consistency) • Query by column groups within rows • Range queries in Hive (Slice predicate ranges) • Fault tolerant 100 Million Events Eric Lubow @elubow
  • 24. What Mistakes? 100 Million Events Eric Lubow @elubow
  • 25. What Mistakes? • Manage how many servers? 100 Million Events Eric Lubow @elubow
  • 26. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) 100 Million Events Eric Lubow @elubow
  • 27. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock 100 Million Events Eric Lubow @elubow
  • 28. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock • Snapshots before drop keyspace 100 Million Events Eric Lubow @elubow
  • 29. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock • Snapshots before drop keyspace • How many experts does it take to run a cluster? 100 Million Events Eric Lubow @elubow
  • 30. What Mistakes? • Manage how many servers? • Re-inventing the wheel (Helenus) • Composites Rock • Snapshots before drop keyspace • How many experts does it take to run a cluster? • You can tune Cassandra?!? 100 Million Events Eric Lubow @elubow
  • 31. Server Management Cluster SSH 100 Million Events Eric Lubow @elubow
  • 32. Server Management • Hand tools - AWS, csshx Cluster SSH 100 Million Events Eric Lubow @elubow
  • 33. Server Management • Hand tools - AWS, csshx • Configuration Management Cluster SSH 100 Million Events Eric Lubow @elubow
  • 34. Server Management • Hand tools - AWS, csshx • Configuration Management • Monitoring and Alerting Tools Cluster SSH 100 Million Events Eric Lubow @elubow
  • 35. Server Management • Hand tools - AWS, csshx • Configuration Management • Monitoring and Alerting Tools Cluster SSH • Performance 100 Million Events Eric Lubow @elubow
  • 36. Server Management • Hand tools - AWS, csshx • Configuration Management • Monitoring and Alerting Tools Cluster SSH • Performance • Security 100 Million Events Eric Lubow @elubow
  • 37. Helenus 100 Million Events Eric Lubow @elubow
  • 38. Helenus • Built Node.js driver for Cassandra 100 Million Events Eric Lubow @elubow
  • 39. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus 100 Million Events Eric Lubow @elubow
  • 40. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface 100 Million Events Eric Lubow @elubow
  • 41. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface • Parallel querying (split up queries) 100 Million Events Eric Lubow @elubow
  • 42. Helenus • Built Node.js driver for Cassandra • https://github.com/simplereach/helenus • CQL 2/3, Composite Column, Thrift Interface • Parallel querying (split up queries) • Fault tolerance and resilience 100 Million Events Eric Lubow @elubow
  • 43. Data Patterns 100 Million Events Eric Lubow @elubow
  • 44. Data Patterns • Storage is cheap 100 Million Events Eric Lubow @elubow
  • 45. Data Patterns • Storage is cheap • Composites are WAY better than underscores 100 Million Events Eric Lubow @elubow
  • 46. Data Patterns • Storage is cheap • Composites are WAY better than underscores • Beyond UTF8Type 100 Million Events Eric Lubow @elubow
  • 47. Data Patterns • Storage is cheap • Composites are WAY better than underscores • Beyond UTF8Type • Timestamps as LongType 100 Million Events Eric Lubow @elubow
  • 48. Safety Mechanisms 100 Million Events Eric Lubow @elubow
  • 49. Safety Mechanisms • Snapshots before dropping keyspaces 100 Million Events Eric Lubow @elubow
  • 50. Safety Mechanisms • Snapshots before dropping keyspaces • Authorization and authentication 100 Million Events Eric Lubow @elubow
  • 51. Safety Mechanisms • Snapshots before dropping keyspaces • Authorization and authentication • (Limit) Direct access to the data store 100 Million Events Eric Lubow @elubow
  • 52. Expertise 100 Million Events Eric Lubow @elubow
  • 53. Expertise • What happens when you need help? 100 Million Events Eric Lubow @elubow
  • 54. Expertise • What happens when you need help? • How do you become an expert? 100 Million Events Eric Lubow @elubow
  • 55. Expertise • What happens when you need help? • How do you become an expert? • What happens when you need more experts? 100 Million Events Eric Lubow @elubow
  • 56. Tunables 100 Million Events Eric Lubow @elubow
  • 57. Tunables • Replication factor and read_repair_chance 100 Million Events Eric Lubow @elubow
  • 58. Tunables • Replication factor and read_repair_chance • Phi Convict and RPC timeout for AWS or DC separation 100 Million Events Eric Lubow @elubow
  • 59. Tunables • Replication factor and read_repair_chance • Phi Convict and RPC timeout for AWS or DC separation • MAX_HEAP_SIZE and HEAP_NEWSIZE (Analytics vs Realtime) 100 Million Events Eric Lubow @elubow
  • 60. Future • Priam • Asgard • Curator • Work for ? • Hastur 100 Million Events Eric Lubow @elubow
  • 61. Summary 100 Million Events Eric Lubow @elubow
  • 62. Summary • Learn from others mistakes 100 Million Events Eric Lubow @elubow
  • 63. Summary • Learn from others mistakes • Tuning and data patterns 100 Million Events Eric Lubow @elubow
  • 64. Summary • Learn from others mistakes • Tuning and data patterns • It’s ok to re-invent the wheel 100 Million Events Eric Lubow @elubow
  • 65. Summary • Learn from others mistakes • Tuning and data patterns • It’s ok to re-invent the wheel • Applications for/with Cassandra 100 Million Events Eric Lubow @elubow
  • 66. We’re Hiring 100 Million Events Eric Lubow @elubow
  • 67. Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com Thank you.

Notas del editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. SimpleReach is a social intelligence tool for content creators. We track everything social action, on every major network, across the entire web in real-time. That means every like, tweet, pin, stumble and many more.\n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n