SlideShare una empresa de Scribd logo
1 de 22
Messaging Architecture @FB

      Joydeep Sen Sarma
Background
• WAS:
  – key participant design team (10/2009-04/2010)
  – vocal proponent of final consensus architecture

• WAS NOT:
   – Part of actual implementation
   – Involved in FB chat backend

• IS NOT: Facebook Employee since 08/2011
Problem
• 1Billion Users

• Volume:
  – 25 messages/day * 4KB (exclude attachments)
  – 10 TB per day


• Indexes/Summaries:
  – Keyword/Threadid/Label Index
  – Label/Thread message counts
Problem – Cont.
• Must have: Cross continent copy
  – Ideally concurrent updates across regions
  – At least disaster recoverable


• No Single Failure Points for Entire Service
  – FB has downtime on a few MySql databases/day
  – No one cares


• Cannot Cannot Cannot lose Data
Solved Problem
• Attachment Store
  – HayStack
  – Stores FB Photos
  – Optimized for Immutable Data


• Hiring best programmers available 
  – Choose best design, not implementation
  – But get things done Fast
Write Throughput
• Disk:
  – Need Log Structured container
  – Can store small messages inline
  – Can store keyword index as well
  – What about read performance?


• Flash/Memory
  – Expensive
  – Only metadata
LSM Trees




• High write throughput

• Recent Data Clustered
   – Nice! Fits a mailbox access pattern

• Inherently Snapshotted
   – Backups/DR should be easy
Reads?
• Write-Optimized => Read-Penalty

• Cache working set in App Server
  – At-Most one App Server per User.
  – All mailbox updates via Application Server
  – Serve directly from cache

• Cold-Start
  – LSM tree clustering should make retrieving recent
    messages/threads fast.
SPOF?
• Single Hbase/HDFS cluster?

• NO!
  – Lots of 100 node clusters
  – HDFS Namenode HA
Cassandra vs. HBase (abridged)
• Tested it out (c. 2010)
  – HBase held up, (FB Internal) Cassandra didn’t

• Tried to understand internals
  – HBase held up, Cassandra didn’t

• Really Really trusted HDFS
  – Stored PB of data for years with no loss

• Missing features in Hbase/HDFS can be added
Disaster Recovery (HBase)
1. Ship HLog to Remote Data Center real-time
2. Every-day update Remote Snapshot
3. Reset remote HLog

• No need to synchronize #2 and #3 perfectly
  – HLog replay is idempotent
Test!




    Try to avoid writing a cache in Java
What about Flash?
• In HBase:
  – Store recent LSM tree segments in Flash
  – Store HBase block cache
  – Inefficient in Cassandra! (3x LSM trees/cache)


• In App Server
  – Page /in out User cache from Flash
Lingering Doubts
• Small Components vs. Big Systems
  – Small Components are better
  – Is HDFS too big?
     • Separate DataNode, BlockManager, NameNode
     • HBase doesn’t need NameNode


• Gave up on Cross-DC concurrency
  – Partition Users if required
  – Global user->DC registry needs to deal with
    partitions and conflict resolution
  – TBD
Cassandra vs. HBase
Cassandra: Flat Earth
• The world is hierarchical
  – PCI Bus, Rack, Data Center, Region, Continent ..
  – Odds of Partitioning differ


vs.

• Symmetric hash ring spanning continents
  – Odds of partitioning considered constant
Cassandra – No Centralization
• The world has central (but HA) tiers:
  – DNS servers, Core-Switches, Memcache-Tier, …


• Cassandra: all servers independent
  – No authoritative commit log or snapshot
  – Do Repeat Your Reads (DRYR) paradigm
Philosophies have Consequences
• Consistent Reads are expensive
  – N=3, R=2, W=2
  – Ugh: why are reads expensive in write optimized
    system?


• Is Consistency foolproof ?
  – Edge cases with failed writes
  – Internet still debating
  – If Science has Bugs – then imagine Code!
Distributed Storage vs. Database
• How to recover failed block or disk?

• Distributed Storage (HDFS):
  – Simple - Find other replicas for that block.


• Distributed Database (Cassandra):
  – A ton of my databases lived on that drive
  – Hard: Let’s merge all the affected databases
Eventual Consistency
• Read-Modify-Write pattern problematic
  1. Read value
  2. Apply Business Logic
  3. Write value
  Stale Read leads to Junk


• What about atomic increments?
Conflict Resolution
• Easy to resolve conflicts in Increments

• Imagine multi-row transactions
  – Pointless resolving conflicts at row level


Solve conflicts at highest possible layer
  – Transaction Monitor
How did it work out?
• Ton of missing Hbase/HDFS features added
  – Bloom Filters, Namenode HA
  – Remote Hlog shipping
  – Modified Block Placement Policy
  – Sticky Regions
  – Improved Block Cache
  –…
• User -> AppServer via Zookeeper
• App Server worked out

Más contenido relacionado

La actualidad más candente

HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopHadoop User Group
 
Big Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemBig Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemRajkumar Singh
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
HBase at Mendeley
HBase at MendeleyHBase at Mendeley
HBase at MendeleyDan Harvey
 
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceHadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceStu Hood
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...huguk
 
HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practicesHadoop User Group
 
Next Generation Hadoop Operations
Next Generation Hadoop OperationsNext Generation Hadoop Operations
Next Generation Hadoop OperationsOwen O'Malley
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseMichael Stack
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop OverviewBrian Enochson
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon
 
Syncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScoreSyncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScoreModern Data Stack France
 
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...Hadoop User Group
 

La actualidad más candente (20)

HBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraphHBaseCon2017 Community-Driven Graphs with JanusGraph
HBaseCon2017 Community-Driven Graphs with JanusGraph
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
 
Big Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemBig Data and Hadoop Ecosystem
Big Data and Hadoop Ecosystem
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
HBase at Mendeley
HBase at MendeleyHBase at Mendeley
HBase at Mendeley
 
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at RackspaceHadoop and Cassandra at Rackspace
Hadoop and Cassandra at Rackspace
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
 
HUG August 2010: Best practices
HUG August 2010: Best practicesHUG August 2010: Best practices
HUG August 2010: Best practices
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
Next Generation Hadoop Operations
Next Generation Hadoop OperationsNext Generation Hadoop Operations
Next Generation Hadoop Operations
 
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBaseHBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010
 
Hadoop - How It Works
Hadoop - How It WorksHadoop - How It Works
Hadoop - How It Works
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsightHBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
 
Syncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScoreSyncsort et le retour d'expérience ComScore
Syncsort et le retour d'expérience ComScore
 
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
Yahoo! Hadoop User Group - May Meetup - HBase and Pig: The Hadoop ecosystem a...
 

Destacado

Facebook Messenger: The Path to Monetization
Facebook Messenger: The Path to Monetization Facebook Messenger: The Path to Monetization
Facebook Messenger: The Path to Monetization Alan Alden
 
Facebook plateform architecture presentation
Facebook plateform architecture   presentationFacebook plateform architecture   presentation
Facebook plateform architecture presentationInam Soomro
 
WhatsApp architecture
WhatsApp architectureWhatsApp architecture
WhatsApp architectureMahesh Bitla
 
9 Examples of the Inspiring Architecture of Zaha Hadid
9 Examples of the Inspiring Architecture of Zaha Hadid9 Examples of the Inspiring Architecture of Zaha Hadid
9 Examples of the Inspiring Architecture of Zaha HadidDr. Ehsan Bayat
 
Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)Paulo Gandra de Sousa
 

Destacado (6)

Facebook Messenger: The Path to Monetization
Facebook Messenger: The Path to Monetization Facebook Messenger: The Path to Monetization
Facebook Messenger: The Path to Monetization
 
Facebook plateform architecture presentation
Facebook plateform architecture   presentationFacebook plateform architecture   presentation
Facebook plateform architecture presentation
 
WhatsApp architecture
WhatsApp architectureWhatsApp architecture
WhatsApp architecture
 
9 Examples of the Inspiring Architecture of Zaha Hadid
9 Examples of the Inspiring Architecture of Zaha Hadid9 Examples of the Inspiring Architecture of Zaha Hadid
9 Examples of the Inspiring Architecture of Zaha Hadid
 
Zaha hadid
Zaha hadidZaha hadid
Zaha hadid
 
Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)Patterns of Enterprise Application Architecture (by example)
Patterns of Enterprise Application Architecture (by example)
 

Similar a Messaging architecture @FB (Fifth Elephant Conference)

Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's EvolutionDataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Distributed Data processing in a Cloud
Distributed Data processing in a CloudDistributed Data processing in a Cloud
Distributed Data processing in a Cloudelliando dias
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceDerek Chen
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014Hassan Islamov
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesAndrew Kandels
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemVaibhav Jain
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big DataJoe Alex
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
HBase Introduction
HBase IntroductionHBase Introduction
HBase IntroductionHanborq Inc.
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 

Similar a Messaging architecture @FB (Fifth Elephant Conference) (20)

Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Chapter2.pdf
Chapter2.pdfChapter2.pdf
Chapter2.pdf
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Distributed Data processing in a Cloud
Distributed Data processing in a CloudDistributed Data processing in a Cloud
Distributed Data processing in a Cloud
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
 
Overview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational DatabasesOverview of MongoDB and Other Non-Relational Databases
Overview of MongoDB and Other Non-Relational Databases
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
HBase Introduction
HBase IntroductionHBase Introduction
HBase Introduction
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
No sql databases
No sql databasesNo sql databases
No sql databases
 

Último

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Messaging architecture @FB (Fifth Elephant Conference)

  • 1. Messaging Architecture @FB Joydeep Sen Sarma
  • 2. Background • WAS: – key participant design team (10/2009-04/2010) – vocal proponent of final consensus architecture • WAS NOT: – Part of actual implementation – Involved in FB chat backend • IS NOT: Facebook Employee since 08/2011
  • 3. Problem • 1Billion Users • Volume: – 25 messages/day * 4KB (exclude attachments) – 10 TB per day • Indexes/Summaries: – Keyword/Threadid/Label Index – Label/Thread message counts
  • 4. Problem – Cont. • Must have: Cross continent copy – Ideally concurrent updates across regions – At least disaster recoverable • No Single Failure Points for Entire Service – FB has downtime on a few MySql databases/day – No one cares • Cannot Cannot Cannot lose Data
  • 5. Solved Problem • Attachment Store – HayStack – Stores FB Photos – Optimized for Immutable Data • Hiring best programmers available  – Choose best design, not implementation – But get things done Fast
  • 6. Write Throughput • Disk: – Need Log Structured container – Can store small messages inline – Can store keyword index as well – What about read performance? • Flash/Memory – Expensive – Only metadata
  • 7. LSM Trees • High write throughput • Recent Data Clustered – Nice! Fits a mailbox access pattern • Inherently Snapshotted – Backups/DR should be easy
  • 8. Reads? • Write-Optimized => Read-Penalty • Cache working set in App Server – At-Most one App Server per User. – All mailbox updates via Application Server – Serve directly from cache • Cold-Start – LSM tree clustering should make retrieving recent messages/threads fast.
  • 9. SPOF? • Single Hbase/HDFS cluster? • NO! – Lots of 100 node clusters – HDFS Namenode HA
  • 10. Cassandra vs. HBase (abridged) • Tested it out (c. 2010) – HBase held up, (FB Internal) Cassandra didn’t • Tried to understand internals – HBase held up, Cassandra didn’t • Really Really trusted HDFS – Stored PB of data for years with no loss • Missing features in Hbase/HDFS can be added
  • 11. Disaster Recovery (HBase) 1. Ship HLog to Remote Data Center real-time 2. Every-day update Remote Snapshot 3. Reset remote HLog • No need to synchronize #2 and #3 perfectly – HLog replay is idempotent
  • 12. Test!  Try to avoid writing a cache in Java
  • 13. What about Flash? • In HBase: – Store recent LSM tree segments in Flash – Store HBase block cache – Inefficient in Cassandra! (3x LSM trees/cache) • In App Server – Page /in out User cache from Flash
  • 14. Lingering Doubts • Small Components vs. Big Systems – Small Components are better – Is HDFS too big? • Separate DataNode, BlockManager, NameNode • HBase doesn’t need NameNode • Gave up on Cross-DC concurrency – Partition Users if required – Global user->DC registry needs to deal with partitions and conflict resolution – TBD
  • 16. Cassandra: Flat Earth • The world is hierarchical – PCI Bus, Rack, Data Center, Region, Continent .. – Odds of Partitioning differ vs. • Symmetric hash ring spanning continents – Odds of partitioning considered constant
  • 17. Cassandra – No Centralization • The world has central (but HA) tiers: – DNS servers, Core-Switches, Memcache-Tier, … • Cassandra: all servers independent – No authoritative commit log or snapshot – Do Repeat Your Reads (DRYR) paradigm
  • 18. Philosophies have Consequences • Consistent Reads are expensive – N=3, R=2, W=2 – Ugh: why are reads expensive in write optimized system? • Is Consistency foolproof ? – Edge cases with failed writes – Internet still debating – If Science has Bugs – then imagine Code!
  • 19. Distributed Storage vs. Database • How to recover failed block or disk? • Distributed Storage (HDFS): – Simple - Find other replicas for that block. • Distributed Database (Cassandra): – A ton of my databases lived on that drive – Hard: Let’s merge all the affected databases
  • 20. Eventual Consistency • Read-Modify-Write pattern problematic 1. Read value 2. Apply Business Logic 3. Write value Stale Read leads to Junk • What about atomic increments?
  • 21. Conflict Resolution • Easy to resolve conflicts in Increments • Imagine multi-row transactions – Pointless resolving conflicts at row level Solve conflicts at highest possible layer – Transaction Monitor
  • 22. How did it work out? • Ton of missing Hbase/HDFS features added – Bloom Filters, Namenode HA – Remote Hlog shipping – Modified Block Placement Policy – Sticky Regions – Improved Block Cache –… • User -> AppServer via Zookeeper • App Server worked out