SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Building Content-Based
Publish/Subscribe Systems
with Distributed Hash Tables



 David Tam, Reza Azimi, Hans-Arno Jacobsen
          University of Toronto, Canada
                 September 8, 2003
Introduction:
Publish/Subscribe Systems

  Push model (a.k.a. event-notification)
      subscribe     publish      match
  Applications:
      stock-market, auction, eBay, news
  2 types
      topic-based: ≈ Usenet newsgroup topics
      content-based: attribute-value pairs
         e.g. (attr1 = value1) ∧ (attr2 = value2) ∧ (attr3 > value3)
The Problem:
Content-Based Publish/Subscribe

  Traditionally centralized
      scalability?
  More recently: distributed
      e.g. SIENA
         small set of brokers
         not P2P
  How about fully distributed?
      exploit P2P
      1000s of brokers
Proposed Solution:
Use Distributed Hash Tables
  DHTs
      hash buckets are mapped to P2P nodes
  Why DHTs?
      scalable, fault-tolerance, load-balancing
  Challenges
      distributed but co-ordinated and light-weight:
         subscribing
         publishing
         matching is difficult
Basic Scheme
   A matching publisher & subscriber must come up with
   the same hash keys based on the content

                   Distributed Hash Table
  buckets




             distributed publish/subscribe system
Basic Scheme
   A matching publisher & subscriber must come up with
   the same hash keys based on the content

                   Distributed Hash Table
  buckets


                   home node




     subscriber

    subscription
Basic Scheme
   A matching publisher & subscriber must come up with
   the same hash keys based on the content

                            Distributed Hash Table
  buckets


                           home node
                  subscription



     subscriber                              publisher

                                             publication
Basic Scheme
   A matching publisher & subscriber must come up with
   the same hash keys based on the content

                            Distributed Hash Table
  buckets


                           home node
                  subscription    publication



     subscriber                                 publisher
Basic Scheme
   A matching publisher & subscriber must come up with
   the same hash keys based on the content

                            Distributed Hash Table
  buckets


                           home node
                  subscription



     subscriber                              publisher
                      publication
Naïve Approach
                                                      Subscription
 Publication
Attr1: value1                                         Attr1: value1
Attr2: value2                                         Attr2:
Attr3: value3                                         Attr3:
                               Key
                      Hash           Key     Hash
Attr4: value4                                         Attr4: value4
                    Function               Function
Attr5: value5                                         Attr5:
Attr6: value6                                         Attr6: value6
Attr7: value7                                         Attr7:


   Publisher must produce keys for all possible attribute
   combinations:
        2N keys for each publication
   Bottleneck at hash bucket node
        subscribing, publishing, matching
Our Approach
  Domain Schema
                  N
     eliminates 2 problem
     similar to RDBMS schema
     set of attribute names
     set of value constraints
     set of indices
        create hash keys for indices only
        choose group of attributes that are common
        but combination of values rare
     well-known
Hash Key Composition
    Indices: {attr1}, {attr1, attr4}, {attr6, attr7}
 Publication                                         Subscription
                        Key1      Key1
                    Hash                     Hash
Attr1: value1                                         Attr1: value1
                  Function                 Function

Attr2: value2                                         Attr2:
Attr3: value3                                         Attr3:
                           Key2     Key2
                    Hash                     Hash
                  Function                 Function
Attr4: value4                                         Attr4: value4
Attr5: value5                                         Attr5:
Attr6: value6                                         Attr6: value6
                           Key3
                    Hash
Attr7: value7                                         Attr7:
                  Function


    Possible false-positives
         because partial matching
         filtered by system
    Possible duplicate notifications
         because multiple subscription keys
Our Approach (cont’d)
  Multicast Trees
      eliminates bottleneck at hash bucket nodes
      distributed subscribing, publishing, matching
                         Home node
                     (hash bucket node)




                                               Existing subscribers




   New subscriber
Our Approach (cont’d)
  Multicast Trees
      eliminates bottleneck at hash bucket nodes
      distributed subscribing, publishing, matching
                           Home node
                       (hash bucket node)




                                               Existing subscribers




         Non-subscribers

   New subscriber
Our Approach (cont’d)
  Multicast Trees
      eliminates bottleneck at hash bucket nodes
      distributed subscribing, publishing, matching
                          Home node
                      (hash bucket node)




                                               Existing subscribers




    New subscribers
Handling Range Queries
  Hash function ruins locality
                    1             2              3         4
  input
           hash( )
  output
               40                           90           120     170

  Divide range of values into intervals
           hash on interval labels
               e.g. RAM attribute
           0            128           256                  512             ∞


    Label:      A             B                      C                 D
               For RAM > 384, submit hash keys:
                  RAM = C
                  RAM = D
           intervals can be sized according to probability distribution
Implementation & Evaluation
  Main Goal: scalability
  Metric: message traffic
  Built using:
      Pastry DHT
      Scribe multicast trees
  Workload Generator: uniformly random distributions
Event Scalability: 1000 nodes




     Need well-designed schema with low false-positives
Node Scalability: 40000 subs, pubs
Range Query Scalability: 1000 nodes




  Multicast tree benefits
       e.g. 1 range vs 0 range, at 40000 subs, pubs
           Expected 2.33 × msgs, but got 1.6
       subscription costs decrease
Range Query Scalability: 40000 subs, pubs
Conclusion
  Method: DHT + domain schema
  Scales to 1000s of nodes
  Multicast trees are important
  Interesting point in design space
      some restrictions on expression of content
          must adhere to domain schema



Future Work
  range query techniques
  examine multicast tree in detail
  locality-sensitive workload distributions
  real-world workloads
  detailed modelling of P2P network
  fault-tolerance

Más contenido relacionado

Similar a DHT

Ts project Hash based inventory system
Ts project Hash based inventory systemTs project Hash based inventory system
Ts project Hash based inventory systemDADITIRUMALATARUN
 
Hash Functions FTW
Hash Functions FTWHash Functions FTW
Hash Functions FTWsunnygleason
 
Data Analytics using R.pptx
Data Analytics using R.pptxData Analytics using R.pptx
Data Analytics using R.pptxCheatMe
 
Realtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBaseRealtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBaseDataWorks Summit
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptxSadhik7
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future HBaseCon
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 
unit-1-dsa-hashing-2022_compressed-1-converted.pptx
unit-1-dsa-hashing-2022_compressed-1-converted.pptxunit-1-dsa-hashing-2022_compressed-1-converted.pptx
unit-1-dsa-hashing-2022_compressed-1-converted.pptxBabaShaikh3
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemCloudera, Inc.
 
Hbase Quick Review Guide for Interviews
Hbase Quick Review Guide for InterviewsHbase Quick Review Guide for Interviews
Hbase Quick Review Guide for InterviewsRavindra kumar
 

Similar a DHT (20)

Ts project Hash based inventory system
Ts project Hash based inventory systemTs project Hash based inventory system
Ts project Hash based inventory system
 
Hbase jdd
Hbase jddHbase jdd
Hbase jdd
 
Hash Functions FTW
Hash Functions FTWHash Functions FTW
Hash Functions FTW
 
Ch_07 (1).pptx
Ch_07 (1).pptxCh_07 (1).pptx
Ch_07 (1).pptx
 
Data Analytics using R.pptx
Data Analytics using R.pptxData Analytics using R.pptx
Data Analytics using R.pptx
 
Realtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBaseRealtime Sentiment Analysis Application Using Hadoop and HBase
Realtime Sentiment Analysis Application Using Hadoop and HBase
 
Hbase.pptx
Hbase.pptxHbase.pptx
Hbase.pptx
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
 
Parquet overview
Parquet overviewParquet overview
Parquet overview
 
HBase lon meetup
HBase lon meetupHBase lon meetup
HBase lon meetup
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
unit-1-dsa-hashing-2022_compressed-1-converted.pptx
unit-1-dsa-hashing-2022_compressed-1-converted.pptxunit-1-dsa-hashing-2022_compressed-1-converted.pptx
unit-1-dsa-hashing-2022_compressed-1-converted.pptx
 
Doug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop EcosystemDoug Cutting on the State of the Hadoop Ecosystem
Doug Cutting on the State of the Hadoop Ecosystem
 
Hashing
HashingHashing
Hashing
 
Hashing
HashingHashing
Hashing
 
Hbase Quick Review Guide for Interviews
Hbase Quick Review Guide for InterviewsHbase Quick Review Guide for Interviews
Hbase Quick Review Guide for Interviews
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
ElephantDB
ElephantDBElephantDB
ElephantDB
 

Último

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 

Último (20)

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 

DHT

  • 1. Building Content-Based Publish/Subscribe Systems with Distributed Hash Tables David Tam, Reza Azimi, Hans-Arno Jacobsen University of Toronto, Canada September 8, 2003
  • 2. Introduction: Publish/Subscribe Systems Push model (a.k.a. event-notification) subscribe publish match Applications: stock-market, auction, eBay, news 2 types topic-based: ≈ Usenet newsgroup topics content-based: attribute-value pairs e.g. (attr1 = value1) ∧ (attr2 = value2) ∧ (attr3 > value3)
  • 3. The Problem: Content-Based Publish/Subscribe Traditionally centralized scalability? More recently: distributed e.g. SIENA small set of brokers not P2P How about fully distributed? exploit P2P 1000s of brokers
  • 4. Proposed Solution: Use Distributed Hash Tables DHTs hash buckets are mapped to P2P nodes Why DHTs? scalable, fault-tolerance, load-balancing Challenges distributed but co-ordinated and light-weight: subscribing publishing matching is difficult
  • 5. Basic Scheme A matching publisher & subscriber must come up with the same hash keys based on the content Distributed Hash Table buckets distributed publish/subscribe system
  • 6. Basic Scheme A matching publisher & subscriber must come up with the same hash keys based on the content Distributed Hash Table buckets home node subscriber subscription
  • 7. Basic Scheme A matching publisher & subscriber must come up with the same hash keys based on the content Distributed Hash Table buckets home node subscription subscriber publisher publication
  • 8. Basic Scheme A matching publisher & subscriber must come up with the same hash keys based on the content Distributed Hash Table buckets home node subscription publication subscriber publisher
  • 9. Basic Scheme A matching publisher & subscriber must come up with the same hash keys based on the content Distributed Hash Table buckets home node subscription subscriber publisher publication
  • 10. Naïve Approach Subscription Publication Attr1: value1 Attr1: value1 Attr2: value2 Attr2: Attr3: value3 Attr3: Key Hash Key Hash Attr4: value4 Attr4: value4 Function Function Attr5: value5 Attr5: Attr6: value6 Attr6: value6 Attr7: value7 Attr7: Publisher must produce keys for all possible attribute combinations: 2N keys for each publication Bottleneck at hash bucket node subscribing, publishing, matching
  • 11. Our Approach Domain Schema N eliminates 2 problem similar to RDBMS schema set of attribute names set of value constraints set of indices create hash keys for indices only choose group of attributes that are common but combination of values rare well-known
  • 12. Hash Key Composition Indices: {attr1}, {attr1, attr4}, {attr6, attr7} Publication Subscription Key1 Key1 Hash Hash Attr1: value1 Attr1: value1 Function Function Attr2: value2 Attr2: Attr3: value3 Attr3: Key2 Key2 Hash Hash Function Function Attr4: value4 Attr4: value4 Attr5: value5 Attr5: Attr6: value6 Attr6: value6 Key3 Hash Attr7: value7 Attr7: Function Possible false-positives because partial matching filtered by system Possible duplicate notifications because multiple subscription keys
  • 13. Our Approach (cont’d) Multicast Trees eliminates bottleneck at hash bucket nodes distributed subscribing, publishing, matching Home node (hash bucket node) Existing subscribers New subscriber
  • 14. Our Approach (cont’d) Multicast Trees eliminates bottleneck at hash bucket nodes distributed subscribing, publishing, matching Home node (hash bucket node) Existing subscribers Non-subscribers New subscriber
  • 15. Our Approach (cont’d) Multicast Trees eliminates bottleneck at hash bucket nodes distributed subscribing, publishing, matching Home node (hash bucket node) Existing subscribers New subscribers
  • 16. Handling Range Queries Hash function ruins locality 1 2 3 4 input hash( ) output 40 90 120 170 Divide range of values into intervals hash on interval labels e.g. RAM attribute 0 128 256 512 ∞ Label: A B C D For RAM > 384, submit hash keys: RAM = C RAM = D intervals can be sized according to probability distribution
  • 17. Implementation & Evaluation Main Goal: scalability Metric: message traffic Built using: Pastry DHT Scribe multicast trees Workload Generator: uniformly random distributions
  • 18. Event Scalability: 1000 nodes Need well-designed schema with low false-positives
  • 20. Range Query Scalability: 1000 nodes Multicast tree benefits e.g. 1 range vs 0 range, at 40000 subs, pubs Expected 2.33 × msgs, but got 1.6 subscription costs decrease
  • 21. Range Query Scalability: 40000 subs, pubs
  • 22. Conclusion Method: DHT + domain schema Scales to 1000s of nodes Multicast trees are important Interesting point in design space some restrictions on expression of content must adhere to domain schema Future Work range query techniques examine multicast tree in detail locality-sensitive workload distributions real-world workloads detailed modelling of P2P network fault-tolerance