SlideShare una empresa de Scribd logo
1 de 28
Scaling DDSto Millions of Computers and Devices Rick Warren, Director of Technology Solutions rick.warren@rti.com
System Component Component Class Last Year in Review:“Large-Scale System Integration with DDS” © 2011 Real-Time Innovations, Inc.   2 Recommended scaling approach: Hierarchical composition of subsystems  Multi-path data routing Class State Component Behavior
Last Year in Review:“Large-Scale System Integration with DDS” Proven within a fully connected subsystem: DDS-RTPS Simple Discovery of nearly 2,000 participants Large fan-out with little performance degradation © 2011 Real-Time Innovations, Inc.   3 <	15% decrease 	1~900 readers RTI Data Distribution Service 4.3 Red Hat EL 5.0, 32-bit 2.4 GHz processors Gigabit Ethernet UDP/IPv4 Reliable ordered delivery
Many users Each an instance of a common topic Subscribe to your friends Larger Zone User ID Andrew Zone Status Idle Status change start time Since 07:20 x 1,000’s x 10,000’s x … This Year: Example Scenario © 2011 Real-Time Innovations, Inc.   4 Topic: “Presence”
Problem Statement Network designed hierarchically Peers subscribe to their own data Peers may subscribe on behalf of other peers Large numbers of publishers and subscribers, distributed geographically Volume of data in total is very large …compared to the volume of data of interest …at any one location Must scale no worse than O(n), where n == # parties producing and/or consuming a certain set of data …in terms of bandwidth requirements …in terms of state storage requirements …in terms of processing time © 2011 Real-Time Innovations, Inc.   5
Future R&D needed based on principled extensions to existing stds. Problem Statement—Translation for DDS Routed network …at network / transport levels (Layer 3, Layer 4) …at data level Unicast communication No routable multicast Probably crossing at least one firewall Instance-level routing and filtering critical, in addition to topic-level Must consider distribution of application data + discovery data © 2011 Real-Time Innovations, Inc.   6 Products available and interoperable based on existing standards. DDS-RTPS TCP mapping available. Standardization kicking off. Not the subject of this talk.
Data Distribution © 2011 Real-Time Innovations, Inc.   7
Algorithmic Scalability Sending time is unavoidably linearin number of destinations Unicast delivery means must send() to each Implication: Avoid other factors > O(logn)—preferably O(1) e.g. Don’t perform a O(n) calculation with each send Keep n small! Don’t send stuff to destinations that don’t care about it. Discovery state growth is unavoidably linear [O(n)]in total system size Data must be maintained durably for each application, publication, subscription—at least once (Just like an IM or email system must keep a record of each user) Implication: Avoid other factors > O(logn)—preferably O(1) e.g. Don’t store information about everything everywhere: O(n2) Keep n small! Don’t store or transmit things you don’t care about. © 2011 Real-Time Innovations, Inc.   8
Everyone knows who wants anything State scales O((w+r)2) w = number of data writers r = number of data readers Discovery stabilizes very quickly for small n, changes are also quick Impractical for large n Each publisher knows who wants its stuff State scales ~O(kw + r) Number of writers on each topic ~= 1 Number of readers on each topic >=1, << w + r Discovery takes longer to stabilize Practical even for very large n—so long as fan-out, fan-in not too high Each publisher knows some who want its stuff, others who know more Increases latency: data traverses multiple hops Practical for arbitrarily large n Just special case of #2, where some subscribers forward to others Distributed Data Routing Approaches © 2011 Real-Time Innovations, Inc.   9
Distributed Data Routing Approaches Example scenario: 1 topic, very many instances, few writers/readers for each Everyone gets all instances (…on that topic) Network traffic scales ~O(wr)—quadratic Aggregate subscriber filtering CPU scales ~O(wr) Each subscriber gets only instances of interest Network traffic scales ~O(w)—linear, assuming few readers per writer Aggregate subscriber filtering CPU scales <= ~O(r)—depending on whether readers have additional filtering to do Needed: way to specify instance-level publications, subscriptions Needed: filter implementation scalable in space, time © 2011 Real-Time Innovations, Inc.   10 Topic Time Happy Sleepy Dopey Grumpy Doc DDS Enhancements
Expressing Instance-Level Interest Recommended enhancement: discoverable instance-level interest/intention on both pub and sub sides For efficient management by middleware implementations Example: Discovery data: don’t propagate pub data to non-matching subs Publishing Today: Infer from writer instance registration (register() or write()) Potential future: Pre-declared, e.g. through QoS policy Subscribing Today: Evaluate based on reader content-based filter Potential future: Explicit instance expression, intersected with content filters © 2011 Real-Time Innovations, Inc.   11
Efficient Instance Filters: Use Case © 2011 Real-Time Innovations, Inc.   12 Applications Data Router User = Doc User = Grumpy User = Happy User = Doc User = Dopey User = Grumpy User = Happy User = Sleepy User = Grumpy User = Sleepy … User = Dopey union
Efficient Instance Filters? Today’s SQL Filters © 2011 Real-Time Innovations, Inc.   13 Applications Data Router User = Doc OR User = Grumpy OR User = Happy Duplicate information (User = Doc OR User = Grumpy OR User = Happy) OR (User = Grumpy OR User = Sleepy) OR (User = Dopey) Size scales O(n) User = Grumpy OR User = Sleepy Eval time scales O(n) User = Dopey Resulting scalability approx. quadratic: Each send from “outside” router pays price proportional to system size “inside”. Problem
Efficient Instance Filters: Bloom Filters Designed for filtering based on membership in a set Total potential number of elements in set may be large—even infinite Evaluation may have false positives, but never false negatives Well suited to representing instances in a topic Compact representation Filter has fixed size, regardless of number of elements of interest Larger size => lower probability of false positives Well suited to interest aggregation Fast updates and evaluations Adding to filter: O(1) Evaluating membership in filter: O(1) Removing from filter requires recalculation: O(n) Well suited to high performance with changing interest © 2011 Real-Time Innovations, Inc.   14
Efficient Instance Filters: Bloom Filters © 2011 Real-Time Innovations, Inc.   15 Topic = “Presence” User = “Grumpy” + fixed number of bits; “hash” to set them Topic = “Presence” User = “Happy” = Union to add to filter; intersect bits to test membership Filter size can be easily modeled Filter maintenance, evaluation costs are negligible Open challenge: testing set overlap requires very sparse patterns
Per-Instance Late Joiners DDS-RTPS implements reliability at topic level Per-instance sequence numbers, heartbeats, state maintenance too expensive for large numbers of instances Challenge: how to be “late joiner” at instance level? For example: There are 10 million people (instances). I cared about 20 of them. Now I care about 21. How do I negatively acknowledge just the one instance? Implication: subscribers may need to cache “uninteresting” information in case it becomes interesting later Simple Endpoint Discovery is example of this © 2011 Real-Time Innovations, Inc.   16
Per-Instance Late Joiners DDS-RTPS implements reliability at topic level Per-instance sequence numbers, heartbeats, state maintenance too expensive for large numbers of instances Response: Several possible protocol improvements Maintain per-instance DDS-RTPS sessions anyway Number of instances per writer/reader is small in this use case …or maintain exactly 2 DDS-RTPS sessions: live data + “late joiner” instances …or introduce new instance-level protocol messages Supported by necessary state © 2011 Real-Time Innovations, Inc.   17
Discovery © 2011 Real-Time Innovations, Inc.   18
Distributing Discovery Data Needed for discovery: Identities of interested parties Data of interest—topics, instances, filters Conditions of interest—reliability, durability, history, other QoS Where to send data of interest Just data—can be distributed over DDS like any other Principle behind DDS-RTPS “built-in topics” …for participants …for publications and subscriptions …for topics One more step: bootstrapping Discovery service provides topics to describe the system…but how to find that discovery service? May be statically configured statically or dynamically-but-optimistically © 2011 Real-Time Innovations, Inc.   19
What is a DDS “Discovery Service”? Provides topics to tell you about DDS objects(or some subset): Domain participants Topics Publications Subscriptions Must be bootstrapped by static configuration Fully static configuration of which services to use—OR Static configuration of potential services (locators) + dynamic discovery of actual services Dynamic information must be propagated optimistically, best effort Scope of this presentation © 2011 Real-Time Innovations, Inc.   20
Today: DDS-RTPS “Simple Discovery” Two parts: 1. Participant Discovery Combines discovery bootstrapping with discovery of 1 participant Advertizes (built-in) topics for describing arbitrary (system-defined) topics, pubs, and subs Any Participant Built-in Topic Data sample canrepresent a discovery service Recommended additions: For scalable filtering, partitioning of discovery data: Include built-in topic data for other built-in endpoints For limiting endpoint discovery: Bloom filter of aggregated interest across all topics For aggregation of participant information: New reliable built-in topic for propagating data for other participants (“participant proxy”) © 2011 Real-Time Innovations, Inc.   21
Today: DDS-RTPS “Simple Discovery” Two parts: 2. Endpoint Discovery Regular DDS topics for describing topics, publications, subscriptions Reliable, durable—cache of most recent state Assumes known discovery service—provided by Participant Discovery Service may propagate “its own” endpoints or forward endpoints of others using existing DDS mechanisms Recommended additions: To limit discovery of unmatched endpoints: Bloom filters to express instance interest (Proposed “participant proxy” topic would fit into this model) © 2011 Real-Time Innovations, Inc.   22
Zone ~1K–10K Zone ~1K–10K Putting It Together: Scalable Communication © 2011 Real-Time Innovations, Inc.   23 assign, reassign Zone Routing Services (Redundant) Zone Discovery Services (Redundant) Gateway Discovery Services (Redundant) App App App preconfigured
Putting It Together: Participant Discovery © 2011 Real-Time Innovations, Inc.   24 ,[object Object]
New samples only on change
State scales O(az) (‘a’ in this zone)
4 KB x 10K = 40 MBZone Discovery Service (‘z’) 2 Gateway Discovery Service (‘g’) App (‘a’) ,[object Object]
ZDS announces participants matching app’s filter; app does endpoint discovery with that participant directly

Más contenido relacionado

La actualidad más candente

Performing Network & Security Analytics with Hadoop
Performing Network & Security Analytics with HadoopPerforming Network & Security Analytics with Hadoop
Performing Network & Security Analytics with Hadoop
DataWorks Summit
 
Chapter 1 organizing data vantage domain action and validity
Chapter 1  organizing data  vantage domain action and validityChapter 1  organizing data  vantage domain action and validity
Chapter 1 organizing data vantage domain action and validity
Phu Nguyen
 
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
IJSRD
 
Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++
Sumant Tambe
 

La actualidad más candente (20)

Performing Network & Security Analytics with Hadoop
Performing Network & Security Analytics with HadoopPerforming Network & Security Analytics with Hadoop
Performing Network & Security Analytics with Hadoop
 
Proposal for System Analysis and Desing
Proposal for System Analysis and DesingProposal for System Analysis and Desing
Proposal for System Analysis and Desing
 
Hadoop / Spark on Malware Expression
Hadoop / Spark on Malware ExpressionHadoop / Spark on Malware Expression
Hadoop / Spark on Malware Expression
 
IoT Protocols Integration with Vortex Gateway
IoT Protocols Integration with Vortex GatewayIoT Protocols Integration with Vortex Gateway
IoT Protocols Integration with Vortex Gateway
 
Reactive Data Centric Architectures with Vortex, Spark and ReactiveX
Reactive Data Centric Architectures with Vortex, Spark and ReactiveXReactive Data Centric Architectures with Vortex, Spark and ReactiveX
Reactive Data Centric Architectures with Vortex, Spark and ReactiveX
 
Chapter 1 organizing data vantage domain action and validity
Chapter 1  organizing data  vantage domain action and validityChapter 1  organizing data  vantage domain action and validity
Chapter 1 organizing data vantage domain action and validity
 
Mncs 16-08-3주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
Mncs 16-08-3주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...Mncs 16-08-3주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
Mncs 16-08-3주-변승규-opportunistic flooding in low-duty-cycle wireless sensor ne...
 
The DDS Security Standard
The DDS Security StandardThe DDS Security Standard
The DDS Security Standard
 
Deep Learning for Fraud Detection
Deep Learning for Fraud DetectionDeep Learning for Fraud Detection
Deep Learning for Fraud Detection
 
IRJET- Usage of Multiple Clouds for Storing and Securing Data through Identit...
IRJET- Usage of Multiple Clouds for Storing and Securing Data through Identit...IRJET- Usage of Multiple Clouds for Storing and Securing Data through Identit...
IRJET- Usage of Multiple Clouds for Storing and Securing Data through Identit...
 
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
 
Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++
 
IRJET- Cloud based Deduplication using Middleware Approach
IRJET- Cloud based Deduplication using Middleware ApproachIRJET- Cloud based Deduplication using Middleware Approach
IRJET- Cloud based Deduplication using Middleware Approach
 
Review and Performance Comparison of Distributed Wireless Reprogramming Proto...
Review and Performance Comparison of Distributed Wireless Reprogramming Proto...Review and Performance Comparison of Distributed Wireless Reprogramming Proto...
Review and Performance Comparison of Distributed Wireless Reprogramming Proto...
 
Keep your Hadoop cluster at its best!
Keep your Hadoop cluster at its best!Keep your Hadoop cluster at its best!
Keep your Hadoop cluster at its best!
 
Hades
HadesHades
Hades
 
Open source network forensics and advanced pcap analysis
Open source network forensics and advanced pcap analysisOpen source network forensics and advanced pcap analysis
Open source network forensics and advanced pcap analysis
 
Deep Learning Based Real-Time DNS DDoS Detection System
Deep Learning Based Real-Time DNS DDoS Detection SystemDeep Learning Based Real-Time DNS DDoS Detection System
Deep Learning Based Real-Time DNS DDoS Detection System
 
El35782786
El35782786El35782786
El35782786
 
Building and Scaling Internet of Things Applications with Vortex Cloud
Building and Scaling Internet of Things Applications with Vortex CloudBuilding and Scaling Internet of Things Applications with Vortex Cloud
Building and Scaling Internet of Things Applications with Vortex Cloud
 

Destacado

Destacado (15)

Introduction to Robotic Technology Components (RTC), Robotics DTF
Introduction to Robotic Technology Components (RTC), Robotics DTFIntroduction to Robotic Technology Components (RTC), Robotics DTF
Introduction to Robotic Technology Components (RTC), Robotics DTF
 
Web-Enabled DDS: Revised Submission
Web-Enabled DDS: Revised SubmissionWeb-Enabled DDS: Revised Submission
Web-Enabled DDS: Revised Submission
 
Engineering Interoperable and Reliable Systems
Engineering Interoperable and Reliable SystemsEngineering Interoperable and Reliable Systems
Engineering Interoperable and Reliable Systems
 
From the Tactical Edge to the Enterprise: Integrating DDS and JMS
From the Tactical Edge to the Enterprise: Integrating DDS and JMSFrom the Tactical Edge to the Enterprise: Integrating DDS and JMS
From the Tactical Edge to the Enterprise: Integrating DDS and JMS
 
DDS in a Nutshell
DDS in a NutshellDDS in a Nutshell
DDS in a Nutshell
 
Java 5 Language PSM for DDS: Final Submission
Java 5 Language PSM for DDS: Final SubmissionJava 5 Language PSM for DDS: Final Submission
Java 5 Language PSM for DDS: Final Submission
 
Java 5 API for DDS RFP (out of date)
Java 5 API for DDS RFP (out of date)Java 5 API for DDS RFP (out of date)
Java 5 API for DDS RFP (out of date)
 
Extensible and Dynamic Topic Types For DDS (out of date)
Extensible and Dynamic Topic Types For DDS (out of date)Extensible and Dynamic Topic Types For DDS (out of date)
Extensible and Dynamic Topic Types For DDS (out of date)
 
Data-centric Invocable Services
Data-centric Invocable ServicesData-centric Invocable Services
Data-centric Invocable Services
 
Data-Centric and Message-Centric System Architecture
Data-Centric and Message-Centric System ArchitectureData-Centric and Message-Centric System Architecture
Data-Centric and Message-Centric System Architecture
 
Robotic Technology Component (RTC) Specification
Robotic Technology Component (RTC) SpecificationRobotic Technology Component (RTC) Specification
Robotic Technology Component (RTC) Specification
 
Java 5 PSM for DDS: Revised Submission (out of date)
Java 5 PSM for DDS: Revised Submission (out of date)Java 5 PSM for DDS: Revised Submission (out of date)
Java 5 PSM for DDS: Revised Submission (out of date)
 
Mapping the RESTful Programming Model to the DDS Data-Centric Model
Mapping the RESTful Programming Model to the DDS Data-Centric ModelMapping the RESTful Programming Model to the DDS Data-Centric Model
Mapping the RESTful Programming Model to the DDS Data-Centric Model
 
Letters from the Trenches: Lessons Learned Taking MongoDB to Production
Letters from the Trenches: Lessons Learned Taking MongoDB to ProductionLetters from the Trenches: Lessons Learned Taking MongoDB to Production
Letters from the Trenches: Lessons Learned Taking MongoDB to Production
 
Introduction to Robotic Technology Components (RTC), MARS PTF
Introduction to Robotic Technology Components (RTC), MARS PTFIntroduction to Robotic Technology Components (RTC), MARS PTF
Introduction to Robotic Technology Components (RTC), MARS PTF
 

Similar a Scaling DDS to Millions of Computers and Devices

Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
Guido Schmutz
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
Guido Schmutz
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web Systems
Vineet Gupta
 

Similar a Scaling DDS to Millions of Computers and Devices (20)

OMG Data-Distribution Service Security
OMG Data-Distribution Service SecurityOMG Data-Distribution Service Security
OMG Data-Distribution Service Security
 
Distributed Systems: How to connect your real-time applications
Distributed Systems: How to connect your real-time applicationsDistributed Systems: How to connect your real-time applications
Distributed Systems: How to connect your real-time applications
 
Scaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, GoalsScaling Streaming - Concepts, Research, Goals
Scaling Streaming - Concepts, Research, Goals
 
Easing Integration of Large-Scale Real-Time Systems with DDS
Easing Integration of Large-Scale Real-Time Systems with DDSEasing Integration of Large-Scale Real-Time Systems with DDS
Easing Integration of Large-Scale Real-Time Systems with DDS
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
RTI Data-Distribution Service (DDS) Master Class 2011
RTI Data-Distribution Service (DDS) Master Class 2011RTI Data-Distribution Service (DDS) Master Class 2011
RTI Data-Distribution Service (DDS) Master Class 2011
 
Distributed Algorithms with DDS
Distributed Algorithms with DDSDistributed Algorithms with DDS
Distributed Algorithms with DDS
 
Self-Tuning Data Centers
Self-Tuning Data CentersSelf-Tuning Data Centers
Self-Tuning Data Centers
 
Cyclone DDS Unleashed: Scalability in DDS and Dealing with Large Systems
Cyclone DDS Unleashed: Scalability in DDS and Dealing with Large SystemsCyclone DDS Unleashed: Scalability in DDS and Dealing with Large Systems
Cyclone DDS Unleashed: Scalability in DDS and Dealing with Large Systems
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Solving Cybersecurity at Scale
Solving Cybersecurity at ScaleSolving Cybersecurity at Scale
Solving Cybersecurity at Scale
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
SQL Server 2008 R2 StreamInsight
SQL Server 2008 R2 StreamInsightSQL Server 2008 R2 StreamInsight
SQL Server 2008 R2 StreamInsight
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web Systems
 
Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters Defense
 

Más de Rick Warren

Más de Rick Warren (11)

Real-World Git
Real-World GitReal-World Git
Real-World Git
 
Building Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJavaBuilding Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJava
 
Patterns of Data Distribution
Patterns of Data DistributionPatterns of Data Distribution
Patterns of Data Distribution
 
C++ PSM for DDS: Revised Submission
C++ PSM for DDS: Revised SubmissionC++ PSM for DDS: Revised Submission
C++ PSM for DDS: Revised Submission
 
Java 5 PSM for DDS: Initial Submission (out of date)
Java 5 PSM for DDS: Initial Submission (out of date)Java 5 PSM for DDS: Initial Submission (out of date)
Java 5 PSM for DDS: Initial Submission (out of date)
 
Extensible and Dynamic Topic Types for DDS, Beta 1
Extensible and Dynamic Topic Types for DDS, Beta 1Extensible and Dynamic Topic Types for DDS, Beta 1
Extensible and Dynamic Topic Types for DDS, Beta 1
 
Large-Scale System Integration with DDS for SCADA, C2, and Finance
Large-Scale System Integration with DDS for SCADA, C2, and FinanceLarge-Scale System Integration with DDS for SCADA, C2, and Finance
Large-Scale System Integration with DDS for SCADA, C2, and Finance
 
Extensible and Dynamic Topic Types for DDS
Extensible and Dynamic Topic Types for DDSExtensible and Dynamic Topic Types for DDS
Extensible and Dynamic Topic Types for DDS
 
Introduction to DDS
Introduction to DDSIntroduction to DDS
Introduction to DDS
 
Extensible and Dynamic Topic Types for DDS
Extensible and Dynamic Topic Types for DDSExtensible and Dynamic Topic Types for DDS
Extensible and Dynamic Topic Types for DDS
 
Proposed Java 5 API for DDS (out of date)
Proposed Java 5 API for DDS (out of date)Proposed Java 5 API for DDS (out of date)
Proposed Java 5 API for DDS (out of date)
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Scaling DDS to Millions of Computers and Devices

  • 1. Scaling DDSto Millions of Computers and Devices Rick Warren, Director of Technology Solutions rick.warren@rti.com
  • 2. System Component Component Class Last Year in Review:“Large-Scale System Integration with DDS” © 2011 Real-Time Innovations, Inc. 2 Recommended scaling approach: Hierarchical composition of subsystems Multi-path data routing Class State Component Behavior
  • 3. Last Year in Review:“Large-Scale System Integration with DDS” Proven within a fully connected subsystem: DDS-RTPS Simple Discovery of nearly 2,000 participants Large fan-out with little performance degradation © 2011 Real-Time Innovations, Inc. 3 < 15% decrease 1~900 readers RTI Data Distribution Service 4.3 Red Hat EL 5.0, 32-bit 2.4 GHz processors Gigabit Ethernet UDP/IPv4 Reliable ordered delivery
  • 4. Many users Each an instance of a common topic Subscribe to your friends Larger Zone User ID Andrew Zone Status Idle Status change start time Since 07:20 x 1,000’s x 10,000’s x … This Year: Example Scenario © 2011 Real-Time Innovations, Inc. 4 Topic: “Presence”
  • 5. Problem Statement Network designed hierarchically Peers subscribe to their own data Peers may subscribe on behalf of other peers Large numbers of publishers and subscribers, distributed geographically Volume of data in total is very large …compared to the volume of data of interest …at any one location Must scale no worse than O(n), where n == # parties producing and/or consuming a certain set of data …in terms of bandwidth requirements …in terms of state storage requirements …in terms of processing time © 2011 Real-Time Innovations, Inc. 5
  • 6. Future R&D needed based on principled extensions to existing stds. Problem Statement—Translation for DDS Routed network …at network / transport levels (Layer 3, Layer 4) …at data level Unicast communication No routable multicast Probably crossing at least one firewall Instance-level routing and filtering critical, in addition to topic-level Must consider distribution of application data + discovery data © 2011 Real-Time Innovations, Inc. 6 Products available and interoperable based on existing standards. DDS-RTPS TCP mapping available. Standardization kicking off. Not the subject of this talk.
  • 7. Data Distribution © 2011 Real-Time Innovations, Inc. 7
  • 8. Algorithmic Scalability Sending time is unavoidably linearin number of destinations Unicast delivery means must send() to each Implication: Avoid other factors > O(logn)—preferably O(1) e.g. Don’t perform a O(n) calculation with each send Keep n small! Don’t send stuff to destinations that don’t care about it. Discovery state growth is unavoidably linear [O(n)]in total system size Data must be maintained durably for each application, publication, subscription—at least once (Just like an IM or email system must keep a record of each user) Implication: Avoid other factors > O(logn)—preferably O(1) e.g. Don’t store information about everything everywhere: O(n2) Keep n small! Don’t store or transmit things you don’t care about. © 2011 Real-Time Innovations, Inc. 8
  • 9. Everyone knows who wants anything State scales O((w+r)2) w = number of data writers r = number of data readers Discovery stabilizes very quickly for small n, changes are also quick Impractical for large n Each publisher knows who wants its stuff State scales ~O(kw + r) Number of writers on each topic ~= 1 Number of readers on each topic >=1, << w + r Discovery takes longer to stabilize Practical even for very large n—so long as fan-out, fan-in not too high Each publisher knows some who want its stuff, others who know more Increases latency: data traverses multiple hops Practical for arbitrarily large n Just special case of #2, where some subscribers forward to others Distributed Data Routing Approaches © 2011 Real-Time Innovations, Inc. 9
  • 10. Distributed Data Routing Approaches Example scenario: 1 topic, very many instances, few writers/readers for each Everyone gets all instances (…on that topic) Network traffic scales ~O(wr)—quadratic Aggregate subscriber filtering CPU scales ~O(wr) Each subscriber gets only instances of interest Network traffic scales ~O(w)—linear, assuming few readers per writer Aggregate subscriber filtering CPU scales <= ~O(r)—depending on whether readers have additional filtering to do Needed: way to specify instance-level publications, subscriptions Needed: filter implementation scalable in space, time © 2011 Real-Time Innovations, Inc. 10 Topic Time Happy Sleepy Dopey Grumpy Doc DDS Enhancements
  • 11. Expressing Instance-Level Interest Recommended enhancement: discoverable instance-level interest/intention on both pub and sub sides For efficient management by middleware implementations Example: Discovery data: don’t propagate pub data to non-matching subs Publishing Today: Infer from writer instance registration (register() or write()) Potential future: Pre-declared, e.g. through QoS policy Subscribing Today: Evaluate based on reader content-based filter Potential future: Explicit instance expression, intersected with content filters © 2011 Real-Time Innovations, Inc. 11
  • 12. Efficient Instance Filters: Use Case © 2011 Real-Time Innovations, Inc. 12 Applications Data Router User = Doc User = Grumpy User = Happy User = Doc User = Dopey User = Grumpy User = Happy User = Sleepy User = Grumpy User = Sleepy … User = Dopey union
  • 13. Efficient Instance Filters? Today’s SQL Filters © 2011 Real-Time Innovations, Inc. 13 Applications Data Router User = Doc OR User = Grumpy OR User = Happy Duplicate information (User = Doc OR User = Grumpy OR User = Happy) OR (User = Grumpy OR User = Sleepy) OR (User = Dopey) Size scales O(n) User = Grumpy OR User = Sleepy Eval time scales O(n) User = Dopey Resulting scalability approx. quadratic: Each send from “outside” router pays price proportional to system size “inside”. Problem
  • 14. Efficient Instance Filters: Bloom Filters Designed for filtering based on membership in a set Total potential number of elements in set may be large—even infinite Evaluation may have false positives, but never false negatives Well suited to representing instances in a topic Compact representation Filter has fixed size, regardless of number of elements of interest Larger size => lower probability of false positives Well suited to interest aggregation Fast updates and evaluations Adding to filter: O(1) Evaluating membership in filter: O(1) Removing from filter requires recalculation: O(n) Well suited to high performance with changing interest © 2011 Real-Time Innovations, Inc. 14
  • 15. Efficient Instance Filters: Bloom Filters © 2011 Real-Time Innovations, Inc. 15 Topic = “Presence” User = “Grumpy” + fixed number of bits; “hash” to set them Topic = “Presence” User = “Happy” = Union to add to filter; intersect bits to test membership Filter size can be easily modeled Filter maintenance, evaluation costs are negligible Open challenge: testing set overlap requires very sparse patterns
  • 16. Per-Instance Late Joiners DDS-RTPS implements reliability at topic level Per-instance sequence numbers, heartbeats, state maintenance too expensive for large numbers of instances Challenge: how to be “late joiner” at instance level? For example: There are 10 million people (instances). I cared about 20 of them. Now I care about 21. How do I negatively acknowledge just the one instance? Implication: subscribers may need to cache “uninteresting” information in case it becomes interesting later Simple Endpoint Discovery is example of this © 2011 Real-Time Innovations, Inc. 16
  • 17. Per-Instance Late Joiners DDS-RTPS implements reliability at topic level Per-instance sequence numbers, heartbeats, state maintenance too expensive for large numbers of instances Response: Several possible protocol improvements Maintain per-instance DDS-RTPS sessions anyway Number of instances per writer/reader is small in this use case …or maintain exactly 2 DDS-RTPS sessions: live data + “late joiner” instances …or introduce new instance-level protocol messages Supported by necessary state © 2011 Real-Time Innovations, Inc. 17
  • 18. Discovery © 2011 Real-Time Innovations, Inc. 18
  • 19. Distributing Discovery Data Needed for discovery: Identities of interested parties Data of interest—topics, instances, filters Conditions of interest—reliability, durability, history, other QoS Where to send data of interest Just data—can be distributed over DDS like any other Principle behind DDS-RTPS “built-in topics” …for participants …for publications and subscriptions …for topics One more step: bootstrapping Discovery service provides topics to describe the system…but how to find that discovery service? May be statically configured statically or dynamically-but-optimistically © 2011 Real-Time Innovations, Inc. 19
  • 20. What is a DDS “Discovery Service”? Provides topics to tell you about DDS objects(or some subset): Domain participants Topics Publications Subscriptions Must be bootstrapped by static configuration Fully static configuration of which services to use—OR Static configuration of potential services (locators) + dynamic discovery of actual services Dynamic information must be propagated optimistically, best effort Scope of this presentation © 2011 Real-Time Innovations, Inc. 20
  • 21. Today: DDS-RTPS “Simple Discovery” Two parts: 1. Participant Discovery Combines discovery bootstrapping with discovery of 1 participant Advertizes (built-in) topics for describing arbitrary (system-defined) topics, pubs, and subs Any Participant Built-in Topic Data sample canrepresent a discovery service Recommended additions: For scalable filtering, partitioning of discovery data: Include built-in topic data for other built-in endpoints For limiting endpoint discovery: Bloom filter of aggregated interest across all topics For aggregation of participant information: New reliable built-in topic for propagating data for other participants (“participant proxy”) © 2011 Real-Time Innovations, Inc. 21
  • 22. Today: DDS-RTPS “Simple Discovery” Two parts: 2. Endpoint Discovery Regular DDS topics for describing topics, publications, subscriptions Reliable, durable—cache of most recent state Assumes known discovery service—provided by Participant Discovery Service may propagate “its own” endpoints or forward endpoints of others using existing DDS mechanisms Recommended additions: To limit discovery of unmatched endpoints: Bloom filters to express instance interest (Proposed “participant proxy” topic would fit into this model) © 2011 Real-Time Innovations, Inc. 22
  • 23. Zone ~1K–10K Zone ~1K–10K Putting It Together: Scalable Communication © 2011 Real-Time Innovations, Inc. 23 assign, reassign Zone Routing Services (Redundant) Zone Discovery Services (Redundant) Gateway Discovery Services (Redundant) App App App preconfigured
  • 24.
  • 25. New samples only on change
  • 26. State scales O(az) (‘a’ in this zone)
  • 27.
  • 28. ZDS announces participants matching app’s filter; app does endpoint discovery with that participant directly
  • 30. zz == 2 or 3
  • 31. p = # interesting participants
  • 32.
  • 34. State scales O(z) + O(a) (‘a’ across all zones)
  • 35. Expected z = 1K; a = 10M but expire quickly
  • 36.
  • 38. …for participants with matching endpointsZone Discovery Service App App App Apps receive matching endpoints …directly from apps that have them …based on filters attached to built-in readers Do not discover non-matching endpoints Treated as late joiners for new matches when endpoints change
  • 39. Putting It Together: Data Distribution Traffic of interest outside zone routed there by dedicated service Traffic withinzone flowsdirectly, peerto peer © 2011 Real-Time Innovations, Inc. 26 Zone Zone Routing Services (Redundant) App App App
  • 40. Conclusion Last year’s talk: demonstration of current scalability Thousands of collaborating applications Based on: Federated and/or composed DDS domains Multi-path data routing Today: roadmap for scaling DDS out further Millions of collaborating applications Based on: Managing discovery over the WAN Improving efficiency of data filtering, routing for all topics—discovery also benefits Based on incremental extensions to existing standard technology © 2011 Real-Time Innovations, Inc. 27
  • 41. Thank You © 2011 Real-Time Innovations, Inc. 28

Notas del editor

  1. Distributed routing = don’t have one broker who knows everything and forwards the data for you