SlideShare una empresa de Scribd logo
1 de 47
PUNCH
Partitioning Using Natural-Cut Heuristics

       Daniel Delling (Microsoft Research)
    Andrew V. Goldberg (Microsoft Research)
     Ilya Razenshteyn (Moscow University)
    Renato F. Werneck (Microsoft Research)



                 May 19, 2010
Motivation




      Goal: process a continental-sized road network in parallel
      (Europe: 18M nodes and 43M arcs).
      The first natural step: divide it into “small” parts with few
      arcs between them.
      Partition problems are NP-hard, but routinely solved using
      different heuristics.
An example
Applications: routing on road networks
   Idea: Precompute distances between boundary nodes of each cell.


   Overlay Graph:
       Nodes — boundary nodes
       Edges between boundary nodes,


                               Search Graph:
                         t
                                    Source and target cell,
          s
                                    Overlay graph,
                                    Use bidirectional Dijkstra
       Number of cut edges affects the performance heavily.
       More applications: arc-flags and reach.
Existing solvers




       METIS [KK’99], SCOTCH [PR’96], KAPPA [HSS’10],
       KASPAR [OS’10], KAFFPA [SS’10]. General purpose, some
       are fast, some produce very good solution.
       There are many more . . .
       Our goal: partitioner tailored to road networks,
       emphasize quality, still fast enough in practice.
Formal definition



      Input: undirected graph G = (V , E ).
      Result: partition (V = V1 ∪ V2 ∪ . . . ∪ Vk , with Vi ∩ Vj = ∅).
      Goal: minimize number of edges between Vi .
      Two common variants:
          given U, require |Vi | ≤ U for every i,
          given k and , require |Vi | ≤ (1 + ) n/k .
      We focus on the first one.
      Rebalancing is possible.
Intuition




   Road networks: dense regions (grids, cities) interleaved with
   natural cuts (mountains, parks, rivers, deserts, sparse areas,
   freeways).
Summary of the algorithm




      Filtering:
           Contracts dense regions,
           Reduces graph size,
           Preserves natural cuts structure.
      Assembly phase:
           Works with much smaller graph,
           Finds actual partition.
Outline of the talk


   Introduction


   Natural cuts


   Assembly phase


   Experiments


   Conclusion
Outline of the talk


   Introduction


   Natural cuts


   Assembly phase


   Experiments


   Conclusion
Natural cuts


      Sparse sets that
      separate dense areas.
      Minimum s–t cuts are
      trivial (average degree
      < 3).
      Sparsest cuts would be
      OK, but they are
      intractable.
      Our notion of natural
      cut is both tractable and
      useful.
Natural cuts



                   Pick centers in a
                   randomized manner.
               v   Compute minimum cut
                   between the core and the
                   ring.
                   Repeat until every node
                   is inside of at least two
                   cores.
Natural cuts



                       Pick centers in a
                       randomized manner.
               v       Compute minimum cut
                       between the core and the
          U/10 nodes   ring.
                       Repeat until every node
                       is inside of at least two
                       cores.
Natural cuts



                       Pick centers in a
                       randomized manner.
               v       Compute minimum cut
                       between the core and the
          U/10 nodes   ring.
                       Repeat until every node
                       is inside of at least two
                       cores.
           U nodes
Natural cuts



                       Pick centers in a
                       randomized manner.
               v       Compute minimum cut
                       between the core and the
          U/10 nodes   ring.
                       Repeat until every node
                       is inside of at least two
                       cores.
           U nodes
Natural cuts
   Take a union of all natural cuts found and contract everything
   between them.
   The resulting graph is much smaller than the original one.
       U = 106 — 18M nodes to 10K nodes
       U = 103 — 18M nodes to 1.3M nodes
Natural cuts
   Take a union of all natural cuts found and contract everything
   between them.
   The resulting graph is much smaller than the original one.
       U = 106 — 18M nodes to 10K nodes
       U = 103 — 18M nodes to 1.3M nodes
Natural cuts
   Take a union of all natural cuts found and contract everything
   between them.
   The resulting graph is much smaller than the original one.
       U = 106 — 18M nodes to 10K nodes
       U = 103 — 18M nodes to 1.3M nodes
Tiny cuts


   The most obvious natural cuts — 1-cuts and 2-cuts.
       We handle them explicitly before processing natural cuts.
       Greatly decreases graph size (by half) and overall running
       time,
Outline of the talk


   Introduction


   Natural cuts


   Assembly phase


   Experiments


   Conclusion
Assembly phase




   Three ingredients:
       Greedy algorithm,
       Local search,
       Multistart and combination heuristics (optional).
Greedy algorithm

      We combine well-connected small fragments in a randomized
      fashion.
      Repeat until maximal.
      Finds initial partition.
Greedy algorithm

      We combine well-connected small fragments in a randomized
      fashion.
      Repeat until maximal.
      Finds initial partition.
Greedy algorithm

      We combine well-connected small fragments in a randomized
      fashion.
      Repeat until maximal.
      Finds initial partition.
Greedy algorithm

      We combine well-connected small fragments in a randomized
      fashion.
      Repeat until maximal.
      Finds initial partition.
Greedy algorithm

      We combine well-connected small fragments in a randomized
      fashion.
      Repeat until maximal.
      Finds initial partition.
Greedy algorithm

      We combine well-connected small fragments in a randomized
      fashion.
      Repeat until maximal.
      Finds initial partition.
The local search

      Pick two neighboring cells, disassemble them, apply greedy
      algorithm to the subproblem.
      Repeat several times for every pair of neighboring cells.
The local search

      Pick two neighboring cells, disassemble them, apply greedy
      algorithm to the subproblem.
      Repeat several times for every pair of neighboring cells.
The local search

      Pick two neighboring cells, disassemble them, apply greedy
      algorithm to the subproblem.
      Repeat several times for every pair of neighboring cells.
Multistart and combination heuristics




   Since the local search is typically much faster than the natural cuts
   detection, we can use the following two heuristics:
       Multistart: since the local search is randomized, we can
       repeat it several times.
       Combination: keep track of several solutions, and combine
       them from time to time.
Outline of the talk


   Introduction


   Natural cuts


   Assembly phase


   Experiments


   Conclusion
Experimental evaluation




      C++/OpenMP
      Tested on Western Europe map (18M nodes, 43M arcs).
      Machine: Intel Xeon X5680 (two six-core 3.33GHz CPUs)
      with DDR3-1333MHz RAM.
A typical use-case




       Europe, U = 64K .
       Tiny cuts contraction: 25 seconds (18M nodes to 9M nodes).
       Natural cuts identification: 50 seconds (12 cores, 9M nodes to
       100K nodes).
       Greedy + local search: only 5 seconds (12 cores).
Running times on Europe


                                                                Tiny cuts
              200




                                                                Natural cuts
                                                                Greedy + Local search
              180




                                                                                   q
              160
              140
              120




                                                                        q
   Time (s)

              100
              80




                                                          q
              60




                                             q
              40




                    q

                                q
                    q     q     q            q            q             q          q
              20




                          q
                          q
                    q
                                q
                                             q
                                                          q             q          q
              0




                    210   212   214          216          218          220        222

                                      maximum cell size
Influence of ϕ
   The local search tries every edge ϕ times.
              15000




                                                                                   Dependence on phi

                            q
              14000
              13000
   cut size

              12000




                                    q

                                        q
                                            q
                                                q
                                                    q
                                                        q
                                                            q
                                                                 q
              11000




                                                                     q   q
              10000




                      0.1       1                           10               100    1000        10000

                                                                 time (s)
Multistart and combination
   Combination helps!
              15000




                                                                                         Dependence on phi
                                                                                         Multistart
                            q
                                                                                         Combination
              14000
              13000
   cut size

              12000




                                    q

                                        q
                                            q
                                                q
                                                    q
                                                        q                    q       q
                                                            q                                  q
                                                                 q
              11000




                                                                     q   q
                                                                                 q
                                                                                         q
                                                                                                    q
              10000




                      0.1       1                           10               100             1000       10000

                                                                 time (s)
Balanced partitions



       Recall that there are two variants of requirements on |Vi |:
           given U, require |Vi | ≤ U for every i,
           given k and , require |Vi | ≤ (1 + ) n/k .
       PUNCH solves the first, but most existing solvers find
        -balanced partitions.
       Rebalancing:
           Run PUNCH with U = (1 + ) n/k ,
           If there are too many regions, redistribute them.
Balanced partitions: Europe, = 0.03 (quality and time)

                              Cut size
     K   PUNCH    KAFFPA    KASPAR       KAPPA    SCOTCH    METIS
     2      129       130       133          —        469       —
     4      309       412       355         543       952      846
     8      634       749       774         986      1667     1675
    16     1293      1454      1401        1760      2922     3519
    32     2289      2428      2595        3186      4336     7424
    64     3828      4240      4502        5290      6772    11313

                               Time
     K   PUNCH    KAFFPA    KASPAR       KAPPA    SCOTCH    METIS
     2      255      1013      1946          —        12       —
     4      215      1823      2168         441       25       29
     8      176      2067      2232         418        39      29
    16      151      2340      2553         498        52      31
    32      130      2445      2599         418        65      31
    64      145      2533      2534         308        77      30
Concluding remarks




      PUNCH is good for partitioning road networks.
      It doesn’t work as well on instances without natural cuts.
Seattle by METIS
Seattle by PUNCH
Portland by METIS
Portland by PUNCH
Vancouver by METIS
Vancouver by PUNCH
Thank you for your attention!

Más contenido relacionado

La actualidad más candente

Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data ScienceErik Bernhardsson
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysisAmazon Web Services
 
Ken Bragg: Batch data processing in FME
Ken Bragg: Batch data processing in FMEKen Bragg: Batch data processing in FME
Ken Bragg: Batch data processing in FMEGIM_nv
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System OverviewFlink Forward
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaTimothy Spann
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryKai Wähner
 
JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...
JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...
JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...HostedbyConfluent
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineInfluxData
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkDataWorks Summit
 
GraalVM Native and Spring Boot 3.0
GraalVM Native and Spring Boot 3.0GraalVM Native and Spring Boot 3.0
GraalVM Native and Spring Boot 3.0MoritzHalbritter
 
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Amazon Web Services
 
Getting started with Hadoop, Hive, Spark and Kafka
Getting started with Hadoop, Hive, Spark and KafkaGetting started with Hadoop, Hive, Spark and Kafka
Getting started with Hadoop, Hive, Spark and KafkaEdelweiss Kammermann
 
Streams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLStreams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLconfluent
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkHortonworks
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingKai Wähner
 

La actualidad más candente (20)

Jasper Reports.pptx
Jasper Reports.pptxJasper Reports.pptx
Jasper Reports.pptx
 
Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data Science
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Ken Bragg: Batch data processing in FME
Ken Bragg: Batch data processing in FMEKen Bragg: Batch data processing in FME
Ken Bragg: Batch data processing in FME
 
Spark
SparkSpark
Spark
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel Industry
 
JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...
JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...
JDBC Source Connector: What could go wrong? with Francesco Tisiot | Kafka Sum...
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage Engine
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
GraalVM Native and Spring Boot 3.0
GraalVM Native and Spring Boot 3.0GraalVM Native and Spring Boot 3.0
GraalVM Native and Spring Boot 3.0
 
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
Aurora Serverless: Scalable, Cost-Effective Application Deployment (DAT336) -...
 
Getting started with Hadoop, Hive, Spark and Kafka
Getting started with Hadoop, Hive, Spark and KafkaGetting started with Hadoop, Hive, Spark and Kafka
Getting started with Hadoop, Hive, Spark and Kafka
 
Streams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQLStreams, Tables, and Time in KSQL
Streams, Tables, and Time in KSQL
 
Integrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache FlinkIntegrating Apache NiFi and Apache Flink
Integrating Apache NiFi and Apache Flink
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
 
Second Level Cache in JPA Explained
Second Level Cache in JPA ExplainedSecond Level Cache in JPA Explained
Second Level Cache in JPA Explained
 

Similar a PUNCH: Partitioning Using Natural-Cut Heuristics

Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesData-Centric_Alliance
 
Scalable membership management
Scalable membership management Scalable membership management
Scalable membership management Vinay Setty
 
Randomness conductors
Randomness conductorsRandomness conductors
Randomness conductorswtyru1989
 
Vlsi physical design automation on partitioning
Vlsi physical design automation on partitioningVlsi physical design automation on partitioning
Vlsi physical design automation on partitioningSushil Kundu
 
My pp tno sound
My pp tno soundMy pp tno sound
My pp tno sounddicosmo178
 
Mesh Generation and Topological Data Analysis
Mesh Generation and Topological Data AnalysisMesh Generation and Topological Data Analysis
Mesh Generation and Topological Data AnalysisDon Sheehy
 
Average Sensitivity of Graph Algorithms
Average Sensitivity of Graph AlgorithmsAverage Sensitivity of Graph Algorithms
Average Sensitivity of Graph AlgorithmsYuichi Yoshida
 
An automated and user-friendly optical tweezers for biomolecular investigat...
An automated and user-friendly optical  tweezers for biomolecular  investigat...An automated and user-friendly optical  tweezers for biomolecular  investigat...
An automated and user-friendly optical tweezers for biomolecular investigat...Dr. Pranav Rathi
 
Separation of Macromolecules by Their Size: The Mean Span Dimension
Separation of Macromolecules by Their Size: The Mean Span DimensionSeparation of Macromolecules by Their Size: The Mean Span Dimension
Separation of Macromolecules by Their Size: The Mean Span Dimensioncypztm
 
Perimeter Institute Talk 2009
Perimeter Institute Talk 2009Perimeter Institute Talk 2009
Perimeter Institute Talk 2009Dave Bacon
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Connected Dominating Set and Short Cycles
Connected Dominating Set and Short CyclesConnected Dominating Set and Short Cycles
Connected Dominating Set and Short CyclesNeeldhara Misra
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012Ted Dunning
 
Introduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCDIntroduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCDChristos Kallidonis
 
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Hemant Jha
 

Similar a PUNCH: Partitioning Using Natural-Cut Heuristics (20)

Clustering of graphs and search of assemblages
Clustering of graphs and search of assemblagesClustering of graphs and search of assemblages
Clustering of graphs and search of assemblages
 
Scalable membership management
Scalable membership management Scalable membership management
Scalable membership management
 
Randomness conductors
Randomness conductorsRandomness conductors
Randomness conductors
 
Finding Dense Subgraphs
Finding Dense SubgraphsFinding Dense Subgraphs
Finding Dense Subgraphs
 
Advances in Directed Spanners
Advances in Directed SpannersAdvances in Directed Spanners
Advances in Directed Spanners
 
Lecture16
Lecture16Lecture16
Lecture16
 
Vlsi physical design automation on partitioning
Vlsi physical design automation on partitioningVlsi physical design automation on partitioning
Vlsi physical design automation on partitioning
 
My pp tno sound
My pp tno soundMy pp tno sound
My pp tno sound
 
Mesh Generation and Topological Data Analysis
Mesh Generation and Topological Data AnalysisMesh Generation and Topological Data Analysis
Mesh Generation and Topological Data Analysis
 
Average Sensitivity of Graph Algorithms
Average Sensitivity of Graph AlgorithmsAverage Sensitivity of Graph Algorithms
Average Sensitivity of Graph Algorithms
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
 
NMR Hardware3.pptx
NMR Hardware3.pptxNMR Hardware3.pptx
NMR Hardware3.pptx
 
An automated and user-friendly optical tweezers for biomolecular investigat...
An automated and user-friendly optical  tweezers for biomolecular  investigat...An automated and user-friendly optical  tweezers for biomolecular  investigat...
An automated and user-friendly optical tweezers for biomolecular investigat...
 
Separation of Macromolecules by Their Size: The Mean Span Dimension
Separation of Macromolecules by Their Size: The Mean Span DimensionSeparation of Macromolecules by Their Size: The Mean Span Dimension
Separation of Macromolecules by Their Size: The Mean Span Dimension
 
Perimeter Institute Talk 2009
Perimeter Institute Talk 2009Perimeter Institute Talk 2009
Perimeter Institute Talk 2009
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Connected Dominating Set and Short Cycles
Connected Dominating Set and Short CyclesConnected Dominating Set and Short Cycles
Connected Dominating Set and Short Cycles
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
 
Introduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCDIntroduction to Hadron Structure from Lattice QCD
Introduction to Hadron Structure from Lattice QCD
 
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01
 

Último

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Último (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

PUNCH: Partitioning Using Natural-Cut Heuristics

  • 1. PUNCH Partitioning Using Natural-Cut Heuristics Daniel Delling (Microsoft Research) Andrew V. Goldberg (Microsoft Research) Ilya Razenshteyn (Moscow University) Renato F. Werneck (Microsoft Research) May 19, 2010
  • 2. Motivation Goal: process a continental-sized road network in parallel (Europe: 18M nodes and 43M arcs). The first natural step: divide it into “small” parts with few arcs between them. Partition problems are NP-hard, but routinely solved using different heuristics.
  • 4. Applications: routing on road networks Idea: Precompute distances between boundary nodes of each cell. Overlay Graph: Nodes — boundary nodes Edges between boundary nodes, Search Graph: t Source and target cell, s Overlay graph, Use bidirectional Dijkstra Number of cut edges affects the performance heavily. More applications: arc-flags and reach.
  • 5. Existing solvers METIS [KK’99], SCOTCH [PR’96], KAPPA [HSS’10], KASPAR [OS’10], KAFFPA [SS’10]. General purpose, some are fast, some produce very good solution. There are many more . . . Our goal: partitioner tailored to road networks, emphasize quality, still fast enough in practice.
  • 6. Formal definition Input: undirected graph G = (V , E ). Result: partition (V = V1 ∪ V2 ∪ . . . ∪ Vk , with Vi ∩ Vj = ∅). Goal: minimize number of edges between Vi . Two common variants: given U, require |Vi | ≤ U for every i, given k and , require |Vi | ≤ (1 + ) n/k . We focus on the first one. Rebalancing is possible.
  • 7. Intuition Road networks: dense regions (grids, cities) interleaved with natural cuts (mountains, parks, rivers, deserts, sparse areas, freeways).
  • 8. Summary of the algorithm Filtering: Contracts dense regions, Reduces graph size, Preserves natural cuts structure. Assembly phase: Works with much smaller graph, Finds actual partition.
  • 9. Outline of the talk Introduction Natural cuts Assembly phase Experiments Conclusion
  • 10. Outline of the talk Introduction Natural cuts Assembly phase Experiments Conclusion
  • 11. Natural cuts Sparse sets that separate dense areas. Minimum s–t cuts are trivial (average degree < 3). Sparsest cuts would be OK, but they are intractable. Our notion of natural cut is both tractable and useful.
  • 12. Natural cuts Pick centers in a randomized manner. v Compute minimum cut between the core and the ring. Repeat until every node is inside of at least two cores.
  • 13. Natural cuts Pick centers in a randomized manner. v Compute minimum cut between the core and the U/10 nodes ring. Repeat until every node is inside of at least two cores.
  • 14. Natural cuts Pick centers in a randomized manner. v Compute minimum cut between the core and the U/10 nodes ring. Repeat until every node is inside of at least two cores. U nodes
  • 15. Natural cuts Pick centers in a randomized manner. v Compute minimum cut between the core and the U/10 nodes ring. Repeat until every node is inside of at least two cores. U nodes
  • 16. Natural cuts Take a union of all natural cuts found and contract everything between them. The resulting graph is much smaller than the original one. U = 106 — 18M nodes to 10K nodes U = 103 — 18M nodes to 1.3M nodes
  • 17. Natural cuts Take a union of all natural cuts found and contract everything between them. The resulting graph is much smaller than the original one. U = 106 — 18M nodes to 10K nodes U = 103 — 18M nodes to 1.3M nodes
  • 18. Natural cuts Take a union of all natural cuts found and contract everything between them. The resulting graph is much smaller than the original one. U = 106 — 18M nodes to 10K nodes U = 103 — 18M nodes to 1.3M nodes
  • 19. Tiny cuts The most obvious natural cuts — 1-cuts and 2-cuts. We handle them explicitly before processing natural cuts. Greatly decreases graph size (by half) and overall running time,
  • 20. Outline of the talk Introduction Natural cuts Assembly phase Experiments Conclusion
  • 21. Assembly phase Three ingredients: Greedy algorithm, Local search, Multistart and combination heuristics (optional).
  • 22. Greedy algorithm We combine well-connected small fragments in a randomized fashion. Repeat until maximal. Finds initial partition.
  • 23. Greedy algorithm We combine well-connected small fragments in a randomized fashion. Repeat until maximal. Finds initial partition.
  • 24. Greedy algorithm We combine well-connected small fragments in a randomized fashion. Repeat until maximal. Finds initial partition.
  • 25. Greedy algorithm We combine well-connected small fragments in a randomized fashion. Repeat until maximal. Finds initial partition.
  • 26. Greedy algorithm We combine well-connected small fragments in a randomized fashion. Repeat until maximal. Finds initial partition.
  • 27. Greedy algorithm We combine well-connected small fragments in a randomized fashion. Repeat until maximal. Finds initial partition.
  • 28. The local search Pick two neighboring cells, disassemble them, apply greedy algorithm to the subproblem. Repeat several times for every pair of neighboring cells.
  • 29. The local search Pick two neighboring cells, disassemble them, apply greedy algorithm to the subproblem. Repeat several times for every pair of neighboring cells.
  • 30. The local search Pick two neighboring cells, disassemble them, apply greedy algorithm to the subproblem. Repeat several times for every pair of neighboring cells.
  • 31. Multistart and combination heuristics Since the local search is typically much faster than the natural cuts detection, we can use the following two heuristics: Multistart: since the local search is randomized, we can repeat it several times. Combination: keep track of several solutions, and combine them from time to time.
  • 32. Outline of the talk Introduction Natural cuts Assembly phase Experiments Conclusion
  • 33. Experimental evaluation C++/OpenMP Tested on Western Europe map (18M nodes, 43M arcs). Machine: Intel Xeon X5680 (two six-core 3.33GHz CPUs) with DDR3-1333MHz RAM.
  • 34. A typical use-case Europe, U = 64K . Tiny cuts contraction: 25 seconds (18M nodes to 9M nodes). Natural cuts identification: 50 seconds (12 cores, 9M nodes to 100K nodes). Greedy + local search: only 5 seconds (12 cores).
  • 35. Running times on Europe Tiny cuts 200 Natural cuts Greedy + Local search 180 q 160 140 120 q Time (s) 100 80 q 60 q 40 q q q q q q q q q 20 q q q q q q q q 0 210 212 214 216 218 220 222 maximum cell size
  • 36. Influence of ϕ The local search tries every edge ϕ times. 15000 Dependence on phi q 14000 13000 cut size 12000 q q q q q q q q 11000 q q 10000 0.1 1 10 100 1000 10000 time (s)
  • 37. Multistart and combination Combination helps! 15000 Dependence on phi Multistart q Combination 14000 13000 cut size 12000 q q q q q q q q q q q 11000 q q q q q 10000 0.1 1 10 100 1000 10000 time (s)
  • 38. Balanced partitions Recall that there are two variants of requirements on |Vi |: given U, require |Vi | ≤ U for every i, given k and , require |Vi | ≤ (1 + ) n/k . PUNCH solves the first, but most existing solvers find -balanced partitions. Rebalancing: Run PUNCH with U = (1 + ) n/k , If there are too many regions, redistribute them.
  • 39. Balanced partitions: Europe, = 0.03 (quality and time) Cut size K PUNCH KAFFPA KASPAR KAPPA SCOTCH METIS 2 129 130 133 — 469 — 4 309 412 355 543 952 846 8 634 749 774 986 1667 1675 16 1293 1454 1401 1760 2922 3519 32 2289 2428 2595 3186 4336 7424 64 3828 4240 4502 5290 6772 11313 Time K PUNCH KAFFPA KASPAR KAPPA SCOTCH METIS 2 255 1013 1946 — 12 — 4 215 1823 2168 441 25 29 8 176 2067 2232 418 39 29 16 151 2340 2553 498 52 31 32 130 2445 2599 418 65 31 64 145 2533 2534 308 77 30
  • 40. Concluding remarks PUNCH is good for partitioning road networks. It doesn’t work as well on instances without natural cuts.
  • 47. Thank you for your attention!