SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
Large Graph Analysis In
The GMine System
By Saurabh Jogalekar
TE C 51
Seminar Guide: Prof. S V Jagtap
Large Graph
A large graph is a graph with hundreds of thousands of nodes and a million edges
Our friend list, recommendations, likes, comments in case of social networks is the
best example of Large Graphs
Other examples of large graphs include web graphs i.e. web pages pointing to each
other through hyperlinks, bipartite graphs and computer communication graphs in
which IP addresses send packets to other IP addresses.
Representing Graphs

The three techniques traditionally used for graph representation are

•
•
•

1. Adjacency matrix
2. Adjacency list
3. Binary Decision Diagrams
Representing Large Graphs
•

•

Representation of large graphs is a
challenging task in the way that the
overall visibility of the graph is reduced
due to huge amounts of nodes and
edges.
Thus the traditional methods for
representation fail
Example of a large graph
Large Graph Representation
•
•

Another problem with representing large graphs is that to acquire or mine
the required nodes and edges, several complex calculations are required
To overcome such hindrances in graph representation, a graph
summarization method called CEPS (CEntre Piece Subgraph) is utilized
GRAPH-TREE
•
•

The CEPS is utilized from Graph-tree, which is hierarchical representation of
graph containing SuperGraph, SuperNodes and SuperEdges
The graph-tree is formed as shown in the figure
FILLING A GRAPH-TREE
Algorithm FillGraphTree(ptr)

•
•

If ptr is leaf then set ptr -> fliepath to the file of
corr. Subgraph
Else for each child of ptr do:

•
•

•
•

FillGraphTree(child)
Instantiate a SuperEdge for each pair of children,
find matches between unresolved edges from each
pair and store them in superEdges

Use external edges to determine ptr’s open nodes
Propagate unresolved external edges to the parent
SuperNodes and GraphNodes connectivity
•

•

•

SuperNodes connectivity for two SuperNodes is the set of edges, where each of the
source belongs to coverage of first SuperNode and target belongs to the coverage
of second SuperNode
Graph Node connectivity is the set of edges connecting the graph node to other
graph nodes which are not a part of coverage of the SuperNode which includes the
Graph Node
Both of the connectivity are useful in constructing the graph from its hierarchical
representation
Motivation behind CEPS
•
•

Using a Graph-tree and hierarchical representation of a SuperGraph lessens
the problem of inspecting large graphs
However, the information retrieved from reaching the sub-graph is
sometimes much greater than required information. To overcome this
lacuna, CEPS is utilized
CEPS
•
•
•

. A centre-piece subgraph contains the collection of paths connecting a subset
of graph nodes of interest
CEPS helps interaction by significantly reducing the number of edges and of
nodes to be inspected
CEPS uses a Random Walk Restart method to fine the ‘importance’ score
between 2 nodes
GOODNESS SCORE
•

Goodness score is calculated by a method
Random Walk Restart. A matrix A(i, j) is
defined which stores the steady state
probabilities for each node ‘j’ with respect to
the query ‘i’.
0.0088
5

0.0333

0.0024
0.0076

11

12

4

0.1260

0.0024

10

0.0283

13
3

0.1235
2

1

0.5767

0.0076
6

0.1260

0.0333

9

8

7

0.0088

Individual Score Matrix
Q1
Node 1
Node 2
Node 3
Node 4
Node 5
Node 6
Node 7
Node 8
Node 9
Node 10
Node 11
Node 12
Node 13

0.5767
0.1235
0.0283
0.0076
0.0088
0.0076
0.0088
0.0333
0.1260
0.1260
0.0333
0.0024
0.0024

Q2
0.0088
0.0076
0.0283
0.1235
0.5767
0.0076
0.0088
0.0024
0.0024
0.0333
0.1260
0.1260
0.0333

Q3
0.0088
0.0076
0.0283
0.0076
0.0088
0.1235
0.5767
0.1260
0.0333
0.0024
0.0024
0.0333
0.1260
EXTRACT ALGORITHM
•
•

•
•
•

The “EXTRACT” algorithm takes as input the weighted graph W, the importance scores on all
nodes, the budget b; and produces as output a small, unweighted, undirected graph H.
It is performed using dynamic programming or greedy method
1. Initialize output graph H be null
2. Let len be the maximum allowable path length
3. While H is not big enough

•
•

•

3.1. Pick up destination node pd
3.2. For each active source node qi wrt node pd

•
•

3.2.1. discover a key path P(qi, pd)
3.2.2. add P(qi, pd) to H

4. Output the final H
GMINE SYSTEM
•
•

•
•

GMine is a graph visualisation tool, used for handling large graphs.
The tool makes use of Graph-Trees to offer good and readable graph
exploration
As the user interacts with the visualization, the system keeps track of the
connectivity among communities of nodes at different levels of the
partitioned graph.
When the user changes the focus position on the tree structure, the system
works on demand to calculate and present contextual information.
GMINE VISUALIZATION
REFERENCES
•
•
•

•
•

Jose F. Rodrigues Jr, Hanghang Tong, Jia-Yu Pan, Agma J.M. Traina, Caetano Traina Jr. and
Christos Faloutsos, “Large Graph Analysis in the GMine System”, IEEE transactions on
knowledge and data engineering, vol. 25, no. 1, January 2013
Christos Falustos, Jose F. Rodrigues Jr, HanghangTong, Agma J.M. Traina, “GMine: A system
for scalable, interactive, graph visualization and mining” In IEEE/ACM International
Conference, pages 1195–1198, Oconomowoc, Wisconsin, USA.
Hanghang Tong, Christos Falustos, Center Piece Subgraphs: Problem definition and fast
solutions”, Carnegie-Mellon University, Research Track Paper, page 404-414
www.cmu.edu (Carnegie-Mellon University Site )
Jose F. Rodrigues Jr, Agma J.M. Traina, Caetano Traina Jr. Caio, Cesar Moreli , “GMine:
Interactive browsing of large graphs”, Workshop On Information Visualization and Analysis In
Social Networks – WIVA 2008
QUESTIONS / QUERIES .. ?
THANK-YOU

Más contenido relacionado

La actualidad más candente

Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsMeta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsShantanu Sharma
 
Integrating Network Discovery and Community Detection (IRE IIITH) Team 24
Integrating Network Discovery and Community Detection (IRE IIITH) Team 24Integrating Network Discovery and Community Detection (IRE IIITH) Team 24
Integrating Network Discovery and Community Detection (IRE IIITH) Team 24Nikhil Daliya
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTIONijcses
 
Data input and transformation
Data input and transformationData input and transformation
Data input and transformationMohsin Siddique
 
Maximizing network capacity and reliable transmission
Maximizing network capacity and reliable transmissionMaximizing network capacity and reliable transmission
Maximizing network capacity and reliable transmissioneSAT Publishing House
 
Maximizing network capacity and reliable transmission in mimo cooperative net...
Maximizing network capacity and reliable transmission in mimo cooperative net...Maximizing network capacity and reliable transmission in mimo cooperative net...
Maximizing network capacity and reliable transmission in mimo cooperative net...eSAT Journals
 
Shortest path estimation for graph
Shortest path estimation for graphShortest path estimation for graph
Shortest path estimation for graphijdms
 
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMGRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMIJCSEA Journal
 
Lecture+12+topology+2013 (3)
Lecture+12+topology+2013 (3)Lecture+12+topology+2013 (3)
Lecture+12+topology+2013 (3)Mei Chi Lo
 
A Unified Framework for Retrieving Diverse Social Images
A Unified Framework for Retrieving Diverse Social ImagesA Unified Framework for Retrieving Diverse Social Images
A Unified Framework for Retrieving Diverse Social Imagesmultimediaeval
 
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화NAVER Engineering
 

La actualidad más candente (20)

Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce ComputationsMeta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
Meta-MapReduce- A Technique for Reducing Communication in MapReduce Computations
 
Integrating Network Discovery and Community Detection (IRE IIITH) Team 24
Integrating Network Discovery and Community Detection (IRE IIITH) Team 24Integrating Network Discovery and Community Detection (IRE IIITH) Team 24
Integrating Network Discovery and Community Detection (IRE IIITH) Team 24
 
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
A NOBEL HYBRID APPROACH FOR EDGE  DETECTIONA NOBEL HYBRID APPROACH FOR EDGE  DETECTION
A NOBEL HYBRID APPROACH FOR EDGE DETECTION
 
Data input and transformation
Data input and transformationData input and transformation
Data input and transformation
 
Maximizing network capacity and reliable transmission
Maximizing network capacity and reliable transmissionMaximizing network capacity and reliable transmission
Maximizing network capacity and reliable transmission
 
Maximizing network capacity and reliable transmission in mimo cooperative net...
Maximizing network capacity and reliable transmission in mimo cooperative net...Maximizing network capacity and reliable transmission in mimo cooperative net...
Maximizing network capacity and reliable transmission in mimo cooperative net...
 
Daniel Lee STAN
Daniel Lee STANDaniel Lee STAN
Daniel Lee STAN
 
Gis Concepts 5/5
Gis Concepts 5/5Gis Concepts 5/5
Gis Concepts 5/5
 
Shortest path estimation for graph
Shortest path estimation for graphShortest path estimation for graph
Shortest path estimation for graph
 
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEMGRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
 
Mr Share 11 Sep 2010
Mr Share 11 Sep 2010Mr Share 11 Sep 2010
Mr Share 11 Sep 2010
 
Lecture+12+topology+2013 (3)
Lecture+12+topology+2013 (3)Lecture+12+topology+2013 (3)
Lecture+12+topology+2013 (3)
 
Visual Search
Visual SearchVisual Search
Visual Search
 
Isam2_v1_2
Isam2_v1_2Isam2_v1_2
Isam2_v1_2
 
adcom2013_submission_59
adcom2013_submission_59adcom2013_submission_59
adcom2013_submission_59
 
TerraWorld
TerraWorldTerraWorld
TerraWorld
 
GIS data structure
GIS data structureGIS data structure
GIS data structure
 
A Unified Framework for Retrieving Diverse Social Images
A Unified Framework for Retrieving Diverse Social ImagesA Unified Framework for Retrieving Diverse Social Images
A Unified Framework for Retrieving Diverse Social Images
 
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
 
Spatial Data Model 2
Spatial Data Model 2Spatial Data Model 2
Spatial Data Model 2
 

Destacado

Destacado (19)

Presentación peype en inglés
Presentación peype en inglésPresentación peype en inglés
Presentación peype en inglés
 
boost ur income
boost ur incomeboost ur income
boost ur income
 
Lines
LinesLines
Lines
 
W1 time management
W1 time management W1 time management
W1 time management
 
Filosofi Hidup untuk berhasil
Filosofi Hidup untuk berhasilFilosofi Hidup untuk berhasil
Filosofi Hidup untuk berhasil
 
CV_Zaghini Fabio
CV_Zaghini FabioCV_Zaghini Fabio
CV_Zaghini Fabio
 
Sitogenetika tiram mutiara
Sitogenetika tiram mutiaraSitogenetika tiram mutiara
Sitogenetika tiram mutiara
 
Pwc network-decommissioning-redacted copy
Pwc network-decommissioning-redacted copyPwc network-decommissioning-redacted copy
Pwc network-decommissioning-redacted copy
 
Digitechx Services
Digitechx ServicesDigitechx Services
Digitechx Services
 
How to make presentations
How to make presentationsHow to make presentations
How to make presentations
 
Publikasi jurnal ilmiah
Publikasi jurnal ilmiahPublikasi jurnal ilmiah
Publikasi jurnal ilmiah
 
Jaypal_Updated_CV
Jaypal_Updated_CVJaypal_Updated_CV
Jaypal_Updated_CV
 
GlowTouch Company Overview
GlowTouch Company OverviewGlowTouch Company Overview
GlowTouch Company Overview
 
Rain org
Rain orgRain org
Rain org
 
MVC
MVCMVC
MVC
 
2013 10-16 第9回、第10回萩本匠道場
2013 10-16 第9回、第10回萩本匠道場2013 10-16 第9回、第10回萩本匠道場
2013 10-16 第9回、第10回萩本匠道場
 
Guia hotel quo vadis
Guia hotel quo vadisGuia hotel quo vadis
Guia hotel quo vadis
 
Jisaセミナー講演
Jisaセミナー講演Jisaセミナー講演
Jisaセミナー講演
 
Innovations & Trends in Hearing conservation
Innovations & Trends in Hearing conservationInnovations & Trends in Hearing conservation
Innovations & Trends in Hearing conservation
 

Similar a Large graph analysis using g mine system

Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processingjins0618
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspectiveপল্লব রায়
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsaftab alam
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...ssuser4b1f48
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...thanhdowork
 
Large Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache GiraphLarge Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache Giraphsscdotopen
 
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...ssuser4b1f48
 
NEAL-2016 ARL Symposium Poster
NEAL-2016 ARL Symposium PosterNEAL-2016 ARL Symposium Poster
NEAL-2016 ARL Symposium PosterBarbara Jean Neal
 
Mining quasi bicliques using giraph
Mining quasi bicliques using giraphMining quasi bicliques using giraph
Mining quasi bicliques using giraphHsiao-Fei Liu
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsMLAI2
 
Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015Big Data Spain
 
A Dependent Set Based Approach for Large Graph Analysis
A Dependent Set Based Approach for Large Graph AnalysisA Dependent Set Based Approach for Large Graph Analysis
A Dependent Set Based Approach for Large Graph AnalysisEditor IJCATR
 
Communication costs in parallel machines
Communication costs in parallel machinesCommunication costs in parallel machines
Communication costs in parallel machinesSyed Zaid Irshad
 
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...Subhajit Sahu
 
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019ssuser4b1f48
 

Similar a Large graph analysis using g mine system (20)

Ling liu part 01:big graph processing
Ling liu part 01:big graph processingLing liu part 01:big graph processing
Ling liu part 01:big graph processing
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
 
Sun_MAPL_GNN.pptx
Sun_MAPL_GNN.pptxSun_MAPL_GNN.pptx
Sun_MAPL_GNN.pptx
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphs
 
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
NS-CUK Joint Journal Club : S.T.Nguyen, Review on "Graph Neural Networks for ...
 
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
240115_Thanh_LabSeminar[Don't walk, skip! online learning of multi-scale netw...
 
Large Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache GiraphLarge Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache Giraph
 
Chapter 3.pptx
Chapter 3.pptxChapter 3.pptx
Chapter 3.pptx
 
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
NS-CUK Seminar: J.H.Lee, Review on "Graph Propagation Transformer for Graph R...
 
NEAL-2016 ARL Symposium Poster
NEAL-2016 ARL Symposium PosterNEAL-2016 ARL Symposium Poster
NEAL-2016 ARL Symposium Poster
 
Pregel
PregelPregel
Pregel
 
Mining quasi bicliques using giraph
Mining quasi bicliques using giraphMining quasi bicliques using giraph
Mining quasi bicliques using giraph
 
Presentation
PresentationPresentation
Presentation
 
SuperGraph visualization
SuperGraph visualizationSuperGraph visualization
SuperGraph visualization
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
 
Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015Building graphs to discover information by David Martínez at Big Data Spain 2015
Building graphs to discover information by David Martínez at Big Data Spain 2015
 
A Dependent Set Based Approach for Large Graph Analysis
A Dependent Set Based Approach for Large Graph AnalysisA Dependent Set Based Approach for Large Graph Analysis
A Dependent Set Based Approach for Large Graph Analysis
 
Communication costs in parallel machines
Communication costs in parallel machinesCommunication costs in parallel machines
Communication costs in parallel machines
 
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
STIC-D: algorithmic techniques for efficient parallel pagerank computation on...
 
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
NS - CUK Seminar: S.T.Nguyen, Review on "Hypergraph Neural Networks", AAAI 2019
 

Último

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 

Último (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 

Large graph analysis using g mine system

  • 1. Large Graph Analysis In The GMine System By Saurabh Jogalekar TE C 51 Seminar Guide: Prof. S V Jagtap
  • 2. Large Graph A large graph is a graph with hundreds of thousands of nodes and a million edges Our friend list, recommendations, likes, comments in case of social networks is the best example of Large Graphs Other examples of large graphs include web graphs i.e. web pages pointing to each other through hyperlinks, bipartite graphs and computer communication graphs in which IP addresses send packets to other IP addresses.
  • 3. Representing Graphs The three techniques traditionally used for graph representation are • • • 1. Adjacency matrix 2. Adjacency list 3. Binary Decision Diagrams
  • 4. Representing Large Graphs • • Representation of large graphs is a challenging task in the way that the overall visibility of the graph is reduced due to huge amounts of nodes and edges. Thus the traditional methods for representation fail Example of a large graph
  • 5. Large Graph Representation • • Another problem with representing large graphs is that to acquire or mine the required nodes and edges, several complex calculations are required To overcome such hindrances in graph representation, a graph summarization method called CEPS (CEntre Piece Subgraph) is utilized
  • 6. GRAPH-TREE • • The CEPS is utilized from Graph-tree, which is hierarchical representation of graph containing SuperGraph, SuperNodes and SuperEdges The graph-tree is formed as shown in the figure
  • 7.
  • 8. FILLING A GRAPH-TREE Algorithm FillGraphTree(ptr) • • If ptr is leaf then set ptr -> fliepath to the file of corr. Subgraph Else for each child of ptr do: • • • • FillGraphTree(child) Instantiate a SuperEdge for each pair of children, find matches between unresolved edges from each pair and store them in superEdges Use external edges to determine ptr’s open nodes Propagate unresolved external edges to the parent
  • 9. SuperNodes and GraphNodes connectivity • • • SuperNodes connectivity for two SuperNodes is the set of edges, where each of the source belongs to coverage of first SuperNode and target belongs to the coverage of second SuperNode Graph Node connectivity is the set of edges connecting the graph node to other graph nodes which are not a part of coverage of the SuperNode which includes the Graph Node Both of the connectivity are useful in constructing the graph from its hierarchical representation
  • 10. Motivation behind CEPS • • Using a Graph-tree and hierarchical representation of a SuperGraph lessens the problem of inspecting large graphs However, the information retrieved from reaching the sub-graph is sometimes much greater than required information. To overcome this lacuna, CEPS is utilized
  • 11. CEPS • • • . A centre-piece subgraph contains the collection of paths connecting a subset of graph nodes of interest CEPS helps interaction by significantly reducing the number of edges and of nodes to be inspected CEPS uses a Random Walk Restart method to fine the ‘importance’ score between 2 nodes
  • 12. GOODNESS SCORE • Goodness score is calculated by a method Random Walk Restart. A matrix A(i, j) is defined which stores the steady state probabilities for each node ‘j’ with respect to the query ‘i’. 0.0088 5 0.0333 0.0024 0.0076 11 12 4 0.1260 0.0024 10 0.0283 13 3 0.1235 2 1 0.5767 0.0076 6 0.1260 0.0333 9 8 7 0.0088 Individual Score Matrix Q1 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 Node 13 0.5767 0.1235 0.0283 0.0076 0.0088 0.0076 0.0088 0.0333 0.1260 0.1260 0.0333 0.0024 0.0024 Q2 0.0088 0.0076 0.0283 0.1235 0.5767 0.0076 0.0088 0.0024 0.0024 0.0333 0.1260 0.1260 0.0333 Q3 0.0088 0.0076 0.0283 0.0076 0.0088 0.1235 0.5767 0.1260 0.0333 0.0024 0.0024 0.0333 0.1260
  • 13. EXTRACT ALGORITHM • • • • • The “EXTRACT” algorithm takes as input the weighted graph W, the importance scores on all nodes, the budget b; and produces as output a small, unweighted, undirected graph H. It is performed using dynamic programming or greedy method 1. Initialize output graph H be null 2. Let len be the maximum allowable path length 3. While H is not big enough • • • 3.1. Pick up destination node pd 3.2. For each active source node qi wrt node pd • • 3.2.1. discover a key path P(qi, pd) 3.2.2. add P(qi, pd) to H 4. Output the final H
  • 14. GMINE SYSTEM • • • • GMine is a graph visualisation tool, used for handling large graphs. The tool makes use of Graph-Trees to offer good and readable graph exploration As the user interacts with the visualization, the system keeps track of the connectivity among communities of nodes at different levels of the partitioned graph. When the user changes the focus position on the tree structure, the system works on demand to calculate and present contextual information.
  • 16. REFERENCES • • • • • Jose F. Rodrigues Jr, Hanghang Tong, Jia-Yu Pan, Agma J.M. Traina, Caetano Traina Jr. and Christos Faloutsos, “Large Graph Analysis in the GMine System”, IEEE transactions on knowledge and data engineering, vol. 25, no. 1, January 2013 Christos Falustos, Jose F. Rodrigues Jr, HanghangTong, Agma J.M. Traina, “GMine: A system for scalable, interactive, graph visualization and mining” In IEEE/ACM International Conference, pages 1195–1198, Oconomowoc, Wisconsin, USA. Hanghang Tong, Christos Falustos, Center Piece Subgraphs: Problem definition and fast solutions”, Carnegie-Mellon University, Research Track Paper, page 404-414 www.cmu.edu (Carnegie-Mellon University Site ) Jose F. Rodrigues Jr, Agma J.M. Traina, Caetano Traina Jr. Caio, Cesar Moreli , “GMine: Interactive browsing of large graphs”, Workshop On Information Visualization and Analysis In Social Networks – WIVA 2008