SlideShare una empresa de Scribd logo
1 de 35
This is a novice-track talk, so all concepts and examples are kept simple
1. Basic graph theory concepts and definitions
2. A few real-world scenarios framed as graph data
3. Working with graphs in Python
The overall goal of this talk is to spark your interest in and show you what’s
out there as a jumping off point for you to go deeper
Graph: “A structure amounting to a set of objects in which some
pairs of the objects are in some sense ‘related’. The objects
correspond to mathematical abstractions called vertices (also called
nodes or points) and each of the related pairs of vertices is called an
edge (also called an arc or line)” – Richard Trudeau, Introduction to
Graph Theory (1st edition, 1993)
Graph Analytics: “Analysis of data structured as a graph
(sometimes also part of network analysis or link analysis depending
on scope and context)” – Me, talking to a stress ball as I made these
slides
• We see two vertices joined by
a single edge
• Vertex 1 is adjacent to vertex 2
• The neighborhood of vertex 1
is all adjacent vertices (vertex
2 in this case)
• We see that there is a loop on
vertex a
• Vertices a and b have multiple
edges between them
• Vertex c has a degree of 3
• There exists a path from vertex a
to vertex e
• Vertices f, g, and h form a 3-
cycle
• We have no single cut vertex or cut
edge (one that would create more
disjoint vertex/edge sets if
removed)
• We can separate this graph into two
disconnected sets:
1) Vertex Set 1 = {a, b, c, d, e}
2) Vertex Set 2 = {f, g, h}
• Imagine symmetric vertex
labels along the top and
left hand sides of the
matrix
• A one in a particular slot
tells us that the two
vertices are adjacent
• In this graph two vertices are
joined by a single directed
edge
• There is a dipath from vertex 1
to vertex 2 but not from vertex
2 to vertex 1
• Every vertex has ‘played’ every
other vertex
• We can see that there is no clear
winner (every vertex has
indegree and outdegree of 2)
• Vertices from Set 1 = {a, b, c, d} are
only adjacent to vertices from Set 2
= {e, f, g, h}
• This can be extended to tripartite
graphs (3 sets) or as many sets as we
like (n-partite graphs)
• Can we pair vertices from each set
together?
We can pair every vertex
from one set to a vertex
from the other using only
existing edges
• We can assign weights to edges
of a graph
• As we follow a path through the
graph, these weights accumulate
• For example, the path a -
> b -> c has an associated
weight of 0.5 + 0.4 = 0.9
• We can assign colors to vertices
• The graph we see here has a
proper coloring (no two vertices
of the same color are adjacent)
• We can also color edges!
• Are we focused more on objects or the relationships/interactions
between them?
• Are we looking at transition states?
• Is orientation important?
If you can imagine a graph to represent it, it’s probably worth giving it a
shot, if only for your own learning and exploration!
• If the lines represent
connections, what can we say
about the people highlighted
in red?
• What kinds of questions might
a graph be able to answer?
• e and d have the highest
degree
• What might the c-d-e cycle
tell us?
• What can we say about cut
vertices?
If we have page view
data with timestamps
how might we
represent this as a
graph?
• What might loops or multiple edges
between vertices represent?
• What types of data might we want to
use as values on the edges?
• What might comparing indegrees and
outdegrees on different vertices
represent?
If we have to regularly pick up a
load at the train station, make
deliveries to every factory and
then return to the garage how can
a graph help us find an optimal
route?
• We can assign weights to each edge to
represent distance, travel time, gas cost
for the distance, etc
• The path with the lowest total weight
represents the
shortest/cheapest/fastest/etc
• Note that edge weights are only
displayed for f-e and f-a
If the following people want to
attend the following talks (a-h),
what’s the minimum number of
sessions we need to satisfy
everyone?
• We can use the talks as
vertices and add edges
between talks that have the
same person interested
• The minimum number of
colors needed for a proper
coloring shows us the
minimum number of
sessions we need to satisfy
everyone
https://github.com/igraph/python-igraph https://github.com/networkx
https://graph-tool.skewed.de
• GraphML (XML-based)
• GML (ASCII-based)
• NetworkX has built in functions to work with a Pandas DataFrame or a
NumPy array/matrix
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
vertices = []
for x in range(1, 6):
vertices.append(x)
G.add_nodes_from(vertices)
G.add_edges_from([(1, 2), (2, 3), (5, 4),
(4, 2), (1, 3), (5, 1), (5, 2), (3, 4)])
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=20)
nx.draw_networkx_edges(G, pos, width=5)
nx.draw_networkx_labels(G, pos,
font_size=14)
nx.draw(G, pos)
plt.show()
import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_nodes_from(['a', 'b', 'c'])
G.add_edge('a', 'b', weight=0.5)
G.add_edge('b', 'c', weight=0.2)
G.add_edge('c', 'a', weight=0.7)
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, node_size=500)
nx.draw_networkx_edges(G, pos, width=6)
nx.draw_networkx_labels(G, pos, font_size=14)
nx.draw_networkx_edge_labels(G, pos,
font_size=14)
nx.draw(G, pos)
plt.show()
>>> G.nodes()
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20]
>>> nx.shortest_path(G, 1, 18)
[1, 3, 18]
>>> G.degree()
{1: 4, 2: 3, 3: 4, 4: 4, 5: 4, 6: 3,
7: 3, 8: 3, 9: 4, 10: 3, 11: 2,
12: 2, 13: 2, 14: 4, 15: 3, 16: 3,
17: 2, 18: 3, 19: 3, 20: 3}
>>> nx.greedy_color(G)
{'d': 0, 'a': 0, 'e': 1, 'b': 1,
'c': 1, 'f': 2, 'h': 1, 'g': 0}
>>> temp = nx.greedy_color(G)
>>> len(set(temp.values()))
3
import networkx as nx
import matplotlib.pyplot as plt
G = nx.DiGraph([(1, 2), (1, 3), (4, 1),
(1, 5), (2, 3), (2, 4), (2, 5), (3, 4),
(3, 5), (4, 5)])
pos = nx.circular_layout(G)
nx.draw_networkx_nodes(G, pos,
node_size=200)
nx.draw_networkx_edges(G, pos)
nx.draw_networkx_labels(G, pos,
fontsize=14)
>>> nx.has_path(G, 1, 5)
True
>>> nx.has_path(G, 5, 1)
False
>>> nx.shortest_path(G, 1, 4)
[1, 2, 4]
>>> nx.maximal_matching(G)
{(1, 4), (5, 2), (6, 3)}
• There’s a NetworkX tutorial tomorrow!
• In-browser Graphviz: webgraphviz.com
• Free graph theory textbook: An Introduction to Combinatorics and
Graph Theory, David Guichard
• Open problems in graph theory: openproblemgarden.org
• Graph databases
• Association for Computational Linguistics (ACL) 2010 Workshop on
Graph-based Methods for Natural Language Processing
• Free papers: researchgate.net

Más contenido relacionado

La actualidad más candente

Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data DepthMukund Raj
 
Dijkstra’S Algorithm
Dijkstra’S AlgorithmDijkstra’S Algorithm
Dijkstra’S Algorithmami_01
 
Networks dijkstra's algorithm- pgsr
Networks  dijkstra's algorithm- pgsrNetworks  dijkstra's algorithm- pgsr
Networks dijkstra's algorithm- pgsrLinawati Adiman
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problemIfra Ilyas
 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofAlexander Decker
 
Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)Anshul gour
 
Double Patterning (3/31 update)
Double Patterning (3/31 update)Double Patterning (3/31 update)
Double Patterning (3/31 update)guest833ea6e
 
Image similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variationsImage similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variationssipij
 
Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...IAEME Publication
 
Double Patterning
Double PatterningDouble Patterning
Double PatterningDanny Luk
 

La actualidad más candente (20)

Vector in R
Vector in RVector in R
Vector in R
 
Generalized Notions of Data Depth
Generalized Notions of Data DepthGeneralized Notions of Data Depth
Generalized Notions of Data Depth
 
Dijkstra’S Algorithm
Dijkstra’S AlgorithmDijkstra’S Algorithm
Dijkstra’S Algorithm
 
Networks dijkstra's algorithm- pgsr
Networks  dijkstra's algorithm- pgsrNetworks  dijkstra's algorithm- pgsr
Networks dijkstra's algorithm- pgsr
 
Data structure
Data structureData structure
Data structure
 
Data structure and algorithm
Data structure and algorithmData structure and algorithm
Data structure and algorithm
 
Shortest path problem
Shortest path problemShortest path problem
Shortest path problem
 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps of
 
Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)Dijkstra & flooding ppt(Routing algorithm)
Dijkstra & flooding ppt(Routing algorithm)
 
Graph clustering
Graph clusteringGraph clustering
Graph clustering
 
Double Patterning (3/31 update)
Double Patterning (3/31 update)Double Patterning (3/31 update)
Double Patterning (3/31 update)
 
d
dd
d
 
Shortest path algorithm
Shortest  path algorithmShortest  path algorithm
Shortest path algorithm
 
Dijkstra's Algorithm
Dijkstra's Algorithm Dijkstra's Algorithm
Dijkstra's Algorithm
 
Combinatorial Optimization
Combinatorial OptimizationCombinatorial Optimization
Combinatorial Optimization
 
Image similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variationsImage similarity using symbolic representation and its variations
Image similarity using symbolic representation and its variations
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
cdrw
cdrwcdrw
cdrw
 
Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...Color vs texture feature extraction and matching in visual content retrieval ...
Color vs texture feature extraction and matching in visual content retrieval ...
 
Double Patterning
Double PatterningDouble Patterning
Double Patterning
 

Similar a Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma

lecture 17
lecture 17lecture 17
lecture 17sajinsc
 
Graphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ programGraphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ programMuhammad Danish Badar
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjtepournima055
 
Graphs in Data Structure
 Graphs in Data Structure Graphs in Data Structure
Graphs in Data Structurehafsa komal
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabatinabati
 
Unit-6 Graph.ppsx ppt
Unit-6 Graph.ppsx                                       pptUnit-6 Graph.ppsx                                       ppt
Unit-6 Graph.ppsx pptDhruvilSTATUS
 
graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________ssuser1989da
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashingVictor Palmar
 

Similar a Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma (20)

LEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdfLEC 12-DSALGO-GRAPHS(final12).pdf
LEC 12-DSALGO-GRAPHS(final12).pdf
 
lecture 17
lecture 17lecture 17
lecture 17
 
Graphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ programGraphs and eularian circuit & path with c++ program
Graphs and eularian circuit & path with c++ program
 
Unit 9 graph
Unit   9 graphUnit   9 graph
Unit 9 graph
 
Unit ix graph
Unit   ix    graph Unit   ix    graph
Unit ix graph
 
18 Basic Graph Algorithms
18 Basic Graph Algorithms18 Basic Graph Algorithms
18 Basic Graph Algorithms
 
Graphs
GraphsGraphs
Graphs
 
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjteUnit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
Unit II_Graph.pptxkgjrekjgiojtoiejhgnltegjte
 
logic.pptx
logic.pptxlogic.pptx
logic.pptx
 
DATA STRUCTURES.pptx
DATA STRUCTURES.pptxDATA STRUCTURES.pptx
DATA STRUCTURES.pptx
 
Graphs in Data Structure
 Graphs in Data Structure Graphs in Data Structure
Graphs in Data Structure
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabati
 
Unit-6 Graph.ppsx ppt
Unit-6 Graph.ppsx                                       pptUnit-6 Graph.ppsx                                       ppt
Unit-6 Graph.ppsx ppt
 
Algorithms Design Assignment Help
Algorithms Design Assignment HelpAlgorithms Design Assignment Help
Algorithms Design Assignment Help
 
Algorithms Design Exam Help
Algorithms Design Exam HelpAlgorithms Design Exam Help
Algorithms Design Exam Help
 
8150.graphs
8150.graphs8150.graphs
8150.graphs
 
Dijkstra
DijkstraDijkstra
Dijkstra
 
graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________graph_theory_1-11.pdf___________________
graph_theory_1-11.pdf___________________
 
ae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.pptae_722_unstructured_meshes.ppt
ae_722_unstructured_meshes.ppt
 
Lecture 5b graphs and hashing
Lecture 5b graphs and hashingLecture 5b graphs and hashing
Lecture 5b graphs and hashing
 

Más de PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshPyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiPyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerPyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroPyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottPyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroPyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydPyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverPyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardPyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...PyData
 

Más de PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 

Último

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma

  • 1.
  • 2. This is a novice-track talk, so all concepts and examples are kept simple 1. Basic graph theory concepts and definitions 2. A few real-world scenarios framed as graph data 3. Working with graphs in Python The overall goal of this talk is to spark your interest in and show you what’s out there as a jumping off point for you to go deeper
  • 3. Graph: “A structure amounting to a set of objects in which some pairs of the objects are in some sense ‘related’. The objects correspond to mathematical abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge (also called an arc or line)” – Richard Trudeau, Introduction to Graph Theory (1st edition, 1993) Graph Analytics: “Analysis of data structured as a graph (sometimes also part of network analysis or link analysis depending on scope and context)” – Me, talking to a stress ball as I made these slides
  • 4.
  • 5. • We see two vertices joined by a single edge • Vertex 1 is adjacent to vertex 2 • The neighborhood of vertex 1 is all adjacent vertices (vertex 2 in this case)
  • 6.
  • 7. • We see that there is a loop on vertex a • Vertices a and b have multiple edges between them • Vertex c has a degree of 3 • There exists a path from vertex a to vertex e • Vertices f, g, and h form a 3- cycle
  • 8. • We have no single cut vertex or cut edge (one that would create more disjoint vertex/edge sets if removed) • We can separate this graph into two disconnected sets: 1) Vertex Set 1 = {a, b, c, d, e} 2) Vertex Set 2 = {f, g, h}
  • 9. • Imagine symmetric vertex labels along the top and left hand sides of the matrix • A one in a particular slot tells us that the two vertices are adjacent
  • 10. • In this graph two vertices are joined by a single directed edge • There is a dipath from vertex 1 to vertex 2 but not from vertex 2 to vertex 1
  • 11. • Every vertex has ‘played’ every other vertex • We can see that there is no clear winner (every vertex has indegree and outdegree of 2)
  • 12. • Vertices from Set 1 = {a, b, c, d} are only adjacent to vertices from Set 2 = {e, f, g, h} • This can be extended to tripartite graphs (3 sets) or as many sets as we like (n-partite graphs) • Can we pair vertices from each set together?
  • 13. We can pair every vertex from one set to a vertex from the other using only existing edges
  • 14. • We can assign weights to edges of a graph • As we follow a path through the graph, these weights accumulate • For example, the path a - > b -> c has an associated weight of 0.5 + 0.4 = 0.9
  • 15. • We can assign colors to vertices • The graph we see here has a proper coloring (no two vertices of the same color are adjacent) • We can also color edges!
  • 16. • Are we focused more on objects or the relationships/interactions between them? • Are we looking at transition states? • Is orientation important? If you can imagine a graph to represent it, it’s probably worth giving it a shot, if only for your own learning and exploration!
  • 17. • If the lines represent connections, what can we say about the people highlighted in red? • What kinds of questions might a graph be able to answer?
  • 18. • e and d have the highest degree • What might the c-d-e cycle tell us? • What can we say about cut vertices?
  • 19. If we have page view data with timestamps how might we represent this as a graph?
  • 20. • What might loops or multiple edges between vertices represent? • What types of data might we want to use as values on the edges? • What might comparing indegrees and outdegrees on different vertices represent?
  • 21. If we have to regularly pick up a load at the train station, make deliveries to every factory and then return to the garage how can a graph help us find an optimal route?
  • 22. • We can assign weights to each edge to represent distance, travel time, gas cost for the distance, etc • The path with the lowest total weight represents the shortest/cheapest/fastest/etc • Note that edge weights are only displayed for f-e and f-a
  • 23. If the following people want to attend the following talks (a-h), what’s the minimum number of sessions we need to satisfy everyone?
  • 24. • We can use the talks as vertices and add edges between talks that have the same person interested • The minimum number of colors needed for a proper coloring shows us the minimum number of sessions we need to satisfy everyone
  • 27. • GraphML (XML-based) • GML (ASCII-based) • NetworkX has built in functions to work with a Pandas DataFrame or a NumPy array/matrix
  • 28. import networkx as nx import matplotlib.pyplot as plt G = nx.Graph() vertices = [] for x in range(1, 6): vertices.append(x) G.add_nodes_from(vertices) G.add_edges_from([(1, 2), (2, 3), (5, 4), (4, 2), (1, 3), (5, 1), (5, 2), (3, 4)]) pos = nx.spring_layout(G) nx.draw_networkx_nodes(G, pos, node_size=20) nx.draw_networkx_edges(G, pos, width=5) nx.draw_networkx_labels(G, pos, font_size=14) nx.draw(G, pos) plt.show()
  • 29. import networkx as nx import matplotlib.pyplot as plt G = nx.Graph() G.add_nodes_from(['a', 'b', 'c']) G.add_edge('a', 'b', weight=0.5) G.add_edge('b', 'c', weight=0.2) G.add_edge('c', 'a', weight=0.7) pos = nx.spring_layout(G) nx.draw_networkx_nodes(G, pos, node_size=500) nx.draw_networkx_edges(G, pos, width=6) nx.draw_networkx_labels(G, pos, font_size=14) nx.draw_networkx_edge_labels(G, pos, font_size=14) nx.draw(G, pos) plt.show()
  • 30. >>> G.nodes() [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20] >>> nx.shortest_path(G, 1, 18) [1, 3, 18] >>> G.degree() {1: 4, 2: 3, 3: 4, 4: 4, 5: 4, 6: 3, 7: 3, 8: 3, 9: 4, 10: 3, 11: 2, 12: 2, 13: 2, 14: 4, 15: 3, 16: 3, 17: 2, 18: 3, 19: 3, 20: 3}
  • 31.
  • 32. >>> nx.greedy_color(G) {'d': 0, 'a': 0, 'e': 1, 'b': 1, 'c': 1, 'f': 2, 'h': 1, 'g': 0} >>> temp = nx.greedy_color(G) >>> len(set(temp.values())) 3
  • 33. import networkx as nx import matplotlib.pyplot as plt G = nx.DiGraph([(1, 2), (1, 3), (4, 1), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]) pos = nx.circular_layout(G) nx.draw_networkx_nodes(G, pos, node_size=200) nx.draw_networkx_edges(G, pos) nx.draw_networkx_labels(G, pos, fontsize=14) >>> nx.has_path(G, 1, 5) True >>> nx.has_path(G, 5, 1) False >>> nx.shortest_path(G, 1, 4) [1, 2, 4]
  • 35. • There’s a NetworkX tutorial tomorrow! • In-browser Graphviz: webgraphviz.com • Free graph theory textbook: An Introduction to Combinatorics and Graph Theory, David Guichard • Open problems in graph theory: openproblemgarden.org • Graph databases • Association for Computational Linguistics (ACL) 2010 Workshop on Graph-based Methods for Natural Language Processing • Free papers: researchgate.net