It was a cool experience, spending time with programmer and some computer engineers. In this codecamp, I talked about the science behind Complex networks, and how to program for complex network analysis. I also had a brief introduction towards graph databases.
Complex Networks: Science, Programming, and Databases
1. Complex Network:
Science, Data Base,
Programming
Mahdi Seyednezhad
Ph.D. candidate
Computer Science And CyberSecurity department
Data scientist at Alstom,
Florida Institute of Technology, Melbourne, Florida.
sseyednezhad2013@my.fit.edu
https://seyednejad.wixsite.com/home
2. Mahdi
Dr. Ronaldo Menezes
• I am Seyed Mohammad
Mahdi Seyednezhad.
• Don’t worry! You can call me
Mahdi/Mehdi.
• I am working at Alstom as a
data scientist.
Biocomplex Lab.
Who is Mahdi?
!2
3. My awesome talk plan!
Data Science Complex Network Graph Database
3
5. Data Science –
Industry expectations!
• Data engineering
• Data Acquisition
• ETL (Extract, Transform & Load)
• Data base management, Data warehousing
• Feature engineering
• Feature Extraction, Selection, Reduction
• Machine Learning
• Supervised, unsupervised, reinforcement
• Time series, images, tables, text, ...
• Data analytics
• Statistical models
• Structural models
• Complex models
Data science is a spectrum of skills and applications!
5
6. What is a graph?
• A graph is a discrete math modeling tool.
• A graph has 2 main elements:
• Node (Vertex)
• Edge (Relation)
• For example:
• Graph (Network) of cities
Orlando
Melbourne
Tampa
Miami
Jacksonville
Edge
Edge
Edge
Edge
Edge
6
7. Complex Network
• A graph generated by real data is called “Network”.
• In the context of network theory, a complex network is a graph
(network) with non-trivial topological features—features that do
not occur in simple networks such as lattices or random graphs
but often occur in graphs modeling of real systems.
A simple network
A complex network 7
8. Properties of a Network
8
Node1
Node3Node2
Degree(Node1)=2
Degree(Node2)=1
Degree(Node3)=1
9. Why network science?
• Let’s assume we have data of some people and tweets on
Twitter.
• Who is important?
• We can count!
• What relation is important?
• Who is important by number of relations?
• Who has attracted most attentions?
• How does the relation between entities change?
• It is easy to generalize to various applications.
9
10. Who is involved with complex networks?
• Physicists
• Computer scientists
• Social Scientists
• Biologists
• Mathematicians
• Industrial engineers
• Economics
10
22. When should we use network sciences?
• The structure is important!
• The collective behavior matters!
• The sequence reveals information.
• Relation-based communities are desired.
• The flow of information is important.
• Epidemic should be monitored.
• The graph model is suitable for the problem.
22
24. Scientific Tools of Complex Networks
• Strong nerve and good temper!
• Python, R
• Networkx (Python)
• Easy and straight forward
• Somehow slower
• igraph (Python, R)
• Staight forward
• Somehow acceptable network visualization
• Originally inplemented in C++ (a bit faster than Networkx)
• graph-tool (Python)
• Hard to use
• Very nice network visualization
• Contains most of the complicated network algorithms
24
27. It starts with a question?
Sometimes it ends up to more meaningful questions!
27
28. Start a project with a question/problem
• It starts with a question.
• Do people have the same kind of feelings?
• Do people show their feeling in the same way all around the
world?
• Do we actually feel different when we talk about various
topics?
• Is it possible to group people based on their feeling?
• The questions can explain the motivations.
28
29. Data Data Data Data Data Data ……
• We should find a good data set to address the questions.
• Do you want more data?
• Yes! Give me more data!
29
30. Data in our project
• How to identify the feelings?
• Emojis! 😁
• Source of emojis?
• Twitter!
• Is it free?
• Kinda! Enough for research.
• Collect tweets in different topics!
30
31. Create the network!
To create a network two essential things should be clearly
defined:
• Node (Entity)
• Edge (Relation)
• In our project, each emoji is a node and the cooccurrence is
the edge.
Node1 Node2
Edge
31
32. Network of Emojis
What a beautiful day out!
We are going to the beach
Can’t believe these waves!
#beachday
1
1
1
1
11
1
2 1
1
1
1
1
1
1
1
1
N-gram of the emojis in a tweet
Combine the N-grams to create the co-occurrence
directed weighted network of emojis
!32
33. Python code - Networkx
• pip install networkx
• Import networkx and create a graph instance
• import networkx as nx
• G=nx.Graph()
• Add one node
• G.add_node(1)
• Add a list of nodes
• G.add_nodes_from([2,3])
• Add one edge
• G.add_edge(1,2)
• Add a list of edges
• G.add_edges_from([(1,2),(1,3)])
33
36. Call the function for different groups
•The function
def network_builder(group_1, key_based, key_text, out_graphml,
emoji_list, tag)
•Call it:
net = network_builder(group,
‘emoji_List','emoji_position_in_text', file_graphml, emoji_list,
tag=group[0])
36
39. Network of Emoji
Stochastic Block model on the Organ data set.
The emojis are chosen from the corresponding
network randomly.
!39
https://link.springer.com/chapter/10.1007/978-3-319-72150-7_67
40. Word-emoji Network
We found a semantic in emojis similar to the semantics of languages.
We found out people are normally happy when they use emojis.
https://www.aaai.org/ocs/index.php/FLAIRS/FLAIRS18/paper/viewPaper/17653
41. After some mathematical analysis…..
• The structure of the feeling is the same!
• The way we show the feeling is different.
• Similar to languages! 😉
!41
43. Network Databases
• There are many options for Graph-based databases.
• Most of them are supported by main cloud services.
• It is important that the graph data base tries to
implement things in a graph model.
• Some data bases just show you the data in a graph model! They
do not store them as a graph!
• Nowadays, big servers offer Graph DBMS along with the classic
data base services.
• The relation between our dates important.
• It is really useful for realtime recommendation.
43
45. Comparing Graph Databases
Name Neo4j Microsoft Azure
Cosmos DB
OrientDB ArangoDB Janus Graph Amazon Neptune GraphDB
Rank (G-
DBMS)
1 2 3 4 6 8
DB model Graph Multi-model Multi-model Multi-model Graph Multi-model Multi-model
Initial
release
2007 2017 2010 2012 2017 2017 2000
Licence Open source Commercial Open source
(Apache 2)
Open source
(Apache 2)
Open source
(Apache 2)
Commercial Commercial
Server OS Linux
OS X
Solaris
Windows
Hosted All OS with a
Java JDK 6+
Linux
OS X
Raspbian
Solaris
Windows
Linux
OS X
Unix
Windows
Hosted All OS with a Java VM
Supported
programmin
g
languuages
.Net, Clojure
Elixir, Go
Groovy, Haskel
l, Java,
JavaScript
Perl, PHP
Python
Ruby, Scala
.Net
C#
Java
JavaScript
JavaScript
(Node.js)
Python
.Net, C, C#,
C++, Clojure
Java,
JavaScript
Node.js
PHP, Python
Ruby, Scala
C#, Clojure
Java
JavaScript
(Node.js)
PHP, Python
Ruby
Clojure
Java
Python
C#, Go
JavaScript (Node.js)
PHP
Python
Ruby, Scala
.Net, Java
C#, Clojure
JavaScript (Node.js)
PHP
Python
Ruby, Scala
45
47. Neo4j - Cypher
• Nodes are surrounded by parenthesis
• () or (p)
• Labels or tags start with’:’ and group nodes by roles or types
• (p:Person:Mammal)
• Nodes can have properties
• (p:Person {name: ‘John’})
47
48. • Relationships are wrapped with hyphens or square brackets.
• - -> or -[h:HIERD]->
• Direction od relation is specified “<“ and “>”
• (p1)-[:HIERD]->(p2)
• Relationship can have properties.
• -[:HIERD {type: ‘contract’}]->
48
49. Neo4j, Python, Cypher
• Jennifer likes graphs.
//data stored with this direction
CREATE (p:Person)-[:LIKES]->(t:Technology)
//query relationship backwards will not return results
MATCH (p:Person)<-[:LIKES]-(t:Technology)
//better to query with undirected relationship unless sure of direction
MATCH (p:Person)-[:LIKES]-(t:Technology)
49
50. Cypher and Python
from neo4jrestclient import client
q = 'MATCH (u:User)-[r:likes]->(m:Beer) WHERE u.name="Marco" RETURN u,
type(r), m'
# "db" as defined before
results = db.query(q, returns=(client.Node, str, client.Node))
for r in results:
print("(%s)-[%s]->(%s)" % (r[0]["name"], r[1], r[2]["name"]))
# The output:
# (Marco)-[likes]->(Punk IPA)
# (Marco)-[likes]->(Hoegaarden Rosee)
•
https://marcobonzanini.com/2015/04/06/getting-started-with-neo4j-and-python/
50
51. Neo4j - Python
• First, install Neo4j server.
• Neo4j Python driver
• It is officially supported by Neo4j and connects to the database using the binary
protocol. It aims to be minimal, while being idiomatic to Python
• pip install neo4j
• Py2neo
• It is a client library and comprehensive toolkit for working with Neo4j from within
Python applications and from the command line. It has been carefully designed to
be easy and intuitive to use.
• pip install py2neo
• neomodel
• An Object Graph Mapper built on top of the Neo4j python driver. Familiar Django
style node definitions with a powerful query API, thread safe and full transaction
support.
• pip install neomodel
51
52. Important Complex Network Conferences
• Complenet
• International Conference on Complex Networks
• Exeter, UK (2020)
• Complex Networks
• International Conference on Complex Networks and their Applications
• December 10-12, 2019 - Lisbon, Portugal.
• Netsci
• INTERNATIONAL SCHOOL AND CONFERENCE ON NETWORK SCIENCE
• 27 May 2019 - 31 May 2019, Burlington, United States
52