1. Betweenness
an attribute of each edge:
the number of pairs of nodes whose shortest path contain current edge.
To find communities, we could do this:
while exist betweenness > threshold
a) cut the edge with largest betweenness;
b) recalculate betweenness
Community partition
2. GN algorithm
GN algorithm is used to calculate
the betweenness of a graph.
step1
step2
step3
Community partition
3. Maximal clique, Complete bi-partite subgraph, and Frequent itemset
Find a core of a graph, expand it by including more nodes whose number of
edges to the core are large than a threshold.
Here is a conclusion about complete bi-partite subgraph:
assume the average degree of a graph with n nodes is d, then there must have at
least one complete bi-partite subgraph Ks,t if below inequality is true.
n(d/n)t
≥ s
Community partition
4. Normailized cuts
Community partition
Vol(S) is the number of edges with at least one end in S.
Cut(S,T) is the number of edges that connect a node in S to a node in T.
The small the normalized cut, the better, for it means we cut less edge and keep
more edges in each subgraph.
5. Eigenvalues and Eigenvectors of the Laplacian Matrix
Laplacian Matrix = Degree Matrix - Adjacency Matrix
Community partition
Article in below link answered why the 2nd eigenvector of Laplacian Matrix give us a
suggestion of how to partition.
http://www.cnblogs.com/vivounicorn/archive/2012/02/10/2343377.html
in this article, there are 3 partition methods based on Laplacian matrix:
Minimum Cut, Ratio Cut and Normalized Cut
6. Simrank approach
simrank is used to measure the similarity between nodes.
Node Similarity
Random walks with restart.
v' = βMv + (1-β)eN
v : the column vector that reflects the probability the walker is at each of the nodes at
previous round, we could init v as eN.
v' : the result of current round.
M : the transition matrix.
N : the node we are dealing with.
β: the probability that walker continue going to other nodes but node N. we say 'restart'
based on the probability of (1-β) that walker would go back to N on each round.
eN : a column vector that has 1 in the row for node N and 0’s elsewhere, thus (1-β)eN is
the contribution to current result when walker 'restart'.
transition matrix iterator result
7. A scenario when dealing
with Page Rank
Add a probability factor
to walk out circle
No probability factor,
trapped in the circle
8. m
Counting Triangles
m
Intuitively, in a social network, a friend of my friend stands a good chance to be
my friend too, so counting the number of triangles helps us to measure the extent
to which a graph looks like a social network.
In a graph that has n nodes with m≥n edges, there must be no more than 2 nodes
whose degree are larger than , we call these nodes the heavy hitter.
We reduce the time complexity of counting triangles by dividing nodes into heavy
hitter group and non-heavy hitter group.
9. Neighborhood Properties of Graphs
Neighborhood
The neighborhood of radius d for a node v is the set of nodes u for which
there is a path of length at most d from v to u, denoted by N(v,d).
The diameter of a directed graph is the smallest integer d such that for every
two nodes u and v there is a path of length d or less from u to v. For a undirected
graph, we treat each edge as double-direction edge.
Transitive Closure and Reachability
The transitive closure of a graph is the set of pairs of nodes (u, v) such that
there is a path from u to v of length zero or more.
We say node u reaches node v if (u,v) is an item of transitive closure of the
graph.
10. Neighborhood Properties of Graphs
Neighborhood
The neighborhood of radius d for a node v is the set of nodes u for which
there is a path of length at most d from v to u, denoted by N(v,d).
The diameter of a directed graph is the smallest integer d such that for every
two nodes u and v there is a path of length d or less from u to v. For a undirected
graph, we treat each edge as double-direction edge.
Transitive Closure and Reachability
The transitive closure of a graph is the set of pairs of nodes (u, v) such that
there is a path from u to v of length zero or more.
We say node u reaches node v if (u,v) is an item of transitive closure of the
graph.