Intermediate Data Science NX
Intermediate Data Science NX
Erdos Renyi Model: Erdos Renyi model is random graph model that results by
adding successive edges randomly to the vertices.
Watts Strogatz Model: These are random graphs having small world properties. It
was introduced by Duncan Watts and Steven Strogatz.
Barabasi Albert Model: It is based upon growth and preferential attachment and
shows scale free distribution. It explains the tendency of attachment of a node with a
highly connected node.
The network and the backbone
(Left) The network obtained for 46 years of events. (Right) The backbone in the network with 190 sources and 280 targets.
Network Resilience
Network resilience. (a) The structure of the network (directed) under targeted attack: Terrorist source nodes are removed in the sequence
of their out-degree starting from the highest out-degree. The plots shows the behavior of the giant component GC (fraction of nodes in the
largest connected component) and the average number of nodes in the isolated clusters other than the giant component has, with
increasing fraction of removed nodes. (b) The structure of the network (directed) under random failure: Nodes having out-degree are
removed randomly. The network and the giant component are destroyed faster by targeted nodes removal (attack), compared to the
random node removal (failure): GC becomes zero after about 35% of the sources are removed through the former method, and about 97%
of the sources are removed through the latter. The results are shown for the network aggregated over 1970 − 2016.
Courtesy:University of Florida Black death spread
Network of
Networks
Jianxi Gao, Daqing Li, Shlomo Havlin , From a single network to a network of networks , National Science Review (September 2014) 1 (3): 346-356
Edge weights can have positive or
negative values ▪ One gene activates/ inhibits another
▪ One person trusting/
distrusting another
▪ Research challenge:
• How does one
‘propagate’ negative
feelings in a social
network?
• Is my enemy’s enemy my friend?
2
Transcription regulatory network in baker’s yeast 0
Basic concepts about networks
❑ Adjacency matrix
❑ Weights matrix
❑ Shortest path length
❑ Average path length
❑ Diameter of a network
BASIC CONCEPTS ABOUT NETWORKS
• Adjacency and Weights Matrix:
• All the former networks can be described using a matricial
formalism.
• Given a set of N nodes with M connections between them:
• Outdegree =
A ∑ 0
0
0
0
0
1
0
1
0
0
ij
• Example: outdegree
j =1
for node 3 is 2, which we obtain A
0 1 0 1 0
=
by summing the number of non-zero entries in the 3rd 0 0 0 0 1
n
row 1 1 0 0 0
∑A
j
n
=1 3j
Indegree =
•
∑ i
0 0 0 0 0
=1 0 0 1 1 0
A A
• Example:
ij the indegree for node 3 is 1, which we obtain 0 1 0 1 0
by summing the numnber of non-zero entries in the =
0 0 0 0 1
rd
3column A
∑
i3 1 1 0 0 0
i=1 4
• Shortest path, average pathlength
and
diameter:
• Shortest path (dij):
• The shortest path dij between nodes i and j corresponds to
the minimal distance (or weight) between all paths that
connect i and j.
• Average path length (l):
• The average path length l is the average shortest path
between all nodes in the network:
• Diameter(D):
• The maximum distance between all shortest
paths
D=max(dij )
• What happens if the network is broken
into several components?
• Component:
• The set of nodes reachable from a given node.
❏ The average number of steps along the shortest paths for all possible
pairs of network nodes are referred as avg shortest path length.
❏ Eccentricity: For a node n in a graph G, the eccentricity of n is the largest
possible shortest path distance between n and all other nodes.
❏ Diameter : The maximum shortest distance between a pair of nodes in a
graph G is its Diameter. It is the largest possible eccentricity value of a
node.
❏ Radius : It is the minimum eccentricity value of a node.
❏ Periphery : It is the set of nodes that have their eccentricity equal to their
Diameter.
❏ Center : Center of a Graph is the set of nodes whose eccentricity is equal
to the radius of the Graph.'''
• Degree, strength and betweenness:
• Degree (kij):
• The degree ki of a node i is the number of connections of
the node
• Strength (si):
• The strength si of a node i is the sum of the weigths of the
connections to that node
• Betweenness (bi):
• The betweeennes of a node i (or a link) accounts for the
number of shortest paths passing through that node (or
link).
• The clustering coefficient C accounts for the number of triangles in the network. Specifically, Ci is the
ratio between the number of links E connecting the nearest neighbors of i and the total number of
possible links between these neighbors.
• The clustering coefficient of the network C is the average of Ci over all nodes.
Community Structure (I):
Figure from: M. E. J. Newman, Proc. Natl. Acad. Sci. USA 103, 8577
Network centrality is a common measure in network analysis. However it
is often poorly understood.
Amongst many we have 4 most considered definitions of centrality, each
with its own meaning
WHAT IS CENTRALITY?
This offers an answer to the question:
Which node is important?
Here, the word “important” has a multitude of meanings, leading to many
different definitions of centrality.
CENTRALITY?
❏ CLOSENESS CENTRALITY
❏ Closeness centrality measures how close a node is to other nodes in the
entire network: the sum of the shortest paths. In the centre is now the
node that can reach all others in the network most quickly and help to
find good broadcasters. This is important in diffusion processes.
❏ Node position in total network.
❏ Sensitive to changes in this network.
❏ BETWEENNESS CENTRALITY
❏ EIGENVECTOR
❏ Betweenness centrality seeks to capture
CENTRALITY
the role of nodes as a bridge between other
❏ Eigenvector centrality states
groups of nodes by calculating all the
that a node is important if its
shortest paths and then counting how
neighbours are important. An
many times each node falls on one. The
extension of the degree
betweenness centrality indicates the
centrality. Not what you
degree of potential control, a node
know, but who you know is
standing between many others can exert
important.
more influence on the flow in a network.
❏ Node position in total
❏ Flow perspective.
network.
❏ A bridge can also be on the periphery of
the network and this centrality is also
sensitive to changes in the network.
Out of these there are other centralities as well namely, eccentricity
centrality, PageRank, Katz centrality…..Percolation, Alpha, etc.,
An important structure:
n-barbell graph is a special type of “undirected” graph consisting
of two non-overlapping n-vertex “cliques” together with a single edge
that has an endpoint in each clique.