0% found this document useful (0 votes)
20 views

Intermediate Data Science NX

The document discusses various concepts related to network centrality including degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality. It explains that degree centrality counts the number of neighbors a node has, closeness centrality measures how close a node is to other nodes based on shortest path lengths, betweenness centrality measures the number of shortest paths that pass through a node, and eigenvector centrality considers the centrality of a node's neighbors. The document aims to clarify different definitions of centrality and their meanings for identifying important nodes in networks.

Uploaded by

NANDINI JAIN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Intermediate Data Science NX

The document discusses various concepts related to network centrality including degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality. It explains that degree centrality counts the number of neighbors a node has, closeness centrality measures how close a node is to other nodes based on shortest path lengths, betweenness centrality measures the number of shortest paths that pass through a node, and eigenvector centrality considers the centrality of a node's neighbors. The document aims to clarify different definitions of centrality and their meanings for identifying important nodes in networks.

Uploaded by

NANDINI JAIN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Intermediate Data Science

Instructor: Dr. S. Shariq Husain


Assistant Professor, JSGP, JGU,
Sonipat, Haryana, India
Email: [email protected]
[email protected]
Tentative Sketch of Overall Course
❏ Data Science and its Implications
❏ Complex Systems
❏ Dynamical Systems
❏ Complex Networks
❏ Non-linear Dynamics/ Differential Equations
❏ Computational Social Science
❏ Digital Epidemiology
❏ Computational Linguistics
❏ Computational Anthropology
❏ Machine Learning Approaches
❏ Deep Learning: Introduction
❏ Reservoir Computing: Introduction
Network models

Erdos Renyi Model: Erdos Renyi model is random graph model that results by
adding successive edges randomly to the vertices.

Watts Strogatz Model: These are random graphs having small world properties. It
was introduced by Duncan Watts and Steven Strogatz.

Barabasi Albert Model: It is based upon growth and preferential attachment and
shows scale free distribution. It explains the tendency of attachment of a node with a
highly connected node.
The network and the backbone

(Left) The network obtained for 46 years of events. (Right) The backbone in the network with 190 sources and 280 targets.
Network Resilience

Network resilience. (a) The structure of the network (directed) under targeted attack: Terrorist source nodes are removed in the sequence
of their out-degree starting from the highest out-degree. The plots shows the behavior of the giant component GC (fraction of nodes in the
largest connected component) and the average number of nodes in the isolated clusters other than the giant component has, with
increasing fraction of removed nodes. (b) The structure of the network (directed) under random failure: Nodes having out-degree are
removed randomly. The network and the giant component are destroyed faster by targeted nodes removal (attack), compared to the
random node removal (failure): GC becomes zero after about 35% of the sources are removed through the former method, and about 97%
of the sources are removed through the latter. The results are shown for the network aggregated over 1970 − 2016.
Courtesy:University of Florida Black death spread
Network of
Networks

Jianxi Gao, Daqing Li, Shlomo Havlin , From a single network to a network of networks , National Science Review (September 2014) 1 (3): 346-356
Edge weights can have positive or
negative values ▪ One gene activates/ inhibits another
▪ One person trusting/
distrusting another
▪ Research challenge:
• How does one
‘propagate’ negative
feelings in a social
network?
• Is my enemy’s enemy my friend?

2
Transcription regulatory network in baker’s yeast 0
Basic concepts about networks
❑ Adjacency matrix
❑ Weights matrix
❑ Shortest path length
❑ Average path length
❑ Diameter of a network
BASIC CONCEPTS ABOUT NETWORKS
• Adjacency and Weights Matrix:
• All the former networks can be described using a matricial
formalism.
• Given a set of N nodes with M connections between them:

• Matrices will be symmetric if networks are


undirected.
Node degree:
n

• Outdegree =
A ∑ 0
0
0
0
0
1
0
1
0
0
ij
• Example: outdegree
j =1
for node 3 is 2, which we obtain A
0 1 0 1 0
=
by summing the number of non-zero entries in the 3rd 0 0 0 0 1
n
row 1 1 0 0 0

∑A
j
n
=1 3j
Indegree =

∑ i
0 0 0 0 0
=1 0 0 1 1 0
A A
• Example:
ij the indegree for node 3 is 1, which we obtain 0 1 0 1 0
by summing the numnber of non-zero entries in the =
0 0 0 0 1
rd
3column A

i3 1 1 0 0 0
i=1 4
• Shortest path, average pathlength
and
diameter:
• Shortest path (dij):
• The shortest path dij between nodes i and j corresponds to
the minimal distance (or weight) between all paths that
connect i and j.
• Average path length (l):
• The average path length l is the average shortest path
between all nodes in the network:
• Diameter(D):
• The maximum distance between all shortest
paths
D=max(dij )
• What happens if the network is broken
into several components?
• Component:
• The set of nodes reachable from a given node.
❏ The average number of steps along the shortest paths for all possible
pairs of network nodes are referred as avg shortest path length.
❏ Eccentricity: For a node n in a graph G, the eccentricity of n is the largest
possible shortest path distance between n and all other nodes.
❏ Diameter : The maximum shortest distance between a pair of nodes in a
graph G is its Diameter. It is the largest possible eccentricity value of a
node.
❏ Radius : It is the minimum eccentricity value of a node.
❏ Periphery : It is the set of nodes that have their eccentricity equal to their
Diameter.
❏ Center : Center of a Graph is the set of nodes whose eccentricity is equal
to the radius of the Graph.'''
• Degree, strength and betweenness:

• Degree (kij):
• The degree ki of a node i is the number of connections of
the node
• Strength (si):
• The strength si of a node i is the sum of the weigths of the
connections to that node

• Betweenness (bi):
• The betweeennes of a node i (or a link) accounts for the
number of shortest paths passing through that node (or
link).

• where njk is the number of shortest paths connecting j and


k, and njk(i) are those shortest paths between j and k that
pass through i.
• Clustering coefficient:

• The clustering coefficient C accounts for the number of triangles in the network. Specifically, Ci is the
ratio between the number of links E connecting the nearest neighbors of i and the total number of
possible links between these neighbors.

• The clustering coefficient of the network C is the average of Ci over all nodes.
Community Structure (I):

Given a graph GN,M, a community is a subgraph G’N’,M’


whose nodes are thightly connected (or at least, more
connected
than in a random equivalent network).

Figure from: M. E. J. Newman, Proc. Natl. Acad. Sci. USA 103, 8577
Network centrality is a common measure in network analysis. However it
is often poorly understood.
Amongst many we have 4 most considered definitions of centrality, each
with its own meaning

WHAT IS CENTRALITY?
This offers an answer to the question:
Which node is important?
Here, the word “important” has a multitude of meanings, leading to many
different definitions of centrality.
CENTRALITY?

Why does centrality in a network matter?


❏ Because networks are everywhere, and centrality gives us crucial
information about these networks.
❏ For example with respect to an undirected network, in a social
network. It is about identifying the most influential
members/actors in these networks.
❏ In a small network centrality is sometimes easy to see, in other
networks this is more difficult. The larger the network, the more
challenging.
❏ DEGREE CENTRALITY
❏ The most simple and popular is the degree centrality,
this centrality counts the number of neighbours of a
node. It can be used to find nodes with many direct
contacts and is often considered a measure of activity.
❏ The simplest to calculate centrality.
❏ Measures only locally; it doesn't inform where the
node is in the network.

❏ CLOSENESS CENTRALITY
❏ Closeness centrality measures how close a node is to other nodes in the
entire network: the sum of the shortest paths. In the centre is now the
node that can reach all others in the network most quickly and help to
find good broadcasters. This is important in diffusion processes.
❏ Node position in total network.
❏ Sensitive to changes in this network.
❏ BETWEENNESS CENTRALITY
❏ EIGENVECTOR
❏ Betweenness centrality seeks to capture
CENTRALITY
the role of nodes as a bridge between other
❏ Eigenvector centrality states
groups of nodes by calculating all the
that a node is important if its
shortest paths and then counting how
neighbours are important. An
many times each node falls on one. The
extension of the degree
betweenness centrality indicates the
centrality. Not what you
degree of potential control, a node
know, but who you know is
standing between many others can exert
important.
more influence on the flow in a network.
❏ Node position in total
❏ Flow perspective.
network.
❏ A bridge can also be on the periphery of
the network and this centrality is also
sensitive to changes in the network.
Out of these there are other centralities as well namely, eccentricity
centrality, PageRank, Katz centrality…..Percolation, Alpha, etc.,

An important structure:
n-barbell graph is a special type of “undirected” graph consisting
of two non-overlapping n-vertex “cliques” together with a single edge
that has an endpoint in each clique.

You might also like