SNA Unit3
SNA Unit3
permuted the matrix), we would see a distinctive pattern of "1-blocks" and "0-
blocks." All connections among actors within a faction would be present, all
connections between actors in different factions would be absent.
Network Measurements
A number of measurable network characteristics were developed to gain a
greater insight into networks, with many of them having their roots in social
studies on the relationships among social actors. In this section, we will discuss
three categories of measurements that have been defined in the social network
analysis stream:
1. Network connection, which includes transitivity, multiplexity, homophily,
dyads and mutuality, balance and triads, and reciprocity
2. Network distribution, which includes the distance between nodes, degree cen-
trality, closeness centrality, betweenness centrality, eigenvector centrality and
density
3. Network segmentation, which includes cohesive subgroups, cliques, clustering
coefficient, k-cores, core/periphery, block models, and hierarchical clustering
Network Connection
Network connection (or connectivity) refers to the ability to move from one node
to another in a network. It is the ratio between route distance and geodesic
distance. Connectivity can be calculated locally (for a part of the network) and
globally (for the entire network). Let’s take a look at some of the important
metrics of network connection.
Transitivity
SIT1610-Social Network Analysis –Unit 3
Homophily
Homophily is the tendency of individuals to connect with others who share the
same attitudes and beliefs. The tendency of individuals to associate with similar
others based on gender, education, race, or other socioeconomic characteristics is
very common in social communities. Coordination and cooperation are typically
more successful between people who show some similarity to each other such
that individuals in homophilic relationships are likely to hear about new ideas or
ask for help from each other.Homophily in the context of online social
networking can be understood from the similarity of users who are using the
network in terms of age, educational back- ground, region, or profession. In the
sense of corporate networks, homophily is translated as the similarity of
professional or academic qualifications.
The above graph is a type of signed graphs which have been studied since the
1950s. They are a special case of valued graphs in which ties are allowed to have
one of two opposing values to convey the positive or negative sentiment.
Examples of signed graphs include friend/foe, trust/distrust or like/dislike,
esteem/disesteem, praise/blame, influence/negative influence, etc. They are very
common in sociology and psychology but less common in fields such as physics
and chemistry.
• In figure a, all the three actors have positive feelings, and there is no place for
conflict among them. The configuration is coherent and lacks inner tensions
between members.
• Figure b is also stable since two actors (B and C) share the same negative
feeling towards actor A, but they like each other.
• Figure c is unstable because actors A and B have a negative feeling towards
each other, while both have a positive feeling towards actor C which has to
divide its loyalty between the other two actors.
• Figure d is also unstable and will eventually break down, as it has an odd
number of negative signs.
In Fig. 3.2, b, types of balanced subgraphs are shown, whereas in Fig. 3.2 c, d,
types of unbalanced graphs are presented. An obvious way to avoid unbalances
in subgraphs is by sign shifting, which includes changing signs such that
enmities (negative signs) become friendships (positive signs) or vice versa.
Within real net- works, stable configurations appear far more often than unstable
configurations.It should be noted here that a negative sign between two nodes
does not mean the lack of tie between these two nodes. While a negative sign
between two nodes is a clear mark of an inimical relationship, the absence of a tie
between these nodes suggests the absence of interaction or communication
between them.
Reciprocity
Reciprocity is a measure of the tendency towards building mutually directed con-
nections between two actors. It refers to the number of reciprocated tie for a
specific actor in a network. For example, if u connects to v, then v connects to u
and vice versa. In real life scenarios, it is important to know whether received
help is also given or whether given help is translated as help by the receiver.For a
given node v, reciprocity is the ratio between the number of nodes which have
both incoming and outgoing connections from/to v, to the number of nodes
which only have incoming connections from v. For an entire network, reciprocity
is calculated as the fraction of edges that are reciprocated. Average reciprocity is
SIT1610-Social Network Analysis –Unit 3
Network Distribution
Measurements of network distribution are related to how nodes and edges are
distributed in a network.
Distance Between Two Nodes
Distance is a network metric that allows the calculation of the number of edges
between any pair of nodes in a network. Measuring distances between nodes in
graphs is critical for many implementations like graph clustering and outlier
detection. Sometimes, the distance measure is used to see if the two nodes are
similar or not. Any commonly used shortest path calculation algorithm (e.g.,
Dijkstra) can be used to provide all shortest paths in a network with their
lengths.We can use the distance measure to calculate node eccentricity, which is
the maximum distances from a given node to all other nodes in a network. It is
also possible to calculate network diameter, which is the highest eccentricity of
its nodes and thus represents the maximum distance between nodes.In most
social networks, the shortest path is computed based on the cost of tran- sition
from one node to another such that the longer the path value, the greater the
cost.Within a community, there might be many edges between nodes, but
between communities, there are fewer edges.
Degree Centrality
In degree centrality metric, the importance of a node is determined by how many
nodes it is connected to. It is a measurement of the number of direct links to
other actors in the network. This means that the larger the number of adjacent
nodes, the more important the node since it is independent of other actors that
reach great parts of the network. It is a local measure since its value is computed
based on the number of links an actor has to the other actors directly adjacent to
it. Actors in social net- works with a high degree of centrality serve as hubs and
as major channels of information.In social networks, for example, node degree
distribution follows a power law distribution, which means that very few nodes
have an extremely large number of connections. Naturally, those high-degree
nodes have more impact in the network than other nodes and thus are considered
more important. A node i’s degree central- ity d(i) can be formulated as
SIT1610-Social Network Analysis –Unit 3
where mij = 1 if there is a link between nodes i and j and mij = 0 if there is no
such link. For directed networks, it is important to differentiate between the in-
degree centrality and the out-degree centrality.
Identifying individuals with the highest-degree centrality is essential in
network analysis because having many ties means having multiple ways to fulfill
the require- ments of satisfying needs, becoming less dependent on other
individuals, and hav- ing better access to network resources. Persons with the
highest-degree centrality are often third parties and deal makers and able to
benefit from this brokerage. For directed networks, in-degree is often used as a
proxy for popularity .The figure shows that node A and node B are at exceptional
structural positions. All communications lines must go through them. This gives
us a conclu- sion that both nodes, A and B, are powerful merely because of their
excellent posi- tions. However, such a finding is largely based on the nature of
links and the nature of embedded relationships.
Closeness Centrality
where dij is the geodesic distance from node i to node j (number of links in the
shortest path from node i to node j).
Closeness centrality is important to understand information dissemination in
net- works in the way that the distance between one particular node and others
has an effect on how this node can receive from or send information (e.g.,
gossip) to other nodes. In social networks, this ability is limited by what is called
“horizon of observ- ability” which states that individuals have almost no sight into
what is going on after two steps.Because closeness centrality is based on the
distance between network nodes, it can be considered the inverse of centrality
because large values refer to lower cen- trality, whereas small values refer to
high centrality. Computationally, the value of C(i) is a number between 0 and 1,
where higher numbers mean greater closeness (lower average distance) whereas
lower numbers mean insignificant closeness (higher average distance) .In the
figure, the nodes in gray are the most central regarding closeness because they
can reach the rest of nodes in the network easily and equally.
They have the ability to reach all other nodes in the fastest amount of time. The
other nodes lack these privileged positions.Because closeness centrality is based
on shortest path calculations, its usefulness when applied to large networks can
be brought into question in the way that closeness produces little variation in the
results, which makes differentiating between nodes more difficult.In information
networks, closeness reveals how long it takes for a bit of information to flow from
one node to others in the network. High-scoring nodes usually have shorter paths
to the rest of nodes in the network.
SIT1610-Social Network Analysis –Unit 3
Betweenness Centrality
where gjk is the number of shortest paths from node (j) to node k (j and k ≠ i)
and gjik is the number of shortest paths from node (j) to node k passing through
the node (i).
Eigenvector Centrality
Density is defined as the degree to which network nodes are connected one to
another. It can be used as a measure of how close a network is to complete. In
the case of a complete graph (a graph in which all possible edges are present),
density is equal to one. In real life, a dense group of objects has many
connections among its entities (i.e., has a high density), while a sparse group has
few of them (i.e., has a low density).
Formally, the density D(G) of graph G is defined as the fraction of edges in G
to the number of all possible edges. Density values range between zero and one
SIT1610-Social Network Analysis –Unit 3
[0, 1].
Proportion of ties between alters comparedto number possible
ties
Fig.3.5 Ties
Total ties=1
No:of possible ties=6 (N*(N-1)/2)//N is the number of nodes
Density=1/6
Cohesive Subgroups
Cohesive groups are communities in which the nodes (members) are connected
to others in the same group more frequent than they are to those who are outside
of the group, allowing all of the members of the group to reach each other.
Within such a highly cohesive group, members tend to have strong homogenous
beliefs. Connections between community members can be formed either through
personal contacts (i.e., direct) or joint group membership (i.e., indirect). As such,
the more tightly the individuals are tied into a community, the more they are
affected by group standards.
Cliques
A clique is a graph (or subgraph) in which every node is connected to every other
node. Socially translated, a clique is a social grouping in which all individuals
know each other (i.e., there is an edge between each pair of nodes). A triangle is
an exam- ple of a clique of size three since it has three nodes and all the nodes
are connected.A maximal clique is a clique that is not a subset of any other clique
in the graph. A clique with size greater than or equal to that of every other clique
in the graph is called a maximum clique.
Relaxation of Strict Cliques
• Distance (length of paths)
SIT1610-Social Network Analysis –Unit 3
Ego Net
The ego-net approach to social network analysis, which takes discrete
individual actors and their contacts as its starting point, is one of the most widely
used approaches.
Ego network (personal network)
Ego: focal node/respondent
Alter: actors ego has ties with
Ties between alters
SIT1610-Social Network Analysis –Unit 3
– E.g., who are the key players in a group? How do ideas diffuse
through a group?
Consider the people with whom you like to spend your free time.
– Over the last six months, who are the one or two people you
have been with the most often for informal social activities
such as going out to lunch, dinner, drinks, films, visiting
one another’s homes, and so on?
• All of these questions will be asked for each alter named in the
previous section
Sample Alter Attribute Questions As far as you know, what is <alter>’ s highest
As far as you know, what is <alter>’ s highest level of education?
– Age, occupation, race, gender, nationality, salary, drug use
habits, etc
• Some approaches do not distinguish betweenname interpreters and alter
attribute
Sample Alter-Alter Relationship Questions
Kinds of Analyses
• In Ego-Centric Network analyses we are typically looking
to use network-derived measures as variables in more
traditional case-based analyses
– E.g., instead of just age, education, and family SES to predict
earning potential, we might also include heterogeneity of
network or brokerage statistics
Many different kinds of network measures, the simplest is degree (size
Data Analysis of Ego Networks
1. Size
– How many contacts does Ego have?
2. Composition
– What types of resources does ego have access to? (e.g., quality )
– Does ego interact with others like him/herself? (e.g., homophily)
– Are ego’s alters all alike? (e.g., homogeneity?)
3. Structure
– Does ego connect otherwise unconnected alters? (e.g.,
brokerage, density, etc)
– Does ego have ties with non-redundant alters (e.g.,
effective size,efficiency, constraint)
Size
Degree = 7
Composition: Content
• The attributes (resources) of others to whom I am connected
affect my success or opportunities
– Access to resources or information
SIT1610-Social Network Analysis –Unit 3
Heterophily
– We may posit that a relationship exists between some phenomenon and a
difference between ego and alters along some attribute
• Mentoring tends to be heterophilous with age
Composition: Homophily/Heterophily
Krackhardt and Stern’s E-I index
Composition: Heterogeneity
• Similar to homophily, but distinct in that it looks not at similarity
to ego, but just among the alters
SIT1610-Social Network Analysis –Unit 3
Structural Analyses
• Burt’s work is particularly and explicitly ego- network based in
calculation
• Effective size
• Efficiency
• Constraint
SIT1610-Social Network Analysis –Unit 3
Effective Size
= 6 – 1.33 = 4.67
Efficiency
Actual Size = 6