Graph Theory Chap
Graph Theory Chap
Borgatti
Graph Theory
This is FIRST draft. Very likely to contain errors.
b
c
Figure 1.
a
0
1
0
0
0
0
b
1
0
1
0
0
0
c
0
1
0
1
1
0
d
0
0
1
0
1
0
e
0
0
1
1
0
1
f
0
0
0
0
1
0
Examining either Figure 1 or Figure 2, we can see that not every vertex is adjacent to
every other. A graph in which all vertices are adjacent to all others is said to be complete.
The extent to which a graph is complete is indicated by its density, which is defined as
the number of edges divided by the number possible. If self-loops are excluded, then the
number possible is n(n-1)/2. If self-loops are allowed, then the number possible is
n(n+1)/2. Hence the density of the graph in Figure 1 is 6/15 = 0.40.
A clique is a maximal complete subgraph. A subgraph of a graph G is a graph whose
points and lines are contained in G. A complete subgraph of G is a section of G that is
complete (i.e., has density = 1). A maximal complete subgraph is a subgraph of G that is
complete and is maximal in the sense that no other node of G could be added to the
subgraph without losing the completeness property. In Figure 1, the nodes {c,d,e}
together with the lines connecting them form a clique. Cliques have been seen as a way to
represent what social scientists have called primary groups.
While not every vertex in the graph in Figure 1 is adjacent, one can construct a sequence
of adjacent vertices from any vertex to any other. Graphs with this property are called
connected. Similarly, any pair of vertices in which one vertex can reach the other via a
c
g
Figure 3.
a b c d e f g
0 1 2 3 3 4 3
1 0 1 2 2 3 2
2 1 0 1 1 2 1
3 2 1 0 1 2 2
3 2 1 1 0 1 1
4 3 2 2 1 0 2
3 2 1 2 1 2 0
The powers of a graphs adjacency matrix, Ap, give the number of walks of length p
between all pairs of nodes. For example, A2, obtained by multiplying the matrix by itself,
2
has entries aij that give the number of walks of length 2 that join node vi to node vj.
Hence, the geodesic distance matrix D has entries dij = p, where p is the smallest p such
p
that aij > 0. (However, there exist much faster algorithms for computing the distance
matrix.)
The eccentricity e(v) of a point v in a connected graph G(V,E) is max d(u,v), for all u
V. In other words, a points eccentricity is equal to the distance from itself to the point
farthest away. The eccentricity of node b in Figure 3 is 3. The minimum eccentricity of
all points in a graph is called the radius r(G) of the graph, while the maximum
eccentricity is the diameter of the graph. In Figure 3, the radius is 2 and the diameter is 4.
A vertex that is least distant from all other vertices (in the sense that its eccentricity
equals the radius of the graph) is a member of the center of the graph and is called a
central point. Every tree has a center consisting of either one point or two adjacent
points.
The number of vertices adjacent to a given vertex is called the degree of the vertex and is
denoted d(v). It can be obtained from the adjacency matrix of a graph by simply
computing each row sum. For example, the degree of vertex c in Figure 3 is 4. The
average degree, d , of all vertices depicted in Figure 3 is 2.29. There is a direct
relationship between the average degree, d , of all vertices in a graph and the graphs
density:
density
d
n 1
The minimum degree of a graph G is denoted (G). A vertex with degree 0 is known as
an isolate (and constitutes a component of size 1), while a vertex with degree 1 is a
pendant. Holding average degree constant, there is a tendency for graphs that contain
some nodes of high degree (i.e., high variance in degree) to have shorter distances than
Directed Graphs
As noted at the outset, the edges contained in graphs are unordered pairs of nodes (i.e.,
(u,v) is the same thing as (v,u)). As such, graphs are useful for encoding directionless
relationships such as the social relation sibling of or the physical relation is near.
However, many relations that we would like to model are not directionless. For example,
is the boss of is usually anti-symmetric in the sense that if u is the boss of v, it is
unlikely that v is the boss of u. Other relations, such as gives advice to are simply nonsymmetric in the sense that if u gives advice to v, v may or may not give advice to u.
To model non-symmetric relations we use directed graphs, also known as digraphs. A
digraph D(V,E) consists of a set of nodes V and a set of ordered pairs of nodes E called
arcs or directed lines. The arc (u,v) points from u to v.
Digraphs are usually represented visually like graphs, except that arrowheads are placed
on lines to indicate direction (see Figure 5). When both arcs (u,v) and (v,u) are present in
a digraph, they may be represented by a double-headed arrow (as in Figure 5a), or two
separate arrows (as shown in Figure 5b).
Figure 5a
b
c
e
f
c
d
Figure 6.
Note that the path of length n or less linking a member of the n-clique to another member
may pass through an intermediary who is not in the group. In the 2-clique in Figure 6,
nodes c and e are distance 2 only because of d, which is not a member of the 2-clique. In
this sense, n-cliques are not as cohesive as they might otherwise appear. The notion of an
n-clan avoids that. An n-clan is an n-clique in which the diameter of the subgraph G
induced by S is less than or equal to n. The subgraph G of a graph G induced by the set
of nodes S is defined as the maximal subgraph of G that has point set S. In other words, it
is the subgraph of G obtained by taking all nodes in S and all ties among them. Therefore,
an n-clan S is an n-clique in which all pairs have distance less than or equal to n even
when we restrict all paths to involve only members of S. In Figure 6, the set {b,c,d,e,f} is
a 2-clan, but {a,b,c,d,e} is not because b and c have distance greater than 2 in the induced
subgraph. Note that {a,b,f,e} is also fails the 2-clan criterion because n-clans are defined
to be n-cliques and {a,b,f,e} is not a 2-clique (it fails the maximality criterion since
{a,b,c,d,e}). An n-club corrects this problem by eliminating the n-clique criterion from
1
Cohesive subsets are traditionally defined in terms of subgraphs rather than subsets of nodes. However,
since most people think about them in terms of node sets, and because using subgraphs complicates
notation, we used subsets here.
b
c
Figure 7.
More cohesive than k-plexes are LS sets. Let H be a set of nodes in graph G(V,E) and let
K be a proper subset of H. Let (K) denote the number of edges linking members of K to
V-K (the set of nodes not in K). Then H is an LS set of G if for every proper subset K of
H, (K) > (H). The basic idea is that individuals in H have more ties with other members
than they do to outsiders. Another way to define LS sets that makes this more evident is
as follows. Let (X,Y) denote the number of edges from members of set X to members of
set Y. Then H is an LS set if (K,H-K) > (K,V-H). In Figure 7, the set {a,b,d,e} is not an
LS set since ({b,d,e},{a}) is not greater than ({b,d,e},{c}). In contrast, the set {a,b,d,e}
in Figure 8 does qualify as an LS set.
b
c
Figure 8.
A key property of LS sets is high edge connectivity. Specifically, every node in an LS set
has higher edge connectivity () with other members of the LS set than with any nonmember. Taking this as the sole criterion for defining a cohesive subset, a lambda set is
defined as a maximal subset of nodes S such that for all a,b,c S and d V-S, (a,b) >
(c,d). To the extent that is high, members of the same lambda set are difficult to
disconnect from one another because defines the number edges that must be removed
from the graph in order to disconnect the nodes within the lambda set.
A k-core is a maximal subgraph H in which (H) >= k. Hence, every member of a 2-core
is connected to at least 2 other members, and no node outside the 2-core is connected to 2
or more members of the core (otherwise it would not be maximal). Every k-core contains
at least k+1 vertices, and vertices in different k-cores cannot be adjacent. A 1-core is
simply a component. K-cores can be described as loosely cohesive regions which will
contain more cohesive subsets. For example, every k-plex is contained in a k-core.
Roles/Positions
Given a digraph D(V,E), the in-neighborhood of a node v, denoted Ni(v) is the set of
vertices that send arcs to v. That is, Ni(v) = {u: (u,v) E}. The out-neighborhood of a
node v, denoted No(v) is the set of vertices that receive arcs from v. That is, No(v) = {u:
(v,u) E}.
A coloration C is an assignments of colors to the vertices V of a digraph. The color of a
vertex v is denoted C(v) and the set of distinct colors assigned to nodes in a set S is
denoted C(S) and termed the spectrum of S. In Figure 9, a coloration of nodes is depicted
by labeling the nodes with letters such as r for red, and y for yellow. Nodes colored the
same are said to be equivalent.
Figure 9.
A coloration is a strong structural coloration if nodes are assigned the same color if and
only if they have identical in and out neighborhoods. That is, for all u,v V, C(u) = C(v)
if and only if Ni(u) = Ni(v) and No(u) = No(v). The coloration in Figure 9 is a strong
structural coloration. We can check this by taking pairs of nodes and verifying that if
they are colored the same (i.e., are strongly structurally equivalent) they have identical
neighborhoods, and if they are not colored the same, they have different neighborhoods.
For example, b and d are colored the same, and both of their neighborhoods consist of
{a,c,e}.
Note that in strong structural colorations, any two nodes that are colored the same are
structurally identical: if we remove the identifying labels from the identically colored
nodes, then spin the graph around in space before placing it back down on the page, we
would not be able to figure out which of the same-colored nodes was which.
Consequently, any property of the nodes that stems from their structural position (such as
expected time until arrival of something flowing through the network) should be the same
for nodes that are equivalent.
A coloration C is regular if C(u) = C(v) implies that C(Ni(u)) = C(Ni(v)) and C(No(u)) =
C(No(v)) for all u, v V. In other words, in regular colorations, every pair of nodes that
has the same color must receive arcs from nodes comprising the same set of colors and
must send arcs to nodes comprising the same set of colors. Every structural coloration is a
regular coloration.
a