An Introduction To Graph Theory in Complex Systems Studies: Why Use A Graph-Theoretic Representation?
An Introduction To Graph Theory in Complex Systems Studies: Why Use A Graph-Theoretic Representation?
Systems Studies
Table of contents
Why use a graph-theoretic representation?..................................................1
Representing a problem as a graph..............................................................2
Obtaining metrics through graph-theoretic analysis....................................3
Case study: metrics in a genetic regulatory network...............................4
Summary: steps in representing a problem as a graph.................................5
Tools for analysing graphs...........................................................................5
Programs..................................................................................................6
Libraries..................................................................................................8
Bibliography................................................................................................8
Glossary.......................................................................................................8
Note on terminology
Graph-theoretic or complex-systems terminology is identified by the use
of SMALL CAPS in the text. While brief definitions of these terms are
available in the glossary section, this introduction is intended to be used in
conjunction with the online reference material at:
http://130.102.66.173/wiki/index.php/Main_Page
This reference material provides definitions for common graph-theoretic
terms, and explanations of many complex systems metrics and techniques
for analysing graph-based representations.
In situations where the graph must represent richer information than simple nodes and
edges will allow, it is possible to have nodes and edges with attached properties. Some
common examples of properties that may be attached to nodes are the location
information of a node in a transportation network or the activation value of a gene in a
genetic regulatory network; a link's properties may include a weight, representing the
degree of influence a gene exhibits upon the activation of another, or the strength of a
relationship in a social network. Some network types, such as transportation networks,
have sophisticated techniques for dealing with certain attached properties, such as
distance between nodes and flow through edges. In general, however, attached properties
will require domain-specific tuning of generic metrics and techniques if such properties
are to be taken into account.
Clustering coefficient (global): Broadly speaking, how closely clustered are nodes
in the network?
Cliques (local): If the network appears to be modular, do cliques (or similar
measures) identify interesting sub-components of the network? This metric can help
to identify functionally related genes/proteins in the network.
Bridges/pivots (local): If there are cliques, how do they communicate? Are there
specific nodes or edges in the network that join multiple cliques together? This
metric may identify nodes that are not highly-connected, but are nevertheless
crucial to the network's operation.
Programs
Pajek
URL: http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Additional resources: http://vlado.fmf.uni-lj.si/pub/networks/pajek/apply.htm
Pajek is a program for the analysis and visualisation of large graphs. While it can be
intimidating at first glance, it is a highly capable package that is freely available, in active
development and has a large user community – particularly in the field of social network
analysis.
UCINET
URL: http://www.analytictech.com/ucinet.htm
UCINET is a package developed for social network analysis and contains a range of
network analysis tools. There is a freely available demonstration version, with registration
required after 30 days. The interface for UCINET is much more polished than that of
Pajek, but makes heavy use of terms from social network analysis.
It is a good tool for graph analysis, and a reasonable tool for graph visualisation. The
range of analysis tools included is not as wide as that of Pajek, but the available tools are
easier to use. It is accompanied by a high quality manual and a PDF version of a good
(free) social network analysis textbook.
Although UCINET is a MS Windows program, it also runs under Unix-based operating
systems via the Wine project (http://www.winehq.org/). UCINET works reasonably well
when run in this way – the analysis tools generally work well, but there are problems with
graph drawing.
Matlab
URL: http://www.mathworks.com/
User community: http://www.mathworks.com/matlabcentral/
While Matlab is not designed for working with graph-based representations, it is capable
of efficiently and simply dealing with connectivity matrix representations of graphs.
Although not a recommended starting point, researchers familiar with Matlab may find its
capabilities sufficient for a variety of graph analyses. There are some user-contributed
libraries available to support the use of Matlab for graph analysis that can be found
through the user community. However, there are no official packages available, and the
tools available through the user community are not as well developed as similar packages
for C++ or Java (see below). Some of the available user-contributed packages are:
● GrTheory – a graph theory toolbox.
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=4266
&objectType=file
This toolbox provides several graph algorithms such as minimum spanning tree,
minimum vertex cover and all-pairs shortest path, but has no internal graph
representation; instead, it uses simple node and edge vectors.
● Toolbox Graph – a toolbox for graph manipulation and plotting
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=5355
&objectType=file
This toolbox includes few algorithms, but has an internal graph representation with
support for building several graph types, as well as having some support for graph
visualisation.
Libraries
Jung (Java)
URL: http://jung.sourceforge.net/
Jung is an actively developed Java graph library with a promising set of graph algorithms
available, and a competent visualisation framework. It is easily extensible through
standard Java mechanisms. It also imports and exports Pajek data files.
Bibliography
Diestel, Reinhard (2000). “Graph theory” Springer: NY.
Hannenman, Robert (2001). “Introduction to Social Network Methods” Available from
http://faculty.ucr.edu/~hanneman/networks/nettext.pdf and included in the UCINET
package.
Strogatz, Stephen H. (2001). “Exploring Complex Networks” Nature 420, pp. 268-276.
Wasserman, Stanley and Faust, Katherine (1994). “Social Network Analysis : Methods
and Applications” Cambridge University Press: Cambridge, NY.
Glossary
CLOSENESS CENTRALITY An inverse measure of centrality associated with a NODE. The sum
of the SHORTEST PATH LENGTHS between a given NODE and all other
nodes in the GRAPH. See also BETWEENNESS CENTRALITY.
CLUSTERING A measure of the closeness of NODES in a GRAPH. The average of
COEFFICIENT the clustering coefficient of all the nodes in a graph, where the
clustering coefficient of a node is the probability that two
neighbours of the node are connected to each other.
CUT-POINT See PIVOT.
DEGREE The number of other nodes that a given node is connected to by
EDGES. For directed graphs, see IN-DEGREE and OUT-DEGREE.
K-PLEX A set of NODES in a GRAPH such that each node in the set is
connected to all but k other nodes in the set. (i.e., in a 2-plex,
every node is connected to at least all but 2 nodes in the set) See
also CLIQUE, K-CORE.
N-CLAN A set of NODES in a GRAPH such that each node in the set is
connected to every other node in the set by n or fewer EDGES, all of
which are between two nodes in the set. (i.e., in a 2-clan, every
node is connected by no more than 2 edges in the set) See also
CLIQUE, N-CLIQUE.
N-CLIQUE A set of NODES in a GRAPH such that each node in the set is
connected to every other node in the set by n or fewer EDGES. (i.e.,
in a 2-clique, every node is connected by no more than 2 edges)
See also CLIQUE, N-CLAN.
NODE A node is a fundamental component of a GRAPH, generally used to
represent an entity in a problem. Nodes are connected together by
EDGES.
PIVOT A NODE that will, if removed, cause a GRAPH (or subgraph) that was
connected to become disconnected. See also BRIDGE.
RADIUS The shortest longest PATH between the most central NODE in the
GRAPH, and all the other nodes in the graph