Algorithms
Algorithms
Optimization Algorithms
in Physics
Authors:
Alexander K. Hartmann, Institute of Theoretical Physics, University of Goettingen, Germany
e-mail: [email protected]
Heiko Rieger, Institute of Theoretical Physics, Saarland University, Germany
e-mail: [email protected]
This book was carefully produced. Nevertheless, authors and publisher do not warrant the information con-
tained therein to be free of errors. Readers are advised to keep in mind that statements, data, illustrations,
procedural details or other items may inadvertently be inaccurate.
1st edition
British Library Cataloguing-in-Publication Data: A catalogue record for this book is available from the British
Library.
ISBN 3-527-40307-8
This book is an interdisciplinary book: it tries t o teach physicists the basic knowledge
of combinatorial and stochastic optimization and describes t o the computer scientists
physical problems and theoretical models in which their optimization algorithms are
needed. It is a unique book since it describes theoretical models and practical situation
in physics in which optimization problems occur, and it explains from a physicists point
of view the sophisticated and highly efficient algorithmic techniques that otherwise can
only be found specialized computer science textbooks or even just in research journals.
Traditionally, there has always been a strong scientific interaction between physicists
and mathematicians in developing physics theories. However, even though numerical
computations are now commonplace in physics, no comparable interaction between
physicists and computer scientists has been developed. Over the last three decades
the design and the analysis of algorithms for decision and optimization problems have
evolved rapidly. Most of the active transfer of the results was t o economics and
engineering and many algorithmic developments were motivated by applications in
these areas.
The few interactions between physicists and computer scientists were often successful
and provided new insights in both fields. For example, in one direction, the algorith-
mic community has profited from the introduction of general purpose optimization
tools like the simulated annealing technique that originated in the physics community.
In the opposite direction, algorithms in linear, nonlinear, and discrete optimization
sometimes have the potential t o be useful tools in physics, in particular in the study of
strongly disordered, amorphous and glassy materials. These systems have in common
a highly non-trivial minimal energy configuration, whose characteristic features dom-
inate the physics a t low temperatures. For a theoretical understanding the knowledge
of the so called "ground states" of model Hamiltonians, or optimal solutions of appro-
priate cost functions, is mandatory. To this end an efficient algorithm, applicable to
reasonably sized instances, is a necessary condition.
The list of interesting physical problems in this context is long, it ranges from disor-
dered magnets, structural glasses and superconductors through polymers, membranes,
and proteins t o neural networks. The predominant method used by physicists to study
these questions numerically are Monte Carlo simulations and/or simulated annealing.
These methods are doomed t o fail in the most interesting situations. But, as pointed
out above, many useful results in optimization algorithms research never reach the
physics community, and interesting computational problems in physics do not come t o
the attention of algorithm designers. We therefore think that there is a definite need
VI Preface
to intensify the interaction between the computer science and physics communities.
We hope that this book will help to extend the bridge between these two groups. Since
one end is on the physics side, we will try t o guide a number of physicists to a journey
to the other side such that they can profit from the enormous wealth in algorithmic
techniques they will find there and that could help them in solving their computational
problems.
In preparing this book we benefited greatly from many collaborations and discussions
with many of our colleagues. We would like t o thank Timo Aspelmeier, Wolfgang
Bartel, Ian Campbell, Martin Feix, Martin Garcia, Ilia Grigorenko, Martin Weigt,
and Annette Zippelius for critical reading of the manuscript, many helpful discus-
sions and other manifold types of support. Furthermore, we have profited very much
from fruitful collaborations and/or interesting discussions with Mikko Alava, Jurgen
Bendisch, Ulrich Blasum, Eytan Domany, Phil Duxbury, Dieter Heermann, Guy Hed,
Heinz Horner, Jermoe Houdayer, Michael Junger, Naoki Kawashima, Jens Kisker,
Reimer Kuhn, Andreas Linke, Olivier Martin, Alan Middleton, Cristian Moukarzel,
Jae-Dong Noh, Uli Nowak, Matthias Otto, Raja Paul, Frank Pfeiffer, Gerhard Reinelt,
Federico Ricci-Tersenghi, Giovanni Rinaldi, Roland Schorr, Eira Seppalaa, Klaus Us-
adel, and Peter Young. We are particularly indebted to Michael Baer, Vera Dederichs
and Cornelia Reinemuth from Wiley-VCH for the excellent cooperation and Judith
Egan-Shuttler for the copy editing.
Work on this book was carried out at the University of thc Saarland, University of
Gottingen, Forschungszentrum Julich and the University of California at Santa Cruz
and we would like t o acknowledge financial support from the Deutsche Forschungs-
geimeinschaft (DFG) and the European Science Foundation (ESF).
Santa Cruz and Saarbriicken May 2001 Alexander K. Hartmann and Heiko Rieger
Contents
I Introduction to Optimization
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Complexity Theory
2.1 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Time Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 NP Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Programming Techniques . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Graphs
3.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Trees and Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Graph Representations . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Basic Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6 NP-complete Graph Problenls . . . . . . . . . . . . . . . . . . . . . .
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6 Maximum-flow Methods
6.1 Random-field Systems and Diluted Antiferromagnets . . . . . . . . . .
6.2 Transformation t o a Graph . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Simple Maximum Flow Algorithms . . . . . . . . . . . . . . . . . . . .
6.4 Dinic's Method and the Wave Algorithm . . . . . . . . . . . . . . . . .
6.5 Calculating all Ground States . . . . . . . . . . . . . . . . . . . . . . .
6.6 Results for the RFIM and the DAFF . . . . . . . . . . . . . . . . . . .
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 Minimum-cost Flows
7.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 The Solution of the N-Line Problem . . . . . . . . . . . . . . . . . . .
7.3 Convex Mincost-flow Problems in Physics . . . . . . . . . . . . . . . .
7.4 General Minimum-cost-flow Algorithms . . . . . . . . . . . . . . . . .
7.5 Miscellaneous Results for Different Models . . . . . . . . . . . . . . . .
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 Genetic Algorithms
8.1 The Basic Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2 Finding the Minimum of a Function . . . . . . . . . . . . . . . . . . .
8.3 Ground States of One-dimensional Quantum Systems . . . . . . . . .
8.4 Orbital Parameters of Interacting Galaxies . . . . . . . . . . . . . . . .
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 Matchings
10.1 Matching and Spin Glasses . . . . . . . . . . . . . . . . . . . . . . . .
10.2 Definition of the General Matching Problem . . . . . . . . . . . . . . .
10.3 Augmenting Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.4 Matching Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.4.1 Maximum-cardinality Matching on Bipartite Graphs . . . . . .
10.4.2 Minimum-weight Perfect Bipartite Matching . . . . . . . . . .
10.4.3 Cardinality Matching on General Graphs . . . . . . . . . . . . 241
10.4.4 Minimum-weight Perfect Matching for General Graphs . . . . . 242
10.3 Ground-state Calculations in 2d . . . . . . . . . . . . . . . . . . . . . . 250
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Index 359
1 Introduction to Optimization
Optimization problems [l,2, 31 are very common in everyday life. For example, when
driving to work one usually tries t o take the shortest route. Sometimes additional
constraints have t o be fulfilled, e.g. a bakery should be located along the path, in case
you did not have time for breakfast, or you are trying t o avoid busy roads when riding
by bicycle.
In physics many applications of optimization methods are well know, e.g.
Even in beginners courses on theoretical physics, in classical mechanics, optimiza-
tion problcms occur: e.g. the Euler-Lagrange differential equation is obtained from
an optimization process.
Many physical systems are governed by minimization principles. For example, in
thermodynamics, a system coupled to a heat bath always takes the state with
minimal free energy.
The TSP has attracted the interest of physicist several times. For an intro-
duction, see Ref. [8]. The model is briefly presented here. Consider n cities
distributed randomly in a plane. Without loss of generality the plane is con-
sidered to be the unit square. The minimization task is t o find the shortest
round-tour through all cities which visits each city only once. The tour stops
at the city where it started. The problem is described by
where d(a,, ap)is the distance between cities a, and a0 and a,+l = a1. The
constraint that every city is visited only once can be realized by constraining
the vector t o be a permutation of the sequence [I,2 , . . . , n].
1 Introduction to Optimization
As an example 15 cities in a plane are given in Fig. 1.1. You can try t o
find the shortest tour. The solution is presented in Chap. 2. For the general
TSP the cities are not placed in a plane, but an arbitrary distance matrix d
is given. 0
The optimum order of the cities for a TSP depends on their exact positions, i.e.
on the random values of the distance matrix d. It is a feature of all problems we
will encounter here that they are characterized by various random parameters. Each
random realization of the parameters is called an instance of the problem. In general,
if we have a collection of optimization problems of the same (general) type, we will
call each single problem an instance of the general problem.
Because the values of the random parameters are fixed for each instance of the TSP,
one speaks of frozen or quenched disorder. To obtain information about the general
structure of a problem one has t o average measurable quantities, like the length of the
shortest tour for the TSP, over the disorder. Later we will see that usually one has t o
consider many different instances t o get reliable results.
While the TSP originates from everyday life, in the following example from physics a
simple model describing complex magnetic materials is presented.
give a negative contribution t o the total energy. On the other hand the
thermal noise causes different spins t o point randomly up or down. For low
temperatures T the thermal noise is small, thus the system is ordered, i.e.
ferromagnetic. For temperatures higher than a critical temperature T,, no
long range order exists. One says that a phase transition occurs at T,, see
Chap. 5. For a longer introduction to phase transitions, we refer the reader
e.g. to Ref. [9].
A spin configuration which occurs at T = 0 is called a ground state. It is just
thc absolute minimum of the energy H ( g ) of the system since no thermal
excitations are possible at T = 0. They are of great interest because they
serve as the basis for understanding the low temperature behavior of physical
systems. From what was said above, it is clear that in the ground state of
a ferromagnet all spins have the same orientation (if quantum mechanical
effects are neglected).
A more complicated class of materials are spin glasses which exhibit not only
ferromagnetic but also antiferromagnetic interactions, see Chap. 9. Pairs of
neighbors of spins connected by an antiferrornagnetic interaction like t o be in
different orientations. In a spin glass, ferromagnetic and antiferromagnetic
interactions are distributed randornly within the lattice. Consequently, it is
not obvious what ground state configurations look like, i.e. finding the min-
imum energy is a non-trivial minimization problem. Formally the problem
reads as follows:
where Jijdenotes the interaction between the spins on site i and site j and
the sum (i, j) runs over all pairs of nearest neighbors. The values of the
interactions are chosen according t o some probability distribution. Each ran-
dom realization is given by the collection of all interactions {Jij).Even the
simplest distribution, where Jij= 1 or Jij = -1 with the same probability,
induces a highly non-trivial behavior of the system. Please note that the in-
teraction parameters are frozen variables, while the spins oi are free variables
which are t o be adjusted in such a way that the encrgy becomes minimized.
Fig. 1.2 shows a small two-dimensional spin glass and one of its ground
states. For this type of system usually many different ground states for
each realization of the disorder are feasible. One says, the ground state is
degenerate. Algorithms for calculating degenerate spin-glass ground states
are explained in Chap. 9.
1 Introductzon to Optimization
Figure 1.2: Two-dimensional spin glass. Solid lines represent ferromagnetic inter-
actions while jagged lines represent antiferromagnetic interactions. The small arrows
represent the spins, adjusted to a ground-state configuration. For all except two in-
teractions (marked with a cross) the spins are oriented relative to each other in an
energetically favorable way. It is not possible to find a state with lower energy (try
it!).
In this book, efficient discrete computer algorithms and recent applications to problems
from physics are presented. The book is organized as follows. In the second chapter,
the foundations of complexity theory are explained. They are needed as a basis for
understanding the rest of the book. In the next chapter an introduction to graph theory
is given. Many physical questions can be mapped onto graph theoretical optimization
problems. Then, some simple algorithms from graph theory are explained, sample
applications are from percolation theory are presented. In the following chapter, the
basic notions from statistical physics, including phase transitions and finite-size scaling
are given. You can skip this chapter if you are familiar with the subject. The main
part of the book starts with the sixth chapter. Many algorithms are presented along
with sample problems from physics, which can be solved using the algorithms. First,
techniques to calculate the maximum flow in networks are exhibited. They can be
used t o calculate the ground states of certain disordered magnetic materials. Next,
minimum-cost-flow methods are introduced and applied t o solid-on-solid models and
vortex glasses. In the eighth chapter genetic algorithms are presented. They are
general purpose optimization methods and have been applied to various problems.
Here it is shown how ground states of electronic systems can be calculated and how the
parameters of interacting galaxies can be determined. Another type of general purpose
algorithm, the Monte Carlo method, is introduced along with several variants in the
following chapter. In the succeeding chapter. the emphasis is on algorithms for spin
glasses, which is a model that has been at the center of interest of statistical physicists
over the last two decades. In the twelfth chapter, a phase transition in a classical
combinatorial optimization problem, the vertex-cover problem, is studied. The final
chapter is dedicated to the practical aspects of scientific computing. An introduction
t o software engineering is given, along with many hints on how to organize the program
development in an efficient way, several tools for programming, debugging and data
analysis, and finally, it is shown how t o find information using modern techniques such
as data bases and the Internet, and how you can prepare your results such that they
can be published in scientific journals.
Bibliography
[I] C.H. Papadimitriou and K. Steiglitz, Combinatorial Opt.imization, (Dover Publi-
cations, Mineola (NY) 1998)
[2] W.J. Cook, W.H. Cunningham, W.R. Pulleyblank, and A. Schrijver, Combinato-
rial Optimization, ( J . Wiley & Sons, New York 1998)
[3] B. Korte and J . Vygen, Combinatorial Optimization, (Spinger, Berlin and Hei-
delberg 2000)
[4] J.C. Anglks d'Auriac, M. Preissmann, and A. Seb, J. Math. and Comp. Model.
26, 1 (1997)
[5] H. Rieger,in : J . Kertesz and I. Kondor (ed.), Advances in Computer Simulation,
Lecture Xotes in Physics 501, (Springer, Heidelberg 1998)
Bzbliography 7
[6] M.J. Alava, P.M. Duxbury, C. Moukarzel, and H. Ricger, Exact Combinatorial
Algorithms: Ground States of Disordered Systems, in: C. Domb and J.L. Lebowitz
(cd.), Phase Transitions and Critical Phenomena 1 8 , (Academic press, New York
2001)
[7] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical
Recipes in C , (Cambridge University Press, Cambridge 1995)
[8] S. Kirkpatrick , C. D. Gelatt, Jr., and M. P. Vecchi, Science 220, 671 ( 1983)
[9] J.M. Yeomans, Statistical Mechanics of Phase Transitions, (Clarendon Press, Ox-
ford 1992)
2 Complexity Theory
Programming languages are used to instruct a computer what to do. Here no specific
language is chosen, since this is only a technical detail. We are more interested in
the general way a method works, i.e. in the algorithm. In the following chapters
we introduce a notation for algorithms, give some examples and explain the most
important results about algorithms provided by theoretical computer sciences.
2.1 Algorithms
Here we do riot want to try t o give a precise definition of what an algorithm is. We
assume that an algorithm is a sequence of statements which is computer readable and
has an unambiguous meaning. Each algorithm may have input and output (see Fig.
2.1) which arc well defined objects such as sequences of numbers or letters. Neither
user-computer interaction nor high-level output such as graphics or sound are covered.
Please note that the communication between the main processing units and keyboards
or graphic-/sound- devices takes place via sequences of numbers as well. Thus, our
notion of an algorithm is universal.
Algorithms for several specific purposes will be presented later. We will concentrate
on the main ideas of each method and not on implementational details. Thus, the
algorithms will not be presented using a specific programming language. Instead, we
will use a notation for algorithms called pidgin Algol, which resembles modern high-
level languages like Algol, Pascal or C. But unlike any conventional programming
language, variables of a n arbitrary type are allowed, e.g. they can represent numbers,
strings, lists, sets or graphs. It is not necessary to declare variables and there is no
strict syntax.
For the definition of pidgin Algol, we assume that the reader is familiar with at lcast onc
high-level language and that the meaning of the terms variable, eqmssion, condibion
10 2 Complexity Theory
and label is clear. A pidgin Algol program is one of the following statements, please
note that by using the compound statement, programs can be of arbitrary length:
1. Assignment
variable := expression
,4 value is assigned t o a variable. Examples: a := 5 * b +c, A := { U I , . . . , an}
Also more complex and informal structures are allowed, like
let z be the first element of the queue Q
2. Condition
if condition then statement 1
else statement 2
The else clause is optional. If the condition is true, statement 1 is executed, else
statement 2, if it exists.
Example: if money>100 then restaurant := 1 else restaurant := 0
3. Cases
case: condition 1
statement 1-A;
statementl-B;
...
case: condition 2
statement2-A;
statement2-B ;
...
case: condition 3
statement3-A;
statement3-B;
...
end cases
This statement is useful, if many different case can occur, thus making a sequence
of if statements too complex. If condition 1 is true, then the first block of state-
ments is executed (here no begin ... end is necessary). If condition 1 is true,
then the second block of statements is executed, etc.
4. While loop
while condition do statement
The statement is performed as long as the condition is true.
Example: while counter < 200 do counter := counter+l
5. For loop
for list do statement
The statement is executed for all parameters in the list. Examples:
2.1 Algorithms
for i := 1 , 2 , .. . , n do s u m := sum+i
for all clcments q of queue Q do waits[q] := waits[q]+l
6 . Goto statement
a ) label: statement
b) goto label
When the execution of an algorithm reaches a goto statement the execution is
continued a t the statement which carries the corresponding label.
7. Compound statement
begin
statement 1;
statement 2;
...
statement n;
end
The compound statement is used t o convert a sequence of statements into one
statement. It is useful e.g. if a for-loop should be executed for a body of several
statements.
Example:
for i : = 1 , 2 , . . . , n d o
begin
a := a i ; +
b:= b + i * i ;
c:=c+i*i*i;
end
8. Procedures
procedure procedure-name (list of parameters)
begin
statements
return expression
end
The return statement is optional. A procedure is used t o define a new name for
one statement or, using a compound statement, for a collection of statements. A
procedure can be invoked by writing: procedure-name (arguments)
Example:
2 Complexity Theory
p r o c e d u r e minimum (z, y )
begin
if x>y t h e n r e t u r n y
else r e t u r n x
end
9. Comments
c o m m e n t text
Comments are used to explain parts of an algorithm, i.e. t o aid in its under-
standing. Sometimes a comment is given a the right end of a line without the
c o m m e n t keyword.
Please note that the length of the tour constructed in this way depends on the city
where the tour starts and t,hat this city is randomly chosen. This algorithm and
many other heuristics for the TSP can be found on the well presented TSP web-pages
2.1 Algorithms 13
Figure 2.2: A sample T S P containing 15 cities. The results for the nearest-neighbor
heuristic (top) and the exact optimum tour (bottom) are shown. The starting city
for the heuristic is marked by a white square. The nearest neighbor of that city is
located above it.
quite a long natural number, representing the program in a unique way. Now, let f n
be the function which is defined through the text with number n, if the text is a valid
2.1 Algorithms 15
program. If text n is not a valid program or if the program has more than one input
or output number, then we define f n ( x ) = div for all x E N . In total, this procedure
assigns a function t o each number.
All functions which can be programmed on a computer are called computable. Please
note that for every computable function f there are multiple ways t o write a program,
thus there are many numbers n with f , = f . Now we want to show:
There are functions f : N + N which are n o t computable
Proof: We define the following function
Evidently, this is a well defined partial function on the natural numbers. The point is
that it is different from all computable functions f,, i.e. f * itself is not computable:
QED
The technique applied in the proof above is called diagonalization. The reason is that
if one tabulates the infinite matrix consisting of the values f,(i) then the function f * is
different from each f , on the diagonal f n ( n ) . The principle used for the construction
of f * is visualized in Fig. 2.3. The technique of diagonalization is very useful for many
proofs occurring not only in the area of theoretical computer science but also in many
fields of mathematics. The method was probably introduced by Georg Cantor a t the
beginning of thc century to show that there are more than a countable number of real
numbers.
Figure 2.3: Principle of diagonalization: define a function which differs from all
computable functions on the diagonal.
forever (f,(x) = div). The question whether a given program stops or not is called
the halting problem. With a similar and related diagonalization argument as we have
seen above, it can be shown that there is indeed no solution t o this problem. It means
that no universal algorithm exists which decides for all programs whether the program
will halt with a given input or run forever. On the other hand, if a test for the halting
problem was available it would be easy t o implement the function f * on a computer,
i.e. f * would be computable. Thus, the undecidability of the halting problem follows
from the fact that f * is also not computable.
In principle, it is always possible t o prove for a given program whether it will halt
on a given input or not by checking the code and deep thinking. The insolvability of
the halting problem just means that there is no systematic way, i.e. no algorithm t o
construct a proof for a n y given program. Here, as for most proofs in mathematics, the
person who carries it out rnust rely on his/her creativity. But with increasing length
of the program the proof usually becomes extremely difficult. It is not surprising that
for realistic programs like word processors or databases no such proofs are available.
The same is true for the correctness problem: There is no systematic way t o prove
that a given program works according a given specification. On the other hand, this
is fortunate, since otherwise many computer scientists and programmers would be
unemployed.
The halting problem is a so called recognition problem: for the question "will Program
Pn halt on input x" only the answers "yes" or "no" are possible. In general, we will
call an instance (here a program) yes-instance if the answer is "yes" for it, otherwise
no-instance. As we have seen, the halting-problem is not decidable, because it is not
possible to prove the answer "no" systematically. But if the answer is "yes", i.e. if the
program stops, this can always be proven: just take the program Pn,supply input x,
run it and wait till it stops. This is the reason why the halting problem a t least is
provable.
execution time of a program may also depend on the number of distances, i.e. the
number of nonzero entries of the matrix d ( i ,j ) .
Usually one is not interested in the actual running time t(z) of an algorithm for a
specific implementation on a given computer. Obviously, this depends on the input,
the skills of the programmer, the quality of the compiler and the amount of money
which is available for buying the computer. Instead, one would like to have some kind
of measure that characterizes the algorithm itself.
As a first step, one takes the longest running time over all inputs of a given length.
This is called the worst case running time or worst case time complezity T(n,):
T ( n ) = max t ( z )
x:/xI=n
Here, the time is measured in sornc arbitrary units. Which unit is used is not relevant:
on a computer B which has exactly twice the speed of computer A a program will
consume only half the time. We want t o characterize the algorithm itself. Therefore,
a good measure must be independent of such constant factors like the speed of a
computer. To get rid of these constant factors one tries to determine the asymptotic
behavior of a program by giving upper bounds:
Definition: 0/0 notation Let T , g be functions from natural numbers t o real
numbers.
We write T ( n ) E O(g(n)) if there existjs a positive constant c with T ( n ) < cg(n)
for all n. We say: T ( n ) is of order at most g(n).
T ( n ) E O(g(n)) if there exist two positive constants cl,ca with clg(n) < T(n) 5
cag(n) Qn. We say: T ( n ) is of order g ( n ) .
Example: 010-notation
+ +
For T ( n ) = pn3 qn2 r n , the cubic term is the fastest growing part: Let
+ + <
c 5 p q r. Then T ( n ) en3 Vn, which means T ( n ) E 0 ( n 3 ) . Since e.g.
n 4 ;2n are growing faster than n3, wc have T ( n )E 0 ( n 4 ) and T (n) E O(2").
< <
Let c' z min{p, q, r ) . Then c'n3 T ( n ) en3. Hence, T ( n ) t 0 ( n 3 ) . This
smallest upper bond characterizes T ( n )most precisely.
We are interested in obtaining the time complexity of a given algorithm without actu-
ally implementing and running it. The aim is to analyze the algorithm given in pidgin
Algol. For this purpose we have t o know how long basic operations like assignments,
increments and nlultiplications take. Here we assume that a machine is available where
all basic operations take one time-step. This restricts our arithmetic operations t o a
fixed number of bits, i.e. numbers of arbitrary length cannot be computed. If we en-
counter a problem where numbers of arbitrary precision can occur, we must include
the time needed for the arithmetic operations explicitly in the analysis.
As an example, the time complexity of the TSP heuristic will now be investigated,
which was presented in the last section. At the beginning of the algorithm a loop
18 2 Complexity Theory
is performed which sets all variables v[i] to zero. Each assignment costs a constant
amount of time. The loop is performed n times. Thus, the loop is performed in
O ( n ) . The random choice of a city and the assignment of v[al] take constant time.
The rest of the algorithm is a loop over all cities which is performed n - 1 times.
The loop consists of assignments, comparisons and another loop which runs over all
unvisited cities. Therefore, the inner loop is performed n - 1 times for i = 2, n - 2
times for i = 3, etc. Consequently, the if-statement inside that loop is performed
+
CrZ2(n I - i) = ~ l n ~ ' ) - ( ni ) = n ( n - 1)/2 times. Asymptotically this pair of
nested loops is the most time-consuming part of the algorithm. Thus, in total the
algorithm has a time complexity of @in2).
Can the TSP heuristic be considered as being fast? Tab. 2.1 shows the growth of
several functions as a function of input size n .
T(n)
n
n log n
n2
TL
n'og n
2"
n,!
For algorithms which have a time complexity that grows faster than all polynomials,
even moderate increases of the system size make the problem impossible to treat.
Therefore, we will call an algorithm effecttiwe if its time complexity is bounded by
a polynomial: T ( n ) E O ( n k ) . In practice, values of the exponent up to k = 3 are
considered as suitable. For very large exponents and small system sizes algorithms
with exponentially growing time complexity may be more useful. Compare for example
an algorithm with Tl(n) = nsOand another with T2(n) = 2". The running-time of the
first algorithm is astronomical even for n = 3, while the second one is able to treat at
least small input sizes.
The application of the 010-notation neglects constants or lower order terms for the
time complexity. Again, in practice an algorithm with running time T3(n) = n3 may
be faster for small input sizes than another with T4(n) = 100n2. But these kinds of
examples are very rare and rather artificial.
In general, finding an algorithm which has a lower time complexity is always more
effective than waiting for a computer t o arrive that is ten times faster. Consider two
algorithms with time complexities T5(n) = n logn and T6(n) = n3. Let n:, respectively
n6 be the rnaximum problem sizes which can be treated within one day of computer
time. If a computer is available which is ten times faster, the first algorithm can treat
approximately inputs of size n5 x 10 (if n5 is large) within one day while for the second
the maximum input size grows only as ns x
2.3 NP Completeness 19
To summarize, algorithms which run in polynomial time are considered as being fast.
But there are many problems, especially optimization problems, where no polynomial-
time algorithm is known. Then one must apply algorithms where the running time
increases exponentially or even greater with the system size. This holds e.g. for the
TSP if the exact minimum tour is to be computed. The study of such problems led to
the concept of NP-completeness, which is introduced in the next section.
2.3 N P Completeness
For the moment, we will only consider recognition problems. Please remember that
these are problems for which only the answers "yes" or "no" are possible. We have
already have introduced the halting and the correctness-problem which are not de-
cidable. The following example of a recognition problem, called SAT is of more
practical interest. In the field of theoretical computer science it is one of the most
basic recognition problems. For SAT it was first shown that many other recognition
problems can mapped onto it. This will be explained in detail later on. Recently SAT
has attracted much attention within the physics community [3].
A boolean variable xi may only take the values 0 (false) and 1 (true). Here
we consider three boolean operations:
-
NOT (negation): the clause % ("NOT xi") is true (z
= I ) , if and
only if (iff) xi is false: xi = 0
0 AND A (conjunction): the clause zi A x j ("xi AND xjl') is true, iff both
variables are truc: xi = 1 AND x j = 1.
0 OR V (disjunction): the clause xi V x j ('!xi OR xj") is true, iff at least
one of the variables is true: xi = 1 OR x j = 1
We have already seen that some recognition problems are undecidable. For these
problems it has been proven that no algorithm can be provided t o solve it. The k-SAT
problem is decidable, i.e. there is a so called decision-algorithm which gives for each
instance of a k-SAT problem the answer "yes" or "no". The simplest algorithm uses
the fact that each formula contains a finite number n of variables. Therefore, there
are exactly 2n different assignments for the values of all variables. To check whether a
formula is satisfiable, one can scan through all possible assignments and check whether
the formula evaluates t o true or t o false. If for one of them the formula is true, then it
is satisfiable, otherwise not. In the Tab. 2.2 all possible assignments for the variables
of (22 V x 3 ) A (51 V G ) and the results for both clauses and thc whole formula is
displayed. A table of this kind is called a truth table.
Since for each formula up to 2n assignrnents have t o be tested, this general algorithm
has an exponential time complexity (in the number of variables). Since the number of
variables is bounded by the number km ( m = number of clauses), the algorithm is of
order 0 ( 2 k m ) . But there are special cases where a faster algorithm exists. Consider
for example the 1-SAT class. Here each formula has the form l1 A 12 A . . . A I, where
I , are literals, i.e. I , = rck or 1, = for some i . Since each literal has to be true so
2.3 NP Completeness 21
that the formula is true, the following simple algorithm tests whether a given 1-SAT
formula is satisfiable. Its idea is to scan the formula from left t o right. Variables are
set such that each literal becomes true. If a literal cannot be satisfied because the
corresponding variable is already fixed, then the formula is not satisfiable. If on the
other hand the end of the formula is reached, it is satisfiable.
a l g o r i t h m 1-SAT
begin
initially all xi are uriset;
for i := 1 , 2 . . . , m d o
begin
let k be the number of variables occurring in literal li: li = xk or li = G;
if xk is unset t h e n
choose xk such that li=true;
else
if literal li=false t h e n
r e t u r n ( n o ) ; c o m m e n t not satisfiable
end
return(yes); c o m m e n t satisfiable
end
x1 A X3 A T, A x2 + unsatisfiable
t
Figure 2.4: Sample run of algorithm 1-SAT for formula X I A A 5A 22
Obviously the algorithm tests whether a 1-SAT formula is satisfiable or not. Fig. 2.4
shows, as an example, how the formula xl A A % A x2 is processed. In the left
column the formula is displayed and an arrow indicates the literal which is treated.
The right column shows the assignments of the variables. The first line shows the
initial situation. The first literal (11 = xl -+ k = 1) causes z l = 1 (second line). In
the second round (12 = ?&j k = 3) x3 = 0 is set. The variable of the third literal
(13 = % 3 k = 1) is set already, but the literal is false. Conscquently, the formula is
not satisfiable.
The algorithm contains only one loop. The operations inside the loop take a constant
time. Therefore, the algorithm is O(rn), which is clearly faster than 0(2"). For 2-
22 2 Complexity Theory
SAT also a polynomial-time algorithm is known, for more details see [4]. Both I-SAT
and 2-SAT belong to the following class of problems:
Definition: P (polynomial) The class P contains all recognition-problems for which
there exists a polynomial-time decision algorithm.
is of more practical interest. For 3-SAT problems no polynomial-time algorithm which
checks satisfiability is known. On the other hand, up t o now there is no proof that
3-SATgP! But, since many clever people have failed to find an effective algorithm, it
is very likely that 3-SAT (and &SAT for k > 3) is not decidable in polynomial time.
There is another class of recognition problems A, which now will be defined. For
this purpose we use certzficate-checking (CC) algorithms. These are algorithms A
which get as input instances a EA like decision algorithms and additionally strings
s = slsz . . . s,, called certzficates (made from a suitable alphabet). Like decision
algorithms they halt on all inputs (a, s) and return only "yes" or "no". The meaning
of the certificate strings will become clear from the following. A new class, called NP,
can be described as follows:
The difference between P and NP is (see Fig. 2.5 ): for a yes-instance of a P problem
the decision algorithm answers "yes". For a yes-instance of an NP problem there exists
at least one certificate-string s such that the CC algorithm answers "yes", i.e. there
may be many certificate strings s with A(a, s)="no" even if a is a yes-instance. For
a no-instance of a P problem the decision algorithm answers "no", while for a no-
instance of an NP problem the CC algorithm answers "no" for all possible certificate
strings s . As a consequence, P is a subset of NP, since every decision algorithm can
be extended t o a certificate-checking algorithm by ignoring the certificate.
The formal definition of NP is as follows:
Definition: N P (nondeterministic polynomial) A recognition-problem A is in
the class NP, if there is a polynomial-time (in lal, a €A) certificate-checking algorithm
with the following property:
An instance a EA is a yes-instance if there is at least one certificate s with A(a, s)=yes,
for which the length Is1 is polynomial in la1 ( 3 z : Is1 5 lal").
In fact, the requirement that the length of s is polynomial in la1 is redundant, since the
algorithm is allowed t o run only a polynomial number of steps. During that time the
2.3 NP Completeness 23
algorithm can read only a certain number of symbols from s which cannot be larger
than the number of steps itself. Nevertheless, the length-requirement on s is included
for clarity in the definition.
The concept of certificate-checking seems rather strange at first. It becomes clearer if
one takes a look at k-SAT. We show &SAT E NP: is of more practical interest.
Proof: Let F ( x l , . . . , x,) be a boolean formula. The suitable certificate s for the
k-SAT problem represents just one assignment for all variables of the formula: s =
S ~ S .Z. . s,, si E (0, I). Clearly, the number of variables occurring in a formula is
bounded by the length of the formula: 1st 5 lFll. The certificate-checking algorithm
just assigns the values t o the variables (xi := si) and evaluates the formula. This
can be done in linear time by scanning the formula from left t o right, similar t o the
algorithm for 1-SAT. The algorithm answers "yes" if the formula is true and "no"
otherwise. If a formula is satisfiable, then, by definition, there is an assignment of the
variables, for which the formula F is true. Consequently, then there is a certificate s
for which the algorithm answers A(F, s)=yes. QED
The name "nondeterministic polynomial" comes from the fact that one can show
that a nondeterministic algorithm can decide N P problems in polynomial time. A
normal algorithm is deterministic, i.e. from a given state of the algorithm, which
consists of the values of all variables and the program line where the execution is
at one moment, and the next state follows in a deterministic way. Nondeterministic
algorithms are able to choose the next state randomly. Thus, a machine executing
nondeterministic algorithms is just a theoretical construct, but in reality cannot be
built yet1. The definition of N P relies on certificate-checking algorithms. For each CC
algorithm an equivalent nondeterministic algorithm can be formulated in the following
way. The steps where a CC algorithm reads the certificate can be replaced by the
nondeterministic changes of state. An instance is a yes-instance if there is at least one
run of the nondeterministic algorithm which answers "yes" with the instance as input.
Thus, both models are equivalent.
As we have stated above, different recognition problems can be mapped onto each
other. Since all algorithms which we encounter in this context are polynomial, only
transformations are of interest which can be carried through in polynomial time as well
(as a function of the length of an instance). The precise definition of the transformation
is as follows:
Definition: Polynomial-time reducible Let A, B be two recognition problems.
We say A is polynomial-time reducible t o B (A<,B), if there is a polynomial-time
algorithm f such that
x is yes-instance of A u f (x) is yes-instance of B
Fig. 2.6 shows how a certificate-checking algorithm for B can be transformed into a
certificate-checking algorithm for A using the polynomial-time transformation f .
As an example we will prove SAT<,3-SAT, i.e. every boolean formula F can be written
as a 3-CNF formula F3such that F3 is satisfiable iff F is satisfiable. The transforma-
tion runs in polynomial time in I FI.
'Quantum computers can be seen as a realization of nondeterministic algorithms.
2 Complexity Theory
Algorithm for A
-D-{ior~)I- Algorithm
There is a special subset of NP problems which reflects in some sense the general
attributes of all problems in NP: it is possible t o reduce all problems of NP t o them.
This leads t o the following definition:
2.3 NP Completeness 25
It can be shown that SAT is NP-complete. The proof is quite technical. It requires
an exact machine model for certificate-checking algorithms. The basic idea is: each
problem in NP has a certificate-checking algorithm. For that algorithm, a given in-
stance and a given certificate, an equivalent boolean formula is constructed which is
only satisfiable if the algorithm answers "yes" given the instance and the certificate.
For more details see [4, 2, 51.
Above we have outlined a simple certificate-checking algorithm for SAT. Consequently,
using the transformation from the proof that SAT is NP-complete, one can construct a
certificate-checking algorithm for every problem in NP. In practice this is never done.
since it is always easier t o invent such an algorithm for each problem in NP directly.
Since SAT is NP-complctc, A<,SAT for every problem AENP. Above we have shown
SAT5,3-SAT. Since the <,-relation is transitive, we obtain ASPS-SAT. Consequently,
3-SAT is NP-complete as well. There are many other problems in NP which are NP-
complete. For a proof it is sufficient t o show A<,B for any other NP-complete problem
A, e.g. 3-SAT<,B. The list of NP-complete problems is growing permanently. Several
of them can be found in [6].
As we have said above, P is a subset of NP. If for one NP-complete problem a
polynomial-time decision algorithm will be found one day, then, using the polyno-
mial time reducibility, this algorithm can decide every problem in NP in polynomial
time. Consequently, P = N P would hold. But for no problem in NP has a polynomial-
time decision algorithm been found so far. On the other hand for no problem in NP
is there a proof that no such algorithm exists. Therefore the so called P-NP-problem,
whether P f N P or P=NP, is still unsolved, but P = N P seems t o be very unlikely. We
can draw everything that we know about the different classes in a figure: NP is a
subset of the set of decidable problems. The NP-complete problems are a subset of
NP. P is a subset of NP. If we assume P f N P , then problems in P are not NP-complete
(see Fig. 2.7).
In this section we have concentrated on recognition problems. Optimization problems
are not recognition problems since one tries t o find a minimum or maximum. This is
not a question that can be answered by "yes" or "no". But, every problem min H(g)
can be transformed into a recognition problem of the form
"given a value K , is there a g with H ( g ) 5 K?"
It is very easy t o see that the recognition problems for the TSP and the spin-glass
ground state are in NP: given an instance of the problem and given a t o u r l a spin
configuration (the certificates) the length of the tourlenergy of the configuration can
be computed in polynomial time. Thus, the question "is H ( a ) 5 K " can be answered
easily.
If the corresponding recognition problem for an optimization problem is NP-complete,
then the optimization problem is called NP-hard. In general, these are problems which
2 Complexity Theory
I recognition problems
are harder than problems from NP or which are not recognition problems, but every
problem in N P can be reduced t o them. This leads t o the definition:
Definition: NP-hard Let A be a problem such that every problem in N P is poly-
nomial reducible to A. If AGNP then A is called NP-hard.
From what we have learned in this section, it is clear that for an NP-hard problem
no algorithm is known which finds the optimum in polynomial time. Otherwise the
corresponding recognition problem could be solved in polynomial time as well, by just
testing whether the thus obtained optimum is lower than m or not.
The TSP and the search for a ground state of spin glasses in three dimensions are
both NP-hard. Thus, only algorithms with exponentially increasing running time are
available, if one is interested in obtaining the exact minimum. Unfortunately this
is true for most interesting optimization problems. Therefor, clever programming
techniques are needed t o implement fast algorithms. Here "fast" means slowly but
still growing exponentially. In the next section, some of the most basic programming
techniques are presented. They are not only very useful for the implementation of
optimization methods but for all kinds of algorithms as well.
with the while-statement from pidgin Algol. Sometimes it is more convenient t o use
the concept of recursion, especially if the quantity to be calculated has a recursive
definition. One speaks of recursion if an algorithm calls itself. As a simple example we
present an algorithm for the calculation of the factorial n! of a natural number n > 0.
Its recursive definition is given by:
ifn=l
n! =
n x (n - I ) ! else
This definition can be translated directly into an algorithm:
a l g o r i t h m factorial(n)
begin
if n < 1 t h e n
r e t u r n 1;
else
r e t u r n r~*factorial(n- 1);
end
&
return 1
return 2x(l)
4
7
return 3x(2)
W
return 6x(4)
algorithm factorial2(n)
begin
t := I ; comment this is a counter
f := 1; comment here the result is stored
while t 5 n do
begin
f := f * t;
t := t + 1;
end
return f ;
end
The sequential factorial algorithm contains one loop which is executed n times. Thus,
the algorithm runs in O(n) steps. For the recursive variant the time complexity is
not so obvious. For the analysis of recursive algorithms, one has t o write down a
recurrence equation for the execution time. For n = 1, the factorial algorithm takes
constant time T(1). For n > 1 the algorithm takes the time T ( n - 1) for the execution
of factorial(n - 1) plus another constant time for the multiplication. Here and in the
following, let C be the maximum of all occurring constants. Then, we obtain
T (n) = { E+T(n-1)
for n = 1
form> 1
One can verify easily that T ( n ) = C n is the solutiori of the recurrence, i.e. both recur-
sive and sequential algorithms have the same asymptotic time complexities. There are
many examples, where a recursive algorithm is asymptotically faster than a straight-
forward sequential solution, e.g. see [7].
An important area for the application of efficient algorithms are sorting problems.
Given n numbers (or strings or complex data structures) Ai (i = 1 , 2 , .. . , n) we want
t o find a permutation Bi of them such that they are sorted in (say) increasing order:
Bi < Bi+l for all i < n. There is a simple recursive algorithm for sorting elements.
Please note that the sorting is performed within the array Ai they were provided in.
Here this means the ualues of the numbers are not taken as arguments, i.e. there
are no local variables which take the valucs. Instead the variables (or their memory
positions) themselves are passed to the following algorithm. Therefore, the algorithm
can change the original data. The basic idea of the algorithm is to look for the largest
element of the array, store it in the last position, and sort the first n - 1 elements by
a recursive call. The algorithmic presentation follows on the next page.
2.4 Prog~ammingTechniques
algorithm sort(n,{Al, . . . , A n ) )
begin
if n = 1 then
return;
mas := 1; comment will contain maximum of all Ai
pos := 1 comment will contain position of maximum
t := 2;
while t 5 n comment look for maximum
if At >mas then
begin
mas := At;
pos := t;
end
exchange maxirnum and last element;
sort(n - l,{Al, . . . , An-1))
end
In Fig. 2.9 it is shown how the algorithrn runs with input (6, {5,9,3,6,2,1}). On the
left side the recursive sequence of calls is itemized. The maximum element for each
call is marked. In the right column the actual state of the array before the next call
is displayed.
Figure 2.9: Run of the sorting algorithm with input (6, {5,9,3,6,2,1)).
The algorithrn takes linear time t o find the maximum element plus the time for sorting
n - I numbers, i.e. for the time complexity T ( n ) one obtains the following recurrence:
( n = I)
T (n) =
Obviously, the solution of the recurrence is 0 ( n 2 ) . Compared with algorithms for NP-
hard problems this is very fast. But there are sorting-algorithms which can do even
30 2 Complexity Theory
.- A z+n/2;
C z. .- .
end
mergesort(n/2, {Bl, . . . ,Bn12)); comment sort subsets
mergesort(n/2, {GI,. . . , CnI2));
x := 1;y := I ; comment largest elements of sequences
for i := 1,.. . ,n do comment merge sorted subsets
if x 5 n/2 AND B, < C, then
+
Ai := B,; x := x 1;
else
.- cy ,. y := y I ;
A 2. .- +
end
+
Thus, e.g. fib(4) = fib(3) fib(2) = (fib(2) fib(1)) fib(2) = 3, fib(5) = fib(4) + + +
+
fib(3) = 3 2 = 5. The functions grows very rapidly: fib(l0) = 55, fib(20) = 6765,
fib(30) = 83204, fib(40) > 10'. Let us assume that this definition is translated directly
into a recursive algorithm. Then a call t,o fib(n) would call fib(n - 1) and fib(n - 2).
The recursive call of fib(n - 1) would call again fib(n - 2) and fib(n - 3) [which is called
from the two calls of fib(n - 2), etc.]. The total number of calls increases rapidly, even
more than fib(n) itself increases. In Fig. 2.11 the top of a hierarchy of calls is shown.
Obviously, every call t o fib with a specific argument is performed frequently which is
definitely a waste of time.
K" "\
fib(n-2) \\
,A,' \ \
fib(n -3) ',\,, fib(n-3)
d' \) '& I./ 'it
fib(n-4) fib(n-4) fib(n-4)
I\, i r j \ d\
1
'
f i , \
\\
4 ?j \4
Instead, one can apply the principle of dynamic programming. The basic idea is
t o start with small problems, solve them and store them for later use. Then one
proceeds with larger problems by using divide-and-conquer. The basic idea of If
for the solution of a larger problem a smaller one is necessary, it is already available.
32 2 Complexity Theory
Since the algorithm contains just one loop it runs in O(n) time.
The last basic programming principle which is presented here is backtracking. This
method is applied when there is no direct way t o compute a solution. This is typical for
many optimization problems. Remember the simple TSP algorithm which constructs
a feasible tour, but which does not guarantee t o find the optimum. In order to improve
the method, one has t o try some (sub-)solutions and throw them away if they are not
good enough. This is the basic principle of backtracking.
In the following we will present a backtracking algorithm for the solution of the N-
queens problem.
The algorithm starts in the last column and places a queen. Next a queen is placed in
the second last column by a recursive call and so forth. If all columns are filled a valid
configuration has been found. If a t any stage it is not possible t o place any further
queen in a given column then the backtracking-step is performed: the recursive call
finishes, the queen which was set in the recursion-step before is removed. Then it is
placed elsewhere and the algorithm proceeds again.
algorithm queens(n)
begin
if n = O then
print array Q 1 , .. . , Q N ; problem solved;
for i := 1 , . . ., N do
begin
Qn := i
if Queen n does not check against any other then
queens(n - I ) ;
end
Q i:= 0;
return
end
In Fig. 2.12 it is shown how the algorithm solves the problem for N = 4. It starts
with a queen in column 4 and row 1. Then queens(3) is called. The positions where
no queen is allowed are marked with a cross. For thc third column no queens in row
1 and row 2 are allowed. Thus, a queen is placed in row 3 and queens(2) is called. In
the second column it is not possible t o place a queen. Hence, t,he call to queens(2)
finishes. The queen in column 3 is placed on a deeper row (second line in Fig. 2.12).
Then, by calling queens(2), a queen is placed in row 2. Now, no queen can be placed
in the first column. Since there was only one possible position in column 2, the queen
is removed and the call finishes. Now, both possible positions for the queen in column
3 have been tried. Therefore, the call for queens(3) finishes as well and we are back at
queens(1). Now, the queen in the last column is placed in the second row (third line
in Fig. 2.12). From here it is straight forward to place queens in all columns and the
algorithm succeeds.
Although this algorithm avoids many "dead ends" it still has an exponential time
complexity as a function of N. This is quite common to many hard optimization
problems. More sophisticated algorithms, which we will encounter later, try t o reduce
the number of configurations visited even more.
2 Complexity Theory
Bibliography
[I] Stephan Mertens, http://itp.nat.uni-magdeburg.de/"mertens/TSP/index.html
(1999)
[2] M.D. Davis, R. Sigal, and E.J. Wyuker, Computability, Complexity and Lan-
guages, (Academic Press, San Diego 1994)
[6] M.R. Garay and D.S. Johnson, Computer and Interactability A Guide to the
Theory of NP-Completeness, (W.H. Freeman & Company, San Francisco 1979)
[7] A.V. Aho, J.E. Hopcroft, and J.D. Ullman, The design and analysis of computer
algorithms, (Addison-Wesley, Reading (MA) 1974)
[9] K. Mehlhorn and St. Naher, The LEDA Platform of Combinatorial and Geometric
Computing, (Cambridge University Press, Cambridge 1999);
see also http://www.mpi-sb.mpg.de/LEDA/leda.html
3 Graphs
3.1 Graphs
Many optimization problems from physics or other areas can be mapped on optimiza-
tion problems on graphs. Some of these transformations will be useful later on. For
this reason a short introduction t o graph theory is given here. Only the basic defini-
tions and algorithms are presented. For more information, the reader should consult
a specialized textbook on graph theory, e.g. Refs. [I, 2, 31. We will sometimes only
mention a real world application which can be treated with a given graph theoretical
concept. The precise definitions of the applications will be given later on.
Consider a map of a country where several towns, villages or other places are connected
by roads or railways. Mathematically this setting can be represented by a graph. A
graph consists of nodes and edges. The nodes represent the towns, villages or other
places and the edges describe the roads or railways. Formally, the definition of a graph
is given by:
Definition: Graph A graph G is an ordered pair G = (V, E ) where V is a set and
E c V x V. An element of V is called a vertex or node. An element (i, j) E E is called
an edge or arc. In a physical context, where edges reprcsent interactions between
particles, edges are often called bonds.
If the pairs (i, j ) E E are ordered pairs, then G is called a directed graph. Otherwise G
is called undirected then (i,j) and ( j , i) denote the same edge. A graph G' = (v', E')
is called subgraph of G if it has the properties V' c V and E' c E (E' C V' x V' by
definition). The empty graph (@,a) is a subgraph of all graphs.
First, some further notations are given which apply t o both directed and undirected
graphs. Usually we restrict ourselves t o finite graphs, i.e. the set of nodes and edges
are finite. In this case we denote by n = IVI the number of vertices and by m = IEI
the number of edges. Let i E V be a vertex. If (i, j ) E E we call j a neighbor of i (and
vice versa). Both nodes are adjacent t o each other. The set N ( i ) of neighbors of i is
given by N ( i ) = { j J ( i ,j) E E V (j,i) E E). The degree d(i) of node i is the cardinality
of the set of neighbors: d(i) = IN(i)l. A vertex with degree 0 is called isolated.
A path from vl t o vk is a sequence of vertices v1, v2,. . . ,vk which are connected by
edges: (vi, vi+l) E E Vi = 1 , 2 , .. . , k - 1. The length of the path is k - 1. If vl = vk
the path is called closed. If no node except the first and the last one appears twice in
a closed path, it is called a cycle. A set of nodes is called a connected component, if
it contains all nodes where from each node a path to each other node of the set exists.
A graph is called connected if it has only one connected component.
3 Graphs
0
Figure 3.1: An undirected graph.
Example: Graph
In Fig. 3.1 the graph G = ({1,2,3,4,5,61, {(1,3), (3,4) (4,111 (4,211 ( 6 , l ) ) )
is shown. The nodes are represented by circles and the edges by lines con-
necting the circles. The graph has n = 6 vertices and m = 5 edges; e.g., nodes
3 and 4 are adjacent. The set of neighbors of vertex 1 is N(1) = {3,4,6).
Thus, node 1 has degree 3 while node 2 has only degree 1. Node 5 is isolated.
The graph contains the path 6 , 1 , 4 , 3 from node 6 t o node 3 of length 3 and
the cycle 1 , 3 , 4 , 1 . 0
Now some definitions are given which apply only to directed graphs. For an edge
e = (i,j), i is the head and j the tail of e. The edge e is called outgoing from
i and incoming t o j . Please note that for a directed path it is important that all
edges point into the direction of the path, formally the definition is the same as in the
case of an undirected graph. A set of nodes is called a strongly connected component
(SCC), if from each of its nodes a directed path t o each other node of the set exists.
In a directed graph, the outgoing and incoming edges can be counted separately. The
indegree is given by id(i) = I{j1 (j,i) E E) I and the outdegree is od(i) = I {j1 (i,j ) E E).
+
Obviously, for all vertices d(i) = id(i) od(i).
- -- -- --
Example: Tree
In Fig. 3.3 an undirected tree is shown. The root is usually displayed at the
top. Here, node 7 is the root. Node 6 is a descendant of the root and an
ancestor of node 5. Nodes 1, 2, 5, 9, 10, 11 are leaves. The tree has height 3.
The subtree which has node 3 as a root contains the nodes 2, 3, 4, 11.
For directed graphs, usually graphs without cycles are called trees only in the case
+
where there are only edges from levels 1 t o 1 I and no edges in the other direction.
Sometimes a directed graph is called a tree only if the undirected version of the graph
contains no cycles.
An important application of trees in the field of computer science are search trees.
They are used to store a collection of objects in an ordered way. Thus, an order
relation ''I"must be given for the objects. Search trees have the following properties:
3 Graphs
Level 0
Level 1
Level 2
Level 3
They are binary trees. This means that each node has at most two sons, called
left and right son. The subtree which has the left (right) son as the root is
called left (right) subtree.
The following property is true for all objects 01:in the left (right) subtree only
objects are stored which are smaller (equal t o or larger) than O1 according t o the
order relation "5".
This special organization allows the design of very fast algorithms for finding, inserting
and deleting elements within a search tree while always keeping the correct order. An
example will be given in Sec. 3.5. In Fig. 3.4 a search tree containing natural numbers
sorted in ascending order is shown.
Figure 3.4: A binary tree containing the nodes 2, 4, 11, 17, 22, 24, 33, 46, 61, 62,
63, 99 sorted in ascending order.
3.2 Trees and Lists 41
Similar t o search trees are heaps. They are binary trees as well, but here the smallest
element of each subtree is always stored a t the root of the subtree, which means that
the smallest element of the whole tree is stored a t the root of the tree. This allows for
an efficient implementation of priority queues, which always give you the next element
with the currently lowest priority (remove operation). After the root element has
been removed, it is replaced by the lower of the two elements in the roots of the left
and right subtree, i.e. the lower of both elements moves up one level. In the same
way the element which has moved up is replaced by its smaller son, etc. Adding an
element t o a heap is done in the following fashion. The element is added as a son of
a leaf. If it is smaller than the element of its father, both elements are swapped. This
process is continued until the current father of the new element is smaller than the new
element. Thus, eventually the new element may rise t o the root, if it is smaller than
all other elements in the tree. As a consequence, the insert and remove operations
can be performed in logarithmic time.
A binary tree is called complete if each node is either a leaf or has two sons. It can
easily be shown by induction that a complete tree with height h has n = 2h+1 - 1
nodes and 2h leaves.
head tail
A very simple type of graph is a list. Lists can be regarded as special types of trees
which have exactly one root and exactly one leaf and every node has a t most one son.
The nodes of a list are called elements, the root is called the head of the list and the
leaf is called the tail. For historic reasons lists are often drawn in a slightly different
manner to other graphs. In Fig. 3.5 a list containing 5 element,^ is shown.
t Out e In
Several special types of lists are very useful in computer science. For queues it is only
possible t o add new elernents a t the tail and t o remove elements a t the head, see Fig.
3.6. They are called FIFO ("first in first out") lists as well. An application of
-
queues are printer queues. The job which enters the queue a t first will be printed a t
first. Other jobs which are added while a job is being printed have t o wait.
Lists are called stacks if adding and removing elements takes place only a t the head
(see Fig. 3.7). They behave in an LIFO ("last in first out") manner. Stacks are used
for exarnple t o realize recursion. Whenever a call t o a procedure occurs the computer
3 Graphs
In --+
t Out
has t o remember where t o proceed when the procedure is finished. These so called
return addresses are stored on a stack, because the procedure which was called last
will be finished first.
3.3 Networks
Consider again the TSP (see Chap. 1). It can be represented by an undirected graph
if one identifies the cities with nodes and the connections between cities with edges.
In order t o describe the TSP completely, the distances between different cities have t o
be represented as well. This can be done by introducing a function d : E tR from
edges to real numbers. For an edge e = (i, j) E E the distance between the cities i
and j is given by d(e).
Sometimes functions f : V tA (A an arbitrary set) on vertices are useful as well. An
arbitrary function on vertices or on edges is called labeling. A graph together with a
labeling is called a labeled graph. Typical examples for labelings are distances, costs
or capacities.
The case where the edges are labeled with capacities is so important that it has its
own name: a network. For example a system of pumps connected via pipes which can
transport some fluid like water is a network. Our kind of networks contain two special
pumps: the source where the fluid enters the network and the sink where the fluid
leaves the network. The capacities of the pipes tell how much fluid can be transported
through each pipe per time unit. The exact definition reads as follows:
Definition: Network A network is a tuple N = (G, c, s , t) where
G = (V, E) is a directed graph without edges of the form (i, i ) .
c : E t72; is a positive labeling of the edges. Additionally, we assume c((i,j)) =
0 if ( i l j ) 6E.
s E V is a vertex called source with no incoming edges di(s) = 0.
t E V is a vertex called sink with no outgoing edges do(t) = 0.
In Fig. 3.8 an example of a network is shown. The capacities are the numbers written
next t o the edges. Please note that it is possible t o define a network via an undirected
graph as well. For the models we encounter here, directed networks are sufficient.
An actual flow of fluid through the network can be described by introducing another
labeling f : V x V -+ R. A flow through an edge is always bounded by its capacity.
3.3 Networks
For technical reasons, negative values of the flow are allowed as well. A negative value
f (i, j ) < 0 means a positive flow from node i t o node j and vice versa. Furthermore,
the flow is conserved in all nodes, except the source and the sink. In total we obtain
=
[by writing f i j f (i,j), c i j G c((i,j))]
Please note that by combining the first and second properties f j i > - c i j is obtained.
The total flow through the graph is given by fo = C j f,,j = - C j f t j . Determining
the maximum possible amount which can flow from s t o t through a network, called the
maximum Bow, is an optimization problem. For example, the maximum flow for the
network in Fig. 3.8 is eight. Later we will see that the determination of the maximum
flow can be solved in polynomial time. In physics, we apply these methods t o calculate
ground states of disordered magnetic materials such as random field systems or diluted
antiferromagnets.
The minimum cost Pow problem is related t o the maximum-flow problem. Here,
additional costs d : E -+ R and a total flow f o are given. d (e) specifies the cost
of sending one unit of flow through edge e. More generally, the cost may depend
arbitrarily but in a convex way on the flow through an edge. The minimum cost flow
problem is t o find a flow of minimum total cost C , c' ( e )f ( e ) , which is compatible
with the capacity/conservation constraints and has the total value f o . Later it will be
shown how ground states of solid-on-solid systems or of vortex glasses (from the theory
of superconductivity) can be obtained by minimizing cost flows. For this problem
polynomial time algorithms are known as well. The case f o =
0 frequently occurs.
Please note that the optimum flow may not vanish inside the network in this case as
well, since the costs can take negative values thus making circular flows favorable.
3 Graphs
simple if along each element two pointers are stored: one t o the next element and one
t o the preceding element. In this case the list is called double linked. A drawback of
the realization with pointers is that it is not possible t o access elements of the list (say
the 20th element) directly. One has t o start at the head or the tail and go through
the list until the element is reached. Furthermore, this type of realization consumes
more memory, since not only the elements but the pointers have t o be stored as well.
The situation which we encounter here is typical for computer science. There are now
perfect data structures. The application has to be taken into account t o find the best
realization.
Figure 3.10: Representation of the tree from Fig. 3.3 via nodes containing arrays of
pointers. Here each array contains at most three entries per node.
This is true for trees as well. If the maximum number of sons for each node is known,
usually the following type of realization is used, which is similar t o the representation
of lists via pointers. Along with the information which is t o be stored in a node, an
array containing pointers to the sons is kept. If a node has less than the maximum
number of sons, some pointer values contain the value NULL. In Fig. 3.10 a realization
of the tree from Fig. 3.3 is shown.
For binary trees (among them heaps) there is a very simple realization using one array
T. The basic idea is t o start a t the root and go through each level from left to right.
This gives us the order in which nodes are stored in the array. One obtains:
The root is stored in T [ l ] .
For each node T[k] the left son is stored in T[2k] and the right son is kept in
+
T[2k 11.
In Fig. 3.11 the array representation of the search tree from Fig. 3.4 is shown. This
type of tree realization is comparable t o the array representation of lists: it is very
3 Graphs
easy t o build up the structure and t o access nodes but insert and delete operations
are time consuming.
We finish this section by explaining two methods t o represent arbitrary graphs. Let
G = (V, E) be a graph. Without loss of generality we assume V = { 1 , 2 , . . . , n).
An adjacency matrix A = ( a i j ) for G is an n x n matrix with entries 0, 1, where n is
the number of nodes. In the case G is directed we have:
1 for ( i , j ) ~ E
aij = { 0 else
For an undirected graph, the adjacency matrix A contains nonzero elements aij =
aji = 1 for each edge (i,j ) E E, i.e. in this case the matrix is symmetric.
The adjacency matrices for the example graphs of Sec. 3.1 are shown in Fig. 3.12.
undirected directed
The advantage of the matrix representation is that one can access each edge within
a constant amount of computer time. On the other hand, an adjacency matrix needs
O ( n 7 memory elements t o be stored. Consequently, this choice is inefficient for storing
sparse graphs, where the number of edges is m E O(n).
For sparse graphs, it is more efficient t o use adjacency lists: for each node i a list Li
is stored which contains all its neighbors N ( i ) . Hence, for directed graphs, Licontains
3.4 Graph Representations 47
all vertices j with (i, j) E E while for an undirected graph it contains all vertices
j with (2, j) E E or ( j , i ) E E. Please note that the elements in the list can be in
arbitrary order. The list representation uses only O ( m ) memory elements. With this
realization it is not possible t o access an edge (i, j ) directly, since one has t o scan
through all elements of Lit o find it. But for most applications a direct access is not
necessary, so this realization of graphs is widely used. A similar method is applied for
the LEDA-library package [5].
undirected directed
Figure 3.13: Adjacency lists for the example graphs. The left lists represent the
case when G is regarded as an undirected graph, while the right are for the case of a
directed graph.
The adjacency lists for the sample graphs from above are shown in Fig. 3.13.
For a directed graph, it is sometimes very convenient to have all incoming edgcs for
a given vertex j available. Then one can store a second set of lists K j containing all
nodes i with (i, j ) E E. The lists K j for the directed version of the sample graph are
shown in Fig. 3.14.
The methods presented above can easily be extended t o implement labeled graphs.
Labels of vertices can be represented by arrays. For labels of edges one can either use
matrices or store the labels along with the list elements representing the edges.
The algorithm performs one descent into the tree. Hence, its time complexity is O(h)
if h is the height of the search tree. If the search tree is complete, the height is
h E O(logn), where n is the number of elements in the tree.
The find algorithm can be directly extended t o insert elements into the tree: if the
algorithm reaches an empty subtree, then the element is not stored in the tree. Thus, a
node containing the object can be added at the position where the search has stopped.
Consider as example where number 21 is t o be inserted into the tree from Fig. 3.15. At
node 22 the algorithm branches t o the left subtree, but this tree is empty. Therefore,
a node with the number 21 will be added as the left son of the node 22.
Figure 3.16: A binary search tree built by inserting numbers 33, 46, 61, 62, 63, 99
in the given order using a simple insert operation.
This simple algorithm for inserting new objects has a major drawback. Assume that
numbers sorted in increasing order are inserted into a tree which is empty at the
beginning. During the execution of the algorithm the right subtree is always taken.
Consequently, the tree which is built in this way is in fact a list (see Fig. 3.16). In
this case the height of the tree is equal t o the number of elements, i.e. the search for
an object takes O(n) time instead of logarithmic time.
In order to avoid this trouble the tree can always be kept balanced. This means that
for every node the heights of the left and right subtree differ at most by one. There are
several clever algorithms which assure that a search tree remains balanced. The main
idea is t o reorganize the tree in case of imbalance while keeping the sorted order. More
details can be found in [6, 41. For balanced trees, it is guaranteed that the height of
the tree grows only logarithmically with the number of nodes n. Thus, all operations
like insert, search and delete can be realized in O(1ogn).
3 Graphs
Figure 3.17: An undirected graph containing a Euler cycle (left) and a Hamilton
cycle (right). The numbers next to the edges give one possible order of the edges
within an Euler cycle.
The first example is about cycles. Let G = (V, E) be an undirected graph. An Euler
cycle is a cycle which contains all edges of the graph exactly once. It was shown by
Euler in 1736 that an undirected graph contains an Euler cycle if and only if each vertex
has an even degree. Thus, the recognition problem, whether an undirected graph has
an Euler cycle or not (EC), can be decided in polynomial time. The first algorithm
for constructing an Euler cycle was presented 1873 in [7]. A Hamilton cycle is a cycle
which contains each vertex exactly once. The problem of whether an undirected graph
contains a Hamilton cycle or not (HC) is NP-complete. A proof can be found in 181,
the main idea is to show 3-SAT<,HC. In Fig. 3.17 a graph containing both an Euler
Cycle and a Hamilton cycle is shown.
The next pair of graph problems is defined via covers of undirected graphs. An edge
cover is a subset E' c E of edges such that each vertex is contained in at least one
edge e E E ' . Each graph which has no isolated vertices has an edge cover, since in
that case E itself is an edge cover. A minimum edge cover is an edge cover where
1 E' 1 is minimal. In Fig. 3.18 a graph and a minimum edge cover are shown. An
algorithm which constructs a minimum edge cover in polynomial time can be found
in [9]. Consequently, the corresponding recognition problem LLdoes graph G have an
3.6 NP-complete Graph Problems
Figure 3.18: A graph and a minimum edge cover (left) and a minimum vertex cover
(right). The edgeslvertices belonging to the cover are shown in bold.
<
edge cover with I E ' I K" is in P. Related to the edge cover is the vertex cover. It
is a subset V' c V of vertices such that each edge contains a t least one node from V' .
In Fig. 3.18 a graph together with its minimum vertex cover is displayed. It has been
shown in [lo] that the problem "does graph G have a vertex cover with I V ' I <
K"
(VC) belongs t o the class of NP-complete problems. The proof works by transforming
3-SAT<,VC. Thus, there is no algorithm known which finds a minimum vertex cover
in polynomial time.
We have already seen that the TSP is a hard optimization problem. The corresponding
decision problem is NP-complete [lo]. The proof works by demonstrating HC<,TSP.
HC has already been recognized as being NP-complete (see above). The proof is short,
which is quite unusual for this type of problem.
Proof: Let G = (V, E) be a graph, n = IVI. Then we built a second graph G' =
(V, V x V) with the following distance labeling
1 if ( i , j ) E E
d(i, j ) =
2 else
The transformation works in polynomial time since it contains just two nested loops
over all nodes. Each tour in G' contains exactly n edges and visits all n nodes. Thus,
a tour must have the length n a t least. If it has exactly length n all distances along
the tour must have value 1, i.e. all edges from the tour are in G as well, and vice versa.
<
Consequently, the question "does G' have a shortest round tour with length n" has
the answer "yes" if and only if G has a Hamilton cycle. Hence, HC<,TSP. QED
On the other hand; if one is interested only in the shortest distance between two given
cities, this problem can be solved in polynomial time. Several algorithms can be found
in [ l l ] . Some of them will be presented in Chap. 4. A recent application in physics is
the study of so called directed polymers.
Another type of graph problems, which can be solved in polynomial time, are matching
problems. They will be introduced in Chap. 10.
3 Graphs
Bibliography
[I] B. Bolobas, Modern Graph Theory, (Springer, New York 1998)
[2] J.D. Claiborne, Mathematical Preliminaries for Computer Networking, (Wiley,
New York 1990)
[3] M.N.S. Swamy and K. Thulasiraman, Graphs, Networks and Algorithms, (Wiley,
New York 1991)
[4] R. Sedgewick, Algorithms in C , (Addison-Wesley, Reading (MA) 1990)
[5] K. Mehlhorn and St. Naher, The LEDA Platform of Combinatorial and Geometric
Computing, (Cambridge University Press, Cambridge 1999);
see also http://www.mpi-sb.mpg.de/LEDA/leda.html
[6] A.V. Aho, J.E. Hopcroft, and J.D. Ullman, The Design and Analysis of Computer
Algorithms, (Addison-Wesley, Reading (MA) 1974)
[7] C. Hierholzer, Math. Ann. 6, 30 (1873)
[8] C.H. Papadimitriou and K. Steiglitz, Combinatorial Optimization, (Dover Publi-
cations, Mineola (NY) 1998)
[9] E.L. Lawler, Combinatorial Optimization: Networks and Matroids, (Holt, Rine-
hard and Winston, New York 1976)
[lo] M.R. Garey and D.S. Johnson, Computer and Intractability, A Guide to the The-
ory of NP-Completeness, (W.H. Freeman & Company, San Francisco 1979)
[11] D.P. Bertsekas, Network Optimization, Continuous and Discrete Models, (Athena
Scientific, Belmont (MA) 1998)
4 Simple Graph Algorithms
it. At low concentrations one does not expect any percolating clusters, i.e. connected
clusters that extend from one boundary t o the opposite one. On the other hand, at
high concentration most probably there will be one, in other words: at low p the
sample will be (with probability one) insulating, at large p it will be a conducting,
see Fig. 4.1. As it turns out, there is a critical value p, above which almost all (i.e.
with probability 1) samples have a percolating cluster (or "infinite cluster") in the
thermodynamic limit, and below which they do not. The precise value of p, depends
on the lattice structure (i.e. the class of random graphs) one considers, but otherwise
many features of this transition are identical for different lattices in two dimensions,
but different in 3, 4, etc. dimensions.
Figure 4.1: Example for site percolation on the square lattice (L = 50) with site oc-
cupation probability p = 0.3 (left), p = 0.5 (middle), p = 0.7 (right). The percolation
threshold is known to be at p, z 0.59275 [I]. Therefore the right picture is above the
percolation threshold and indeed one recognizes easily a percolating cluster extending
from the bottom to the top of the system.
For instance exactly at the critical point p = p, at least one infinite cluster exists and
it is a fractal, i.e. its volume V, (defined as the number of sites it contains) scales with
the linear system size as V, -- ~ ~ The f value
. of this fractal dimension is one of the
aforementioned features that do not depend on the lattice but only on the dimension in
which one studies percolation. The number d f , called a critical exponent is universal
(for instance df = 91/48 for 2d-percolation [2], and df x 2.524 for 3d percolation [3]),as
other exponents describing the percolation transition: The infinite-cluster probability
behaves like P, -
(which is the probability that a site is on the infinite cluster) close t o the transition
( p - p,)O (P = 5/36 in 2d [2], /? x 0.417 in 3d [3]); the correlation
length 6 -- ( p - p,l-" (v = 413 in 2d [2], v x 0.875 in 3d [3]) and the average cluster
size S -- Ip-p,l-Y ( y = 43/18 in 2d [2], y x 1.795 in 3d [3]). In Sec. 5.3 we will show
how these exponents are computed using the methods presented in the next section.
Figure 4.2: Example for the Hoshen-Kopelman algorithm. One starts in the upper
left corner and proceeds row by row. In the first row the first site is occupied and gets
label 1, the second gets the same label since it is a neighbor of the first, the third is
empty and the fourth gets label 2. In the second line the first site has an occupied
upper neighbor with label 1, thus it gets also label 1. The second is empty and the
third is labeled 3. The fourth site is now a neighbor of two sites, one labeled 2 and
one labeled 3. All three sites belong to the same cluster, which was first labeled 2.
Accordingly we also assign the label 2 to the new site, but we have to keep track that
cluster 2 and 3 are connected. This is achieved by setting the array size(3)=-2, saying
that 3 is not a proper label, since size(3) is negative, and that the proper label is 2.
After distributing labels to all sites one changes them into the proper ones.
procedure start-new ( x ,y )
begin
label (x,y) := c;
size(c) := 1;
c:=c+l;
end
procedure connect ( X I , y ~ x~
, ,y2)
begin
label(xl,y l ) := label(x~,yz);
+
size ( c ) := size ( c ) 1;
end
In Fig. 4.2 we show an example of the cluster labeling at work. For the implementation
it might be advantageous t o replace the cluster labels by the proper one as soon as
neighbor sites are checked. In Sec. 5.3 we show how the results generated by an
application of this algorithm are analyzed with finite-size scaling to estimate the critical
exponents for percolation mentioned in the last section.
The main work of the first algorithm is done within the procedure depth-first(), which
starts at a given vertex i and visits all vertices which are connected t o i. The main
idea is t o perform recursive calls of depth-first() for all neighbors of i which have not
been visited a t that moment. The array comp[] is used t o keep track of the process.
If comp[i] = 0 vertex i has not been visited yet. Otherwise it contains the number
t of the component. Please note that the array comp[] is passed as reference, i.e. it
behaves like a global variable.
algorithm components(G)
begin
initialize comp[i] := 0 for all i E V;
t := 1;
while there is a node i with comp[i]=O do
depth-first(G, i, comp, t); t := t + 1;
end
Figure 4.3: Depth-first spanning forest for a graph with three connected components.
In Fig. 4.3 an example graph is shown. The numbers next t o the nodes indicate
a possible order in which the nodes are visited. The edges which are used by the
algorithm are indicated by thick lines. Since each node is visited not more than once,
there are no cycles in the subgraph given by these thick edges. Thus, they constitute
a tree for each component. Since all nodes are visited, the tree spans the whole
component. For this reason the tree is called a spanning tree. The collection of all
spanning trees of a graph is called its spanning forest.
4.1 The Connectivity-percolation Problem 59
Please note that the algorithm tries t o go as far as possible within a graph, before
visiting other neighbors of the node where the search started. It is for this reason that
the procedure has its name. The spanning trees constructed by this procedure are also
called depth-first spanning trees.
For readers more interested in the subject, we should note that a depth-first search
uses singly connected edges in its first probe t o maximum depth. These are edges,
which are not contained in loops, which include the source and transverse each edge
only once.
Since depth-first search excludes crossing an edge more than once, the node at the
end of a singly connected edge (or sequence of singly connected edges) which is closest
t o the source of the search, defines an articulation point in a depth-first tree. The
articulation points divide the depth-first-search tree into its biconnected components.
This procedure was well known in computer science in the early 1970s [lo], and
has recently provided efficient algorithms for the backbone in connectivity percolation
[Ill. The backbone is a subset of the infinite cluster, and is found from the infinite
cluster by trimming off "dangling ends" which are unable t o transport an electric
current. The original burning algorithm for backbone identification uses forward and
reverse breadth-first searches t o find the backbone [6]. This is inefficient compared
with the depth-first search procedure, though high accuracy results have been found
by applying this method at p, [12].
A similar algorithm, which instead first visits all neighbors of a node before proceeding
with nodes further away, is called breadth-first search (BFS). This means at first all
neighbors of the source are visited, they have distance one from the source. In the
previous example in Fig. 4.3, the edge (1,2) would be included in the spanning tree,
if it was constructed using a breadth-first search. In the next step of the algorithm,
all neighbors of the vertices treated in the first step are visited, and so on. Thus, a
queue (see Sec. 3.2) can be used t o store the vertices which are t o be processed. The
neighbors of the current vertex are always stored a t the end of the queue. Initially
the queue contains only the source. The algorithmic representation reads as follows,
level(i) denotes the distance of vertex i from the source s and pred(i) is the predecessor
of i in a shortest path, which is obtained as a byproduct (see next page):
4 Simple Graph Algorithms
During a run of thc algorithm, for each vertex all neighbors are visited. Thus each
edge is touched twice, which results in a tirne complexity of O(IE1).
Figure 4.4: Example graph for breadth-first search. The search starts at vertex 0.
In the first iteration, vertices 1 and 3 are visited. In the next iteration, vertices 2 and
4 are treated, finally vertex 5.
E x a m ~ l e :Breadth-first search
We consider the graph shown in Fig. 4.4. Initially the queue contains the
source and all values level(i) are undefined, except level(0) = 0.
Q = {0), level(0) = 0
While treating the source, its neighbors, vertices 1 and 3, are added to the
queue, thus pred(1) = pred(3) = I. They have distance 1 from the source
4.2 Shortest-path Algorithms
(level(1) = level(3) = 1). The order the vertices are added has no influences
on the final results, only on the details of the computation.
Next vertex 1 is processed. It has two neighbors, vertices 2 and 3, but vertex
3 has been visited already (level(3) # -I), thus only vertex 2 is added t o
+
Q with pred(2) = 1, level(2) = level(1) 1 = 2. After this iteration the
situation is as follows:
configuration of the polymer. In what follows we will present the algorithms with
which these problems can be solved efficiently.
Figure 4.5: Models for a DPRM. Left: In the (10) orientation. Right: In the (11)
orientation.
where the sum is over all bonds ( i j ) of the lattice and xij represents the DPRM
configuration starting at one particular point s of the lattice and ending at another
point t (see Fig. 4.5). It is xij = 1 if the DPRM passes through the bond ( i j ) and
xij = 0 otherwise. Typically s is on one side of a lattice of linear size L and t on the
opposite and the ground state of (4.1) is the minimum energy path from s t o t .
Interpreting the energies as distances (after making them all positive by adding a
sufficiently large constant t o all energies), and the lattice as a directed graph, this
becomes a shortest path problem that can be solved by using, for instance, Dijkstra's
algorithm, which will be described below. In addition, owing t o the directed structure
of the lattice one can compute the minimum energies of D P configurations ending
at (or shortest paths leading to) target nodes t recursively (this is the way in which
Dijkstra's algorithm would proceed for this particular case) [16]. This is the same as
the transfer-matrix algorithm, encountered in statistical mechanics [17]. It reduces in
4.2 Shortest-path Algorithms
Figure 4.6: A collection of polymers of lowest energy directed along the diagonals of
a square lattice with random bonds. Each polymer (crossing 500 bonds) has one end
fixed to the apex of the triangle, the other to various points on its base, and finds the
optimal path in between.
the zero temperature limit to a simple recursion relation for the energies or distances
+
from s t o nodes t in the (n 1)th layer, once we know the shortest paths to the n-th
layer:
EL^+') +
= M ~ ~ { E : Y )e,,, Is1 E n r t h layer a n d sr nearest neighbor of s }
(4.2)
+
In Fig. 4.6 we show a collection of such optimal paths in the 1 1 dimensional case.
In the following we will discuss the two basic algorithms t o find the shortest paths in
a general graph.
Figure 4.7: The tree of minimal paths from the source node (shaded) t o all other
nodes in a directed square lattice (the edges of the graph only allow paths which
are in the positive (01) and positive (10) directions). All bonds between nearest
neighbors are labeled with their costs, but only the tree of minimal paths is shown.
Each node is labeled with the cost of a minimal path from the source to that node.
(Generated using the demonstration programs from the LEDA library available at
http://www.mpi-sb.mpg.de/LEDA/)
(i) For each edge belonging t o the shortest-path tree from node s:
(ii) For each edge which does not belong t o a shortest path:
These properties are often discussed in terms of reduced costs, defined as,
cfj = cij + d(i) - d(j). (4.3)
Properties (i) and (ii) above are then,
c;j = 0, if (i, j) E Tp,
and
cfj > 0, otherwise. (4.5)
The proof of these properties relies on the spanning-tree structure of the set of minimal
paths, namely that each site of the tree has only one predecessor. Thus if there is a
4.2 Shortest-path Algorithms 65
bond with c$ < 0 which is not on the minimal-path tree, adding that bond t o the tree
and removing the current predecessor bond for site j [which from condition (4.4) has
zero reduced cost], leads t o a reduction in the cost of the minimal-path tree. Thus any
tree for which there exists a bond with c$ < 0, is not a minimal-path tree.
For later reference we also note that a directed cycle, W, has the property,
The algorithm maintains the minimal distance growth front by adding the node j E 3
with minimal distance label d(j). The proof that d ( j ) generated in this way is actually
a minimal-cost label proceeds as follows:
(i) Assume that we have a growth front consisting of sites which are labeled with their
minimal path lengths to the source s.
(ii) The next candidate for growth is chosen to be a site which is not already labeled,
and which is connected t o the growth front by an edge (i, j) E E.
+
(iii) We choose the site j for which d ( j ) is minimal d ( j ) = mink,m{d(k) ck,, k E
S , m E 3,( k , m ) E E}.
(iv) Because of (iii) and because all of the costs are non-negative there can be no
path from the current growth front t o site j which has a smaller distance than d ( j ) .
This is because any such path must originate at the current growth front and hence
must use a non-optimal path t o generate any alternative path t o j (negative costs can
compensate for locally non-optimal paths from the growth front and hence Dijkstra's
method is restricted t o non-negative costs). In Fig. 4.8 an example demonstrates how
the Dijkstra's algorithm works.
The "generic" Dijkstra's algorithm scales as C3(IXI2), if the choose statement in the
above algorithm requires a search over all the sites in the lattice. It is easy t o do much
4 Simple Graph Algorithms
Figure 4.8: Demonstration of Dijkstra's algorithm The number in the circles denote
the distance labels d ( i ) . White circle stand for temporary nodes, filled circles for
permanently labeled nodes. The numbers on the edges are the distances cij. The
edges that have been considered in the last step are marked.
better than this by maintaining a list of active sites at the growth front (as in breadth-
first search). However now we must choose the lowest-cost site from among this list.
Thus the potential growth sites must be ordered according t o their distance label.
This ordering must be reshuffled every time a new growth site (with a new distance
label) is added t o the list. In the computer science community this is typically done
with heaps or priority-queues which consist of a tree-like data structure, see Sec. 3.2.
Heap reshuffling is O(ln1E 1) which reduces the algorithmic bound to O ( 1 E llnl E 1).
Each element of the heap has a key, which is here the temporary distance. The
heap operations create-heap(), find-min() and delete-min() are self-explanatory. The
decrease-key() operation assigns a new lower temporary distance to an element of the
heap and moves it eventually towards the root of the heap by exchanging elements.
By A ( i ) ,we denote the edges adjacent t o vertex i . The heap implementation of the
Dijkstra algorithm reads as follows:
4.2 Shortest-path Algorithms
If the bond costs are integers, the site distance labels themselves can be used as
pointers. Thus we can set up a queue with the distance label as pointers and the
site labels with that distance in the queue (or, in computer science terminology, we
use buckets). The number of buckets that is required, n b , is n b > 2C where C =
max(i,j){q,} is the maximum cost. For example if the costs are chosen from the set
1 , 2 , . . . , l o , then C = 10. As long as C is finite, and the graph is sparse, buckets are
very efficient both in speed and storage, for dctails see Ref. [ 2 2 ] . Using buckets, one
can implemented a Dijkstra's algorithm for integer costs which has a running time
scaling as O(I El).
As long as one reduced edge length is negative, the distance labels d(i) are not shortest-
path distances:
d(.) shortest path distances ctJ 2 0 V(i,j) E A (4.9)
The criterion (4.9) suggests the following algorithm for the shortest-path problem:
algorithm label-correcting
begin
d(s) := 0; pred(s) := 0;
d ( j ) := oo for each node j E N\{s};
while some edge (i, j ) satisfies d ( j ) > d(i) + cij (c$ < 0) do
begin
+
d ( j ) := d(i) cij (+c,dj = 0);
pred(j) := i ;
end
end
Initially the distance labels at each site are set to a very large number [except the
reference site s which has distance label d(s) = 01. This method requires that the
starting dist,ance labels d ( j ) are greater than thc exact shortest distances and the
choice of d ( j ) = cc ensures that. In Fig. 4.9 an example of the operation of the
algorithm is shown.
In practice it is efficient t o grow outward from the starting site s. The algorithm may
sweep the lattice many times until the correct distance labels are identified. The worst
case bound on running time O(min{l X 1" EIC, I E ( ~ ~ ~ I } ) with C = max {leij I } , which is
pseudo-polynomial. An alternative procedure is to sweep the lattice once t o establish
approximate distance labels and then t o iterate locally until local convergence is found.
This so called first in, first out (FIFO) implementation has complexity O(I X I I El).
+
Note that if there are negative cycles, the instruction d ( j ) = d(i) cij would decrease
some distance labels ad (negative) infinitum. However if there are negative cycles,
one can detect them with an appropriate modification of the label-correcting code:
One can terminate if d(k) < -nC for some node i (again C = max JcijI) and obtain
these negative cycles by tracing them through the predecessor indices starting at node
i . This will be useful in the negative-cycle-canceling method for minimum-cost flow
(Chap. 7).
set of points using a measure of performance that on the surface has little resemblance
t o the minimum spanning tree objective (sum of edge costs), or the problem itself
bears little resemblance t o an optimal tree problem.
The minimum spanning tree [18, 191 of a connected graph with edge costs c i j is a
tree which: (i) visits each node of the graph and; (ii) for which xi,,,
c i j a minimum.
Prim's algorithm and Kruskal's algorithm are two methods for finding the minimal
spanning tree [18, 231. The two algorithms are based on the following (equivalent)
optimality conditions:
Cut-optimality condition: A spanning tree T * is a minimum spanning tree if and
only if: for all tree edges ( i j ) E T*: c i j is smaller than every capacity c k l contained in
the cut1, which is induced by deleting edge ( i j ) from T*, i.e. after deleting the edge
from the tree, the tree falls apart in two parts. The cut, which is induced by this,
contains all edges in the graph G, which run between the two parts.
Path-optimality condition: A spanning tree T* is a minimum spanning tree if and
only if: For every non-tree edge ( k l ) of G: c i j 5 c k l for every edge ( i j ) contained in
the path in T* connecting nodes I% and I .
'A cut of a graph G is a separation of the node set V of G in two disjoint subsets S and S' with
S U S' = V, see also Chap. 6. Usually one also identifies all edges (i,j ) with i E S and j E S' with
the cut.
70 4 Simple Graph Algorithms
Kruskal's algorithm is based on the path optimality condition whereas Prim's algo-
rithm is based on the cut optimality condition. The latter is very similar in structure
t o Dijkstra's algorithm. Recently it has been observed [24] that Prim's algorithm is
essentially equivalent t o the invasion algorithm for percolation.
In Prim's algorithm we start by choosing the lowest cost bond in the graph. The
algorithm then uses the two sites at the ends of this minimal cost bond as the starting
sites for growth. Growth is t o the lowest cost bond which is adjacent t o the growth
front. The algorithm terminates when every site has been visited. The cost of the
minimal spanning tree is stored in CT, and the bonds making up the minimal spanning
tree are stored in T.
algorithm Prim()
begin
choose (s, r ) : c,, := min{csm 1 (k, m) E E ) ;
S := {s,r ) ; := X\{S, T } ;
T := {(s,r ) } ;CT = c,,;
while IS(< 1x1 do
begin
choose ( i ,j ) : cij := min{ckm1 k E S,m E 3,(k, m ) E E ) ;
-
S = S\{j);S = S U {j);
+
CT = CT cij; T = T u {(i,j));
end
end
Figure 4.10: Illustration of Prim's algorithm. From (a), starting with the lowest
cost edge, to (d) successively new edges are added until a minimum spanning tree is
reached.
Bibliography
[I] D. Stauffer and A. Aharony, Perkolationstheorie: eine Einfuehrung, (Wiley-VCH,
Weinheim 1995)
[2] M.P.M. den Nijs, J. Phys. A 1 2 , 1857 (1979), B. Nienhuis, J. Phys. A 1 5 , 199
(1982)
[3] P.N. Strenski, R.M. Bradley, and J.M. Debierre, Phys. Rev. Lett. 66, 133 (1991)
[4] P.L. Leath, Phys. Rev. B 1 4 , 5046 (1976)
[5] J. Hoshen and R. Kopelman, Phys. Rev. B 1 4 , 3438 (1976)
[6] H.J. Herrmann, D.C. Hong, and H.E. Stanley, J. Phys. A 1 7 , L261 (1984)
[7] D. Wilkinson and J.F. Willemson, J. Phys. A 1 6 , 3365 (1983); D. Wilkinson and
M. Barsony, J. Phys. A 17, L129 (1984)
[8] D.C. Rapaport, J. Stat. Phys. 66, 679 (1992)
[9] C. Lorenz and R. Ziff, Phys. Rev. E 57, 230 (1998)
[lo] R. Tarjan, SIAM J. Cornput. 1, 146 (1972)
[ll] P. Grassberger, J. Phys. A 25, 5475 (1992); Physica A 262, 251 (1999)
72 4 Simple Graph Algorithms
[12] M. Rintoul and H. Nakanishi, J. Phys. A 25, L945 (1992); J. Phys. A 27, 5445
(1994)
[13] T. Halpin-Healy and Y.-C. Zhang, Phys. Rep. 254, 215 (1995) and references
therein
[14] M. Kardar, G. Parisi, and Y.-C. Zhang, Phys. Rev. Lett. 56, 889 (1986)
[15] M. Kardar and Y.-C. Zhang, Phys. Rev. Lett. 58, 2087 (1987); T. Nattermann
and R. Lipowski,Phys. Rev. Lett. 61, 2508 (1988); J. Derrida and H. Spohn, J.
Stat. Phys. 51, 817 (1988); G. Parisi, J. Physique 51, 1695 (1990); M. Mkzard, J.
Physique 51, 1831 (1990); D. Fisher and D. Huse,Phys. Rev. B 43, 10728 (1991)
[16] M. Kardar, Phys. Rev. Lett. 55, 2235 (1985); D. Huse and C. L. Henley, Phys.
Rev. Lett. 54, 2708 (1985); M. Kardar, Phys. Rev. Lett. 55, 2923 (1985)
[17] L.E. Reichl, A Modern Course in Statistical Physics, (John Wiley & sops, New
York 1998)
[18] T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms, (MIT
Press, Cambridge (MA) 1990)
[19] C.H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms
and Complexity, (Prentice Hall, New Jersy 1992)
[20] A.V. Aho, J.E. Hopcroft, and J.D. Ullman, The Design and Analysis of Computer
Algorithms, (Addison-Wesley, Reading (MA) 1974)
[21] R. Sedgewick, Algorithms in C , (Addison-Wesley, Reading (MA) 1990)
[22] P.M. Duxbury and R. Dobrin, Physica A 270, 263 (1999)
[23] D.B. West, Introduction to Graph Theory, (Prentice-Hall, New Jersy 1996)
[24] A.L. Barabasi, Phys. Rev. Lett. 76, 3750 (1996)
5 Introduction to Statistical Physics
function (or Hamiltonian) or energy function X ( S ) of the system assigns t o each state
an energy - from a microscopical view point this is & l needs to know t o describe
one
the physics of the system. In case of the aforementioned Ising model it is, for instance,
Here the sum is over all states of the system, kb the Boltzmann constant Z =
C exp(-X(S)/lcbT) is the partition function, which simply ensures the normaliza-
C
that is the probability for a system coupled t o a heat bath with temperature T to
be in state 3.A thermal expectation value of an observable X ( S ) is therefore just a
weighted sum of values A(S) that observable A takes on as in state ,!?. In the Ising
system defined above, one such observable is for instance the magnetization
However, it is the free energy F and not the internal energy that is crucial for the
thermodynamic propertics of a physical system at non-vanishing temperatures ( T > 0).
It is defined t o be proportional to the logarithm of the partition function:
5.1 Basics of Statistical Physics 75
All thermodynamic quantities can be calculated from the free energylpartition func-
tion by taking derivatives, e.g. the expectation values of the energy E, the magneti-
zation M , the specific heat c and the susceptibility x (with P = l / k b T )
By a short calculation, you will find that the second derivatives specific heat and
susceptibility are related t o the fluctuations of the related first derivatives:
The importance of the free energy is due to the fact that for the thermodynamics not
only the energetics of the states of the system but also their number, the density of
states, play a role. The measure for the latter is the (Boltzmann) entropy defined by
We do not have the space here to uncover all the deep implications of this quantity,
and it is not necessary for the purpose of this book, which is mainly concerned with
zero temperature problems. If we insert P(S) from (5.3) into (5.13), use (5.6) and
compare with (5.5), we see immediately that
Thus the entropy does not play a role at T = 0 where the intcrnal energy becomes equal
t o the free energy. In that case, a closer look at Eq. (5.3) for the Boltzmann probability
reveals the crucial role that the energy X ( S ) has in determining the thermodynamics
of a system: for fixed temperature T the weight P(2) of state S gets smaller the
larger its energy is. In the zero-temperature limit T + 0 only the state Sowith the
lowest energy Eo = mins{X(,S)}, the ground state, contributes t o the thermodynamic
average (5.2) of any observable A. All the other states with energies X ( S ) > Eo are
exponentially suppressed and are irrelevant at T = 0.This clarifies the importance of
the ground state (S) of a physical system described by a Hamiltonian X ( S ) .
For non-zero temperature it is a tremendous task to perform the sum given in (5.2)
explicitly and it can be achieved analytically only in a few exactly solvable cases.
Note that the number of states grows exponentially with the size N of the system; for
instance an Ising system with N spins can take on 2N different states 5.In many cases
the individual particle have even t o be described by a continuous variable with one
or more components, e.g. S, = (cos ip, sin ip), a two-component vector parameterized
76 5 Introduction to Statistical Physics
by the angle cp E [O, 27r), also called a XY-spin, which makes the task even more
difficult. On the other hand, not all states S are equally important in an approximate
calculation of the thermodynamic expectation value at low temperature, for instance,
-
In the sum on the right hand side over all states S', the first term describes processes
which move a system into state S, while the second term accounts for processes leaving
state 5. After a large number of steps ( t + oo) the probabilities P t ( S ) will approach
a stationary distribution P(S) = lim Pt(S),which we want to be the Boltzmann
t i m
distribution P(S)= P,,(S) = Z-l exp(-PX(27)). We can design the transition prob-
abilities w ( S t 3')in such a way, that for Pt(,S)= P,,(S) each term in the sum on
the right hand side of (5.15) vanishes:
This relation is called detailed balance. Now it is ensured that the desired distribution
is the Markov process defined by (5.15). Please note that, in principle, you can use
other measures t o ensure that the equilibrium distribution is obtained. To ensure
detailed balance, Metropolis et al. [4] chose
where A X = X (S')- X ( S ) is the energy difference between the old state S' and the
new state S and I- an arbitrary transition rate such that w has the meaning of a
transition probability per unit time. By inserting these transition probabilities into
(5.16), one sees that they fulfill the detailed balance condition.
There is a considerable amount of freedom in the choice of the move 5 -t St,but
one should note that because of w ( S + S1)/w(S' t S ) = exp(-AX/kbT) the energy
change A X should not be too large when going from + S' or vice versa. Hence
typically it is necessary t o consider small changes of 3 only, since otherwise the ac-
ceptance rate of either S + 5' or 3' t would be very small (and a lot of computer
5.1 Basics of Statistical Physics 77
time would be wasted since the procedure would be poorly convergent). There are
exceptions from this rule, like the cluster algorithms for spin models [5, 61, where
clever schemes make large changes in configuration space possible, while the energy
changes A X are kept small. But these cluster algorithms do not work in general.
So, for instance, for the Ising model one conventionally uses a single spin flip dyhamics,
where a t each move 5 + S' only a single spin, say Si, is flipped: Si + --Si and the
other spins do not change S, = Si V j # f . Then
for Si = -sign(hi)
'
w(Si t -Si) =
r exp(-2Si hi/kbT) for Si = sign(hi)
where hi = C +
JijSj bi is the local field acting upon spin Si in state S.
j(f 4
In practice one realizes the sequence of states S with chosen transition probabilities
w by executing the following Monte Carlo program t o calculate an estimate of the
thermal expectation value (A)T of an observable A(S):
After a number steps Nsteps,the number of trial states is usually called the number of
Monte Carlo steps, one stops and we get A,(Nsteps) as an estimate of (A)T. This is
true because according t o what we have said about the stationary distribution of this
process, we have
where T is the correlation time of the quantity, which describes how long you have
t o simulate to obtain two independent measurements. Tho correlation time can be
obtained from an autocorrelation function C A ( t ) in equilibrium
through CA(7) = e-lCA(0). The value of St measures the distance between two
consecutive measurements, using the same time unit that is chosen t o measure r. For
the algorithm above, St = 1 (one Montc Carlo step). In case you take your samples
very rarely (St >> r),Eq. (5.20) becomes
What is important for our topic L'optimization" is that the Monte Carlo procedure
can also be used t o find the ground state configuration ,So of the system under
consideration: for low temperature only the states with the lowest energies allow a
significant weight Pe,(,S), and therefore by reducing the temperature step by step
down to T = 0 the process just described should converge t o the state with the lowest
energy 5,.This procedure is called simulated annealing [9]and is one of many methods
among what is known as the field of "stochastic optimization", see Chap. 11. Of course
for lower and lower temperatures the equilibration time of the process gets largcr and
larger, necessitating huge numbers of steps NstePs to guarantee equilibration. Since
this condition cannot be fulfilled for arbitrary low temperatures there is no guarantee
of finding the exact ground state with simulated annealing. However, sometimes this
is the only method a t hand.
Figure 5.1: Schematic picture of the liquid-gas transition of water. Below T, the
system is a fluid (left), while for higher temperatures it is a gas (right).
Figure 5.2: A two dimensional Ising ferromagnet. Near neighbors of the lattice
interact ferromagnetically, indicated by straight lines joining them.
the free energy, whereas in the former case it is of first order, meaning that a first
derivative of the free energy (the density) is already discontinuous.
In general, a phase transition in a many particle system is induced by the variation of
a parameter (such as temperature, pressure, etc.) and is (usually) mathematically
characterized by a discontinuity or a singularity in an appropriately chosen order
parameter. It separates regions in the phase diagram with different physical properties,
most elegantly expressed by their symmetries and reflected by the behavior of the
order parameter. The concept is most easily visualized by an example, the Ising
model in an external field that can simultaneously serve as a lattice gas model for
the liquidlgas transition (spin up A particle present, spin down A particlc absent,
magnetic interaction A nearest neighbor attraction, field A chemical potential). The
80 5 Introduction to Statistical Physics
spins Si = f1 are localized on the sites of a 2-dimensional lattice (Fig. 5.2), the
Hamiltonian is
( i j ) means the sum over all nearest neighbor pairs of the lattice. For all nonzero
magnetic fields h, the model is paramagnetic, i.e. the average orientation of the spins
is determined by the sign of h. Below a critical temperature T,, the order parameter,
the magnetization per spin rn as a function of the field h, has a discontinuous jump
at h = 0, which is the sign of a first order phase transition (Fig. 5.3).
Figure 5.3: The (T,h) phase diagram of the 2d Ising model. The insets show the
magnetization as a function of the field for two different temperatures indicating the
critical behavior for T = T, and the first order transition at T < T,.
-
For h = 0 the model possesses a 2nd order phase transition at the critical temperature
T,, where the magnetization has a singularity m (T, - T ) for ~ T < T,, but m = 0
for T > T,. The exponent P, not t o be confused with ,B = l / k b T , characterizes the
behavior near the phase transition, called the critical behavior. Thus, P is called a
critical exponent1. Also for other interesting quantities critical exponents are defined,
see Secs. 4.1 and 5.4. It can be shown that for many models the actual values of the
critical exponents do not depend on details of the model. In that case one speaks of
universality.
For the purpose of this book (the computation of various quantities in a finite system)
we should emphasize that the above scenario describes the behavior of the infinite
'1t turns out that ,8 = 118.
5.3 Percolation and Finite-size Scaling 81
where n(x) is a (usually unknown) scaling function. For all different lattice types
(square, triangle, cubic, ...) the exponents depend on each other in the same way:
T = 2 +PI(? + P ) and a = l / ( y + P ) ,
but the actual values are not the same for different lattice types. The percolation
cluster at p = p, is a fractal and its mass scales as M K rdf with its radius (or linear
size) r , where df is the fractal dimension. In Fig. 5.4 we show an example for the
determination of the fractal dimension for d = 2 using the methods describe in Sec.
4.1 for percolation problems.
5 Introduction to Statistical Physics
Figure 5.4: Monte Carlo data for the size of the largest cluster at the percolation
threshold p = p, = 112 of the triangular lattice, as a function of the linear system size
L of the lattice. The slope of the straight line in this log-log plot is the exactly know
fractal dimension df = 91/48. From [12].
Hence one finds a percolating cluster in the finite subsystem with the probability of
order 1 (i.e. P:~"(~,) M 1) if
+ +
where we used T = 2 p / ( y p) and the hyper scaling relation [lo, 111 dv = y 2P, +
which involves the dimension d of the system. Furthermore, v is the critical exponent
of the correlation length [, which is defined as the typical length scale over which the
connectedness probability, i.e. the probability that two sites a distance r apart belong
t o the same connected cluster, decays in the infinite system. It diverges a t the critical
5.3 Percolation and Finite-size Scaling
point as
The mass (number of sites) of a percolating cluster in a finite system is simply the
mass of a cluster of radius L
which means that the probability for a site t o belong t o a percolation cluster is
since PL(p,) = M / L ~ .
The distribution n,,(p) in a finite system is cut off at s K L d f . Therefore also the
divergence of the susceptibility is
We will look more closely at the combination of exponents that appear in the expression
for xr,(p): since (3 - r ) / a = y and lladf = v we get
in other words: xL(p) is composed of a diverging factor SPY, which describes the
behavior in the infinite system, and a scaling function that depends on the scaling
variable L/J, where J = S-". Since there cannot be a divergence in a finite system,
one expects that X(x) vanishes for x + 0 in such a way that the divergence of the
prefactor is canceled. In this case, X(z) K 271" for x + 0. This yields at p = p,
(6 = 0) the correct finite size behavior x ~ , ( p= p,) cc L')'IV.
This is the essence of finite-size scaling (FSS), scaling functions always depend on the
ratio of the two lengths L and J and the prefactors describe the singularities, either
in terms of the distance from the critical point 6 = 5-7, or in terms of the system size
L, which then replaces J , i.e. for (5.33)
Analogously
84 5 Introduction to Statzstical Physics
Note that these relations only hold asymptotically, i.e. close to p, and for large values
of L. Further away from p, and for smaller system sizes corrections to scaling occur.
The general scheme is t o perform simulations at different systems sizes and use these
data and equations like (5.34), (5.35) and (5.36) t o extract all information. Equation
(5.36) is particularly useful for the numerical determination of p,: for p = p, the value
of pier" is independent of the system size, P : ~ ~ ~=( Ij(0)
~ , ) and therefore the curves
for different fixed systcm sizes should intersect at the critical point p = p,!
Practical issues of performing a FSS analysis are covered in Sec. 13.8.3. In the following
section, we will explain the FSS behavior of magnets.
N
where M = M ( S ) = C Si and
i=l
is the susceptibility. Here 6 := (T - T,)/T, denotes the distance from the critical
point. The singular dependencies of m ( T ) and x ( T ) on 6 are valid for the infinite
system. If N = Ld is finite one gets
The correlation length < = 161-" is defined via the decay of spatial spin correlations:
close to the critical point. Since the correlation length diverges at the critical point,
C(r) holds for T = T,. q is another critical exponent that is related to y
via [since d r C ( r ) = X ]
5.4 Magnetic Transition 85
A useful way in which one can estimate the critical exponents from a set of Monte
Carlo data for finite sizes L and different temperatures T (or distances from the critical
~ / " LSV or t = L ' / ~ S . Then data for
point 6) is t o plot for instance X L ( ~ ) / ~versus
different system sizes should fall on a single curve, the scaling function 2, according
to (5.40). The exponents y and u as well as the critical temperature Tc are then fit
parameters, by which one has t o try t o achieve the best possible data collapse for
different system sizes. In Fig. 5.5 we show such a scaling plot for the two-dimensional
Ising model.
Note that the presence of three free fit parameters makes the data determination of the
critical quantities using such a data collapse open t o large systematic errors. It would
be much better if one could estimate onc or two of these quantities separately such
that the number of free parameters is reduced. Fortunately there is such a method
for determining the critical temperature, similar t o the use of the percolating cluster
probability in the aforementioned percolation problem, namely the dimensionless ratio
of moments ("Binder cumulant")
This quantity is the second cumulant of the probability distribution of the fluctuating
magnetization per site m and it is 0 for a Gaussian and 1 for a double-delta function.
Since this is exactly what one expects for the paramagnetic and the ferromagnetic
phase, respectively, of a ferromagnet in the infinite system-size limit, this cumulant is
a step function in the thermodynamic limit. Since for any moment of the magnetization
one expects a scaling behavior similar t o (5.39):
it follows that G L ( T = T,) is independent of system size and a family of curves for
G L ( T ) with different but fixed system sizes intersccts at T = T, (with corrections to
scaling, i.e. the deviation from the asymptotic behavior for smaller system sizes).
The practical procedure is exemplified in Fig. 5.5 for the two-dimensional Ising model.
First the location of the critical point, T,, is determined by the intersection point of
the curves for different system sizes of the dimensionless ratio of moments G L ( T ) ,
c.f. Fig. 5.5a. Then the same data for G L ( T ) are plotted against the scaling variable
L'/"(T - Tc)/Tc, where one chooses the exponent u such that the best data collapse
is achieved, c.f. Fig. 5.5b. Next the magnetization exponent ,8 and the susceptibility
exponent y are estimated by plotting the rescaled magnetization L - o I V r n L ( ~and)
the rescaled susceptibility (T) against the scaling variable L'/"(T - Tc)/Tc,see
and Fig. 5.5d, where one chooses the exponents ,6' and y such that the best
Fig. 5 . 5 ~
data collapse is achieved and T, and u are, for instance, taken from a and b. Other
quantities, like e.g. the specific heat can also be studied in this way, and the critical
exponents ( a = 0, i.e. a logarithmic divergence in case of the specific heat of the 2d
Ising model) extracted.
One recognizes the similarity between the critical singularities for percolation and the
one for a thermal phase transition and also of the finite-size scaling behavior. Actu-
-
ally for the case that there is one single length scale determining the transition (the
5 Introduction to Statistical Physics
Figure 5.5: Finite-size scaling (FSS) behavior of the two-dimensional Ising model
on the square lattice (with J = 1). The data are obtained with a conventional single
spin-flip Monte Carlo simulation using Metropolis transition rates (5.17). (a) The
dimensionless ratio of moments g r ( T ) , Eq. (5.43). The data for different system sizes
intersect at the temperature T, = 2.27, which is indicated by the vertical line. This
is the estimate for the critical temperature (which is exactly Tc = 2.269.. .). (b)
Scaling plot of the dimensionless ratio of moments according to (5.43). The data of
(a) are plotted against the scaling variable L1/"(T - T,)/T, with T, = 2.27 (from a)
and v = 1 (which is the exact value). Note that the data collapse is good close to
the critical point (around 0 on the x-axis) and gets worse far away from it. ( c ) The
magnetization r n ~ ( T )rescaled
, by its FSS behavior at T,, L - ~ / " versus
, the scaling
variable L1/"(T - Tc)/Tc, with T, = 2.27 (from a), v = 1 and P = 118 (which are the
exact values). Close to the critical point the data collapse according to (5.39) is good.
(d) FSS-plot of the susceptibility XL(T),Eq. (5.40), which is rescaled by its FSS at
Tc, Ly/", versus the scaling variable L1/"(T - T,)/T,, with T, = 2.27 (from a), v = 1
and y = 714 (which are the exact values).
correlation length) this is the generic scenario (there are always exceptions, but as a
working hypothesis it is good for many systems); an order parameter, its susceptibility
and the correlation function (including a characteristic lengths scale, the correlation
length) yield a set of critical exponents: a , for the specific heat, P, for the order pa-
rameter, y, for the order parameter susceptibility, u for the correlation lengths, etc.,
5.5 Disordered Systems 87
and usually only 2 of them are independent; in the case when the hyper-scaling rela-
tion 2d = 2 - a is violated, 3 of them. These determine the universality class of the
system.
Now we will shortly introduce the notion of universality. According t o the theory of
critical phenomena including the renormalization group, the exponents determining
the critical singularities of a many particle system at a second order phase transition
depend only on features like the space dimension d, the number of components of the
order parameter (e.g. I for the Ising model, 2 for XY spins etc.), the range of the
interactions (short-range versus long-range, like e.g. Coulomb), possibly on quenched
disorder (see below, depending on whether it is relevant or not), sometimes on the
type of frustration (e.g. for spin glasses, see Chap. 9), etc. But they do not depend on
microscopic features like the detailed lattice structure of next and next-to-next nearest
neighbor interactions (as long as no frustration arises through these additional inter-
actions), i.e. they are universal The two-dimensional Ising model has the same critical
exponents (a, p, y,etc.) for nearest neighbor interactions on the square lattice, the
triangular lattice, the hexagonal lattice, the Kagomk lattice etc. Even if ferromagnetic
next-nearest (or n-nearest) neighbor interactions are considered, the exponents will
not change. The same for the three-dimensional case, which is particularly useful,
since the critical point in the aforementioned (p,V) phase diagram of for instance
water is in the universality class of the 3d Ising model (the density corresponds t o
the magnetization, the chemical potential t o the external field. and on a lattice the
presence of a molecule corresponds t o spin up, the absence t o spin down). This is one
of the most important reasons why physicists are interested in the models we discuss
in this book. These models will never describe an experimental system in all details
- but the concept of universality tells us that this is not necessary as far as critical
phenomena are concerned.
where Si= &I,the sum runs over all nearest neighbor pairs of spins of a simple cubic
lattice and the Jij(> 0) are quenched random variables (independently identically
88 5 Introduction to Statistical Physics
describing the variance of the the distribution of the observable O generated by the
disorder realization, do in general not decay t o zero when the ratio of system size and
correlation length, L / ( ( T ) ,approach zero, i.e. when [ diverges, e.g. a t a critical point.
where [. ..Iav
denotes the average over the quenched disorder, and (. . . ) T the thermal
average with fixed realization of the disorder. More explicitly this means for the
random Ising ferromagnet (5.45):
5.5 Disordered Systems 89
In pract,ice (i.c. in a computer simulation) the exact average over the random vari-
able has to be replaced by an unbiased sum over a large enough number of disorder
realizations.
When computing ground-state properties of a model with quenched disorder the ther-
mal average is simply replaced by the computation of the (exact) ground state(s) S
and an evaluation of the observables under consideration in this state(s) S.
The random-field Ising model, for instance (see Chap. 6),
where hi is a random variable, modeling a random external field, obeying some distri-
bution with zero mean and variance h; has in three space dimensions a phase transition
also at zero temperature (T = 0) from a paramagnetic phase (m = 0) to a ferromag-
netic phase (m > 0) at a critical strength h, of the random fields. If SO@) denotes the
ground state of the Hamiltonian H with random fields h the average magnetization is
simply
Sometimes one is interested in correlated disorder, for which only the generation of
the random variables has t o be adopted such that the joint distribution is obeyed, for
ni
instance P ( h l , . . . , h N ) instead of P(hi).
The physical applications presented in this book are predominantly disordered systems
so that we can skip the presentation of examples here: they will come in abundance
in the later chapters.
The concept of universality, which we mentioned in the previous section concerning
critical phenomena, carries over to critical points in disordered systems as well and
implies here, besides the irrelevance of microscopic details of the model, that the crit-
ical exponents do not depend on the detailed probability distribution of the disordcr.
However, this idea is far from being as well established for disordered systems as it
is for homogeneous systems, simply because exact results for disordered systems are
rare and renormalization group calculations for field theories of disordered systems are
difficult. Here one should note that only the exactly solvable disordered models do
display this universality (e.g. the Mc-Coy-Wu model which is a 2d random ferromagnet
with disorder only in one space direction).
On the other hand a number of numerical calculations (Monte Carlo at finite temper-
ature as well as ground-state calculations at zero temperature) for spin-glass models
1141 as well as random-field models [15] appear t o bc incompatible with the concept of
universality: so a binary distribution seems t o yield a differcnt set of critical exponents
than a continuous distribution. The usual argument brought against these numerical
results by the community of believers in universality is that these computations are
still in a pre-asymptotic regime, where finite-size effects are still strong and the true
and unique disorder fixed point is still too far away when reuormalizing thc system
sizes that could be studied. We are inclined t o use Occam's razor in any unclear situa-
tion and simply recommend that it is always good to think twice before one abandons
90 5 Introduction to Statistical Physics
an appealingly elegant concept like universality, even if there is no rigorous proof that
it holds in the particular case one is considering. Nevertheless, when the numerical
evidence against it grows in strength over the years, it might be a good time to start
to think about something new.
Bibliography
[I] D. Chandler, Introduction to Modern Statistical Mechanics, (Oxford University
Press, Oxford 1987)
[2] J.M. Yeomans, Statistical Mechanics of Phase Transitions, (Larendon Press, Ox-
ford 1992)
[3] L.E. Reichl, A Modern Course in Statistical Physics, (John Wiley & sons, New
York 1998)
[4] N. Metropolis, A.W. Rosenbluth, M.N. Roscnbluth, A.H. Teller, and E. Teller, J.
Chem. Phys. 21, 1087 (1953)
[5] R.H. Swendsen and J.S. Wang, Phys. Rev. Lett. 58, 86 (1987)
[6] U. Wolff, Phys. Rev. Lett. 60, 1461 (1988)
[7] M.E.J. Newman and G. T. Barkema, Monte Carlo Methods in Statistical Physics,
(Clarendon Press, Oxford 1999)
[8] D.P. Landau and K. Binder, A Guide to Monte Carlo Simulations in Statistical
Physics, (Cambridge University Press, Cambridge 2000)
[9] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, Science 220, 671 (1983)
[lo] D.J. Amit, Field Theory, The Renormalization Group and Critical Phenomena,
(World Scientific, Singapore 1984)
[11] J. Cardy, Scaling and Renormalization in Statistical Physics, (Cambridge Uni-
versity Press 1996)
[12] D. Stauffer and A. Aharony, Perkolationstheorie: eine Einfuehrung, (Wiley-VCH,
Weinheim 1995)
[13] S. Wiseman and E. Domany, Phys. Rev. E 52, 3469 (1995)
[l4] L. Bernardi and I. A. Campbell, Phys. Rev. B/ 52, 12501 (1995)
[15] J.-C. Anglks d'Auriac and N. Sourlas, Europhys. Lett./ 39, 473 (1997)
6 Maximum-flow Methods
This chapter introduces a model for a certain class of random magnetic systems. They
consist of lattices where the atoms have a small magnetic moment, i.e. a spin. Within
this model, for neighboring spins it is energetically favorable t o point in the same
directions, which means they interact ferromagnetically. A magnetic field acts on
each spin, its sign and strength may change randomly and independently from spin to
spin. These types of systems are described by random-field models. They have been
studied widely by means of computer simulations and analytical calculations.
There are no direct experimental realizations of random-field systems, but it can be
shown that another class of systems, the diluted antiferrornagnet in a field (DAFF),
can be mapped onto random-field systems. The DAFF consists of spins on lattices
as well, but neighboring spins interact antzferromagnetically. The system is called
diluted, because not all sites of the lattice are occupied by magnetic atoms. Similar
t o the random-field model, a magnetic field acts on the spins, but it has the same
direction and the same strength for all spins, i.e. it is homogeneous.
Here we are interested in the low temperature-properties of random-field systems and
diluted antiferromagnets, especially in the ground states. From the viewpoint of a
computational physicist, the calculation of ground states of this type of system belongs
t o the class of P problems, i.e. the T = 0 behavior can be studied for large system
sizes.
Initially both models and some basic results are presented. Then it is shown how
these systems can be mapped onto networks. It is demonstrated that the maximum
flow through such a network is equivalent t o the minimum energy of the corresponding
system. Next, for pedagogical reasons, a simple algorithm for calculating the maximum
flow is presented. Unfortunately, this algorithm turns out t o be very slow. Therefore,
in the main part of this chapter a highly efficient method is presented. Sometimes
random-field systems and diluted antiferromagnets exhibit a huge number of ground
states, i.e. the ground state is degenerate. It is explained how all different ground
states of such a system can be calculated by a second calculation after the maximum
flow has been obtained. Finally some of the most interesting results obtained with
this algorithms are shown.
92 6 Maximum-flow Methods
The sum ( i ,j ) runs over pairs of nearest neighbors. J > 0 denotes the magnitude of the
interaction. Here only simple cubic lattices are considered. Since physical properties
depend on the values of the Bi one has t o implement quenched disorder. This means
that the local magnetic fields are drawn independently from a probability distribution
p(B) and remain fixed throughout a calculation or simulation. A system with a certain
choice of the values {Bi}is called a realization of the disorder. Since the result of each
calculation depends on the choice of the { B i ) ,one has to repeat the process scveral
times and average the results over many realizations. It turns out that for random
systems the results vary strongly from realization t o realization. Therefore one needs
many of them t o obtain reliable results, which is very common when studying random
systems.
For the distribution of the random fields, usually either a bimodal or a Gaussian
distribution are used t o draw the random realizations. In the case of a bimodal dis-
tribution the local fields take one of two values Bi = +A with equal probability. For
the Gaussian case arbitrary fields are allowed. The probability density functions are
The parameter A, the width of the distribution, is a measure for the strength of the
disorder. For A = 0 a pure ferromagnet is obtained. Several attempts t o treat the
system analytically have been performed [ I , 21. For a review of results obtained by
means of computer simulation, see [3].
As already mentioned, there is no direct experimental realization of a random-field
system. Instead diluted antiferromagnets in a field are studied. Later we will see that
indeed the behavior of the RFIM agrees remarkably with that of the DAFF. Before
we present the algorithms t o calculate ground states for these models, we will focus a
little bit on the DAFF.
The system FeF2 is an antiferromagnet. By replacing some of the magnetic Fe by
non-magnetic Zn one obtains FexZnl-,F2, and a diluted antiferromagnet is created.
The crystal structure is that of rutilc (TiOa) (see Fig. 6.1). The only relevant type of
interaction is the antiferromagnetic superexchange ncxt-nearest-neighbor interaction
between the Fe atoms on the body corner and the body center sites. This kind of
interaction is generated by the intermediate F atoms. Otherwise the Fe atoms would
be ferromagnetic.
6.1 Random-field Systems and Diluted Antiferromagnets 93
At low temperatures this system exhibits many peculiar properties, e.g. it develops
frozen domains, as found by neutron-scattering experiment [4]. An overview of exper-
imental results is given in [5].
Figure 6.1: Crystal structure of Fe,Znl-,Fz. Small circles denote the Fe/Zn sites
while the large circles are the F sites. The eight corner sites (Fe/Zn) are shared by 8
lattice cells, while the body-center site is not shared. The upper and lower pairs of F
sites are shared by two cells, while the left and right sites are not shared. All together
there are 2 Fe/Zn sites and 4 F sites per lattice cell.
For diluted antiferromagnets with a high anisotropy, like Fe,Znl-,Fa, at low temper-
atures, the spins, i.e. the magnetic Fe moments, are usually oriented along one axis
(the c axis of the crystal). Thus, they can take only two directions, called "up" and
"down". It is assumed this the reason for that anisotropy is a slight deviation of the
crystal structure from the pure cubic symmetry.
The Ising model, which has been presented above, is very well suited t o describe such
type of system. For the case of a DAFF, the spins interact antiferromagnetically with
their neighbors. Please note that only the strong interactions are considered in the
following model. So the F atoms are not considered here, since they do not contribute
directly t o the magnetic behavior. As a consequence, a simple cubic crystal structure
is sufficient. To describe the dilution, i.e. the fact that not all Fe/Zn sites are occupied
by magnetic Fe atoms, a second variable c, = 0 , l is introduced. A non-magnetic Zn
site is represented by c = 0 while c = 1 holds for a site occupied by a spin (Fe).
Additionally, an external magnetic field B can act on the system. Therefore, the
energy of a diluted antiferromagnet in a field is given by
94 6 Maximum-flow Methods
The strength of the interaction is denoted with J > 0. The sum (i,j ) runs over pairs
of nearest neighbors. Here only simple cubic lattices are considered. Like the random-
field model, diluted antiferromagnets exhibit quenched disorder. Each realization is
characterized by a set {ti) of independent random numbers, here we will consider
~i = 0 , l with equal probability.
What can we expect for the behavior of this model? In zero field and with zero tem-
perature the ground state, i.e. the state with the lowest energy, is taken. Therefore, in
(6.4) configurations {ai)are favorable where neighboring spins take different orienta-
tions, because their contribution t o the energy Jaiaj = - J turns out to be negative.
It means that, apart from the fact that not all lattice sites are occupied, the system can
be divided into two sublattices, which penetrate each other in checkerboard fashion.
On one of these sublattices all spins point in one direction (say up), while the spins of
the other sublattice are oriented in the opposite way. In Fig. 6.2 an example of a d = 2
dimensional diluted antiferromagnet at B = T = 0 is shown. In higher dimensions the
model behaves similarly, but it is more difficult t o draw.
By increasing the temperature, the DAFF is driven away from the ground state.
This means the spins start to fluctuate thermally. When some critical temperature is
reached the system becomes paramagnetic (PM). Also by increasing the magnitude of
the external magnetic field, the antiferromagnetic order is disturbed. For large values
of B, especially if lBI > 2dJ, all spins point in the direction of the field (at T = 0).
These expectations can be recovered in the schematic phase diagram of the three-
dimensional DAFF model (50% dilution) which is shown in Fig. 6.3. It was measured
by means of Monte Carlo simulations at finite temperature T, see Refs. [6, 71. The
6.1 Random-field Systems and Diluted Antifer~omagnets
0 1 2
Figure 6.3: Schematic phase diagram of the DAFF in the B-T plane, obtained by
M C simulation [6]. Three regions can be identified: antiferromagnetic (AFM), domain
state (DS) and paramagnetic (PM).
ordered phase for low temperatures and low fields was established by evaluating the
staggered magnetization
Here, x , y, z are the spatial coordinates of spin i . This order parameter accounts for
the fact that in the presence of order the two penetrating sublattices have opposite
magnetizations. Therefore a = 1 holds for antiferromagnetic order on a cubic lattice.
The transition of the so called domain state (DS) t o the disordered phase is established
in the following way: for a given field B the system is initialized in an ordered state
a t T = 0 and the MC simulation starts. Then, slowly the temperature is increased,
here up t o T = 2.5. This process is called field heating (FH). The temperature is
then slowly decreased again (FC=field cooling). Now the system is not able find the
way back t o the starting configuration. This can be seen by recording the staggered
magnetization. At some temperature Ti,,(B) the FH and FC curves begin t o deviate
from each other. This is called an onset of irreversibility. The line which separates
the P M and the DS regions in Fig. 6.3 is just Ti,,as a function of B . Please note
that the value of Ti,, (B) defined in this fashion depends on the dynamics, i.e. on the
heating/cooling rate. A more detailed study has shown that the DS region can be
characterized by large fractal domains penetrating the system.
The behavior of true diluted antiferrornagnets agrees very well with these computer
simulations. But it is relatively hard t o treat this model by means of an analyti-
cal approach. As a consequence random-field systems are usually studied. It can
be shown fairly easily that quite often a DAFF can be mapped onto an RFIM. The
mapping works for example in the case of a simple cubic (or square) lattice. One per-
forms a gauge transformation by introducing new spin variables oi = (-1)2+Y+Zci;
where x , y, z are the spatial coordinates of spin i. This transformation multiplies in a
96 6 Maximum-flow Methods
checkerboard fashion every second spin with minus one, thus all bonds become ferro-
magnetic. The resulting Hamiltonian describes a diluted ferromagnet with staggered
field B, = (-l)"+y+"B. Please note that this kind of transformation does not work
for all lattice types, it fails for example for triangular lattices. An example of such
transformation can be found in Sec. 9.2.
As already mentioned, random-field systems are easier t o treat analytically. For this
reason, most of the theoretical research has focused on this model. We finish this
section by stating the expectations about the basic behavior of the model. With low
temperatures and low fields the system exhibits a ferromagnetic long range order. For
large temperatures the system becomes paramagnetic. By increasing the strength A
of the random fields, the spins tend t o be oriented along the direction of the field. If
lBzl > 2dJ spin i is fixed (at T = 0). But at zero temperature for intermediate values
of the field a peculiar behavior of the RFIM appears, similar t o the DS phase of the
diluted antiferromagnet. This region of the phase diagram can be studied by means
of ground-state calculations. The basic idea of the method for obtaining the ground
states is explained in the next section.
.. s E S
t E S
Usually, we denote the elements of S left and the elements of 3 right of the cut. In
Fig. 6.4 an example network with 5 nodes and 6 edges is shown.
Figure 6.4: A graph with 3 inner vertices 1,2,3. A cut ( S , $ = ({s, I), { 2 , 3 ,t } ) is
represented by a dashed-dotted line. The capacity of the cut is C ( { s ,I), { 2 , 3 ,t } ) =
c,2 +
c I 3 ) The edge (1,2) does not contribute to the cut, because it is oriented in the
opposite direction t o the cut.
Since the amount of work (or TNT) t o remove a pipeline grows with its capacity, the
terrorist are interested in the capacity of the pipes they have t o remove. These are
exactly the pipes which go from S t o 3, that means from left to right. We say these
pipesledges cross the cut. Edges in the opposite direction cannot contribute to a flow
from the source to the sink, thus they are disregarded. The sum of the capacities of
these edges is called the capacity C(S, 3) of the cut
ztS,.ltS
In the example presented in Fig. 6.4 the capacity of the cut is C ( S , S ) = c , ~ C I S . +
The edge (1,2) does not contribute, since it leads from right t o left.
The basic idea t o represent a Harniltonian by a network is t o represent a cut by a
binary vector X = ( z o ,x l , . . . , z,, x , + ~ ) with
For an (s,t)-cut the values of xo = 1 and x,+1 = 0 arc fixcd. An edge (i,j ) goes
from left t o right of the cut only if x, = 1 and x, = 0. Therefore, the formula for the
capacity of a cut can be rewritten in the following way (all sums run from 0 to n 1): +
98 6 Maximum-flow Methods
Here, it can already be seen that the structure of the formula resembles the energy
functions for the random-field system: it consists of a linear and a quadratic term.
To make the correspondence complete, the values of the source and the sink xo =
1,2,+1 = 0 have t o be inserted, additionally q, = 0 Vi is assumed. For didactic
+
reasons, the capacities involving the source 0 or the sink n 1 are written as extra
terms. One obtains:
The last slight difference t o a Hamiltonian is that the capacity is given in terms of
zero-one variables xi = 0 , l while a spin may take oi = + l . Thus, one can identify xi =
+
0.5(ni 1). In the Hamiltonian the sum runs over all bonds, while for C ( x l , . . . , x,)
each pair of vertices appears twice in the quadratic term. Thus, the identity Cijcij =
+
(cij cji) is used. In the final formula all sums run from 1 t o n :
+
This formula represents a system which has n 2 vertices: one source, one sink and
n (inner) vertices, one for each spin oi. The model is slightly more general than the
system represented by the Hamiltonians in (6.1) and (6.4). Thus, formula (6.10) can
be compared with an energy function which generalizes both Hamiltonians (ci = 0 , l ) :
This is a diluted random-field system with bonds of variable strengths. Another kind
of model which falls into this class is the random-bond ferromagnet [12, 131, were
the system is not diluted nor does it have a random field, but the strengths of the
ferromagnetic bonds are drawn randomly.
Now we want t o choose the capacities cij in such a way that the systcm (6.11) is
+
represented by a network. By comparison with (6.10) we find cij cji = 4Jijcitj for
i, j E 1 , . . . , n. Since for a network only non-negative capacities are allowed, the bond
values have t o be non-negative as well. Both cij and cJi appear in the equality, so
there is some freedom of choice. We chose that non-zero capacities shall be present
only for edges (i, j ) with i < j . The reason for this choice is that we have only edges
6.2 Transformation to a Graph 99
going from vertices with smaller number t o vertices with higher number. This allows
some algorithms t o be implemented in a way that they run faster. For the capacities
we obtain i , j E 1,.. . , n:
Finally, since the sum of the constant terms in (6.10) must vanish, we obtain
The capacity co,,+l of the edge may be positive or negative. But in this case it does
not matter, since this edge crosses every (s,t)-cut of the network. Consequently, it
can be removed from the network a t the beginning. Later, after the capacity of a cut
has been obtained, the value of co,,+l has t o be added to obtain the energy of the
corresponding system.
To summarize, for a system defined by the energy function (6.11) an equivalent net-
work can be constructed by creating a graph which has one vertex for each spin and
additionally a source and a sink. The capacities of the edges are chosen by performing
the steps given through Eqs. (6.12, 6.14, 6.15). Then, every configuration of the sys-
tem corresponds to a cut in the network (oi = 2zi - 1) and the energy H ( c l , .. . , CT,)
is equal t o the capacity C(zl,. . . , z,) of the corresponding cut.
Since we are interested in ground states, a minimum of the energy is to be obtained.
Consequently, we are looking for a minimum cut, that is a cut among all cuts which
has minimum capacity. Such a cut cannot be obtained directly, but it is related t o the
flow going through the network. Each flow must pass the edges crossing an arbitrary
cut, especially the minimum cut. Therefore, the minimum cut capacity is an upper
bound for the flow. On the other hand, it can be shown that the maximum flow which
is possible is indeed given by the capacity of a minimum cut. The proof was found by
Ford and Fulkerson in 1956 [14]. Versions of the proof which arc more instructive can
be found in [15,16,17]. Along with the proof a simplc method for finding the maximum
flow was introduced. After constructing an equivalent network such a technique can
be applied t o find a ground state of a random-field Ising system. The Ford-Fulkerson
algorithm and an extension of it are presented in the next section.
But before we proceed with the algorithms, as an example, we look a t a small random-
field system, which is shown in Fig. 6.5. There are four spins arranged in a square,
100 6 Maximum-Bow Methods
Figure 6.5: A small random-field Ising magnet. It contains 4 spins coupled via
ferromagnetic interactions (512 =J13 =J24 =534 = 1). The local fields have the
magnitudes BI= 0, B2 = 2 A , B3 = 2A, and B4 = 0.
At first the case A = 0 is investigated. Thus, no magnetic field acts on the spins.
A ferromagnetically ordered ground state is t o be expected, thus all spins point up
or all spins point down at T = 0. Since each bond contributes an amount of -1 t o
the energy, the total ground-state energy sums up t o Eo = -4. This behavior can
be extracted from the corresponding network as well. By setting A = 0 in (6.17) we
6.2 Transformation to a Graph 101
Figure 6.6: The network obtained for the system from Fig. 6.5 for the case A = 0.
Vertex 0 is the source and vertex 5 the sink. The numbers next to the edges denote
their capacities.
The capacity of the phantom edge connecting source and sink evaluates according t o
(6.15) t o ~ 0 =-0.25(4
5 + + +
4 4 4) - 0.5(4 + + + 0 0 4) = -8. The final network
is presented in Fig. 6.6. Thc network allows for two minimum (0,5)-cuts, indicated
+
by dot-dashed lines in the drawing. Both cuts have capacity C ( S , S ) = 4 co,s= -4
which is indeed equal t o the ground-state energy obtained above. The cut denoted
with an "A" is given by (S, 3 ) =({O), { 1 , 2 , 3 , 4 , 5 ) ) . This means xo = 1 and xl = 22 =
23 = x4 = ~ 5 = 1. Using oi = 2zi - 1 ai = 1 for i = 1 , 2 , 3 , 4 is obtained. Thus, really
all spins are oriented in the same direction. The cut "B" (s, 3)=({0,1,2,3,4), (5))
corresponds to the configuration where all spins are pointing down.
In the case A = 1 the following values for the capacities are obtained:
Thc resulting nctwork is shown in Fig. 6.7. Now six different minimum cuts are
possible. At a first glance the figure might look somewhat complicated. But it is
easy t o verify that indeed all minimum cuts A to F have capacity C = 8 + c o ~= -4.
Please note that only the edges contribute which go from the left side of the cut t o the
6 Maximum-flow Methods
Figure 6.7: The network obtained for the system from Fig. 6.5 for the case A = 1.
Vertex 0 is the source and vertex 5 the sink. The numbers next to the edges denote
their capacities.
right side. Thus, to the cut F , the edges ( 0 , l ) and (4,5) contribute t o the capacity,
but not the edges (1,3) or (2,4). The corresponding spin configurations are drawn in
Fig. 6.8. There the small plus and minus signs state the contributions of the single
spinslspin pairs to the total ground-state energy. In all 6 cases the ground-state energy
is Eo = -4, which is again equal to the capacity of all minimum cuts. The six ground
states can be described as follows: for two configurations the system is ordered, all
spins either point up or all spins point down. For the two ordered configurations one
of the spins 213 is oriented against its local field. The remaining four ground states are
characterized by the fact that spins 2,3 point in the direction of their local magnetic
fields and the remaining spins can be either up or down independently.
The example graphs we have presented here will also be used in the next section t o
elucidate how the algorithms for calculating maximum flows work.
Figure 6.8: Ground states of the RFIM shown in Fig. 6.5 for the case A = 1. The
system has 6 ground states with the same energy Eo = -4. The arrows indicate the
orientations of the spins. The plus and minus symbols represent contributions to the
ground-state energy.
in all vertices except the source and the sink. It is assumed that all capacities are
non-negative integer (or rational) values. The aim is t o find the maximum flow from
the source t o the sink.
The basic idea of the Ford-Fulkerson algorithm is to start with an empty network and
t o try t o push additional flow from the source t o the sink. This is done by searching
for paths along which the flow can be increased, they are called augmenting paths. If
no such path is found, the algorithm stops and the maximum flow has been found.
But for each single edge, the flow through the edge is not always increased or remains
the same. Sometimes it is necessary t o decrease the flow in some edges, to increase
the total flow through the network. This case appears when a certain amount of flow
is redirected within the graph. To treat this redirection process efficiently the notion
of a residual network is useful. The residual network R (often called residual graph)
for network N and flow f has the same vertices as the graph G. It contains edges with
nonzero capacities ri, whenever in G with a given flow f it is possible t o increasc the
flow along edge (i, j). Please note that it is also possible t o increase a negative flow in
(i, j) by decreasing the flow in the reverse edge (j,i). Formally R is defined as follows:
R = (G' ,r, s, t ) , G' = (V, E ' ) with
The graph G' contains directed edges (i, j) for all nonzero capacities I-,,> 0. Please
note that in the residual network two vertices (i,j) may be connected by two edges
(i, j) and (j,2).
6 Maximum-flow Methods
For each augmenting path the edge with the smallest capacity is a bottleneck. This
is the reason why the flow along the path can be increased only by the amount of the
minimum capacity r,in. The augmenting path can be constructed using a breadth-first
search. 1.e. beginning a t the source iteratively neighboring vertices are visited which
have not been visited before. During this search a t each vertex i the predecessor in
the actual path and the minimum residual capacity r,in(i) along the path up t o i are
stored. In this way each vertex which is connected t o the source is visited exactly once.
If an augmenting path is found, i.e. if the sink has been visited during the brctadth-
first search, the flow can be augmented directly by starting a t the sink. Iteratively
the predecessor is visited and the flow increased by r,i,(t) until the source is reached
again.
Please note that the algorithm may not converge t o the true maximum flow if the
capacities are irrational, an example is given in [20].
As an example we will investigate how the algorithm calculates the maximum flow
through the network given in Fig. 6.7. At the beginning the flow is empty, so the
6.3 Simple Maximum Flow Algorithms 105
residual network is equal t o the original network. A possible augmenting path from
the source 0 t o the sink 5 is given by the vertices 0, 1, 3, 4, 5. Each edge along
the path has (residual) capacity 4, thus it is possible t o increase the flow along the
path by this value. The residual graph is is shown in Fig. 6.9, the augmenting path
is highlighted. The values in parentheses next t o the edges state the amount of
additional flow which can be pushed through cach edge along the augmenting path.
In this case the resulting original network with the flow after the augmentation looks
the same.
Figure 6.10: Residual network of small random-field system at the second iteration.
Now, only one augmenting path exists (highlighted). Again, the values next to the
edges denote the residual capacity. The values in parentheses state the amount of
flow which can be pushed along the augmenting path.
For the second iteration, again the residual network is has t o be calculated. Now, for
example in edge ( 0 , l ) the flow fol = 4 = col is present. This means that the residual
capacity in direction ( 0 , l ) is rol = col - fol = 0, but for the reversed direction a
value of rlo = clo - f l o = 0 - (-4) = 4 is obtained. The complete residual network
is shown in Fig. 6.10. In this case only one augmenting path is feasible: 0 , 3 , 1 , 2 , 5 .
The capacity of this path is 4. Please note that by augmenting the flow through edge
( 3 , l ) in the residual network, this results in a decrease of the flow through edge (1,3)
in the original graph. The resulting total flow for the graph is displayed in Fig. 6.11.
Now both edges ( 0 , l ) and (0,3) are satisfied. As a consequence, in the corresponding
residual network rol = 0 and ros = 0, i.e. no edge leaves the source. This means that
no augmenting path exists. Consequently, the maximum flow has been found.
How fast is the Ford-Fulkerson algorithm? Since in each step the flow is increased a t
least by the amount of one (if the capacities are integers, as assumed), a time bound
of O(f,,,) is obtained, where f,,, is the maximum flow. Unfortunately, it is not
possible t o find a polynomial in the number of vertices and the number of edges which
bounds the time complexity of the method. The reason is that the way an augmenting
path is chosen is not determined in any way. The effect of this can be demonstrated
by a simple example. Consider the graph in Fig. 6.12. It is similar to the preceding
example graph, but the capacities are different. Assume that the augmenting path is
0 , 1 , 3 , 4 , 5 (highlighted). Then the residual capacity of this path is 1, by this amount
the flow through the graph is increased. The resulting residual network is shown in
6.13. Now, an augmenting path is given by 0 , 3 , 1 , 2 , 5 . Again, the flow is increased by
6 Maximum-flow Methods
Figure 6.11: Final flow in network of small random-field system. The values in
parentheses state the amount of flow which flows through each edge.
Figure 6.12: A tiny network where the Ford-Fulkerson algorithm may spend much
time to calculated the maximum flow. Displayed is the residual graph of the first
iteration. The resulting flow in the network after the first iteration is represented by
the bold edges and the numbers in parentheses.
Figure 6.13: The residual network of the tiny network after one iteration of the
Ford-Fulkerson algorithm.
Obviously, by always using these bad choices of the augmenting paths, the Ford-
Fulkerson algorithm can be iterated for 10000 times, each time the flow is increased
by one unit. On the other hand, it is possible t o calculate the maximum flow within
two iterations if the augmenting paths 0 , 1 , 2 , 5 and 0 , 2 , 4 , 5 are chosen.
6.4 DinicJsMethod and the Wave Algorithm
Figure 6.14: The tiny network after the flow has been increased two times by the
Ford-Fulkerson algorithm. The embraced number give the flow through the edges, the
other numbers state the capacities of the edges.
This unfavorable behavior is avoided by an extension of the algorithm, which was pre-
sented by Edmonds and Karp in 1972 [18]. The basic idea is t o choose an augmenting
path which has the shortest distance from the source t o the sink, where each edge
counts with distance one1. Algorithms for finding the shortest paths can be found in
Chap. 4. It can be shown that indeed this algorithm has a polynomial worst-case time
complexity of O(JV1lEI2),which is in fact independent of the capacities of the edges.
Since thc networks we are considering here are lattices, each vertex has a maximum
number of neighbors, i.e. the number of edges is proportional t o the number of vertices.
This results in a running time of O(IVI3). Modern algorithms are even faster. One is
presented in the next section. Another approach, the push-relabel method, is described
in Refs. [21, 22, 231.
The starting point for the level network2 is the residual network R , as defined pre-
viously. The basic idea is the same as for the algorithm by Edmonds and Karp: the
level network contains all shortest paths from the source t o the sink. In contrast t o
the previous method, here the flow can be increased along several paths in parallel.
This is the key idea to obtaining an efficient algorithm. Let level(i) be the length of
the shortest path in the residual network R from the source to vertex i. The level can
be constructed using a breadth-first search, i.e. the algorithm given in the preceding
section. This introduces the so called topological order of the vertices, which is just t,he
order the vertices are visited during the search. Consequently, for each vertex all its
predecessors have lower topological numbers while all successors have higher numbers.
After the breadth-first search, the level network L N ( R ) is obtained by removing all
+
edges (i, j ) from R which do not fulfill level(j) = level(i) 1, thus only edges between
neighboring levels are left. Please note that the levcl network is unique although the
topological order may not be unique. We denote the capacities of the levcl network
with t ( i , j ) .
Since the methods presented here are more involved than the techniques presented
previously, we need a slightly more complicated network t o illustrate the algorithms.
It is given in Fig. 6.15. The graphical representation clearly shows the levels of the
vertices. For example level(2) = 1, level(7) = 3 and level(l2) = 5. The vertices are
already numbered according an arbitrary topological order.
The resulting level network does not contain the edge (2,6) since level(2) = level(6).
For the same reason the edges (7,10) and (8,11) are not included. Now, there may be
"dead ends" in the graph, i.e. vertices which are not connected to the sink. Thus, they
cannot contribute t o the maximum flow. These vertices are removed in the next step
of the algorithm. For the example graph vertices 1 and 3 are removed. The outcome
is displayed in Fig. 6.16.
The reader may already have noticed the values written next t o the vertices in Fig.
6.16. They are due t o the idea of Traff who introduced capacities p ( i ) also for the
"he level network is often called level graph.
6.4 Dinic's Method and the Wave Algorithm
Figure 6.16: The resulting level network contains only edges ( i ,j ) with l e v e l ( j ) =
+
level(i) 1. Dead ends are removed. The values next to the vertices are the vertex
capacities introduced by Traff.
vertices i . These capacities give upper bounds for the maximum amount of flow which
can leave a vertex i due t o the capacities of all succeeding edges on paths t o the
sink. Since the sink itself has no successors, its capacity can be set t o infinity. All its
predecessors allow for a flow which is bounded by the capacities of the edges t o the
sink. For all other vertices we have t o realize that each edge cannot carry more flow
than that given by its capacity and given by the capacity of the vertex a t the end
of the edge. Therefore, the amount of flow leaving a vertex cannot exceed the sum
of the bounds calculated in this way for all outgoing edges. Since the capacity of a
vertex is determined by its succeeding vertices, the values of p(i) can be calculated in a
recursive way, by starting from the sink and visiting all vertices in reversed topological
order:
The resulting values for the sample graph are also presented in Fig. 6.16. Please note
that indeed p(i) is only a n upper bound. Vertex 3 has p(3) = 15, while it is not actually
possible t o have a flow through this vertex which is larger than 10 = c(9,12)+c(10,12).
Now the main part of the method will be explained, the wave algorithm. The target is
t o obtain a blocking flow. The algorithm starts by calculating a preflow {f^(i,j ) } . It
is a kind of flow but the flow may not be conserved a t the inner vertices. The excess
e(i) = CJf(i,j ) states the amount of flow which is lost [e(i) < 0] or created e(i) > 0
a t vertex i. A vertex with e(i) is called overflowing while it is called balanced if
e(i) = 0. Thus, a flow is a preflow where all inner vertices are balanced. The wave
110 6 Maximum-pow Methods
algorithm starts f (0, j) = c(0,j ) , i.e. with a preflow where all edges leaving the source
are saturated 3 , while all other edges are empty. The maximum flow can certainly not
be larger than C, c(0, j ) . During the execution of the algorithm, for all vertices it is
+
recorded whether they are blocked or not. A vertex i # (N 1) is called blocked if it
is not possible t o push additional flow towards the sink through i. Initially all vertices
are not blocked.
After this initialization the wave algorithm starts. It consists of forward waves and
backward waves. Within a forward wave, the preflow is pushed into the direction of the
sink as far as possible. If some part of the preflow stops somewhere, the corresponding
vertex is blocked for further iterations. Within a backward wave, flow which has been
blocked is pushed in the opposite direction. Later, during the next forward wave,
pushing it along other paths it attempted. Both steps, forward wave and backward
wave, are iterated until all excess flow has disappeared. This means that a t the end
the preflow has been turned into a flow4. The two main steps of the wave algorithm
read in detail as follows:
f o r w a r d wave:
All vertices i are visited in topological order. All outgoing edges (i,j) are treated
which are not satisfied and where the end vertex j is not blocked. The preflow
through (i,j) is increased by the value of min{t(i, j) - f (i, j),p ( j ) - e ( j ) , e(i)).
This means the preflow cannot be incrcased by more than the current residual
capacity, by not more than vertex j can take and by not more than available as
excess. The excess e(i) is decreased and e ( j ) is increased by that amount.
If it is not possible t o balance vertex i , i.e. if not all of the excess flow can be
pushed forward, then vertex i is blocked. So in subsequent iterations pushing
further flow through this edge is not tried.
b a c k w a r d wave:
All vertices j except the sink are visited in reversed topological order. If j is
blocked all incoming edges (i,j) are treated: the preflow f ( i , j) is decreased by
min{f(i, j),e ( j ) ) as long as the excess e ( j ) > 0.The excess e(i) of the predecessor
is increased by the same amount.
This means, for each blocked node, that all positive excess is pushed backwards.
In this way the flow is moved towards the source until a non-blocked vertex is
reached. There the flow may be directed along another path during the next
forward wave.
3Since f ( i ,j ) = ( j )f 0 = c ( 0 , j ) . But the negative values are not used within the wave
algorithm, they are considered only for the calculation of the residual network.
4 ~ l e a s enote that by the wave algorithm a flow which has already found a path to the source will
never be redirected during the current waves. So the flow obtained in this way is indeed a blocking
flow and may not be a maximum flow. Nevertheless, the multiple execution of the whole procedure
(see later) guarantees that at the end a maximum flow is obtained.
6.4 Dinic's Method and the Wave Algorithm
Figure 6.17: The level network after the first forward wave. Vertex 7 is blocked.
The value next t o t h e vertex capacity p ( 7 ) is the excess e ( 7 ) = 5 . All other vertices
are not blocked, so they have excess zero. The values a t the edges have the format
i(2~,/ ? ( id, .
0 First vertex i = 1 is treated. For edge (i, j ) = (1,3) the flow is increased
by min{E(i, j ) - f ( i , j ) , p ( j ) - e ( j ) , e ( i ) ) = min{20 - 0,15 - 0,251 = 15.
Consequently e(3) = 15 and the excess for vertex 1 is decreased t o e(1) =
10. Then edge (1,4) is considered. The remaining flow can be directed
along that edge, consequently f ( l , 4 ) = 5, e(4) = 10 and e(1) = 0.
0 Next vertex i = 3 is treated. Edge (3,6) can carry flow of amount
f (3,6) = 10, so e(6) = 10 and now e(3) = 5. The remaining excess can
travel along edge (3,7), resulting in f ( 3 , 7 ) = 5 , e(7) = 5 and e(3) = 0
after completing the treatment of vertex 3.
The excess of vertex i = 4 can be passed on completely t o vertex 8,
resulting in f ( 4 , 8 ) = 10, e(4) = 0 and e(8) = 10.
Next vertex i = 6 is considered. The excess flow e(6) = 10 can be
completely partitioned among the successors j = 9,10, resulting in e(6) =
0, e(9) = 5, e(10) = 5, f ( 6 , 9 ) = 5 and f(6,10) = 5.
Vertex i = 7 is the first one, which will be blocked: For edged (i,j ) =
(7,10) the flow can be increased by min{i.(i, j ) - f ( i , j ) , p ( j ) - e ( j ) , e(i))
6 Maximum-flow Methods
The resulting preflow system is shown in Fig. 6.17. Only vertex 7 is blocked,
it is marked by a box containing the value of the excess.
During the backward wave only vertex 7 is treated. The flow is moved back
t o vertex 3 which then has excess e(3) = 5, while now e(7) = 0 0
After completing one forward and one backward wave the vertex capacities are dynam-
ically adjusted by visiting all vertices again in reversed topological order. Therefore,
they represent only non-used capacities:
Figure 6.18: The level network after the recalculation of the vertex capacities. Vertex
3 has excess e ( 3 ) = 5 .
Figure 6.19: The level network after the final iteration of the wave algorithm.
114 6 Maximum-flow
Methods
Dinic's algorithrn iterates the construction of the level network and the calculation of
a blocking until the sink is not part of the level network any more. Then the maximum
flow has been obtained, since there is no path from the sink t o the source in the residual
network. Finally a summary of the algorithm is given:
algorithm dinics-algorithm(N, { c ( i , j)))
begin
initialize flow f (i,j) := 0 Vi, j
build residual network R
build level network L N ( R )
+
while ( N 1 E L N ( R ) )
begin
initialize vertex capacities p(i)
initialize preflow { f (i, j ) )
while ( 3 unbalanced vertices e(i) > 0 )
begin
scan L N ( R ) in topological order:
push forward preflow
scan L N ( R ) in reversed topological order:
push backward preflow
recalculate vertex capacities p(i)
end
+
increase flow: f (i,j ) := f (i,j) f (i,j ) Vi, j
build residual network R
build level network L N ( R )
end
return {f ( i ,j)}
end
After finishing the inner loop containing the forward and backward waves, the
flow values obtained are transfered t o the original graph [f ( i ,j) := f (i,j) +
f ( i , j ) Vi, j]. The residual network after the first iteration of the inner loop
is displayed in Fig. 6.20 [please remember f (i, j) = -f (j, i)].
The corresponding level network, after removing dead ends and calculating
the vertex capacities, is shown in Fig. 6.21. Please note that now the topolog-
ical order of the nodes is 0, 2, 5 , 4, 8, 11, 12. Obviously, the wave algorithm
treats this network by performing just one forward wave. This is due t o the
vertex capacities introduced by Traff. By adding the flow obtained in such a
way t o the total flow, the maximum flow has been calculated, since no fur-
ther increment is possible: the level network obtained after this step is not
connected t o the sink.
6.5 Calculating all Ground States
Figure 6.20: The residual network after the first execution of the inner loop
Figure 6.21: The level network graph before executing the wave loop a second time.
Please note, the graph in Fig. 6.13, which may keep the original Ford-Fulkerson algo-
rithm busy for a while, is treated by the wave algorithm within one forward wave.
stops. Then the spins which have been visited are left t o the cut, so they are set t o
orientation +1, see the definition of X in (6.7) and remember ai = 2xi - 1. The
remaining spins are set t o orientation -1.
Please note that in this way only one ground state can be obtained, even if the system
is degenerate, i.e. if it has many ground states. A simple method t o find different
ground states in different runs of an algorithm has been presented in [27]. To actually
find all different ground states one must find all existing minimum cuts. This can be
achieved with a method explained in [28]. The result is a graph which describes all
possible minimum cuts. Now the method will be explained in detail. In this section, we
will show why the method is indeed correct. For a formal proof the reader is referred
t o [28].
The basic idea is t o describe all ground states as a set of clusters of spins and as
dependencies between these clusters. Each cluster consists of spins which have in
all ground states the same relative orientation t o each other. This means that no
minimum cut will divide any two of the corresponding vertices of one cluster in the
network. We have already mentioned that only satisfied edges may cross a cut. Thus,
the vertices of one cluster are connected somehow by paths of unsatisfied edges.
Figure 6.22: Maximum flow for the network already presented in Fig. 6.6. The
flow through edges (1,3) and (3,4) may be (partially) redirected through edges (1,2),
(2,4). Therefore, only edges ( 0 , l ) and (4,5) are satisfied in all possible distributions
of the maximum flow among the edges.
On the other hand, if an edge is satisfied this does not guarantee that at least one of
the minimum cuts crosses this edge. The reason is that it may be possible t o redirect
the minimum flow such that there is another topology of the flow values resulting in
the same amount of total flow. In this case it is just accidental that the edge was
satisfied. This is illustrated by an example given in Fig. 6.22.
The starting point for constructing the cluster graph representing all minimum cuts is
the residual network. There the edges just represent unused capacities [f (i, j) < c ( i ,j)]
or they represent directions which can be used t o redirect the flow [f( j , i) < 0] in the
original network. Here we are only interested if there is an edge in the residual network
or not, so we can neglect the residual capacities. The main statement of [28] is: the
strongly connected components (SCC) in the residual graph are the above mentioned
clusters of vertices, which are for all minimum cuts on the same side of the cut, i.e.
for each cut all vertices of a cluster are left or all are right of a cut. Please remember
6.5 Calculating all Ground States 117
the definition of the SCC: these are subsets of vertices of maximum cardinality, where
each vertex is connected t o each other one. An algorithm for the calculation of the
SCC was given in Chap. 3.
After all clusters have been obtained, a new reduced graph R" is built, which con-
tains one vertex for each SCC. Each edge in the residual network which runs between
different SCCs is also present in R" (multiple edges are kept only once). All existing
minimum cuts can be built using the reduced graph. This works in the following way:
the following implication gives the meaning of each edge (Ic, 1) in R"
Please remember that the left side of a cut (S,S)is S and that the source is always
left of the cut. The central statement of this section is: all cuts are represented by the
reduced graph. All cuts are allowed that are compatible with all constraints imposed
by the reduced graph.
We will now explain why this interpretation of the edges in R" is meaningful and how
the notion of SCC as clusters follows from this. Please note that the implication I M P
also applies t o the edges in the residual network, since the verticcs can be seen as tiny
clusters. Let us assume that (i,j) is such an edge. So I M P reads: if vertex i is left of
the cut, then vertex j must be left as well. The existence of edge (i, j) allows for three
cases regarding the original network after the maximum flow has been calculated:
f (i,j ) = 0 < c(i, j ) . In this case the edge (i, j ) must not be removed by any cut,
because only satisfied edges may cross a cut. So the three possibilities
are allowed. The third case is allowed, since an edge contributes only to the cut,
if it goes from the left t o the right side but not if it runs from right t o left. That
these three cases are allowed is exactly what I M P tells us.
0 f (i,j ) = -c(j, i), this is the other extreme. Here there is a non-vanishing flow
in the opposite direction ( j , i ) and the edge ( j , i ) is satisfied. Now again it may
be allowed t o have both i, j on the same side of the cut, also edge ( j , i) may be
removed ( jleft, i right, if there is no contradiction from other constraints). But it
is not allowed t o have i on the left and j on the right side of the cut, since in that
case there is flow from the right to the left side of the cut. Since all flow originates
a t the source: which is always left, this means it must cross the cut from left t o
right twice. This contradicts the fact that the cut is minimal. Summarizing, again
I M P holds.
0 0 < f (i,j) < c(i,j ) . In this case, since f (i,j) = -f (j,i) in addition to edge
(i, j ) , also edge ( j , i ) is part of the residual network. Please note that the case
-c(j, i ) < f (i,j ) < 0 is covered here as well. Consequently we have two I M P
118 6 Maximum-flow Methods
clauses which combine to "both i , j must be on the same side of the cut". This is
reasonable, since there is a non-vanishing flow as well as an unsatisfied capacity,
so both reasonings given above for the other two cases hold.
If there is a chain of implications (il left) implies (i2 left), (i2 left) implies (i3 left),
. . ., (in left) implies ( i l left), then from the transitivity of the implication rule follows
that all vertices i l , . . . , in must always be on the same side of a minimum cut. On the
other, the chain of implications corresponds t o a closed loop of edges in the residual
graph. Thus, all vertices belong t o the same strongly connected component. This
explains why the method t o construct all minimum cuts given above is indeed correct.
Figure 6.23: Residual network of the network with maximum flow of Fig. 6.22.
reduced graph is presented in Fig. 6.5. Since SCC A contains the source, it
must always be left of the cut. C contains the sink, so it must always be right
of the cut. So the reduced graph tells us that two minimum cuts are allowed:
({A), {B;C)) and ({A, B ) , {C)). They are identical with the minimum cuts
which were obtained by direct inspection in Sec. 6.2. The reduced graph
Figure 6.24: Reduced graph of the network with maximum flow of Fig.
6.22. All minimum cuts are represented by this graph.
for the network of Fig. 6.7 calculated from its maximum flow (see Fig. 6.11)
is shown in Fig. 6.5.
After one has obtained the reduced graph describing all minimum cuts, for their eval-
uation, it is possible t o enumerate all of them. This can be done by applying the
6.5 Calculating all Ground States 119
following algorithm [29]. It considers only the N" components which do not contain
the source or the sink. We assume that the SCCs are numbered in a decreasing topo-
logical order, i.e. whenever (k, 1) is an edge in the reduced graph, then k < I . The basic
idea is t o start with all SCCs right of the cut. Similar t o Sec. 6.2, we use a variable
Xk = 0 , l t o describe whether an SCC k is left ( X k = 1) or right of the cut. Thus,
the string X = { X k ) describes a cut. This string can be read as a binary number
X . This means all states can be enumerated by just increasing X starting from 0
,,
up t o 2N - 1 and considering only the configurations which are compatible with the
constraints imposed by the reduced graph. This is done by the following algorithm:
algorithm enumerate-cuts(R")
begin
Let X k := 0 for all k
while not all X k = 1 do
begin
Find the smallest component lo with X1, = 0
Let X l , = 1
for all 1 < lo do
if component 1 is not a successor of lo in R" then
Let X I = 0
end
end
This algorithm can be used for all kinds of graphs or relations R" which denote a
priority relation, i.e. they must not contain cycles. Then all configurations which obey
all priority rules can be enumerated with this method.
The system shown in Fig. 6.5 has only one strongly connected component which con-
tains neither the source nor the sink, so only the cuts X I = 0 , l are possible. The
application t o the network which was shown in Fig. 6.11 is more instructive.
Figure 6.25: Residual graph of a small network which was presented along
with a maximum flow in Fig. 6.11.
6 Maximum-flow Methods
The algorithm starts with X = X4X3X2X1= 0000 (we denote at the right
the least significant bit/component with the smallest number), i.e. all inner
components are t o the right of the cut: (S, S) = ((01, {1,2,3,4,5)).
Within the first iteration jo = 1, X is increased t o X = 0001. So we obtain
(S,S) = ({0,1), {2,3,4,5)).
By the second iteration jo = 2 and X = 0011 is obtained. Since component
1 is a successor of component 2, the string X is not altered by the for-loop.
Therefore, we obtain (S,S) = ({0,1,2), {3,4,5)).
Next jo = 3 and initially X = 0111. Since component 2 is not a successor of
3 (but SCC 1 is), we get X = 0101. This means (S,S ) = ({0,1,3), {2,4,5)).
The next iteration results in jo = 2, thus X = 0111. Similar t o the second
iteration, now X remains unchanged and we get (S, 3) = ({0,1,2,3), {4,5)).
For the final iteration jo = 4 and X = 1111. All other components are
successors of jo, so the for-loop does not alter X and we get (S,S) =
({o, ~ 2 ~ { 5 13) . ~ 4 ~
Summarizing, 6 minimum cuts are obtained. They are the same as those
already found in Sec. 6.2 by direct inspection, see Fig. 6.7 (please note that
the order of the vertices is different from the order of the corresponding com-
ponents due t o the topological order). 0
Many types of graphs exhibit an exponential number of minimum cuts, so the enumer-
ation is quite time consuming. The number of different cuts itself has t o be obtained
by the enumeration as well, so it is fairly time consuming for large systems. But in
the case of the systems considered here, DAFF and RFIM which have an exponential
degeneracy as well, it turns out that the reduced graph is usually very sparse. There-
fore, the reduced graph can be divided into many small groups of connected SCCs
without edges to other clusters. Then all groups can be treated independently, i.e. the
number of minimum cuts is the product of the number of cuts allowed by each group.
6.6 Results for the RFIM and the DAFF 121
This drastically reduces the running time of the algorithm. Furthermore, it is possible
t o extract information from the graph directly. For example one can use the extremal
cuts, i.e. the cuts having alllno (except one) SCCs left of the cut, t o calculate the
minimurn/maximum magnetization of the corresponding system.
Now all tools for examining the zero temperature properties of random field systems
and diluted antiferromagnets are available. To summarize: first an equivalent net-
work is constructed (see Sec. 6.2), then the maximum flow is obtained with the wave
algorithm (Sec. 6.4) and finally the graph describing all degenerate ground states is
calculated. Some basic results for RFIM and DAFF, which were found with these
techniques, are shown in the next section.
------ maximum
minimum
Figure 6.27: Magnetization m as a function of the strength of the random fields for
one realization of an RFIM (L = 8) with bimodal distribution.
Since this type of system exhibits a ground-state degeneracy, the magnetization may
vary from one ground state t o the other. In the figure the maximum and minimum
values which the magnetization can take are shown. How many intermediate values
are possible, depends on the degree of degeneracy. More results can be found in [30].
In general, the behavior turns out t o be as expected in the first section: for small fields
the system is ferromagnetically ordered. With increasing strength of the random fields
the spins tend t o be oriented in the direction of the local fields and the magnetization
vanishes. The order parameter changes only at a finite number of values for A, in
between it is constant. A detailed analysis [26] shows that the steps in m(A) occur
whenever the sum of local fields on some cluster is strong enough to flip the whole
cluster.
The behavior is similar for other realizations, but the values where the cluster turn
around vary from system t o system. Only the jumps at some integer values of A appear
in all realizations. These discontinuities are due t o the reversing of several single spins.
By averaging over different realizations one gets a smooth curve. The result is shown
in Fig. 6.28 for the maximum of the absolute value of the magnetization averaged
over typically 1000 different realizations. For other quantities like the minimum or
the average of the magnetization the results look similar. In the figure the data for 4
different system sizes N = lo3, 203, 40"nd 80"re presented. With increasing system
size a monotonic shift of the curves can be observed. We are interested in performing
the thermodynamic limit, i.e. in obtaining the behavior of very large systems. This
6.6 Results for the RFIM and the DAFF 123
can be done by applying the technique of finite-size scaling. The basic assumption is
that the average magnetization shows the following behavior [31]:
A, is the critical value of the infinite system where the magnetization vanishes. The
values of the critical exponents /3 and v describe the asymptotic behavior of the order
parameter and the correlation length near the transition respectively. The exact form
of the function h is in general unknown. The values for A,,u and ,L3 have t o be
determined. They can be obtained by rescaling the numerical data in a way that all
data points collapse onto one curve. The resulting scaling plot is shown in Fig. 6.29.
The values obtained in this way are A, = 2.20, v = 1.67 and P = -0.01.
Figure 6.29: Finite-size scaling plot for RFIM with bimodal distribution. The pa-
rameters are A, = 2.20, v = 1.67 and 3!, = -0.01.
As mentioned a t the beginning, it is assumed that the RFIM is similar t o the DAFF.
In this case the values for the exponents should be equal. Additionally, they should
not depend on details of the model such as the choice of the distribution of the random
fields. In this case one says that the exponents are universal. On the other hand, the
exact value of the critical strength of the field is not expected t o be universal. By
calculating similar finite-size scaling plots for the RFIM with Gaussian distribution
and the DAFF, the following values are found:
Consequently, the exponents seem not to be universal, since v is different for the
different distributions of the RFIM. On the other hand, the Gaussian random-field
system seems to be a good model for diluted antiferromagnets, since in this case the
values of the exponents agree.
For an intermediate strength of the field, the order parameter vanishes. Nevertheless,
in this region the systems exhibit fractal domains, which are described by non-trivial
exponents. An example of such a domain, is displayed in Fig. 6.30. Quantitative
results can bc found e.g. in Ref. [32].
Finally, we will give some examples regarding the ground-state degeneracy of the
DAFF and the RFIM with bimodal interactions [26]. It turns out that the number of
ground states grows exponentially with system size, so there is a finite entropy, see
also R.ef. [33]. On the other hand, the ground-state landscape is rather simple. Most
of the spins have the same orientation in all ground states, i.e. they are frozen. This
fraction is more than 95% for the DAFF and even 98% for the RFIM. The clusters of
spins, which may have two orientations in different ground states, interact only rarely
Bibliography 125
with each other. This means the reduced graph contains only few edges. Therefore,
the clusters can choose one of their orientations more or less independently of each
other.
To describe the ground-state landscape a quantity called overlap q can be used. Let
{at) and {a?} be two independent ground-state configurations for the same system,
i.e. the same realization of the disorder. One can compare these two configurations by
calculating
where N' is the number of spins, i.e. N' = Ciei for the DAFF and N' = N for the
RFIM. Thus, if A, B are equal q = 1, while q = -1 if {gt} and ) :
a
{ are inverted
relative to each other. In general 1 5 q 1. <
For a set of M ground states onc compares each state with each other one, resulting
in M ( M - 1)/2 values. So one gets a whole distribution of overlaps. To describe the
behavior of a random ensemble, again one takes different realizations of the disorder
and calculates an average distribution P ( q ) . The results for the DAFF are shown
in Fig. 6.31, see also [27]. The fact that the ground-state landscape is rather simple,
is reflected by the fact that P ( q ) is zero for most of the overlap values and by the
shrinking of the width of P ( q ) . In the thermodynamic limit the distribution becomes
a delta function. The result for the RFIM looks similar.
Other recent results for the ground states of diluted antiferromagnets and random-field
systems can be found e.g. in [32, 34, 35, 36, 37, 38, 39, 401.
For another class of random systems, the spin glass model, the ground state landscape
for individual realizations looks much more interesting and P ( q ) is very broad for finite
sizes. This subject is covered in Chap. 9.
Bibliography
[I] S. Fishman and A. Aharony, J. Phys. C 1 2 , L729 (1979)
[2] J.L. Cardy, Phys. Rev. B 29, 505 (1984)
[3] H. Rieger, in: D. Stauffer (ed.), Annual Reuiews of Computational Physics II,
(World Scientific, Singapore 1995)
[4] R.A. Cowley, H. Yoshizawa, G. Shirane, and B.J. Birgeneau, 2.Phys. B 58, 15
(1984)
[5] D.P. Belanger, in: A.P. Young (ed.), Spin Glasses and Random Fields, (World
Scientific, Singapore 1998)
[6] U. Nowak and K.D. Usadel, Phys. Rev. B 44, 7426 (1991)
[7] K.D. Usadel and U. Nowak, JMMM 104-107, 179 (1992)
[S] J.-C. Picard and H.D. Ratliff, Networks 5 ; 357 (1975)
6 Maximum-flow Methods
[9] E.T. Seppala, M.J. Alava, and P. M. Duxbury, Phys. Rev. E 63, 036126 (2001)
[lo] E.T. Seppala and M.J. Alava, Phys. Rev. Lett. 84, 3982 (2000)
[ll]E.T. Seppala, V.I. Raisanen, and M.J. Alava, Phys. Rev. E 61, 6312 (2000)
[14] L.R. Ford and D.R. Fulkerson, Canadian J. Math. 8 , 399 (1956)
[15] J.D. Claiborne, Mathematical Preliminaries for Computer Networking, (John Wi-
ley & Sons, New York 1990)
[17] M.N.S. Swamy and K. Thulasira.man, Graphs, Networks and Algorithms, (John
Wiley & Sons, New York 1991)
[19] R.E. Tarjan, Data Structures and Network Algorithms, (Society for Industrial and
Applied Mathematics, Philadelphia 1983)
[20] L.R. Ford and D.R. Fulkerson, Flows in Networks, (Princeton University Press,
Princeton 1962)
[21] A.V. Goldberg and R.E. Tarjan, J. ACM 35, 921 (1988)
[22] B. Cherkassky and A. Goldberg, Algorithmica 1 9 , 390 (1997)
7.1 Motivation
The ground-state configuration of a directed polymer in a random environment, in
which all (bind)-energies are non-negative, can be obtained with Dijkstra's algorithm
to find the shortest path in a directed network with non-negative costs on the edges.
We will also refer t o this problem as the 1-line-problem since the directed polymer con-
figuration or shortest path is a line-like object threading, in typical physical situations,
a two- or three-dimensional system.
At this point it appears natural t o ask for the minimal energy configuration (or ground
state) of more than one, say N, lines in the same disordered environment under the
constraint (the "capacity constraint") that only one line can pass a single bond in the
lattice or edge in the graph. Having a particular physical situation in mind, namely
magnetic flux lines threading through a disordered superconductor in the mixed phasc
(Shubnikov phase), we usually want the lines t o enter the system through a top surface
and to leave the system through a bottom surface as, for instance, the z = H and the
z = 0 plane, respectively, in a two-dimensional square lattice or a three-dimensional
simple cubic geometry, see Fig. 7.1.
The physical situation one is interested in by considering such cases of interacting
elastic lines in a random potential are conlnlonly described by the Hamiltonian
top t o the bottom in the remaining network, giving (together with the removed bonds)
a 2-line configuration. One Proceeds successively until N lines thread the sample from
top t o bottom. Apart from the obvious problems one could run into by successively
removing edges from the graph (it could happen that we cannot add further lines since
all paths from top t o bottom are blocked) a little bit of thinking will convince us that
in this way one will, in general, n o t find the minimum energy configuration: it could
be that the ground state of 2 lines is not separable into the 1-line ground state plus
the "first excited state" that one constructs in the way described above (see Fig. 7.2
and Fig. 7.3). In many cases one has t o deform pieces (or even all) of the 1st line
before we add the 2nd line in order t o minimize the total energy. As long as there
are only a few lines in a large system the difference will vanish in the thermodynamic
lirnit. However, if the density of lines is fixed a difference will exist with probability
one even in the infinite system size lirnit. For dense systems, as we will consider in
what follows, the difference will be essential.
How does one take into account all possible deformations of N - 1 lines when one wants
t o add the N t h line t o the system? At first sight this appears t o be a tremendous
7.1 Motivation
Figure 7.2: (top) Two-polymer ground-state in 2d, (bottom) the same system but
with the first (1-line GS) frozen first. The energy of the configuration (top) is lower.
Note that in (top) case the 1-line GS is to the left line compared t o the (bottom) figure.
The fact that in (top) the 1-line GS is minimally deformed (in the lower part in order
to givc the 2nd line a bit of space which is energetically favorable) produces a 2-line
GS that is totally different (concerning the 2nd line) from the configuration (bottom).
In both cases the disorder landscape is the same.
task, but t,here is an elegant trick by which one can devise an efficient algorithm for
the N-line problem. First one does not work with the original network but with the
so called residual network that depends on the number of lines one has to put into
the system and their actual configuration. It is defined as follows: since we confine
ourselves in the beginning t o capacity-one edges (i.e. only a single line can pass each
edge) we remove each edge (ij) that is occupied by a line segment (xij = 1) and
insert the reversed edge ( j i ) with cost cji = -cij 5 O! These reversed edges can
now be occupied by (virtual) line segments (xji = 1) through which one gains energy
(cji 5 0), which is due t o the fact that in reality one removcs a line segment from edge
7 Minimum-cost Flows
Figure 7.3: Similar to Fig. 7.2 but in 3d. In both cases the disorder landscape is the
same.
( i j ) when occupying edge ( j i ) and thus reduces the total energy by an amount c i j . In
this way one incorporates elegantly the possibility of deforming the already cxisting
lines when adding a new line into the residual network.
One difficulty occurs. Now a number of edges have negative costs, which renders
the efficient shortest path finder, Dijkstra's algorithm, inapplicable. However, in this
particular situation, in which the N - 1-line configuration is indeed the one with the
lowest total cost, there is n o negative cycle in the residual network (i.e one cannot find
a closed path in the residual whose addition to the existing configuration would lower
the energy). Therefore it is possible to find so called node potentials ~ ( ifor) all nodes
i that can be used to modify the costs in the residual network in such a way that they
7.2 The Solution of the N-Line Problem 133
+
are all non-negative. These reduced costs are then defined as c; = cij - ~ ( i ) ~ ( j T) ,
is chosen such that c; > 0 for all edges in the residual network and the shortest paths
with respect t o these reduced cost c& are still the shortest paths with respect t o the
original costs cij of the residual network. In this residual network with these reduced
costs one proceeds now in the same way as we proposed naively a t the beginning of
this section: one adds lines successively t o the system by finding the shortest paths in
the residual network, updating the line (or flow) configuration and then updating the
residual network including the reduced costs again. We will now put this idea on a
formal basis.
where C(ij) is a sum over all bonds ( i j ) joining site i and j of a d-dimensional lattice,
e.g. a rectangular ( L ~ - 'x H) lattice, with arbitrary boundary conditions (b.c.) in d- 1
space direction (i.e. they can be specified later) and free b.c. in one direction. The
>
bond energies cij 0 are quenched random variables that indicate how much energy
it costs t o put a segment of fluxline on a specific bond ( i j ) . The fluxline configuration
x (xij 2 0)) also called a flow, is given by specifying xij = 1 for each bond i , which
is occupied by the fluxline and otherwise xij = 0. For the configuration to form lines
on each site of the lattice all incoming flow should balance the outgoing flow, i.e. the
flow is divergence free
where V. denotes the lattice divergence. Obviously the fluxline has t o enter, and to
leave, the system somewhere. We attach all sites of one free boundary t o an extra site
(via energetically neutral edges, e = 0), which we call the source s , and the other side
t o another extra site, the target, t as indicated in Fig. 7.4. Now one can push one line
through the system by inferring that s has a source strength of +1 and that t has a
sink strength of -1, i.e.
134 7' Mznzmum-cost Flows
with N = 1. Thus, the 1-line problem consists of minimizing the energy (7.2) by
finding a flow x in the network (the lattice plus the two extra sites s and t) fulfilling
the constraints (7.3) and (7.4).
sink sink
Figure 7.4: Sketch of the successive shortest path algorithm for the solution of
the minimum cost flow problem described in the text. (a) Network for N = 0, the
numbers are the bond energies (or costs) ci,. The bold thick line is a shortest path
from s to t . (b) The residual network GE for a flow as in (a) with the updated node
potentials. (c) G: from (b) with the updated reduced costs plus the the shortest path
from s to t in GE indicated by the thick light line. (d) Optimal flow configuration for
N = 2 in the original network. Note that the 2-line state is not separable, i.e. it does
not consist of the line of (a) plus a 2nd line.
To solve this problem along the lines described in 7.1 we first define the residual
network G,(x) corresponding t o the actual fluxline configuration. As described in
7.1 it also contains the information about possibilities of sending flow backwards (now
with energy -cij since one wins energy by reducing xij), i.e. to modify the actual
flow. Suppose that we put one fluxline along a shortest path P ( s , t) from s t o t , which
means that we set xij = 1 for all edges on the path P ( s , t ) . Then the residual network
is obtained by reversing all edges and inverting all energies along this path, indicating
that here we cannot put any further flow in the forward direction (since we assume
hard-core interaction, i.e. xij 5 I ) , but can send flow backwards by reducing xij on
the forward edges by one unit. This procedure is sketched in Fig. 7.4.
Next we introduce a node potential T that fulfills the relation
for all edges ( i j ) in the residual network, indicating how much energy ~ ( jit )would
at least take t o send one unit of flow from s to site j , IF it would cost an energy ~ ( i )
7.2 The Solution of the N-Line Problem 135
to send it t o site i. With the help of these potentials one defines the reduced costs
The last inequality, which follows from the properties of the potential T (7.5) actually
ensures that there is no loop L in the current residual network (corresponding t o a
flow x ) with negative total energy, since C(ij) cij = C(ij)EC c c , implying that the
flow x is optimal (for a proof see Sec. 7.3).
It is important t o note that the inequality (7.5) is reminiscent of a condition for shortest
path distances d(i) from s to all sites i with respect t o the energics cij: they have to
< +
fulfill d ( j ) d(i) cij. Thus, one uses these distances d t o construct the potential T
when putting one fluxline after the other into the network.
The iterative procedure we described in 7.1 now works as follows. We start with the
empty nctwork (zero fluxlines) x0 = 0, which is certainly an optimal flow for N = 0,
and set T = 0, c"2.7 = cij. Next, let us suppose that we have an optimal N - 1-
line configuration corresponding t o the flow x N p l . The current potential is 7rNp1,
+
the reduced costs are cg-' = c23-. nNpl ( i ) - nN-' (j)and we consider the residual
network G-:' corresponding to the flow xNpl with the reduced costs c t - l 2 0. The
iteration leading t o an optimal N-line configuration x$ is
An example of how the algorithm operates is given in Fig. 7.4. The complexity of this
iteration is the same as that of Dijkstra's algorithm for finding the shortest paths in a
network, which is C3(M2) in the worst case ( M is the number of nodes in the network).
We find, however, for the cases we consider (hdimensional lattices) it roughly scales
linearly in M = L ~ Thus,
. for N fluxlines the complexity of this algorithm is ~ ( N L ~ ) .
For the actual implementation of the above iteration it is important t o note that it
is not necessary actually t o find the shortest paths to all other nodes: in the find
routine one uses Dijkstra's algorithm only t o find a shortest path from s to t and in
the compute statement it is sufficient t o update only those potentials of the nodes
that have been permanently labeled during Dijkstra's algorithm. It is easy t o show
>
that these node potentials still fulfill the requirement cij 0 (7.6).
136 7 Mznimum-cost Flows
Figure 7.5: The SOS model on a disordered substrate. The substrate heights are
denoted by d, E [ O , l ] , the number of particle on site i by n, 2,which means that
+
they could also be negative, and the total height on site i by h, = di ni
The model (7.7) has a phase transition at a temperature T, from a (thermally) rough
phase for T > T, t o a super-rough low temperature phase for T < T,.In two di-
mensions "rough" means that the height-height correlation function diverges loga-
7.3 Convex Mincost-flow Problems zn Physics 137
where & E [O, 27r) are phase variables, Bi E [O, 27r) are quenched random phase shifts
and X is a coupling constant. One might anticipate that both models (7.7) and (7.8)
are closely related by realizing that both have the same symmetries [the energy is
+ +
invariant under the replacement ni + ni m (q$ + #i 27rm) with being m an
integer]. Close t o the transition one can show that all higher order harmonics apart
from the one present in the Sine-Gordon model (7.8) are irrelevant in a field theory for
(7.7), which establishes the identity of the universality classes. Note, however, that
far away from T,, as for instance at zero temperature, there might be differences in
the two models.
To calculate the ground states of the SOS model on a disordered substrate with gcncral
interaction function f (x) we map it onto a minimum cost flow model. Let us comment,
however, that the special case f (x) = 1x1 can be mapped onto the interface problem in
the random bond Ising ferromagnet in 3d with columnar disorder [5] (i.e. all bonds
in a particular direction are identical), by which it can be treated with the maximum
flow algorithm we have discussed already (see Chap. 6).
We define a network G by the set of nodes N being the sites of the dual lattice of our
original problem (which is simply made of the centers of each elernentary plaquette of
the square lattice, thus being again a square lattice) and the set of directed edges A
connecting nearest neighbor sites (in the dual lattice) (i,j ) and (j,i). If we have a set
of height variables ni we define a flow x in the following way. Suppose two neighboring
sites i and j have a positive (!) height difference ni - nj > 0. Then we assign the flow
value xij = ni - nj to the directed edge ( i ,j) in the dual lattice, for which the site i
with the larger height value is on the right hand side, and assign zero t o the opposite
edge ( j , i ) , i.e. xji = 0. Also xi, = 0 whenever sites i and j are of the same height.
See Fig. 7.6 for a visualization of this scheme. The flow pattern is made up of closed
cycles that separate regions of different height and therefore we have:
On the other hand, for a n arbitrary set of values for x,, the constraint (7.9) has t o be
fulfilled in order to be a flow, i.e. in order to allow a reconstruction of height variables
out of thc height differences. This observation becomes immediately clear by looking
a t Fig. 7.6.
7 Minimum-cost Flows
Figure 7.6: The flow representation of a surface (here a "mountain" of height ni = 3).
The broken lines represent the original lattice, the open dots are the nodes of the dual
lattice. The arrows indicate a flow on the dual lattice, which results from the height
differences of the variables ni on the original lattice. Thin arrows indicate a height
difference of xij = 1, medium arrows xi, = 2 and thick arrows x,, = 3. According to
our convention the larger height values are always on the right of an arrow. Observe
that on each node the mass balance constraint (7.9) is fulfilled.
V o r t e x glass in 3d
The starting point of a field theory for superconductors is the Ginzburg-Landau the-
ory containing the phase variables (or XY-spins) for the local superconducting order
parameter. As is well known from the classical XY-model [6] the spin-wave degrees of
freedom of such a modcl can be integrated out and one is left with an effective Hamil-
tonian for the topological defects, the vortices, which are the singularities of the phase
field 0 interacting with one another like currents in the Biot-Savat law from classical
electrodynamics, i.e. like l l r , where l / r is the distance (see also [7]). An additional
integration over the fluctuation vector potential sets a cutoff for this long-range inter-
action beyond which the interaction decays exponentially, and can thus bc neglected.
Thus a standard model for interacting magnetic flux lines in high temperature super-
conductors (81 is, in the vortex representation, the Hamiltonian [9]
7.4 General Minimum-cost-flow Algorithms 139
where the sum runs over all pairs of bonds of a simple cubic lattice (thus it is a three-
dimensional model!) and the integer variables xij, assigned to these bonds, can take
on any integer but have t o satisfy the divergence free condition
on every site i , the same as in the two-dimensional SOS model (7.9). The bij are
magnetic fields which are constructed from quenched random vector potentials A by
a lattice curl, i.e. one obtains bi, as 1/(27r) times the directed sum of the vector
potentials on the plaquette surrounding the link (ij) on which bij lives. By definition,
the magnetic fields satisfy the divergence free condition ( V . b ) i = 0 on every site
i, since they stem from a lattice curl. The vortex interaction is given by the lattice
Green's function
Thus we have t o minimize (7.14) under the mass balance constraint (7.12); which is a
convex minimum-cost-flow problem.
The parameters b(i), cij are integers, while xij, uij are non-negative integers. Note
that we have allowed for a general set of sources and sinks b(i), though we will restrict
attention t o one source, s , and one sink, t , in the applications. In fact, it is easy t o
convert a problem with a general set b(i) t o one with just one source s and one target
t in the following way:
(i) Connect all of the nodes with b(i) > 0 t o s by edges with capacity u,i = b(i)
(ii) Connect all of the nodes with b(i) < 0 t o t by edges with capacity uit = -b(i).
Flow conservation requires Cib(i) = 0 so that the flow into the network is equal to
the flow out of the network.
The quantity hij(xij) is the cost function and may be a different function on each
bond. The cost functions hij(xij) can be any convex function, that is,
<
VX, y , and 8 E [0,1] hij(8x + (1 - 8 ) ~ ) 8hij (x) + (1 - $)hij (Y) (7.18)
Linear cost is the special case, h(xij) = c i j x i j There are faster algorithms for linear
cost than for convex cost, but for the applications considered here, the difference is
not large. The general convex-cost case is as simple t o discuss as the linear-cost case,
so we will discuss the general algorithm.
The residual network G ( x ) , corresponding t o a flow, x, is defined as in the N-line
problem. However the residual costs need t o be constructed differently due t o the non-
linearity of the local cost functions hij(xij). We need to know the cost of augmenting
the flow on arc ( i ,j ) , when there is already a flow xij in that edge. In the general
convex-cost problem, we always augment the flow by one flow unit. Because we have
> >
defined xij 0 and xji 0, we must treat three cases:
>
(i) If xij 1, + xJi = 0
( i ,j ) , the first unit of flow goes into the 1st replicated edge, the 2nd unit of flow
7.4 General Minimum-cost-flow Algorithms 141
in the 2nd replicate etc. When the flow is reversed, the flow is canceled first in
the highest replicated edge provided the cost function is convex. That is, we need
hi, (k) - hij(k - 1) > hij(k - 1) - hij(k - 2) so that this replication procedure makes
sense. Unfortunately no analogous procedure is possible when the convexity,c6ndition
is violated. In the case of linear costs there is no need t o replicate the edges as the
cost for incrementing the flow does not depend on the existing flow.
We will now discuss two methods for solving minimum-cost-flow problems, namely
the negative-cycle-canceling method and the successive-shortest-path method, both of
which rely on residual-graph ideas. The first method starts with an arbitrary feasible
solution that is not yet optimal and improves it iteratively until optiniality is reached.
The latter method starts with an optimal solution that violates certain constraints and
fulfills the constraints one after the other keeping optimality all the time so that when
all constraints are fulfilled simultaneously optimality is guaranteed. This method is
more efficient, but we will discuss the negative cycle algorithm since the negative-cycle
theorem presented below is needed t o prove the correctness of the successive-shortest-
path method.
Negative-cycle-canceling Algorithm
The idea of this algorithm is t o find a feasible flow, that is, one which satisfies the
mass-conservation rules, and then t o improve its cost by canceling negative-cost cy-
cles. A negative-cost cycle in the original network is also a negative-cost cycle in the
residual graph, so we can work with the residual graph. Moreover, flow cycles do not
change the total flow into or out of the network, and they do not alter the mass-balance
conditions a t each node. Thus, augmenting the flow on a negative-cost cycle maintains
feasibility and reduces the cost, which forms the basis of the negative-cycle-canceling
algorithm. This is formalized as follows.
Proof: Suppose the flow x is feasible and G(x) contains a negative cycle. Then a
flow augmentation along this cycle improves the function value z(x), thus x is not
optimal. Now suppose that x* is feasible and G ( x * ) contains no negative cycles and
let x0 # x* be an optimal solution. Now decompose x0 - x* into augmenting cycles,
the sum of the costs along these cycles is c . x0 - c . x*. Since G ( x * ) contains no
>
negative cycles c . x0 - c . x* 0, and therefore c . x0 = c . x* because optimality of
x* implies c . x0 5 c . x * . Thus x0 is also optimal. QED
A minimum-cost algorithm based on the negative-cost-canceling theorem, valid for
graphs with convex costs and no negative-cost cycles in G(0), is given below, an
example is presented in Fig. 7.7.
7 Minimum-cost Flows
Figure 7.7: Illustrating the cycle canceling algorithm for linear costs: (a) network
example with a feasible flow x; (b) residual network G(x); (c) residual network after
augmenting 2 units along the cycle 4-2-3-4; (d) residual network after augmenting 1
unit along t h e cycle 4-2-1-3-4.
7.4 General Mznimum-cost-flow Algorithms 143
To begin the algorithm, it is necessary t o find a feasible flow, which is a flow which
satisfies the injected flow a t each of the sources, the extracted flow a t each of the sinks,
and which satisfies the mass-balance constraints a t each node. A robust procedure t o
find a feasible flow is t o find a flow which satisfies the capacity constraints using the
maximum-flow algorithm (see Chap. 6). To detect negative cycles in the residual net-
work, G(x), one can use the label-correcting algorithm for the shortest-path problem
presented Sec. 4.2.3.
In the linear cost case, the maximum possible improvement of the cost function is
O(IEICU), where C = max lcijI and U = maxuij. Since each augmenting cycle con-
tains a t least one edge and a t least one unit of flow, the upper bound on the number
of augmenting cycle iterations for convergence is also O(I EICU). Negative-cycle de-
tection is C3(IX21) generically, but for sparse graphs with integer costs it is O(IE) I).
Thus for sparse graphs with integer costs, negative cycle canceling is O(IEI2CU).
Successive-shortest-path algorithm
The successive-shortest-path algorithm iteratively sends flow along minimal-cost paths
from source nodes t o sink nodes t o finally fulfill the mass-balance constraints. A
pseudo flow satisfies the capacity and non-negativity constraints, but not necessarily
the mass-balance constraints. Such flows are called infeasible as they do not satisfy
all the constraints. The successive-shortest-path algorithm is an infeasible method
which maintains an optimal-cost solution a t each step. In contrast, the negative-cycle-
canceling algorithm always satisfies the constraints (so it is a feasible method) and it
iteratively produces a more optimal solution.
The imbalance of node i is defined as
If e(i) > 0 then we call e(i) the excess of node i, if e(i) < 0 then we call it the
deficit. The successive-shortest-path algorithm sends flow along minimal-cost paths
from excess sites t o deficit sites until no excess or deficit nodes remain in the graph.
Dijkstra's algorithm (Sec. 4.2.2) is efficient in finding minimum-cost paths, but it only
works for positive costs. The successive-shortest-path algorithm uses Dijkstra's method
t o find augmenting paths, but to make this work we have t o develop a different sort of
residual network with positive reduced costs [remember that the residual costs can be
negative - see Eqs. (7.19-7.21)]. Surprisingly, this is possible. To construct positive
reduced costs from which the optimal flow can be calculated, we use the concept of
node potential ~ ( ialready
) encountered in Sec. 7.2.
The reduced costs used in the successive-shortest-path problem are inspired by the
reduced costs, c$, introduced in the shortest-path problem [Eq. (4.3)]. c$ has the
attractive feature that, with respect t o the optimal distances, every arc has a non-
negative cost. To generalize the definition (4.3) so that it can be used in the minimum-
cost-flow problem, one defines the reduced cost of edge (i, j ) in terms of a set of node
potentials ~ ( i, )
144 7 Minimum-cost Flows
C c"=
23 C c;.
In particular, property (ii) means that negative cycles with respect to crj are also
negative cycles with respect to c;. We define the residual network G T ( x ) to be the
residual graph with residual capacities defined as before, but with reduced costs as
given by Eq. (7.23).
The next step is t o find a way to construct the potentials sr(i). This is carried out
recursively, starting with ~ ( i =) 0 when there is no flow in the network. The procedure
for generating potentials iteratively relies on the potential lemma given below.
L e m m a : (Potential)
(i) Given: a valid node potential ~ ( i ) a: set of reduced costs c6 and a set of distance
labels, d(i) (found using c: > 0) t h e n the potential ~ ' ( i =
) ~ ( i-)d(i) also has positive
reduced costs, c$ >0.
(ii) c c = 0 for all edges (i, j) on shortest paths.
P r o o f : Properties (i) and (ii) follow from the analogous properties for the minimal
path [see Eqs. (4.3-4.5)]. To prove (i), using (4.3-4.5), we have, d ( j ) 5 d(i) +c$, then
+ + >
we have, c$ = c,', - [n(i) - d(i)] [sr(j) - d(j)] = c: - d ( j ) d(i) 0. For (ii) simply
repeat the discussion after replacing the inequality by an equality. QED
Now that we have a method for constructing potentials, it is necessary to demonstrate
that this construction produces an optimal flow.
+ >
define ;r = -d then c; = cij d(i) - d ( j ) 0. Hence we have constructed a set of
node potentials associated with the optimal flow. QED
The reduced cost optimality theorem proves that each optimal flow has an associated
set of potentials, while the potential lemma shows how t o construct these potentials.
The final step is t o demonstrate how t o augment the flow using the potentials. To
demonstrate this, suppose that we have an optimal flow x and its associated potential
n(i) which produces reduced costs c; that satisfy the rcduced cost optimality condi-
tion. Suppose that we want t o add one unit of flow t o the system, injecting a t a source
at site, k, and extracting a t a site 1. Find a minimal path, Pkl, (using the reduced
costs c;) from an excess site k t o a deficit site 1. Now augment the flow by one unit for
all edges (i, j ) E Pkr. We call this flow augmentation 6. The following augmentation
lemma ensures that this proccdnre maintains optimality.
Proof: Take ;r and n' as in the potential Lemma and let P be the shortest path from
node s t o node k . Part (ii) of the potential lemma implies that 'd (i, j) E P : c$ = 0.
Therefore c ~ = ~ -c$
! = 0. Thus a flow augmentation on (i; j ) E P might add ( j , i ) t o
the residual network, but ~7%: = 0, which means that still the reduced cost optimality
condition c; > 0 is fulfilled. QED
The strategy for the successive-shortest-path algorithm is now clear. Given a set
of excess nodes E = {ile(i) > 0) and a set of deficit nodes D = {ile(i) < 01, we
iteratively find minimal paths from a node i E E to a node j E D until no excess or
deficit remains:
The minimal path for flow augmentation is found using Dijkstra's method on the resid-
ual network with the reduced costs given by Eq. (7.23). After each flow augmentation,
the node potentials are recalculated using the potential lemma. We demonstrate the
algorithm in Fig. 7.8, where for simplicity we use a linear cost function.
Since we worked hard t o construct a system with positive reduced costs, the "find"
operation above can be carried out using Dijkstra's algorithm. If we denote the sum of
all the sources t o be v = C i l b ( i ) >bO( i ) , then the number of flow augmentations needed
t o find the optimal flow is simply v. Each flow augmentation requires a search for a
minimum-cost path from a node k E E t o a node 1 E D which for sparse graphs and
integer flows can be efficiently accomplished with Dijkstra's method, which is ( ? ( I El).
Thus for integer flows on sparse graphs with positive costs (as is typical of the physics
applications) the successive-shortest-path algorithm is O(vl E 1).
A final note on the initialization statement: for the physical problems we presented in
the preceding sections (7.2 and 7.3) it is not hard t o find a flow x and a note potential
T that fulfills the requirement that the reduced costs c"(x) = cij(xij) + >
~i - . s ~ j 0
are all non-negative. In the N-line problem it is simply x = 0 and T = 0, i.e. the
system without any flux line (FL), since all bond energies are non-negative. In the
convcx flow problem with a general local costs hij(xij) one just chooses the integer
xij that is closest to the minimum of hij(x), for the specific examples of 7.3, where
h2..7. ( x2.7..) = (xij - dij)' it is simply the integer that is closest t o the real number dij.
>
With this configuration cij(xij) 0 and with .ir = 0 also the reduced costs are non-
negative.
Flux-line a r r a y i n a p e r i o d i c p o t e n t i a l
As one example for typical flux-line (FL) problems we demonstrate here how one can
study the competition between point disorder and a periodic potential. Depending on
the strength of the disorder with respect to the depth of the periodic potential one may
or may not have a roughening transition. As a starting point we use the continuum
model (7.1) with a periodic potential V, and write down an appropriate lattice model
for it. Here we study the situation in which each potential valley is occupied by one
line and its minima are well localized, i.e. they have a width that is small against the
interaction range of thc lines.
We use the N-line model defined in Sec. 7.2 and simply modify the bond energies
eij in (7.2) appropriately: to the uncorrelated bond energy variables, taken from
some probability distribution P(tij)with variance t , we add a periodic part setting
+
eij = ~ i j Aij. The structure of the periodic part resembles periodically arranged
columnar defects and is depicted in Fig. 7.9, where one has valleys of effective depth
7 Mznimum-cost Flows
Figure 7.9: Top: Periodic potential in 2d. The depth of the valleys is denoted by A
and the nearest neighbor distance by a. Additional point disorder t accomplishes the
energy landscape. The FL can only enter and leave the system via the energetically
neutral edges connecting the source and the sink, respectively, with the potential
valleys. Bottom: Schematic phase diagram in 2d and 3d for bounded disorder. In
the case of unbounded disorder the flat phase vanishes.
A such that the Aij values are zero inside the potential wells and constant elsewhere.
This also reproduces the elastic energy, since all bonds cost some positive energy,
defining the ratio disorder strength and the depth of the potential valleys as q = &/A.
Figure 7.10 demonstrates with a series of snapshots the geometry involved in the
calculations; and the typical behavior with increasing g in 2d and 3d. In both cases
the lines are pinned t o the energetically favorable valleys for small q, and finally for
large q a cross-over t o a rough state takes place. In 3d one can observe that the
lines wander almost freely under such conditions. The examples of Fig. 7.10 represent
different regions in the u-q phase diagram, which is sketched in Fig. 7.9.
One discriminates between the different regions in the phase diagram by looking at
7 . 5 Mzscellaneous Results for Dzfferent Models
Figure 7.10: Optimal ground state configurations in 2d (top) and 3d (bottom) for
different point disorder strengths q , increasing from left to right. In the flat phase
(left) the FL are trapped completely inside the potential valleys.
For unbounded disorder this flat region does not exist, since the probability for a
sequence of high-energy bonds in the valleys that pushes the lines out of it is always
positive. In the weakly fluctuating region for q,l < q 5 q,z the lines roughen locally.
Here one has w > 0 and Ill> 0, independent of the systems size L, and P V = 1.
The transverse fluctuations of flux lines are bounded by the average line distance or
valley separation a. The central feature is that lines fluctuate individually, so that a
columnar defect competes with point disorder. Both in 2d and in 3d a strong columnar
pin strictly localizes the line [12] reducing the line-to-line interaction t o zero. More
details of this analysis can be found in [13]
With respect t o the balance constraint (7.9) the only feasible optimal solution (ground
state) is a flat surface, i.e. xij = 0 for all links ( i j ) . On the other hand, dislocations
can be introduced if one treats the height field hi as a multi-valued function which
may jump by 1 along lines that connect two point defects (i.e. a dislocation pair) [14].
Therefore, for the given example (Fig. 7.12) it should be clear that the minimal con-
figuration {x),~, (see above) is exactly the optimal (i.e. ground state) configuration
with one dislocation pair. One of the two defects has a Burgers charge b = +1 and
the other one b = -1. The pair is connected by a dislocation line (dashed line in Fig.
7.12) along which one has xij = 1. This already demonstrates that due t o the disor-
der the presence of dislocations decreases the ground state energy and a proliferation
of defects appears. Alternatively one can introduce a dislocation pair by fixing the
boundary t o zero and one [15].
The defect pairs in the disordered SOS model are source and sink nodes of strength + b
and -b, respectively, for the network flow field xij, which otherwise fulfills (V . x ) i = 0,
i.e. we have t o modify the mass balance constraint (7.9) as follows
0 no dislocation at i
(V . x ) =
~
*b dislocation at i
Thus the ground-state problem is t o minimize the Hamiltonian (7.10) subjected t o the
mass balance constraint (7.26). In the following we concentrate on defect pairs with
b = *1.
152 7 Minimum-cost Flows
The defect energy A E is the difference of the minimal energy configuration with and
without dislocations for each disorder realization, i.e. A E = El - Eo. Morc precisely,
for the configuration with N defect pairs of Burgers charge b = 411 we introduce two
extra nodes s and t with source strength n, = + N and nt = - N , respectively, and
connect them via external edges or bonds with particular sites of the lattice depending
on the degree of optimization: (a) with two sites separated by L/2 [Fig. 7.13(a)], (b)
the source node with one site i and the sink node with the sites on a circle of radius L/2
around i [Fig. 7.13(b)] and (c) both nodes with the whole lattice. Case (a) corresponds
t o a fixed defect pair, (b) to a partially optimized pair along a circle, both separated
by a distance L/2, and (c) t o a completely optimized pair with an arbitrary separation.
In all cases the energy costs for flow along these external edges are set t o a positive
value in order t o ensure the algorithm will find the optimal defect pair on the chosen
sites. These "costs" have no contribution t o the ground-state energy. In case of multi
pairs we always use graph (c). Here the optimal number N of defects in the system
is gradually determined starting with one pair ( N = 1) with a vortex core energy 2E,
and checking whether there is an energy gain or not. If yes, add a further pair (with
2Ec) and repeat the procedure until there is no energy gain from the difference of the
ground-state energy between two iterations.
One can study the defect energy A E and its probability distribution P ( A E ) on an
L x L lattice with L = 6, 12, 24, 48, 96, 192 and 2 . l o 3 - lo5 samples for each size and
consider the three cases (a)-(c) (see earlier). With an increasing degree of optimization
a negative defect energy A E becomes more probable and its probability distribution
7.5 Miscellaneous Results for Different Models
0.35 -
0.3 -
0.25 -
iii
-
a
a
0.2 -
0.15 -
0.1
0.05 -
n -
P ( A E ) differs more and more from the Gaussian fit, Fig. 7.14. The resulting disorder
averaged defect energy [AEIdi, scales like
From the fact that the partially and completely optimized dislocation pairs have on
average a negative energy that increases in modulus for increasing distance L it follows
that the ground state is unstable with respect t o the formation of unbound dislocation
pairs. More details can be found in [16].
We choose Byt = B e, i.e. the external field points in the z-direction. The dependence
of A E ( L ) on L provides the essential evidence about the stability of the ground state
with respect t o thermal fluctuations. If A E ( L ) decreases with increasing length L it
implies that it costs less energy t o turn over larger domains thus indicating the absence
of a true ordered (glass) state at any T # 0. Usually one studies such excitation of
length scale L by manipulating the boundary condition for the phase variables of
the original gauge glass Hamiltonian [17, 181. One induces a so called domain wall
of length scale L into the system by changing the boundary condition of a particular
sample from periodic to anti-periodic (or vice versa) in one space direction and measure
the energy of such an excitation by comparing the energy of these two ground-state
configurations. This is the common procedure for a domain-wall renormalization group
(DWRG) analysis, as it was first introduced in connection with spin glasses (see Chap.
9), which, however, contains some technical complications [17] and some conceptual
ambiguities [18, 191 in it.
Here we follow the basic idea of DWRG, we will, however, avoid the complications
and the ambiguities that appear by manipulating the boundary conditions (b.c.) and
try t o induce the low energy excitation in a different way, as first been done by one
of us in [20] for the zero-field case. First we will clarify what a low energy excitation
of length scale L is: in the model under consideration here it is a global vortex loop
encircling the 3d torus (i.e. the L3 lattice with periodic b.c.) once (or several times)
with minimum energy cost. How can we induce the above mentioned global vortex
loop, if not by manipulating the b.c.? Schematically the solution is the following
numerical procedure:
Bibliography 155
2. Determine the resulting global flux along, say, the x-axis f, = Ci J,Ox.
3. Study a minimum-cost-flow problem in which the actual costs for increasing the
flow on any bond in the x-direction Act = ci(J:x +1)- ci(J,OX)is smoothly mod-
ified letting the cost of a topologically simple connected loop remain unchanged
and only affecting global loops.
4. Reduce the Ac: until the optimal flow configuration {J1)for this min-cost-flow
+
problem has the global flux ( f , I ) , corresponding t o the so called elementary
low energy excitation on the length scale L.
where B is fixed, [. . .I, denotes the disorder average and Q is the stiffness exponent
and its sign determines whether there is a finite temperature phase transition or not,
as explained above. If Q < 0, i.e. the transition t o a true superconducting vortex state
appears only at T = 0 [ l o , 201, as shown in Fig. 7.15.
For any fixed value of B the finite size scaling relation (7.29) is confirmed and gives
Q = -0.95 & 0.04, c.f. Ref. [20], independent of the field strength B. Nevertheless
Fig. 7.16 shows that in one individual sample the excitation loops themselves change
their form dramatically with B. Only small parts of the loop seem to persist over
a significant range of the field strength, see for instance in the vicinity of the plane
z = 20 in Fig. 7.16.
Bibliography
[I] See e.g. S.T. Chui and J.D. Weeks, Phys. Rev. B 1 4 , 4978 (1976)
[2] Y.-C. Tsai and Y. Shapir, Phys. Rev. Lett. 69, 1773 (1992); Phys. Rev. E 40,
3546, 4445 (1994)
[3] J . Toner and D.P. Di Vincenzo, Phys. Rev. B 41, 632 (1990)
Figure 7.15: The domain-wall energy [AE],, in a log-log plot. The straight line
is a fit to [AE],, L' with 8 = -0.95
N + 0.03. This implies a thermal exponent of
v = 1.05 f 0.03. The disorder average is over 500 samples for L = 48, 1500 samples
for L = 32, and for the smaller sizes several thousand samples have been used.
[5] C. Zeng, A.A. Middleton, and Y. Shapir, Phys. Rev. Lett. 77, 3204 (1996)
[7] M.S. Li, T. Nattermann, H. Rieger, and M. Schwartz, Phys. Rev. B 54, 16024
(1996)
[8] G. Blatter, M.V. Feigel'man, V.B. Geshkenbein, A.I. Larkin, and V.M. Vinokur,
Rev. Mod. Phys. 66, 1125 (1994)
[9] M.P.A. Fisher, T.A. Tokuyasu, and A.P. Young, Phys. Rev. Lett. 66, 2931 (1991)
[lo] C. Wengel and A.P. Young, Phys. Rev. B 54, R6869 (1996)
[ll] R. Ahuja, T . Magnanti, and J . Orlin, Network Flows, (Prentice Hall, New Jersey
1993)
[12] L. Balents and M. Kardar Phys. Rev. B 49, 13030 (1994); T . Hwa arid T. Nat-
termann, Phys. Rev. B 51, 455 (1995)
[14] P.M. Chaikin and T.C. Lubensky, Principles of Condensed Matter Physics, (Cam-
bridge University Press, Cambridge 1997)
Figure 7.16: The minimum energy global excitation loop perpendicular to the ex-
ternal field in the z-direction is shown for one particular sample (L = 24) and three
different field strengths B (note the periodic b.c. in all space directions). a) (left)
B E [0.0065,0.0069] is in a range, where the defect energy AE varies linearly with
respect to the field. Note that the loop also has a winding number n, = 1 in the
direction parallel t o the external field. Hence d A E / d B = 2L. b) (middle) The same
sample as in (a) with B E [0.0070,0.0075]. In this interval the defect energy is con-
stant, no loop along the direction of the applied field occurs. c ) (right) The same
sample as in (a, b) with B E [0.0076,0.0081]. The system is very sensitive to the
variation of applied field AB. Even for a sniall change by AB = 0.0001 the form of
the excitation loop changes drastically.
In this chapter, genetic algorithms (GA) are explained. For a detailed introduction,
see c.g. [I, 2, 31. The basic idea is to mimic the evolution of a group of creatures of the
same species. Individuals which adapt better t o the requirements imposed by their
environment have a higher probability of survival. Thus, they pass their genes more
frequently t o subsequent generations than others. This means, the average fitness of
the population increases with time. This basic idea can be transfered to optimization
problems: instead of looking for skilled creatures, the aim is t o find a minimum of a
maximum of an objective function. Rather than different individuals, different vectors
of arguments of the objective function are treated. The evolutionary process can easily
be adapted for algorithms creating better and better vectors. The scheme has already
been applied t o various problems.
In this section we first give the basic framework of a GA. Then we present an example
in detail, which enables us t o understand the underlying mechanisms better. Finally
two applications from physics of the smallest and the largest particles are presented:
finding ground states in one-dimensional electronical quantum systems and determin-
ing the parameters of interacting galaxies. In Chap. 9 another application is shown,
where genetic algorithms are applied among other techniques for the calculation of
spin-glass ground states.
of individuals, on average better and better creatures appear. Please note that this
scheme is over simplified, because it neglects the influence of learning, society (which
is itself determined somehow by the genes as well) etc.
Population
This simple principle can be transfered t o other optimization problems. One is not
necessarily interested in obtaining an optimum creature, but maybe one would like to
have a configuration of minimum energy (ground state) of a physical system, a motor
with low energy consumption or a scheme to organize a company in an efficient way.
Physical systems, motors or companies are not represented by a sequence of genes but
are given through a configuration of particles or a vector of parameters. These will be
denoted as individuals-ii as well. The population (see Fig. 8.1) is again just a set of
different individuals.
Mutation
@-/L
Figure 8.2: The effect of the mutation operation on a population. Individuals are
randomly changed. Here, the values of the vectors are turned from + to - or vice
versa with a given small probability. In the upper part the initial population is shown,
in the lower part the result after the mutation has been carried through. The values
which have been turned are highlighted.
8.1 The Basic Scheme
Crossover
Figure 8.3: The crossover operation. Offspring are created by assembling parts from
different individuals (parents). Here just one part from a +/- vector and another
part from a second vector are taken.
Now the different operations affecting the population will be considered. Mutations
of genes correspond t o random changcs of the individual, e.g. displacing a particle or
changing a parameter, see Fig. 8.2. The reproduction scheme, often called crossover,
can be transfered in many different ways t o genetic algorithms. The general principle
is that one or several individuals (the parents) are taken, divided into small parts and
reassembled in different ways to create new individuals, called offspring or children,
see Fig. 8.3. For example a new particle configuration can be created by taking the po-
sitions of some particles from one configuration and the positions of the other particles
from a second configuration.
The selection step can be performed in many ways as well. First of all one has to
evaluate the fitness of the individuals, i.e. t o calculate the energy of the configurations
or t o calculate the efficiency of the motor with given parameters. In general, better
individuals are kept, while worse individuals are thrown away, see Fig. 8.4. The details
of the implementation depend on the problem. Sometimes the whole population is
evaluated, and only the better half is kept. Or one could keep each individual with a
probability which depends on the fitness, i.e. bad ones have a nonzero probability of
being removed. Other selection schemes just compare the offspring with their parcnt,s
and replace them if the offspring are better.
Another detail, which has t o be considered first when thinking about implementing a
GA, is the way the individuals are represented in a computer. In general, arbitrary
data structures are possible, but they should facilitate the genetic operations such as
mutation and crossover. In many cases a binary representation is chosen, i.e. each
individual is stored via a string of bits. This is the standard case where one speaks of
a "genetic algoritjhmX.In other cases, where the data structures are more complicated,
sometimes the denotation "evolutionary program" is used. For simplicity, we just keep
8 Genetic Algorithms
Selection
Figure 8.4: The effect of the selectiom operation on the population. The fitness F
is evaluated, here for all four individuals. Individuals with a low fitness have a small
probability of survival. In this case, individual two is removed from the population.
"configuration"
local
optimization
"configuration"
Figure 8.5: Local optimization. Here the population is shown in an energy landscape.
Large energy means small fitness. This local optimization moves individuals to the
next local optimum.
a ground state, one could move particles in a way that the energy is decreased, using
the forces calculated from a given configuration. Whether and how local optimizations
can be applied depends strongly on the current problem.
We finish this section by summarizing a possible general structure of a GA. Please
note that many different ways of implementing a genetic algorithm exist, in particular
the order of the operations crossover, mutation, local optimization and selection may
vary. At the beginning the population is usually initialized randomly, M denotes its
size and n R the number of iterations.
algorithm genetic
begin
Initialize population X I ,. . . , X M ;
for t := 1 to n R do
begin
select parents pl , . . . ,p k ;
create offspring e l , . . . , el via crossover;
perform mutations;
eventually perform local optimization;
calculate fitness values;
select individuals staying alive;
end
end
Genetic algorithms are very general optimization schemes. It is possible to apply them
t o various problems appearing in science and everyday life. Furthermore, the programs
are relatively easy t o implement, no special mathematical knowledge is required. As a
result an enormous number of applications have appeared during the last two decades.
There are several specialized journals and conferences dedicated to this subject. When
you search in the database INSPEC for articles in scientific journals which contain the
term "genetic algorithm" in the abstract, you will obtain more than 15000 references.
On the other hand, applications in physics are less common, about several hundred
publications can be found in INSPEC. Maybe this work will encourage more physicists
t o employ these methods. Recent applications include lattice gauge theory [5], analysis
of X-ray data [6], study of the structures of clusters [7, 8, 91, optimization of lasers
[lo]/ laser pulsed [ I l l and optical fibers [12], examining nuclear reactions [13], data
assimilation in meteorology [14] and reconstructing geological structures from seismic
measurements [15]. An application t o a problem of statistical physics in conjunction
with other methods is presented in Chap. 9. Below, sample applications from quantum
physics and astronomy are covered.
However, one should mention that GAS have two major drawbacks. Firstly the method
is not exact. Hence, if your are interested in finding the exact optimum along with
a proof that the global optimum has really been obtained, then you should use other
techniques. But if you want t o get only "very good" solutions, genetic algorithms
could be suitable. It is also possible t o find the global optimum with GAS, but usually
you have t o spend a huge numerical effort when the problem is NP-hard. Secondly,
164 8 Genetic Algorithms
although the method has a general applicability, the actual implementation depends on
the problem. Additionally, you have t o tune parameters like the size of the population
or the mutation rate, t o obtain a good performance. Very often the effort is worthwhile.
The implementation is usually not very difficult. Some people might find it helpful t o
use the package Genetic and Evolutionary Algorithm Toolbox (GEATbx) [16], which
is written t o use in conjunction with the program Matlab [17]. In the next section
a simple example is presented, which can be implemented very easily without any
additional libraries.
in the interval [O, I]. The function is plotted in Fig. 8.6. The goal is t o find, via a
genetic algorithm, the minimum of f (x), which is obviously located a t zo = 0.5 with
f (xo) = 0.
Here the individuals are simply different real values xi E [O, I]. In order t o apply genetic
operations like mutation and crossover, we have t o find a suitable representation. We
chose the method used to store numbers in a computer, the binary representation: it
is a string x i . . . x r of zeros and ones, where P denotes the length of the strings, i.e.
the precision. Since all values belong t o the intcrval [O, I] we use:
8 1 I I I I
procedure bit-sequence(x, P)
begin
f := 0.5
for q := 1 to P
begin
>
if x f then
xy := 1; x : = x - f ;
else
2 4 := 0;
f := f l 2 ;
end
return(zl,.. . , xP);
end
Next, we present the realization of the genetic operations. For the mutation with
rate p,, each bit is reversed with probability p, (the random numbers drawn in this
algorithm are assumed t o be equally distributed in [O, I]):
8 Genetic Algorithms
procedure mutation({xq))
begin
for q : = 1 to P do
begin
r := random number in [O, 11;
if r < p, then
24
' := 1 - 2 4 ;
end
return ( x l , . . . , xP);
end
Examde: Mutation
We will consider for the example the following bit string (P = 20)
For a mutation rate of p, = 0.2, by chance 3 bits (on average two bits) could
be reversed, e.g. bit 5, bit 10 and bit 17, resulting in
The crossover is slightly more complicated. It creates two children cl,c2 from two
parents xi, x j in the following way. A random crossover point s E { 1 , 2 , . . . , P ) is
chosen. The first child is assigned the bits x i t o x,S from the first parent and the bits
x$+' t o x r from the second parent, while the second child takes the remaining bits
xi,.. . ,x;, xBfl,. . . , x;:
procedure crossover(xi, x j )
begin
s := random integer number in { 1 , 2 , . . . , P};
for q : = 1 to s do
c4 .- 24. . c4 .- x4.
I1 2 .- 3'
for q := s + 1 to P do
8.2 Finding the Minimum of a Function 167
Example: Crossover
We assume that the crossover point is s = 7 (denoted by a vertical bar I).
For the two parents (P = 20)
In this problem, the selection is done in the following way: each child replaces a parent,
if it has a better fitness, i.e. the evaluation of the function f results in a lower value.
For simplicity we just compare the first child with parent x, and the second with x,.
The complete genetic algorithm is organized as follows. Initially M strings of length
P are created randomly, zeroes and ones appear with the same probability. The main
loop is performed n~ times. In the main loop, two parents x,, x, are chosen randomly
each individual has the same probability, and two children cl, c2 are created via the
crossover. Then the mutation is applied to the children. Finally, the selection is
performed. After the main loop is completed, the individual x having the best fitness
f (x) is chosen a:; a result of the algorithm. The following representation summarizes
the algorithm:
Please note that before the evaluation of the fitness f (x,), the value of the bit string
x i , . . . , x r has t o be converted into the number x:, .
Now we study the algorithm with the parameters M = 50 and p, = 0.1. Wc recom-
mend the reader t o write khe program itself. It is very short and the implementation
allows t o learn much about genetic algorithms. The choice of p, = 0.1 for the mu-
tation rate is very typical for many optimization problems. Much smaller mutation
rates do not change the individuals very much, so new areas in configuration space
are explored only very slowly. On the othcr hand, if p, is too large, too much genetic
information is destroyed by the mutation.
The optimum size M of the population usually has t o be determined by tests. It
depends on whether one is interested in really obtaining the global optimum. As a
rule of a thumb, the larger the size of the population is, the better the results are. On
the other hand, one does not want t o spend much computer time on this, so one can
decrease the population size, if the optimum is rather easy t o find.
average
best
Figure 8.7: Evolution of the current minimum and average fitness with time t , here
M = 50,pm = 0.1, n R = 10000.
In Fig. 8.7 the evolution of the fitness of the best individual and the average fitness are
shown as a function of the step size. Here nR = 10000 iterations have been performed.
The global minimum has been found after 1450 steps. Since in each iteration two
members of the population are considered, this means on average each member has
been treated 2 x 1450/M = 58 times. Please note, if you are only interested in a very
good value, not in the global optimum, you could stop the program after say only
100 iterations. At timestep 1450 only one individual has found the global optimum a t
xo = 0.5, the average value still decreases.
8.2 Finding the Minimum of a Function
t=l OxM
Figure 8.8: Evolution of population with time t ; here M = 50000,p,, = 0.1 and
t = 0 , l x AT, 10 x M.
170 8 Genetic Algorithms
To gain more insight into the driving mechanism of a GA, we will study the develop-
ment of the population with time. Since we are interested in average values, we will
study a much larger population M = 50000 to reduce the statistical fluctuations. At
the beginning, the members of the population are independently distributed in the
interval [0, 11, see the top of Fig. 8.8.
The situation after t = I x M steps is shown in the middle part of Fig 8.8. Now it
is more likely to find individuals near the local minima. The highest density of the
population is near the global minimum. After t = 10 x M steps, most members of
the population have gathered near the global optimum. Please note the change of
the scale in the lower part of Fig. 8.8. Finally, after t = 100 x M , the population is
completely centered at the minimum (not shown).
As mentioned before, the efficiency of a genetic algorithm may be increased by applying
local optimizations. Which method is suitable, depends highly on the nature of the
problem. Here, for the minimization of a function of one variable, we use a very
simple technique: an individual x is just moved t o the next local minimum. This can
be achieved by the following method. If the function f has a positive slope at x, then x
is decreased by a value 6, otherwise x is increased. The sign of S indicates the direction
of the change. This process is iterated. Each time the slope of f changes compared
with the last step, the size of /SI is reduced and the sign reversed. The iteration stops
if 6 becomes of the order of 2 Y P . The procedure reads as follows:
procedure local-optimization(x, P)
begin
S := 0.01;
while 161 > 2-P do
begin
>
if 6 x f l ( x ) 0 then
S := -0.5 x 6;
x := x + 6;
end
end
The algorithm iteratively approaches the closest local minimum. This simple local
optimization is suitable, because f has a smooth behavior. For arbitrary functions
more sophisticated methods have to be applied to search for local minima, see e.g.
[18]. Please note that due t o the limited numerical accuracy, it is possible that the
exact numerical local optimum cannot be found.
The above algorithm is a special case of the steepest descend method, which is applied
for functions of several variables. The basic idea is to calculate the gradient of the
function at the current value of the arguments and alt,er the arguments in the direction
of the gradient. Similar t o the simple method presented above, this technique is
not very fast. More efficient for minimizing functions of real variables are conjugate
gradient methods [18].
The GA is altered in such a way that the local optimization is applied t o each offspring
after the mutation. This drastically decreases the number of steps which are necessary
8.2 Finding the Minimum of a Function
Figure 8.9: Evolution of the current minimum and average fitness with time t when
additionally a local optimization is applied, here M = 50,p, = 0.1, n~ = 1000.
t o find the global optimum. The minimum and average fitness as a function of step
number of the extended algorithm (run with the same parameters as before) are shown
in Fig. 8.9. Please note that the decrease of the number of steps needed t o obtain the
minimum does not mean that the minimum is found quicker in terms of CPU time.
In many cases the numerical demand of the local optimization is very large. Also one
has t o take into account the additional effort for the implementation. Whether it is
worth including a local optimization or not depends heavily on the problem and must
be decided each time again.
The impact on the population can be observed in Fig. 8.10. The distribution of the
individuals after t = hir steps is shown. Most individuals are located near the local
minima. Only the members of the population, which have not been considered so far,
remain a t their original positions.
The minimization problem presented in this section serves just as an example to il-
lustrate the basic concepts of genetic algorithms. Two interesting applications from
physics are presented in the following sections, while another special-purpose genetic
algorithm is presented in Chap. 9.
8 Genetic Algorithms
I t=l XM
value
of the energy, where $* denotes the conjugate complex of $. To solve the ground-
state problem via a genetic algorithm, one considers a population {gj) of M different
wave functions. For the problem presented here, we assume that the electrons are
confined by potential walls of infinite height to the interval [u, b]. So the population
contains wave functions obeying the boundary conditions $ ( a ) = $ ( b ) = 0. Since we
are interested in time-independent problems, we can neglect the phase of the wave
function, i.e. concentrate on real-value functions.
Numerically, the functions are represented as a list of K + 1 numbers $j(a+Ic(b-a)/K)
with k = 0 , 1 , . . . , K. The fitness of individual dj is given by E[gj].For the evaluation
of the fitness, a cubic spline [18] can be used t o interpolate the wave function. Then
the integral (8.4) can be calculated numerically.
To specify the details of the GA, we have t o state the choice of the initial population,
the crossover mechanism and the way mutations are performed. No local optimization
is applied here.
The initial population {$j(x)) consists of Gaussian like functions of the form
For each individual the values for the center x j E [a, b] and the width oj E (0, b - a]
are chosen randomly. The constant A j is calculated from the normalization condition
I$j(x)/2dx = 1. Please note that the boundary conditions are fulfilled by definition.
For the mutation bperation a random mutation function $"(x) is added and the
result normalized again:
x(") =A [$3 (5) + grn(x)l . (8.6)
The simplest approach is just to draw the mutation function grnfrom the same en-
semble as the initial population. The only difference is, that $"' is not normalized.
This allows for slight changes of the wave function. In Fig. 8.11 a sample crossover
operation is presented.
The crossover operation combines two randomly chosen parent functions (x), $ 9 ( x )
and creates two offspring 41 (x), 4 2 (2). From the four wave functions $1 (x), $2 (x),
(x), $2 (x) the best two are taken over t o the next generation. Similar t o the crossover
presented in the last section, the resulting wave functions consist of two parts, where
the left part comes from one parent and the right part from another. Since wave
functions have t o be smooth functions, the offspring interpolate smoothly between
both parents. For that purpose a smooth step function St(x) is used:
The constants B1, B2 are chosen in such a way that 41,42 are again normalized. Here
+
S t ( x ) = ;[I tanh((x - xo)/k2)] is taken, where xo E (a, b) is chosen randomly and
8 Genetic Algorithms
Figure 8.11: The mutation operation for the quantum ground-state calculation.
Top: The original function $ j and the random mutation $" function. Bottom: The
resulting wave function.
k is a parameter which determines the sharpness of the step function. In Fig. 8.12 an
example of the crossover operation is shown. The two parent functions belong to the
initial population. The step function has xo = 0.5, k2 = 0.1.
For this specific genetic algorithm, a t each iteration step M times either a mutation
8.3 Ground States of One-dimensional Quantum Systems 175
Such potentials occur for example when clusters are studied, which are created by
intense laser pulses. To speed up the convergence, the initial wave functions $,I should
represent the symmetry of the potential. Thus, superpositions of five random functions
of the form (8.5) are taken. The resulting ground-state probability distribution after
20 iterations is shown in Fig. 8.14. Please note that the symmetry of the potential is
reflected by the wave function.
With an extension of the genet,ic algorithm it is possible t o calculate excited states
as well. More details can be found in [19]. Recently the method has been applied t o
two-dimensional quantum systems as well [20]. Now we leave the world of quantum
physics and turn t o the physics of large objects which provides another example where
GAS have been applied successfully.
8 Genetic Algorithms
Figure 8.12: The crossover operation for the quantum ground-state calculation.
Top: two parent functions. Middle: the step function (k2 = 0.1, xo = 0.5) regulating
the crossover. Bottom: resulting children functions.
8.3 Ground Sta,tes o f One-dim.en,sion.al Quan.tum Svstem.s
Figure 8.13: Calculated spatial distribution of electron density l$(z)I2 (solid line)
for the I d harmonic potential (dotted line). The inset figure shows the evolution of the
fitness during the GA-iterations. Reprinted from [I91 with permission from Elsevier
Science.
Iterations 4
x, (atomic units)
Figure 8.15: Two colliding galaxies. The figure is generated from a simulation.
the z-axis points towards the observer. The movement of the centers of the masses
of the galaxies can be described by the distances Ax, Ay, A z , the relative velocities
Av,, Av,, Av,, the masses m l , ma and the spins s l , s2 (clockwise or counterclockwise).
Using astronomical observations it is possible t o measure the separation Ax, Ay in the
plane of the sky and the relative velocity Av, along the line of sight [22]. The distance
A z , the velocities Av, , Av,, the masses m,l , ma and the spins s l , s2 are not known. In
the following it is shown how these parameters can be estimated by applying a genetic
algorithm. For that purpose, the individuals of the underlying population are chosen
t o be vectors of values (Az, Av,, A v , , m ~ ,ma, s l , sa).
First of all, since the distance Az cannot be measured, how do we know that the
galaxies are interacting a t all? Since one can observe only the projection of the scenery
t o the x-y plane, the galaxies might be far away. The basic idea t o answer this question
is that the evolution of a single galaxy is well studied. Its distribution of emitted light
obeys an exponential falloff. Usually it is assumed that the amount of emitted light is
proportional t o the mass density, thus one can use an exponential density distribution.
Only the length and mass scales may vary from galaxy to galaxy. For the case when two
galaxies are far away from each other, both distribution of masses obey this standard
picture. On the other hand, if two galaxies interact with each other, the standard
distribution is perturbed. For example in Fig. 8.15 a bridge of stars connecting both
galaxies can be observed. The main idea of the genetic algorithm t o determine the
unknown parameters is t o take advantage of this type of perturbation for the analysis.
We start the description of the GA with the encoding scheme for the individuals.
Secondly, it is shown how the fitness of the individuals is calculated, i.e. how the
collusion which results from the parameters is compared with the measurement data.
Next the mutation and crossover operations are given. Finally a result from a sample
simulation is presented.
Figure 8.16: The encoding of the parameters ( m l ,mz, Az, Av,, Av,, s l , sz) via 25
digits g, ~ { 0 , 1 ,. .. ,9). Four digits are used for each mass m l , mz, while the values
A z , Av,, Av, take 5 digits (one for the sign), the spins sl, sz = & are stored in the
last two digits. An odd number represents the sign +, otherwise it is a negative sign.
of time is 1.05 Myr = 3.3143 x 1013s, the unit of the mass is 2 x 1011 times the mass
of the sun (3.9782 x lo4' kg). The unit of velocity is then 931 km/s. Owing t o the
rescaling, all masses are taken in the interval 0 and 100, Az between -1000 and +I000
and Av,, Av, between -10 and 10. For encoding the masses 4 digits are used, while
the distance Az and the velocities take 5 digits. Thus, the first mass is obtained from
+ +
the vector g through m l = gl x 10' g2 x 10' +g3 x 10-I g4 x 10V2. The distances
+
and velocities arc encoded in a similar way using 4 1 digits. One digit is reserved t o
store the sign, it is negative if the corresponding value is odd and positive otherwise.
The spins sl, ss = i l of the galaxies are stored in the last two digits gad, 925 in the
same way.
The part of the GA which consumes the most computer time is the evaluation of
the fitness function, because for each set of parameters g a complete many-particle
simulation has to be carried through. This works as follows. First the centers of mass of
the two galaxies are calculated. With the set of known (i.e. fixed) parameters and with
given values for the parameters in g, the development of the system consisting of both
of the centers of masses is completely determined. Hence, it is possible to integrate the
equations of motion backwards in time until a state is obtained where the two galaxies
are far away from each other. Next, the two centers of masses are replaced by two sets
of stars with a rescaled standard mass distribution. The number of particles in the
galaxies are denoted by N,,J and N,J, respectively. For the examples presented here,
N,,l = NP,2 = 1000 were chosen. For simplicity, it is assumed that the distribution of
the stars is rotationally symmetric. Then the system is propagated forwards in time.
To speed up calculations, the simulations can be performed non-self-gravitatingly, i.e.
inter-particle forces are neglected. In this case only the gravitational forces between
the particles and the center of masses are included. For an improved simulation the
self gravitation and even gas dynamics and dark matter can be added, but this is
beyond the scope of a first test of the method. Another enhancement is t o drop the
assumption of rotational symmetry and to allow galaxies to be oriented differently in
space relative to each other, see [21]. Nevertheless, when the two galaxies approach
each other, they begin to intcract, i.e. the motion of the stars of the two systems are
affected by the other galaxy and the morphology of the galaxies change. The system
is propagated until the initial time 0 has been reached.
Now the distribution of masses obtained by the simulation can be compared with the
given distribution from the two galaxies under investigation. This comparison works
as follows. A grid of size n, x n, is superimposed on the data, see Fig. 8.17. The grid
corresponds t o a grid of pixels which is the result of an observation with a telescope,
digitized and stored on a computer1. For each grid cell (i,j ) the total mass mi,j in
the cell is evaluated, each particle from the first galaxy contributes with ml/Np,l and
each particle from the second one with mn/Np,z. When comparing with observations,
the amount of light corresponding t o each mass mi,j has t o be calculated. Here the
method is evaluated for testing purposes. Thus, only artificial observations are taken,
i.e. they are generated via a simulation of particles in the same way as explained. This
'please note that usually galaxies consist of several billion of stars, so it is not possiblc to identify
individual stars via a telescope.
8.4 Orbital Parameters of Interacting Galaxies
Figure 8.17: Computation of the data for the comparison. A grid is superimposed
on the image of interacting galaxies. For each grid cell the corresponding Inass is
calculated. For clarity, only a coarse grid is shown here. Reprinted from [21] with
permission from Springer Science.
allows the final result of the best set g obtained from the GA t o be compared with
the values which have been used t o set up the system. For the "observed" system, the
masses rn:)' can be calculated in the same way as the values of w,j. The deviation
6 the result of the simulation and the observational data is defined here as
The sum runs over all grid cells. The contribution m, in the denominator prevents
a divergence in case the mass observed in a grid cell is zero. In Ref. [21] m, =
+
~ / ( N , , J N,J) has been chosen, so particles in a region where no particle is supposed
t o be have the largest impact on the value of 6. Finally, the fitness F of the individual
g is taken as
Now the structure of the genetic algorithm will be described. Similar t o the first
example, the initial population of M individuals is chosen completely a t random. Each
generation is treated in the following way. First, the fitness values for all individuals
are calculated, as explained above. Then the individuals are ranked according t o their
fitness values, the highest fitness comes last, the lowest fitness first. The best individual
182 8 Genetic Algorithms
is always taken over twice t o the next generation. The other M - 2 members of the
next generation are obtained by crossover and mutation.
For each crossover two parents are selected randomly from the population. Here the
linear fitness ranking is applied. This means each parent is chosen with a probability
which is proportional t o its position in the ranking. Thus, individuals with a low
fitness have a lower probability of being selected. This can be achieved by drawing
+
a natural random number r between 0 and M(M 1)/2. The individual which has
position i in the ranking is chosen if ( i - l ) i / 2 5 r < i(i+1)/2. The crossover is carried
out as explained in Sec. 8.2: a random crossover point s E 1 , 2 , . . . , 2 5 is chosen. The
first child consists of the left part up t o digit g, from the first parent and the right
part from the second parent. The second child takes the remaining digits.
Finally, mutations are applied for all new individuals. With probability p, a digit is
set to a new randomly chosen value from { 0 , 1 , . . . , 9 ) . The result is a new generation
of M individuals. The whole process is repeated for n G generations and after the last
iteration the best individual is picked.
Figure 8.18: Comparison of the simulation with the best parameters with the original
system. The two pictures are not identical, but very similar.
To test the genetic algorithm, an artificial pair of colliding galaxies was created with
a given set of parameters (ml = 1.0, m2 = 1.0, A z = 3.0, Av, = -0.672, Av, = 0.839
and sl = sz = I), the measurable parameters are A x = 3.5, Ay = 8.0, and Av, = 0.44,
see Fig. 8.15. Then the GA was run with the aim of finding the parameters again. A
size M = 500 of the population was chosen, the program ran for n, = 100 generations
with p, = 0.003. To decrease the numerical effort, the values of ml and rns were
constrained t o be between 0.3 and 3.0, Az in [-SO, 501 and Au,,Av, between -1 and
1. When all individuals obey these restrictions, the children created by the crossover
Bibliography 183
remain in the same sample space as well. Mutations were accepted only if the resulting
parameters remained in the given intervals. These constraints reduce the number of
possible combinations of the parameters t o 1.2 x l 0 l 5 , still much too large t o be
searched systematically. The GA was able t o find these parameters in 100 generations
within an accuracy of about 10%. The best individual exhibited m l = 0.99, mz = 1.00,
AZ = 2.899, Av, = -0.676, Av, = 0.897, sl = 1, sz = 1. The resulting distribution
of the stars is compared in Fig. 8.18 with the given observation. Only slight deviations
are visible.
The results of further tests are presented in Ref. [21]. It can be shown that the GA is
much more effective than a random search. Furthermore the algorithm is not sensitive
t o noise. Even when the observational data m$' were disturbed with a 30% level of
noise, the parameters could be recovered, but with slightly lower accuracy. Finally it
should be mentioned, that for real observational data, usually features such as bars,
rings etc. in the central regions of the galaxies occur. This may cause problems t o the
GA. Hence, one should leave out the inner region for the calculation of the fitness F.
Bibliography
[I] M. Mitchell, An Introduction to Genetic Algorithms, (MIT Press, Cambridge
(USA) 1996)
[4] J.H. Holland, Adaption in Natural and Artzficial Systems, (University of Michigan
Press, Ann Arbor 1975)
[5] A. Yamaguchi and A. Sugamoto, Nucl. Phys. B Proc. Suppl. 83-84, 837 (2000)
[7] T.X. Li, S.Y. Yin, Y.L. Ji, B.L. Wang, C.H. Wang, and J.J. Zhao, Phys. Lett. A
267, 403 (2000)
[Ill R.S. Judson and H. Rabitz, Phys. Rev. Lett. 68, 1500 (1992)
[12] F.G. Omenetto, B.P. Luce, and A.J. Taylor, J. Opt. Soc. Am. B 1 6 , 2005 (1999)
[15] H. Sadeghi, S. Suzuki, and H. Takenaka, Phys. Earth Plan. Inter. 113,355 (1999)
[16] Genetic and Evolutionary Algorithm Toolbox, can be tested three weeks for free,
see http://www.geatbx.com/index.html
[17] Matlab 5.3.1 is a commercial program for data analysis, numerical computations,
plotting and simulation, see http://www.mathworks.com/
[18] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical
Recipes in C , (Cambridge University Press, Cambridge 1995)
[19] I. Grigorenko and M.E. Garcia, Physica A 284, 131 (2000)
[20] I. Grigorenko and M.E. Garcia, Physica A 291, 439 (2001)
ferromagnetic
-
Figure 9.1: A two-dimensional spin glass with bond disorder. Spins are placed on the
sites of a regular grid. They interact with their neighbors, the interaction is random,
either ferromagnetic or antiferromagnetic.
The sum (i, j ) runs over all pairs of nearest neighbors and Jij denotes the strength
of the bond connecting spins i and j . It is also possible t o add a term describing the
interaction with an external field B, here we will concentrate on the case B = 0. This
kind of model was introduced by Edwards and Anderson [6] in 1975, usually it is called
the EA model. It has a broad range of applications. Models involving similar energy
formulae have been developed e.g. for representing neural networks, social systems or
stock markets.
For each realization of the disorder, the values Jij of the bonds are drawn according
t o a given probability distribution. Very common are the Gaussian distribution and
the bimodal fJ distribution, which have the following probability densities:
Once the values of the bonds are fixed for a realization, they keep their values through-
out the whole calculation or simulation, one speaks of quenched disorder. Since the
system is random itself, to calculate physical quantities like magnetization or energy,
one must perform not only a thermal average but also an average over different real-
izations of the disorder.
9.1 Spin Glasses 187
The main ingredients constituting a spin glass are: mixed signs of interactions and
disorder. As a consequence, there are inevitably spins which cannot fulfill all con-
straints imposed by their neighbors and the connecting bonds, i.e. there will be some
ferromagnetic bonds connecting antiparallel spins and vice versa. On says, it is not
possible t o satisfy all bonds. This situation is called frustration, the concept was in-
troduced in 171. In Fig. 9.2 an example of a small frustrated system is shown. In Sec.
9.2 it will be explained what impact the presence of frustration has on the choice of the
algorithms.
- But first we will show how bond-randomness and frustration are created
in real materials.
t o 10 K, a jump of the susceptibility occurs. Again the system evolves for a while.
After time t2 the system is heated back to 12K. Now the the susceptibility switches
back t o the value it had a time t l . Hence, the system has remembered its state at
temperature 1 2 K. Such types of experiments are not understood so far in detail, only
heuristic explanations exist. This is one reason why spin glasses have attracted so
much attention in the past and will probably also attract it in the future.
The basic reason for this strange behavior is the type of interaction which is present in
this class of materials. The behavior of the Fe,Aul-, alloy is governed by the indirect-
exchange interaction, usually called RKKY (Ruderman, Kittel, Kasuya, Yosida) in-
teraction. Placing a magnetic spin Si (iron) in a sea of conducting electrons, results in
a damped oscillation in space of the susceptibility. Another spin Sj placed at distance
r will create the same kind of oscillations resulting in an energy H = J ( r ) S i . S j where
9.1 Spin Glasses
Figure 9.5: The RKKY (Ruderman, Kittel, Kasuya, Yosida) interaction sketched:
strength of the interaction of two spins with distance r in a sea of conducting electrons.
the lattice sites are occupied in a regular way, but the sign and the strength of the
bonds are random. Examples of such random-bond systems are RbzCul-,Co,F4 and
Fel-,Mn,TiO3.
"configuration" "configuration"
Figure 9.6: Mean-field solution vs. Droplet picture. The solution of the SK model
exhibits a complicated energy landscape, resulting in a broad distribution P(lq1) of
the overlaps. In the Droplet picture instead the system is dominated by one pair of
states at low temperatures, giving rise t o a delta-distributed P(lq1).
On the other hand, analytically it is very hard t o treat the EA model [2, 31. Since
it is actually impossible t o solve the simple cubic ferromagnet analytically, the reader
may imagine that due t o the additional average over the disorder and the varying sign
of the bonds, only very raw approximations could be performed successfully for spin
glasses. But there is a special spin-glass model, which was introduced by Sherrington
and Kirkpatrick also in 1975 [ll],the SK model. Its Hamiltonian is similar t o the
EA model, see Eq. (9.1), but it includes interactions between all pairs of spins. This
means that the spins do not have any positions in space, the system does not have a
9.1 Spin Glasses 191
dimension. Usually one says the system is infinite-dimensional, because in the ther-
modynamic limit each spin has infinitely many "neighbors". The model is denoted as
the mean-field (MF) model as well, since the MF approximation is exact here. For a
Gaussian distribution of the interactions, the SK model has been solved analytically
through the use of several enhanced techniques by Parisi in the 1980s [2]. The main
property of the solution is that a complicated energy landscape is obtained (see up-
per left part of Fig. 9.6) and that the states are organized in a special hierarchical
tree-line structure which is called ultrametric, for details see [12]. Especially at low
temperatures for typical realizations, there are always many configurations which are
arbitrarily different. What does this mean? To measure the difference between two
configurations {a$),{a!), the overlap q is introduced:
Thus if {a$) and {a?)are the same, we obtain q = 1, while q = -1 if the configurations
are inverted relative t o each other. If only about half of the spins have the same
orientation, we get q E 0. Since in a spin glass may exhibit many configurations
with large thermodynamical weight even at low temperatures, one has t o compare
all of them pairwise. Each comparison results in an overlap value, so we end up
with a distribution of overlaps P J ( ~ ) .After averaging over the disorder an average
distribution is obtained, denoted by P(q). Since the Hamiltonian Eq. (9.1) does not
contain an external magnetic field, it is symmetrical with respect t o the inversion of
all spins. Thus, P(q) is symmetrical with respect to q = 0 and it is sufficient t o study
P(lq1). The result for the SK model at low temperatures is shown in the lower left
part of Fig. 9.6. It contains a large peak called self overlap, which results from the
overlaps of states belonging t o the same valley in the energy landscape. Additionally,
there is a long tail down t o q = 0 resulting from pairs of states from different valleys.
Although the solution of the SK model is very elegant, it is restricted t o this spe-
cial spin-glass model. The question concerning the behavior of realistic, i.e. finite-
dimensional spin glasses is currently unsolved. One part of the physics community
believes that also for e.g. three-dimensional systems a similar hierarchical organi-
zation of the states can be found, as in the SK/mean-field model. Another group
favors a description which predicts a much simpler behavior, the Droplet picture
[13, 14, 15, 16, 171. In that framework it is assumed that the low temperature be-
havior is governed by basically one class of similar states (and the inverses), i.e. the
energy landscape is dominated by one large valley, see right half of Fig. 9.6. The
main signature of this behavior is that the distribution of overlaps is a delta function.
Please note that for finite system sizes, the distribution of overlaps always has a finite
width. Thus, the delta function is found only in the thermodynamic limit N + oo.
Recently, many results have been made available addressing this question, especially
with numerical techniques. Since near the transition temperature TG the systems are
very difficult to equilibrate, one is restricted to small system sizes and even lower tem-
peratures are not accessible using the usual Monte Carlo methods. As a consequence,
a definite answer has not yet been obtained.
192 9 Approximation Methods for Spin Glasses
even all bonds are negative. Here, the transformation t o a network is possible, since via
a gauge transformation all bonds can be made positive. We will now give an example
of such a gauge transformation and we will show why it does not work for spin glasses.
In the left part of Fig. 9.7 a small DAFF is shown. The basic idea of the gauge
transformation is t o multiply every second spin in a checkerboard manner with -1
and leave the other half unchanged
where x, y , z are the spatial coordinates of spin i and ti is the sign of the local gauge
transformation. The resulting system is shown in the right part of Fig. 9.7. All
bonds have turned ferromagnetic. The transformation has the following effect on the
Hamiltonian, in particular the sign of the quadratic term has changed, so in the a: a
ferromagnet is obtained:
Figure 9.8: A small & J spin glass. Via a gauge transformation the aim is to make
all bonds ferromagnetic. It fails because the system is frustrated.
Why is this transformation not suitable for spin glasses? In Fig. 9.8 a tiny spin glass
with four spins is shown. Let us try t o apply the gauge transformation. Because bonds
194 9 Approximation Methods for Spin Glasses
with different signs are present, the signs of the transformation have t o be chosen for
each spin individually. We start with the spin in the upper left corner. Without loss of
generality we leave it unchanged: = 01, i.e. t l = 1. Now we turn t o the spin below.
The bond between this and the first spin is antiferromagnetic, so we choose 0; = - 0 2
(tB = -1) which makes the bond ferromagnetic. In a similar way we have to choose
CT; = -03 (t3 = -1). Now we are left with the spin in the upper right corner. It is
connected by one ferromagnetic bond and one antiferromagnetic bond t o its neighbors.
Consequently, whatever sign for the transformation of 0 4 we select, always one bond
remains negative. The reason that not all bonds can be made positive is equivalent t o
the fact that in a ground state it is not possible t o satisfy all bonds, i.e. the presence
of frustration. In general a system is frustrated if closed loops exist which have an odd
number of antiferromagnetic bonds, e.g. an antiferromagnet on a triangular lattice.
So we can see that due t o the existence of frustration a spin glass cannot be transformed
into a ferromagnet. This is the reason why the fast algorithms for the ground-state
calculations cannot be applied in this case. In fact it can be shown that the ground-
state problem for spin glasses is NP-hard [18], i.e. only algorithms with exponentially
increasing running time are known. As an exception, for the special case of two-
dimensional spin glasses without external field and with periodic boundary conditions
in a t most one direction, efficient polynomial algorithms [19] for the calculation of
exact ground states are available. The most recent applications are based on matching
algorithms [20, 21, 22, 23, 24, 25, 26, 27, 281, see also Chap. 10, other exact approaches
can be found in Refs. [29, 301. Recently results for systems of size 1800 x 1800 were
obtained [31].
For the general problem several algorithms are available, for an overview see [32, 331.
The simplest method works by enumerating all 2N possible states and obviously has an
exponential running time. Even a system size of 43 is too large. The basic idea of the
so called Branch-and-Bound algorithm [34] is to exclude the parts of the state space
where no low-lying states can be found, so that the complete low-energy landscape of
systems of size 43 can be calculated [35, 36, 37, 381. Also transfer-matrix techniques
have been applied [39] for 43 spin glasses. To evaluate all ground states, similar
algorithms have been applied t o two-dimensional systems as well [40,41,42, 43,44,45].
A more sophisticated method called Branch-and-Cut [46, 471 works by rewriting the
quadratic energy function as a linear function with an additional set of inequalities
which must hold for the feasible solutions. Since not all inequalities are known a pri-
ori, the method iteratively solves the linear problem, looks for in~qualit~ies which are
violated, and adds them t o the set until the solution is found. Since the number of
inequalities grows exponentially with the system size, the same holds for the compu-
tation time of the algorithm. With Branch-and-Cut anyway small systems up to g3
are feasible. Further applications of these method can be found in Refs. [48, 491.
Since finding ground states in three-dimensional systems is computaionally very de-
manding, several heuristic methods have been applied. At the beginning simulated
annealing (see Chap. 11) was very popular, recent results can be found in Refs.
[50, 511. But usually it is very difficult t o obtain true ground states using this tech-
nique. A more sophisticated method is the multicanonical method [52], which is based
on Monte Carlo simulations as well, but incorporates a reweighting scheme t o speed
9.2 Genetic Cluster-exact Approximation 195
up the simulation. Very low lying states of some systems up to size 123 have been
obtained [53, 54, 55, 56, 571. Heuristics which are able to find low energy states but not
true ground states unless the systems are very small can be found in Refs. [58, 59, 601.
Another approach included the application of neural networks 1611.
Usually with minor success several variants of genetic algorithms (see Chap. 8) have
been applied [62,63,64,65,66,67,68]. In a similar approach the population of different
configurations was replaced by a distribution describing the population 1691. At first
sight these approaches looked very promising, but it is not possible to prove whether
a true ground state has been found. One always has to check carefully, whether it is
possible to obtain states with slightly lower energy by applying more computational
effort. A genetic approach 170, 711 which was designed especially for spin glasses has
been more successful. Realizations up to size l o 3 with Gaussian distributions of the
bonds have been studied recently [72, 73, 741, supporting the Droplet picture for spin
glasses. A combination with a recursive renormalization method can be found in Ref.
1751. The basic idea is to divide the problem into subproblems of smaller size and
treat the subproblems in the same way. The technique has been applied to finite-
dimensional systems with Gaussian distribution of the bonds [76, 771, again up to size
l o 3 , but also to other combinatorial optimization problems like the TSP.
The method presented in this chapter is able to calculate true ground states [78] up to
size 143. The method is based on the special genetic algorithm of Ref. [70] as well but
also incorporates the cluster-exact approximation (CEA) 1791 algorithm. CEA is an
optimization method designed specifically for spin glasses, but it should be applicable
to all problems where each element (here: spin) interacts with a small number of other
elements. Its basic idea is to transform the spin glass in a way that anyway the max-
flow methods can be applied, which work only for systems exhibiting no frustrations.
Next a description of the genetic CEA is given. We start with the CEA part and later
we turn t o the genetic algorithm.
The basic idea of CEA is to treat the system as if it were possible to turn it into a ferro-
magnet: a cluster of spins is built in such a way that all interactions between the cluster
spins can be made ferromagnetic by choosing the appropriate gauge-transformation
signs ti = & I . All other spins are left out (ti = 0). In this way the frustration is
broken. The interaction with the non-cluster spins is included in the total energy, but
t,he non-cluster spins are not allowed to flip, they remain fixed. Usually one starts
with some random orientations of all spins, so the non-cluster spins just keep this
orientation. Let us consider a pair interaction Jijcricrjbetween a cluster spin m i and a
non-cluster spin aj. Since m j is kept fixed, say crj = $1, we can write Ji,crimj = Jijmi.
Now the interaction has turned into an interaction of spin j with a local field of size
Bi = m j = $1. After the construction of the cluster has been completed, for each
cluster spin all interactions with non-cluster spins and che original local field Bi are
summed up to calculate the new local field:
The sum runs over all neighbors j of spin i . Because of the factor ( I - lt,I) only the
interactions with non-cluster spins are included into the new local field. The gauge-
196 9 Approximation Methods for Spin Glasses
The constant C contains the interactions among non-cluster spins. Since the these
spins will not change their orientations during the calculation, C does not change.
Thus, it can be neglected for the ground-state calculation. Since the signs ti have been
chosen such that all pair interactions Jij = Jijtitj are either zero or ferromagnetic and
because several spins may not be included into the cluster (ti = 0), the system we
have obtained is a diluted ferromagnet with random local fields. As we have learned
before, for this system the ground state can be calculated in polynomial time using
the fast methods of Chap. 6.
How does the construction of the non-frustrated cluster work? The method is similar
to the construction demonstrated in the example shown in Fig. 9.8: spins are chosen
iteratively. If it is possible t o make all adjacent bonds positive by a gauge transfor-
mation, the sign of the transformation is chosen accordingly. It is important t o note
that only those bonds have to become ferromagnetic where t j # 0 on the other end
of the bond, i.e. for bonds connecting cluster spins. The other bonds may have been
considered already or they can be treated lat,er on. The algorithm for the cluster con-
struction is presented below. The variable Si is used t o remember which spins have
been treated already.
algorithm build-cluster ({ J i j ) )
begin
Initialize di := 0 for all i;
Initialize ti := 0 for all i ;
while there are untreated spins (Si = 0)
begin
Choose a spin i with Si = 0;
6.2 .- 1.
'- 1
Next, we assume that spin 6 is chosen. It has one neighbor in the cluster,
spin 5. We obtain cu = J65t5 = 1. Consequently, we set t6 := 1. Again, this
transformation leaves the bonds unchanged. Then spin 8 is considered. It has
one neighbor in the cluster: spin 5, with a = Js5t5 = -1. Therefore, we set
t8 = -1 to turn the bond between spins 5 and 8 ferromagnetic. The resulting
situation is presented in Fig. 9.10. Spin 9 cannot be added t o the cluster. It
has two neighbors with t, # 0, spins 8 and 6 with Jsst8 = -1 # 1 = Jg6t6.
During the following iteration spin 2 is chosen. Similar t o the preceding steps
t2 = 1 is obtained. If now spin 3 is taken into account, we see that it has two
neighbors, which already belong t o the cluster: spin 2 and spin 6. We get
5 3 2 2 2 = 1 # -1 = &t6. Therefore, spin 3 cannot become a member of the
cluster, so tg = 0.
If we assume that next spins 4,7 are treated, then both of them can be added
t o the cluster without creating frustration. The gauge-transformation signs
obtained are t4 = 1, t7 = -1. Last, spin 1 cannot be added t o the cluster.
The final situation is shown in Fig. 9.11.
0
9 Approximation Methods for Spin Glasses
In the algorithm above, the order in which the spins i are chosen is not specified.
Several heuristics are possible. From our tests we have found that the most efficient
method so far is as follows. The spins are selected randomly among the spins which
have not been treated so far. For each spin the probability of selection within the
current step is proportional t o the number of unsatisfied bonds adjacent t o it. These
numbers are calculated using the current spin configuration. Thus, a spin with a high
(bad) contribution to the total energy is selected more often than a spin with a low
(negative = good) contribution. This results in a very quick decrease in the energy,
see below.
After the cluster of non-frustrated spins has been constructed and the cluster ground
state has been obtained, the cluster spins are set accordingly, while the other spins
keep their previous orientation. For the whole system this means that the total energy
has either been decreased or it has remained the same, because the cluster spins have
been oriented in an optimum way. Please note that for the total system no ground
9.2 Genetic Cluster-exact Approximation 199
state has usually been found! But, since the cluster is built in a random way, another
run can be performed and the cluster will probably be constructed differently. So
the whole step can be repeated and maybe again the energy of the whole system is
decreased. This decrease is very efficient in the beginning, because usually the clusters
are quite big. For three-dimensional spin glasses on average about 55% of all spins
are members of a non-frustrated cluster (70% for two dimensions). In Fig. 9.12 a
sample run for a two dimensional spin glass is shown. Initially the spin configuration
was chosen randomly, which results on average in energy 0 at step 0. In the beginning
the energy decreases rapidly (please note the logarithmic scale of the x-axis). Later
on the energy levels off and cannot be decreased further, the system has run into one
(probably local) minimum.
-1.5
1 10 100 1000
step
Figure 9.12: Energy per spin for a two-dimensional sample spin glass treated with
the CEA algorithm as a function of the number of the step.
From comparisons with exact results for small systems, we know that the states found
in this way really have a very low energy, but, except for very small systems, they are
usually slightly above the true ground states. To make the met,hod even more efficient,
and t o find true ground states, even for larger system sizes, the CEA method can be
combined with a genetic algorithm (for an introduction, see Chap. 8).
The genetic algorithm starts [70] with an initial population of Mi randomly initialized
spin configurations (= individuals), which are linearly arranged using an array. The
last one is also a neighbor of the first one. Then no x Mi times two neighbors from the
population are taken (called parents) and two new configurations called offspring are
created. For that purpose the triadic crossover is used which turns out t o bc very
efficient for spin glasses: a mask is used which is a third randomly chosen (usually
200 9 Approximation Methods for Spin Glasses
distant) member of the population with a randomly chosen fraction of 0.1 of its spins
reversed. In a first step the offspring are created as copies of the parents. Then those
spins are selected where the orientations of the first parent and the mask agree [SO].
The values of these spins are swapped between the two offspring.
parentl
mask
Figure 9.13: The triadic crossover. Initially the offspring are copies of the
parents. Then the spins between the offspring are exchanged a t positions
where parentl and a mask agree. The mask belongs to the population of
configurations as well.
The effect can be seen especially in this example, because the first halves of
both parents are assumed to be inverted relative t o cach other. Consequently,
in that part the offspring are equal t o the mask and t o its mirror configu-
ration, respectively. This is the reason why the mask must be similar t o an
existing low energy configuration. Since the parents agree in some spins, the
offspring are only partly copies of the masklits inverse.
9.2 Genetic Cluster-exact Approximation 201
Then a mutation with a rate of p, is applied to each offspring, i.e. a randomly chosen
fraction p, of the spins is reversed.
Next for both offspring the energy is reduced by applying CEA. This CEA minimiza-
tion step is performed nmi, times for each offspring. Afterwards each offspring is
compared with one of its parents. The offspring/parent pairs are chosen in the way
that the sum of the phenotypic differences between them is minimal. The phenotypic
difference is defined here as the number of spins where the two configurations diffcr.
Each parent is replaced if its energy is not lower (i.e. not better) than the correspond-
ing offspring. After this whole step is conducted no x Mi times, the population is
halved: from each pair of neighbors the configuration which has the higher energy is
eliminated. If more than 4 individuals remain the process is continued otherwise it is
stopped and the best remaining individual is taken as result of the calculation.
The following representation summarizes the algorithm.
algorithm genetic CEA({Jij), Mi, no;p,, nmin)
begin
create Afi configurations randomly;
while (.Mi > 4) do
begin
for i = 1 to n,, x Mi do
begin
select two neighbors;
create two offspring using triadic crossover;
do mutations with rate p,;
for both offspring do
begin
for j = 1 to nmindo
begin
construct unfrustrated cluster of spins;
construct equivalent network;
calculate maximum flow;
construct minimum cut;
set new orientations of cluster spins;
end
if offspring is not worse than related parent
then
replace parent with offspring;
end
end
half population; Mi = Mi/2:
end
return one configuration with lowest energy
end
The whole algorithm is performed nR times and all configurations which exhibit the
9 Approximation Methods for Spin Glasses
Figure 9.14: Average ground-state energy per spins eo as a function of system size
L for three-dimensional Ising & J spin glasses.
With increasing system size the average ground-state energy decreases monotonically.
+
Thus, one can try to fit a function of the form eo(L) = eo(co) a L p b t o the data
points, resulting in an estimate of the ground-state energy eo(oo) of a very large, i.e.
nearly infinite, system. Please note that so far no justification for this special form of
the fitting function has been found. It just fits very well. It has been reported in the
literature that other functions such as exponentials have also been tried, but the results
for eo(co) are similar. When using the fit-procedure of the gnuplot program, which is
available for free, the value eo(co) = -1.7876(3) is obtained (the values of u, b are not
important here). This value means that in a ground state about 0.6 unsatisfied bonds
per spin exist (if all bonds were satisfied this would result in a ground-state energy of
-3, each unsatisfied bond increases the energy by a value of 2).
For two and four dimensions, values of eo(oo) = -1.4015(3) and eo(co) = -2.095(1)
have been obtained, respectively, using genetic CEA. These are currently the lowest
values found for the ground-state energies of spin glasses. This is another indication
that the genetic CEA method is indeed a very powerful optimization tool.
Apart from obtaining some values, the calculation of ground-state energies can tell
us a lot more about the underlying physics. In particular, one is able t o determine
whether a system exhibits a transition from an ordered low-temperature state t o a
disordered high-temperature state at a non-zero temperature T, > 0.
To show how this can be done, first the simple Ising ferromagnet is considered. It
is known from basic courses in statistical physics [83], that the one-dimensional Ising
chain exhibits a phase transition only at T, = 0, i.e. for all finite temperatures the
9.3 Energy and Ground-state Statistics 205
Ising chain is paramagnetic. The two-dimensional ferromagnet on the other hand has
a ferromagnetically ordered phase (T, = 2.2695, where J is the interaction constant).
Figure 9.15: A two-dimensional Ising ferromagnet. In the case where all boundary
conditions are periodic (left), the system is ferromagnetic at T = 0. When antiperiodic
boundary conditions are imposed in one direction (right), a domain wall is introduced,
raising the ground-state energy by an amount which is proportional to the length of
the domain wall.
Why the one-dimensional Ising ferromagnet does not exhibit an ordered phase at
T > 0, while the two-dimensional has one, can be seen from ground-state calculations
as well. Consider a two-dimensional ferromagnet of size N = L x L wit,h periodic
boundary conditions (pbc) in all directions. Since all bonds are ferromagnetic, in the
ground state all spins have the same orientation, see the left half of Fig. 9.15. The
ground-state energy is Epbc= - 2 N J . Now antiperiodic boundary conditions (abc)
in one direction are considered. This can be achieved by turning all bonds negative,
which connect the spins of the first and the last column. Again the ground state is
calculated. Now, it is favorable for the spins of the first and thc last row to have
different orientations. This means we get two domains, in one domain all spins are
up, in the other all spins are down. This introduces two domain walls in the system.
The first domain wall is between the first and the last row, connected through the
boundary. This is compatible with the negative sign of the bonds there. The other
domain wall is somewhere else in the system. This means the bonds on the second
domain wall are ferromagnetic, while the spins left and right of the domain wall have
different orientations. Thus, L bonds are broken1, raising the ground-state energy
Eabcby an amount of 2 L J . As a consequence, we get Eabc= - 2 N J + 2LJ. The
difference A = Eabc- Epbc= 2 L J is called the stzffness energy. Here it depends on
the linear system size L. When performing the thermodynamic limit L + oo, the
'please note that the fully ordered state is still possible. In this case both domain walls fall onto
each other. But again L bonds are broken.
206 9 Approximation Methods for Spin Glasses
goes t o zero. This indicates that the two-dimensional ferromagnet exhibits some kind
of stiffness against flips of finite domains. Thus, they will not occur at low tempera-
tures, which means that the ferromagnetically ordered state is stable at low tempera-
tures. For higher temperatures more complicated domains with longer domain walls
also have t o be considered, thus for entropic reasons at some temperature T, > 0 the
ordered state is destroyed. Please note that only the fact that T, > 0 holds can be
shown from ground-state calculations, the value of T, itself cannot be calculated this
way.
In case one performs the same considerations for the one-dimensional spin chain, one
sees immediately that when flipping the boundary conditions from periodic t o an-
tiperiodic, again a domain wall is introduced. But now it has only the length one for
all chain lengths L. This means the stiffness energy A = 25 does not depend on the
system size. In the thermodynamic limit an arbitrary small temperature is sufficient
t o break the long range order.
Figure 9.16: Stiffness energy as a function of system size L. The line represents
the function I A(L)I = with Os = 0.19(2). The inset shows the same figure on
a log-log scale. The increase of IAl with system size indicates that in 3d Ising spin
glasses an ordered phase exists below a non-zero temperature T,. The figure is taken
from 1781.
9.3 Energy and Ground-state Statistics 207
The same kind of calculations can be applied t o spin glasses as well. One obtains the
ground states with periodic and antiperiodic boundary conditions, respectively, and
calculates the stiffness energy = (E,,, - Epbcl. Since spin glasses are disordered
systems, here again an average over the disorder has t o be taken. It is necessary to take
the absolute value, because for some systems the state with pbc has a lower energy,
while for other the abc state is favorable. From theoretical considerations [84, 851, one
obtains a behavior of the form A(L) = L@S,where Os is the stiffness exponent.
If Os > 0 one expects that the system has an ordered low-temperature phase, while
for Os < 0 only ordering exactly at T = 0 should exist. In Fig. 9.16 the stiffness
energy of three-dimensional spin glasses as a function of system size L is shown. The
value of Os = 0.19 indicates that the three-dimensional spin glass has Tc > 0, which
is compatible with Monte Carlo simulations [86, 871. On the other hand, for two-
dimensional spin glasses, a value of Os < 0 has been found [25], i.e. in two dimensions,
spin glasscs seem to have an ordered phase only at T = 0.
Now we turn from the analysis of ground-state energies t o the ground-state configu-
rations itself. As it was pointed out before, we have t o perform two kinds of averages
t o characterize the behavior of the ground-state landscape. The first kind of average
is that we have t o consider many different realizations of the random bonds. Sec-
ondly, for each realization, many independent configurations have t o be calculated.
The statistical weight p(,z} of a spin configuration {ai) with energy E = H({oi}) is
p(,z} = exp(-H({ai})/T)/Z where Z is the partition sum. In the limit of zero temper-
ature only the ground states contribute t o the behavior of the system. As an important
consequence, we can see from this formula that each ground state contributes with the
same weight, because all of them have exactly the same energy.
The genetic CEA method returns in each run at most one ground state. Thus, any
such an algorithm which is used t o sample a ground-state landscape of spin glasses (or
any other system) must return each ground state with the same probability. In that
case, by taking an average over different ground states, it is ensured that the physical
quantities calculated indeed represent the true thermodynamic behavior.
For the algorithm which has been presented in the preceding section, it is not known a
priori, whether the ground states obey the correct statistics, i.e. whether each ground
state appears with the same probability. To test this issue [88], we take one small spin
glass of size N = 53 and let the algorithm run for n, = lo5 times. Each ground state
which appears is stored and it is counted how often each ground state is found. This
gives us a histogram of the frequencies of how often each ground state is calculated by
the algorithm. The result is shown in Fig. 9.17. Obviously the large deviations from
state to state cannot be explained by the presence of statistical fluctuations. Thus,
genetic CEA samples different ground states from the same realization with different
weights.
Consequently, when just the configurations are taken as they are given by the algo-
rithm, the physical quantities calculated by taking an average are not reliable 189, 901.
This is true especially for the overlap parameter q which is used to compare different
configurations and evaluate the ground-state landscape.
It should be pointed out that this drawback does not appear only for the genetic CEA
method. No algorithm known so far guarantees the correct statistics of the ground
9 Approximation Methods for Spin Glasses
id of ground state
Figure 9.17: Number of times each ground state is found in l o 5 runs of the genetic
CEA algorithm for one sample realization of size N = 53. This realization has 56
different ground states. Obviously, different states have significantly different proba-
bilities of being calculated in one run. Example: ground state 10 occurred about 4500
times among the lo5 outcomes while state 56 was found only about 200 times.
states. This is true also for methods which are based on thermodynamics itself, like
simulated annealing (see Chap. l l ) ,which is very often used to study ground states
of disordered systems. For that algorithm, if the rate of the temperature decrease
is chosen in a way that one in two runs results in a truc ground state, then for the
N = 53 system treated above a similar histogram is obtained. By decreasing the
cooling process the weight of the different ground states become more equal, but one
has t o cool 100 times slower, i.e. spend 100 time the computational effort, to find each
ground state with almost the same probability.
So far no method is known which allows the algorithms t o be changed in such a way
that each ground state appears with the proper weight. On the other hand, it is
possible by applying a post-processing step t o remove the bias which is imposed by
the algorithms. A suitable method is presented in the following section.
exhibit about 1016 ground states. In this case one can only sample a small subset of
states. But then each of all existing ground states must have the same probability
appearing in the sample to guarantee the correct result.
The basic concept of the method, which ensures correct thermodynamic statistics of
the ground states, is t o apply post processing. The method which calculates the ground
states remains unchanged. The input t o the post processing is a set of ground states.
The output is another set of ground states, which now have the correct thermodynamic
weights, see Fig. 9.18. This post processing can be used in conjunction with all kinds
of methods which calculate ground states, it is not restricted t o the genetic CEA
algorithm.
ground
- - -states
_ ground
- - -states
-
16 0
\
\
0 \ \ \
'\0 0 :
Calculate
ground
states # processing #
Figure 9.18: The correct thermodynamical weight of the ground states is ensured by
a post-processing step which is applied after a set of ground states has been calculated.
Now the idea behind the post processing is explained. Assume we investigate a three-
dimensional spin glass of size N = 83 which has about 1016 different ground states.
Using genetic CEA we calculate 100 ground states. From the preceding section we
know that some ground states are more likely t o be found than others. The first step
of the post processing consist of dividing the 100 states into groups according t o some
criterion. A convenient criterion is explained later on. For the moment we take a toy
criterion, we assume that ground states have colors, e.g. blue, red and green and that
they are divided according their colors.
Next we have to get access to the other ground states, which have not been found
before. They have colors as well. We assume that it is possible t o perform T = 0
Monte Carlo simulations which preserve the color and visits only ground states. This
is similar to a Monte Carlo simulation which preserves the magnetization by flipping
only pairs of up/down spins. We assume that the MC simulation is ergodic, that means
starting with a blue ground state we can access all blue ground states by just running
the simulation long enough. Furthermore, we assume that the simulation satisfies
detailed balance, that means each ground state obtains its correct thermodynamic
weight within its group. This means after performing nRuN runs of the simulation we
have a set of ~ R ~ ground
J N states (the initial states are not used any more), where all
existing blue ground states have the same probability of being in this set, all red have
the same probability, etc..
210 9 Approximation Methods for Spin Glasses
Now, we still do not know whether a red and a blue ground state have the same
probability of being visited during the MC simulation. To obtain the final sample
where all ground states have the same probability, we have t o estimate the size of
each group. This must be done using the small number of states we have obtained
for each group by the MC simulation. We cannot just count all (- l 0 l 5 ) blue states,
simply because we do not have them all available. Later on we will explain how the
group sizes are estimated for the real criterion we use. For the moment we just assume
that it is possible somehow t o estimate the total number of blue states etc. for each
realization.
The final sample of states is obtained by drawing from each group a number of states
which is proportional to the size of the group, so each group is represented in the
sample with the correct weight. Since we have made all ground states within each
group equiprobable by the MC simulation, we end up with a sample (e.g. of size 100)
where each of the 1016 ground states is included with the same probability. Thus,
the sample is thermodynamically correctly distributed. A summary of the method is
given in Fig. 9.19.
A final problem may be that ground states having a specific color, e.g. yellow, have
not been detected by the initial run of the genetic CEA method, although the system
may have some (e.g. l o 4 ) yellow states. In this case they will never occur during
the MC simulation, because there the color is preserved. For the actual criterion we
use, it can be shown that the probability that a member of a specific group is found
increases with the size of the group [88]. Furthermore, for the system sizes which are
accessible the number of groups is small compared with the number of ground states
usually obtained. Thus, only some small groups are missed, representing just a tiny
fraction of all ground states. Consequently, when calculating physical properties, the
error made is very small.
So far we have explained how the post processing works in general. Now it is time
t o be precise, the actual criterion we use is given. We start with a definition. Two
ground states are said t o be neighbors if they differ by the orientation of just one spin.
Thus, the spin can be flipped without changing the energy, the spin is called free.
All ground states connected t o each other by this neighbor relation in a transitive way
are said t o be in one valley. Different valleys are separated by states having a higher
energy. Therefore, one can travel in phase space within a valley by iteratively flipping
free spins. The valleys are used as the groups mentioned above, i.e. for obtaining each
ground state with the same probability. In the first step the configurations are sorted
according t o the valleys and then the valley sizes are estimated.
The MC simulation, which preserves the group identity, is very sirnple: iteratively a
spin is chosen randomly. If the spin is free, it is flipped, otherwise the spin remains
unchanged. This is an ordinary T = O MC simulation. Consequently, ergodicity and
detailed balance are fulfilled as usual. One just has t o ensure that t,he runs are of
sufficient length. This length can be estimated by test simulations. One starts with
a given ground state and performs say 20 different runs of length nMc MC steps per
spin resulting in 20 new ground states of the same valley. Then one compares the
ground states by calculating the distribution P ( q ) of overlaps, this is the quantity we
are finally interested in. The behavior of P ( q ) is observed as a function of the number
9.4 Ballistic Search
2. ground-state groups
. - - - _.
n M c of steps. If the shapes change no longer for even the largest valleys, the runs
have a sufficient length. For three-dimensional systems n ~ =c 100 turned out t o be
sufficient for all system sizes.
More difficult t o implement is the first step of the post processing, the division of the
ground states into different valleys. If all ground states were available, the method
for dividing the ground states would work as follows. The construction starts with
one arbitrarily chosen ground state. All other states, which differ from this state by
one free spin, are its neighbors. They are added t o the valley. These neighbors are
treated recursively in the same way: all their neighbors which are yet not included in
the valley are added. After the construction of one valley is complete, the construction
of the next one starts with a ground state, which has not been visited so far.
212 9 Approximation Methods for Spin Glasses
Unfortunately the number of ground states grows very quickly with system size, so
even for small systems like N = g3 it is impossible t o calculate all of them. Thus, the
valley structure of the say 100 states which were obtained has to be established in a
different way.
The basic tool is a test, which tells whether two ground states are in the same valley
or not. The test is called ballistic search (BS), the reason for the choice of the name
will become clear later on. Assume that it is known that some ground states belong
t o the same valley. Now we know that another state z belongs t o the valley as well, if
the test tells us that z is in the same valley as any of the states already treated. The
main feature of the test is that it can be performed for states which are not direct
neighbors in phase space. This is the reason why only a small subset of all ground
states is needed.
The test works as follows. Given two independent replicas (0,") and {of) let A be
=
the set of spins, which are different in both states: A {ilo,"# of}. Now BS tries t o
build a path in configuration-space of successive flips of free spins, which leads from
(0,") t o {of). The path consists of states which differ only by flips of free spins from
A (see Fig. 9.20). For the simplest version iteratively a free spin is selected randomly
from A, flipped and removed from A, i.e. each spin is flippcd a t most once. Thercforc,
a straight path is built in phase space. This is the rcason why the search for a path is
called ballistic. This test does not guarantee to find a path bctween two ground states
which belong to the same valley. It may depend on the order of selection of the spins
whether a path is found or not, because not all free spins are independent of each
other. Additionally, for very large valleys, when A percolates, it may be necessary t o
flip some spins not contained in A twice t o allow other spins to flip in between. Thus,
a path is found with a certain probability pf, which decreases with the size of A. If the
9.4 Ballistic Search 213
size of A is small, pf is very close t o one. For the identification of the valley structure,
it does not matter that pf < 1. This is explained next.
b) 1 valley
c) >1 valleys
Figure 9.21: Algorithm for the identification of all valleys: several ground-states
(circles) "cover" parts of valleys (filled areas). During the processing of all states a
set of valleys is kept. When state c3 is treated, it is established using BS to how many
of the already existing valleys the state belongs to. Three cases can occur: a) the
ground state is found to belong to no valley, b) it is found in exactly one valley and c)
it is found in several valleys. In the first case a new valley is found, in the second one
nothing changes and in the third case several smaller valleys are identified as subsets
of the same larger valley.
The algorithm for the identification of valleys using BS works as follows: the basic
idea is t o let a ground statc represent that part of a valley which can be reached
using BS with a high probability by starting a t this ground state. If a valley is
large it has t o be represented by a collection of states, such that the whole valley
is "covered". For example a typical valley of an L = 8 spin glass consisting of 10''
ground states is usually represented by only few ground states (e.g. two or three). A
detailed analysis of how many representing ground states are needed as a function of
valley and system size can be found in [91]. At each time the algorithm stores a set of
m valleys A = {A(r)Ir = 1,.. . , m ) ( r being the ID of the valley A(r)) each consisting
of a set A(r) = { c T 1 )of representing configurations crl = { c r , ~ ' ) (2 = I , . . . , IA(r)/). At
the beginning the valley set is empty. Iteratively all available ground states cj = {a:}
( j = 1,.. . , D) are treated: the BS algorithm tries to find paths from d or its inverse
t o all representing configurations in A. Let F be the set of valleys IDS, where a path
214 9 Approximation Methods for S p i n Glasses
c1 C2c3 c' $6
Figure 9.22: Example, that the order the states are treated may affect the result.
Consider three states cl, c2,c3, all belonging to the same valley. Assume that BS finds
a path between ( c l , c2) and (c2,c", but not between (c1,c3). In the first case two
valleys are found (false), in the second case one valley (correct). In order that the
correct result is always obtained, two iterations are needed.
path is found from c4 t o c1 and t o c2. Thus, all states encountered so far
belong to the same valley. Both valleys are merged and now are represented
by c1 and c2. For c5 a path t o c2 but not t o c1 is found. Nevertheless, this
means that c5 belongs t o the valley as well. Finally, for c6 no path is found
t o either c1 or c". Therefore, c6 belongs t o another valley, which is created
in the last step.
Since the probability pf that a path is found between two ground states belonging t o
the same valley is smaller than one, how can we be sure that the valleys construction
has been performed correctly? Hence, how can we avoid that a number of states
belonging t o one large valley are divided into two or several valleys, as a result of t o
some paths which have not been detected? If we only have a small number of states
available, it is indeed very likely that the large valley is not identified correctly, see
Fig. 9.24. If more states of the valley are available, it is more likely that all ground
states are identified as being members of the same valley. The reason is that BS finds
a path to neighbors which are close in phase space with a high probability. Now the
probability that some path between two ground states is found increases due t o the
growing number of possible paths. It is always possible t o generate additional ground
9 Approximation Methods for Spin Glasses
BS($, c1 )= no
BS(G, c2)= yes
BS(c5, ci )= no
5
BS(c5, c2)= yes
Figure 9.23: Example run of the algorithm for the identification of ground-state
valleys; 6 ground states are processed. For details, see text.
states within each valley, so the probability that everything is done correctly can be
increased arbitrarily close t o one. It is very easy t o obtain a 99.9% level of certainty,
for details see [91].
So far we have explained the basic idea of the post-processing method and prescnted
the BS algorithm which allows a number of ground states to be divided into valleys.
Also the MC algorithm has been given which ensures that within each valley all ground
states have the same probability of being selected. The final part which is missing is
a technique that allows the size of a valley t o be estimated. This allows each valley t o
be considered with its proper weight, which is proportional t o its size.
A method similar t o BS is used t o estimate the sizes of the valleys. Starting from
an arbitrary state {ai)in a valley C , free spins are flipped iteratively, but each spin
not more than once. During the iteration additional free spins may be generated and
other spins may become fixed. When thcre are no more free spins left which have not
been flipped already, the process stops. Thus, one has constructed a straight path in
9.4 Ballistic Search
Figure 9.24: If two states, belonging to the same valley, are far apart in state space
it is very unlikely that the BS algorithm detects that they belong to the same valley. If
more states of the valley are available, the probability for the correct answer increases.
The thickness of the lines indicates the probability that a path between two ground
states 1s found by BS
1o5
1o4
I o3
>
10'
10'
state space from the ground state t o the border of the valley C. The number of spins
that has been flipped is counted. By averaging over several trials and several ground
states of a valley, one obtains an average value l,, which is a measure of the size of
the valley.
For small (d = 3) system sizes L = 3 , 4 , 5 it is easy to obtain all ground states by
218 9 Approximation Methods for Spin Glasses
performing many runs of the genetic CEA algorithm. Thus, the valley sizes can be
calculated exactly just by counting. Fig. 9.25 displays [91] the average size V of a
valley as a function of .,I An exponential dependence is found, yielding
v = 2dmaX (9.11)
with a = 0.90(5). The deviation from the pure exponential behavior for the largest
valleys of each system size should be a finite size effect. Similar measurements can be
performed also for two- and fom-dimensional systems.
Another method could be just t o count the static number nf of free spins. This is
slightly simpler t o implement, but it has turned out that the quantity ,I describes
the size of a valley better than nf.The reason is that by flipping spins additional free
spins are created and deleted. Consider for example a one-dimensional chain of N
ferromagnetically coupled spins with antiperiodic boundary conditions. Each ground
state consists of two linear domains of spins. In each domain all spins have the same
orientation. For each ground-state there are just two free spins, but all 2N ground
states belong t o the same valley. The possibility of similar but more complicated
ground-state topologies is taken into account when using the quantity I,.
With relation (9.11) one can obtain for each valley an estimate of its size, even in the
case when only a small number of ground states are available. Using these sizes one can
draw ground states in such a way that each valley is represented with its proper weight.
The selection is done in a manner such that many small valleys may contribute as a
collection as well; e.g. assume that 100 states should be drawn from a valley consisting
of 10'' ground states, then for a set of 500 valleys of size lo7 each, a total number
of 50 states is selected. This is achieved automatically by first sorting the valleys in
ascending order. Then the generation of states starts with the smallest valley. For
each valley the number of states generated is proportional t o its size multiplied by a
factor f . If the number of states grows too large, only a certain fraction f 2 of the
states which have already been selected is kept, the factor is recalculated (f t f * f 2 )
and the process continues with the next valley.
When again measuring the frequency of occurrence of each ground state for small
systems, now after the post-processing step, indeed a flat distribution is obtained.
Hence, now we have all the tools available to investigate the ground-state landscape
of Ising spin glasses thermodynamically correctly. In the next section some results
obtained with this methods are presented. We close this section by a summary of the
post-processing method. Input is the realization {&) of the bonds and a set {c" of
ground states.
algorithm post-processing({ Jij), {ck})
begin
divide configuration {ck} into different valleys with BS;
perform nRuN runs of a T = 0 MC simulation
calculate valley sizes;
select sample of ground states according t o valley sizes;
return ground states;
end
9.5 Results
9.5 Results
We finish this chapter by presenting some results [92] which were obtained with the
combination of the genetic CEA and the method for ensuring the correct thermody-
namic distribution. Since we are interested in the ground-state landscape, the distri-
bution P ( q ) of overlaps is a suitable quantity t o study. You may remember from the
first section the question of whether the mean-field scenario or the Droplet picture
describes the behavior of realistic (i.e. three-dimensional) spin glasses correctly. By
studying the finite-size behavior of P(q), i.e. the width as a function of the system size,
we will be able t o show that the mean-field scenario does not hold for the ground states
of three-dimensional spin glasses. But the behavior turns out to be more complex than
predicted by a naive version of the Droplet theory.
For this purpose, ground states were generated using genetic CEA for sizes L E
[3,. . . ,141. The number of random realizations per lattice size ranged from 100 real-
izations for L = 14 up t o 1000 realizations for L = 3.
Each run resulted in one configuration which was stored, if it exhibited the ground-
state energy. For the smallest sizes L = 3 , 4 all ground states were calculated for each
realization by performing up to lo4 runs. For larger sizcs it is not possible t o obtain
all ground states, because of the exponentially rising degeneracy. For L = 5 , 6 , 8 in
fact almost all valleys are obtained using at most lo4 runs [91], only for about 25% of
the L = 8 realizations may some small valleys have been missed.
For L > 8 not only the number of statcs but also the number of valleys grows too
large, consequently only 40 independent runs were made for each realization. For
L = 14 this resulted in an average of 13.8 states per realization having the lowest
energy while for L = 10 on average 35.3 states were stored. This seems a rather small
number. However, the probability that genetic CEA calculates a specific ground state
increases (sublinearly) with the size of the valley the state belongs to [88]. Thus,
ground states from small valleys do appear with a small probability. Because the
behavior is dominated by the largest valleys, the results shown later on are the same
(within error bars) as if all ground states were available. This was tested explicitly for
100 realizations of L = 10 by doubling the number of runs, i.e. increasing the number
of valleys found.
Using this initial set of states for each realization (L > 4) a second set was produced
using the techniques explained before, which ensures that each ground state enters
the results with the same weight. The number of states was chosen in a way, that
nmaX= 100 states were available for the largest valleys of each realization, i.e. a single
valley smaller than one hundredth of the largest valley does not contribute t o physical
quantities, but, as explained before, a collection of many small valleys contributes t o
the results as well. Finally, it was verified that the results did not change by increasing
nmax.
The order parameter selected here for the description of the complex ground-state
behavior of spin glasses is the total distribution P(lq1) of overlaps. The result is shown
in Fig. 9.26 for L = 6,lO. The distributions are dominated by a large peak for q > 0.8.
Additionally there is a long tail down t o q = 0, which means that arbitrarily different
ground states are possible. Qualitatively the result is similar to the P ( l q ( ) obtained
9 Approximation Methods for Spin Glasses
Figure 9.26: Distribution P(lq1) of overlaps for L = 6,10. Each ground state enters
the result with the same probability. The fraction of small overlaps decreases by about
a factor 0.6 by going from L = 6 to L = 10 (please note the logarithmic scale).
for the SK model for a small but nonzero temperature. But with increasing system
size, the weight of the long tail decreases. To obtain a definite answer we have to
extrapolate t o very largc system sizes.
To study the finite-size dependence of P(jql), the variance g2(lql)was evaluated as
a function of the system size L. The values are displayed in Fig. 9.27. Additionally
the datapoints are given which are obtained when the post-processing step is omitted
[93, 941. Obviously, by guaranteeing that every ground state has the same weight,
the result changes dramatically. To extrapolate t o L t oo, a fit of the data to
+
a; = a 2 aoLpal was performed. A slightly negative value of a& = -0.01(1) was
obtained, indicating that the width of P(lq1) is zero for the infinite system. For the
plot a double-logarithmic scale was used. The fact that the datapoints are found t o be
more or less on a straight line is another indication that the width of P(lq1) converges
towards zero in the thermodynamic limit L tm. Consequently, the lMF picture with
a continuous breaking of replica symmetry, which predicts a distribution of overlaps
with finite width, cannot be true for the ground-state landscape of three-dimensional
iLJ spin glasses. Please note that only a small range of system sizes could be treated.
Unfortunately, due t o the NP-hardness of the ground-state calculation for spin glasses,
larger sizes are not feasible a t the moment.
By collecting all results, see also [92], one obtains the following description of the
shape of P(1q). It consists of a large delta-peak and a tail down to q = 0, but the
weight of that tail goes to zero with lattice size going t o infinity. This expression is
9.5 Results
o genetic CEA
G true TD
used t o point out that by going to larger sizes small overlaps still occur: the number
of arbitrarily different ground states diverges [91]. But the size of the largest valleys,
which determine the self overlap leading t o the large peak, diverges even faster. The
delta peak is centered around a finite value ~ E A From
. further evaluation of the results
~ E A= 0.90(1) was obtained.
It should be stressed that this just tells us what the structure of the ground-state
landscape looks like. To finally decide whether the mean-field like description or
the Droplet picture describes the behavior of realistic spin glasses better, one has
to study small but finite temperatures as well. This can be done by cxtending the
methods described here t o finite temperatures T > 0. Using the genetic CEA method
excited states can be generated even faster than true ground states, smaller sizes of the
populations and fewer minimizations steps are sufficient. The configurations obtained
in this way can be divided into different valleys in the same way as has been done
for the ground states. The only difference is that one has t o take several different
energy levels into account and one has t o weight each excited state with its proper
thermodynamic weight. Work in this direction is in progress.
Another approach to generate excited states has been presented recently [95] for the
222 9 Approximation Methods for Spin Glasses
Bibliography
[I] K. Binder and A.P. Young, Rev. Mod. Phys. 58, 801 (1986)
[2] M. Mezard, G. Parisi, and M.A. Virasoro, Spin Glass Theory and Beyond, (World
Scientific, Singapore 1987)
[3] K.H. Fisher and J.A. Hertz, Spin Glasses, (Cambridge University Press, Cam-
bridge 1991)
[4] J.A. Mydosh, Spin Glasses: an Experimental Introduction, (Taylor and Francis,
London 1993)
[5] A.P. Young (ed.), Spin Glasses and Random Fields, (World Scientific, Singapore
1998)
[6] S.F. Edwards and P.W. Anderson, J. de Phys. 5, 965 (1975)
[7] G. Toulouse, Commun. Phys. 2, 115 (1977)
[8] Y. Cannella and J.A. Mydosh, Phys. Rev. B 6, 4220 (1972)
[9] P. Norblad and P. Svendlidh, in: A.P. Young (ed.), Spin Glasses and Random
Fields, (World Scientific, Singapore 1998)
[lo] H. Rieger, in D. Stauffer (ed.): Annual Reviews of Computational Physics 11,
295-341, World Scientific, 1995
[11] D. Sherrington and S. Kirkpatrick, Phys. Rev. Lett. 51, 1791 (1975)
[12] R. Rammal, G. Toulouse, and M.A. Virasoro, Rev. Mod. Phys. 58, 765 (1986)
[13] W.L. McMillan, J. Phys. C 17, 3179 (1984)
[14] A.J. Bray and M.A. Moore, J. Phys. C 18, L699 (1985)
Bibliography
[15] D.S. Fisher and D.A. Huse, Phys. Rev. Lett 56, 1601 (1986)
[16] D.S. Fisher and D.A. Huse, Phys. Rev. B 38 , 386 (1988)
[17] A. Bovier and J. Frohlich, J. Stat. Phys. 44, 347 (1986)
[18] I?. Barahona, J. Phys. A 15, 3241 (1982)
[19] I?. Barahona, R. Maynard, R. Rammal, and J.P. Uhry, J. Phys. A 1 5 , 673 (1982)
[20] J . Bendisch, J. Stat. Phys. 62, 435 (1991)
[21] J. Bendisch, J. Stat. Phys. 67, 1209 (1992)
[22] J . Bendisch, Physica A 202, 48 (1994)
[23] J. Bendisch, Physica A 216, 316 (1995)
[24] J. Kisker, L. Santen, M. Schreckenberg, and H. Rieger, Phys. Rev. B 53, 6418
(1996)
[25] N. Kawashima and H. Rieger, Europhys. Lett. 39, 85 (1997)
[26] J. Bendisch, Physica A 245, 560 (1997)
[27] J. Bendisch and H. v. Trotha, Physica A 253, 428 (1998)
[28] M. Achilles, J . Bendisch, and H. v. Trotha, Physica A 275, 178 (2000)
[29] Y. Ozeki, J. Phys. Soc. J. 59, 3531 (1990)
[30] T. Kadowaki, Y. Nonomura, H. Nishimori, in: M. Suzuki and N. Kawashima
(ed.), Hayashibara Forum '95. International Symposium on Coherent Approaches
to Fluctuations, (World Scientific, Singapore 1996)
[31] R.G. Palmer and J . Adler, Int. J. Mod. Phys. C 1 0 , 667 (1999)
[32] J.C. Angles d'Auriac, M. Preissmann, and A.S. Leibniz-Imag, Math. Comp. Mod.
26, 1 (1997)
[33] H. Rieger, in J . Kertesz and I. Kondor (ed.): Advances in Computer Simulation,
Lecture Notes in Physics 501, Springer, Heidelberg, 1998
[34] A. Hartwig, F. Daske, and S. Kobe, Comp. Phys. Commun. 3 2 133 (1984)
[35] T . Klotz and S. Kobe, J. Phys. A 27, L95 (1994)
[36] T. Klotz and S. Kobe, Act. Phys. Slov. 44, 347 (1994)
[37] T . Klotz-T and S. Kobe, in: M. Suzuki and N. Kawashima (ed.), Hayashibara
Forum '95. International Symposium on Coherent Approaches to Fluctuations,
(World Scientific, Singapore 1996)
224 9 Approximation Methods for Spin Glasses
[38] T. Klotz and S. Kobe, J. Magn. Magn. Mat. 17, 1359 (1998)
[39] P. Stolorz, Phys. Rev. B 48, 3085 (1993)
[40] E.E. Vogel, J . Cartes, S. Contreras, W. Lebrecht, and J. Villegas, Phys. Rev. B
49, 6018 (1994)
[41] A.J. Ramirez-Pastor, F. Nieto, and E.E. Vogel, Phys. Rev. B 55, 14323
[42] E.E. Vogel and J. Cartes, J. Magn. Magn. Mat. 1 7 , 777
[43] E.E. Vogel, S. Contreras, M.A. Osorio, J. Cartes, F. Nieto, and A.J. Ramirez-
Pastor, Phys. Rev. B 58, 8475 (1998)
[44] E.E. Vogel, S. Contreras, F. Nieto, and A.J. Ramirez-Pastor, Physica A 257, 256
(1998)
[45] J.F. Valdes, J. Cartes, E.E. Vogel, S. Kobe, and T . Klotz, Physica A 257, 557
(1998)
[46] C. De Sirnonc, M. Dichl, M. Junger, P. Mutzel, G. Reinelt, and G. Rinaldi, J.
Stat. Phys. 8 0 , 487 (1995)
[47] C. De Simone, M. Diehl, M. Junger, P. Mutzel, G. Reinelt, and G. Rinaldi, J.
Stat. Phys. 84, 1363 (1996)
[48] H. Rieger, L. Santen, U. Blasum-U, M. Diehl, M. Junger, and G. Rinaldi, J. Phys.
A 29, 3939 (1996)
[49] M. Palassini and A.P. Young, Phys. Rev. B 60, R9919 (1999)
[50] D. Stauffer and P.M. Castro-de-Oliveira, Physica A 215, 407 (1995)
[51] P. Ocampo-Alfaro and Hong-Guo, Phys. Rev. E 53, 1982 (1996)
[52] B. A. Berg and T. Celik, Phys. Rev. Lett. 69, 2292 (1992)
[53] B.A. Berg and T. Celik, Int. J. Mod. Phys. C 3 , 1251 (1992)
[54] B. A. Berg, T. Celik, and U.H.E. Hansmann, Europhys. Lett. 22, 63 (1993)
[55] T. Celik, Nucl. Phys. B Proc. Suppl. 30, 908
[56] B.A. Berg, U.E. Hansmann, and T.Celik, Phys. Rev. B 50, 16444 (1994)
[57] B.A. Berg, U.E. Hansmann, and T.Celik, Nucl. Phys. B Suppl. 42, 905 (1995)
[58] H. Freund and P. Grassberger, J. Phys. A 22, 4045 (1989)
[59] N. Kawashima and M. Suzuki, J. Phys. A 25, 1055 (1992)
[60] D. Petters, Int. J. Mod. Phys. C 8 , 595 (1997)
Bibliography
[74] M. Palassini and A.P. Young, Ph,ys. Rev. Lett. 8 5 , 3333 (2000)
[75] J . Houdayer and O.C. Martin Phys. Rev. Lett. 83, 1030 (1999)
[76] J. Houdaycr and O.C. Martin, Phys. Rev. Lett. 8 2 , 4934 (1999)
[77] J. Houdayer and O.C. Martin, Phys. Rev. Lett. 8 4 , 1057 (2000)
[78] A.K. Hartmann, Phys. Rev. E 5 9 , 84 (1999)
[79] A.K. Hartmann, Physica A 224, 480 (1996)
[80] K.F. Pal, Biol. Cybern. 73, 335 (1995)
[81] A.K. Hartmann, Eur. Phys. J. B 8, 619 (1999)
[82] A.K. Hartmann, Phys. Rev. E 6 0 , 5135 (1999)
[83] L.E. Reichl, A Modern Course in Statistical Physics, (John Wiley & Sons, New
York 1998)
[84] A.J. Bray and M.A. Moore, J. Phys. C 17, L463 (1984)
[85] W.L. McMillan; Phys. Rev. B 30, 476 (1984)
[87] E. Marinari, G. Parisi, and J.J. Ruiz-Lorenzo, in: A.P. Young (ed.), Spin Glasses
and Random Fields, (World Scientific, Singapore 1998)
[88] A.K. Hartmann, Physica A 275, 1 (1999)
[89] A. Sandvic, Europhys. Lett. 45, 745 (1999)
[90] A.K. Hartmann, Europhys. Lett. 45, 747 (1999)
[91] A.K. Hartmann, J. Phys. A 33, 657 (2000)
[92] A.K. Hartmann, Eur. Phys J. B 13,591 (2000)
[93] A.K. Hartmann, Europhys. Lett. 40, 429 (1997)
[94] A.K. Hartmann, Europhys. Lett. 44, 249 (1998)
[95] F. Krzakala and O.C. Martin, Phys. Rev. Lett. 8 5 , 3013 (2000)
[96] M. Palassini and A.P. Young, Phys Rev. Lett. 8 5 , 3017 (2000)
[97] E. Marinari and G. Parisi, Phys. Rev. Lett. 86, 3887 (2001)
[98] J. Houdayer and O.C. Martin, Euro. Phys. Lett. 49, 794 (2000)
10 Matchings
After introducing spin glasses and discussing general approximation algorithms for
ground states in Chap. 9, we now turn to two-dimensional systems. We will first show
how ground states of certain two-dimensional spin glasses can be calculated by mapping
the problem onto a matching problem. Next, a general introduction t o matching
problems is given. In the third section, the foundation of all matching algorithms, the
augmenting path theorem, in presented. In the central section, algorithms for different
types of matching problems are explained. Finally, an overview of some results for spin
glasses is given.
Let us for example put all spins pointing upwards, so that each negative bond will be
unsatisfied and thus cut by an energy string [Fig. 10.1 (a)]. Unfrustrated plaquettes
have by definition an even number of strings crossing their boundary, therefore energy
lines always enter and leave unfrustrated plaquettes. Frustrated plaquettes on the
other hand have an odd number of strings, and therefore one string must begin or end
in each frustrated plaquette. These observations hold for any spin configuration.
If boundary conditions are open or fixed, some of the energy strings can end at the
boundary. To unify the case of (partly) periodic/non-periodic boundary conditions,
it is possible to introduce "external" plaquettes for each side with open boundary
conditions. In that case plaquettes always occur in pairs. One says the plaquettes
belonging t o a pair are matched.
l o r with periodic boundary conditions in one direction and free boundary conditions in the other.
The general rule is that the graph must be planar, i.e. it is possible to draw the graph in a plane,
without crossing edges.
228 10 Matchings
Figure 10.1: Thick lines represent negative couplings. Energy strings (dotted) are
drawn perpendicular to each unsatisfied coupling. Frustrated plaquettes (odd number
of negative couplings) are marked by a dot. a) All spins up configuration. b) A ground
state. c) Another ground state.
From Eq. (10.1), finding a ground state, i.e. a state of minimum energy, is equivalent
to finding a minimum length perfect matching, as we define it below, in the graph
of frustrated plaquettes [I, 2, 41. For the definition of frustration, see Chap. 9. If
IJijI = J, i.e. all interactions have the same strength, Fig. 10.1 (b) shows one possible
ground state. An equivalent ground state is obtained by flipping all spins inside the
gray area in Fig. 10.1 (c), since the numbers of satisfied and unsatisfied bonds along its
contour are equal. Degenerate ground states are related t o each other by the flipping
of irregularly shaped clusters, which have an equal number of satisfied and unsatisfied
bonds on their boundary.
If the amplitudes of the interactions are also random, this large degeneracy of the
ground state will in general be lost, but it is easy t o see that there will be a large number
of low-lying excited states, i.e. spin configurations which differ from the ground state
in the flipping of a cluster such that the "length" of unsatisfied bonds on its boundary
almost cancels the length of satisfied ones.
Please note, in the case of a two-dimensional spin glass with periodic boundary condi-
tions in all directions, or in the presence of an external field, the ground-state problem
becomes NP-hard.
is matched, then u and v are called mates. An alternating path is a path along which
the edges are alternately matched and unmatched. A bipartite graph is a graph which
can be subdivided into two sets of vertices, say X and Y, such that the edges in the
graph (i,j) only connect vertices in X to vertices in Y, with n o edges internal t o X
or Y. Nearest-neighbor hypercubic lattices are bipartite, while the triangular and
face-centered-cubic lattices are n o t bipartite.
Example: Matching
In Fig. 10.2 a sample graph is shown. Please note that the graph is bi-
partite with vertex sets X = {1,2,3) and Y = {4,5,6). Matched ver-
tices are indicated by thick lines. The matching shown in the left half is
M = {(1,4), (2,5)). This means, e.g. edge (1,4) is matched, while edge (3,4)
if free. Vertices 1,2,4 and 5 are covered, while vertices 3 and 6 are exposed.
In the right half of the figure, a perfect matching M = {(1,5),(2,6), (3,4))
is shown, i.e., there are no exposed vertices.
An example of an alternating path is shown in the left part of Fig. 10.3.
problem rather than the optimization problems we consider here. As a general rule,
graph enumeration problems are harder than graph-optimization problems.
Owing to the fact that all cycles on bipartite graphs have an even number of edges,
matching on bipartite graphs is considerably easier than matching on general graphs.
In addition, maximum-cardinality matching and maximum-weight matching on bi-
partite graphs can be easily related t o the maximum-flow and minimum-cost-flow
(respectively) problems discussed in Chaps. 6 and 7, respectively. Matching on gen-
eral graphs and maximum/minimum perfect matching are more complicated. Thus,
after presenting a fundamental theorem in the next section, in Sec. 10.4 matching
algorithms for different types of problems are explained.
Proof: (2) + is trivial. To prove +,assume M is not maximum. Then some matching
M' must exist with I M'I > I MI. Consider now the graph G' whose edge set is E' =
MAM', (the symmetric difference of M and M', MAM' =
( M \ M') U (MI \ M ) ).
Clearly each node of G' is incident to at most one edge of M and at most one edge
of M'. Therefore nodes in G' have at most two incident edges and the connected
components must be either simple paths or cycles of even length, and all paths are
alternating paths. In all cycles we have the same number of edges from M as from M'
so we can forget them. But since (M'I > /MI there must be at least one path in G'
with more edges from M' than from M . This path must necessarily be an augmenting
path.
(ii) + is again trivial. Assume M is not maximum. Some matching M' must therefore
exist with c(M') > c(M). Consider again G' = ( X , MAM'). By the same reasoning
as before, we conclude that G' must contain at least one alternating path or cycle of
positive weight. &ED
Figure 10.4: a) An initial matching is given for a bipartite graph. b) The enlarged
matching obtained after inverting the augmenting path discovered from node vs (see
Fig. 10.5).
Figure 10.5: a) The BFS or alternating tree built from exposed node vs in Fig. 10.4a.
Dashed lines represent non-tree edges to already visited nodes. The search finishes
when an exposed node uq (double circle) is found. b) The auxiliary tree obtained after
removing odd-level nodes and identifying them with their mates. After inverting
the augmenting path {v5,ug, v1,U Z ,v 4 ,u4), the enlarged matching in Fig. 10.4b is
obtained.
augmenting path. After inverting it, the augmented matching shown in Fig. 10.4b is
obtained. If no exposed node were found when the BFS ends, then node v:, will never
be matched and can be forgotten because of the following result [9], which is valid for
general graphs:
20.4 Matching Algorithms 233
Theorem:
If there is no augmenting path from node uo a t some stage, then there never will be
an augmenting path from u,.
Since the BFS or alternating tree is the basic structure maintained by matching algo-
rithms, we introduce a convenient notation. In what follows we denote the odd-level
-
set of nodes of the alternating tree 7 by A ( 7 ) and the even-level set B(7). Thesc sets,
beginning with A = 0, B = {r}, can be built up, by iteratively calling the following
procedure, we denote V ( 7 ) A ( 7 ) U B ( 7 ) :
procedure extend-alternating-tree(7, (i, j), M)
begin
if i E B ( 7 ) , j $2 V ( T ) , ( j , k ) E M , then
add j t o A ( T ) , k t o B ( 7 ) ;
end
The set V ( 7 ) and the edges used in its construction have the structure indicated in
Fig. 10.5a, without the broken lines and without the node u4, which is found t o be
exposed. Note that an alternating tree always ends in B-nodes and once an edge in
G has been found having one end in B ( 7 ) and the other end not in A ( 7 ) , we find an
augmenting path.
Using alternating trees, the augmentation of a matching by exchanging matched and
unmatched verticcs along an augmcnting path, can be writtcn in the following way,
this procedure will be used later on as well [we denote the edges of a path with E ( P ) ] :
procedure Augment ( 7 , (i,j ) , M )
begin
Let r the root of 7;
Let P be the path in 7 from r t o i plus (i,j ) ;
replace M by M A E ( P ) ;
end
For the algorithms, which we present later on, the sets A ( 7 ) and B ( 7 ) are very
useful. For the maximum cardinality bipartite matching, there is a difference between
the search technique actually applied and the usual BFS since the searches from odd-
numbered levels are trivial. They always lead t o the mate of that node, if the node
itself is not exposed. Therefore the search can in practice be simplified by ignoring
odd-numbered nodes and going directly t o their mates, as shown in Fig. 10.5b. The
search for alternating paths can in fact be seen as a usual BFS on an auxiliary graph
from which odd-level nodes have been removed by identifying them with their mates.
The basic augmenting path algorithm is then as follows 110, 111.
10 Matchings
Figure 10.6: The resulting network, after adding vertices s and t t o the graph from
Fig. 10.2 and connecting all vertices from X with s and all vertices from Y with t .
Please note that the initial feasible matching can be empty M = 0. The best known
implementation for this algorithm is due t o Hopcroft and Karp [12]. It runs in time
O(1El. m) and is based on doing more than one augmentation in one step. There is
also a simple way in which t o map the maximurn-cardinality-matching problem t o the
maximum-flow problem, see Chap. 6. Let B = ( X , Y, E ) , and define B' by adding a
source node s and a target node t , and connecting all nodes in X to s,and all nodes in
Y to t, by edges of capacity I, see e.g. Fig. 10.6. Now let all edges e E E have capacity
1. Because of the integer flow theorem, maximum flows in B' are integral. Every flow
of size f thus identifies f matched edges in B, and vice versa. Since maximum flows
in 0 - 1 networks are computable in O(l El . m ) time, so are maximum-cardinality
matchings on B .
The mapping of matching problems to flow problems also applies t o the maximum-
weight-matching problem on bipartite graphs:
Given an edge-weighted bipartite graph, find a matching for which the sum of the
weights of the edges is maximum.
10.4 Matching Algorithms 235
This is also known as the assignment problem, because it can be identified with
optimal assignment, e.g. of workers to machines, if worker i E Y produces a value cij
working a t machine j E X :
Given an n x n matrix, find a subset of elements in the matrix, exactly one element
in each column and one in each row, such that the sum of the chosen elements is
maximal.
In a similar manner t o that described in the previous paragraph, this problem can
easily be formulated as a flow problem. Again add a source node s , a sink node t ;
connecting s t o all nodes in U by unit-capacity, zero-cost edges; all nodes in V t o t
by unit-capacity, zero-cost edges. Also interpret edge eij as a directed edge with unit
capacity and cost E(eij) = C,, - cij, where C,,, > max(i,j)(cij). The solution t o
the minimum-cost-flow problem (see Chap. 7) from s t o t is then equivalent t o the
maximum-weight matching which we seek. Please note that in this case a mapping
on the maximum-flow problem is not possible, because otherwise it is not possible t o
take the edge weights into account and guarantee that each worker works exactly a t
one machine. The minimum weight perfect bipartite matching problem can be solved
in the same way, for this case the weights cij can be used directly without inversion:
E(e..)
23
= c23. .
'
Any vector x which fullfills this conditions is called feasible. The optimization task t o
find an minimum perfect matching, can be written as a linear program (LP)
minimize C(i,j)EE
cijxij
subject t o (10.2)
Linear programming is the problem of optimizing (minimizing or maximizing) a linear
cost function while satisfying a set of linear equality and/or inequality constraints.
The full subject of linear optimization is beyond the scope of this work, but there are
236 10 Matchings
many books devoted t o it (c.f. Ref. [lo]) and its applications. Most of the problems
described in this book can be cast as linear or, more generally, convex-programming
problems. The simplex method, published by Dantzig in 1949, solves a standard LP
minimize ct z
subject t o B t y 5 c.
Please note that the sign of the variables yi is not restricted and that the dual of the
dual is the primal. For the matching problem (10.3) we get
subject t o +
yi y, 5 cij
We will now give a little background information, for a comprehensive presentation
we refer to the literature. In LP theory, it can be shown that the optimum values
ctz and bty agree. The idca behind duality is, that when solving a primal LP, the
algorithms always keeps feasible solutions, while stepwise decreasing the value of ctz,
thus keeping an upper bound. On the other hand, when solving the dual problem,
which is a maximization problem, always a lower bound bty is kept. Thus, one can
treat a primal and a dual problem in parallel by iteratively increasing the lower bound
and decreasing the upper bound. The solution is found when upper and lower bounds
agree.
he simplex technique does not guarantee t o run in polynomial time. There is a polynomial
algorithm to solve ( L P ) , the ellipsoid method. But in practice, thc simplex mcthods is usaully faster,
so it is used.
10.4 Matching Algorithms 237
Another important result, which we need for the Hungarian algorithm, is that the
following orthogonality conditions are necessary and sufficient for optimality of the
primal and dual solutions, here directly written in the form suitable for our application
('di E Y and V j E X):
In Fig. 10.7, a bipartite graph, a minimum-weight perfect matching and the corre-
sponding dual solution are shown. Please note that conditions yi + yj = cij are only
necessary for matched edges, not sufficient.
Figure 10.7: A bipartite graph. The numbers next to the edges are the weights c,, .
The minimum-weight perfect matching is indicated by bold lines. The numbers next
to the vertices are the values y, of the dual solution. Zero values are not shown. Edgcs
not belonging to the matching, but having y, + y j = c,, , are indicated by broken lines.
The weight of the minimum perfect matching (=17) equals the value C yi of the dual
solution.
Next, we introduce a convenient notation. Given a vector y E IRE and an edge (i,j ) E
+
E , we denote by G j = cij(y) the difference cij - (yi y j ) Thus, y is feasible in (10.6)
>
if and only if cij 0 for all (i,j) E E . In this case we denote by E= [or E= (y)] the set
E={(i, j) E EIFij = 0); (10.8)
its elements are the equality edges with respect t o y.
If x is the characteristic vector of a perfect matching M of G, the optimality conditions
are equivalent t o
M C E, (10.9)
We will now explain the basic idea of the Hungarian algorithm. Given a feasible
solution y to (10.6), we can now use the maximum cardinality bipartite matching
algorithm described above t o search for a perfect matching having this property. If we
succeed, we have a perfect matching whose characteristic vector is optimal t o (10.3),
as required. Otherwise the algorithm will deliver a matching M of G= = (V, E=) and
an M-alternating (BFS) tree 7- such that its B-nodes are joined by equality edges only
t o A-nodes (in Fig. 10.5 the A-nodes are those indicated by ui and the B-nodes by
vi). See the example in Fig. 10.8, where a perfect matching exists, but with respect
t o the current y (indicated a t the nodes), it is not completly contained in E=.
10 Matchings
Figure 10.8: The graph of Fig. 10.7 with a preliminary matching. Vertices e and h
are exposed, h is the current root of the alternating tree, which is shown on the right.
By increasing the values at the B-nodes b, c, h and g by t = 3 and decreasing it by 6 at
the A-nodes a , f and i , the edge (b, e) joins E= and C yi attains the maximum value.
The final result of Fig. 10.7 is obtained, by inverting the alternating path h, f , c, a, b, e.
In that case there is a natural way t o change y, keeping in mind that we would like
edges of M and T t o remain in E=, and that we would like Eij t o decrease for edges
(i,j ) joining nodes in B ( T ) t o nodes not in A ( 7 ) . Namely, we increase yi by E > 0
for all v E B ( T ) , and we decrease yi by E for i E A ( 7 ) . This has the desired effect; we
will choose E as large as possible so that feasibility in (10.6) is not lost and as a result
(provided that G has a perfect matching a t all), some edge joining a node u E B ( T )
to a node w 6 A ( 7 ) will enter E x . Since G is bipartite, we will have w 6 V ( T ) ,
leading to a n augmentation or a tree-extension step. In the example Fig. 10.8 wc will
choose E = 3 because of the edge (b, e), and then that edge enters E=, allowing for an
augmenting path.
To summarize, we get the following so-called Hungarian algorithm, due t o Kuhn [13]
and Munkres [14]:
10.4 Matching Algorithms
Figure 10.9: After the first edge has been matched using the rninimum-
weight perfect bipartite matching algorithm. The root of 7 has been u and
t = 2.
Figure 10.10: The graph after a , d and f have been the root of the alter-
nating tree: M = E={(a,b ) , ( c , f ) , (d, y)) and C y, = 11.
Now, only two exposed vertices are left. We assume that h is the root of the
next alternating tree. First, no edges of E= can be used to extend the tree.
Then, using E f h = i& = 1, we obtain r = 1, leading t o y, = 1, C yi = 12 and
the edges (f, h) and (h, d) join E=. Now, 'T can be extended twice, leading
t o the situation displayed in Fig. 10.11.
Now we have edges ( c ,a ) and (g, e) which fulfill the condition i E B ( 7 ) and
j @ V('T).Hence, t = min{~,,, E,,) = min{6,2) = 2 is obtained, leading to
the situation we already encountered in Fig. 10.8. As we have seen already,
E = 3 and edge (b, e ) joins E=. Then, using (b, e) the matching is augmented
and the final result is obtained, which has already been presented in Fig.
10.7.
10.4 Matching Algorithms
Figure 10.11: Bipartite graph and alternating tree with root h . Now
C Y ~ and E= = { ( a , b ) , ( ~ f ) (,f , g ) , ( d , g ) I .
i j k x
4 b)
Figure 10.12: a) A blossom is an odd-cycle which is as heavy as possible in matched
+
edges, i.e. contains 2k 1 edges among which k are matched. b) The reduced graph
G x C obtained by shrinking a blossom C = { c ,d , e , f ; y , x,k , j , i ) to a single node
(the pseudo node B) contains the same augmenting paths as the original graph.
The first polynomial-time algorithm t o handle blossoms is due t o Edmonds [16]. Ed-
monds' idea was t o shrink blossoms, that is, replace them by a single pseudo node B
obtaining a modified or derived graph GI as shown in Fig. 10.12b. The possibility of
shrinking is justified by the following theorem due t o Edmonds.
Theorem: (Edmonds)
There is an augmenting path in G zf and only if there is an augmenting path in G1
The existence of a blossom is discovered when edge (x, y) between two even-level nodes
is first found, and its nodes and edges are identified by backtracking from x and y until
the first common node is found (c in our example), which is the blossom's base.
Once identified, the blossom is shrunk, replacing all its nodes (among which there
might be previously shrunk blossoms) by a single node, and reconnecting all edges
incident t o nodes in the blossom (necessarily uncovered edges) t o this one. The search
proceeds as usual, until an augmenting path is found, or none, in which case a is
abandoned and no search will ever be started from it again. If an augmenting path
is found, which does not involve any shrunk blossom, it is inverted as usual. If it
contains blossom-nodes, they must be expanded first and one has t o identify which
way around the blossom the augmenting path goes. This may need to be repeated
several times if blossoms are nested. After inverting the resulting augmenting path, a
new search is started from a different node. A simple implementation of these ideas
runs in C3(/xI4) time [lo]. The fastest known algorithm for non-bipartite matching is
also C3(lEI . m) time [17].
are fractional (Fig. 10.13 shows an example of this) and thus do not correspond t o a
matching. Here, the reason for the complication is the existence of odd cycles [16].
Thus we need t o explicitly impose the condition xij = 0 , l .
Figure 10.13: Example of fractional optimal solution on graphs with odd cycles. a)
The graph and its weights. b) An optimal solution of P1 is fractional. c ) The desired
solution.
A solution to this problem for the case of general graphs has been found by Edmonds,
and consists in adding new constraints, which impose xij = 0, I indirectly. In the
subsequent discussion that will lead us finally t o the Blossom algorithm for minimum-
weight perfect matchings we follow Ref. [18], see also Ref. [15].
For each odd subset S c G (i.e. S contains an odd number of nodes), we impose an
additional set of constraints: let D be an odd cut generated by S, i.e.
This is called a blossom inequality. By adding these inequlaities t o the problem (10.3),
we get a stronger linear-programming bound, i.e. the space of feasible vectors shrinks.
[For example add the inequality (10.11) to (10.2), for D consisting of the vertical bond
in Fig. 10.13, it is no longer possible to get a solution of value 3; in fact the new
optimal value for the resulting problem is 4, which is also the minimum weight of a
perfect matching.]
244 10 Matchings
Let C denote the set of all odd cuts of G that are not of the form S(i) = {(i,j) E E l j E
V ) for some node i. Then we are led t o consider the linear-programming problem
minimize C(i,j)EE Cijxij
~ E V : x(S(i)) = 1 , (10.12)
subject to VDEC: x(D) 1 , >
( j )€ E : xij 0 , >
where x(J(i)) = Cj with ( i , j ) t Exij.
As we have indicated, (10.12) provides a better approximation to the minimum weight
of a perfect matching, but a much stronger statement can be made. Its optimal value
i s the minimum weight of a perfect matching. This is the fundamental theorem of
Edmonds [19] (also "Matching Polytope Theorem"):
Theorem: Let G be a graph, and lct c E RE. Thcn G has a perfect matching if
and only if (10.12) has a feasible solution. Moreover, if G has a perfect matching, the
minimum weight of a perfect matching is equal t o the optimal value of (10.12).
The algorithm that we will describe will construct a perfect matching M whose char-
acteristic vector x* is an optimal solution to (10.12), and so M is a minimum-weight
perfect matching. This will provide a proof of the above theorem. The way in which
we will know the 2* is optirnal, is analogous t o the bipartite case discussed above. We
will also have a feasible solution to the dual linear-programming problem of (10.12)
that satisfies orthogonality conditions for optimality of the primal and dual solutions.
The dual problem t o (10.12) is:
maximize Citvyz CDEC YD+
Given a vector ( y , Y ) as in (10.13) and an edge (i, j), we denote, as in the bipartite
case, by cij = Eij(y, Y) the difference
-
Ci3 = Cij - yi + yj + C y~ (10.14)
D E C with ( i , j ) t D
which we also call the reduced cost of the edge (i,j ) . Thus (y, Y) is feasible in Eq.
(10.13) if and only if Yo >
0 for all D E C and Fij 0 for all (i,j) E E.>
Again, the important result from LP theory that we need for the algorithm is that
the following orthogonality conditions are necessary and sufficient for the optimality
of the primal and dual solutions
It is not obvious how an algorithm will work with the dual variable Y, but the answer
is suggested by the maximum-cardinality matching algorithm we discussed in Sec.
10.4.3. We will be working with derived graphs G' of G, and such graphs have the
property
every odd cut of G' is an odd cut of G.
It follows from this, in particular, that every cut of the form SG,(v) for a pseudo node
v of GI is an odd cut of G. These are the only odd cuts D of G' for which we will
assign a positive value t o Yo. Note, however, that such a cut of G' need not have this
property in G - it is of the form S(S) where S is an odd subset of G which become
a pseudo node of G' after (repeated) odd-circuit shrinkings. It follows that we can
handle Yo by replacing it by yv, with the additional provision that y, > 0.
We take the same approach as in the bipartite case, trying t o find a perfect matching
in G, using tree-extension and augmentation steps. When we get stuck, we change
y in the same way, except that the existence of edges joining two nodes in B ( 7 ) will
limit the size of E . In particular, there may be an equality edge joining two such nodes.
Then we shrink the circuit C , but there is now a small problem: how do we take into
account the variables yi for i E V ( C ) , when those nodes are no longer in the graph?
The answer is that we update c as well. Namely, we replace cij by cij - yi for each edge
(i,j ) with i E V ( C )and j 6 V(C). Notice that by this transformation, and the setting
of y c = 0 for the new pseudo node, the reduced costs Eij are the same for edges of the
reduced graph G x C (in which the circuit C is replaced by a pseudo node) as they
were for those edges in G. We will use c' t o denote these updated weights, so when we
speak of a derived graph G' of G , the weight vector c' is understood to come with GI,
or we may refer t o the derived pair (G', c'). The observation about the invariance of
the reduced costs, however, means that we can avoid 2' in favor of E . The subroutine
for blossom shrinking is just the same as for the maximurn-cardinality matching, c.f.
Fig. 10.12. We assume that M ' is a matching of a graph G', 7 an MI-alternating tree,
and (i,j ) is an edge of G' such that i, j E B ( 7 ) . Here, and in the following, E ( G )
denotes the edges of a graph (e.g. circuit or alternating tree) G:
procedure Shrink-update((i, j), M ' , 7, c')
begin
Let C be the circuit formed by (i, j ) together with the path in 7 from i to j;
Replace G' by G' x C , M ' by M 1 \ E ( C ) ;
Replace 7 by the tree in GI having edge-set E ( T ) \ E ( C ) ;
update c' and set yc := 0;
end
An example of shrinking and updating weights is shown in Fig. 10.14. On the left is
the graph with a feasible dual solution y (shown a t the nodes) for which the edges
of the left triangle are all equality edges. On the right we see the result of shrinking
this circuit and updating the weights. Notice that an optimal perfect matching of this
smaller graph extends t o an optimal perfect matching of the original.
The essential justification of the stopping condition of the algorithm is that, when we
have solved the problem on a derived graph, then we have solved the problem on the
10 Matchings
Figure 10.14: Shrinking and updating weights. The left triangle of the graph G in
(a) is a circuit or blossom C that is shrunk to a pseudo node, yielding the reduced
graph G x C in (b). Its variable yc is set to zero. The weights of the edges connected
to C are updated according to cij + czj - yi for each edge ( i ,j ) with i E V ( C ) and
+
j @ V ( C ) . Notice that by this transformation, the reduced costs E i j = c,j - (yz yj)
stay the same for edges of G and of the reduced graph G x C .
original graph. For this t o be correct, we have t o be very specific about what we mean
by "solved the problem" :
Proposition: Let G', c' be obtained from G, c by shrinking the odd circuit C of
equality edges with respect t o the dual-feasible solution y. Let M ' be a perfect match-
ing of G' and, with respect t o G', c', let (y', Y') be a feasible solution t o (10.12) such
>
that M', (y', Y') satisfy conditions (10.16) and such that y& 0. Let M be the perfect
matching of G obtained by extending M ' with edges from E ( C ) . Let (y, Y) bc defined
as follows. For i E V \ V ( C ) define yi = y i [for i E V ( C ) , yi is already defined]. For
D E C,we put Yo = Yh if YA > 0; we put Yo = y& if D = dl(C); otherwise, we put
Yo = 0. Then with respect to G, c , ( y , Y ) is a feasible solution t o (10.12) and M ,
(y, Y) satisfy the condition (10.16).
Figure 10.15: An example for a dual change where G is non-bipartite. The current
matching M is indicated by the full lines, a perfect matching still needs edge (gh).
Node r is exposed, r , b, and f are B-nodes, a and d are A-nodes. The weights cij are
indicated at each edge, the current value of yi is indicated on each node.
Now let us describe the dual variable change. It is the same one used in the bipartite
10.4 Matching Algorithms 247
case, but with different rules for the choice of E. First we need t o consider edges (i, j)
with i , j E B ( 7 ) when choosing E, so we will need cij/2 for such edges. Second we
need t o ensure that yi remains nonnegative if i is a pseudo node, so we need E 5 yi for
such nodes. Since it is the nodes in A ( T ) whose y-values are decreased by the change,
these are the ones whose y-values affect the choice of E. To illustrate, see Fig. 10.15,
where E = 112 is taken and (g, h) then becomes an equality edge. In the following the
procedure for changing y is shown. It takes as input a derived pair (G', c'), a feasible
solution y of (10.12) for this pair, a matching M ' of GI consisting of equality edges,
and an MI-alternating tree 7- consisting of equality edges in GI.
procedure Change-y((G1,c'), y, M 1 , 7 )
begin
Let EI := min{cijl(i, j ) E E ( G 1 ) , i E B(7-),j $2 V ( 7 ) ) ;
~2 := min{cij/21(i, j) 6 E ( G i ) ,i E B(T),j E B ( 7 ) ) ;
EQ := min{yili E A ( T ) , i pseudonode of G');
E := min{E1, ~ 2 E ,~ ) ;
Replace yi by
+
yi E , if i E B ( 7 ) ;
yi - E, if i E A(7-);
yi otherwise;
end
The last ingredient of the algorithm that we need is a way to handle the expansion
of pseudo nodes. Note that we no longer have the luxury of expanding them all after
finding an augmentation. The reason is that expanding the pseudo node i when yi is
positive would mean giving a value t o a variable Yo, where D is a cut of the current
derived graph that is not of the form SGf (j)for some pseudo node j . Therefore, we can
expand a pseudo node i only if yi = 0. Moreover, in some sense we need t o do such
expansions, because a dual variable change may not result in any equality edge that
could be used t o augment, extend, or shrink (because the choice of E was determined
by some odd pseudo node). However, in this case, unlike the unweighted case, we are
still in the process of constructing a tree, and we do not want t o lose the progress that
has been made. So the expanding step should, as well as updating MI and c', also
update T . The example in Fig. 10.16 suggests what t o do.
Suppose that the odd node i of the M-alternating tree 7 is a pseudo node and expan-
sion of the corresponding circuit C with the updating of M leaves us with the graph
on the right. There is a natural way t o update 7 after the expansion by keeping a
subset of the edges of C . In Fig. 1 0 . 1 6 ~we show the new alternating tree after the
pseudo node expansion illustrated in Fig. 10.16b.
The expansion of pseudo nodes and the related updating is performed using the fol-
lowing procedure. It takes as input a matching M i consisting of equality edges of a
derived graph G', an Mi-alternating tree 7 consisting of equality edges, costs c' and
an odd pseudo node i of G' with yi = 0:
10 Matchings
Figure 10.16: a) and b): Expanding an odd pseudo node; c ) : Tree update after
pseudo node expansion (see text).
Proposition: After the application of the expand routine, M' is a matching contained
in E,, and 7 is an MI-alternating tree whose edges are all contained in E=.
regime 0.10 5 x 5 0.15 a random antiphase state exists which has zero magnetization
but long-range order. This state is, according t o the authors, characterized by the
existence of magnetic walls (which are different from fracture lines) across which the
magnetization changes sign. Thus the system is composed in this regime of "chunks"
of opposite magnetization, so (M) = 0 although the spin-spin correlation does not
go t o zero with distance. At x = 0.15 a second transition occurs, this time due
t o the proliferation of fracture lines, and rigidity (long-range order) is lost since the
system is now broken into finite pieces which can be flipped without energy cost.
Their conclusions were supported by later work using zero-temperature transfer-matrix
methods [21]. Freund and Grassberger [22], using an approximation algorithm t o find
low-energy states on systems up t o sizc 210 x 210, located the ferromagnetic transition
at x* = 0.105 but found no evidence of the random antiphase state. The largest 2d
systems studied t o date using exact matching algorithms appear t o be L = 300 by
Bendisch et al. [3, 231. The authors examined the ground-state magnetization as a
function of the density p of negative bonds on square lattices, and concluded that
0.096 < x* < 0.108, but their finite-size scaling analysis is not the best one could
think of. They did not analyze the morphology of the states found, so no conclusion
could be reached regarding the existence of the random antiphase state.
More recently Kawashima and Rieger [24], again using exact matching methods, com-
pared previous analyses of the ground state of the 2d ( h J ) spin glass, in addition t o
performing new simulations. They summarized the results in this area in the phase
diagram given in Fig. 10.17.
However Kawashima and Rieger found that the "spin-glass phase" is absent and that
there is only one value of p,. They thus argued for a direct transition from the
10.5 Ground-state Calculations i n 2d
ferromagnetic state to a paramagnetic state, for both site- and bond-random spin-
glass models. Their analysis is based on the stiffness energy, i.e. the difference A E =
E, - E,, where E, is the ground-state energy with periodic boundary conditions and
E, is the ground-state energy with antiperiodic boundary conditions, see also Sec. 9.3.
The scaling behavior [25],
case, the ground state is found using a minimal weighted matching algorithm (with the
modification that now not only the length of a path between two matched plaquettes
counts for the weight, but also the strength of the bonds laying on this path).
The latest estimate for the stiffness exponent of the 2d Ising-spin-glass model with a
uniform bond distribution between 0 and 1 obtained via exact ground-state calcula-
tions [28] is
[AE~],, cc L: with Os = -0.281 & 0.002, (10.18)
which implies that in the infinite system arbitrarily large clusters can be flipped with
vanishingly small excitation energy. Therefore the spin-glass order is unstable with
respect t o thermal fluctuations and one does not have a spin-glass transition at finite
temperature.
Nevertheless, the spin-glass correlation length (defining the length scale over which
spatial correlations like [(SiSi+,)$], decay) will diverge a t zero temperature as <-
T-'/", where v is the thermal exponent. A scaling theory (for a zero-temperature
fixed-point scenario as is given here) predicts that v = l/lOsl which, using the most
accurate Monte Carlo [29] and transfer-matrix [30] calculations gives u = 2.0 f 0.2,
is inconsistent with Eq. (10.18). This is certainly an important unsolved puzzle,
which might be rooted in some conceptional problems concerning the use of peri-
odic/antiperiodic boundary conditions t o calculate large-scale low-energy excitations
in a spin glass (these problems were first discussed in the context of the XY spin glass
[31] and the gauge glass [32]). Further work in this direction will be rewarding.
Next we would like t o focus our attention on the concept of chaos in spin glasses
[33]. This notion implies an extreme sensitivity of the SG-state with respect t o small
parameter changes like temperature or field variations. There is a length scale X
- the so called overlap length - beyond which the spin configurations within the
same sample become completely decorrelated if compared for instance at two different
temperatures
CAT = [ ( a i r i + r )(~~ ~ Q ~ + T ) T + A T
exp
] ~(-r/X(AT))
v . (10.19)
This should also hold for the ground states if one slightly varies the interaction
strengths Jij in a random manner with amplitude S. Let {a} be the ground state
of a sample with couplings Jij and let {a'} be the ground state of a sample with
+
couplings Jij SKij, where the K i j are random (with zero mean and variance one)
and 6 is a small amplitude. Now define the overlap correlation function as
Cb(r) = [aiai+ra~a:+,],, - c(Ts''~), (10.20)
where the last relation indicates the scaling behavior we would expect (the overlap
-
length being X S-'/C) and C is the chaos exponent. In [28] this scaling prediction
was confirmed with 115 = 1.2 & 0.1 by exact ground-state calculations of the 2d Ising
spin glass with a uniform coupling distribution, and the corresponding scaling plot for
C6(r) is shown in Fig. 10.18.
Bibliography
[I] I. Bieche, R. Maynard, R. Rammal, and J.P. Uhry, J. Phys. A 13, 2553 (1980)
Bibliography
Figure 10.18: Scaling plot of the overlap correlation function C s ( r )versus r / L * with
L* = S-'/C. The best data collapse (for data confined to r < L / 4 ) is obtained for
1/C = 1.05. The system size is L = 50 and the data are averaged over 400 samples.
These were obtained by creating about 80 reference instances and creating 5 random
perturbations of strength 6 for each. From [28].
[3] J . Bcndisch, U. Derigs, and A. Metz, Disc. Appl. Math. 52, 139 (1994)
[4] F. Barahona, R. Maynard, R. Rammal, and J.P. Uhry, J. Phys. A 1 5 , 673 (1982)
[5] R.L. Graham, M. Grotschel, and L. Lovasz, Handbook of Combinatorics, (Elsevier
Science Publishers, Amsterdam 1995)
[6] L. Lovasz and M.D. Plummer, Matching Theory, (Elsevier Science Publishers,
Amsterdam 1986)
[8] R.Z. Norman and M. Rabin, Proc. Am. Math. Soc. 1 0 , 315 (1959)
[Ill E. Minieka, Optimization Algorith,ms for Networks and Graphs, (Marcel Dekker
Inc., New York and Base1 1978)
10 Matchings
Figure 11.1: Comparison of the random walkmethod (left) with the greedy algorithm
(right). Displayed are, respectively, the energy as a function of the state S and sample
trajectories in configuration space. Ho denotes the energy of the minimum. Here
the random walk results in the second configuration as minimum, while the greedy
algorithm always results in the final state.
For a well behaved energy landscape with just one minimum the greedy algorithm is
often successful (Fig. 11.1 right). It accepts only moves that generate states S' with
a lower energy and is defined by p ( S -+ S') = Q(-AX), where A X = X(S1) - X ( S ) is
the energy difference between the old and the new state and Q(x) = 1 for z 0 and 0 >
otherwise, the Heavyside step function.
As soon as one deals with problems that have many local energy minima one needs
to accept moves t o states with higher energy too, in order t o escape local minima and
overcome energy barriers separating different minima. One of the simplest of these
techniques appears t o be threshold accepting [l],which is defined by the acceptance
rule
p(s + s f )= Q ( E - ax) (11.2)
which means that all moves leading t o an energy decrease are accepted, as well as
moves with a positive energy difference A X as long as it is smaller than the threshold
t. This means that in principle arbitrarily large energy barriers can be overcome, if
enough intermediate configurations with sniall energy differences are present (Fig. 11.2
left).
Such a procedure is already an improvement on the greedy method but it gets stuck
in an energy landscape that has local minima surrounded by energy barriers that are
larger than the threshold 6 so called golf holes in the energy landscape (Fig. 11.2
-
right). One should note that in a high-dimensional configuration space the number
of escape routes or directions out of a local energy minimum is also enlarged and
with this also the chance to get out of a local minimum even a t low threshold values.
This parameter t , which is reduced during the run down t o 0, plays the role of a
temperature, although the acceptance rate does not at all fulfill detailed balance and
the stationary distribution of this process does not have anything t o do with the
Boltzmann distribution, see Chap. 5. Obviously it is advantageous t o repeat the
11.2 Simulated Annealing 257
Figure 11.2: Threshold accepting: large energy barriers can be overcome starting
from So by several small steps (left), unless they are "golf holes" (right).
i.e. a new configuration S' is accepted when its energy is below E and it is rejected
otherwise. Note that the acceptance rate is independent of the old state S (a random
walk) and each state S'is accepted with equal probability as long as its energy is below
E , which plays the role of a temperature that is slowly decreased until the system settles
in a (local) minimum. This algorithm reminds one of a great deluge, since if one inverts
the energy landscape, one looks for the absolute maximum, its highest top. One lets
the water level rise continuously and one can only visit spots that are not yct flooded.
The great deluge algorithm obeys detailed balance (c.f. Chap. 5) - sincc p ( S -t S') =
p(S' + S) for all states S and S' that can be reached in the stationary state [in which
X ( S ) < E and X(S1) < E] of this stochastic process. However, it is not ergodic, since
for low enough E the part of the configuration space that is sampled splits into several
islands between which no transition is possible any more.
configurations) has t o decrease very slowly since otherwise the molecular configura-
tion gets stuck and the crystal will be imperfect. This is called annealing in materials
science and various stochastic optimization methods are guided by this physical spirit
such that they have been called simulated annealing [3].
We recall what we said in Sec. 5.1. We consider a physical system that can be in
states S and is described by a Hamiltonian Z ( S ) . Its equilibrium properties at a
temperature T are determined by the Boltzmann distribution
where A Z = ?f(S1)- N ( S ) is the energy difference between the old state S' and the
new state S, leads t o a stationary state that is governed by the stationary distribution
Pe,(S). Another choice leading t o the same results are the heat-bath transition rates
Figure 11.3: Simulated annealing. A linear cooling schedule T ( t )(left) and a sketch
of the resulting average energy E ( t ) as a function of time typically.
11.2 Simulated Annealing 259
where a and b are positive constants that depend on the special problem. Practical
applications, of course, have t o be finished in finite time and therefore prefer the
following empirical cooling protocols. The linear scheme (Fig. 11.3 left)
where a is the starting temperature and b the step size for decreasing the temperature
< <
(usually 0.01 b 0.2); and the exponential scheme:
where again a is the starting temperature and b the cooling rate (usually 0.8 5 b <
0.999). For completeness we present a schematic listing of a simulated annealing
procedure:
a l g o r i t h m simulated annealing
begin
choose start configuration S ;
f o r t := 1 , . . . ,t,,, do
begin
set temperature T := T ( t ) ;
M o n t e C a r l o ( M ( t ) ,T) a t temperature T with M ( t ) steps;
end
end
the system at different temperatures, which are not too far from each other in order t o
get a sufficiently large overlap in the energy distribution of the state (we shall soon see
why). One simulation of the system is done at temperature TI, the other at T2 > TI.
The respective inverse temperatures are called Dl = l / k ~ T land P 2 = l / l c ~ T 2 . The
Monte Carlo algorithm works as follows. We perform a conventional Monte Carlo
simulation with systems in parallel (which is done best on a parallel computer with
one CPU for one system, but it is of course also manageable on a serial machine) using
for instance a simple Metropolis update rule (11.5). But from time to time instead of
doing this we calculate the energy difference of the current state S1 and S2of the two
simulations, A E = E2 - El, where El = X(S1) and E2 = X(S2), and we swap the
two spin configurations ([SI,S2]+ [Sz,Sl])with an acceptance probability
Here we use the notation [S,S'] t o describe the state of the two combined systems:
first at temperature Tl in state S and second at temperature T2 in state St. These
transition rates for swapping fulfill detailed balance (5.15) with respect t o the joint
equilibrium distribution function for the two systems
11.3 Parallel Tempering
The transition rates in the conventional Monte Carlo steps where we do not swap
states fulfill detailed balance anyway and they are also ergodic. Hence the combined
procedure fulfills detailed balance and is ergodic, and therefore both simulations will
after a sufficiently long time sample the Boltzmann distribution for this model at
temperature TI in the first system and temperature T2 in the second. The crucial
advantage is, again, that the T2 system helps the TI system to equilibrate faster:
a l g o r i t h m Parallel tempering
begin
choose start configurations S1 and S 2 ;
for t := 1 , . . . ,t,,, do
begin
M o n t e C a r l o ( M , TI) for system 1;
M o n t e C a r l o ( M , T2) for system 2;
A E := Z(S2) - Z(S1);
if ( A E < 0) t h e n
accept [Sl, S2] + [ S 2 , Sl];
else
begin
w := exp(-(PI - P2)AE);
generate uniform random number x E [0, I];
if (x < w) t h e n
accept [Sl, S21 + [S2,S1I ;
end
end
end
One question is how often one should perform the swap moves (i.e. how large should
the parameter M in the above listing be). It is clear that it does not make sense to
try this too frequently, the high temperature system should have time to be carried
away from a local minimum, otherwise it is useless. To estimate the time needed for
this one could, for instance, calculate the normalized energy autocorrelation function
determine its correlation time r from CE(T) = e-lCE(0) and choose M > T. A
practical point concerning the swapping procedure is that obviously we do not need
to shuffle the configurations S1 and S2 from one array to the other, the same effect
is achieved by simply exchanging the temperatures Tl and T2 for the two systems.
Finally: We mentioned before that the method becomes particularly efficient if it is
done on a parallel computer (which is the origin of its name parallel tempering), where
262 I 1 Monte Carlo Methods
each node (or a block of nodes) simulates one copy of the system a t one temperature
out of not only two but many (typically 32 or so), see Fig. 11.4. Since the swapping
probabilities are exponentially small in the energy difference of the systems at two
different temperatures and because the average energy increases with the temperature
the different temperatures should not be too far from each other and one should swap
only between systems at neighboring temperatures. But, of course, the temperatures
also should not be chosen too close to each other if we want to find good approximations
t o the lowest energy configurations, i.e. t o reach as low temperatures as possible. A
good rule of thumb is, that one should achieve an acceptance ratio of 0.5 for the
swapping move for each pair of neighboring temperatures.
Figure 11.4: Parallel tempering with k different temperatures TI < T2 < . . . < T k .
At each temperature a system is simulated using conventional Monte Carlo. From
time to time, configurations are exchanged between neighboring temperatures, such
that detailed balance is fulfilled.
site of the last placed monomer. In order t o obtain the correct statistics, an already
occupied neighbor should not be avoided, but any attempt t o place the monomer at
such a place is punished by discarding the entire chain. This leads t o an exponential
"attrition", which makes the method useless for long chains.
Rosenbluth and Rosenbluth [lo] observed that this exponential attrition can be strongly
reduced by simulating a biased sample, and correcting the bias by means of a weight
associated with each configuration. The biased sample is simply obtained by replacing
any "illegal" step, which would violate the self avoidance constraint, by a random
"legal" one, provided such a legal step exists. More generally, assume we want t o
simulate a distribution in which each configuration S is weighted by a (Boltzmann-)
weight Q(S), so that for any observable A one has (A) = Cs A(S)Q(S)/ C , Q(S). If
we sample unevenly with probability p(S), then we must compensate this by giving a
weight W ( S ) Q(S)Ip(S),
(A) = lim
M+oo
Cfli A ( s i ) Q ( s i ) / p ( s i ) =
xz1 Q(Si)/p(Si) -
1 1
- c
M
i=l
A(si)w(si), (11.14)
This is called the generalized Rosenbluth method. If p(S) were chosen close t o Q ( S ) ,
this would lead to importance sampling and obviously would be very efficient. But in
general this is not possible, and Eq. (11.14) suffers from the problem that the sum is
dominated by very few events with high weight.
+
Consider now a lattice chain of length N 1 with self avoidance and with nearest
neighbor interaction - E between unbonded neighbors. In the original Rosenbluth
method, p(S) is then a product,
where m, is the number of free neighbors in the n-th step, i.e. the number of possible
lattice sites where t o place the n-th monomer (monomers are labeled n = 0 , 1 , . . . N).
11 Monte Carlo Methods
where = l / k T and En = -t C kn-1 = OAkn is the energy of the n-th monomer in the
field of all previous ones (Akn = 1 if and only if monomers k and n are neighbors and
non-bonded, otherwise A k n = 0).
Obviously, p(S) favors compact configurations where monomers have only few free
neighbors. This renders the Rosenbluth method unsuitable for long chains, except
<
near the collapse ("theta") point where simulations with N 1000 are feasible on the
simple cubic lattice [12]. In general we should find ways t o modify the sampling so
that "good" configurations are sampled more frequently, and "bad" ones less. The key
t o this is the product structure of the weights (with w, = m,e-BE- 1
gets too large (i.e. is above some threshold W + ) , we replace the configuration by k
copies, each with weight Wn/k. Growth of one of these copies is continued, all others
are placed on a stack for later use. As a consequence, configurations with high weight
are simulated more often. To account for this and t o keep the final result correct,
the weight is reduced accordingly. The whole idea is applied recursively. Following
Ref. [ll]we call this enrichment. The opposite action, when W, falls below another
threshold W (pruning), is done stochastically: with probability 112 the configuration
is killed and replaced by the top of the stack, while its weight is doubled in the other
half of cases.
In this way PERM (the pruned-enriched Rosenbluth method [ 8 ] )gives a sample with
exactly the right statistical weights, independently of the thresholds w*, the selection
probability p ( S ) , and the clone multiplicity k. But its efficiency depends strongly on
good choices for these parameters. Notice that one has complete freedom in choosing
them, and can even change them during a run. Fortunately, reasonably good choices
are easy t o find (more sophisticated choices needed at very low temperatures are
discussed in Refs. [13, 141). The guiding principle for p(S) is that it should lead as
closely as possible t o the correct final distribution, so that pruning and enrichment are
kept t o a minimum. This is also part of the guiding principles for Wi. In addition, W +
and W - have t o be chosen such that roughly the same effort is spent on simulating any
11.4 Prune-enriched Rosenbluth Method (PERM) 265
part of the configuration. For polymers this means that the sample size should neither
grow nor decrease with chain length n. This is easily done by adjusting W+ and W -
"on the fly". We now show a listing for the central part of the PERM algorithm (after
[ 8 ] ) x, denotes the current position of the head of the chain and n the current chain
length:
procedure PERM-Step ( x , n )
begin
choose x1 near x w density p(xl-x); simplest case: p(x) = l/m,61,1,1
w, := cnp(xl - x ) ~ exp(-E(xl)/kBT);
' if c,=const.+grand canonical
Wn := W n - l ~ , ;
begin do statistics
2, := 2, + W,; partition function
R2, := R2, + x12w,; end-to-end distance
t := t + 1; total number of subroutine calls
etc.
end
if n < N,,,, and W, > 0 then
begin
W + := c+Z,/Z1; adaption of W+ (optional)
W- .- .- c- Z,/Z1; adaption of W- (optional)
if W, > W + then
begin
W, := WJ2;
PERM-Step(xl, n + 1);
PERM-Step(xl, n + 1); enrichment!
end
else if W, < W- then
begin
Wn := W, * 2; prune w. prob. 1/2
draw E uniform E [0, I];
if ([ < l / 2 ) then
PERM-Step(xl,n + 1);
end
else
PERM-Step(xl,n + 1); normal Rosenbluth step
end
return
end
The subroutine PERM-Step is called from the main routine with arguments x = 0,
n = 1. W,, c,, Z, and R2, are global arrays indexed by the chain length. W, is
the current weight, c, a reweighting factor, which allows the simulation of different
statistical ensembles, Zn the estimate of the partition function and R2, the sum of
the mean squared end-to-end distance, i.e. R2,/Zn is the average. N,,,, t , c+ and
266 11 Monte Carlo Methods
c p , are global scalar variables, where N,,, is the maximum chain length, t counts
the total number of subroutine calls and c+/c- control the adaptation of the weights.
Without adaptation, the lines involving c+ and c- can be dropped, and then W +
and W - are global scalars. In more sophisticated implementations, p, c+ and c will
depend on n and/or on the configuration of the monomers with indices n' < n. Good
choices for these functions may be crucial for the efficiency of the algorithm, but are
not important for its correctness. To compute the energy E ( x l ) of the newly placed
monomer in the field of the earlier ones, one can use either bit maps (in lattice models
with small lattices), hashing or neighbor lists. If none of these are suitable, E ( x l ) has
to be computed as an explicit sum over all earlier monomers.
In selecting the good and killing the bad, PERM is similar t o evolutionary and ge-
netic algorithms [15], see Chap. 8, t o population based growth algorithms for chain
polymers [16, 17, 18, 191, t o diffusion type quantum Monte Carlo algorithms [20],
and t o the "go with the winners" strategy of Ref. [21]. The main difference with
the first three groups of methods is that one does not keep the entire population of
instances simultaneously in computer memory. Indeed, even on the stack one does
not keep copies of good configurations but only the steps involved in constructing the
configurations and flags telling us when t o make a copy [8]. In genetic algorithms,
keeping the entire population in memory is needed for cross-overs, and it allows a one-
to-one competition between instances. But in our case this is not needed since every
instance can be compared with the average behavior of all others. The same would
be true for diffusion type quantum Monte Carlo simulations. The main advantage of
our strategy is that it reduces computer memory enormously. This, together with the
surprisingly easy determination of the thresholds w*, could make PERM also a very
useful strategy for quantum Monte Carlo simulations.
non-bonded monomers. These interactions can have continuous distributions [23], but
the majority of authors have considered only two kinds of monomers. In the HP model
[24, 251 they are hydrophobic (H) and polar ( P ) , with (eHH,EHP, ePP) = - ( I , 0,O).
Since this leads t o highly degenerate ground states, alternative models were proposed,
e.g. Z = -(3,1,3) [26] and E'= -(1,0,1) [27].
The algorithms that were applied in [28, 291 were variants of the pruned-enriched
Rosenbluth method (PERM) described in the last section and we present here a few
of the impressive results obtained in this way, improving substantially on previous
work.
2d HP model
Two chains with N = 100 were studied in [30]. The authors claimed that their native
configurations were compact, fitting exactly into a 10 x 10 square, and had energies
-44 and -46:
Sequence 1:
P6HPH2PsH3PH5PH2P2(P2H2)2PH5PH10PH2PH7PllH7P2HPH3P6HPH2
Sequence 2:
P3H2P2H4P2H3(PH2)3H2PsH6P2H6P9HPH2PH11P2H3PH2PHP2HPH3P~H3
In Fig. 11.6 we show the the respective proposed ground-state structures. These con-
formations were found by a specially designed MC algorithm which was claimed t o be
particularly efficient for compact configurations. We do not discuss the method here.
For these two HP chains by applying the PERM algorithm a t low temperatures, Grass-
berger et al. [29] found (within ca. 40 hours of CPU time) several compact states that
had energies lower than those of the compact putative ground states proposed in [30],
namely with E = -46 for sequence 1 and E = -47 for sequence 2. Moreover, they
found (again within 1-2 days of CPU time) several non-compact configurations with
energies even lower: E = -47 and E = -48 for sequences 1 and 2, respectively.
Forbidding non-bonded H P pairs, Grassberger et al. [28, 291 obtained even E = -49
for sequence 2. Figure 11.6 shows representative non-compact structures with these
energies. These results reflect the well-known property that HP sequences (and those
of other models) usually have ground states that are not maximally compact, see, e.g.
[31], although there is a persistent prejudice t o the contrary [27, 30, 321.
3d modified HP model
A most interesting case is a 2-species 80-mer with interactions (-1,0, -1) studied first
in [27]. These particular interactions were chosen instead of the HP choice (-1,0,0)
because it was hoped that this would lead t o compact configurations. Indeed the
sequence [27, 331
was specially designed t o form a "four helix bundle" which fits perfectly into a 4 x 4 x 5
box, see Fig. 11.7. Its energy in this putative native state is -94. Although the authors
268 11 Monte Carlo Methods
Figure 11.6: Top: Putative compact native structure of sequence 1 (left) with E =
-44 and sequence 2 (right) with E = -46 according t o [30];(filled circle) H monomers,
(open circle) P monomers. Bottom: One of the (non-compact) lowest energy sequences
for sequence 1, left, with E = -47 and sequence 2, right, with E = -49.
of [27] used highly optimized codes, they were not able t o recover this state by MC.
Instead, they reached only E = -91. Supposedly a different state with E = -94 was
found in [30], but figure 10 of this paper, which it is claimed shows this configuration,
has a much higher value of E. Configurations with E = -94 but slightly different from
that in [27] and with E = -95 were found in [33] by means of an algorithm similar t o
that in [30]. For each of these low energy states the author needed about one week of
CPU time on a Pentiurn.
Grassberger et al. [28, 291 applied the PERM algorithm to the aforementioned system.
Evcn without much tuning the algorithm gave E = -94 after a few hours, but it did
not stop there. After a number of rather disordered configurations with successivcly
lower encrgics, the final candidate for the native state has E = -98. It again has a
11.5 Protein Folding
Figure 11.7: Putative native state of thc "four helix bundle" sequence, as proposed
by O'Toole and A. Panagiotopoulos [27]. It has E = -94, fits into a rectangular box,
and consists of three homogeneous layers. Structurally, it can be interpreted as four
helix bundles.
Figure 11.8: Conformation of the "four helix bundle" sequence with E = -98.
Grassberger et al. [28, 291 proposed that this is the actual ground state. Its shape is
highly symmetric although it does not fit into a rectangular box. It is not degenerate
except for a flipping of the central front 2 x 2 x 2 box.
highly symmetric shape, although it does not fit into a 4 x 4 x 5 box, see Fig. 11.8.
It has two-fold degeneracy (the central 2 x 2 x 2 box in the front of Fig. 11.8 can be
flipped), and both configurations were actually found in the simulations. The optimal
2 70 11 Monte Carlo Methods
Bibliography
[l]G. Dueck and T . Scheuer, J. Comp. Phys. 90, 161 (1990)
[3] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, Science 220, 671 (1983)
[4] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller, J.
Chem. Phys. 21, 1087 (1953)
[5] E. Marinari and G. Parisi, Europhys. Lett. 1 9 , 451 (1992)
[6] B. Berg and T . Neuhaus, Phys. Lett B 267, 249 (1991); Phys. Rev. Lett. 68, 9
(1992)
[7] K. Hukushima and K. Nemoto, J. Phys. Soc. Jpn. 65, 1604 (1996)
[ll] F.T. Wall and J.J. Erpenbeck, J. Chem. Phys. 30, 634, 637 (1959)
[18] B. Velikson, T . Garel, J.-C. Niel, H. Orland, and J.C. Smith, J. Comput. Chem.
13, 1216 (1992)
[19] H. Orland, in: P. Grassberger, G.T. Barkema, and W. Nadler. (ed.), Workshop
on: Monte Carlo Approach to Biopolymers and Protein Folding, (World Scientific,
Singapure, 1998)
[20] C.J. Umrigar, M.P. Nightingale, and K.J. Runge, J. Chem. Phys. 99, 2865 (1993)
[21] D. Aldous and U. Vazirani, in: Proc. 35th IEEE Sympos. on Foundations of
Computer Science (1994)
[22] T.E. Creighton (ed.), Protein Folding, (Freeman, New York, 1992)
[23] D.K. Klimov and D. Thirurnalai, Prpteins: Struct., Fnct. Gen. 26, 411 (1996);
sequences are available from http://www.glue.umd.edu/ klimov.
[25] K.F. Lau and K.A. Dill, Macromolecules 22, 3986 (1989); J. Chem. Phys. 95,
3775 (1991); D. Shortle, H.S. Chan, and K.A. Dill, Protein Sci. 1, 201 (1992)
[26] N.D. Socci and J.N.Onuchic, J. Chem. Phys. 101, 1519 (1994)
[30] R. Ramakrishnan, B. Ramachandran, and J.F. Pekney, J. Chem. Phys. 106, 2418
(1997)
[31] K. Yue K.M. Fiebig, P.D. Thomas, H.S. Chan, E.I. Shakhnovich, and K.A. Dill,
Proc. Natl. Acad. Sci. USA 92, 325 (1995)
[32] E.I. Shakhnovich and A.M. Gutin, J. Chem. Phys. 9 3 5967 (1990); A.M. Gutin
and E.I. Shakhnovich, J. Chem. Phys. 98 8174 (1993)
[33] J.M. Deutsch, J. Chem. Phys. 106, 8849 (1997)
12 Branch-and-bound Methods
Figure 12.1: Graphs and covers. Covered vertices are shown in bold and
dark, covered edges are indicated by dashed lines. Left: a partially covered
graph. Vertex 1 and 2 are covered. Thus, edges (1,3), (1,4), and (2,3) are
covered. Right: by also covering also vertices 4 and 5, the graph is covered.
12.1 Vertex Covers 275
The vertex-cover decision problem asks whether there are VCs of fixed given cardinality
X = 1V,,1, we define x = X / N . In other words we are interested if it is possible t o
cover all edges of G by covering x N suitably chosen vertices, i.e. by distributing x N
covering marks among the vertices. To measure the extent a graph is not covered, we
introduce an energy e(G, x) = E ( G , x ) / N with
E ( G , x) = rnin{number of uncovered edges when covering xNvertices) (12.1)
Thus, a graph G is coverable by X N vertices if e(G, x) = 0. This means that you can
answer the VC decision problem by first solving a minimization problem, and then
testing whether the minimum e(G, x) is zero or not.
For the preceding case, the energy was minimized with fixed X . The decision problem
can also be solved by solving another optimization problem: for a given graph G it asks
for a minimum vertex cover V,,, i.e. a vertex covcr of minimum size X,(G) = /V,,/.
Thus, here the number of vertices in the vertex cover is minimized, while the energy
is kept a t zero. The answer to the vertex cover decision problem is "yes", if X > X,.
Also minimum vertex covers may not be unique. In case several vertex covers V,,(1),
. . ., vJ(,K) exist, each with the same cardinality X (not necessarily minimum vertex
covers), a vertex i is called a backbone vertex, if it is either a member of all covers
(Vk = 1 , . . . , K : i E V,,( k )) or else a member of no cover (Vk = 1,.. . , K : i 6 v::)).
These vertices can be regarded as frozen in all vertex-cover configurations. All other
vertices are called non-backbone vertices.
Figure 12.2: All three minimum vertex covers of the graph from the preceding
example.
Turning back t o VC, when the number x N of covering marks is increased (c is kept
constant), the model is expected to undergo an uncoverable-coverable transition. Being
able only t o cover a small number x N of vertices, it is very unlikely that one will be
able t o cover all edges of a random graph. With increasing size of the cover set this
will be become more and more likely, while for X = N it is certain that all edges are
covered. For a given graph G , wc will denote the minimum fraction of covered vertices
necessary t o cover the whole graph by x,(G) = X,(G)/N. The value of x,(G)N is
related t o the energy e(G, z), it is just the smallest number x where the energy e(G)
vanishes. The average of e(G) over all random graphs as a function of x, will be
denoted e(x). Later we will see that by taking the thermodynamic limit N + oo the
size x, = X,/N of the minimum cover set will only depend on the concentration c of
the edges, i.e. x, = x,(c).
Using probabilistic tools, rigorous lower and upper bounds for this threshold [7]
and the asymptotic behavior for large connectivities [8] have been deduced. Recently,
the problem has investigated using a statistical physics approach [9] and the results
have been improved drastically. Up t o an average concentration c = e/2 E 1.359 the
transition line is given exactly by
0.6 1 Cob
UNCOV
Figure 12.3: Phase diagram. Fraction x,(c) of vertices in a minimal vertex cover as
a function of the average connectivity c. For x > z,(c), almost all graphs have covers
with X N vertices, while they have almost definitely no cover for z < z,(c). The solid
line shows the result from statistical physics. The upper bound of Harant is given by
the dashed line, the bounds of Gazmuri by the dash-dotted lines. The vertical line is
at c = e / 2 z 1.359.
vertices as are necessary. Thus, it seems favorable t o cover vertices with a high degree.
This step can be iterated, while the degree of the vertices is adjusted dynamically by
removing edges which are covered. This leads to the following greedy algorithm, which
returns an approximation of the minimum vertex cover V,,, the size /V,,I is an upper
bound of the true minimum vertex-cover size:
a l g o r i t h m greedy-cover(G)
begin
initialize V,, = 0;
while there are uncovered edges d o
begin
take one vertex i with the largest current degree di;
mark i as covered: V,, = V,, U {i);
remove from E all edges (i, j ) incident t o i ;
end;
return(V,,);
end
Example: Heuristic
To demonstrate, how the heuristic operates, we consider the graph shown in
Fig. 12.4. In the first iteration, vertex 3 is chosen, because it has degree 4,
which is the largest in the graph. The vertex is covered, and the incident edges
(1,3), (2,3), (3,4) and (3,5) are removed. Now, vertices 6 and 7 have the
highest degree 3. Wc assume that in the second iteration vertex 6 is covered
and vertex 7 in the third iteration. Then the algorithms stops, because all
edges are covered.
0
In the preceding example we have seen that the heuristic is sometimes able t o find a
true minimum vertex cover. But this is not always the case. In Fig. 12.5 a simple
counter example is presented, where the heuristic fails to find the true minimal vertex
cover. First the algorithm covers the root vertex, because it has degree 3. Thus, three
additional vertices have t o be subsequently covered, i.e. the heuristic covers 4 vertices.
But, the minimum vertex cover has only size 3, as indicated in Fig. 12.5.
The heuristic can be easily altered for the case where the number X of covered vertices
is fixed and it is asked for a minimum number of uncovered edges. Then the iteration
has t o stop as well, when the size of the cover set has reached X. In case a vertex
cover is found, before X vertices are covered, arbitrary vertices can be added t o the
vertex-cover set V,, until IV,,I = X.
So far we have presented a simple heuristic t o find approximations of minimum vertex
covers. Next, two exact algorithms are explained: divide-and-conquer and branch-
and-bound, which both incorporate the heuristic t o gain a high efficiency. Without
the heuristic, the algorithms would still be exact, but slower running.
12.2 Numerical Methods
Figure 12.4: Example of the cover heuristic. Upper left: initial graph. Upper
right: graph after the first iteration, vertex has been covered (shown in bold) and the
incident edges removed (shown with dashed line style). Bottom: graph after second
and third iteration.
The basic idea of both methods is as follows, again we are interested first in a VC
of minimum size: as each vertex is either covered or uncovered, there are 2 N pos-
sible configurations which can be arranged as leafs of a binary (backtracking) tree,
see Fig. 12.6. At each node, the two subtrees represent the subproblems where the
corresponding vertex is either covered ("left subtree") or uncovered ("right subtree").
Vertices, which have not been touched a t a certain level of the tree are said t o be free.
Both algorithms do not descend further into the tree when a cover has been found,
i.e. when all edges are covered. Then the search continues in higher levels of the tree
(backtracking) for a cover which possibly has a smaller size. Since the number of
nodes in a tree grows exponentially with system size, algorithms which are based on
backtracking trees have a running time which may grow exponentially with the system
size. This is not surprising, since the minimal-VC problem is NP-hard, so all exact
12 Branch-and-bound Methods
Figure 12.5: A small sample graph with minimum vertex cover of size 3. The vertices
belonging to the minimum V,, are indicated by darklbold circles. For this graph the
heuristic fails to find the true minimum cover, because is starts by covering the root
vertex, which has the highest degree 3.
uncovered
Figure 12.6: Binary backtracking tree for the VC. Each node of the backtracking tree
corresponds to a vertex which is either covered ("left subtree") or uncovered ("right
subtree").
a l g o r i t h m divide-and-conquer(G)
begin
take one free vertex i with the largest current degree di;
mark i as free;
if sizel < size2 t h e n
return(sizel);
else
return(size2);
end
12 Branch-and-bound Methods
Figure 12.7: Example of how the divide-and-conquer algorithm operates. Above the
graph is shown. The vertex i with the highest degree is considered. In the case where
it is covered (left subtree), all incident edges can be removed. In the case where it
is uncovered, (right subtree) all neighbors have to be covered and all edges incident
to the neighbors can be removed. In both cases, the graph may split into several
components, which can be treated independently by recursive calls of the algorithm.
The algorithm can be easily extended t o record the cover sets as well or t o calculate
the degeneracy. In Fig. 12.7 an example is given of how the algorithm operates.
The algorithm is able t o treat large graphs deep in the percolating regime. We have
calculated for example minimum vertex covers for graphs of size N = 560 with average
edge density c = 1.3
For average edge densities larger than 4, the divide-and-conquer algorithm is too slow,
because the graph only rarely splits into several components. Then a branch-and-
bound approach [12, 13, 141 is favorable. It differs from the previous method by
the fact that no independent components of the graph are calculated. Instead, some
subtrees of the backtracking tree are omitted by introducing a bound, this saves a
lot of running time. This is achieved by storing three quantities, assuming that the
algorithm is currently at some node in the backtracking tree:
0 The best size of the smallest vertex cover found in subtrees visited so far (initially
best = N).
0 X denotes the number of vertices which have been covered in higher levels of the
tree.
Always a table of free vertices ordered in descending current degree di is kept
12.2 Numerical Methods 283
Then the algorithms returns t o the preceding level of the backtracking tree.
Vertex 7 is uncovered. Thus, since only full covers of the graph are desired, all
its neighbors must be covered, namely vertices 5, 9 and 11. Again a cover of
the whole graph has been obtained. Now X = 4 vertices have been covered,
so no better optimum has been found, i.e. still best := 3.
For vertices 5 , 9 and 11 only covered states had to be considered. Thus, the
algorithm returns directly 3 levels in the backtracking tree. Then it returns
one more level, since for vertex 7 both states havc been considered. Next,
vertex 6 is uncovered and its neighbors 4, 8 and 10 are covered (X = 4),
see Fig. 12.2. Now three uncovered edges remain, i.e. the algorithm does not
return immediately. Hence, the calculations for the bound are performed.
F = best - X = 3 - 4 = -1 is obtained. This means no more vertices can
be covered, i.e. D = 0 which is smaller than the number of uncovered edges
(= 3). Therefore, the bound becomes active and the algorithm returns t o the
preceding level.
12.2 Numerical Methods
Now the algorithm again reaches the top level of the backtracking tree. Vertex
3 is uncovered and all its neighbors (1;2 , 4 and 5) are uncovered (X = 4), see
Fig. 12.2. Again, no VC has been found, i.e. the algorithm proceeds with the
286 12 Branch-and-bound Methods
The algorithm returns again t o the top level and has been finished. The
minimum vertex cover has size best = 3. Please note that the backtracking
tree only contains 17 nodes on 7 levels, while a full configuration tree would
contain 12 levels, 2'' leaves and 212 - I = 4097 nodes. This shows clearly
that by the branch-and-bound method the running time can be decreased
considerably.
0
For the preceding example, for the calculation of the bounds F 5 0 was always ob-
tained. This is due t o the small graph sizes, which can be treated within the scope
of an example. With real applications, F > 0 occurs. Then, for every calculation of
the bound, one has to access the F vertices of largest current connectivity. Thereforc,
it is favorable t o implement the table of vertices as two arrays vl, vz of sets of ver-
tices. The arrays are indexed by the current degree of the vertices. The sets in the
first array vl contain the F free vertices of the largest current degree, while the other
array contains all other free vertices. Every time a vertex changes its degree, it is
moved to another set, and eventually even changes the array. Also, in case the mark
(free/covered/uncovered) of a vertex changes it may be entered in or removed from
an array, possibly the smallest degree vertex of vl is moved t o v2 or vice versa. Since
we are treating graphs of finite average connectivity, this implementation ensures that
12.3 Results 287
the running time spent in each node of the graph is growing slower than linear in the
graph size1. For the sake of clarity, we have omitted the update operation for both
arrays from the algorithmic presentation.
The algorithm, as it has been presented, is suitable for solving the optimization prob-
lem, i.e. finding the smallest feasible size X , = Nx, of a vertex cover, i.e. the minimum
number of covered vertices needed to cover the graph fully. The algorithm can be eas-
ily modified t o treat the problem, where the size X = V,, is given and a configuration
with minimum energy is t o be found. Then, in best, it is not the current smallest size
of a vertex cover but the smallest number of uncovered edges (i.e. the energy) that
+
is stored. The bound becomes active, if F D is larger than or equal t o the current
number of uncovered edges. Furthermore, when a vertex is uncovered, the step where
all neighbors are covered cannot be applied, because the configuration of the lowest
energy may not be a VC. On the other hand, if a VC has been found, i.e. all edges are
covered, the algorithm can stop, because for sure no better solution can be obtained.
But thc algorithm can stop only for the case when one optimum has t o be obtained.
In case all minima have t o be enumerated, the algorithm has to proceed and the bound
+
becomes active only in the case if F D is strictly larger (not equal) t o the current
number of uncovered edges.
Although the branch-and-bound algorithm is very simple, in the regime 4 < c < 10
random graphs up t o size N = 140 can be treated. It is difficult t o compare the
branch-and-bound algorithm to more elaborate algorithms [14, 151, because they are
usually tested on a different graph ensemble where each edge appears with a certain
probability, independently of the graph size (high-connectivity regime). Nevertheless,
in the computer-science literature usually graphs with up to 200 vertices are treated,
which is slightly larger than the systems considered here. Nevertheless, the algorithm
presented here has the advantage that it is easy t o implement and its power is sufficient
to study interesting problems. Some results are presented in the next section.
12.3 Results
First, the problem is considered where the energy is minimized for fixed values x.
As stated in the first section, we know that for small values of x, the energy (12.1)
is not zero [e(O) = c], i.e. no vertex covers with x N vertices covered exist. On the
other hand, for large values of x, the random graphs are almost surely coverable,
i.e. e(z) = 0. In Fig. 12.12 the average ground-state energy and the probability
P,,,(z) that a graph is coverable with x N vertices is shown for different system sizes
N = 25,50,100 (c = 1.0). The results were obtained using the branch-and-bound
algorithm presented in the last section. The data are averages over lo3 ( N = 100)
to lo4 ( N = 25,50) samples. As expected, the value of P,,,(x) increases with the
fraction of covered vertices. With growing graph sizes, the curves become steeper.
This indicates that in the limit N 4 co,which we are interested in, a sharp threshold
x, z 0.39 appears. Above x, a graph is coverable with probability one, below it is
'Efficient implementations of sets require at most O(1og S) time for the operations, where S is the
size of a set.
288 12 Branch-and-bound Methods
almost surely not coverable. Thus, in the language of a physicist, a phase transition
from an uncoverable phase t o a coverable phase occurs. Please note that the value z,
of the critical threshold depends on the average density of vertices. The result for the
phase boundary x, as a function of c obtained from the simulations is shown later on.
Figure 12.12: Probability Pco,(x) that a cover exists for a random graph (c = 2) as
a function of the fraction x of covered vertices. The result is shown for three different
system sizes N = 25,50,100 (averaged for lo3 - lo4 samples). Lines are guides to the
eyes only. In the left part, where the PC,, is zero, the energy e (see text) is displayed.
The inset enlargcs the result for the energy in the rcgion 0.3 5 x 0.5. <
In Fig. 12.13 the median running time of the branch-and-bound algorithm is shown
as a function of the fraction x of covered vertices. The running time is measured
in terms of the number of nodes which are visited in the backtracking tree. Again
graphs with c = 1.0 were considered and an average over the same realizations has
been performed. A sharp peak can be observed near the transition x,. This means,
near the phase transition, the instances which are the hardest t o solve can be found.
Please note, that for values x < x,, the running time still increases exponentially, as
can been seen from the inset of Fig. 12.13. For values x considerably larger than the
critical value x,, the running time is linear. The reason is that the heuristic is already
able t o find a VC, i.e. the algorithm terminates after the first descent into the running
tree2.
Please note that in physics phase transitions are usually indicated by divergences in
measurable quantities such as specific heat or magnetic susceptibilities. The peak
2 ~ h algorithm
e terminates after a full cover of the graph has been found.
12.3 Results
Figure 12.13: Time complexity of the vertex cover. Median number of nodes vis-
ited in the backtracking tree as a function of the fraction x of covered vertices for
graph sizes N = 20,25,30,35,40 ( c = 1.0). The inset shows the region below the
threshold with logarithmic scale, including also data for N = 45,50. The fact that
in this representation the lines are equidistant shows that the time complexity grows
exponentially with N.
appearing in the time complexity serves as a similar indicator, but is not really equiv-
alent, because the time complexity diverges everywhere, only the rate of divergence is
much stronger near the phase transition.
For the uncoverable region, the running time is also fast, but still exponential. This is
due t o the fact that a configuration with a minimum number of uncouered edges has
t o be obtained. If only the question whether a VC exists or not is t o be answered, the
algorithm can be easily changed3, such that for small values of x again a polynomial
running time will be obtained.
To calculate the phase diagram numerically, as presented in Fig. 12.3 for the analytical
results, it is sufficient t o calculate for each graph the size X , = N x , of a minimum
vertex cover, as done by the version of the branch-and-bound algorithm or the divide-
and-conquer method presented in the last chapter. To compare with the analytical
result from Eq. (12.2)one has to perform the thermodynamic limit numerically. This
can be achieved by calculating an average value x,(N) for different graph sizes N.
The results for c = 1.0 are shown in the inset of Fig. 12.14. Then one fits a function
t o the data. The form of the function is purely heuristic, no exact justification exists.
But in case you do not know anything about the finite-size behavior of a quantity, an
algebraic ansatz of the form (12.3) is always a good guess. As can be seen from the
inset, the fit matches well.
Figure 12.14: Phase diagram showing the fraction x,(c) of vertices in a minimal
vertex cover as a function of the average connectivity c. For x > x,(c), almost all
graphs have covers with X N vertices, while they have almost surely no cover for
x < x,(c). The solid line shows the analytical result. The circles represent the results
of the numerical simulations. Error bars are much smaller than symbol sizes. The
upper bound of Harant is given by the dashed line, the bounds of Gazmuri by the dash-
dotted lines. The vertical line is at c = e/2 % 1.359. Inset: all numerical values were
+
calculated from finite-size scaling fits of x,(N, c) using functions z , ( N ) = x , ~ N P ~ .
We show the data for c = 1.0 as an example.
This procedure has been performed for several concentrations c of the edges. The
result is indicated in Fig. 12.14 by symbols. Obviously, the numerical data and the
analytical result, which has been obtained by using methods from statistical physics,
agree very well in the region c < e/2 z 1.359, as expected. But for larger connectivities
of the graph agreement is also very good.
A stronger deviation between numerical and analytical results can be observed for the
fraction b, of the backbone vertices. Please remember that the backbone of a graph
contains all vertices, which have in all degenerate minimum vertex covers (or all lowest
energy configurations) the same state, i.e. they are either always covered or always
uncovered. The numerical result can be obtained in a similar fashion as the threshold.
For each graph G all minimum vertex covers are enumerated. All vertices having the
Bibliography
Figure 12.15: The total backbone size of minimal vertex covers as a function of
c. The solid line shows the analytical result. Numerical data are represented by the
error bars. They were obtained from finite-size scaling fits similar to the calculation
for x,(c). The vertical line is at c = e/2 = 1.359 where the analytic results becomes
not exact.
same state in all configurations belong t o the backbone B. Then, the resulting fraction
b,(G) = IBI/N of backbone vertices is averaged by considering different realizations
of the random graphs, for one graph size N. The process is performed for different
values of N . These data can be used t o extrapolate to the thermodynamic limit via a
finite-size scaling function. The result, as a function of different edge concentrations c,
is displayed in Fig. 12.15 and compared with the analytical result. Again, a very good
agreement can be observed for low values c < e/2 = 1.359, while for graphs having a
stronger connectivity, larger deviations occur. Please note that for the case c = 0.0,
where the graph has no edges, no vertex needs t o be covered, meaning that all vertices
belong t o the backbone [b,(O) = I].
Marly more results can be found in [16, 171. In particular the critical concentration
c = e/2 = 1.359 is related t o the behavior of the subgraphs, consisting only of the
non-backbone vertices. The analytical calculations are displayed in [17]. A calculation
of the average running time for a simple algorithm, distinguishing the polynomial and
the exponential regime, can be found in [IS].
Bibliography
[I] B. Hayes, Am. Scient. 85, 108 (1997)
292 12 Branch-and-bound Methods
Here practical aspects of conducting research via computer simulations are discussed.
It is assumed that you arc familiar with an operating system such as UNIX (e.g.
Linux), a high-level programming language such as C, Fortran or Pascal and have
some experience with a t least small software projects.
Because of the limited space, usually only short introductions t o the specific areas are
given and refercnces to more extensive literature arc cited. All examples of code are
in C/C++.
First, a short introduction t o software engineering is given and several hints allowing
the construction of efficient and reliable code are stated. In the second section a short
introduction to object-oriented software development is presented. In particular, it is
shown that this kind of programming style can be achievcd with standard procedural
languages such as C as well. Next, practical hints concerning the actual process of
writing the code arc given. In the fourth section macros are introduced. Then it,
is shown how the development of larger pieces of code can be organized with the
help of so called make files. In the subsequent section the benefit of using libraries like
Numerical Recipes or LEDA are explained and it is shown how you can build your own
libraries. In the seventh section the generation of random numbers is covered while
in the eighth section three very useful debugging tools are presented. Afterwards,
programs to perform data analysis, curve fitting and finite-size scaling are explained.
In the last section an introduction t o information retrieval and literature search in the
Internet and t o the preparation of presentations and publications is given.
[l,21. Here just the steps that should be undertaken t o create a sophisticated software
development process are stated. The following descriptions refer t o the usual situation
you find in physics: one or a few people are involved in the project. How t o manage
the development of big programs involving many developers is explained in literature.
0 Definition of the problem and solution strategies
You should write down which problem you would like to solve. Drawing diagrams
is always helpful! Discuss your problem with others and tell them how you would
like t o solve it. In this context many questions may appear, here some examples
are given:
What is the input you have t o supply? In case you have only a few parameters,
they can be passed t o the program via options. In other cases, especially when
chemical systems are to be simulated, many parameters have t o be controlled
and it may be advisable to use extra parameter files.
Which results do you want t o obtain and which quantities do you have t o
analyze? Very often it is useful t o write the raw results of your simulations,
e.g. the positions of all atoms or the orientations of all spins of your system, t o
a configuration file. The physical results can be obtained by post-processing.
Then, in case new questions arise, it is very easy to analyze the data again.
When using configuration files, you should estirnate the amount of data you
generate. Is there enough space on your disk? It may be helpful, to include
the compression of the data files directly in your programs1.
- Can you identify "objects" in your problem? Objects may be physical entities
like atoms or molecules, but also internal structures like nodes in a tree or
elements of tables. Seeing the system and the program as a hierarchical
collection of objects usually makes the problem easier t o understand. More
on object-oriented development can be found in Sec. 13.2.
- Is the program t o be extended later on? Usually a code is "never" finished.
You should foresee later extensions of the program and set up everything in
a way it can be reused easily.
Do you have existing programs available which can be included into the soft-
ware project? If you have implemented your previous projects in the above
mentioned fashion, it is very likely that you can recycle some code. But this
requires experience and is not very easy t o achieve at the beginning. But over
the years you will have a growing library of programs which enables you t o
finish future software projects much quicker.
Has somebody else created a program which you can reuse? Sometimes you
can rely on external code like libraries. Examples are the Numerical Recipes
[3] and the LEDA library [4] which are covered in Sec. 13.5.
Which algorithms are known? Are you sure that you can solve the problem
at all? Many other techniques have been invented already. You should always
'1n C this can be achieved by calling system("gzip -f <filename>"); after the file has been
written and closed.
13. I Software Engineering 295
search the literature for solutions which already exist. How searches can be
simplified by using electronic data bases is covered more deeply in Sec. 13.9.
Sometimes it is necessary t o invent new methods. This part of a project may
be the most time consuming.
But it is not necessary that you must identify all basic operations at the beginning.
During the development of the code, new applications may arise, which lead t o
the need for further operations. Also it may be required t o change or extend the
data structures defined before. However, the more you think in advance, the less
you need t o change the program later on.
As an example, the problem of finding ground states in Ising spin glasses via
simulated annealing is considered. Some of basic operations are:
- Set up the data structures for storing the realizations of the interactions and
for storing the spin glass configurations.
- Create a random realization of the interactions
Initialize a random spin configuration.
- Calculate the energy of a spin in the local field of its neighbors.
Calculate the total energy of a system.
Calculate the energy changes associated with a spin flip
Execute a Monte Carlo step
Execute a whole annealing run.
- Calculate the magnetization.
Save a realization and corresponding spin configurations in a file.
0 Distributing work
In case several people are involved in a project, the next step is to split up the
work between the coworkers. If several types of objects appear in the program
design, a natural approach is t o make everyone responsible for one or several
types of objects and the related operations. The code should be broken up into
several modules (i.e. source files), such that every module is written by only one
person. This makes the implementation easer and also helps testing the code (see
below). Nevertheless, the partitioning of the work requires much care, since quite
often some modules or data types depend on others. For this reason, the actual
implementation of a data type should be hidden. This means that all interactions
should be performed through exactly defined interfaces which do not depend on
the internal representation, see also Sec. 13.2 on object-oriented programming.
When several people are editing the same files, which is usually necessary later
on, even when initially each file was created by only one person, then you should
use a source-code management system. It prevents several people from performing
changes on the same file in parallel, which would cause a lot of trouble. Addition-
ally, a source-code management system enables you to keep track of all changes
13.1 Software Engineering 297
made. An example of such a system is the Revision Control System (RCS), which
is freely available through the GNU project [5] and part of the free operating
system Linux.
Implementing the code
With good preparation, the actual implementation becomes only a small part of
the software development process. General style rules, guaranteeing clear struc-
tured code, which can even be understood several months later, are explained in
Scc. 13.3. You should use a different file, i.e. a different module, for each coherent
unit of data structures and subroutines; when using an object oriented language
you should define different classes (see Sec. 13.2). This rule should be obeyed for
the case of a one-person project as well. Large software projects containing many
modules are easily maintained via makefiles (see Sec. 13.4.2).
Each subroutine and each module should be tested separately, before integrating
many modules into one program. In the following some general hints concerning
testing are presented.
Testing
When performing tests on single subroutines, standard cases usually are used.
This is the reason why many errors become apparent much later. Then, because
the modules have already been integrated into one single program, errors are much
harder t o localize. For this reason, you should always try t o find special and rare
cases as well when testing a subroutine. Consider for example a procedure which
inserts an element into a list. Then not only inserting in the middle of the list, but
also at the beginning, at the end and into an empty list must be tested. Also, it is
strongly recommended t o read your code carefully once again before considering
it finished. In this way many bugs can be found easily which otherwise must be
tracked down by intensive debugging.
The actual debugging of the code can be performed by placing print instructions at
selected positions in the code. But this approach is quite time consuming, because
you have t o modify and recompile your program several times. Therefore, it is
advisable t o use debugging tools like a source-code debugger and a program for
checking the memory management. More about these tools can be found in Sec.
13.7. But usually you also need special operations which are not covered by an
available tool. You should always write a procedure which prints out the current
instance of the system that is simulated, e.g. the nodes and edges of a graph or
the interaction constants of an Ising system. This facilitates the types of tests,
which arc described in the following.
After the raw operation of the subroutines has been verified, more complex tests
can be performed. When e.g. testing an optimization routine, you should conlpare
the outcome of the calculation for a small system with the result which can be
obtained by hand. If the outcome is different from the expected result, the small
size of the test system allows you t o follow the execution of the program step
by step. For each operation you should think about the expected outcome and
compare it with the result originating from the running program.
298 13 Practical Issues
' ~ u tthis is not true for some C++ compilers when combining with option -g.
13.1 Software Engineering 299
0 Writing documentation
This part of the software development process is very often disregarded, especially
in the context of scientific research, where no direct customers exist. But even if
you are using your own code, you should write good documentation. It should
consist of a t least three parts:
- Comments in the source code: You should place comments at the beginning
of each module, in front of each subroutine or each self-defined data structure,
for blocks of the code and for selected lines. Additionally, meaningful names
for the variables are crucial. Following these rules makes later changes and
extension of the program much more straightforward. You will find in more
hints on how a good programming style can be achieved Sec. 13.3.
On-line help: You should include a short description of the program, its
parameters and its options in the main program. It should be printed, when
the program is called with the wrong numberlform of the parameters, or when
the option - h e l p is passed. Even when you are the author of the program,
after it has grown larger it is quite hard t o remember all options and usages.
- External documentation: This part of the documentation process is impor-
tant, when you would like to make the program available t o other users or
when it grows really complex. Writing good instructions is really a hard job.
When you remember how often you have complained about the instructions
for a video recorder or a word processor, you will understand why there is a
high demand for good authors of documentation in industry.
- Where t o put the results? In many cases you have t o investigate your model
for different parameters. You should organize the directories where you put
the data and the names of the files in such a way that even years later the
former results can be found quickly. You should put a README file in each
directory, explaining what it contains.
If you want t o start a sequence of several simulations, you can write a short
script, which calls your program with different parameters within a loop.
Logfiles are very helpful, where during each simulation some information
about the ongoing processes are written automatically. Your program should
put its version number and the parameters which have been used to start the
simulation in the first line of each logfile. This allows a reconstruction of how
the results have been obtained.
The steps given do not usually occur in linear order. It is quite common that aftjer you
have written a program and performed some simulations, you are not satisfied with
the performance or new questions arise. Then you start to define new problems and
the program will be extended. It may also be necessary to extend the data structures,
when e.g. new attributes of the simulated models have t o be included. It is also possible
that a nasty bug is still hidden in the program, which is found later on during the
actual simulations and becomes obvious by results which cannot be explained. In this
case changes cannot be circumvented either.
In other words, the software development process is a cycle which is traversed several
times. As a consequence, when planning your code, you should always keep this in
mind and set up everything in a flexible way, so that extensions and code recycling
can be performed easily.
different chairs belong to the class "chairs". The objects of many classes can have
internal states, e.g. a traffic-light can be red, yellow or green. The state of a
computer is much more difficult t o describe. Furthermore, objects are useful for
the environment, because other objects interact via operations with the object.
You (belonging t o the class "human") can read the state of a traffic light, some
central computer may set the state or even switch the traffic light off.
Similar t o the real world, you can have objects in programs as well. The internal
state of a n object is given by the values of the variables describing the object. Also
it is possible t o interact with the objects by calling subroutines (called methods
in this context) associated with the objects.
Objects and the related methods are seen as coherent units. This means you define
within one class definition the way the objects look, i.e. the data structures,
together with the methods which accesslalter the content of the objects. The
syntax of the class definition depends on the programming language you use.
Since implementational details are not relevant here, the reader is referred t o the
literature.
When you take the viewpoint of a pure object-oriented programmer, then all
programs can be organized as collections of objects calling methods of each other.
This is derived from the structure the real world has: it is a large set of interacting
objects. But for writing good programs it is as in real life, taking an orthodox
position imposes too many restrictions. You should take the best of both worlds,
the object-oriented and the procedural world, depending on the actual problem.
0 Data capsuling
When using a computer, you do not care about the implementation. When you
press a key on the keyboard, you would like t o see the result on the screen. You
are not intercsted in how the key converts your pressing into an electrical signal,
how this signal is sent t o the input ports of the chips, how the algorithm treats
the signal and so on.
Similarly, a main principle of object-oriented programming is t o hide the actual
implementation of the objects. Access to them is only allowed via given interfaces,
i.e. via methods. The internal data structures are hidden, this is called p r i v a t e
in C++. The data capsuling has several advantages:
- You do not have to remember the implementation of your objects. When
using them later on, they just appear as a black box fulfilling some duties.
You can change the implementation later on without the need to change the
rest of the program. Changes of the implementation may be useful e.g. when
you want t o increase the performance of the code or t o include new features.
Furthermore, you can have flexible data structures: several different types of
implementations may coexist. Which onc is chosen depends on the rcquire-
ments. An example are graphs which can be implemented via arrays, lists,
hash tables or in other ways. In the case of sparse graphs, the list imple-
mentation has a better performance. When the graph is almost complete, the
13 Practical Issues
array representation is favorable. Then you only have to provide the basic ac-
cess methods, such as inserting/removing/testing vertices/edges and iterating
over them, for the different internal representations. Therefore, higher-level
algorithms like computing a spanning tree can be written in a simple way to
work with all internal implementations. When using such a class, the user
just has t o specify the representation he wants, the rest of the program is
independent of this choice.
- Last but not least, software debugging is made easier. Since you have only
defined ways the data can be changed, undesired side-effects become less
common. Also the memory management can be controlled easier.
Inheritance
inheritance This means lower level objects can be specializations of higher level
objects. For example the class of (German) "ICE trains" is a subclass of "trains"
which itself is a subclass of "vehicles".
In computational physics, you may have a basic class of "atoms" containing mass,
position and velocity, and built upon this a class of "charged atoms" by including
the value of the charge. Then you can use the subroutines you have written for the
uncharged atoms, like moving the particles or calculating correlation functions,
for the charged atoms as well.
A similar form of hierarchical organization of objects works the other way round:
higher level objects can be defined in terms of lower level objects. For example a
book is composed of many objects belonging t o the class "page". Each page can
be regarded as a collection of many "letter" objects.
For the physical example above, when modeling chemical systems, you can have
"atoms" as basic objects and use them to define "molecules". Another level up
would be the "system" object, which is a collection of molecules.
F'unction/operator overloading
This inheritance of methods to lower level classes is an example of operator over-
loading. It just means that you can have methods for different classes having the
same name, sometimes the same code applies to several classes. This applies also
to classes, which are not connected by inheritance. For example you can define
how t o add integers, real numbers, complex numbers or larger objects like lists,
graphs or documents. In language like C or Pascal you can define subroutines to
add numbers and subroutines to add graphs as well, but they must have different
names. In C++ you can define the operator "+" for all different classes. Hence,
the operator-overloading mechanisms of object-oriented languages is just a tool
t o make the code more readable and clearer structured.
13.2 Object-oriented Software Development 303
0 Software reuse
Once you have an idea of how t o build a chair, you can do it several times.
Because you have a blueprint, the tools and the experience, building another
chair is an easy task.
This is true for building programs as well: both data capsuling and inheritance
facilitate the reuse of software. Once you have written your class for e.g. treating
lists, you can include them in other programs as well. This is easy, because later
on you do not have t o care about the implementation. With a class designed in
a flexible way, much time can be saved when realizing new software projects.
As mentioned before, for object-oriented programming you do not necessarily have t o
use an object-oriented language. It is true that they are helpful for the implementation
and the resulting programs will look slightly more elegant and clear, but you can
program everything with a language like C as well. In C an object-oriented style can
be achieved very easily. As an example a class h i s t o implementing histograms is
outlined, which are needed for almost all types of computer simulations as evaluation
and analysis tools.
First you have t,o think about the data you would like to store. That is the histjogram
itself, i.e. an array t a b l e of bins. Each bin just counts the number of events which
fall into a small interval. To achieve a high degree of flexibility, the range and the
number of bins must be variable. From this, the width d e l t a of each bin can be
calculated. For convenience d e l t a is stored as well. To count the number of events
which are outside the range of the table, the entries low and h i g h are introduced.
Furthermore, statistical quantities like mean and variance should be available quickly
and with high accuracy. Thus, several summarized moments sum of the distribution
are stored separately as well. Here the number of moments HISTOJOM- is defined as
a macro, converting this macro t o variable is straightforward. All together, this leads
t o the following C data structure:
/ * h o l d s s t a t i s t i c a l i n f o r m a t i o n s f o r a s e t of numbers: */
/ * h i s t o g r a m , # of Numbers, sum of numbers, s q u a r e s , ... * /
typedef s t r u c t
C
double from, t o ; /* r a n g e of h i s t o g r a m */
double delta; /* width of b i n s */
int n-bask; /* number of b i n s */
double *table; /* bins */
int low, h i g h ; /* No. of d a t a o u t of r a n g e */
double sum[-HISTO-NOM-1; /* sum of I s , numbers, numbers-2 ...* /
) histo-t ;
Here, the postfix -t is used to stress the fact that the name h i s t o - t denotes a type.
The bins are double variables, which allows for more general applications. Please
304 13 Practical Issues
note that it is still possible t o access the internal structures from outside, but it is
not necessary and not recommended. In C++, you could prevent this by declaring
the internal variables as private. Nevertheless, everything can be done via special
subroutines. First of all one must be able t o create and delete histograms, please note
that some simple error-checking is included in the program:
1
else
f or(t=O; t<n-bask; t + + )
his->table[tl = 0;
1
return(his) ;
1
/** D e l e t e s a h i s t o g r a m ' h i s ' **/
void h i s t o - d e l e t e ( h i s t 0 - t *his)
C
f r e e (his->table) ;
free(his) ;
}
Split your code into several modules. This has several advantages:
- When you perform changes, you have to recompile only the modules which
have been edited. Otherwise, if everything is contained in a long file, the
whole program has t o be recompiled each time again.
- Subroutines which are related t o each other can be collected in single modules.
It is much easier t o navigate in several short files than in one large program.
After one module has been finished and tested it can be used for other projects.
Thus, software reuse is facilitated.
- Distributing the work among several people is impossible if everything is writ-
ten into one file. Furthermore, you should use a source-code management sys-
tem (see Sec. 13.1) in case several people are involved in avoiding uncontrolled
editing.
To keep your program logically structured, you should always put data structures
and implementations of the operations in separate files. In C/C++ this means
you have t o write the data structures in a header (. h) file and the code into a
source code (. c/ . cpp) file.
Try t o find meaningful names for your variables and subroutines. Therefore,
during the programming process it is much easier to remember their meanings,
which helps a lot in avoiding bugs. Additionally, it is not necessary to look up
13.3 Programming Style 307
the meaning frequently. For local variables like loop counters, it is sufficient and
more convenient t o have short (e.g. one letter) names.
In the beginning this might seem t o take additional time (writing e.g.
'kinetic-energy7for a variable instead of 'x107).But several months after you
have written the program, you will appreciate your effort, when you read the line
instead of
You should use proper indentation of your lines. This helps a great deal in rec-
ognizing the structure of a program. Many bugs are caused by misaligned braces
forming a block of code. Furthermore, you should place a t most one command
per line of code. The reader will probably agree that
0 Avoid jumping t o other parts of a program via the "goto" command. This is bad
style originating from programming in assembler or BASIC. In modern program-
ming languages, for every logical programming construct there are corresponding
commands. "Goto" commands make a program harder t o understand and much
harder t o debug if it does not work as it should.
In case you want t o break out of a loop, you can use a whileluntil loop with a
flag that indicates if the loop is to be stopped. In C, if you are lazy, you can use
the commands break or continue.
Do not use global variables. At first sight the use of global variables may seem
tempting: you do not have to care about parameters for subroutines, everywhere
the variables are accessible and everywhere they have the same name. Program-
ming is done much faster.
But later on you will have a bad time: many bugs are created by improper use of
global variables. When you want t o check for a definition of a variable you have
t o search the whole list of global variables, instead of just checking the parameter
308 13 Practical Issues
- Block comments: You should divide each subroutine, unless it is very short,
into several logical blocks. A rule of thumb is that no block should be longer
than the number of lines you can display in your editor window. Within one
or two lines you should explain what is done in the block. Example:
/ * go t h r o u g h a l l nodes e x c e p t s o u r c e s and s i n k t i n */
/ * r e v e r s e d t o p o l o g i c a l o r d e r and s e t c a p a c i t i e s */
f o r (t2=nurn-nodes-2; t 2 > 0 ; t2--)
13 Practical Issues
- Line comments: They are the lowest level comments. Since you are using
(hopefully) sound names for data types, variables and subroutines, many lines
should be self explanatory. But in case the meaning is not obvious, you should
add a small comment at the end of a line, for example:
C(t, SOURCE) = cap-s2t [t] ; / * restore capacities * /
Aligning all comments t o the right makes a code easier to read. Please avoid
unnecessary comments like
counter++; / * increase counter */
or unintelligible comments like
minimize-energy(spin, N, next, 5); / * I try this one */
The line containing C (t , SOURCE) is an example of the application of a macro. This
subject is covered in the following section.
You can use the same sorts of names for macros as for variables. It is convention t o use
only upper-case letters for macros. A macro can be deleted via the #undef directive.
When scanning the code, the preprocessor just replaces literally every occurrence of a
macro by its definition. If you have for example the expression 2.0*PI*omega in your
13.4 Programming Tools 311
code, the preprocessor will convert it into 2.0*3.1415926536*omega. You can use
macros also in the definition of other macros. But macros are not replaced in strings,
i.e. p r i n t f ("PI") ; will print PI and not 3.1415926536 when the program is running.
It is possible to test for the (non)existence of macros using the # i f d e f and #ifndef
directives. This allows for conditional compiling or for platform-independent code,
such as e.g. in
# i f d e f UNIX
...
#endif
# i f d e f MSDOS
Please note that it is possible to supply definitions of macros to the compiler via
the -D option, e.g. gcc -0 program program. c -DUNIX=I. If a macro is used only
for conditional #ifdef/#ifndef statements, an assignment like =1can be omitted, i.e.
-DUNIX is sufficient.
When programs are divided into several modules, or when library functions are used,
the definition of data types and functions are provided in header files (. h files). Each
header file should be read by the compiler only once. When projects become more
complex, many header files have to be managed, and it may become difficult to avoid
multiple scanning of some header files. This can be prevented automatically by this
simple construction using macros:
. . . . ( r e s t of .h f i l e )
(may c o n t a i n o t h e r #include d i r e c t i v e s )
After the body of the header file has been read the first time during a compilation
process, the macro -MYFILE-H- is defined, thus the body will never read be again.
So far, macros are just constants. You will benefit from their full power when using
macros with arguments. They are given in braces after the name of the macro, such
as e.g. in
You do not have to worry more than usual about the names you choose for the ar-
guments, there cannot be a conflict with other variables of the same name, because
they are replaced by the expression you provide when a macro is used, e.g. MIN(4*a,
b-32) will be expanded to (4*a) < (b-32) ? (4*a) : (b-32).
312 13 Practical Issues
The arguments are used in braces () in the macro, because the comparison < must have
the lowest priority, regardless which operators are included in the expressions that are
supplied as actual arguments. Furthermore, you should take care of unexpected side
effects. Macros do not behave like functions. For example when calling M I N (a++,b++)
the variable a or b may be increased twice when the program is executed. Usually it
is better t o use inline functions (or sometimes templates in C++) in such cases. But
there are many applications of macros, which cannot be replaced by incline functions,
like in the following example, which closes this section.
The example illustrates how a program can be written in a clear way using macros,
making the program less error-prone, and furthermore allowing for a broad applicabil-
ity. A system of Ising spins is considered, that is a lattice where at each site i a particle
oi is placed. Each particle can have only two states ai = &I. It is assumed that all
lattice sites are numbered from 1 t o N. This is different from C arrays, which start at
index 0, the benefit of starting with index 1 for the sites will become clear below. For
the simplest version of the model only neighbors of spins are interacting. With a two-
dimensional square lattice of size N = L x L a spin i , which is not a t the boundary,
+ +
interact,^ with spins i 1 (+z-direction), i - 1 (-z-direction), i L (+y-direction) and
i - L (-y-direction). A spin at the boundary may interact with fewer neighbors when
free boundary conditions are assumed. With periodic boundary conditions (pbc), all
spins have exactly 4 neighbors. In this case, a spin at the boundary interacts also
with the nearest mirror images, i.e. with the sites that are neighbors if you consider
the system repeated in each direction. For a 10 x 10 system spin 5, which is in the
+ +
first row, interacts with spins 5 1 = 6, 5 - 1 = 4, 5 10 = 15 and through the pbc
13.4 Programming Tools 313
with spin 95, see Fig. 13.1. The spin in the upper left corner, spin 1, interacts with
spins 2,11,10 and 91. In a program pbc can be realized by performing all calculations
modulo L (for the +x-directions) and modulo L2 (for the -+y-directions), respectively.
This way of realizing the neighbor relations in a program has several disadvantages:
0 You have t o write the code everywhere where the neighbor relation is needed.
This makes the source code larger and less clear.
0 When switching to free boundary conditions, you have t o include further code to
check whether a spin is at the boundary.
0 Your code works only for one lattice type. If you want t o extend the program
t o lattices of higher dimension you have t o rewrite the code or provide extra
tests/calculations.
0 Even more complicated would be an extension to different lattice structures such
as triangle or face-center cubic. This would make the program look even more
confusing.
An alternative is t o write the program directly in a way it can cope with almost
arbitrary lattice types. This can be achieved by setting up the neighbor relation in
one special initialization subroutine (not discussed here) and storing it in an array
next [I. Then, the code outside the subroutine remains the same for all lattice types
and dimensions. Since the code should work for all possible lattice dimensions, the
array next is one dimensional. It is assumed that each site has numn neighbors.
Then the neighbors of site i can be stored in next [i*numnl , next [i*numn+il, . . .,
next [i*numn+numn-11 . Please note that the sites are numbered beginning with 1.
This means, a system with N spins needs an array NEXT of size (N+l)*numn. When
using frce boundary conditions, missing neighbors can be set to 0. The access to the
array can be made easier using a macro NEXT:
#define NEXT ( i ,r ) next [ ( i ) *num-n + r]
NEXT(i , r ) contains the neighbor of spin i in direction r . For e.g. a quadratic system,
r = O is the +x-direction, r=i the -x-direction, r = 2 the +y-direction and r = 3 the -y-
direction. However, which convention you use depends on you, but you should make
sure you are consistent. For the case of a quadratic lattice, it is numn=4. Please note
that whenever the macro NEXT is used, there must be a variable num-n defined, which
stores the number of neighbors. You could include num-n as a third parameter of the
macro, but in this case a call of the macro looks slightly more confusing. Nevertheless,
the way you define such a macro depends on your personal preferences.
Please note that the NEXT macro cannot be realized by an inline function, in case you
want t o set values directly like in NEXT(i ,0) =i+l. Also, when using an inline function,
you would have t o include all parameters explicitly, i.e. num-n in the example. The
last requirement could be circumvented by using global variables, but this is bad
programming style as well.
When the system is an Ising spin glass, the sign and magnitude of the interaction may
be different for each pair of spins. The interaction strengths can be stored in a similar
314 13 Practical Issues
way t o the neighbor relation, e.g. in an array j [I. The access can be simplified via
the macro J :
A subroutine for calculating the energy H = C(i,j)Jijaiaj may look as follows, please
note that the parameter N denotes the number of spins and the values of the spins are
stored in the array sigma[] :
double spinglass-energy(int N, int num-n, int *next, int *j,
short int *sigma)
C
double energy = 0.0;
int i, r; / * counters */
For this piece of code the comments explaining the parameters and the purpose of the
code are just missing for convenience. In the actual program it should be included.
The code for spinglass-energy() is very short and clear. It works for all kinds of
lattices. Only the subroutine where the array next [I is set up has t o be rewritten
when implementing a different type of lattice. This is true for all kinds of code re-
alizing e.g. a Monte Carlo scheme or the calculation of a physical quantity. For free
boundary conditions, additionally sigma[01 =O must be assigned to be consistent with
the convention that missing neighbors have the id 0. This is the reason, why the spin
site numbering starts with index 1 while C arrays start with index 0.
object (.o) files. Each pair of dependencies and commands is called rule. The file
containing all rules of a project is called makefile, usually it is named Makef i l e and
should be placed in the directory where the source files are stored.
A rule can be coded by two lines of the form
target : sources
<tab> command(s)
The first line contains the dependencies, the second one the commands. The command
line must begin with a tabulator symbol < t a b > . It is allowed t o have several targets
depending on the same sources. You can extend the lines with the backslash "\" at
the end of each line. The command line is allowed t o be left empty. An example of a
dependency/command pair is
o b j e c t l : <sources of o b j e c t l >
<tab> <command t o g e n e r a t e o b j e c t l >
object2: ...
<tab> <command t o g e n e r a t e o b j e c t 2 >
object3 ...
<tab> <command t o g e n e r a t e o b j e c t 3 >
It is not necessary t o separate different rules by blank lines. Here it is just for better
readability. If you want t o rebuild just e.g. obj e c t 3 , you can call make obj e c t 3 . This
316 13 Practical Issues
allows several independent targets t o be combined into one makefile. When compiling
programs via make, it is common t o include the target "clean" in the makefile such
that all objects files are removed when make c l e a n is called. Thus, the next call of
make (without further arguments) compiles the whole program again from scratch.
The rule for 'clean' reads like
clean :
<tab> r m -f * .o
Also iterated dependencies are allowed, for example
objectl: object2
The order of the rules is not important, except that m a k e always starts with the first
target. Please note that the m a k e tool is not just intended t o manage the software
development process and toggle compile commands. Any project where some output
files depend on some input files in an arbitrary way can be controlled. For example you
could control the setting of a book, where you have text-files, figures, a bibliography
and an index as input files. The different chapters and finally the whole book are the
target files.
Furthermore, it is possible t o define variables, sometimes also called macros. They
have the format
Also variables belonging t o your cnvironment like $HOME can be referenced in the
makefile. The value of a variable can be used, similar to shells variables, by placing
a $ sign in front of the name of the variable, but you have t o embrace the name by
(. . .) or {. . .). There are some special variables, e.g. $@ holds the name of the target
in each corresponding command line, here no braces are necessary. The variable CC is
predefined t o hold the compiling command, you can change it by including for example
in the makefile. In thc command part of a rule the compiler is called via $ (CC). Thus,
you can change your compiler for the whole project very quickly by altering just one
line of the makefile.
Finally, it will be shown what a typical makefile for a small software project might
look like. The resulting program is called s i m u l a t i o n . There are two additional
modules i n i t . c, r u n . c and the corresponding header .h files. In d a t a t y p e s . h types
are defined which are used in all modules. Additionally, an external precompiled object
file a n a l y s i s . o in the directory $HOME/lib is t o be linked, the corresponding header
13.4 Programming Tools 317
where the variable CFLAGS may contain options passed t o the compiler and is initially
empty. The makefile looks like this, please note that lines beginning with "#" are
comments.
#
# sample make file
#
OBJECTS=simulation.o init.0 run.0
OBJECTSEXT=$(HOME)/lib/analysis.o
CC=gcc
CFLAGS=-g -Wall -I$(HOME)/include
LIBS=-lm
$ (OBJECTS) : datatypes.h
clean:
<tab> rm -f *. o
The first three lines are comments, then five variables OBJECTS, OBJECTSEXT, CC,
CFLAGS and LIBS are assigned. The final part of the makefile are the rules.
Please note that sometimes bugs are introduced, if the makefile is incomplete. For
example consider a header file which is included in several code files, but this is not
mentioned in the makefile. Then, if you change e.g. a data type in the header file, some
of the code files might not be compiled again, especially those you did not change. Thus
the same objects files can be treated with different formats in your program, yielding
bugs which seem hard t o explain. Hence, in case you encounter mysterious bugs, a
make clean might help. But most of the time, bugs which are hard t o explain are due
t o errors in your memory management. How to track down those bugs is explained in
Sec. 13.7.
The make tool exhibits many other features. For additional dctails, please consult the
references given above.
13.4.3 Scripts
Scripts are even more general tools than m a k e files. They are in fact small programs,
but they are usually not compiled, i.e. they are quickly written but they run slowly.
Scripts can be used to perform many administration tasks like backing up data, in-
stalling software or running simulation programs for many different parameters. Here
318 13 Practical Issues
only an example concerning the last task is presented. For a general introduction to
scripts, please refer to a book on UNIX/L'inux.
Assume that you have a simulation program called coversim21 which calculates vertex
covers of graphs. In case you do not know what a vertex cover is, it does not matter,
just regard it as one optimization problem characterized by some parameters. You
want to run the program for a fixed graph size L, for a fixed concentration c of the
edges, average over num realizations and write the results to a file, which contains a
string appendix in its name to distinguish it from other output files. Furthermore, you
want t o iterate over different relative sizes x. Then you can use the following script
run. scr:
# ! /bidbash
L=$l
c=$2
num=$3
appendix=$4
shift
shift
shift
shift
for x
do
$~HOME)/cover/coversim21 -mag $L $c $x $num > \
mag-$(c)-$Cx)$Cappendix). out
done
The first line starting with "#" is a comment line, but it has a special meaning. It
tells the operating system the language in which the script is written. In this case it
is for the bash shell, the absolute pathname of the shell is given. Each UNIX shell
has its own script language, you can use all commands which are allowed in the shell.
There are also more elaborate script languages like per1 or phyton, but they are not
covered here.
Scripts can have command line arguments, which are referred via $1, $2, $2 etc., the
name of the script itself is stored in $0. Thus, in the lines 2 to 5, four variables are
assigned. In general, you can use the arguments everywhere in the script directly, i.e.
it is not necessary to store them in other variables. It is done here because in the
next four lines the arguments $1 to $4 are thrown away by four shift commands.
Then, the argument which was on position five at the beginning is stored in the first
argument. Argument zero, containing the script name, is not affected by the shift.
Next, the script enters a loop, given by "for x; do . . . done". This construction
means that iteratively all remaining arguments are assigned to the variable "x" and
each time the body of the loop is executed. In this case, the simulation is started with
some parameters and the output directed to a file. Please note that you can state the
loop parameters explicitly like in lLfor size in 10 20 4 0 80 160; do . . . done".
The above script can be called for example by
13.5 Libraries
which means that the graph size is 100, the fraction of edges is 0.5, the number of
realizations per run is 100, the string testA appears in the output file name and the
simulation is performed for the relative sizes 0.20, 0.22, 0.24, 0.26, 0.28, 0.30.
13.5 Libraries
Libraries are collections of subroutines and data types, which can be used in other
programs. There are libraries for numerical methods such as integration or solving
differential equations, for storing, sorting and accessing data, for fancy data types like
lists or trees, for generating colorful graphics and for thousands of other applications.
Some can be obtained for free, while other, usually specialized libraries have t o be pur-
chased. The use of libraries speeds up the software development process enormously,
because you do not have t o implement every standard method by yourself. Hence, you
should always check whether someone has done the jobs for you already, before start-
ing to write a program. Here, two standard libraries are briefly presented, providing
routines which are needed for most computer simulations.
Nevertheless, sometimes it is inevitable to implement some methods by yourself. In
this case, after the code has been proven t o be reliable and useful for some time, you
can put it in a self-created library. How t o create libraries is explained in the last part
of this section.
0 performing interpolations
0 minimizing functions
0 diagonalization of matrices
0 Fourier transform
To give you an impression how the subroutines can be used, just a short example is
presented. Consider the case that a symmetrical matrix is given and that all eigen-
values are to be determined. For more information on the library the reader should
consult Ref. [3]. There it is not only shown how the library can be applied, but also
all algorithms are explained.
The program to calculate the eigenvalues reads as follows.
In the third part the main work is done by the Numerical Recipes subroutines t r e d 2 0
and t q l i 0. First, the matrix is written in tridiagonal form by a Householder trans-
formation ( t r e d 2 0)
and then the actual eigenvalues are calculated by calling t q l i (d,
e , n , m ) . The eigenvalues are returned in the vector d[1 and the eigenvectors in the
matrix m[] [I (not used here), which is overwritten. Finally the memory allocated for
the matrix and the vectors is freed again.
This small example should be sufficient to show how simply the subroutines from the
Numerical Recipes can be incorporated into a program. When you have a problem of
this kind you should always consult the NR library first, before starting t o write code
by yourself.
13.5.2 LEDA
While the Numerical Recipes are dedicated to numerical problems, the Library of
EJgicient Data types and Algorithms (LEDA) [4] can help a great deal in writing
efficient programs in general. It is written in C++, but it can be used by C style
programmers as well via mixing C++ calls t o LEDA subroutines within C code. LEDA
contains many basic and advanced data types such as:
strings
sets
trees
dictionaries, there you can storc objects with arbitrary key words as indices
data types for two and three dimensional geometries, like points, segments or
spheres
For most data types, it is possible t o create arbitrary complex structures by using
templates. For example you can make lists of self defined structures or stacks of
trees. The most efficient implementations known in literature so far are taken for all
data structures. Usually, you can choose between different implementations, t o match
special requirements. For every data type, all necessary operations are included; e.g.
for lists: creating, appending, splitting, printing and deleting lists as well as inserting,
searching, sorting and deleting elements in a list, also iterating over all elements of a
list. The major part of the library is dedicated t o graphs and related algorithms. You
will find for example subroutines t o calculate strongly connected components, shortest
paths, maximum flows, minimum cost flows and (minimum) matchings.
322 13 Practical Issues
Here again, just a short example is given to illustrate how the library can be utilized
and to show how easy LEDA can be used. A list of a self defined class Mydatatype is
considered. Each element contains the data entries info and flag. In the first part
of the program below, the class Mydatatype is partly defined. Please note that input
and output stream operators <<I>> must be provided to be able to create a list of
Mydatatype elements, otherwise the program will not compile. In the main part of
the program a list is defined via the LEDA data type list. Elements are inserted into
the list with append(). Finally an iteration over all list elements is performed using
the LEDA macro forall. The program leda-test. cc reads as follows:
The program has t o be compiled with a C++ compiler. Depending on your system,
you have to specify some compiler flags t o include LEDA, please consult your systems
documentation or the system administrator. The compile command may look like this:
The -I flag specifies where the compiler searches for header files like LEDA/list. h,
the -L flag tells where the libraries (-1G -1L) are located. The environment variable
LEDAROOT must point t o the directory where LEDA is stored in your system.
Please note that using Numerical Recipes and LEDA together results in conflicts,
since the objects v e c t o r and m a t r i x are defined in both libraries. You can circum-
vent this problem by taking the source code of Numerical Recipes (here: n r u t i l . c ,
n r u t il .h) and rename the subroutines m a t r i x () and v e c t o r 0, compile again and
include n r u t i l . o directly in your program.
Here, it should be stressed: Before trying t o write everything by yourself, you should
check whether someone else has done it for you already. LEDA is a highly effective
and very convenient tool. It will save you a lot of time and effort when you use it for
your program development.
In a library several object files may be collected. The option "r" replaces the given
object files, if they already belong t o the library, otherwise they are added. If the
library does not exist yet it is created. For more options, please refer t o the man page
of ar.
After including an object file, you have t o update an internal object table of the library.
This is done by
13 Practical Issues
Now you can compile a program prog. c using your library via
cc -0 prog pr0g.c 1ibmy.a
In case 1ibmy.a contains several object files, it saves some typing by just writing
libmy.a, furthermore you do not have t o remember the names of all your object files.
To make the handling of the library more comfortable, you can create a directory,
e.g. -/lib and put your libraries there. Additionally, you should create the direc-
tory w/include where all personal header files can be collected. Then your compile
conlniand may look like this:
cc -0 prog pr0g.c -I$HOME/include -L$HOME/lib -1my
The option -1 states the search path for additional header files, the -L option tells
the linker where your libraries are stored and via -1my the library libmy.a is actually
included. Please note that the prefix lib and the postfix .a are omitted with the -1
option. Finally, it should be pointed out, that the compiler command given above
works in all directories, once you have set up the structure as explained. Hence, you
do not have t o remember directories or names of object files.
in exactly thc same way. This is the reason why pseudo random numbers are usually
taken. They are generated by deterministic rules, but they look like and have many
of the properties of true random numbers. One would like to have a random number
generator r a n d ( ) , such that each possible number has the same probability of occur-
rence. Each time r a n d 0 is called, a new random number is returned. Additionally,
if two numbers ri, rk differ only slightly, the random numbers ri+l,r k + l returned by
the respective subsequent calls should have a low correlation.
The simplest methods t o generate pseudo random numbers are linear congruential
generators. They generate a sequence 11, . . of integer numbers between 0 and
12,.
m - 1 by a recursive recipe:
To generate random numbers r distributed in the interval [O,1) one has to divide the
current random number by m. It is desirable t o obtain equally distributed values in
the interval, i.e. a uniform distribution. Below, you will see, how random numbers
obeying other distributions can be generated from uniformly distributed numbers.
The real art is t o choose the parameters a , c, m in a way that "good" random numbers
are obtained, where "good" means "with less correlations". In the past several results
from simulations have been turned out t o be wrong, because of the application of bad
random number generators [17].
Figure 13.2: Distribution of random numbers in the interval [0, I). They are gener-
ated using a linear congruential generator with the parameters a = 12351,c = 1, m =
215.
Figure 13.3: Two point correlations xi+l(x,) between successive random numbers
x,,xi+l. The top case is generated using a linear congruential generator with the
parameters a = 12351, c = I , m = 215, the bottom case has instead a = 12349.
The target is to find a function g ( X ) , such that after the transformation Z = g(U), the
values of Z are distributed according t o (13.2). It is assumed that g can be inverted
and is strongly monotonically increasing, then one obtains
1oO
uniformly in [zo,zl] x [0,z,,] and accept only those values z where y 5 p(z) holds,
i.e. the pairs which are located below p(x), see Fig. 13.5. Therefore, the probability
t h a t z is drawn is proportional t o p(x), as desired. The algorithm for the rejection
method is:
a l g o r i t h m rejectionmethod(z,,,,p)
begin
found := false;
while n o t found d o
begin
ul := random number in [ O , l ) ;
+
z := zo (21 - xo) x u1;
u2 := random number in [0, I);
y := zmax X u2;
if y 5 p ( z ) t h e n
found := t r u e ;
end;
return(x);
end
The rejection method always works if the probability density is boxed, but it has the
drawback that more random numbers have t o be generated than can be used.
13 Practical Issues
Figure 13.5: The rejection method: points (x,y) are scattered uniformly over a
bounded rectangle. The probability that y 5 p(x) is proportional to p(x).
In case neither the distribution function can be inverted nor the probability fits into
a box, special methods have t o be applied. As an example a method for generating
random numbers distributed according t o a Gaussian distribution is considered. Other
methods and examples of how different techniques can be combined, are collected in
Ref. [16].
Figure 13.6: Gaussian distribution with zero mean and unit width. The circles
represent a histogram obtained from l o 4 values drawn with the Box-Miiller method.
ui (with mean m and variance v) will converge to a Gaussian distribution with mean
N m and variance Nu. If again ui is taken take to be uniformly distributed in [O,1)
(which has mean r r ~= 0.5 and variance u = 1/12), one can choose N = 12 and
Z = c::, ui - 6 will be distributed approximately normally. The drawback of this
method is that 12 random numbers are needed to generate one final random number
and that values larger than 6 never appear.
In contrast to this technique the Box-Muller method is exact. You need two uniformly
in [O,1) distributed random variables Ul, Uz to generate two independent normal vari-
ables N,, N2. This can be achieved by setting
N~ = J-2 log(1- u l ) c o s ( 2 ~ u 2 )
N2 = J-2 log(l - u l ) sin(2nu2)
A proof that N1 and N2 are indeed distributed according to (13.5) can be found in
Refs. [3, 161, where also other methods for generating Gaussian random numbers,
some even more efficient, are explained. A method which is based on the simulation
of particles in a box is explained in Ref. [18]. In Fig. 13.6 a histogram of lo4 random
numbers drawn with the Box-Miiller method is shown.
cess. Please note again that the tools run under UNIX/Linux operating systems.
Similar programs are available for other operating systems as well. The tools covered
here are gdb, a source-code debugger, ddd, a graphic front-end t o gdb, and checkergcc,
which finds bugs resulting from bad memory management.
13.7.1 gdb
The gdb gnu debugger tool is a source code debugger. Its main purpose is that
you can watch the execution of your code. You can stop the program at arbitrarily
chosen points by setting breakpoints at lines or subroutines in the source code, inspect
variablesldata structures, change them and let the program continue (e.g. line by line).
Here some examples for the most basic operations are given, detailed instructions can
be obtained within the program via the help command.
As an example of how t o debug, please consider the following little program gdbtest .c:
(gdb) c
Continuing.
sum= 4949
As you can see, the final value (4949) the program prints is affected by the change of
the variable array C991.
The above given commands are sufficient for most of the standard debugging tasks.
For more specialized cases gdb offers many other commands, please have a look at the
documentation [ 5 ] .
13.7.2 ddd
Some users may find graphical user interfaces more convenient. For this reason there
exists a graphical front-end t o the gdb, the d a t a display debugger (ddd). On U N I X
operating systems it is just invoked by typing ddd (see also m a n page for options).
Then a nice windows pops up, see Fig. 13.7. The lower part of the window is an or-
dinary gdb interface, several other windows are available. By typing file <program>
you can load a program into the debugger. Then the source code is shown in the main
window of the debugger. All gdb commands are available, the most important ones
can be entered via menus or buttons using the mouse. For example t o sct a breakpoint
it is sufficient to place the cursor in a source-code line in the main ddd window and
click on the break button. A good feature is that the content of a variable is shown
when moving the mouse onto it. For more details, please consult the online help of
ddd.
13.7.3 checkergcc
Most program bugs arc rcvealed by systematically running the program and cross-
checking with the expected results. But other errors seem to appear in a rather
irregular and unpredictable fashion. Sometimes a program runs without a problem,
in other cases it crashes with a Segmentation fault at rather puzzling locations in
the code. Very often a bad memory management is the cause of such a behavior.
Writing beyond the boundaries of an array, reading uninitialized memory locations
or addressing data which has been freed already are the most common bugs of this
class. Since the operating system organizes the memory in a different way each time
a program is run, it is rather unpredictable whether these errors become apparent
or not. Furthermore it is very hard t o track them down, because the effect of such
errors most of the time becomes visible at positions diffcrmt from wherc the error has
occurred.
As an example, the case where it is written beyond the boundary of an array is
considered. If in the heap, where all dynamically allocated memory is taken from, at
the location behind the array another variable is stored, it will be overwritten in this
case. Hcnce, the error bccomes visible the next time the other variable is read. On
13.7 Tools f o r Testing 335
I n t m a i n ( i n t a r g c , char * a r g v l l )
i n t t , "array, sum = 0;
II
a r r a y = ( i n t *) rnalloc ( 1 0 0 * s i z e o f ( i n t ) ) ;
f o r ( t=0; t d 0 0 ; t++)
arraylt] = t;
f o r ( t = 0 ; t d 0 0 ; t++)
sum += a r r a y r t l ;
p r i n t f ( " s u m = % d \ n 8 ' , sum);
freecarray);
return(0) ;
9
D
lopyright 1999 Technische U n i v e r s i t a t Braunschweig, Germany
:gdb) f i l e g d b t e s t
!cadi ng symbol s from g d b t e s t done
.gdb)
, Sett~ngbuttons done
-
Figure 13.7: The data display debugger (ddd). In the main window the source code
is shown. Commands can be invoked via a mouse or by entering them into the lower
part of the window.
the other hand, if the memory block behind the array is not used, the program may
run that time without any problems. Unfortunately, the programmer is not able t o
influence the memory management directly.
To detect such types of nasty bugs, one can take advantage of several tools. A list of free
and commercial tools can be found in Ref. [19]. Here checkergcc is considered, which
is a very convenient tool and freely available. It works under UNIX and is included
by compiling everything with checkergcc instead of cc or gcc. Unfortunately, the
current version does not have full support for C++, but you should try it on your
own project. The checkergcc compiler replaces all memory allocations/deallocations
and accesses by its own routines. Any access t o non-authorized memory locations is
reported, regardlcss of the positions of other variables in the memory area (heap).
As an example, the program from Sec. 13.7.1 is considered, which is slightly modified;
336 13 Practical Issues
the memory block allocated for the array is now slightly too short (length 99 instead
of 100):
#include <stdio.h>
#include <stdlib.h>
sum= 4950
Initialization of detector.. .
Searching in data
Searching in stack
Searching in registers
From Checker (pid:30900): (gar) garbage detector results.
13 Practical Issues
Obviously, the memory leak has been found. Further information on the various
features of checkergcc can be found in Ref. [20]. A last hint: you should always test
a program with a memory checker, even if everything seems to be fine.
and the third the standard crror of the energy, please note that lines starting with "#"
are comment lines which are ignored on reading:
# ground s t a t e energy of +-J s p i n g l a s s e s
# L e-0 error
3 -1.6710 0.0037
4 -1.7341 0.0019
5 -1.7603 0.0008
6 -1.7726 0.0009
8 -1.7809 0.0008
10 -1.7823 0.0015
12 -1.7852 0.0004
14 -1.7866 0.0007
To plot the data enter
gnuplot> p l o t "sg-eO-L. d a t " with y e r r o r b a r s
which can be abbreviated as p "sg-eO-L. d a t " w e. Please do not forget the quotation
marks around the file name. Next, a window pops up, showing the result, see Fig.
For the p l o t command many options and styles are available, e.g. with l i n e s pro-
duces lines instead of symbols. It is possible t o read files with multi columns via the
13 Practical Issues
The actual fit is performed via the fit command. The program uses the nonlinear
least-squares Marquardt-Levenberg algorithm [3], which allows a fit according t o al-
13.8 Evaluating Data
Dlrectorles Files
-"
. a' /
j/
lrrJ.,,,,..-. 6azdat---%
'textelseminaridatd jmag3.dat-
I
I
,mag5 dat
I mag5 dat-
!sample08 dat
, J Autoscale on read
Selection
Figure 13.9: The xmgr program, just after a data file has been loaded, and the AS
button has been pressed to adjust the figure range automatically.
most all arbitrary functions. To issue the command, you have t o state the fit function,
the data set and the parameters which are t o be adjusted. For our example you enter:
Then gnuplot writes log information t o the output describing the fitting process. After
the fit has converged it prints for the given example:
A f t e r 17 i t e r a t i o n s t h e f i t converged.
f i n a l sum of s q u a r e s of r e s i d u a l s : 7.55104e-06
r e l . change d u r i n g l a s t i t e r a t i o n : -2.42172e-09
d e g r e e s of freedom ( n d f ) : 5
r m s of r e s i d u a l s ( s t d f i t ) = s q r t (WSSR/ndf) : 0.00122891
v a r i a n c e of r e s i d u a l s (reduced c h i s q u a r e ) = WSSR/ndf : 1.51021e-06
342 13 Practical Issues
The most interesting lines are those where the results for your parameters along with
the standard error are printed. Additionally, the quality of the fit can be estimated
by the information provide in the three lines beginning with "degree of freedom".
The first of these lines states the number of degrees of freedom, which is just the
number of data points minus the number of parameters in the fit. The deviation
of the fit function f (x) from the data points (xi, yi f ai) ( i = 1 , . . . , N) is given by
2
i2 = Z = I [rf l?)], which is denoted by WSSR in the gnuplot output. A measure
o,
of the quality of the fit is the probability Q that the value of X 2 is worse than in the
current fit, given the assumption that the datapoints yi are Gaussian distributed with
mean f (xi)and variance one [3]. The larger the value of Q , the better is the quality
of the fit. To calculate Q you can use the little program Q. c
#include <stdio.h>
#include "nr.h"
int main(int argc, char **argv)
Figure 13.10: Gnuplot window showing the result of a fit command along with the
input data.
away from the best values. Try the initial values e=l, a=-3 and b=l! Furthermore,
not all function parameters have to be subjected to the fitting. Alternatively, you can
set some parameters to fixed values and omit them from the list a t the end of the
f i t command. You should also know that in the example given above all data points
enter into the result with the same weight. You can tell the algorithm to consider the
error bars by typing f i t f (x) "sg-eO-L. d a t " u s i n g I : 2 : 3 v i a a , b , c. Then, data
points with larger error bars have less influence on the results. More on how to use
the f i t command can be found out when entering h e l p f i t .
gnuplot> n=l.l
gnuplot> pc=0.222
gnuplot> plot [-I: 11 l'm-scale.dat" u (($2-pc) *$I** (l/n)) : ($3*$1**(b/n))
The plot command makes use of the feature that with the sing) option you can
transform the data of the input in an arbitrary way. For each data set, the variables
$1,$2 and $3 refer t o the first, second and third columns, e.g. $l**(l/n) raises the
system size t o the power Ilv. The resulting plot is shown in Fig. 13.12. Near the
transition p - p, % 0 a good collapse of the data points can be observed.
Figure 13.12: Gnuplot output of a finite-size scaling plot. The ground-state magne-
tization of a three-dimensional fJ spin glass as a function of the concentration p of
the antiferromagnetic bonds is shown. For the fit, the parameters p, = 0.222, P = 0.27
and v = 1.1have been used.
In case you do not know the values of p,,p, u you can start with some estimated
values, perform the plot, resulting probably in a bad collapse. Then you may alter
the parameters iteratively and watch the resulting changes by plotting again. In this
way you can converge t o a set of parameters, where all data points show a satisfying
collapse.
The process of determining the finite-size scaling parameters can be performed more
conveniently by using the special purpose program fsscale. It can bc obtaincd free of
charge from [22]. This tool allows the scaling parameters t o be changed interactively
by pressing buttons on the keyboard, making a finite-size scaling fit very convenient
346 I S Practical Issues
t o perform. Several different scaling forms are available. To obtain more information,
start the program, with f s s c a l e -help. A sample screen-shot is shown in Fig. 13.13
Please note that the data have t o be presented t o fsscale in a file containing three
columns, where the first column contains the system size, the second the x-value and
the third the y-value. If you have only data files with more columns, you can use the
standard UNIX tool awlc t o project out the relevant columns. For example, assume
that your data file r e s u l t s . d a t has 10 columns, and your are interested in columns
3 , 8 , and 9. Then you have t o enter:
You can also use awlc t o perform calulations with t,he values in the columns, similar
t o gnuplot, as in
Literature databases
In case you want t o obtain all articles from a specific author or all articles on a
certain subject, you should consult a literature database. In physics the INSPEC
[25] database is the appropriate source of information. Unfortunately, the access
is not free of charge. But usually your library should allow access to INSPEC,
either via CD-ROMs or via the Internet. If your library/university does not offer
an access you should complain.
INSPEC frequently surveys almost all scientific journals in the areas of physics,
electronics and computers. For each paper that appears, all bibliographic informa-
tion along with the abstract are stored. You can search the database for example
for author names, keywords (in the abstract or title), publication years or jour-
nals. Via INSPEC it is possible t o keep track of recent developments happening
in a certain field.
There are many other specialized databases. You should consult the web page of
your library, to find out t o which of them you can access. Modern scientific work
is not possible without regularly checking literature databases.
Preprint server
In the time of the Internet, speed of publication becomes increasingly important.
Meanwhile, many researchers put their publications on the Los Alamos Preprint
server [26], where they become available world wide at most 72 (usually 24)
hours after submission. The database is free of charge and can be accessed from
348 13 Practical Issues
almost everywhere via a browser. The preprint database is divided into several
sections such as astrophysics (astro-ph), condensed matter (cond-mat) or quantum
physics (quant-ph). Similar t o a conventional literature database, you can search
the database, eventually restricted to a section, for author names, publication
years or keywords in the titlelabstract. But furthermore, after you have found an
interesting article, you can download it and print it immediately. File formats are
postscript and pdf. The submission can also be in ~ / B T E X(see Sec. 13.9.2).
Please note that there is no editorial processing at all, that means you do not
have any guarantee on the quality of a paper. If you like, you can submit a poem
describing the beauty of your garden. Nevertheless, the aim of the server is t o
make important scientific results available very quickly. Thus, before submitting
an article, you should be sure that it is correct and interesting, otherwise you
might get a poor reputation.
The preprint server also offers access via email. It is possible t o subscribe t o a
certain subject. Then evcry working day you will receive a list of all new papers
which have been submitted. This is a very convenient way of keeping track of
recent developments. But be careful, not everyone submits t o the preprint server.
Hence, you still have to read scientific journals regularly.
Scientific journals
Journals are the most important resources of information in science. Most of them
allow access via the Internet, when your university or institute has subscribed to
them. Some of the most important physical journals, which are available online,
are published by (in alphabetical order)
Citation databases
In every scientific paper some other articles are cited. Sometimes it is interesting
t o get the reverse information, i.e. t o obtain all papers which are citing a given
article A. This can be useful, if one wants t o learn about the most recent develop-
ments which are triggered by article A. In that case you have t o access a citation
i n d e x . For physics, probably the most important is the Science Citation I n d e x
(SCI) which can be accessed via the W e b of Science [35]. You have t o ask your
system administrator or your librarian whether and how you can access it from
your site.
13.9 Information Retrieval and Publishing 349
The American Physical Society (APS) [28] also includes links t o citing articles
with the online versions of recent papers. If the citing article is available via the
APS as well, you can immediately access the article from the Internet. This works
not only for citing papers, but also for cited articles.
Phys Net
If you want t o have access t o the web pages of a certain physics department, you
should go via your web browser t o the Phys Net pages [36]. They offer a list of all
physics departments in the world. Additionally, you will find lists of forthcoming
conferences, job offers and many other useful links. Also, the home page of your
department probably offers many interesting links t o other web pages related t o
physics.
Web browsing
Except for the sources mentioned so far, nowadays much information is available
on net. Many researchers present their work, their results and their publications
on their home pages. Quite often talks or computer codes can also be downloaded.
In case you cannot find a specific page through the Phys Net (see above), or you
are interested in obtaining all web pages concerning a specific subject, you should
ask a search engine. There are some very popular all purpose engines like Yahoo
[37] or Alta Vista [38]. A very convenient way t o start a query on several search
engines in parallel is a meta search engine, e.g. Metacrawler [39]. To find out
more, please contact a search engine.
Large projects do not give rise t o any problems, in contrast t o many commercial
office programs. When treating a I 4w text, your computer will never complain
when your text is more than 300 pages or contains many huge post-script figures.
Type setting of formulae is very convenient and fast. You do not have t o care
about sizes of indices of indices etc. Furthermore, in case you want for example
t o replace all n in your formulae with P, this can be done with a conventional
replace, by replacing all \alpha strings by a \beta strings. For the case of an
office system, please do not ask how t o do this conveniently.
There are many additional packages for enhanced styles such as letters, trans-
parencies or books. The bibtex package is very convenient, which allows a nice
literature database t o be build up.
Since you can use a conventional editor, the writing process is very fast. You do
not have to wait for a huge packet t o come up.
On the other hand, if you still prefer a WYSIWYG ("what you see is what you
get") system, there is a program called lyx [42] which operates like a conventional
word processor but creates I 4 m files as output. Nevertheless, once you get used
togw, you will never want to loose it.
Please note that this text was written entirely with W&-jK. Since IN$$ is a type
setting language, you have t o compile your text t o create the actual output. Now, an
example is given of what a I 4w text looks like and how it can be compiled. This
example will just give you an impression of how the system operates. For a complete
reference, please consult the literature mentioned above.
The following file example. tex produces a text with different fonts and a formula:
\document class [12pt] (article)
\begin{document)
This is just a small sample text. You can write some words {\em
emphasized}\/, or in {\bf bold face). Also different {\small sizes)
are possible.
e.g. with \begin{equation) and \end{equation}. For the mathematical mode a huge
number of commands exists. Here only examples for Greek letters (\alpha), subscripts
(x-i), fractions ( \ f r a c ) , integrals ( \ i n t ) and vectors (\vec) are given.
The text can be compiled by entering l a t e x example. t e x . This is the command for
UNIX, but aTJ$exists for all operating systems. Please consult the documentation
of your local installation.
The output of the compiling process is the file example.dvi, where "dvi" means
"device independent". The . d v i file can be inspected on screen by a viewer via
entering xdvi example. d v i or converted into a postscript file via typing d v i p s -0
example. p s example. d v i and then transferred t o a printer. On many systems it can
be printed directly as well. The result will look like this:
This is just a small sample text. You can write some words emphasized, or
in bold face. Also different sizes are possible.
An empty line generates a new paragraph. UTJ$ is very convenient for
writing formulae, e.g.
This example should be sufficient t o give you an impression of what the philosophy
of I4TJ$ is. Comprehensive instructions are beyond the scope of this section, please
consult the literature [40, 411.
Under UNIX/Linux, the spell checker ispell is available. It allows a simple spell check
t o be performed. The tool is built on a dictionary, i.e. a huge list of known words.
The program scans any given text, also a special I 4 w mode is available. Every time
a word occurs, which is not contained in the list, ispell stops. Should similar words
exist in the list, they are suggested. Now the user has to decide whether the word
should be replaced, changed, accepted or even added t o the dictionary. The whole
text is treated in this way. Please note that many mistakes cannot be found in this
way, especially when the misspelled word is equal t o another word in the dictionary.
However, a t least ispell finds many spelling mistakes quickly and conveniently, so you
should use the tool.
Most scientific texts do not only contain text, formulae and curves, but also schematic
figures showing the models, algorithms or devices covered in the publication. A very
convenient but also simple tool t o create such figures is xfig. It is a window based
vector-oriented drawing program. Among its features are the creation of simple objects
like lines, arrows, polylines, splines, arcs as well as rectangles, circles and other closed,
possibly filled, areas. Furthermore you can create text or include arbitrary (eps, jpg)
pictures files. You may place the objects on different layers which allows complex
sceneries t o be created. Different simple objects can be combined into more complex
objects. For editing you can move, copy, delete, rotate or scale objects. To give you
an impression what xfig looks like, in Fig. 13.14 a screen-shot is shown, displaying xfig
with the picture that is shown in Fig. 13.1. Again, for further help, please consult the
online help function or the man pages.
13 Practical Issues
...= ,,,
The figures can be saved in the internal fig format, and exported in several file formats
such as (encapsulated) postscrzpt, B?&X, Jpeg, TzSf or bitmap. The xfig program can
be called in a way that it produces just an output file with a given fig input file. This
is very convenient when you have larger projects where some small picturc objects
are contained in other pictures and you want t o change the appearance of the small
13.9 Information Retrieval and Publishing 353
objects in all other files. With the help of the make program pretty large projects can
be realized.
Also, xjig is very convenient when creating transparencies for talks, which is the stan-
dard method of presenting results in physics. With different colors, text sizes and all
the objects mentioned before, very clear transparencies can be created quickly. The
possibility of including picture files, like postscript files which were created by a data
plotting program such as xmgr, is very helpful. In the beginning it may seem that
more effort is necessary than when creating the transparencies by hand. However,
once you have a solid base of transparencies you can reuse many parts and preparing
a talk may become a question of minutes. In particular, when your handwriting looks
awful, the audience will be much obliged for transparencies prepared with xjig.
Last but not least, please note that xjig is vector oriented, but not pixel oriented.
Therefore, you cannot treat pictures like jpg files (e.g. photos) and apply operations
like smoothing, sharpening or filtering. For these purposes the package gimp is suitable.
It is freely available again from GNU [5].
It is also possible to draw three-dimensional figures with xfig, but there is no spe-
cial support for it. This means, xjig has only a two-dimensional coordinate system.
A very convenient and powerful tool for making three-dimensional figures is Povray
(Persistence Of Vision RAYtraycer). Here, again, only a short example is given, for a
detailed documentation please refer t o the home page [43], where the program can be
downloaded for many operating systems free of charge.
Povray is, as can be realized from its name, a raytracer. This means you present a
scene consisting of several objects to the program. These objects have characteristics
like color, reflectivity or transparency. Furthermore the position of one or several
light sources and a virtual camera have t o be defined. The output of a raytracer is a
photo-realistic picture of the scene, seen through the camera. The name "raytracer"
originates from the fact that the program creates a picture by starting several rays of
light at the light sources and traces their way through the scene, where they may be
absorbed, reflected or refracted, until they hit the camera, disappear into infinity or
become too weak. Hence, the creation of a picture may take a while, depending on
the complexity of the scene.
A scene is described in a human readable file, it can be entered with any text editor.
But for more complex scenes, special editors exist, which allow a scene t o be created
interactively. Also several tools for making animations are available on the Internet.
Hcrc, a simple example is given. The scene consists of three spheres connected by
two cylinders, forming a molecule. Furthermore, a light source, a camera, an infinite
plane and the background color are defined. Please note that a sphere is defined by its
center and a radius and a cylinder by two end points and a radius. Additionally, for all
objects color information has t o be included, the center sphere is slightly transparent.
The scene description file t e s t 1 .pov reads as follows:
background ( c o l o r White 1
13 Practical Issues
sphere { ( 0 , 2 , l o > , 4
pigment { Green t r a n s m i t 0 . 4 1 1
The creation of the picture is started by calling (here on a Linux system via command
line) x-povray +I t e s t 1 .pov. The resulting picture is shown in Fig. 13.15, please
note the shadows on the plane.
Povray is really powerful. You can create almost arbitrarily shaped objects, combine
them into complex objects and impose many transformations. Also special effects
like blurring or fog are available. All features of Povray are described in a 400 page
manual. The use of Povray is widespread in the artists community. For scientists
it is very convenient as well, because you can easily convert e.g. configuration files
of molecules or three-dimensional domains of magnetic systems into nice looking per-
spective pictures. This can be accomplished by writing a small program which reads
e.g your configuration file containing a list of positions of atoms and a list of links,
and puts for every atom a sphere and for every link a cylinder into a Povray scene
file. Finally the program must add suitable chosen light sources and a camera. Then,
a three-dimensional pictures is created by calling Povray.
The tools described in this section, should allow all technical problems occurring in the
process of preparing a publication (a "paper") to be solved. Once you have prepared it,
you should give it t o at least one other person, who should read it carefully. Probably
he/she will find some errors or indicate passages which might be difficult t o understand
or that are misleading. You should always take such comments very seriously, because
the average reader knows much lcss about your problem than you do.
After all necessary changes have been performed, and you and other readers are satis-
fied with thc publication, you can submit it t o a scientific journal. You should choose
a journal which suits your paper. Where t o submit, you sholild discuss with experi-
enced researchers. It is not possible t o give general advice on this issue. Nevertheless,
technically the submission can be performed nowadays almost everywhere electroni-
cally. For a list of publishers of some important journals in physics, please see Sec.
13.9.1. Submitting one paper t o several journals in parallel is not allowed. However,
you should also consider submitting t o the preprint server [25] as well to make your
results quickly available t o the physics community.
Nevertheless, although this text provides many useful hints concerning performing
computer simulations, the main part of the work is still having good ideas and carefully
conducting the actual research projects.
Bibliography
[I] I. Sommerville, Software Engineering, (Addisin-Wesley, Reading (MA) 1989)
[3] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical
Recipes in C (Cambridge University Press, Cambridge 1995)
[4] K. Mehlhorn and St. Naher, The LEDA Platform of Combinatorial and Geometric
Computing (Cambridge University Press, Cambridge 1999);
see also http://www.mpi-sb.mpg.de/LEDA/leda.html
356 13 Practical Issues
[5] M. Loukides and A. Oram, Programming with GNU Software, (O'Reilly, London
1996);
see also http://www.gnu.org/manual
[6] H.R. Lewis and C.H. Papadimitriou, Elements of the Theory of Computation,
(Prentice Hall, London 1981)
[9] J. Skansholm, C++ from the Beginning, (Addisin-Wesley, Reading (MA) 1997)
[Ill B.W. Kernighan and D.M. Ritchie, The C Programming Language, (Prentice Hall,
London 1988)
[12] A. Orarn and S. Talbott, Managing Projects With Make, (O'Reill~,London 1991)
[13] The programs and manuals can be found on http://www.gnu.org. For some there
is a texinfo file. To read it, call the editor 'emacs' and type <crtl>+'h' and then
'i' t o start the texinfo mode.
[14] J. Phillips, The Nag Library: A Beginner's Guide (Oxford University Press, Ox-
ford 1987);
see also http://www.nag.com
[17] A.M. Ferrenberg, D.P. Landau and Y.J. Wong, Phys. Rev. Lett. 6 9 , 3382 (1992);
I. Vattulainen, T . Ala-Nissila and K. Kankaala, Phys. Rev. Lett. 73, 2313 (1994)
[18] J.F. Fernandez and C. Criado, Phys. Rev. E 6 0 , 3361 (1999)
[20]The tool can be obtained under the gnu public license from
http://www.gnu.org/software/checker/checker.html
[21] J. Cardy, Scaling and Renormalization in Statistical Physics, (Cambridge Uni-
versity Press, Cambridge 1996)
[22] The program fsscale is written by A. Hucht, please contact him via email:
[email protected]
Bzbliography
IP 23 forest-fire 54, 57
1-SAT algorithm 21 generalized Rosenbluth 263
2-CNF 20 genetic 159-1 83, 195, 199-202
3-CNF 20 graph 48-49, 53-70
3-SAT 22, 23 greedy cover 278
Hoshen-Kopelman 54-57
adjacency list 46 Hungarian 235, 237-241
adjacency matrix 46 invasion 54, 70
adjacent node 37 Kruskal's 69, 70
aging 187, 190 label-setting 63
algorithm 9 Leath 54, 57
1-SAT 21 local optimization 170
approximation for 26 spin glasses matching 194, 231-250
250 maximum-flow 143
bit-sequence 165 mergesort 30
branch-and-bound 194, 282-287 minimize-function 167
branch-and-cut 194 minimum spanning tree 68-70
breadth-first search 59-61, 104, 108, minimum-cost-flow 139-147
115, 231-234 minimum-weight perfect matching
build-cluster 196 242-250
burning 54, 57 Monte Carlo 77, 255-270
certificate checking 22 negative-cycle-canceling 68, 141-
cluster 77 143
conjugate gradient 170 N-line 135
cycle canceling 142 nondeterministic 23
decision 20 N queens 33
depth-first search 57-59 optimization 5
Dijkstra's 63-67, 70 parallel tempering 260-262
Dinic's 107-1 1 5 Prim's 69, 70
divide-and-conquer 281-282 primal-dual 239
ellipsoid 236 prune-enriched Rosenbluth 262-270
extend-alternating-tree 233 push-relabel 107
factoriaJ 27 quantum Monte Carlo 266
fibonachi 32 random-walk 255
find 48 sequential 27
for spin glasses 192-202, 208-219 shortest path 61-68
Ford-Fulkerson 102-107 simplex 236
Index
outgoing 38 factorial 27
phantom 101 father 39
singly connected 59 FC 95
tail 38 feasible
Edmonds theorem 242 flow 141
Edwards-Anderson model 186 solution 2
eigenvalue 320 vector 235
elastic medium 96 Fe,Aul-, 187, 188
electron 173 FeFz 92-93
element of list 41 Fe,Znl-,Fa 92-93
ellipsoid algorithm 236 Fel-,Mn,TiOs 190
energy 75 ferromagnet 3, 185, 204, 250
conservation 298 ferromagnetic interactions 4, 91
free 74 ferromagnetic phase 84
internal 74
FH 95
enrichment 264
Fibonachi numbers 30-32
ensemble
field
canonical 74
cooling 95
microcanonical 74
external 74
entropy 206
heating 95
Boltzmann 75
FIFO implementation 68
ground-state 192
enumeration problem 229 FIFO list 41
ergodic 209 find operation 48
error bar 78, 299, 342 finite graph 37
finite-size scaling 81, 83-84, 203, 289,
5' 84
Euler cycle 50 343-346
even level 233 first order transition 79
excess 109, 143 fit parameters 85
excess node 145 fitness 161
excitation 154-155 function 180
excited state 221 fitness function 173
expand-update procedure 247 fitting 340-343
expectation value 74 quality of 342
experimental results for spin glasses 187 flow 42, 102, 105, 133
190 blocking 107
exponent conservation 43, 103
cu 85 feasible 141
0 54, 80, 81, 84, 123 infeasible 143
critical 54 maximum 43, 106, 192
5' 84 minimum-cost 43, 68
y 81, 84 negative 43
v 123, 252 pseudo 143
a 81 flow-augmentation lemma 145
stiffness 207 fiuctuations 75
r 81 flux 155
0 s 207, 251 flux line 138
exponential distribution 328 array 147-150
external field 74 problem 147-150
Index