Boolean Functions - Theory, Algorithms, and Applications (Crama & Hammer 2011-05-16)
Boolean Functions - Theory, Algorithms, and Applications (Crama & Hammer 2011-05-16)
Boolean Functions
Written by prominent experts in the field, this monograph provides the first compre-
hensive and unified presentation of the structural, algorithmic, and applied aspects of
the theory of Boolean functions.
The book focuses on algebraic representations of Boolean functions, especially dis-
junctive and conjunctive normal form representations. It presents within this framework
the fundamental elements of the theory (Boolean equations and satisfiability problems,
prime implicants and associated short representations, dualization), an in-depth study
of special classes of Boolean functions (quadratic, Horn, shellable, regular, threshold,
read-once functions and their characterization by functional equations), and two fruit-
ful generalizations of the concept of Boolean functions (partially defined functions and
pseudo-Boolean functions). Several topics are presented here in book form for the first
time.
Because of the unique depth and breadth of the unified treatment that it provides and
its emphasis on algorithms and applications, this monograph will have special appeal
for researchers and graduate students in discrete mathematics, operations research,
computer science, engineering, and economics.
Dr. Yves Crama is Professor of Operations Research and Production Management and
the former Director General of the HEC Management School of the University of
Liège, Belgium. He is widely recognized as a prominent expert in the field of Boolean
functions, combinatorial optimization, and operations research, and he has coauthored
more than seventy papers and three books on these subjects. Dr. Crama is a member of
the editorial board of Discrete Applied Mathematics, Discrete Optimization, Journal
of Scheduling, and 4OR – The Quarterly Journal of the Belgian, French and Italian
Operations Research Societies.
Boolean Functions
Theory, Algorithms, and Applications
YVE S C RAMA
University of Liège, Belgium
PETE R L. HAM M E R
cambridge university press
Cambridge, New York, Melbourne, Madrid, Cape Town,
Singapore, São Paulo, Delhi, Tokyo, Mexico City
A catalog record for this publication is available from the British Library.
Cambridge University Press has no responsibility for the persistence or accuracy of URLs for
external or third-party Internet Web sites referred to in this publication and does not guarantee that
any content on such Web sites is, or will remain, accurate or appropriate.
To Edith,
by way of apology for countless days
spent in front of the computer.
YC
Contents
Part I Foundations
1 Fundamental concepts and applications 3
1.1 Boolean functions: Definitions and examples 3
1.2 Boolean expressions 8
1.3 Duality 13
1.4 Normal forms 14
1.5 Transforming an arbitrary expression into a DNF 19
1.6 Orthogonal DNFs and number of true points 22
1.7 Implicants and prime implicants 24
1.8 Restrictions of functions, essential variables 28
1.9 Geometric interpretation 31
1.10 Monotone Boolean functions 33
1.11 Recognition of functional and DNF properties 40
1.12 Other representations of Boolean functions 44
1.13 Applications 49
1.14 Exercises 65
2 Boolean equations 67
2.1 Definitions and applications 67
2.2 The complexity of Boolean equations: Cook’s theorem 72
2.3 On the role of DNF equations 74
2.4 What does it mean to “solve a Boolean equation”? 78
2.5 Branching procedures 80
2.6 Variable elimination procedures 87
vii
viii Contents
Bibliography 635
Index 677
Contributors
Claude Benzaken
Laboratoire G-SCOP
Université Joseph Fourier
Grenoble, France
Endre Boros
RUTCOR – Rutgers Center for Operations Research
Rutgers University
Piscataway, NJ, USA
Nadia Brauner
Laboratoire G-SCOP
Université Joseph Fourier
Grenoble, France
Martin C. Golumbic
The Caesarea Rothschild Institute
University of Haifa
Haifa, Israel
Vladimir Gurvich
RUTCOR – Rutgers Center for Operations Research
Rutgers University
Piscataway, NJ, USA
Lisa Hellerstein
Department of Computer and Information Science
Polytechnic Institute of New York University
Brooklyn, NY, USA
xiii
xiv Contributors
Toshihide Ibaraki
Kyoto College of Graduate Studies for Informatics
Kyoto, Japan
Alexander Kogan
Rutgers Business School and RUTCOR
Rutgers University
Piscataway, NJ, USA
Kazuhisa Makino
Department of Mathematical Informatics
University of Tokyo
Tokyo, Japan
Bruno Simeone
Department of Statistics
La Sapienza University
Rome, Italy
Preface
Boolean functions, meaning {0, 1}-valued functions of a finite number of {0, 1}-
valued variables, are among the most fundamental objects investigated in pure and
applied mathematics. Their importance can be explained by several interacting
factors.
xv
xvi Preface
computers on binary digits (or bits). Turing machines and Boolean circuits
are prime examples illustrating this claim. Similarly, electrical engineers rely
on the Boolean formalism for the description, synthesis, or verification of
digital circuits.
• In operations research or management science, binary variables and Boolean
functions are frequently used to formulate problems where a number of “go –
no go” decisions are to be made; these could be, for instance, investment
decisions arising in a financial management framework, or location deci-
sions in logistics, or assignment decisions for production planning. In most
cases, the variables have to be fixed at values that satisfy constraints express-
ible as Boolean conditions and that optimize an appropriate real-valued
objective function. This leads to – frequently difficult – Boolean equations
(“satisfiability problems”) or integer programming problems.
• Voting games and related systems of collective choice are frequently repre-
sented by Boolean functions, where the variables are associated with (binary)
alternatives available to the decision makers, and the value of the function
indicates the outcome of the process.
• Various branches of artificial intelligence rely on Boolean functions to express
deductive reasoning processes (in the above-mentioned propositional frame-
work), or to model primitive cognitive and memorizing activities of the brain
by neural networks, or to investigate efficient learning strategies, or to devise
storing and retrieving mechanisms in databases, and so on.
We could easily extend this list to speak of Boolean models arising in reliability
theory, in cryptography, in coding theory, in multicriteria analysis, in mathematical
biology, in image processing, in theoretical physics, in statistics, and so on.
The main objective of the present monograph is to introduce the reader to the
fundamental elements of the theory of Boolean functions. It focuses on algebraic
representations of Boolean functions, especially disjunctive or conjunctive nor-
mal form expressions, and it provides a very comprehensive presentation of the
structural, algorithmic, and applied aspects of the theory in this framework.
The monograph is divided into three main parts.
Part I: Foundations proposes in Chapter 1: Fundamental concepts and applica-
tions, an introduction to the major concepts and applications of the theory. It then
successively tackles three generic classes of problems that play a central role in the
theory and in the applications of Boolean functions, namely, Boolean equations
and their extensions in Chapter 2: Boolean equations, the generation of prime
implicants and of optimal normal form representations in Chapter 3: Prime impli-
cants and minimal DNFs, and various aspects of the relation between functions
and their dual in Chapter 4: Duality theory.
Part II: Special Classes presents an in-depth study of several remarkable classes
of Boolean functions. Each such class is investigated from both the structural and
the algorithmic points of view. Chapter 5 is devoted to Quadratic functions, Chapter
6 to Horn functions, Chapter 7 to Orthogonal forms and shellability, Chapter 8 to
Preface xvii
The genesis of this book spread over many years, and over this long period, the
authors have benefited from the support and advice provided by many individuals.
First and foremost, several colleagues have contributed important material to
the monograph: Endre Boros, Marty Golumbic, Vladimir Gurvich, Lisa Heller-
stein, Toshi Ibaraki, Oya Ekin Karaşan, Alex Kogan, Kaz Makino, and Bruno
Simeone have coauthored several chapters and have provided input on various
sections. Claude Benzaken and Nadia Brauner have developed a software pack-
age for manipulating Boolean functions that serves as a useful companion to the
monograph. The contributions of these prominent experts of Boolean functions
greatly enhance the appeal of the volume.
Comments, reviews, and corrections have been provided at different stages by
colleagues and by RUTCOR students, including Nina Feferman, Noam Goldberg,
Levent Kandiller, Shaoji Li, Tongyin Liu, Irina Lozina, Martin Milanic, Devon
Morrese, David Neu, Sergiu Rudeanu, Gábor Rudolf, Jan-Georg Smaus, and Mine
Subasi.
Special thanks are due to Endre Boros, who provided constant encouragement
and tireless advice to the authors over the gestation period of the volume. Terry
Hart provided the efficient administrative assistance that allowed the authors to
keep track of countless versions of the manuscript and endless mail exchanges.
Finally, I am deeply indebted to my mentor, colleague, and friend, Peter L.
Hammer, for getting us started on this ambitious project, many years ago. Peter
spent much of his academic career stressing the importance and relevance of
Boolean models in different fields of applied mathematics, and he was very keen
on completing this monograph. It is extremely unfair that he did not live to see
the outcome of our joint effort. I am sure that he would have loved it, and that he
would have been very proud of this contribution to the dissemination of the theory,
algorithms, and applications of Boolean functions.
Yves Crama
Liège, Belgium, September 2010
xix
Notations
xxi
Part I
Foundations
1
Fundamental concepts and applications
The purpose of this introductory chapter is threefold. First, it contains the main
definitions, terminology, and notations that are used throughout the book. After the
introduction of our main feature characters – namely, Boolean functions – several
sections are devoted to a discussion of alternative representations, or expressions,
of Boolean functions. Disjunctive and conjunctive normal forms, in particular, are
discussed at length in Sections 1.4–1.11. These special algebraic expressions play
a very central role in our investigations, as we frequently focus on the relation
between Boolean functions and their normal forms. Section 1.12, however, also
provides a short description of different types of function representations, namely,
representations over GF(2), pseudo-Boolean polynomial expressions, and binary
decision diagrams.
A second objective of this chapter is to introduce several of the topics to be
investigated in more depth in subsequent chapters, namely: fundamental algorith-
mic problems (Boolean equations, generation of prime implicants, dualization,
orthogonalization, etc.) and special classes of Boolean functions (bounded-degree
normal forms, monotone functions, Horn functions, threshold functions, etc.).
Finally, the chapter briefly presents a variety of applications of Boolean functions
in such diverse fields as logic, electrical engineering, reliability theory, game the-
ory, combinatorics, and so on. These applications have often provided the primary
motivation for the study of the problems to be encountered in the next chapters.
In a sense, this introductory chapter provides a (very) condensed digest of
what’s to come. It can be considered a degustation: Its main purpose is to whet the
appetite, so that readers will decide to embark on the full course!
3
4 1 Fundamental concepts and applications
Of course, the use of truth tables becomes extremely cumbersome when the
function to be defined depends on more than, say, 5 or 6 arguments. As a matter
1.1 Boolean functions: Definitions and examples 5
(x1 , x2 , x3 ) f (x1 , x2 , x3 )
(0,0,0) 1
(0,0,1) 1
(0,1,0) 0
(0,1,1) 1
(1,0,0) 0
(1,0,1) 1
(1,1,0) 0
(1,1,1) 1
of fact, Boolean functions are often defined implicitly rather than explicitly, in the
sense that they are described through a procedure that allows us, for any 0 − 1 point
in the domain of interest, to compute the value of the function at this point. In some
theoretical developments, or when we analyze the computational complexity of
certain problems, such a procedure can simply be viewed as a black box oracle, of
which we can observe the output (that is, the function value) for any given input,
but not the inner working (that is, the details of the algorithm that computes the
output). In most applications, however, more information is available regarding
the process that generates the function of interest, as illustrated by the examples
below. (We come back to these applications in much greater detail in Section 1.13
and in many subsequent chapters of the book.)
For instance, the circuit represented in Figure 1.1 computes the function given
in Example 1.1. This can easily be verified by computing the state of the output gate
(in this case, the OR-gate) for all possible 0–1 inputs. For example, if (x1 , x2 , x3 ) =
(0, 0, 0), then one successively finds that the state of each NOT-gate is 1 (= 1 − 0);
the state of the AND-gate is 1 (= min(1, 1)); and the state of the output gate is 1
(= max(1, 0)).
More generally, the gates of a combinational circuit may be “primitive”
Boolean functions forming another class from the {AND,OR,NOT} collection used
in our small example. In all cases, the gates may be viewed as atomic units of hard-
ware, providing the building blocks for the construction of larger circuits.
Historically, propositional logic and electrical engineering have been the main
nurturing fields for the development of research on Boolean functions. However,
because they are such fundamental mathematical objects, Boolean functions have
also been used to model a large number of applications in a variety of areas. To
describe these applications, we introduce a few more notations.
Given a point X ∈ Bn , we denote by supp(X) the support of X, that is, supp(X)
is the set { i ∈ {1, 2, . . . , n} | xi = 1}. (Conversely, X is the characteristic vector of
supp(X).)
#✥ #✥
✲ NOT ❍
v1
❍❍ #✥
✧✦ ✧✦ ❍
❥
❍
#✥ #✥ AND ❍
✯
✟
✟ ❍❍
✟ ✧✦ ❍❍
✲ NOT ✟✟ #✥
v2 ❍❍
❥
✧✦ ✧✦ OR
✟✯
✟
✧✦
#✥ ✟✟
✟ ✟
v3 ✟
✧✦
Figure 1.1. A small combinational circuit.
1.1 Boolean functions: Definitions and examples 7
Application 1.3. (Game theory.) Many group decision procedures (such as those
used in legislative assemblies or in corporate stockholder meetings) can be viewed,
in abstract terms, as decision rules that associate a single dichotomous “Yes–No”
outcome (for instance, adoption or rejection of a resolution) with a collection
of dichotomous “Yes–No” votes (for instance, assent or disagreement of indi-
vidual lawmakers). Such procedures have been studied in the game-theoretic
literature under the name of simple games or voting games. More formally, let
N = {1, 2, . . . , n} be a finite set, the elements of which are to be called players.
A simple game on N is a function v : {A | A ⊆ N } → B. Clearly, from our van-
tage point, a simple game can be equivalently modeled as a Boolean function fv
on Bn : The variables of fv are in 1-to-1 correspondence with the players of the
game (variable i takes value 1 exactly when player i votes “Yes”), and the value
of the function reflects the outcome of the vote for each point X ∗ ∈ B n describing
a vector of individual votes:
1 if v(supp(X∗ )) = 1,
fv (X ∗ ) =
0 otherwise.
Application 1.4. (Reliability theory.) Reliability theory investigates the relation-
ship between the operating state of a complex system S and the operating state
of its individual components, say components 1, 2, . . . , n. It is commonly assumed
that the system and its components can be in either of two states: operative or
failed. Moreover, the state of the system is completely determined by the state of
its components via a deterministic rule embodied in a Boolean function fS on B n ,
called the structure function of the system: For each X∗ ∈ Bn ,
1 if the system operates when all components in supp(X ∗ ) operate
∗
fS (X ) = and all other components fail,
0 otherwise.
A central issue is to compute the probability that the system operates (meaning
that fS takes value 1) when each component is subject to probabilistic failure.
Thus, reliability theory deals primarily with the stochastic theory of Boolean
functions.
Keeping in line with our focus on functions, we often regard the three elemen-
tary Boolean operations as defining Boolean functions on B2 : disj(x, y) = x ∨ y,
conj(x, y) = x ∧ y, and on B: neg(x) = x. When the elements of B = {0, 1} are
interpreted as integers rather than abstract symbols, these operations can be defined
by simple arithmetic expressions: For all x, y ∈ B,
x ∨ y = max{x, y} = x + y − x y,
x ∧ y = min{x, y} = x y,
x = 1 − x.
Observe that the conjunction of two elements of B is equal to their arithmetic
product. By analogy with the usual convention for products, we often omit the
operator ∧ and denote conjunction by mere juxtaposition.
We can extend the definitions of all three elementary operators to Bn by writing:
For all X, Y ∈ Bn ,
X ∨ Y = (x1 ∨ y1 , x2 ∨ y2 , . . . , xn ∨ yn ),
X ∧ Y = (x1 ∧ y1 , x2 ∧ y2 , . . . , xn ∧ yn ) = (x1 y1 , x2 y2 , . . . , xn yn ),
X = (x 1 , x 2 , . . . , x n ).
Let us enumerate some of the elementary properties of disjunction, conjunc-
tion, and complementation. (We note for completeness that the properties listed
in Theorem 1.1 can be viewed as the defining properties of a general Boolean
algebra.)
(1) x ∨ 1 = 1 and x ∧ 0 = 0;
(2) x ∨ 0 = x and x ∧ 1 = x;
(3) x ∨ y = y ∨ x and x y = y x (commutativity);
(4) (x ∨ y) ∨ z = x ∨ (y ∨ z) and x (y z) = (x y) z (associativity);
(5) x ∨ x = x and x x = x (idempotency);
(6) x ∨ (x y) = x and x (x ∨ y) = x (absorption);
(7) x ∨ (y z) = (x ∨ y) (x ∨ z) and x (y ∨ z) = (x y) ∨ (x z) (distributivity);
(8) x ∨ x = 1 and x x = 0;
(9) x = x (involution);
(10) (x ∨ y) = x y and (x y) = x ∨ y (De Morgan’s laws);
(11) x ∨ (x y) = x ∨ y and x (x ∨ y) = x y (Boolean absorption).
10 1 Fundamental concepts and applications
Proof. These identities are easily verified, for example, by exhausting all possible
values for x, y, z.
Building upon Definition 1.3, we are now in a position to introduce the important
notion of Boolean expression.
Definition 1.4. Given a finite collection of Boolean variables x1 , x2 , . . . , xn , a
Boolean expression (or Boolean formula) in the variables x1 , x2 , . . . , xn is defined
as follows:
(1) The constants 0, 1, and the variables x1 , x2 , . . . , xn are Boolean expressions
in x1 , x2 , . . . , xn .
(2) If φ and ψ are Boolean expressions in x1 , x2 , . . . , xn , then (φ ∨ ψ), (φ ψ)
and φ are Boolean expressions in x1 , x2 , . . . , xn .
(3) Every Boolean expression is formed by finitely many applications of the
rules (1)–(2).
We also say that a Boolean expression in the variables x1 , x2 , . . . , xn is a Boolean
expression on Bn .
We use notations like φ(x1 , x2 , . . . , xn ) or ψ(x1 , x2 , . . . , xn ) to denote Boolean
expressions in the variables x1 , x2 , . . . , xn .
Example 1.2. Here are some examples of Boolean expressions:
φ1 (x) = x,
φ2 (x) = x,
ψ1 (x, y, z) = (((x ∨ y)(y ∨ z)) ∨ ((xy)z)),
ψ2 (x1 , x2 , x3 , x4 ) = ((x1 x2 ) ∨ (x 3 x 4 )).
Definition 1.6. We say that two Boolean expressions ψ and φ are equivalent
if they represent the same Boolean function. When this is the case, we write
ψ = φ.
Note that any two expressions that can be deduced from each other by repeated
use of the properties listed in Theorem 1.1 are equivalent even though they are not
identical.
Thus, ψ1 (x, y, z) and φ(x, y, z) are equivalent, that is, ψ1 (x, y, z) = φ(x, y, z).
Definition 1.7. The length (or size) of a Boolean expression φ is the number of
symbols used in an encoding of φ as a binary string. The length of φ is denoted by
|φ|.
where we identify the expression ψ with the function fψ that it represents (thus,
(1.2) simply boils down to function composition). In particular, if f and g are two
1.3 Duality 13
(f ∨ g)(X∗ ) = f (X ∗ ) ∨ g(X∗ ),
(f ∧ g)(X∗ ) = f (X ∗ ) ∧ g(X∗ ),
f (X ∗ ) = f (X ∗ ).
1.3 Duality
With every Boolean function f , the following definition associates another
Boolean function f d called the dual of f :
Definition 1.8. The dual of a Boolean function f is the function f d defined by
f d (X) = f (X)
Dual functions arise naturally in many Boolean models. We only describe here
one simple occurence of this concept; more applications are discussed in Chapter 4.
Application 1.6. (Voting theory.) Suppose that a voting procedure is modeled by
a Boolean function f on Bn , as explained in Application 1.3. Thus, when the play-
ers’ votes are described by the Boolean point X∗ ∈ Bn , the outcome of the voting
procedure is f (X∗ ). What happens if all the players simultaneously change their
minds and vote X ∗ rather than X∗ ? In many cases, we would expect the outcome of
the procedure to be reversed as well, that is, we would expect f (X ∗ ) = f (X ∗ ), or
equivalently, f (X∗ ) = f (X ∗ ) = f d (X ∗ ). When the property f (X) = f d (X) holds
for all X ∈ B n , we say that the function f (and the voting procedure it describes) is
self-dual. Note, however, that some common voting procedures are not self-dual,
as exemplified by the two-thirds majority rule.
Proof. Definition 1.8 immediately implies (a) and (b). For property (c), observe
that
(f ∨ g)d (X) = (f ∨ g)(X)
= (f (X) ∨ g(X))
= f (X) g(X) (by de Morgan’s laws)
d d
= f (X) g (X).
Property (d) follows from (a) and (c).
expressions indexed over the set * = {k1 , k2 , . . . , km }, then we denote by k∈* φk
the expression (φk1 ∨ φk2 ∨ . . . ∨ φkm ), and we denote by k∈* φk the expression
(φk1 ∧ φk2 ∧ . . . ∧ φkm ). By convention, when * is empty, k∈* φk is equivalent to
the constant 0 and k∈* φk is equivalent to the constant 1.
Definition 1.10. A literal is an expression of the form x or x, where x is a Boolean
variable. An elementary conjunction (sometimes called term, or monomial, or
cube) is an expression of the form
C= xi x j , where A ∩ B = ∅,
i∈A j ∈B
Ck = xi xj ,
k=1 k=1 i∈Ak j ∈Bk
In particular, we have been able to derive both a DNF representation (1.4) and a
CNF representation (1.5) of the original expression (1.3) (which is not a normal
form). This is not an accident. Indeed, we can now establish a fundamental property
of Boolean functions.
Theorem 1.4. Every Boolean function can be represented by a disjunctive normal
form and by a conjunctive normal form.
Proof. Let f be a Boolean function on B n , let T be the set of true points of f , and
consider the DNF
φf (x1 , x2 , . . . , xn ) = xi xj . (1.6)
Y ∈T i|yi =1 j |yj =0
But condition (1.7) simply means that xi∗ = 1 whenever yi = 1, and xi∗ = 0 when-
ever yi = 0, that is, X∗ = Y . Hence, X∗ is a true point of φf if and only if X ∗ ∈ T ,
and we conclude that φf represents f .
A similar reasoning establishes that f is also represented by the CNF
ψf (x1 , x2 , . . . , xn ) = xj ∨ xi , (1.8)
Y ∈F j |yj =0 i|yi =1
Note that, alternatively, the second part of Theorem 1.4 can also be derived
from its first part by an easy duality argument. Indeed, in view of Theorem 1.3,
the function f is represented by the CNF
xi ∨ xj (1.9)
(A,B)∈* i∈A j ∈B
is the minterm expression (or canonical DNF) of f , and the terms of φf are the
minterms of f . The CNF
ψf (x1 , x2 , . . . , xn ) = xj ∨ xi (1.12)
Y ∈F (f ) j |yj =0 i|yi =1
is the maxterm expression (or canonical CNF) of f , and the terms of ψf are the
maxterms of f .
Observe that Definition 1.11 actually involves a slight abuse of language, since
the minterm (or the maxterm) expression of a function is unique only up to the
order of its terms and literals. In the sequel, we shall not dwell on this subtle, but
usually irrelevant, point and shall continue to speak of “the” minterm (or maxterm)
expression of a function.
With this terminology, the proof of Theorem 1.4 establishes that every Boolean
function is represented by its minterm expression. This observation can be traced all
the way back to Boole [103]. In view of its unicity, the minterm expression provides
a “canonical” representation of a function. In general, however, the number of
minterms (or, equivalently, of true points) of a function can be very large, so that
handling the minterm expression often turns out to be rather impractical.
Normal form expressions play a central role in the theory of Boolean functions.
Their preeminence is partially justified by Theorem 1.4, but this justification is
not sufficient in itself. Indeed, the property described in Theorem 1.4 similarly
holds for many other special classes of Boolean expressions. For instance, it can
be observed that, besides its DNF and CNF expressions, every Boolean function
also admits expressions involving only disjunctions and complementations, but no
conjunctions (as well as expressions involving only conjunctions and complemen-
tations, but no disjunctions). Indeed, as an immediate consequence of De Morgan’s
laws, every conjunction xy can be replaced by the equivalent expression (x ∨ y),
and similarly, every disjunction x ∨ y can be replaced by the expression (x y).
More to the point, and as we will repeatedly observe, normal forms arise
quite naturally when one attempts to model various problems within a Boolean
18 1 Fundamental concepts and applications
framework. For this reason, normal forms are ubiquitous in this book: Many of
the problems to be investigated will be based on the assumption that the Boolean
functions at hand are expressed in normal form or, conversely, will have as a goal
constructing a normal form of a function described in some alternative way (truth
table, arbitrary Boolean expression, etc.).
However, it should be noticed that DNF and CNF expressions provide closely
related frameworks for representing or manipulating Boolean functions (remember
the duality argument invoked at the end of the proof of Theorem 1.4). This dual
relationship between DNFs and CNFs will constitute, in itself, an object of study
in Chapter 4.
Most of the time, we display a slight preference for DNF representations of
Boolean functions over their CNF counterparts, to the extent that we discuss many
problems in terms of DNF rather than CNF representations. This choice is in
agreement with much of the classical literature on propositional logic, electrical
engineering, and reliability theory, but is opposite to the standard convention in the
artificial intelligence and computational complexity communities. Our preference
for DNFs is partially motivated by their analogy with real polynomials: Indeed,
since the identities x ∧ y = xy and x = 1 − x hold when x, y are interpreted as
numbers in {0, 1}, every DNF of the form
m
xi xj
k=1 i∈Ak j ∈Bk
Note that the (encoding) length |φ| of a DNF φ, as introduced in Definition 1.7,
comes within a constant factor of the number of literals appearing in φ. Therefore,
we generally feel free to identify these two measures of the size of φ (especially
when discussing asymptotic complexity results). We denote by ||φ|| the number
of terms of a DNF φ.
The problem with this method (and, actually, with any method that transforms
a Boolean expression into an equivalent DNF) is that it may very well require an
exponential number of steps, as illustrated by the following example:
Example 1.11. The function represented by the CNF
has a unique shortest DNF expression; call it ψ (this will result from Theorem 1.23
hereunder). The terms of ψ are exactly those elementary conjunctions of n variables
that involve one variable out of each of the pairs {x1 , x2 }, {x3 , x4 }, ..., {x2n−1 , x2n }.
Thus, ψ has 2n terms. Writing down all these terms requires exponentially large
time and space in terms of the length of the original formula φ.
The same DNF ψ2 is also the DNF expansion of (x1 ∨ (x2 x 3 )), this time with
y 2 as associated literal.
Finally, putting all the pieces together, we obtain the desired expression of ψ:
ψ(x1 , x2 , x3 , y1 , y2 , y4 , z) = y 1 z ∨ y2 z ∨ y1 y 2 z
∨ x1 y 1 ∨ x4 y 1 ∨ x 1 x 4 y1
∨ x1 y 2 ∨ y4 y 2 ∨ x 1 y 4 y2
∨ x 2 y4 ∨ x3 y4 ∨ x2 x 3 y 4
with distinguished literal z. We leave it as an easy exercise to check that, for every
X = (x1 , x2 , x3 , x4 ) ∈ B4 , the unique solution of the equation ψ = 0 satisfies
y1 = φ1 (X) = x1 ∨ x4
y2 = φ2 (X) = x1 ∨ (x2 x 3 )
y4 = φ4 (X) = x2 x 3
z = φ(X).
Figure 1.2 presents a formal description of the procedure Expand. Let us now
establish the correctness of this procedure.
Theorem 1.5. With every Boolean expression φ(X) on B n , the procedure Expand
associates a DNF ψ(X, Y ) on Bn+m (m ≥ 0) and a distinguished literal z among the
literals on {x1 , x2 , . . . , xn , y1 , y2 , . . . , ym } with the property that, for each X∗ ∈ B n ,
there is a unique point Y (X ∗ ) ∈ B m such that ψ(X∗ , Y (X ∗ )) = 0; moreover, in this
point, the distinguished literal z is equal to φ(X∗ ). Expand can be implemented
to run in linear time.
Fix X∗ ∈ Bn . By induction, there exist k points Y1∗ , Y2∗ , . . . , Yk∗ , each of them
uniquely defined, such that ψj (X ∗ , Yj∗ ) = 0 and zj∗ = φj (X ∗ ) for j = 1, 2, . . . , k. It is
then straightforward to verify that the condition ψ(X ∗ , Y1 , Y2 , . . . , Yk , y) = 0 holds
for a unique choice of (Y1 , Y2 , . . . , Yk , y), namely, for Yj = Yj∗ (j = 1, 2, . . . , k), and
for
Procedure Expand(φ)
Input: A Boolean expression φ(x1 , x2 , . . . , xn ).
Output: A DNF ψ(x1 , x2 , . . . , xn , y1 , y2 , . . . , ym ), with a distinguished literal z among the
literals on {x1 , x2 , . . . , xn , y1 , y2 , . . . , ym }.
begin
if φ = xi for some i ∈ {1, 2, . . . , n}
then return ψ(x1 , x2 , . . . , xn ) = 0 n and the distinguished literal xi
else if φ = φ1 for some expression φ1 then
begin
let ψ1 := Expand(φ1 ) and let z be the distinguished literal of ψ1 ;
return ψ := ψ1 and the distinguished literal z;
end
else if φ = (φ1 ∨ φ2 ∨ . . . ∨ φk ) for some expressions φ1 , φ2 , . . . , φk then
begin
for j = 1 to k do ψj := Expand(φj );
let zj be the distinguished literal of ψj , for j = 1, 2, . . . , k;
create a new variable y;
return ψ := z1 y ∨ z2 y ∨ . . . ∨ zk y ∨ z1 z2 . . . zk y ∨ ψ1 ∨ ψ2 ∨ . . . ∨ ψk
and the distinguished literal y;
end
else if φ = (φ1 φ2 . . . φk ) for some expressions φ1 , φ2 , . . . , φk then
begin
for j = 1 to k do ψj := Expand(φj );
let zj be the distinguished literal of ψj , for j = 1, 2, . . . , k;
create a new variable y;
return ψ := z1 y ∨ z2 y ∨ . . . ∨ zk y ∨ z1 z2 . . . zk y ∨ ψ1 ∨ ψ2 ∨ . . . ∨ ψk
and the distinguished literal y;
end
end
Note also that a DNF is orthogonal if and only if, for every pair of terms
k, l ∈ {1, 2, . . . , m}, k = l, and for every X∗ ∈ Bn ,
∗ ∗ ∗ ∗
xi x j xi x j = 0.
i∈Ak j ∈Bk i∈Al j ∈Bl
The terminology “orthogonal” is quite natural in view of this observation, the proof
of which is left to the reader.
One of the main motivations for the interest in orthogonal DNFs is that, for
functions expressed in this form, computing the number of true points turns out to
be extremely easy.
Theorem 1.8. If the Boolean function f on B n is represented by an orthogonal
DNF of the form (1.13), then the number of its true points is equal to
m
ω(f ) = 2n−|Ak |−|Bk | .
k=1
Proof. The DNF (1.13) takes value 1 exactly whenone of its terms takes value 1.
Since the terms are pairwise orthogonal, ω(f ) = m k=1 αk , where αk denotes the
number of true points of the k-th term. The statement follows easily.
24 1 Fundamental concepts and applications
At this point, the reader may be wondering (with some reason) why anyone
would ever want to compute the number of true points of a Boolean function. We
present several applications of this concept in Section 1.13. For now, it may be
sufficient to note that determining the number of true points of a function f is a
roundabout way to check the consistency of the Boolean equation f = 0.
Chow [194] introduced several parameters of a Boolean function that are closely
related to the number ω(f ) defined in Theorem 1.8.
Definition 1.14. The Chow parameters of a Boolean function f on B n are the
n + 1 integers (ω1 , ω2 , . . . , ωn , ω), where ω = ω(f ) is the number of true points of
f and ωi is the number of true points X∗ of f such that xi∗ = 1:
ωi = | {X ∗ ∈ B n | f (X∗ ) = 1 and xi∗ = 1} |, i = 1, 2, . . . , n.
The same reasoning as in Theorem 1.8 shows that the Chow parameters of a
function represented in orthogonal form can be efficiently computed: For ω, this
is just a consequence of Theorem 1.8; for ωi (1 ≤ i ≤ n), this follows from the fact
that the DNF obtained by fixing xi to 1 in an orthogonal DNF remains orthogonal.
Theorem 1.9. For all Boolean functions f and g on Bn , the following statements
are equivalent:
(1) f ≤ g;
(2) f ∨ g = g;
(3) f ∨ g = 1n ;
(4) f g =f;
(5) f g = 0n.
Proof. It suffices to note that each of the assertions (1)–(5) fails exactly when
there exists X ∈ Bn such that f (X) = 1 and g(X) = 0.
(1) 0 n ≤ f ≤ 1n ;
(2) f g ≤ f ≤ f ∨ g;
(3) f = g if and only if (f ≤ g and g ≤ f );
(4) (f ≤ h and g ≤ h) if and only if f ∨ g ≤ h;
(5) (f ≤ g and f ≤ h) if and only if f ≤ g h;
(6) if f ≤ g then f h ≤ g h;
(7) if f ≤ g then f ∨ h ≤ g ∨ h;
Theorem 1.11. The elementary conjunction CAB = i∈A xi j ∈B x j implies the
elementary conjunction CF G = i∈F xi j ∈G x j if and only if F ⊆ A and G ⊆ B.
Proof. Assume that F ⊆ A and G ⊆ B and consider any point X = (x1 , x2 , . . . , xn ) ∈
Bn . If CAB (X) = 1, then xi = 1 for all i ∈ A and xj = 0 for all j ∈ B, so that xi = 1
for all i ∈ F and xj = 0 for all j ∈ G. Hence, CF G (X) = 1 and we conclude that
CAB implies CF G .
To prove the converse statement, assume for instance that F is not contained
in A. Set xi = 1 for all i ∈ A, xj = 0 for all j ∈ A and X = (x1 , x2 , . . . , xn ). Then,
CAB (X) = 1 but CF G (X) = 0 (since xk = 0 for some k ∈ F \ A), so that CAB does
not imply CF G .
Example 1.16. By Theorem 1.12, the function f = xy ∨ xyz (see Example 1.15)
admits the DNF expression xy ∨ xyz ∨ xz = xy ∨ xz (the last equality is easily
verified to hold).
So, a redundant DNF expression can be turned into a shorter equivalent DNF by
dropping some of its terms. For instance, Example 1.18 shows that the complete
DNF of a Boolean function is not necessarily irredundant. Similarly, if a DNF is
not prime, then at least one of its terms can be replaced by a prime implicant that
absorbs it (remember Theorem 1.12 and the comments following it). Therefore,
prime irredundant DNFs provide the shortest possible DNF representations of
Boolean functions. In Chapter 3, we return to the study of prime irredundant
DNFs in detail.
Of course, the concepts of implicants and prime implicants have their natural
disjunctive counterparts.
Definition 1.21. Let f be a Boolean function and D be an elementary disjunction.
We say that D is an implicate of f if f implies D. We say that the implicate D is
prime if it is not implied by any other implicate of f .
Similarly to Theorem 1.13, we obtain:
Theorem 1.14. Every Boolean function can be represented by the conjunction of
all its prime implicates.
Proof. The proof is a straightforward adaptation of the proof of Theorem 1.13.
Example 1.19. The function g considered in Example 1.18 has four implicates,
namely, (x ∨ y), (x ∨ y ∨ z), (x ∨ y ∨ z), and (x ∨ y ∨ z). However, only the first
and the last implicates in this list are prime, and we conclude that g = (x ∨ y)
(x ∨ y ∨ z).
and similarly for f|xk =0 (x1 , x2 , . . . , xn ). This slight abuse of definitions is innocu-
ous and we use it whenever it proves convenient. Also,we use shorthand like
f|x1 =0,x2 =1,x3 =0 instead of the more cumbersome notation f|x1 =0 |x =1 .
2 |x3 =0
The link between representations of a function and representations of its
restrictions is straightforward.
Theorem 1.15. Let f be a Boolean function on Bn , let ψ be a representation of f ,
and let k ∈ {1, 2, . . . , n}. Then, the expression obtained by substituting the constant
1 (respectively, 0) for every occurrence of xk in ψ represents f|xk =1 (respectively,
f|xk =0 ).
Proof. This is an immediate consequence of Definitions 1.22 and 1.5.
Example 1.20. Consider the function f = (xz ∨ y)(x ∨ z) ∨ x y. After some easy
simplifications, we derive the following expressions for f|y=1 and f|y=0 :
The right-hand side of the identity (1.15) is often called the Shannon expansion
of the function f with respect to xk , by reference to its use by Shannon in [827],
although this identity was already well-known to Boole [103]. It can be used, in
particular, to construct the minterm DNF of a function (Theorem 1.4 and Definition
1.11). More interestingly, by applying the Shannon expansion to a function and to
its successive restrictions until these restrictions become either 0, or 1, or a literal,
we obtain an orthogonal DNF of the function (this is easily proved by induction
on n). Not every orthogonal DNF, however, can be obtained in this way.
Example 1.21. Consider again the function f in Example 1.20. The Shannon
expansion of f|y=1 with respect to x is
xf|y=1,x=1 ∨ xf|y=1,x=0 = x ∨ x z.
zf|y=0,z=1 ∨ zf|y=0,z=0 = z ∨ z x.
f (x, y, z) = y (x ∨ x z) ∨ y (z ∨ z x) = x y ∨ x y z ∨ y z ∨ x y z,
We claim that C is an implicant of f : This will in turn entail that the prime impli-
cants of f do not involve xk . To prove the claim, let X = (x1 , x2 , . . . , xn ) be any
point in Bn such that C(X) = 1, and let us show that f (X) = 1. Since neither C nor
f depend on xk , we may as well suppose that xk = 1. Then, C(X) = CAB (X) = 1
and hence f (X) = 1, as required.
function on B n (except the null function 0 n ), even when the function depends on
much fewer than n variables.
Example 1.22. The DNF φ(x1 , x2 , x3 , x4 ) = x1 x2 ∨ x1 x 2 ∨ x 1 x2 ∨ x 1 x 2 repre-
sents the constant function 14 on B 4 . In particular, φ does not depend on any of its
variables.
We will prove later that, for a function represented by an arbitrary DNF expres-
sion, it is generally difficult to determine whether any given variable is essential
or not (see Theorem 1.32 in Section 1.11).
Finally, let us mention an interesting connection between the concept of
essential variables and of Chow parameters.
Theorem 1.18. Let f be a Boolean function on B n , let (ω1 , ω2 , . . . , ωn , ω) be its
vector of Chow parameters, and let k ∈ {1, 2, . . . , n}. If the variable xk is inessential
for f , then ω = 2 ωk .
Proof. The sets A = {X ∈ Bn | f (X) = 1, xk = 1} and B = {X ∈ Bn | f (X) = 1,
xk = 0} partition the set of true points of f , and |A| = ωk , |B| = ω − ωk . If xk is
inessential, then A and B are in one-to-one correspondence, so ω = 2 ωk .
The converse of Theorem 1.18 is not valid since the function f (x1 , x2 ) =
x1 x 2 ∨ x 1 x2 has Chow parameters (1,1,2), and both variables x1 , x2 are essential.
Geometrically, the points in TAB are exactly the vertices contained in a face of
U n . Every such face is itself a hypercube of dimension n − |A| − |B| containing
2n−|A|−|B| vertices, and will therefore be referred to as a subcube. (Some authors,
especially in the electrical engineering literature, actually use the term “cube”
instead of “elementary conjunction.”)
Consider now a Boolean function f . In view of the previous observation, each
implicant of f corresponds to a subcube of U n that contains no false points of f .
The implicant
is prime if the corresponding subcube is maximal with this property.
Let φ = m k=1 Ck be an arbitrary DNF expression of the function f . The set
of true points of f coincides with the union of the sets of true points of the
32 1 Fundamental concepts and applications
(0, 1, 1) ✐
✑ ✑ pppp ❏
✑ pp ❏ ✐ T (f )
✑ pp ❏
✑ ppp
✑ ppp ❏
(0, 0, 1) ✐ ppp y F (f )
ppp ❏
❏
❏ p ❏✐
(0, 1, 0) p p py (1, 1, 1)
❏ p p p pp ✑
❏ ppppp p p ✑✑
p p
p p p❏ ✑p p
pp p p ✑ ppp
✐ ❏✐ ✑ pp
(0, 0, 0) (1, 0, 1) pp
pp
❏ pp
❏ y (1, 1, 0)
❏ ✑ ✑
❏ ✑
❏ ✑
✑
❏y✑ (1, 0, 0)
Figure 1.3. A 3-dimensional view of the Boolean function of Example 1.23.
Example 1.23. Consider again the function given by Table 1.1 in Section 1.1.
A Karnaugh map for this DNF is given by the matrix displayed in Table 1.2.
The rows of the map are indexed by the values of the variable x1 ; its columns are
indexed by the values of the pair of variables (x2 , x3 ); and each cell contains the
value of the function in the corresponding Boolean point. For instance, the cell in
the second row, fourth column, of the map contains a 0, since f (1, 1, 0) = 0.
1.10 Monotone Boolean functions 33
Because of the special way in which the columns are ordered, two adjacent
cells always correspond to neighboring vertices of the unit hypercube U ; that is,
the corresponding points differ in exactly one component. This remains true if we
think of the Karnaugh map as being wrapped on a torus, with cell (0, 10) adjacent
to cell (0, 00). Likewise, each row of the map corresponds to a 2-dimensional face
of U , and so do squares formed by 4 adjacent cells, like (0, 01), (0, 11), (1, 01),
and (1, 11).
Now, note that every cell of the map containing a 1 can alternatively be viewed as
representing a minterm of the function f . For instance, the cell (0, 01) corresponds
to the minterm x 1 x 2 x3 . Moreover, any two adjacent cells with value 1 can be
combined to produce an implicant of degree 2. So, the cells (0, 01) and (0, 11)
generate the implicant x 1 x3 , and so on. Finally, each row or square containing
four 1’s generates an implicant of degree 1; e.g., the cells (0, 01), (0, 11), (1, 01),
and (1, 11) correspond to the implicant x1 .
So, in order to derive from the map a DNF expression of f , we just have to
find a collection of subsets of adjacent cells corresponding to implicants of f
and covering all the true points of f . Each such collection generates a different
DNF of f . For instance, the pairs of cells ((0, 00), (0, 01)), ((0, 01), (0, 11)), and
((1, 01), (1, 11)) simultaneously cover all the true points of f and generate the DNF
φ = x 1 x 2 ∨ x 1 x3 ∨ x1 x3 .
Alternatively, the true points can be covered by the pair ((0, 00), (0, 01)) and by
the square ((0, 01), (0, 11), (1, 01), (1, 11)), thus giving rise to the DNF
ψ = x 1 x 2 ∨ x3 .
Karnaugh maps have been mostly used by electrical engineers to identify short
(irredundant, prime) DNFs of Boolean functions of a small number of variables.
Extensions of this problem to arbitrary functions will be discussed in Section 3.3.
So, when a monotone function is neither positive nor negative (as in the pre-
ceding Example 1.24), it can always be brought to one of these two forms by an
elementary change of variables. This suggests that, in many cases, it is sufficient to
study the properties of positive functions to understand the properties of monotone
functions. This is our point of view in the next sections.
Let us give a characterization of positive functions that can be seen as a simple
restatement of Definitions 1.24 and 1.25. For two points X = (x1 , x2 , . . . , xn ) and
Y = (y1 , y2 , . . . , yn ) in B n , we write X ≤ Y if xi ≤ yi for all i = 1, 2, . . . , n.
Theorem 1.20. A Boolean function f on B n is positive if and only if f (X) ≤ f (Y )
for all X, Y ∈ Bn such that X ≤ Y .
Proof. The “if” part of the statement is trivial, and the “only if” part is easily
established by induction on the number of components of X and Y such that
xi < yi .
Theorem 1.21. Let f be a Boolean function on B n , and let k ∈ {1, 2, . . . , n}. The
following statements are equivalent:
C= xi xj ,
i∈A j ∈B\{k}
Theorem 1.22. Let φ and ψ be two DNFs and assume that ψ is positive. Then, φ
implies ψ if and only if each term of φ is absorbed by some term of ψ.
Proof. We suppose, without loss of generality, that φ and ψ are expressions in the
same n variables. The “if” part of the statement holds even when ψ is not positive,
as an easy corollary of Theorem 1.11. For the converse statement,
let us assume that
φ implies ψ, and let us consider some term of φ, say, Ck = i∈A xi j ∈B x j . Con-
sider the characteristic vector of A, denoted eA . There holds Ck (eA ) = φ(eA ) = 1.
Thus, ψ(eA ) = 1 (since φ ≤ ψ), and therefore, some term of ψ must take value 1
at the point eA : Denote this term by Cj = i∈F xi (remember that ψ is positive).
1.10 Monotone Boolean functions 37
Theorem 1.23 is due to Quine [767]. It is important because it shows that the
complete DNF provides a “canonical” shortest DNF representation of a positive
Boolean function: Since the shortest DNF representation of a Boolean function is
necessarily prime and irredundant (see the comments following Definition 1.20),
no other DNF representation of a positive function can be as short as its com-
plete DNF. Notice that this unicity result does not hold in general for nonpositive
functions, as illustrated by the example below.
Example 1.26. The DNFs ψ1 = x y ∨ y z ∨ x z and ψ2 = x z ∨ y z ∨ x y are two
shortest (prime and irredundant) expressions of the same function.
We conclude this section with a useful result that extends Theorem 1.23: This
result states that the complete DNF of a positive function can be obtained by first
dropping the complemented literals from any DNF representation of the function
and then deleting the redundant implicants from the resulting expression.
Theorem 1.24. Let φ = m k=1 i∈Ak xi j ∈Bk x j be a DNF representation
of a positive Boolean function f . Then, ψ = m k=1 i∈Ak x i is a positive DNF
representation of f . The prime implicants of f are the terms of ψ which are not
absorbed by other terms of ψ.
Proof. Clearly, f = φ ≤ ψ (see Theorem 1.22). To prove the reverse inequality,
consider any point X ∗ = (x1∗ , x2∗ , . . . , xn∗ ) ∈ B n such that ψ(X∗ ) = 1. There is a term
38 1 Fundamental concepts and applications
Example 1.27. As already observed in the comments following Example 1.25, the
DNF φ(x, y, z) = xy ∨ x y z ∨ xz represents a positive function; call it f . An alter-
native representation of f is derived by deleting all complemented literals from φ.
In this way, we obtain the redundant DNF ψ = xy ∨ x ∨ xz, and we conclude that
x is the only prime implicant of f .
Example 1.28. Consider the positive function f (x, y, z, u) = xy ∨ xzu ∨ yz. From
Theorem 1.26, we conclude that the minimal true points of f are (1,1,0,0), (1,0,1,1)
and (0,1,1,0).
Theorem 1.26 crucially depends on the positivity assumption. To see this, con-
sider the function g(x, y, z) = x y ∨x z. The point (1,1,0) is a true point of g derived
from the prime implicant x y, as explained in Theorem 1.26. However, (0,0,0) is
the unique minimal true point of g.
Proof. It suffices to mimic the proof of Theorem 1.26. Alternatively, Theorem 1.27
can be derived as a corollary of Theorem 1.26 via De Morgan’s laws or simple
duality arguments.
Example 1.29. The function f given in Example 1.28 has four prime implicates,
namely, (x ∨ y), (x ∨ z), (y ∨ z) and (y ∨ u). Accordingly, it has four maximal false
points, namely, (0,0,1,1), (0,1,0,1), (1,0,0,1) and (1,0,1,0).
DNF Membership in C
Instance: A DNF expression φ.
Question: Is φ in C?
Functional Membership in C
Instance: A DNF expression of a function f .
Question: Is f in C?
Theorem 1.30. Let C be any class of Boolean functions with the following
properties:
(a) There exists a function g such that g ∈ C.
(b) For all n ∈ N, the constant function 1n is in C.
(c) C is closed under restrictions; that is, if f is a function in C, then all functions
obtained by fixing some variables of f to either 0 or 1 are also in C.
Then, the problem Functional Membership in C is NP-hard.
In spite of its apparent simplicity, Theorem 1.30 is a very general result that
can be applied to numerous classes of interest due to the weakness of its premises.
Indeed, condition (a) is perfectly trivial because the membership question would
be vacuous without it. Condition (b) is quite weak as well: It is fulfilled by all
the classes introduced earlier in this section, except by Z (remember Theorem
1.28). Condition (c) is stronger than the first two, but it arises naturally in many
situations. In particular, the condition holds again for all the classes of functions
discussed above.
Without further knowledge about the class C, Theorem 1.30 does not allow us to
draw conclusions about NP-completeness or co-NP-completeness of the member-
ship problem. In any specific application of the theorem, however, we may know
1.11 Recognition of functional and DNF properties 43
Let us finally observe that, even though they are not direct corollaries of The-
orem 1.30, some related complexity results may sometimes be derived from it as
well. For instance:
⊕(x1 , x2 ) = x1 x 2 ∨ x 1 x2
Proof. We provide a constructive proof from first principles. To establish the exis-
tence of the representation, we use induction on n. A representation of the form
(1.16) clearly exists when n = 0, or when n = 1 (since x = x ⊕ 1). For n > 1, the
existence of the representation directly follows from the trivial identity (note the
analogy with the Shannon expansion (1.15)):
Indeed, by induction, both f|xn =0 and f|xn =1 can be expressed in the form (1.16).
Substituting these expressions in (1.17) yields a sum-of-products modulo 2 that
may contain pairs of identical terms. In this case, these pairs of terms can be
removed using the identity x ⊕ x = 0.
1.12 Other representations of Boolean functions 45
n
To prove uniqueness, it suffices to observe that there are exactly 22 expressions
of the form (1.16) and that this is also the number of Boolean functions on B n .
Proof. This result will be established in Chapter 13; see Theorem 13.1.
For Boolean functions, we could actually have observed the existence of the rep-
resentation (1.19) while discussing orthogonal expressions in Section 1.6. Indeed,
the existence of a polynomial representation is an immediate corollary of The-
orem 1.7. The latter result also underlines that Boolean functions admit various
representations over the reals.
However, the uniqueness of the multilinear polynomial (1.19) makes it espe-
cially attractive, as it provides a canonical representation of every Boolean
function. Note that checking whether a given multilinear polynomial represents a
Boolean function, rather than an arbitrary pseudo-Boolean function, is quite easy.
Indeed, a pseudo-Boolean function f is Boolean if and only if f 2 (X) = f (X)
for all X ∈ Bn . This condition can be checked efficiently due to the unicity of
expression (1.19).
Finally, we note that, when the Boolean function f is viewed as a function on
the domain {−1, +1}n and taking its values in {−1, +1}, then f obviously admits
an alternative polynomial representation of the form (1.19), sometimes called the
Fourier expansion of f . Although the Fourier expansion is perfectly equivalent
to the multilinear polynomial in 0-1 variables, one of these two expressions may
occasionally prove more useful than the other, depending on the intended purpose.
Applications of the Fourier expansion in the theoretical computer-science literature
are numerous; some illustrations can be found, for instance, in [163, 543, 714, 716];
we refer to Bruck [157] for an introduction to this very fruiful topic. (See also Carlet
[170] for uses of the pseudo-Boolean representation and of the Fourier expansion
of Boolean functions in crytpography and in coding theory.)
begin
if f is constant then
D(f ) has a unique vertex r(f ) (which is both its root and its leaf);
r(f ) is labeled with the constant value of f (either 0 or 1);
else
let f0 := f|xi =0 and f1 := f|xi =1 ;
run Decision Tree(f0 ) to build D(f0 ) with root r(f0 );
run Decision Tree(f1 ) to build D(f1 ) with root r(f1 );
introduce a root r(f ) labeled by x1 ;
make r(f0 ) the right son and r(f1 ) the left son of r(f );
label the arc (r(f ), r(f0 )) by 0 and the arc (r(f ), r(f1 )) by 1;
return D(f );
end
is not identically 0 (that is, a variable xk and its complement x k do not simulta-
neously appear in the conjunction). Then, C(P ) is an implicant of fG . Moreover,
P ∈P:C(P ) ≡0 C(P ) is an orthogonal DNF of fG .
Of course, by applying the same procedure to the paths from the root to the
0–leaves of G, one can similarly compute an orthogonal DNF of fG and of the
dual function fGd .
Example 1.33. When we apply this procedure to the binary decision diagram in
Figure 1.4, we obtain the orthogonal DNF ψ = x 2 ∨ x1 x2 x3 ∨ x 1 x2 x 3 for the func-
tion f represented by the BDD, and the orthogonal DNF φ = x 1 x2 x3 ∨ x1 x2 x 3 for
its complement f .
For arbitrary BDDs, the above procedure may be inefficient because the number
of paths in P may be exponentially large in the size of G. When G is a decision
tree, however, we obtain a stronger result:
Theorem 1.35. Let f be a Boolean function represented by a decision tree D, let
L be the number of leaves of D and let δ be the depth of D, that is, the length of
a longest path from root to leaf in D. Then, an ODNF of f and an ODNF of f d
with degree δ can be computed in time O(δL).
Proof. When D is a decision tree, there is exactly one path from the root to each
leaf of D. Hence, the number of terms in the ODNF is at most L, and each term
can be built in time O(δ).
1.13 Applications
In this section, we return to some of the areas of application that we briefly men-
tioned earlier in this chapter: propositional logic, electrical engineering, game
theory, reliability, combinatorics, and integer programming. We sketch how the
basic Boolean concepts arise in these various frameworks and introduce some of
the problems and concepts investigated in subsequent chapters. We stress again,
however, that Boolean functions and expressions play a role in many other fields of
science. We have already mentioned their importance in complexity theory (see,
for instance, Krause and Wegener [583], Papadimitriou [725], Wegener [902]);
in coding theory or in cryptography (see Carlet [170], McWilliams and Sloane
[642]); and we could cite a variety of additional applications arising in social
sciences (qualitative analysis of data; see Ragin [775]); in psychology (human
50 1 Fundamental concepts and applications
concept learning; see Feldman [326, 327]); in medicine (diagnostic, risk asses-
ment; see Bonates and Hammer [102]); in biology (genetic regulatory networks;
see Kauffman [553], Shmulevich, Dougherty and Zhang [831], Shmulevich and
Zhang [832]), and so on.
Beyond the specific issues arising in connection with each particular applica-
tion, we want to stress that the unifying role played by Boolean functions and,
more generally, by Boolean models, should probably provide the main motivation
for studying this book (it certainly provided one of the main motivations for writ-
ing it). This theme will be recurrent throughout subsequent chapters, where we
will see that the same basic Boolean concepts and results have repeatedly been
reinvented in various areas of applied mathematics.
φ(x, y, z) = x y z ∨ x y z ∨ x z ∨ y z, (1.20)
where each term of φ corresponds in a straightforward way to one of rules 1–4. The
interpretation of φ is easy: A 0–1 point (x, y, z) is a false point of φ if and only if the
corresponding assignment of True–False values to the propositional variables does
not contradict any of the rules in the knowledge base. Thus, in the terminology of
1.13 Applications 51
logic theory, the set of solutions of the Boolean equation φ(x, y, z) = 0 is exactly
the set of models of the knowledge base. In particular, the set of rules is not “self-
contradictory” if and only if φ is not identically 1, that is, if and only if the Boolean
equation φ = 0 admits at least one solution (which is easily seen to be the case for
our small example).
The main purpose of an expert system is to draw inferences and to answer
queries involving the propositional variables, such as: “Is the assignment z = 1
consistent with the given set of rules?” (that is, “Does diagnosis Z apply under at
least one imaginable scenario?”). This question can be answered by plugging the
value z = 1 into φ and checking whether the resulting Boolean equation φ|z=1 = 0
remains consistent. For our example, this procedure yields the equation
x y ∨ x ∨ y = 0,
Definition 1.30. A DNF is a Horn DNF if each of its terms contains at most one
complemented variable.
We will show in Chapter 6 that, when φ is a Horn DNF, the Boolean equation
φ(X) = 0 can be solved easily, more precisely, in linear time. This single fact
suffices to explain the importance of Horn DNFs in the context of expert systems,
where large Boolean equations must be solved repeatedly. Moreover, we also
discover in Chapter 6 that Horn DNFs possess a host of additional remarkable
properties making them a worthwhile object of study.
Before we close this section, we must warn the reader that our view that (propo-
sitional) knowledge bases define Boolean expressions and, concomitantly, Boolean
functions, is quite unorthodox in the artificial intelligence literature, where rules
are more traditionally regarded as forming a “loose” collection of Boolean clauses
52 1 Fundamental concepts and applications
rather than a single function. We claim, however, that our point of view has defi-
nite advantages over the traditional one. Indeed, it allows us to take advantage of
the huge body of knowledge regarding Boolean functions and to draw inspiration
from concepts and properties pertaining to such functions.
As an example of this general claim, let us go back to the small knowledge base
just given and to the corresponding DNF φ displayed in equation (1.20). It should
be clear from our previous discussion that, as far as drawing inferences goes, all the
information contained in the knowledge base is adequately translated in φ. More
precisely, any Boolean expression representing the same Boolean function as φ
provides the same information as the original knowledge base. Indeed, if ψ is any
expression such that ψ = φ, then the set of models of the knowledge base is in one-
to-one correspondence with the set of false points of ψ, which coincides with the
set of false points of φ. This observation implies that Boolean transformations can
sometimes be applied in order to obtain a simpler, but equivalent, representation
of the knowledge base. The simplification of Boolean expressions is one of the
main topics of Chapter 3. For now, however, the discussion in Section 1.7 already
suggests that the prime implicants of φ may play an interesting role in this respect.
For our example, it turns out that φ only has two prime implicants, namely, x y and
z. By way of consequence (recall Theorem 1.13), φ = x y ∨ z, so that the original
rules 1–4 are equivalent to the conjunction of the following two rules:
Rule 5 : “either x is true or y is false.”
Rule 6 : “z is false.”
(Note that Rule 6 provides a confirmation of our previous conclusion, according
to which z can never be true.)
Recently, the Boolean formalism has found a very central role in another area of
artificial intelligence, namely, in computational learning theory. In intuitive terms,
many of the fundamental questions in this field take the following form: Given a
class C of Boolean functions and an unknown function f in C, how many rows
of the truth table of f is it necessary to query to be “reasonably confident” that f
is known with “sufficient accuracy?” Another type of question would be: Given a
class C of Boolean functions and two subsets (or “samples”) T , F ⊆ B n , is there a
function f ∈ C such that f takes value 1 on T and value 0 on F ? Related issues
will be tackled in Chapter 12 of this book. For more information on computational
learning theory, we refer the reader to the textbook by Anthony and Biggs [29] and
to survey papers by Anthony [26] and Sloan, Szörényi, and Turán [838].
compute two functions, say f1 and f2 , for which we can (recursively) construct
the representations φ1 and φ2 . Then, the expression φ1 ∨ φ2 represents f .
Example 1.34. The circuit displayed in Figure 1.6 computes the function
φ = (x1 ∨ x4 )(x1 ∨ (x2 x 3 )). (1.21)
It can be much more difficult, however, to obtain a DNF of the function asso-
ciated to a given circuit. Fortunately, for many applications, it is sufficient to have
an implicit representation of f via a DNF ψ(X, Y , z) similar to the DNF pro-
duced by the procedure Expand (see Section 1.4). In the DNF ψ(X, Y , z), the
vector X represents the inputs of the circuit, z represents its output, and Y can be
viewed as a vector of variables associated with the outputs of the “hidden” gates
of the circuit (in the physical realization of a switching circuit, the input and ouput
signals can be directly observed, whereas the value of all other signals cannot,
hence the qualifier “hidden” applied to these internal gates). On every input signal
X∗ , the circuit produces the output z∗ , where (X∗ , Y ∗ , z∗ ) is the unique solution
of the equation ψ(X∗ , Y , z) = 0. (See Abdulla, Bjesse, and Eén [1] for a more
detailed contribution along similar lines.) Let us illustrate this construction on an
example.
Example 1.35. Consider again the circuit displayed in Figure 1.6 and the corre-
sponding expression φ given by (1.21). We have already shown in Example 1.12
✗✔
x4
✖✕ ✗✔
z
✿ OR PP
✘
✘✘✘ ✖✕ PP
✛✘ ✘✘✘ P
x1 ✘ ✘✘✘ PP
PP
PP ✗✔ PP
✚✙ PP PP
P PP q AND
✗✔✏
PP
P ✏ ✏ ✏
✶
✖✕
PP ✗✔✏ NOT
✶ ✖✕
✗✔ P PP ✏
q OR ✏
x2 ✯ ✖✕
✟
✟
✗✔ ✟
✖✕ ✟
✟
z AND
✗✔ ✟✯
✟
✟ ✖✕
NOT
✗✔ ✯✖✕
✟
✟✟
x3
✖✕
Figure 1.6. Combinational circuit for Example 1.34.
54 1 Fundamental concepts and applications
Proof. Let Y ∗ be any true point of f such that yk∗ = 1 (note that there are ωk such
points), and write Y ∗ = X∗ ∨ ek , where xk∗ = 0. Then, either X ∗ is a swing for
k, or X ∗ is a true point of f , but not both. Moreover, all swings of f for k and
all true points of f whose k-th component is zero can be obtained in this way.
Denoting by sk the number of swings for k, we conclude that ωk = sk + (ω − ωk )
or, equivalently, sk = πk .
In voting terms, a swing for variable (that is, player) k corresponds to a losing
coalition (namely, the coalition {i ∈ N | xi∗ = 1}) that turns into a winning coalition
when player k joins it. Intuitively, then, player k retains a lot of power in the game
v if fv has many swings for k, since this means that k plays a “pivotal” role in
many winning coalitions.
Accordingly, many authors define power indices as functions of the number
of swings or, equivalently, of the modified Chow parameters. Banzhaf [52], for
instance, made a proposal which translates as follows in our terminology (see also
Penrose [739] for pioneering work on this topic).
for k = 1, 2, . . . , n.
58 1 Fundamental concepts and applications
The Banzhaf index ranks among the most extensively studied and widely
accepted power indices for voting games. In spite of some fundamental draw-
backs, it agrees on many accounts with what we would intuitively expect from
a reasonable measure of power (see Dubey and Shapley [279]; Felsenthal and
Machover [329]; and Straffin [850] for an axiomatic characterization and exten-
sive discussions of the relation between Banzhaf and other power indices).
Note, for instance, that, in view of Theorem 1.18, the Banzhaf index of an
inessential player is equal to zero. The converse statement also holds for pos-
itive Boolean functions (the proof is left to the reader as an exercise). We
return to this topic in Chapter 9. Many other connections between the theory
of Boolean functions and the theory of simple games will also be established in
the monograph.
Finally, it is interesting to observe that Boolean functions provide useful models
for investigating certain types of nonsimple games, for example, 2-player posi-
tional games in normal form. We do not further discuss this topic now but refer the
reader to Chapter 10 and to Gurvich [421, 423, 424, etc.] for more information.
When viewed as a function from [0, 1]n to [0, 1], RelS is called the reliability
function or reliability polynomial of S (see, e.g., [54, 205, 206, 777]). Observe
that the polynomial RelS extends the Boolean function fS : {0, 1}n → {0, 1} over
the whole unit cube U n = [0, 1]n . As a matter of fact, if fS is viewed as a pseudo-
Boolean function, and if it is represented as a multilinear polynomial over the reals
(see Section 1.12.2)
fS (x1 , x2 , . . . , xn ) = c(A) xi ,
A∈P(N) i∈A
Similar observations have also been made in the game theory literature (see [456,
720, 777]).
When pi = 12 for i = 1, 2, . . . , n, all vertices of B n are equiprobable with
probability 2−n . As a result,
1 ω(fS )
RelS 2
, . . . , 12 = Prob[fS = 1] = ,
2n
where ω(fS ) denotes as usual the number of true points of fS . Similarly, if pk = 1
for some component k and pi = 12 for i = 1, 2, . . . , n, i = k, then
1 ωk
RelS 2
, . . . , 12 , 1, 21 , . . . , 12 = n−1 ,
2
where ωk is the k-th Chow parameter of fS . These observations show that com-
puting the Chow parameters of a positive Boolean function is just a special case of
computing the reliability of a coherent system. Also, similarly to what happened
for simple games, variants of the modified Chow parameters have been used in
the literature to estimate the “importance” of individual components of a coherent
system. Ramamurthy [777] explains nicely how Banzhaf and other power indices
(like the Shapley-Shubik index) have been rediscovered in this framework.
60 1 Fundamental concepts and applications
1.13.5 Combinatorics
Relations between Boolean functions and other classical combinatorial constructs,
such as graphs, hypergraphs, independence systems, clutters, block designs,
matroids, colorings, and so forth, are amazingly rich and diverse. Over time, these
relations have been exploited to gain insights into the constructs themselves (see,
e.g., Benzaken [64]), to handle algorithmic issues related to the functions or the
constructs (see, e.g., Hammer and Rudeanu [460]; Aspvall, Plass, and Tarjan [34];
Simeone [834]) and to introduce previously unknown classes of combinatorial
objects (see, e.g., Chvátal and Hammer [201]). These are but a few examples, and
we will encounter plenty more throughout this book. In this section, we only men-
tion a few useful connections between the study of hypergraphs and the concepts
introduced so far.
The stability function f = fH of a hypergraph H = (N , E) was introduced in
Application 1.5. We observed in Application 1.9 that fH is a positive function. In
fact, if N = {1, 2, . . . , n}, then it is easy to see that
fH (x1 , x2 , . . . , xn ) = xj . (1.23)
A∈E j ∈A
It is important to realize that the function fH does not completely define the
hypergraph H. Indeed, consider two hypergraphs H = (N , E) and H = (N , E ).
If E ⊆ E , and if every edge in E \ E contains some edge in E, then H and H have
exactly the same stable sets, so that fH = fH . Thus, the expression (1.23) of fH
can be rewritten as
fH (x1 , x2 , . . . , xn ) = xj ,
A∈P j ∈A
A ∈ P, B ∈ P, A = B ⇒ A ⊆ B.
Conversely, any clutter can also be viewed as defining the collection of minimal
edges of a hypergraph or, equivalently, the collection of prime implicants of a
positive Boolean function.
Many operations on hypergraphs or clutters are natural counterparts of oper-
ations on Boolean expressions. For instance, if H = (N, E) is a clutter and
j ∈ N , the clutter H \ j is defined as follows: H \ j = (N \ {j }, F), where
F = E \ {A ∈ E | j ∈ A} (deletion of j ; see, e.g., Seymour [821] or the literature
on matroid theory). Thus, fH\j is simply the restriction of fH to xj = 0.
Similarly, the clutter H/j is defined as H/j = (N \ {j }, G), where G is the
collection of minimal sets in {A \ {j } | A ∈ E} (contraction of j ). We see that
1.13 Applications 61
Proof. Let f (x1 , x2 , . . . , xn ) = {i,j }∈E xi xj , and let G be the corresponding graph,
namely, G = (N , E). We denote by s(G) the number of stable sets of G, and by
ω(f ) the number of true points of f . Valiant [883] proved that computing s(G) is
#P-complete. Since s(G) = 2n − ω(f ), the result follows.
As a corollary of this theorem, we can also conclude that computing the Chow
parameters of a quadratic positive function is #P-hard. Observe that these results
actually hold independently of the representation of f . Indeed, if we know in
advance that f is purely quadratic and positive, then the complete DNF of f can
easily be obtained by querying O(n2 ) values of f : For all pairs of indices i, j ∈ N ,
compute f (e{i,j } ), where e{i,j } is the characteristic vector of {i, j }. Those pairs
{i, j } such that f (e{i,j } ) = 1 are exactly the edges of G.
We conclude this section by mentioning one last connection between combina-
torial structures and positive Boolean functions. In 1897, Dedekind asked for the
number d(n) of elements of the free distributive lattice on n elements. This famous
question is often referred to as Dedekind’s problem [572]. As it turns out, d(n) is
equal to the number of positive Boolean functions of n variables. The number
d(n) grows quite fast and its exact value is only known for small values of n; see
Table 1.3 based on Berman and Köhler [73]; Church [196]; Wiedemann [908];
and sequence A000372 in Sloane [840]. Kleitman n [572] proved that log2 d(n) is
asymptotic to the middle binomial coefficient [n/2] (see also [542, 573, 578, 579]
for extensions and refinements of this deep result).
We should warn the reader, however, that the relations between combina-
torics and Boolean theory are by no means limited to the study of positive
Boolean functions. Later in the book, we shall have several opportunities to
encounter nonpositive Boolean functions linked, in various ways, to graphs
or hypergraphs.
62 1 Fundamental concepts and applications
n d(n)
0 2
1 3
2 6
3 20
4 168
5 7581
6 7828354
7 2414682040998
8 56130437228687557907788
ψ= xi xj (1.29)
k=1 i∈Ak j ∈Bk
n
maximize z(x1 , x2 , . . . , xn ) = ci xi (1.30)
i=1
subject to xi − xj ≤ |Ak | − 1, k = 1, 2, . . . , m (1.31)
i ∈Ak j ∈Bk
(x1 , x2 , . . . , xn ) ∈ Bn . (1.32)
Proof. We must show that the set of false points of fF coincides with the set of
solutions of (1.31). Let X∗ be a false point of fF . For each k = 1, 2, . . . , m, since
ψ(X∗ ) = 0, either there is an index i ∈ Ak such that xi∗ = 0, or there is an index
j ∈ Bk such that xj∗ = 1. In either case, we see that X ∗ satisfies the k-th inequality
in (1.31). The converse statement is equally easy.
(y1 , y2 , . . . , yn ) ∈ Bn ,
ψ= xi (1.33)
k=1 i∈Ak
(y1 , y2 , . . . , yn ) ∈ Bn . (1.36)
64 1 Fundamental concepts and applications
(x1 , x2 , . . . , xn ) ∈ Bn .
Note that the latter result motivates the terminology “generalized covering”
used in Theorem 1.39.
Another way to look at Theorem 1.40 is suggested by the connections estab-
lished in Section 1.13.5. Indeed, when fF is positive, the feasible solutions of P
are exactly the stable sets of a hypergraph, and the feasible solutions of SCP are
the transversals of this hypergraph. So, Theorem 1.40 simply builds on the well-
known observation that stable sets are exactly the complements of transversals
(see, e.g., Berge [72]).
Algorithms based on the transformations described in Theorems 1.39 and 1.40
have been proposed in [408, 409, 410]. Several recent approaches to the solution
of Boolean equations also rely on this transformation (see, e.g., [184]).
We shall come back to integer programming problems of the form P in sub-
sequent chapters of the book (see, in particular, Sections 4.2, 8.6, and 9.4). For
now, we conclude this section with a discussion of the complexity of computing
a DNF expression of the resolvent. For this question to make sense, we must first
specify how the set F is described in (1.25). In the integer programming context,
F would typically be defined as the solution set of a system of linear inequalities
in the variables x1 , x2 , . . . , xn , say,
n
aki xi ≤ bk , k = 1, 2, . . . , s. (1.37)
i=1
Finally, one should also notice that, as long as the description of F is in NP,
Cook’s theorem [208] guarantees the existence of a polynomial-time procedure
which, given any instance of F , produces an integer t ≥ n and a DNF expression
φ(y1 , y2 , . . . , yt ) such that X∗ ∈ F if and only if φ(y1∗ , y2∗ , . . . , yt∗ ) = 0 for some
(y1∗ , y2∗ , . . . , yt∗ ) ∈ B t . However, although the DNF φ bears some resemblance with
the resolvent of F , it usually involves a large number of additional variables
beside the original variables x1 , x2 , . . . , xn (compare with the DNF produced by the
procedure Expand in Section 1.5).
1.14 Exercises
1. Compute the number of Boolean functions and DNF expressions in n
variables, for n = 1, 2, . . . , 6.
2. Show that (a x ∨ b x) = a x ∨ b x for all a, b, x ∈ B.
3. Prove that every Boolean function has an expression involving only dis-
junctions and negations, but no conjunctions, as well as an expression
involving only conjunctions and negations, but no disjunctions.
4. The binary operator NOR is defined by NOR(x, y) = x y. Show that every
Boolean expression is equivalent to an expression involving only the NOR
operator (and parentheses). Show that the same property holds for the
NAND operator defined by NAND(x, y) = x ∨ y. (See, e.g., [752] for far-
reaching extensions of these observations.)
5. A Boolean function f is called symmetric if f (x1 , x2 , . . . , xn ) =
f (xσ1 , xσ2 , . . . , xσn ) for all permutations (σ1 , σ2 , . . . , σn ) of {1, 2, . . . , n}.
(a) Prove that f is symmetric if and only if there exists a function
g : {0, 1, . . . , n} → B such that, for all X ∈ B n , f (x1 , x2 , . . . , xn ) =
g( ni=1 xi ).
(b) For k = 0, 1, . . . , n, define the Boolean function rk by rk (X) = 1 if and
only if ni=1 xi = k. Prove that f is symmetric if and only if there exists
A ⊆ {0, 1, . . . , n} such that f = k∈A rk .
(c) Prove that the set of all symmetric functions is closed under disjunc-
tions, conjunctions, and complementations.
(d) What is the complexity of deciding whether a given DNF represents a
symmetric function?
6. Design a data structure to store a DNF φ in which
(a) φ can be stored in O(|φ|) space built in O(|φ|) time;
(b) finding a term of η of a given degree requires O(1) time;
(c) finding a negative linear term of φ requires O(1) time;
(d) adding/deleting a term of degree k requires O(k) time;
(e) fixing/reactivating a literal occurring l times in φ requires O(l) time.
7. Show that the degree of a DNF expression of a Boolean function may be
strictly smaller than the degree of its complete DNF.
8. For an arbitrary Boolean function f on Bn , define the influence of variable
k (k = 1, 2, . . . , n) to be the probability that f|xk =1 (X) = f|xk =0 (X), where
66 1 Fundamental concepts and applications
X is drawn uniformly at random over B n−1 (see Kahn, Kalai, and Linial
[543]). Show that, when f is positive, the influence of variable k is equal
πk
to 2n−1 , where πk is the k-th modified Chow parameter of f .
9. Show that the binary operator ⊕ is commutative and associative, that is,
x1 ⊕ x2 = x2 ⊕ x1 and (x1 ⊕ x2 ) ⊕ x3 = x1 ⊕ (x2 ⊕ x3 ) for all x1 , x2 , x3 ∈ B.
10. The parity function on B n is the function pn (x1 , x2 , . . . , xn ) = x1 ⊕ x2 ⊕
. . . ⊕ xn .
(a) Write a DNF expression of p n .
(b) Compute the Chow parameters of p n .
11. Assume that f is represented either as a sum-of-products modulo 2 of the
form (1.16) or as a multilinear polynomial over the reals of the form (1.19).
In each case, show how to efficiently solve the equation f (X) = 0.
12. Show that, if f is a Boolean function on B n , and f has an odd number of
true points, then
(a) every orthogonal DNF of f has degree n;
(b) every decision tree for f contains a path of length n from the root to
some leaf.
13. Prove that every Boolean function f on B n has a unique largest positive
minorant f− and a unique smallest positive majorant f + , where
(a) f− and f + are positive functions on B n ;
(b) f− ≤ f ≤ f + ;
(c) if g and h are any two positive functions such that g ≤ f ≤ h, then
g ≤ f− and f + ≤ h.
14. Prove that every Boolean function has the same maximal false points as its
largest positive minorant and the same minimal true points as its smallest
positive majorant (see previous exercise).
15. Consider the 0-1 integer programming problem (1.26)–(1.28) in Section
1.13.6. Prove that, when cj > 0 for j = 1, 2, . . . , n, (1.26)–(1.28) has the
same optimal solutions as the set covering problem obtained upon replacing
the resolvent fF by its largest positive minorant (see previous exercises,
and Hammer, Johnson, and Peled [443]).
67
68 2 Boolean equations
Boolean equations not only play a fundamental role in propositional logic and
in theoretical computer science but also occur directly and naturally in many
applications, such as artificial intelligence, electrical engineering, mathematical
programming, and so on. Here are brief outlines of some typical applications.
Application 2.1. (Propositional logic, artificial intelligence.) In propositional
logic, a formula (or a Boolean expression) φ is called satisfiable if the equation
φ(X) = 1 is consistent, and it is called a contradiction otherwise. The formula is
valid, or is a tautology, if φ is identically equal to 1, that is, if the equation φ(X) = 0
is inconsistent. These classical concepts play a central role in (propositional) logic
and in all applications of artificial intelligence in which propositional logic is used
to model knowledge.
To illustrate, consider a knowledge base of rules involving the propositional
variables x1 , x2 , . . . , xn , and let φ(x1 , x2 , . . . , xn ) be the Boolean expression associ-
ated with the knowledge base, as in Section 1.13.1 (Chapter 1). Then, as we have
seen, the set of solutions of the equation φ = 0 describes the set of models of the
knowledge base, that is, the set of truth assignments that satisfy all the rules. In
particular, the equation φ = 0 is consistent if and only if the collection of rules is
not self-contradictory. Also, questions relative to the atomic propositions – e.g.,
questions of the form, “Is xi = 1 consistent with the given rules?” – are directly
reducible to the solution of Boolean equations.
Similar principles are used in many other areas of artificial intelligence, notably
in automated theorem proving. Assume, for instance, that a theorem proving system
must prove or disprove a general implication of the form
where φ and ψ are arbitrary Boolean expressions (the premise φ(X) = 0 could
express the axioms of the theory as well as a number of more specific hypotheses).
The usual way to attack this question is to reason by contradiction and to solve
the equation
φ(X) ∨ ψ(X) = 0.
If this equation is consistent, then any of its solutions yields a counter-example to
the conjecture (2.1). Conversely, if the equation is inconsistent, then the implication
(2.1) is a theorem.
Our discussion focused on propositional logic. However, testing the validity
of formulas in first-order predicate logic, even though an undecidable problem,
can, in principle, be “reduced” to the solution of an infinite number of Boolean
equations through an application of Herbrand’s theorem. This type of reduction is
used, either explicitly or implicitly, in many theorem-proving procedures for first-
order logic; see, for example, Gilmore [380], Davis and Putnam [261], Robinson
[787], Chang and Lee [186], Jeroslow [533], Thayse [863]. Boolean equations
also find applications in solving decision problems from modal logic, as discussed
in [384, 510].
2.1 Definitions and applications 69
infer stuck-at faults from the observed input and ouput signals of the circuit. In
general terms, the test generation problem for stuck-at faults can be expressed as
follows: Generate an input vector X∗ (or possibly several) such that the output of
the circuit is incorrect on that input when certain gates have stuck-at faults.
To make this more explicit, let us focus on the test generation problem for diag-
nosing whether a specific OR-gate, say, gate k, is stuck at 1. (In practice, one may
often safely assume that only a few gates are faulty in a circuit. It is even common
to posit the “single fault hypothesis” according to which one gate at most could
be faulty.) Let ψ(X, Y , z) be the Boolean expression modeling the combinational
circuit as explained in Section 1.13.2, let y1 , y2 model the inputs of gate k, and let
yk model its output. So, in the expression ψ, we can isolate the terms associated
with gate k by rewriting ψ(X, Y , z) as
ψ(X, Y , z) = φ(X, Y , z) ∨ y1 y k ∨ y2 y k ∨ y 1 y 2 yk . (2.2)
Observe that the role of the last three terms of ψ in (2.2) is only to describe
the correct operation of the OR-gate k (all three terms must be 0 when the gate is
operating properly).
To model the behavior of the circuit when gate k is stuck at 1, we introduce
a new variable w, representing the output of the faulty circuit, and a new vector
of variables V = (v1 , v2 , . . . , vm ), where vi represents the output signal of gate
i (i = 1, 2, . . . , m) in the faulty circuit. Applying the same reasoning as in the
absence of any fault, we can state: In every solution (X ∗ , V ∗ , w∗ ) of the equation
φ|vk =1 (X, V , w) = 0, the variable associated with each gate represents the output
of that gate on the input signal X∗ on the assumption that gate k is stuck at 1 (note
that the terms linking y1 , y2 , and yk are absent from this equation).
It is then easy to conclude that every solution (X∗ , Y ∗ , z∗ , V ∗ , w∗ ) of the equation
ψ(X, Y , z) ∨ φ|vk =1 (X, V , w) ∨ z w ∨ z w = 0 (2.3)
has the following property: On the input signal X ∗ , the correct circuit described
by ψ produces the output z∗ , while the faulty circuit in which gate k is stuck at 1
produces the output w∗ = z∗ . In other words, a valid test vector for the stuck-at-1
fault at gate k can be generated by solving the Boolean equation (2.3).
Example 2.1. Let us illustrate this procedure for the detection of a stuck-at-1 fault
at the second (lower) OR-gate of the circuit displayed in Figure 1.6. The expres-
sion ψ(X, Y , z) associated with this circuit is given by equation (1.22), where the
output of the OR-gate under consideration is represented by variable y2 . As in
equation (2.2), we can rewrite
ψ(X, Y , z) = φ(X, Y , z) ∨ x1 y 2 ∨ y4 y 2 ∨ x 1 y 4 y2 ,
with
φ(X, Y , z) = y 1 z ∨ y2 z ∨ y1 y 2 z ∨ x1 y 1 ∨ x4 y 1 ∨ x 1 x 4 y1 ∨ x 2 y4 ∨ x3 y4 ∨ x2 x 3 y 4 .
Then, after some simplifications, equation (2.3) reduces to
x1 ∨ x 4 ∨ y 1 ∨ y2 ∨ y4 ∨ z ∨ v 1 ∨ w ∨ x2 x 3 ∨ x 2 v4 ∨ x3 v4 = 0.
2.1 Definitions and applications 71
The conclusion is that any input vector X ∗ satisfying x1∗ = 0, x4∗ = 1, and
x2∗ x ∗ 3 = 0 is a valid test vector for a stuck-at-1 fault at the lower OR-
gate. Any such vector produces the output w∗ = 0 in the faulty circuit, when
it should produce the output z∗ = 1 in the correct circuit (it is not very
difficult to check that it is indeed so, by direct verification).
Larrabee [599] has demonstrated that a Boolean approach to test pattern gen-
eration, based on the formulation just described, is extremely effective in practice
and produces excellent results on well-known benchmark problems. In her exper-
iments, the approach proved competitive with alternative structural approaches
proposed in the specialized literature (see, e.g., [178, 590]).
In more recent work, Clarke et al. [204] describe successful reformulations
of other verification problems as Boolean DNF equations. They observe that this
approach, known as bounded model checking, appears to be remarkably efficient
and robust on industrial systems that would be difficult for the more traditional
model checking techniques based on binary decision diagrams; see also Jiang and
Villa [535].
φ(x1 , x2 , . . . , xn ) = xi xj = 0, (2.4)
k=1 i∈Ak j ∈Bk
hundreds of questions. Indeed, it has been observed for a long time that numer-
ous problems of a combinatorial nature can be reduced to the solution of Boolean
equations (see, for instance, Fortet [342, 343]; Hammer and Rudeanu [460]). This
statement was given a more precise and very dramatic formulation by Cook [208],
who proved that each and every decision problem in a broad class of problems
(namely, the so-called class NP) can be transformed in polynomial time into an
equivalent Boolean equation. In order to express Cook’s theorem in the usual for-
mat of complexity theory, we first pose the problem of solving Boolean equations
as a decision problem (see Appendix B).
Boolean Equation
Instance: Two Boolean expressions φ(X) and ψ(X).
Question: Is the equation φ(X) = ψ(X) consistent?
DNF Equation
Instance: A DNF expression φ(X).
Question: Is the equation φ(X) = 0 consistent?
φ(X) = xi xj =0 (2.5)
k=1 i∈Ak j ∈Bk
where ψ is now a CNF. As a matter of fact, Cook’s theorem is frequently stated (and
was originally proved) in its dual form involving CNF rather than DNF equations.
74 2 Boolean equations
Satisfiability
Instance: A CNF ψ(X).
Question: Is the equation ψ(X) = 1 consistent?
In view of Theorem 2.1 and of the equivalence of (2.5) and (2.6), we immedi-
ately conclude that Satisfiability is NP-complete, even when each clause of the
CNF ψ involves at most three literals (3-Satisfiability or 3-Sat problem).
Boolean equations have been frequently stated as satisfiability problems in
the artificial intelligence and computational complexity literatures. On the other
hand, DNF formulations are more commonly used in electrical engineering and
in propositional logic. In this book, we mostly deal with DNF equations rather
than satisfiability problems, but it should be clear that this is purely a matter of
convention.
The reader should also be aware that, in contrast to the foregoing comments,
equations of the form φ = 1, where φ is a DNF, are extremely easy to solve (and
thus rather uninteresting). To see this, simply remember that a DNF takes value 1
if and only if at least one of its terms takes value 1.
Finally, it should be noted that the bound on the degree of the equation in
Theorem 2.1 is tight, in the sense that the equation φ(X) = 0 can be solved in
polynomial time if φ is a quadratic DNF, as we shall see in Chapter 5. Numerous
extensions of Theorem 2.1, of the form Boolean Equation is NP-complete,
even when restricted to equations satisfying condition C, have been established in
the literature. We refer to [371, 571] for a discussion of such extensions, and we
propose some of them as end-of-chapter exercises.
xi x j = 0, k = 1, 2, . . . , m.
i∈Ak j ∈Bk
the production rules used in the knowledge base of an expert system frequently
constitute a system of conditions of this type; see Applications 1.13.1 and 2.1.
This is also the case for the Boolean equation associated with a logic circuit, as
explained in Applications 1.13.2 and 2.2.
More generally, systems of (possibly complex) Boolean conditions also arise
when instantiation techniques based on Herbrand’s theorem are used to prove the
validity of first-order logic formulas. Davis and Putnam [261] argued that, in this
framework, it is quite natural and efficient to work with DNFs. Their argument
goes as follows (for the sake of clarity, we replace the word “conjunctive” by
“disjunctive” in the authors’ original statement, without altering its meaning):
That the disjunctive normal form can be employed follows from the remark that to put
a whole system of formulas into disjunctive normal form we have only to put the
individual formulas into disjunctive normal form. Thus, even if a system has hundreds
or thousands of formulas, it can be put into disjunctive normal form “piece by piece”,
without any “multiplying out” (Davis and Putnam [261]).
In the remainder of this section, we show how an arbitrary system of Boolean
conditions (equations and inequalities) can be efficiently transformed into an
equivalent DNF equation. Let us first define what we mean by a “system of Boolean
conditions.”
Definition 2.3. A Boolean system on B n is a collection of Boolean equations and
inequalities of the form
φk (X) = ψk (X) k = 1, 2, . . . , p, (2.7)
φk (X) ≤ ψk (X) k = p + 1, p + 2, . . . , p + q, (2.8)
where φk and ψk are Boolean expressions on B n , for k = 1, 2, . . . , p + q. A solution
of the system is a point X ∗ ∈ B n such that φk (X ∗ ) = ψk (X ∗ ) for k = 1, 2, . . . , p
and φk (X ∗ ) ≤ ψk (X ∗ ) for k = p + 1, p + 2, . . . , p + q.
An easy, but fundamental, result due to Boole [103] allows us to transform any
Boolean system into a single Boolean equation.
Theorem 2.2. The Boolean system (2.7)–(2.8) has the same set of solutions as the
Boolean equation
p
p+q
φk (X) ψk (X) ∨ φk (X) ψk (X) ∨ φk (X) ψk (X) = 0. (2.9)
k=1 k=p+1
Proof. It suffices to observe that the system (2.7) is equivalent to the system
φk (X) ≤ ψk (X) k = 1, 2, . . . , p,
φk (X) ≥ ψk (X) k = 1, 2, . . . , p,
and that each inequality of the form φk (X) ≤ ψk (X) is in turn equivalent to the
equation φk (X) ψk (X) = 0.
76 2 Boolean equations
In view of Theorem 2.2, it only remains to show that every Boolean equation of
the form φ(X) = 0 can be efficiently transformed into an equivalent DNF equation.
A polynomial time transformation could of course be read from the proof of Cook’s
theorem (Theorem 2.1), but the resulting procedure would be too cumbersome to
be of practical interest.
On the other hand, since every Boolean function has a DNF expression, the left-
hand side of (2.9) could, in principle, be rewritten as an equivalent DNF. However,
we have already observed that this may lead to an exponential explosion in the size
of the problem (see Example 1.11). As a matter of fact, Example 1.11 essentially
shows that there is no hope of achieving the desired polynomial time transformation
of an arbitrary equation into an equivalent DNF equation, unless one is willing to
introduce additional variables in the picture.
Definition 2.4. Consider two Boolean systems, say S1 (X) and S2 (X, Y ), where S1
involves only the variables (x1 , x2 , . . . , xn ), whereas S2 involves (x1 , x2 , . . . , xn ) and
possibly additional variables (y1 , y2 , . . . , ym ). We say that S1 and S2 are equivalent
if the following two conditions hold:
So, when S1 and S2 are equivalent, the solution set of S1 is the projection on
B n of the solution set of S2 . In particular, if S2 only involves the X-variables, then
S1 and S2 are equivalent if and only if they have the same solution set.
We are now ready for the main result of this section (Tseitin [872]; see also [78]
for a broader discussion and for extensions of this result to first-order predicate
logic).
Theorem 2.3. Every Boolean system can be reduced in linear time to an equivalent
DNF equation.
Proof. First, Theorem 2.2 can be used to rewrite (in linear time) the system as a sin-
gle equation of the form φ(X) = 0. Then, apply the procedure Expand described
in Section 1.5 to the expression φ(X). The output of Expand is a DNF ψ(X, Y )
and a distinguished literal z among the literals on (X, Y ), with the property that
the equation φ(X) = 0 is equivalent to the DNF equation ψ|z=0 (X, Y ) = 0.
Actually, we do not need the full power of the procedure Expand in order to
establish Theorem 2.3. Indeed, we leave it to the reader to verify that the procedure
Expand∗ in Figure 2.1, which introduces fewer additional variables and produces
shorter DNFs than Expand, also achieves the required transformation (we refer, for
instance, to Blair, Jeroslow and Lowe [98], Clarke, Biere, Raimi and Zhu [204],
Eén and Sörensson [290], Jeroslow [533], Plaisted and Greenbaum [750], and
Wilson [914], for descriptions and applications of related procedures).
2.3 On the role of DNF equations 77
begin
if φ is a DNF then ψ := φ;
else if φ = α for some expression α then return Expand∗ (α);
else if φ = (φ1 ∨ φ2 ) for some expressions φ1 , φ2 then return Expand∗ (φ1 φ2 );
else if φ = (φ1 φ2 ) for some expressions φ1 , φ2 then return Expand∗ (φ1 ∨ φ2 );
else if φ = (φ1 ∨ φ2 ∨ . . . ∨ φk ) for some expressions φ1 , φ2 , . . . , φk then
begin
for j = 1 to k do ψj := Expand∗ (φj );
return ψ := ψ1 ∨ ψ2 ∨ . . . ∨ ψk ;
end
else if φ = (φ1 φ2 . . . φk ) for some expressions φ1 , φ2 , . . . , φk then
begin
for j = 1 to k do ψj := Expand∗ (φj );
create k new variables, say y1 , y2 , . . . , yk ;
return ψ := y 1 ψ1 ∨ y 2 ψ2 ∨ . . . ∨ y k ψk ∨ y1 y2 . . . yk ;
end
end
The next result underlines the special role played by DNF equations of degree 3.
It can be seen as a strengthening of the second half of Theorem 2.1.
Theorem 2.4. Every DNF equation can be reduced in linear time to an equivalent
DNF equation of degree 3.
Proof. Consider a DNF equation of the form (2.5) and assume that |A1 | + |B1 | > 3.
Select two distinct indices in A1 ∪ B1 , say, h, - ∈ A1 (similar arguments apply if
one of the indices is in B1 ). Let y be an additional Boolean variable, different from
x1 , x2 , . . . , xn , and define
m
ψ(X, y) = xi xj y ∨ xi x j ∨ xh x- y ∨ xh y ∨ x- y.
i∈A1 \{h,-} j ∈B1 k=2 i∈Ak j ∈Bk
We claim that the equations φ(X) = 0 and ψ(X, y) = 0 are equivalent. To see
this, consider any point (X ∗ , y ∗ ) ∈ B n+1 . It is easy to see that the expression
xh∗ x-∗ y ∗ ∨ xh∗ y ∗ ∨ x-∗ y ∗ is equal to 0 if and only if y ∗ = xh∗ x-∗ . This implies that,
for all solutions (X ∗ , y ∗ ) of the equation ψ(X, y) = 0, there also holds φ(X ∗ ) = 0.
And conversely, every solution X ∗ of φ(X) = 0 gives rise to a solution (X ∗ , y ∗ )
of the equation ψ(X, y) = 0, by simply setting y ∗ = xh∗ x-∗ . Thus, the equations are
equivalent.
Note that the degree of the first term of ψ is equal to |A1 | + |B1 | − 1. Thus,
repeatedly applying this reduction eventually yields a DNF equation of degree 3. It
can be checked that the total number of additional variables and terms introduced
78 2 Boolean equations
m
by this transformation is O k=1 (|Ak | + |Bk |) . We leave to the reader a more
complete analysis of the complexity of this procedure.
Relying on Theorem 2.3 (and Theorem 2.4), the remainder of this chapter
mostly concentrates on the solution of DNF equations. The reader should be aware,
however, that the transformation of an arbitrary Boolean equation into a DNF
equation typically introduces a large number of new variables into the picture,
even when procedure Expand∗ is used, rather than Expand. Hence, in some cases,
this transformation may artificially increase the difficulty of the problem at hand.
Since some Boolean equations naturally arise in non-DNF form (e.g., equations of
the form φ(X) = ψ(X) arising in logic circuit verification; see Application 2.2),
it may sometimes be desirable to develop procedures capable of dealing directly
with these alternative forms, rather than blindly relying on the general techniques
discussed earlier.
To illustrate this comment, let us consider an equation of the form φ(X) = ψ(X),
where φ and ψ are DNFs. According to our previous discussion, one way of
handling this equation is to rewrite it as φ(X) ψ(X) ∨ φ(X) ψ(X) = 0, and next
to apply Expand∗ to the latter equation. However, a more efficient approach can
be used here. First, check whether the system φ(X) = 1, ψ(X) = 1 has a solution.
Since φ and ψ are both DNFs, this system turns out to be very easy to solve
(we leave this for the reader to check). If it is consistent, then we can stop right
away. Otherwise, the original equation φ(X) = ψ(X) has been reduced to the
system φ(X) = 0, ψ(X) = 0, which is, in turn, equivalent to the DNF equation
φ(X) ∨ ψ(X) = 0. Clearly, this approach usually involves much less work than
the “standard” procedure.
It will also be easy to see that some of the equation-solving techniques presented
in the following sections (e.g., the enumeration techniques) can be modified in a
straightforward way to handle non-DNF equations. Other techniques have been
generalized in a more sophisticated way with the same goal in mind, for example,
the consensus technique (see Thayse [863], Van Gelder [885]) or local search
heuristics (see Stachniak [844]).
or
φ(x1 , . . . , xn−1 , 1) = 0
is consistent.
Theorem 2.5 suggests that we can solve the equation φ(x1 , . . . , xn−1 , xn ) = 0
using a branching, or enumerative, procedure similar in spirit to the branch-and-
bound methods developed for integer programming problems. We are now going
to describe the basic scheme of such a procedure, first informally, and then more
rigorously. We restrict ourselves to a depth-first search version of the procedure,
partly for the sake of simplicity, and also because many efficient implementations
fall under this category (the reader will easily figure out what a more general
branching scheme may look like).
The procedure can be viewed as growing a binary enumeration tree (or “seman-
tic tree”), where each node of the tree corresponds to a partial assignment of values
to the variables. More precisely, each node is associated with a subproblem that we
denote by (φ, T , F ), where T and F are two disjoint subsets of {1, . . . , n}. This sub-
problem is defined as follows: Find a solution X∗ = (x1∗ , x2∗ , . . . , xn∗ ) of the equation
φ(X) = 0 such that xi∗ = 1 for all i ∈ T , and xi∗ = 0 for all i ∈ F , or decide that no
such solution exists. The root of the tree corresponds to the subproblem (φ, ∅, ∅),
meaning that all variables are initially free.
The branching procedure uses a subroutine Preprocess(φ, T , F ) which could
perform a variety of preprocessing operations on the subproblem (φ, T , F ). We
simply assume that this subroutine always returns one of three possible outputs:
Procedure Branch(φ, T , F ).
Input: A Boolean expression φ(x1 , x2 , . . . , xn ) and two subsets T , F of {1, . . . , n} such that
T ∩ F = ∅.
Output: A solution X∗ = (x1∗ , x2∗ , . . . , xn∗ ) of the equation φ(X) = 0 such that xi∗ = 1 for all i ∈ T ,
and xi∗ = 0 for all i ∈ F , if such a solution exists; No otherwise.
begin
if Preprocess(φ, T , F ) = X∗ then return X ∗ ;
if Preprocess(φ, T , F ) = No then return No;
if Preprocess(φ, T , F ) = (ψ, S, G) then
{comment: branch }
begin
select an index i ∈ {1, . . . , n} \ (S ∪ G);
{comment: fix xi to 0}
if Branch(ψ, S, G ∪ {i}) = X∗ then return X ∗
{comment: fix xi to 1}
else return Branch(ψ, S ∪ {i}, G);
end
end
where wi = 2−i . Jeroslow and Wang [534] and Harche, Hooker, and Thomp-
son [476] obtained good computational results with this branching rule, but
Dubois et al. [281] report that other choices of the weights wi may be more
2.5 Branching procedures 83
hmin(φ) (x) + hmin(φ) (x) + α min hmin(φ) (x), hmin(φ) (x) , (2.11)
Other practical branching rules are discussed in [58, 166, 239, 281, 418, 534,
613, 886], and so on. Dubois et al. [281], in particular, stress the fact that the branch-
ing strategies that prove most efficient on consistent instances may be different
from those that perform well on inconsistent instances.
In order to improve the effectiveness of branching, several authors have sug-
gested focusing on control sets, where a control set is a set S of indices such that,
after branching on all the variables with indices in S, in any arbitrary order, the
remaining equation is always “easy” to solve (that is, the subproblem (φ, T , F ) is
“easy” for every partition T ∪ F of S). This type of strategy appears, for instance,
in publications by Brayton et al. [153], Chandru and Hooker [183], Boros et al.
[116], Truemper [871], and so on. Crama, Ekin, and Hammer [229] proved that
finding a smallest control set is NP-hard for a broad range of specifications of what
constitutes an “easy” equation. Closely related concepts have recently been reex-
amined by Williams, Gomes, and Selman [913] under the name backdoor sets; see
also [568, 581, 715, 854] and the discussion of relaxation schemes in Section 2.5.2
hereunder.
The branching rules described earlier may lead to ties that can be broken either
deterministically (e.g., by choosing the variable with smallest index among the
candidates) or by random selection. Implementations of sophisticated randomized
branching rules are found, for instance, in Bayardo and Schrag [58] and Crawford
and Auton [239]. Interestingly, Gomes et al. [403] provide evidence that random-
ized selection may noticeably influence the performance of branching procedures
(namely, the variance of the running time is usually large when the randomized
procedure is applied several times to a single instance).
Departing from the basic algorithm Branch, some authors have suggested
branching on terms of the current DNF rather than on its variables. For
instance, Monien and Speckenmeyer [690] proposed the following approach: If
84 2 Boolean equations
2.5.2 Preprocessing
Let us now discuss some of the possible ingredients that may go into the sub-
routine Preprocess. We assume again, for the sake of simplicity, that the current
subproblem is the equation DNF φ = 0. We successively handle rewriting rules,
the Davis-Putnam rules, general heuristics, and relaxation schemes.
Rewriting rules
Any rewriting operation that replaces φ by an equivalent DNF can be applied.
Examples of such operations are the removal of duplicate terms or, more generally,
the removal of any term of φ that is absorbed by another term. Several authors
have also experimented with rules which replace φ by an equivalent DNF of the
form φ ∨ C1 ∨ C2 ∨ . . . ∨ Cr , where C1 , C2 , . . . , Cr are (prime) implicants of φ.
The consensus procedure (see Section 2.7) can be interpreted in this framework;
related ideas are found in [599, 886].
Davis-Putnam rules
In an oft-cited paper, Davis and Putnam [261] proposed a number of simple prepro-
cessing rules that have attracted an enormous amount of attention in the literature
on Boolean equations and that are implemented in most of the efficient equation
solvers (strictly speaking, Davis and Putnam’s suggestions were formulated in the
framework of elimination algorithms – to be discussed in Section 2.6 – rather
than branching algorithms; the application of these rules within branching proce-
dures was popularized by Davis, Logemann, and Loveland [260] and Loveland
[627]).
The Davis-Putnam rules identify various special circumstances under which a
variable xi can be fixed to a specific value without affecting the consistency of the
equation. The rules fall into two categories: unit literal rules (sometimes called
unit clause rules, unit deduction rules, forward chaining rules, etc.) and monotone
literal rules (sometimes called pure literal rules, affirmative-negative rules, etc.).
2.5 Branching procedures 85
To state them, it is convenient to assume that the terms of the DNF φ have been
grouped as follows:
φ = x i φ0 ∨ xi φ1 ∨ φ2 , (2.12)
The unit literal rules are obviously valid; that is, the equation obtained after
applying the rules is consistent if and only if the original equation is consistent.
Within branching algorithms, they are usually applied in an iterative fashion until
their premises are no longer satisfied. At this point, either a complete solution of
the equation φ = 0 has been found, or an equivalent, but simpler equation has been
derived. In the artificial intelligence literature, this procedure sometimes goes by
the name of unit resolution, clausal chaining, or Boolean constraint propagation
(BCP) (see, e.g., [186, 533, 627, 670, 693]).
The unit literal rules can be implemented to run in linear time and are com-
putationally efficient. It is worth noting that they are somehow redundant with
most of the branching rules described in the previous subsection, in the sense that
these branching rules tend to select a variable appearing in a term of degree 1
when such a term exists (since the branching rules often give priority to variables
appearing in short terms). Thus, many branching rules can be seen as automatically
enforcing the unit literal rules when they are applicable, and as generalizing these
rules to terms of higher degree otherwise. Separately handling the unit literal rules,
however, usually allows for more efficient implementations.
Let us now turn to the monotone literal rules.
The monotone literal rules are valid in the sense that φ = 0 has a solution if and
only if the equation obtained after applying the rules has a solution. From a practical
viewpoint, they can be implemented to run in linear time but seem to have only
a marginal effect on the performance of branching procedures. Generalizations of
these rules have been investigated in [126].
86 2 Boolean equations
Heuristics
Any heuristic approach to consistency testing can be used within the branching
framework. For instance, Jeroslow and Wang [534] implement a “greedy” heuris-
tic, which essentially consists in iteratively fixing to 0 any literal u that maximizes
the expression W (u) defined by (2.10). This process is repeated until either a solu-
tion X ∗ of φ(X) = 0 has been produced or a contradiction has been detected. In
the latter case, Preprocess simply returns the original equation. Jaumard, Stan,
and Desrosiers [532] similarly rely on a tabu search heuristic at every node of the
branching tree.
Relaxation schemes
An interesting approach to preprocessing has been initiated by Gallo and Urbani
[365] (who also credit Minoux [unpublished] with a similar idea) and exploited
by several other researchers in various frameworks. This approach makes use of a
basic ingredient of enumerative algorithms: the notion of relaxation of a problem.
We define here a relaxation scheme as an operator that associates with every
(DNF) equation φ(X) = 0 another (DNF) equation ψ(X, Y ) = 0 (its relax-
ation), with the property that φ(X) = 0 is inconsistent whenever ψ(X, Y ) = 0
is inconsistent.
Given a relaxation scheme, the subroutine Preprocess can proceed along the
following lines:
For the current subproblem φ(X) = 0,
Thus, solving the relaxation ψ = 0 either proves that the original equation φ = 0
is inconsistent (in step (b)) or produces a candidate (heuristic) solution of φ = 0
(in step (c)).
Generally speaking, the art consists in choosing the relaxation scheme in such
a way that the relaxed equation ψ(X, Y ) = 0 is “easy” to solve, while remaining
sufficiently “close” to the original equation. One way of defining a relaxation
scheme is to construct ψ so that ψ(X, Y ) ≤ φ(X) for all (X, Y ), which can be
achieved by removing a subset of terms from φ. In this framework, the goal is to
remove as few terms as possible from φ (so that ψ remains “close” to φ) until
the equation ψ = 0 becomes “easy” to solve. (This idea is related to the notion
of control set introduced in Section 2.5.1.) Crama, Ekin, and Hammer [229] have
investigated the computational complexity of several versions of this problem.
Gallo and Urbani [365] use Horn equations as relaxations of arbitrary DNF
equations. Horn equations are precisely those DNF equations in which each term
contains at most one complemented variable (recall Definition 1.30 in Section
2.6 Variable elimination procedures 87
1.13.1). As we will see in Chapter 6, Horn equations can be solved in linear time
(essentially, by repeated application of the unit literal rules). A DNF equation φ = 0
can be relaxed to a Horn equation by dropping from φ any term that contains more
than one complemented variable. More elaborate schemes are discussed in Gallo
and Urbani [365] or Pretolani [758].
Other authors have similarly proposed to relax the given DNF equation to a
quadratic equation (quadratic equations, like Horn equations, are easily solved in
linear time; see Chapter 5). Buro and Kleine Büning [166]; Dubois and Dequen
[283]; Groote and Warners [412]; Jaumard, Stan, and Desrosiers [532]; Larrabee
[599]; and Van Gelder and Tsuji [886] report on computational experiments relying
on (variants of) such schemes. As Larrabee observed [599], one may expect these
approaches to perform particularly well when the equation contains a relatively
high number of quadratic terms, as is the case with the equations arising from
stuck-at fault detection in combinational circuits (see Application 2.2).
Finally, we note that the decomposition techniques described by Truemper [871]
share some similarities with relaxation schemes.
the variables are not immediately relevant, but have rather been introduced in the
equation in order to facilitate the formulation of a problem. For instance, in the
Boolean equation φ(X, Y , z) = 0 describing the correct functioning of a switching
circuit (see Application 1.13.2), the variables Y associated with the output of the
hidden gates are usually not of direct interest. In this application, eliminating the Y -
variables from φ = 0 leads to an equation whose solution set describes the relation
between the input signals X and the output signal z (viz., the function computed
by the circuit).
More specifically, successive elimination of all variables of the equation (2.13)
eventually provides a straightforward consistency test for this equation. Before we
make this more precise, however, we would like to address the following question:
Suppose that the equation (2.14) is consistent, and that we know one of its solutions,
say (x1∗ , . . . , xn−1
∗
); how can we use this knowledge to produce a solution of the
original equation (2.13)? The next result provides a constructive answer to this
question.
Theorem 2.7. If (x1∗ , . . . , xn−1
∗
) is a solution of (2.14), if xn∗ = φ(x1∗ , . . . , xn−1∗
, 0),
∗ ∗ ∗ ∗ ∗ ∗
and xn = φ(x1 , . . . , xn−1 , 1), then both (x1 , . . . , xn−1 , xn ) and (x1 , . . . , xn−1 , xn∗∗ )
∗∗ ∗
Let us now illustrate the use of Theorems 2.6 and 2.7 on a small example.
Example 2.2. Consider the DNF equation φ3 (x1 , x2 , x3 ) = 0, where
φ3 = x 1 x2 x 3 ∨ x1 x 2 x3 ∨ x 1 x 2 ∨ x 1 x3 ∨ x2 x3 .
By Theorem 2.6, the equation φ3 (x1 , x2 , x3 ) = 0 is consistent if and only if the
equation φ2 (x1 , x2 ) = 0 is consistent, where
φ2 (x1 , x2 ) = φ3 (x1 , x2 , 0)φ3 (x1 , x2 , 1) = (x 1 x2 ∨ x 1 x 2 )(x1 x 2 ∨ x 1 x 2 ∨ x 1 ∨ x2 ).
Applying once again Theorem 2.6, φ2 (x1 , x2 ) = 0 is consistent if and only if
φ1 (x1 ) = 0 is consistent, where
φ1 (x1 ) = φ2 (x1 , 0)φ2 (x1 , 1) = x 1 .
Finally, eliminating x1 yields
φ0 = 0.
2.6 Variable elimination procedures 89
Procedure Eliminate(φ)
Input: A Boolean expression φ(x1 , . . . , xn ).
Output:Asolution (x1∗ , . . . , xn∗ ) of the equation φ(X) = 0 if the equation is consistent; No otherwise.
begin
φn := φ(x1 , . . . , xn );
{comment: begin successive variable elimination}
for j := n down to 1 do φj −1 (x1 , . . . , xj −1 ) := φj (x1 , . . . , xj −1 , 0) φj (x1 , . . . , xj −1 , 1);
{comment: consistency check}
if φ0 = 1 then return No;
if φ0 = 0 then {comment: the equation is consistent; begin backtracking}
for j := 1 to n do xj∗ := φj (x1∗ , . . . , xj∗−1 , 0);
return (x1∗ , . . . , xn∗ );
end
Figure 2.3 presents a formal statement of the procedure Eliminate(φ) for the
solution of Boolean equations of the form (2.13). The correctness of the procedure
is an immediate consequence of Theorems 2.6 and 2.7. It should be noted, however,
that Eliminate can be implemented in a variety of ways. More precisely, the
meaning of the assignment
φ0 = φ(X ∗ ).
X ∗ ∈Bn
90 2 Boolean equations
φn = x n ψ0 ∨ xn ψ1 ∨ ψ2 , (2.18)
where ψ0 , ψ1 and ψ2 are DNFs involving the variables x1 , . . . , xn−1 , but not xn .
Then,
φn (x1 , . . . , xn−1 , 0) = ψ0 ∨ ψ2
and
φn (x1 , . . . , xn−1 , 1) = ψ1 ∨ ψ2 ,
so that
The expression (2.19) can be used to rewrite φn−1 as a DNF. Indeed, by dis-
tributivity, the conjunction ψ0 ψ1 has a DNF expression ψ, each term of which is
simply the conjunction of a term of ψ0 with a term of ψ1 . This DNF can be further
simplified by deleting any term that is identically 0 or is absorbed by another term.
These straightforward rules yield a DNF equivalent to φn−1 .
Since a DNF is identically zero if and only if it has no terms, this approach
sometimes allows us to detect consistency early in the elimination procedure,
thus reducing the number of iterations required by Eliminate and speeding up
termination.
Example 2.3. Consider the equation
φ4 = x1 x2 x 4 ∨ x 1 x2 x3 x 4 ∨ x 2 x4 ∨ x 1 x 3 x4 .
By elimination of x4 , we get
φ3 = (x1 x2 ∨ x 1 x2 x3 )(x 2 ∨ x 1 x 3 ).
φ6 = x 1 x2 x 3 ∨ x1 x 2 x3 ∨ x 1 x 2 x4 ∨ x 1 x3 ∨ x2 x3 x4 ∨ x4 x5 x6 ∨ x 4 x 5 x 6 ∨ x 4 ∨ x3 x5 x 6 .
Applying the unit literal rule, we see that x4 can be fixed to 1. This reduces φ6 to
x 1 x2 x 3 ∨ x1 x 2 x3 ∨ x 1 x 2 ∨ x 1 x3 ∨ x2 x3 ∨ x5 x6 ∨ x3 x5 x 6 .
Davis and Putnam’s original algorithm [261] is in fact a variant of the classi-
cal procedure Eliminate, especially tailored for the solution of DNF (or CNF)
equations. The additional rules proposed by these authors consist in maintaining the
DNF format throughout the procedure and in computing dynamically an effective
variable elimination ordering. Since both the unit literal rules and the monotone
literal rules lead to a simplification of the current DNF φj , it makes sense to apply
them first in the elimination algorithm. When the rules are no longer applicable,
Davis and Putnam [261] suggest proceeding with the elimination of any variable
that appears in a shortest term of φj (recall our discussion of branching rules in
Section 2.5).
Even with these refinements, however, the main computational hurdle of the
elimination method remains: Namely, the number of terms in the equation tends to
explode in the initial phases of the procedure, before it eventually decreases with the
number of variables. As a result, computer implementations of elimination proce-
dures rapidly face memory space problems, similar in nature to those encountered
by other dynamic programming algorithms. In effect, these problems are often
serious enough to prohibit the solution of equations involving many variables.
This difficulty was first noticed by Davis, Logemann, and Loveland [260] and led
them to replace the original form of the Davis-Putnam algorithm by a branching
procedure of the type discussed in Section 2.5. We will see in Sections 2.11.2
and 2.11.3, however, that elimination procedures are well suited for generating all
solutions or for computing parametric solutions of Boolean equations.
92 2 Boolean equations
φ = xi C ∨ x i D ∨ ψ, (2.20)
of an inference rule (namely, the classical syllogism) that allows us to draw the
conclusion C D from the premises xi C and x i D.
This view of consensus derivation, as an operation producing new elemen-
tary conjunctions from existing ones, leads to a natural extension of the previous
concepts.
Definition 2.6. The elementary conjunction C can be derived by consensus from
a set S of elementary conjunctions if there exists a finite sequence C1 , C2 , . . . , Cp
of elementary conjunctions such that
(1) Cp = C, and
(2) for i = 1, . . . , p, either Ci ∈ S or there exist j < i and k < i such that Ci is
the consensus of Cj and Ck .
We are now ready to state the fundamental result that motivates the consideration
of consensus derivation.
Theorem 2.9. The DNF equation φ = 0 is inconsistent if and only if the (empty)
elementary conjunction 1 can be derived by consensus from the set of terms of φ.
Proof. As mentioned earlier, this theorem can be viewed as an immediate corollary
of the results in Chapter 3 (see Theorem 3.5 and its Corollary 3.4). For the sake of
completeness, we prove it here from first principles.
The “if” part of the statement follows directly from Theorem 2.8. For the “only
if” part, we assume that the DNF equation φ(x1 , x2 , . . . , xn ) = 0 is inconsistent,
and we proceed by induction on the number n of variables. The result is trivial if
n = 1. For n > 1, write φ as
φ = x n ψ0 ∨ xn ψ1 ∨ ψ2 , (2.21)
where ψ0 , ψ1 and ψ2 do not depend on xn . Theorem 2.6 implies that the equation
ψ0 ψ1 ∨ ψ2 = 0 is inconsistent.
m
Now use distributivity to rewrite ψ0 ψ1 ∨ ψ2 as a
DNF of the form ψ = k=1 Ck , where each term Ck is either a term of φ or the
conjunction of a term of ψ0 with a term of ψ1 , namely, the consensus (on xn ) of
two terms of φ. Since ψ depends on n − 1 variables, we know by induction that the
constant 1 can be derived by consensus from {Ck | k = 1, 2, . . . , m}. This, however,
implies that 1 can be derived by consensus from the set of terms of φ.
A procedure for testing the consistency of DNF equations can now be stated
as in Figure 2.4. The correctness of the procedure is an immediate corollary of
Theorem 2.9 (note that the while-loop eventually terminates, since the number of
elementary conjunctions on n variables is finite).
Example 2.5. Consider the DNF equation φ(x1 , x2 , x3 , x4 ) = 0, where
φ = x 1 x2 x 3 ∨ x1 x 4 ∨ x 1 x 2 ∨ x 1 x3 ∨ x4 .
From the terms x 1 x2 x 3 and x 1 x 2 , we can derive the consensus x 1 x 3 . This new
term together with x 1 x3 yields the consensus x 1 . On the other hand, the term
94 2 Boolean equations
Procedure Consensus(φ)
Input: A DNF expression φ(x1 , . . . , xn ) = m
k=1 Ck .
Output: Yes if the equation φ = 0 is consistent; No otherwise.
begin
S := {Ck | k = 1, 2, . . . , m};
while there exist two terms xi C and x i D in S such that
xi C and x i D have a consensus and CD is not in S do
if CD = 1 then return No
else S := S ∪ {CD};
return Yes;
end
x1 can be derived from x1 x 4 and x4 . Combining now the derived terms x 1 and
x1 , we can produce the constant 1, and we conclude that the equation φ = 0 is
inconsistent.
Two features of the consensus procedure deserve further attention. First, Con-
sensus does not produce a solution of the DNF equation when there is one. Second,
Consensus is not completely defined, since we did not specify how the terms xi C
and x i D are to be chosen in the while-loop. We now successively tackle these two
points.
Consider first the fact that Consensus only delivers a consistency verdict for
DNF equations, but no solution. This is, from a theoretical viewpoint, no serious
problem. Indeed, as explained in Section 2.4, Consensus can easily be used as a
subroutine to produce a solution of consistent equations.
But the situation is actually even better here. Indeed, we shall prove in Chapter 3
that, when the procedure Consensus(φ) halts and returns the answer Yes, the set
S contains all prime implicants of the function represented by the DNF φ. The
knowledge of these prime implicants is, by itself, sufficient to produce a solution
of the equation φ = 0, as will also be explained in Chapter 3 (see Corollary 3.4).
Let us also notice, as a final remark on this topic, that the consensus procedure
and its various extensions have been mostly used as equation-solving techniques
within the field of automated theorem proving. As previously mentioned, many
applications in this particular field do not require the explicit finding of solu-
tions, since only inconsistent equations are “interesting” (because they correspond
to theorems). On the other hand, what is valuable in this context is an explicit
argument showing why a theorem is true (i.e., a proof of the theorem). A consen-
sus derivation of inconsistency provides such an argument (although sometimes
insufficiently clear; see [881, 925] for a more detailed discussion).
We now take up the second issue mentioned above: How are the terms xi C and
x i D to be selected in the while-loop of the consensus procedure? This question
is closely related to the question of selecting the next variable to branch upon in
2.8 Mathematical programming approaches 95
branching procedures, and some of the available strategies should be by now very
familiar.
A first strategy is to replace the condition “CD is not in S” in the while-
statement by the stronger condition “CD is not absorbed by any term in S.” The
procedure remains correct under this modification, as easily follows from the proof
of Theorem 2.9.
Another strategy, much in the spirit of the Davis-Putnam unit literal rule, is to
give priority to so-called unit consensus steps, namely, to pairs of terms {xi C, x i D}
such that either xi C or x i D is of degree 1. Note, for instance, that the consensus
of xi and x i D is simply D, which absorbs x i D. Thus, unit consensus steps can be
implemented without increasing the cardinality of the set S. If we restrict the pro-
cedure Consensus to the use of unit consensus steps, then the procedure becomes
extremely fast. But, unfortunately, it can fail to detect inconsistent equations. Nev-
ertheless, equation solving heuristics based on this approach are widely used in
automated reasoning procedures.
Similarly, a substantially accelerated heuristic algorithm is obtained when we
restrict consensus formation to pairs of terms of the form {xi C, x i CD}; indeed,
such a pair produces the term CD, which absorbs x i CD.
If Consensus starts by selecting all pairs of terms having a consensus on xn , as
long as they are available, and proceeds next to pairs of terms having a consensus on
xn−1 , xn−2 , . . . , x1 , then Consensus becomes essentially identical to the elimination
procedure.
Other specialized forms of consensus used in automated reasoning are the so-
called set of support strategy, linear consensus, input refutation, and so on. Some
of these variants will be introduced in subsequent chapters (e.g., in Chapter 6). We
also refer to [186, 571, 925] and to the exercises at the end of this chapter for more
information.
φ(x1 , x2 , . . . , xn ) = xi xj =0 (2.22)
k=1 i∈Ak j ∈Bk
has the same set of solutions as IS(φ), where IS(φ) is the following system of linear
inequalities in 0-1 variables:
(1 − xi ) + xj ≥ 1, k = 1, 2, . . . , m;
i∈Ak j ∈Bk
xi ∈ {0, 1}, i = 1, 2, . . . , n.
minimize z
subject to z + (1 − xi ) + xj ≥ 1, k = 1, 2, . . . , m;
i∈Ak j ∈Bk
xi ∈ {0, 1}, i = 1, 2, . . . , n;
z ∈ {0, 1}.
Proof. The first claim is just a restatement of Theorem 1.39 (see Section 1.13.6)
and the second one is an immediate corollary.
minimize z
subject to z + (1 − xi ) + xj ≥ 1, k = 1, 2, . . . , m,
i∈Ak j ∈Bk
0 ≤ xi ≤ 1, i = 1, 2, . . . , n,
0 ≤ z ≤ 1.
When neither case (b) nor case (c) applies, then one of the variables assuming
a fractional value in X∗ can be selected for branching.
How effective is this particular version of Branch? Let us say that a variable
xi is fixed to the value 0 (respectively, 1) by unit consensus on φ if the term
xi (respectively, x i ) can be derived from the terms of φ by a sequence of unit
consensus steps (i.e., if the linear term xi , respectively x i , arises after iterated
applications of the unit literal rule on φ). Also, let us say that unit consensus
detects that φ = 0 is inconsistent if some variable xi can be fixed both to 0 and to
1 by unit consensus. The next result is due to Blair, Jeroslow, and Lowe [98].
Theorem 2.11. (a) If unit consensus does not detect that φ = 0 is inconsis-
tent, then there is a feasible solution (X ∗ , z∗ ) of LP(φ) in which z∗ = 0 and
xi∗ = 1/2 for each variable xi that is not fixed by unit consensus (i = 1, 2, . . . , n).
(b) For each i = 1, 2, . . . , n, if xi is fixed to the value u ∈ {0, 1} by unit consensus
on φ, then xi∗ = u in all those feasible solutions (X ∗ , z∗ ) of LP(φ) for which
z∗ = 0.
(c) The optimal value of LP(φ) is strictly positive if and only if unit consensus
detects that φ = 0 is inconsistent.
Proof. The theorem follows from the fact that, if all terms of φ have degree at least
2, then setting z∗ = 0 and xj∗ = 1/2 for j = 1, 2, . . . , n defines a feasible solution
of LP(φ). And conversely, if φ contains a term of degree 1, say, the term xi , then
LP(φ) contains the constraint
z + (1 − xi ) ≥ 1,
It follows from Theorem 2.11 that, when applied to problem IP(φ) (or IS(φ)),
a branch-and-bound algorithm does not detect inconsistency faster than the unit
literal rules. One may still hope that, in the course of solving the linear relaxation
LP(φ), integer solutions may be produced by “sheer luck,” thus accelerating the
basic branching procedure in the case of consistent equations. While this is true
to some extent, computational experiments indicate that this approach is rather
inefficient and that special-purpose heuristics tend to outperform this general-
purpose LP-based approach (see [98, 534] and Section 2.5.2).
The integer programming framework, however, also offers insights of a more
theoretical nature into the solution of Boolean equations. Let us first recall some
definitions from [197, 211, 812]. (We denote by "x# the smallest integer not smaller
than x.)
Definition 2.7. Let A ∈ Zm×n , b ∈ Zm , and consider the system of linear inequali-
ties I : (A x ≥ b, x ≥ 0) for x ∈ Rn . A Chvátal cut for I is any inequality of the form
cx ≥ δ, where c ∈ Zn and δ ∈ R, such that for some d ∈ R, "d# ≥ δ, the inequality
cx ≥ d can be obtained as a nonnegative linear combination of the inequalities
in I.
It should be clear that every integral vector x ∈ Zn that satisfies all the inequal-
ities in I also satisfies every Chvátal cut for I. Let us now consider the set of all
the inequalities that can be obtained by iterated computations of Chvátal cuts.
(1) cp = c, dp = d, and
(2) for i = 1, . . . , p, either the inequality ci x ≥ di is in I, or it is a Chvátal cut
for the system of inequalities (cj x ≥ dj : 1 ≤ j < i).
A deep theorem of Chvátal [197] asserts that, if the solution set of I is bounded,
then every linear inequality cx ≥ δ (c ∈ Zn , δ ∈ R) that is satisfied by all integral
solutions of I is in the Chvátal closure of I (see also [812, 211]). In particular,
if the system I has no integral solution, then the inequality 0 ≥ 1 must be in
its Chvátal closure. We are now ready to apply these concepts to the solution of
Boolean equations.
Theorem 2.12. The DNF equation φ(X) = 0 is inconsistent if and only if the
inequality 0 ≥ 1 is in the Chvátal closure of the system
(1 − xi ) + xj ≥ 1, k = 1, 2, . . . , m, (2.23)
i∈Ak j ∈Bk
0 ≤ xi ≤ 1, i = 1, 2, . . . , n. (2.24)
2.8 Mathematical programming approaches 99
As observed by Cook, Coullard, and Turán [210], Definition 2.8 and Theo-
rem 2.12 suggest a purely algebraic cutting-plane proof system for establishing
the inconsistency of DNF equations. The next result, proved in Cook, Coullard,
and Turán [210]; Hooker [499, 500] and Williams [911], establishes a connection
between this approach and the consensus method.
Theorem 2.13. Let k ∈ {1, ..., n}; let A1 , A2 , B1 , B2 be subsets of {1, ..., n}\{k} such
that (A1 ∪ A2 ) ∩ (B1 ∪ B2 ) = ∅; and consider the system of inequalities
(1 − xk ) + (1 − xi ) + xj ≥ 1, (2.25)
i∈A1 j ∈B1
xk + (1 − xi ) + xj ≥ 1, (2.26)
i∈A2 j ∈B2
0 ≤ xi ≤ 1, i = 1, 2, . . . , n. (2.27)
Proof. Take the sum of (2.25) and (2.26).Add (1−xi ) ≥ 0 to the resulting inequality
for each i that appears in exactly one of A1 , A2 and add xj ≥ 0 for each j that appears
in exactly one of B1 , B2 . Divide both sides of the resulting inequality by 2. These
operations yield the valid inequality
1
(1 − xi ) + xj ≥ , (2.29)
i∈A1 ∪ A2 j ∈B1 ∪ B2
2
(1 − x1 ) + (1 − x2 ) + x3 + x4 ≥ 1,
x1 + (1 − x2 ) + x3 + (1 − x5 ) ≥ 1.
100 2 Boolean equations
φ(x1 , x2 , . . . , xn ) = xi xj (2.30)
k=1 i∈Ak j ∈Bk
2.8 Mathematical programming approaches 101
m
f (x1 , x2 , . . . , xn ) = ck xi (1 − xj ) , (2.31)
k=1 i∈Ak j ∈Bk
Proof. The equivalence of statements (a) and (b) is obvious. Their equivalence
with statement (c) follow from the claim that minX∈[0,1]n f (X) = minX∈{0,1}n f (X)
(as observed by Rosenberg [789], this property actually holds for every multilin-
ear function f ; see also Theorem 13.12 in Section 13.4.3). To see this, consider
an arbitrary point X ∗ ∈ [0, 1]n and assume that one of its components, say, x1∗ ,
is not integral. The restriction of f to xi = xi∗ for i ≥ 2, namely, the function
g(x1 ) = f (x1 , x2∗ , . . . , xn∗ ), is affine. Hence, g(x1 ) attains its minimum at a 0-1 point
x̂1 . This implies in particular that f (x̂1 , x2∗ , . . . , xn∗ ) ≤ f (x1∗ , x2∗ , . . . , xn∗ ). Continu-
ing in this way with any remaining fractional components, we eventually produce
a point X̂ ∈ {0, 1}n such that f (X̂) ≤ f (X ∗ ), which proves the claim and the
theorem.
Any algorithm for nonlinear 0-1 programming can be used to optimize f (X)
over Bn (see Chapter 13 and the survey [469]). Hammer, Rosenberg, and Rudeanu
[458, 460], for instance, have proposed a variable elimination algorithm (inspired
from Eliminate) for minimizing functions of the form (2.31) over B n . A stream-
lined version and an efficient implementation of this algorithm are described by
Crama, Hansen, and Jaumard [235], who also observe that this algorithm is appli-
cable to the solution of Boolean equations. The algorithm described in [235] relies
on numerical bounding procedures to control (to a certain extent) the combinatorial
explosion inherent to elimination procedures (see Section 2.6).
The coefficients ck are arbitrary in (2.31), and the performance of any opti-
mization algorithm based on Theorem 2.14 may be influenced by the choice of
these coefficients. Wah and Shang [894] propose a discrete Lagrangian algorithm
for minimizing (2.31), which can be viewed as starting with ck = 1 for all k and
dynamically adapting these values.
Recently, several authors have experimented with semidefinite programming
reformulations of Boolean equations based on extensions of Theorem 2.14; see, for
instance, Anjos [23, 24] and de Klerk, Warners, and van Maaren [266]. Gu [417]
combines various continuous global optimization algorithms with backtracking
techniques to compute the minimum of (2.31), or of closely related functions,
over [0, 1]n . Other nonlinear programming approaches to the solution of Boolean
102 2 Boolean equations
φ = x n φ0 ∨ xn φ1 ∨ φ2 , (2.33)
where φ0 , φ1 and φ2 are DNFs which do not involve xn , and suppose that the
first branching takes place on xn . Two subtrees are created, corresponding to the
equations φ0 ∨ φ2 = 0 and φ1 ∨ φ2 = 0. Say these trees have sizes β0 and β1 ,
respectively, where β0 + β1 = β − 1. Since both equations are inconsistent, the
constant 1 can be derived by consensus from each of them in, at most, β0 and β1
steps, respectively. Now, apply the same consensus steps to the terms of φ (note
2.10 More on the complexity of Boolean equations 105
Since solving Boolean equations is NP-hard, one may expect any solution
procedure to take an exponential number of steps on some classes of instances.
Identifying bad instances for any particular method, however, is not an easy task.
The so-called pigeonhole formulae have played an interesting role in this respect.
These formulae express that it is impossible to assign n + 1 pigeons to n holes
without squeezing two pigeons into a same hole. In Boolean terms, this rather
obvious fact of life translates into the inconsistency of the DNF equation
n+1 n
n n+1
n
x ik ∨ xik xj k = 0, (2.34)
i=1 k=1 i=1 j =i+1 k=1
where variable xik takes value 1 if the i-th pigeon is assigned to the k-th hole.
In a famous breakthrough result, Haken [433] showed that any consensus proof
of inconsistency has exponential length for the pigeonhole formulae. Other hard
examples for consensus (and hence, for branching and variable elimination) were
later provided by Urquhart [880] (see also Section 2.10.2).
It can be shown, however, that cutting-plane derivations of length O(n3 ) are suf-
ficient to prove the inconsistency of (2.34) (see [210]). Exponential lower bounds
for cutting-plane proofs are provided by Pudlák [761]. Let us also mention that an
extended version of consensus has been introduced by Tseitin [872], and is known
to be at least as strong as cutting-plane proofs [210]. Interestingly, no exponential
lower bound has been established for this extended consensus algorithm. We refer
to Urquhart [882] for a discussion of the complexity of other proof systems.
A number of authors have examined upper bounds on the number of steps
required to prove the inconsistency of a DNF equation φ(x1 , x2 , . . . , xn ) = 0.
Branching procedures trivially require O(2n ) steps. Monien and Speckenmeyer
[690] have improved this bound by proving that a variant of the branching proce-
dure solves DNF equations of degree k in at most O(αkn ) steps, where αk is the
largest root of the equation
x k = 2x k−1 − 1,
for k = 1, 2, . . . , n. One computes: α3 = 1.618, α4 = 1.839, α5 = 1.928, and so on.
Note that αk < 2 for all k, but that αk quickly approaches 2 as k goes to infinity. It
is an open question whether DNF equations in n variables can be solved in O(α n )
steps for some constant α < 2.
The above bounds have been subsequently improved by several authors, see for
instance Kullmann [588]; Schiermeyer [808]; Paturi, Pudlák, Saks, and Zane [731];
Dantsin et al. [255]. In particular, the algorithm in [255] requires (2 − 2/(k + 1))n
steps for equations of degree k and O(1.481n ) steps for cubic DNF equations.
106 2 Boolean equations
Proof. Each term Cj (X) takes value 1 in exactly 2n−|Cj | points of B n , for
j = 1, 2, . . . , m. So, φ(X) takes value 1 in at most m j =1 2
n−|Cj |
points of Bn .
m −|C | n
If j =1 2 j < 1, then φ(X) takes value 1 in less than 2 points, which implies
that φ(X) = 0 is consistent. The second statement is an immediate corollary of the
first one.
Of course, there is nothing really probabilistic about the previous result. In order
to improve the bound on m, however, several researchers have analyzed algorithms
which quickly find a solution of random equations with high probability. Following
previous work by Chao and Franco [187], Chvátal and Reed [202] were able to
show that, when k ≥ 2 and c < 2k /4k, random (n, m, k)-equations with m = c n
terms are consistent with probability approaching 1 as n goes to infinity (this is to
be contrasted with the lower bound on c in Theorem 2.16, which grows roughly
like 2k ln 2).
These results motivate the following conjecture (see [209, 349, 418]).
Threshold conjecture. For each k ≥ 2, there exists a constant c∗ such that random
(n, cn, k)-equations are consistent with probability approaching 1 as n goes to
infinity when c < c∗ and are inconsistent with probability approaching 1 as n goes
to infinity when c > c∗ .
Despite its appeal, the considerable experimental evidence for its validity, and
the existence of similar zero-one laws for other combinatorial structures, the thresh-
old conjecture has only been established when k = 2. In this case, Chvátal and Reed
[202] and Goerdt [390] were able to show that the conjecture holds for the threshold
value c∗ = 1. This result was subsequently sharpened by several researchers; see in
particular the very tight results by Bollobás, Borgs, Chayes, Kim, and Wilson [101].
For k = 3, experiments indicate the existence of a threshold around the value
c∗ = 4.2, but at the time of this writing, the available bounds only imply that, if c∗
exists, then 3.26 < c∗ < 4.506 (see Achlioptas and Sorkin [4]; Dubois, Boufkhad,
and Mandler [282]; Janson, Stamatiou, and Vamvakari [525], etc.). In a remarkable
breakthrough, however, Friedgut [348] proved that a weak form of the threshold
conjecture holds for all k when c∗ is replaced by a function depending on n only.
Achlioptas and Peres [3] established that the conjecture holds asymptotically when
k → +∞ with c∗ = 2k log 2 − O(k); see also Frieze and Wormald [350].
From an empirical point of view, it has been repeatedly observed that very
long and very short equations are easy for most algorithms, whereas hard nuts
108 2 Boolean equations
occur in the so-called phase transition region, near the crossover point at which
about half the instances are (in)consistent. These observations clearly have impor-
tant consequences for the design of experiments aimed at assessing the quality of
equation solvers. They have progressively led researchers to focus their computa-
tional experiments on the solution of special subclasses of random equations, or
on structured equations derived from the encoding of hard combinatorial problems
(see, e.g., [57, 239, 505, 687, etc.]).
The concept of random equations has also been used to analyze the efficiency
of solution algorithms. In a far-reaching extension of the results of Haken [433]
and Urquhart [880] (see Section 2.10.1), Chvátal and Szemerédi [203] proved that
for all fixed integers c and k ≥ 3, there exists ε > 0 such that, for large n, almost no
random (n, cn, k)-equations have consensus proofs of inconsistency of length less
than (1 + ε)n . In view of Theorems 2.15 and 2.16, this result actually implies that
almost all cubic equations with more than 5.191 n terms are hard for branching,
variable elimination, and consensus algorithms.
For more information on the analysis of random equations, we refer to an
extensive survey by Franco [344].
where fi1 , fi2 , . . . , fiq are functions in the constraint set F, and (Xi1 ), (X i2 ), . . . ,
(X iq ) are vectors of Boolean variables of appropriate lengths. So, an instance of
CSP(F) is defined by the list of applications (fij , X ij ), j = 1, 2, . . . , q.
Let us give a few examples of constraint satisfaction problems.
Example 2.7. Consider the constraint set F QU AD = {f1 , f2 , f3 , f4 , f5 } in which
the constraints are represented by the following expressions:
or, equivalently,
x1 x2 ∨ x2 x3 ∨ x1 x 3 ∨ x1 x 4 ∨ x 1 x4 ∨ x 3 x4 ∨ x 2 x 3 = 0.
g(x1 , x2 , x3 ) = x1 x2 x3 ∨ x 1 x 2 x 3 .
Note that in any point (x1∗ , x2∗ , x3∗ ) ∈ B 3 , g(x1∗ , x2∗ , x3∗ ) = 1 if and only if x1∗ = x2∗ = x3∗ .
Therefore, we call g the “cubic not-all-equal” constraint, a name which is in turn
reflected in the notation F 3NAE . The constraint satisfaction problem CSP(F 3N AE )
is NP-complete (see the exercises at the end of the chapter).
So, depending on the class F, the problem CSP(F) may be either easy or hard,
as illustrated by the classes introduced in the previous example. Schaefer’s theorem
very accurately separates those classes for which CSP(F) is polynomially solvable
from those for which it is NP-complete. Before we can state this result, however,
we need a few more definitions.
Extending Definitions 1.12 and 1.30 in Chapter 1, we say that a Boolean function
is quadratic if it can be represented by a DNF in which each term contains at most
110 2 Boolean equations
two variables, and that the function is Horn if it can be represented by a DNF in
which each term contains at most one complemented variable. Similarly, we say
that a function is co-Horn if it can be represented by a DNF in which each term
contains at most one noncomplemented variable. It will follow from the results in
Chapter 6 that CSP(F) is polynomially solvable when all the constraints in F are
Horn, or when they are all co-Horn.
Finally, we define a Boolean function f on B n to be affine if the set of false
points of f is exactly the set of solutions of a system of linear equations over
GF(2), that is, if f can be represented by an expression of the form
f (x1 , x2 , . . . , xn ) = ( xi ) ∨ (1 ⊕ xi ), (2.37)
A∈E0 i∈A A∈E1 i∈A
Proof. Consider the system f (X) = 0, where f has the form (2.37), and assume
that the first term of (2.37) defines the equation
a ⊕ x1 ⊕ x2 ⊕ . . . ⊕ xn = 0,
xn = a ⊕ x1 ⊕ x2 ⊕ . . . ⊕ xn−1 ,
Theorem 2.19. If F satisfies either one of the conditions (1)–(6) hereunder, then
CSP(F) is polynomially solvable; otherwise, it is NP-complete.
Proof. The first half of the theorem is easy: CSP(F) is trivial under conditions (1)
and (2), and we have already discussed conditions (3)–(6). The NP-completeness
2.11 Generalizations of consistency testing 111
Theorem 2.19 underlines the special role played by quadratic, Horn, and affine
functions in Boolean theory. Chapter 5 and Chapter 6 contain a thorough discussion
of quadratic and Horn functions, respectively. Affine functions will not be further
handled in the book. The monograph [243] contains additional facts about these
functions, as well as several extensions and refinements of Theorem 2.19; see also
Creignou and Daudé [241, 242] for probabilistic extensions.
Finally, we note that Boros, Crama, Hammer and Saks [116] established another
theorem separating NP-hard from polynomially solvable instances of Boolean
Equation (see Section 6.10.2). Although the nature of their classification result
is very different from Schaefer’s classification, it also stresses the importance of
quadratic and Horn equations.
Note that we did not assume anything about the expression φ in Theorem 2.20,
and hence, this result can be used to generate all solutions of a general equation
of the form φ(X) = ψ(X). Approaches of the type described in the proof of
Theorem 2.20 have also been used in the machine learning literature; see, for
instance, Angluin [21].
Other approaches rely on ad hoc modifications of the equation-solving pro-
cedures described in previous sections in order to generate all solutions of the
given equation. For instance, straightforward extensions of the branching pro-
cedure Branch can be used to handle the problem. We describe here another
approach, based on an extension of the variable elimination technique and the
following simple observations.
2.11 Generalizations of consistency testing 113
Theorem 2.22. The point (x1∗ , x2∗ , . . . , xn∗ ) is a solution of the equation
φ(x1 , x2 , . . . , xn ) = 0 if and only if (x1∗ , . . . , xn−1
∗
) is a solution of the equation
φ(x1 , . . . , xn−1 , 0) φ(x1 , . . . , xn−1 , 1) = 0,
and xn∗ is a solution of the one-variable equation
φ(x1∗ , . . . , xn−1
∗
, xn ) = 0.
Proof. This is trivial.
Theorem 2.23. Let X∗ = (x1∗ , x2∗ , . . . , xn∗ ) be a particular solution of the Boolean
equation φ(X) = 0, and consider the functions σi : Bn → Bn defined by
σ1 = φ3 (p1 , p2 , p3 ) ∨ p1 φ3 (p1 , p2 , p3 ),
σ2 = p2 φ3 (p1 , p2 , p3 ),
σ3 = p3 φ3 (p1 , p2 , p3 ).
Some additional manipulations show that this parametric solution can be alterna-
tively represented as (σ1 , σ2 , σ3 ) = (1, p1 p2 p3 , 0), and that it correctly describes
the two solutions of φ3 = 0, namely, the points (1, 0, 0) and (1, 1, 0).
2.11 Generalizations of consistency testing 115
Another type of parametric solution can be derived from the variable elimination
principle. Note first that a parametric solution of a consistent one-variable equation
φ(x) = a x ∨ b x = 0 is given by σ (p) = p a ∨ p b = p φ(1) ∨ p φ(0) (compare
with Theorem 2.21). This observation leads to the following reformulation of
Theorem 2.22.
Theorem 2.24. The point (x1∗ , x2∗ , . . . , xn∗ ) is a solution of the equation
φ(x1 , x2 , . . . , xn ) = 0 if and only if (x1∗ , . . . , xn−1
∗
) is a solution of the equation
φn−1 (x1 , . . . , xn−1 ) = 0 and
Example 2.9. Let us again return to Example 2.2. Using Theorem 2.24 and the
expression φ1 = x 1 derived in Example 2.2, we find σ1 = p1 φ1 (1) ∨ p1 φ1 (0) =
p1 ∨ p1 = 1.
Next, σ2 = p2 φ2 (σ1 , 1) ∨ p2 φ2 (σ1 , 0) = p2 φ2 (1, 1) ∨ p 2 φ2 (1, 0). In view of
Example 2.2, φ2 (1, 0) = φ2 (1, 1) = 0, so that σ2 = p2 .
Finally, σ3 = p3 φ3 (σ1 , σ2 , 1) ∨ p 3 φ3 (σ1 , σ2 , 0) = p3 φ3 (1, p2 , 1) ∨ p3 φ3 (1, p2 , 0).
From the expression of φ3 , we find immediately that σ3 = 0.
Note that this solution (σ1 , σ2 , σ3 ) = (1, p2 , 0) is in triangular form, as opposed
to the solution derived in Example 2.8, and that it is reproductive.
maximizes the total weight of the terms canceled by X∗ . In other words, Max Sat
is the optimization problem
m
maximize { wk | Ck (X) = 0 } subject to X ∈ Bn .
k=1
The name Max Sat refers more properly to a dual version of the problem
in which the objective is to maximize the number of satisfied clauses of a CNF
ψ(X) = m k=1 Dk , where a clause k is satisfied if it takes value 1. Clearly, both
versions of the problem are equivalent. To be consistent with the remainder of
the book, we carry on the discussion in terms of DNFs; but the terminology Max
Sat is so deeply entrenched that we prefer to apply it to this DNF version as well,
rather than inventing some neologism like “maximum falsifiability problem.”
Max Sat is a natural generalization of DNF equations, viewed as collections
of logical conditions C1 (X) = 0, C2 (X) = 0, . . . , Cm (X) = 0. When the equation
φ(X) = 0 is inconsistent, we may be happy to find a model X∗ that satisfies as
many of the conditions as possible. Applications are discussed, for instance, in
Hansen and Jaumard [468].
Let us call Max d-Sat the restriction of Max Sat to DNFs of degree d. In view
of Cook’s theorem, Max d-Sat is NP-hard for all d ≥ 3. But a stronger statement
can actually be made (Garey, Johnson, and Stockmeyer [372]).
Theorem 2.25. Max 2-Sat is NP-hard, even when w1 = w2 = . . . = wm = 1.
Proof. The problem of solving the DNF equation φ(x1 , x2 , . . . , xn ) = m k=1 Ck = 0
is NP-complete even when all terms of φ have degree exactly 3 (see [208, 371],
Theorem 2.4 and Exercise 4). With such an equation, we associate an instance of
Max 2-Sat on B n+m , as follows. First, we introduce m new variables y1 , y2 , . . . , ym .
Next, for all k = 1, 2, . . . , m, if Ck = u1 u2 u3 is the kth term of φ, where u1 , u2 , u3
are distinct literals, we create a subformula ψk consisting of 10 terms:
ψk = u1 ∨ u2 ∨ u3 ∨ u1 u2 ∨ u1 u3 ∨ u2 u3 ∨ yk ∨ u1 y k ∨ u2 y k ∨ u3 y k .
Finally, the instance of Max 2-Sat is the DNF ψ = m k=1 ψk , with weight 1 on
each term. We claim that φ = 0 is consistent if and only if the optimal value of this
Max 2-Sat instance is at least 7m.
Indeed, suppose that φ(X ∗ ) = 0 for some point X ∗ ∈ B n , and consider a term
Ck = u1 u2 u3 . Either 1, 2, or 3 of the literals u1 , u2 , u3 take value 0 at X ∗ . If only
one of the literals is 0, then set yk∗ = 1; otherwise, set yk∗ = 0. The resulting point
(X∗ , Y ∗ ) cancels 7 terms of each DNF ψk , for k = 1, 2, . . . , m, and hence it cancels
7m terms of ψ.
Conversely, assume that the point (X ∗ , Y ∗ ) cancels 7m terms of ψ. For
k = 1, 2, . . . , m, it is easy to see that no assignment of values to u1 , u2 , u3 , yk cancels
more than 7 terms of ψk . Moreover, if u1 = u2 = u3 = 1, then at most 6 terms of
ψk can be cancelled. Therefore, (X ∗ , Y ∗ ) must cancel exactly 7 terms of each DNF
ψk , and X ∗ must be a solution of the equation φ(X) = 0.
2.11 Generalizations of consistency testing 117
The following extension of Theorem 2.10 and Theorem 2.14 will be useful in
the sequel (see also Theorem 13.13 in Section 13.4.3).
Theorem 2.26. If
m
φ(x1 , x2 , . . . , xn ) = xi xj , (2.41)
k=1 i∈Ak j ∈Bk
then the optimal value of Max Sat is equal to the optimal value of the 0-1 linear
programming problem
m
maximize wk zk (2.42)
k=1
subject to (1 − xi ) + xj ≥ z k , k = 1, 2, . . . , m; (2.43)
i∈Ak j ∈Bk
So, Max Sat can be seen as either a linear or a nonlinear optimization problem
in 0-1 variables and can, in principle, be solved by any 0-1 programming algorithm
(see, e.g., [707] and Chapter 13). Rather than diving into the details of specific
implementations, we restrict ourselves here to a few elegant results concerning
the performance of approximation algorithms for Max Sat, as these results tie in
nicely with previous sections of the chapter. We begin with a definition.
Definition 2.13. Let 0 < α ≤ 1. An α-approximation algorithm for Max Sat is
a polynomial-time algorithm which, for every instance of Max Sat, produces a
point X̂ ∈ Bn such that
m
m
{ wk | Ck (X̂) = 0 } ≥ α maxn { wk | Ck (X) = 0 }.
X∈B
k=1 k=1
118 2 Boolean equations
where W MS is the optimal value of Max Sat. Therefore, starting from the point XH
and proceeding as in the last part of the proof of Theorem 2.14, we can produce a
point X̂ ∈ {0, 1}n such that f (X̂) ≥ f (X H ) ≥ (1− 21d ) W MS . This procedure clearly
runs in polynomial time, and hence, the algorithm that returns X̂ is a (1 − 21d )-
approximation algorithm.
Note that the proof actually establishes a little bit more than what we claimed:
Namely, the algorithm always returns an assignment with value at least 12 m k=1 wk .
This shows in particular that, for any DNF equation φ = 0, there exists a point that
cancels at least half of the terms of φ.
Our proof of Theorem 2.27 is inspired from a probabilistic argument due to
Yannakakis [934]. In this approach, each variable xi is independently set to either
0 or 1 with probability 12 , and f ( 12 , 12 , . . . , 12 ) is interpreted as the expected objective
value of this random assignment. Then, the method of conditional probabilities is
used to “derandomize” the procedure. The above proof translates this probabilistic
method into a purely deterministic one (but not every probabilistic algorithm can
be so easily derandomized; see, for instance, [689] for a brief introduction to
probabilistic algorithms).
Theorem 2.27 has been subsequently improved by several authors, but the
first real breakthrough came with a 34 -approximation algorithm proposed by
Yannakakis [934] (note that Johnson’s algorithm has a performance guarantee
equal to 34 for DNFs without linear terms). Goemans and Williamson [389] later
proposed another, simpler 34 -approximation algorithm, which we now describe.
We need some preliminary results.
Define the sequence
1
βt = 1 − (1 − )t , t ∈ N.
t
2.11 Generalizations of consistency testing 119
Proof. Assume without loss of generality that |A| = n and B = ∅. The arithmetic-
geometric mean inequality yields
n n
xi
n
xi ≤ i=1 ,
i=1
n
or equivalently
n
n
n
i=1 xi
xi ≤ .
i=1
n
n
From (2.48), i=1 xi ≤ n − z. Hence,
n
n−z n z
xi ≤ ( ) = (1 − )n .
i=1
n n
As in the proof of Theorem 2.14, we can find in polynomial time a point X̂ ∈ {0, 1}n
such that f (X̂) ≥ f (XLP ) ≥ βd W MS .
The second part of the statement follows from lim t→∞ βt = 1 − 1e .
and no algorithm for Max 3-Sat (and, a fortiori, for Max Sat) can achieve a
better guarantee than 0.875. Hence, the performance guarantee in [549] is best
possible for Max 3-Sat, but a small gap remains between the known upper and
lower bounds for Max 2-Sat. Khot et al. [567] have shown that, if the so-called
“Unique Games Conjecture” holds, then it is NP-hard to approximate Max 2-Sat
to within any factor greater than 0.943, a bound that is extremely close to the
approximation ratio of 0.9401 due to Lewin, Livnat, and Zwick [611].
Escoffier and Paschos [315] analyze the approximability of Max Sat
under a different type of metric, namely the differential approximation ratio.
Creignou [240] and Khanna, Sudan, and Williamson [563] investigate and classify
some generalizations of Max Sat; see also Creignou, Khanna, and Sudan [243]
for a complete overview.
We have concentrated in this section on the approximability of Max Sat. On
the computational side, numerous algorithms have been proposed for the solu-
tion of Max Sat problems. Most of these algorithms rely on generalizations of
techniques described in previous sections, especially in Section 2.8. We do not
discuss these approaches in detail, and we refer instead to early work by Hansen
and Jaumard [468]; to the papers [55, 540, 557, 785] in the volume edited by Du,
Gu, and Pardalos [278]; to the book by Hoos and Stützle [508], and so on. Recent
efficient algorithms are proposed by Ibaraki et al. [515] or Xing and Zhang [926].
De Klerk and Warners [265] examine the computational performance of semidefi-
nite programming algorithms for Max Sat. We also refer to Chapter 13 for a more
general discussion of pseudo-Boolean optimization.
2.12 Exercises
1. Given an undirected graph G and an integer K, write a DNF φ such that the
equation φ = 0 is consistent if and only if G is K-colorable.
2. Prove that Boolean Equation can be solved in polynomial time when
restricted to DNF equations in which every variable appears at most twice.
3. Complete the proof of Theorem 2.18.
4. Prove that every Boolean equation can be transformed in linear time into an
equivalent DNF equation in which all terms have degree exactly equal to 3.
5. Prove that Boolean Equation is NP-complete, even when restricted to
DNF equations in which every variable appears at most three times.
6. Prove that Boolean Equation is NP-complete, even when restricted to
cubic DNF equations in which every term is either positive or negative.
7. Let ψ(X, Y ) be the DNF produced by the procedure Expand∗ when running
on the expression φ(X). Prove that, for all X∗ ∈ B n , φ(X ∗ ) = 0 if and only if
there exists Y ∗ ∈ B m such that ψ(X ∗ , Y ∗ ) = 0. Show that Y ∗ is not necessarily
unique.
8. Prove that the following problem is NP-hard: Given a DNF φ = k∈A Tk ,
find alargest subset of terms, say, B ⊆ A, such that the “relaxed” DNF
ψ = k∈B Tk is monotone (see Section 2.5 and [229]).
122 2 Boolean equations
This chapter is dedicated to two of the most important topics in the theory of
Boolean functions. The first is concerned with the basic building blocks of a
Boolean function, namely, its prime implicants. The set of prime implicants of
a Boolean function not only defines the function, but also provides detailed
information about many of its properties. In this chapter, we discuss various appli-
cations and basic properties of prime implicants and describe several methods for
generating all the prime implicants of a Boolean function.
The second deals with problems related to the representation of a Boolean
function by a DNF; that is, as a disjunction of elementary conjunctions. Since a
Boolean function may have numerous DNF representations, the question of finding
an “optimal” one plays a very important role. Among the most commonly consid-
ered optimality criteria, we discuss in detail the minimization of both the number
of terms and the number of literals in a DNF representation of a given function. We
explain the close relationship between these “logic minimization” problems and
the well-known set covering problem of combinatorial optimization; we describe
several efficient DNF simplification procedures; we establish the computational
complexity of logic minimization problems; and we present a “greedy” procedure
as an efficient and effective approximation algorithm for logic minimization.
123
124 3 Prime implicants and minimal DNFs
xy ∨ yz ∨ xz.
φ(x, y, z) = x y z ∨ x y z ∨ x z ∨ y z. (3.1)
This DNF is logically equivalent to the prime DNF φ = x y ∨ z, so that the original
rules 1–4 are equivalent to the conjunction of the following two rules:
Rule 5: If y is true then x is true
Rule 6: z is false
The two rule systems are logically equivalent in the sense that any logical deduction
that follows from one system also follows from the other one.
The foregoing example shows how the application of the notion of prime impli-
cants allows the simplification of an arbitrary system of rules. Moreover, any
3.1 Prime implicants 125
implicant of the associated DNF corresponds to a rule that can be deduced from
the rule system, and vice versa. For example, the term x z is an implicant of the
DNF φ(x, y, z), and therefore the rule
Rule 7: If z is true then x is true
can be deduced from the rule system.
Note that Rule 7 is not very interesting, since a more general rule (namely,
Rule 6: “z is false”) can also be deduced. Since z is a prime implicant of φ(x, y, z),
it is impossible to deduce a more general rule than the latter one. It is therefore
natural to consider the so-called irredundant rules, which correspond to the prime
implicants of the associated Boolean function. The complete DNF of the associated
Boolean function will provide all the irredundant rules that can be deduced from
the given rule system. While some of these rules may be present in the original
rule system or can be obtained by generalizing the rules of the original system (i.e.,
by removing some literals from them), some other irredundant rules may bear no
evident similarity to any of the initial rules. Such rules can reveal some possibly
interesting logical implications that are “hidden” in the original system.
α
Moreover, an elementary conjunction different from xi i , i = 1, . . . , m, is a prime
implicant of f (x1 , . . . , xn ) if and only if it is a prime implicant of g(xm+1 , . . . , xn ).
The decomposition provided by Theorem 3.1 allows the reduction of many
problems involving Boolean functions to the case of Boolean functions without
linear implicants.
Prime implicants of degree 2, also called quadratic prime implicants, define a
α α
partial order among certain literals. Indeed, if x1 1 x2 2 is an implicant of a Boolean
function f (x1 , . . . , xn ), then the inequality
α α
x1 1 ≤ x2 2 ,
f = xy ∨ yz ∨ xwz.
β γ
Proof. (1) Let x1 x2 C be a prime implicant of f . Clearly, γ = β since otherwise
β γ β
x1 x2 C is absorbed by x1 x 2 or x 1 x2 . However, in this case x1 C is an implicant,
since
β β β β β β γ β β
x1 C = x1 x2 C ∨ x1 x2 C ≤ x1 x2 C ∨ x1 x2 ≤ f .
β γ β
Therefore, x1 x2 C is not prime since it is absorbed by x1 C. This proves statement 1.
(2) If x1α C is an implicant of f , then x2α C is an implicant of f , since
x1 x 2 ∨ x 1 x2 ∨ x2α C = x1α x2α ∨ x1α x2α C ∨ x1α x2α ∨ x1α x2α C = x1α x2α ∨ x1α (x2α ∨ x2α C)
Prime Implicants
Instance: An arbitrary expression of a Boolean function f .
Output: The complete DNF of f or, equivalently, a list of all prime implicants of f .
This problem has been intensively investigated in the literature since the early
1930s. Its complexity depends very much on the expression of f . We shall suc-
cessively handle the cases in which f is given by a list of its true points, or by an
arbitrary DNF, or by a CNF.
[Y , Z] = xi xj .
i:yi =zi =1 j :yj =zj =0
[Y , Z] = [Z, Y ]
and
[Y , Z](Y ) = [Y , Z](Z) = 1.
In fact, the set of true points of [Y , Z] is the smallest subcube covering both Y
and Z.
Proof. Let C = i∈P xi j ∈N x j . Since C(Y ) = 1, the point Y = (y1 , y2 , . . . , yn )
is such that yi = 1 for i ∈ P and yj = 0 for j ∈ N . Let us define the point
Z = (z1 , z2 , . . . , zn ) in the following way:
yi , if i ∈ P ∪ N ,
zi =
y i , if i ∈ P ∪ N .
Proof. By Theorem 3.3, every implicant of f is the hull of some pair of (possibly
identical) true points of f , and every pair of true points generates in this way at
most one implicant of f .
C and C have the same degree, then their order is the lexicographic order induced
by the linear order of literals, whereby xi is before x i , x j is before than ∗ (meaning
“not present”), and xi is before xj if i < j .
A comparison of two elementary conjunctions according to this order can be
performed in O(n) time, and the set of implicants of f can be linearly ordered in
O(nM log M) time. Ordering the list also allows us to eliminate possible repeti-
tions. Then, using binary search, one can check whether a conjunction is present
in the list by doing at most log M comparisons, that is, in O(n log M) time. For
any implicant C, at most n conjunctions C j need to be checked. Therefore, all the
nonprime implicants can be eliminated from the list in O(n2 M log M) time. When
M is sufficiently large, this bound is better than the naive O(nM 2 ) bound.
The arguments above prove the following statement:
Note that for those Boolean functions whose number of true points is sufficiently
large, the additional expense of reducing the list of all implicants and keeping only
the prime ones is asymptotically negligible compared with the time required to
generate all the implicants.
Procedure Consensus*(φ)
Input: A DNF expression φ(x1 , . . . , xn ) = m
k=1 Ck of a Boolean function f .
Output: The complete DNF of f , that is, the disjunction of all prime implicants of f .
begin
while one of the following conditions applies do
if there exist two terms C and D of φ such that C absorbs D
then remove D from φ: φ := φ \ D;
if there exist two terms xi C and x i D of φ such that xi C and x i D
have a consensus and CD is not absorbed by another term of φ
then add CD to φ: φ := φ ∨ CD ;
end while
return φ;
end
We shall say that a DNF is closed under absorption if it satisfies the first
condition above, and that it is closed under consensus if it satisfies the second
condition.
Note that the consensus procedure always terminates and produces a DNF
closed under consensus and absorption in a finite number of steps: Indeed, the
number of terms in the given variables is finite, and once a term is removed by
absorption, it will never again be added by consensus.
φ(x1 , x2 , x3 , x4 ) = x1 x 2 x3 ∨ x 1 x 2 x4 ∨ x2 x3 x4 .
φ (x1 , x2 , x3 , x4 ) = x1 x 2 x3 ∨ x 1 x 2 x4 ∨ x2 x3 x4 ∨ x 2 x3 x4 .
φ (x1 , x2 , x3 , x4 ) = x1 x 2 x3 ∨ x 1 x 2 x4 ∨ x2 x3 x4 ∨ x 2 x3 x4 ∨ x3 x4 .
Now the last term of φ absorbs the two previous terms, and φ is transformed
into
φ (x1 , x2 , x3 , x4 ) = x1 x 2 x3 ∨ x 1 x 2 x4 ∨ x3 x4 .
132 3 Prime implicants and minimal DNFs
Here, the consensus procedure stops. Note that the first two terms of φ actually
have a consensus, but it is absorbed by the last term.
C ∨ CD = C.
xC ∨ xD = xC ∨ xD ∨ CD.
Before we proceed with the second proof of Theorem 3.5, we first establish
a lemma. Recall from Section 1.9 that, if A and B are two disjoint subsets of
{1, . . . , n}, then the set of all Boolean vectors in Bn whose coordinates in A are
fixed at 1 and whose coordinates in B are fixed at 0, forms a subcube of B n . This
subcube is denoted by TA,B .
Lemma 3.5. Let φ be a DNF closed under consensus and let TA,B be a subcube.
The equation φ(X) = 0 has a solution in TA,B if and only if no term of φ is
identically 1 on TA,B .
Proof. The “only if ” part of the statement is trivial. Let us prove the “if ” part by
contradiction. Assume that TA∗ ,B ∗ is a subcube such that no term of φ is identically
1 on TA∗ ,B ∗ , and such that no solution of φ(X) = 0 exists in TA∗ ,B ∗ ; moreover,
assume that |A∗ ∪ B ∗ | has maximum cardinality among all subcubes satisfying
these conditions. Clearly, |A∗ ∪ B ∗ | ≤ n − 1, since the statement trivially holds
when |A ∪ B| = n.
Let us select an arbitrary variable xi such that i ∈ A∗ ∪ B ∗ . Since each of the two
subcubes TA∗ ∪{i},B ∗ and TA∗ ,B ∗ ∪{i} is a subset of TA∗ ,B ∗ , no solution of the equation
φ(X) = 0 exists either in TA∗ ∪{i},B ∗ or in TA∗ ,B ∗ ∪{i} . It follows, then, from the max-
imality of |A∗ ∪ B ∗ | that, on each of the two subcubes TA∗ ∪{i},B ∗ and TA∗ ,B ∗ ∪{i} ,
at least one of the terms of φ is identically 1. Obviously, one of these terms must
involve the literal xi , while the other one must involve x i . Let the two terms in
question be xi C and x i D. Clearly, C and D are elementary conjunctions that are
134 3 Prime implicants and minimal DNFs
both identically 1 on the subcube TA∗ ,B ∗ . Therefore, C and D cannot conflict, and
hence the consensus CD of xi C and x i D exists. Since φ is closed under consensus,
it must contain a term E that absorbs CD. Then this term E must be identically 1
on the subcube TA∗ ,B ∗ , contradicting our assumption.
Lemma 3.6. A DNF is closed under consensus if and only if it contains all prime
implicants of the Boolean function it represents.
Proof. We first prove the “if ” part of the statement. It follows from Lemma 3.2 that
the consensus of any two terms of a DNF φ is an implicant of the Boolean function
f represented by φ. This implies in turn that, if φ contains all prime implicants of
f , then it is closed under consensus.
To prove the “only if ” part of the lemma, let us assume that a DNF φ representing
the
Boolean function f is closed under consensus, and that the conjunction C =
i∈P xi i∈N x i is a prime implicant of f not contained in φ. Clearly, the partial
assignment defined by
Proof of Theorem 3.5. Let φ be the DNF produced by the consensus procedure
applied to the given DNF φ. This DNF φ is closed under absorption and consensus.
By Lemma 3.6, φ contains all the prime implicants of f . Since every implicant
of f is absorbed by a prime implicant of f , it follows that φ is the complete DNF
of f .
Corollary 3.2. A DNF is closed under consensus and absorption if and only if it
is the complete DNF of the function it represents.
Proof. Given two terms, one can check in O(n) time whether one is absorbed
by the other. Therefore, one can check in O(n||φ||2 ) time whether the absorption
operation can be applied to φ.
Given two terms, checking the existence
of their consensus and producing it can
be done in O(n) time. Since there are ||φ|| pairs of terms of φ, it can be checked
||φ|| 3
2
in 2 O(n||φ||) = O(n||φ|| ) time whether every consensus of two terms of φ
is absorbed by another term of φ.
The next corollary is essentially due to Robinson [787]. It shows how a solution
of a consistent DNF equation can be efficiently computed once the prime implicants
of the DNF are available.
Corollary 3.4. If a DNF φ is closed under consensus, then one can find a solution
of the equation φ(X) = 0, or prove that the equation is inconsistent, in O(n2 ||φ||)
time.
Proof. By Lemma 3.6, the equation φ(X) = 0 is inconsistent if and only if 1 is one
of the terms of φ. If this is not the case, then a solution of the equation is obtained
by a simple “greedy” procedure: Fix successively the variables x1 , x2 , . . . , xn to
either 0 or 1, while avoiding making any term in the DNF identically equal to 1.
Indeed, Lemma 3.5 implies that this procedure is correct, since any DNF that is
closed under consensus will remain closed under consensus after substituting any
Boolean values for any of the variables. The time bound follows from the fact that
substituting a value for a variable in any of the DNFs obtained in the process of
fixing variables can be done in O(n||φ||) time.
Variable depletion
A streamlined version of the consensus procedure called variable depletion was
proposed by Blake [99] and later by Tison [864]. This method organizes the con-
sensus procedure in the following way: First, a starting variable xi1 is chosen,
and all possible consensuses are formed using pairs of terms that conflict in xi1 .
After completing this stage and removing all absorbed terms, another variable xi2
is chosen, and all consensuses on xi2 are produced. The process is repeated in the
same way until all variables have been exhausted.
The surprising fact, perhaps, is that after the stage based on an arbitrary variable
xi is completed, there is no need later on to apply again the consensus operation to
any pair of terms conflicting in xi . Before proving the correctness of this method,
we first establish the following lemma (which extends Theorem 2.6).
Note that the first step of the variable depletion procedure will generate all the
terms of the conjunction φ0 φ1 . Therefore, the DNF φ produced after the first step
of variable depletion is
φ = xn φ1 ∨ x n φ0 ∨ φ2 ∨ φ0 φ1 .
Term disengagement
Another interesting variant of the consensus procedure based on term disengage-
ment was introduced by Tison [864], who proved that it works for arbitrary DNFs;
it was subsequently generalized by Pichat [747] to more abstract lattice-theoretic
structures. The Disengagement Consensus procedure is described in Figure 3.2.
It relies on the following principles:
(i) at each stage a term C in the current list L is selected and all possible
consensuses of C with all other terms of L are generated;
138 3 Prime implicants and minimal DNFs
begin
L := (C1 , C2 , . . . , Cm ), the list of terms of φ;
declare all terms C in L to be engaged;
while L contains some engaged term do
select an engaged term C;
declare C to be disengaged;
generate all possible consensuses of C and the other terms of L;
let R be the list of all such consensuses;
L := L ∪ R;
for each C in R
if C is not absorbed by another term in L
then add C to L and declare C to be engaged;
end while
return L;
end
(ii) each of the newly generated terms is checked for absorption by any other
(old or new) existing term; if it is not absorbed, then it is added to L;
(iii) the term C can no longer be chosen as a parent (it is “disengaged”) in any
subsequent stages, although it can still absorb some new terms.
We refer to Tison [864] and Pichat [747] for a proof of correctness of the
disengagement procedure.
Implicant Recognition
Instance: An elementary conjunction C and a Boolean function f in DNF.
Question: Is C an implicant of f ?
Proof. Clearly, the problem belongs to the class co-NP, since one can easily check
in polynomial time whether a Boolean point gives value 1 to the elementary
conjunction C and value 0 to the function f .
3.2 Generation of all prime implicants 139
Theorem 3.7 already suggests that generating the prime implicants of a function
given in DNF cannot be an easy task. Moreover, Theorem 3.17 will show that the
number of prime implicants (that is, the length of the output) may be exponential in
the length of the initial DNF (the input). Therefore, the computational complexity
of prime implicant generation algorithms should be measured in terms of the sizes
of their input and of their output (see Appendix B for a more detailed discussion
of list-generation algorithms).
In fact, if it were possible to design a prime implicant generation algorithm
that runs in polynomial total time (that is, polynomial in the combined sizes of the
input and of the output), then this algorithm could be used to solve DNF equations
in polynomial time, since the only prime implicant of a tautology is the constant
1 (see the proof of Theorem 3.7). This, of course, is not to be expected.
The next theorems will show that the computational complexity of the DNF
equation problem actually is the main stumbling block on the way to the efficient
recognition and generation of prime implicants. In order to state these results, let
us recall that |φ| denotes the length (that is, the number of literals) of a DNF φ,
and let t(L) denote the computational complexity of solving a DNF equation of
length at most L.
Theorem 3.8. For any Boolean function f , any DNF φ representing f , and any
elementary conjunction C, one can check in O(|φ|) + t(|φ|) time whether C is an
implicant of f .
Proof. By definition, an elementary conjunction C = i∈A xi j ∈B x j is an impli-
cant of f if and only if the restriction of f to the subcube TA,B is a tautology. The
latter property can be checked in O(|φ|) + t(|φ|) time by fixing xi to 1, for all
i ∈ A, and xj to 0, for all j ∈ B, in the DNF φ, and by solving the resulting DNF
equation.
Corollary 3.5. For any Boolean function f , any DNF φ representing f , and any
implicant C of f , a prime implicant absorbing C can be constructed in O(|C|(|φ|+
t(|φ|))) time.
Proof. If C is not prime, then there must exist a literal in C such that the elementary
conjunction obtained from C by removing this literal remains an implicant of f .
By Theorem 3.8, this process can be carried out in O(|φ|) + t(|φ|) time for every
literal in C.
140 3 Prime implicants and minimal DNFs
Let us now denote by D(f ) the set of prime implicants of a Boolean function f .
Theorem 3.9. For any Boolean function f and any DNF φ representing f ,
the set D(f ) can be generated by an algorithm that solves O(n|D(f )|2 ) DNF
equations of length at most |φ|. If t(L) is the computational complexity of
solving DNF equations of length L, then the running time of this algorithm is
O(n|D(f )|2 (n|D(f )| + |φ| + t(|φ|))).
Proof. By Corollary 3.5, for every elementary conjunction C in φ, one can find a
prime implicant of f absorbing C in O(|C|(|φ| + t(|φ|))) time. In this way, φ can
be reduced to a prime DNF (i.e., a disjunction of prime implicants) φ . In order
to obtain the complete DNF of f , we are going to add prime implicants to φ , as
described below.
First, the terms of φ are ordered arbitrarily, and the first term is marked. At
each step of the algorithm,
• the first unmarked term is compared to every marked term, and their
consensus – if any – is produced;
• for every consensus produced, a prime implicant of f absorbing it is found
(as in Corollary 3.5), and this prime implicant is added to φ if it is not already
present.
After this, the term is marked, and the algorithm continues with the next unmarked
term of φ . The algorithm stops when all the terms of φ are marked.
By construction, the resulting DNF φ is closed under absorption and consensus.
Therefore, by Corollary 3.2, the output DNF φ is complete.
Since the number of terms of φ never exceeds |D(f )|, the number of steps of
the algorithm does not exceed |D(f )|. At each step, an unmarked term is com-
pared to at most |D(f )| marked terms, and for every consensus produced, a prime
implicant of f absorbing it can be found in O(n(|φ| + t(|φ|))) time. Finally, it
can be checked in O(n|D(f )|) time whether a prime implicant is already present
in φ . Therefore, the total running time of the algorithm is O(n|D(f )|2 (n|D(f )|
+ |φ| + t(|φ|))).
Note that the complexity of the algorithm in Theorem 3.9 depends, not only
on the size of the output (namely, on |D(f )|), but also on the complexity t(φ)
of solving the DNF equation φ = 0. For arbitrary DNFs, we expect t(φ) to be
exponential in the input size.
It is natural, however, to consider the problem of generating the prime implicants
of a Boolean function represented by a DNF in the special case where the associated
DNF equation can be solved efficiently. More precisely, let us call a class C of DNFs
tractable if the DNF equation φ = 0 can be solved in polynomial time for every
DNF φ in C and for every DNF obtained by fixing variables to either 0 or 1 in
such a DNF. For instance, the class of quadratic DNFs the class of Horn DNFs are
tractable (see Chapters 5 and 6).
3.3 Logic minimization 141
Corollary 3.6. For every tractable class C, there exists a polynomial p(x, y) and
an algorithm that, for every DNF φ ∈ C, generates the set of prime implicants
D(f ) of the function represented by φ in polynomial total time p(|φ|, |D(f )|).
Proof. This statement follows immediately from the definition of a tractable class
and from the proof of Theorem 3.9.
Example 3.4. Consider the Boolean function f (x1 , x2 , x3 ) represented by the DNF
φ1 = x1 x 2 ∨ x 1 x2 ∨ x1 x 3 ∨ x 1 x3 .
φ2 = x1 x 2 ∨ x 1 x3 ∨ x2 x 3 ,
(T , F ) ||φ||-minimization
Instance: The complete truth table of a Boolean function f .
Output: A prime ||φ||-minimizing DNF of f .
T ||φ||-minimization
Instance: The list of true points of a Boolean function f or, equivalently, the
minterm DNF expression of f .
Output: A prime ||φ||-minimizing DNF of f .
minT ||φ||-minimization
Instance: The list of prime implicants of a Boolean function f or, equivalently,
the complete DNF of f .
Output: A prime ||φ||-minimizing DNF of f .
||φ||-minimization
Instance: An arbitrary DNF expression of a Boolean function f .
Output: A prime ||φ||-minimizing DNF of f .
Clearly, every prime DNF of f corresponds to a vector S for which the inequal-
ity (3.3) holds as an equality and, conversely, every vector S for which (3.3)
becomes an equality defines a prime DNF of f . It follows from (3.3) that to char-
acterize those vectors S that correspond to the prime DNFs of f , it is sufficient to
144 3 Prime implicants and minimal DNFs
We now consider the Boolean function π(s1 , s2 , . . . , sk ) that takes value 1 exactly
on those points S = (s1 , s2 , . . . , sk ) for which the system of inequalities (3.5) holds
or, equivalently, on those points S for which (3.2) defines a prime DNF of f .
This function is known in the literature as the Petrick function associated with
f (see [744]). A CNF representation of the Petrick function follows directly
from (3.5):
k
π(s1 , . . . , sk ) = ( si Ci (X)). (3.6)
X∈T (f ) i=1
This CNF representation clearly shows that the Petrick function is positive, a fact
that can also be easily derived from its definition.
By definition of the Petrick function, there is a one-to-one correspondence
between its positive implicants and the prime DNFs of the function f . Furthermore,
one can easily see that there is a one-to-one correspondence between the prime
implicants of the Petrick function and the prime irredundant DNFs of the function
f . But of course, in general, the computational complexity of generating all the
prime implicants of the Petrick function is prohibitively expensive.
In view of the preceding discussion, the problem of finding a ||φ||-minimizing
DNF can be formulated as the problem of finding a minimum degree prime impli-
cant of the Petrick function. Alternatively, the same problem can be formulated as
the set covering problem
k
minimize si (3.7)
i=1
Example 3.5. Consider again the Boolean function f (x1 , x2 , x3 ) of Example 3.4,
represented this time by its set of true points
T (f ) = {(1, 0, 0), (0, 1, 0), (0, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1)}.
Using the algorithm described in Section 3.2.1, we generate all the prime impli-
cants of this function: x1 x 2 , x 1 x2 , x1 x 3 , x 1 x3 , x2 x 3 , x 2 x3 . Associating with these
prime implicants the binary variables s1 , s2 , . . . , s6 , respectively, we can write the
CNF (3.6) of the Petrick function as
π(s1 , . . . , s6 ) = (s1 ∨ s3 )(s2 ∨ s5 )(s4 ∨ s6 )(s3 ∨ s5 )(s1 ∨ s6 )(s2 ∨ s4 ).
The complete DNF of the Petrick function is obtained by dualization of this CNF
(see Section 3.2.4):
π(s1 , . . . , s6 ) = s1 s4 s5 ∨ s2 s3 s6 ∨ s1 s2 s3 s4 ∨ s1 s2 s5 s6 ∨ s3 s4 s5 s6 .
From this DNF, we conclude that the function f (x1 , x2 , x3 ) has five prime irre-
dundant DNFs, two of them consisting of three prime implicants each, and three
others consisting of four prime implicants each.
The problem of finding a ||φ||-minimizing DNF of f (without necessarily listing
all its prime irredundant DNFs) can be formulated as the following set covering
problem:
minimize s1 + s2 + s3 + s4 + s5 + s6
subject to s1 + s3 ≥ 1
s2 + s5 ≥ 1
s4 + s6 ≥ 1 (3.11)
s3 + s5 ≥ 1
s1 + s6 ≥ 1
s2 + s4 ≥ 1
si ∈ {0, 1}, i = 1, . . . , 6.
For this small example, it can be easily checked that the optimal solutions of the
set covering problem (3.11) are (1, 0, 0, 1, 1, 0) and (0, 1, 1, 0, 0, 1), corresponding
to the two ||φ||-minimizing DNFs of f :
φ = x1 x 2 ∨ x 1 x3 ∨ x2 x 3 ,
and
φ = x 1 x2 ∨ x1 x 3 ∨ x 2 x3 .
Since, in this example, all the prime implicants of f have the same degree, its
||φ||-minimizing DNFs and |φ|-minimizing DNFs coincide.
It follows from Section 3.2.1 that, given the set of true points of a Boolean
function, the set covering formulation of the logic minimization problem can be
146 3 Prime implicants and minimal DNFs
φ1 = xy ∨ xy ∨ xu ∨ uw
and
φ2 = xy ∨ xy ∨ yu ∨ uw.
Notice that the prime implicants xy, xy, and uw appear in all prime and irre-
dundant DNFs of f , while the prime implicants xw and yw do not appear in any
prime and irredundant DNFs of f .
The rows of this matrix correspond to the true points of f and will be denoted
by aj , j = 1, 2, . . . , |T (f )|. The columns of the matrix correspond to the prime
implicants of f and will be denoted by a i , i = 1, 2, . . . , k.
Let us say that a (0, 1)-point S = (s1 , s2 , . . . , sk ) satisfying the system of inequal-
ities (3.5) is a minimal solution of (3.5) if no point obtained by changing any of
the components of S from 1 to 0 also satisfies (3.5). We now discuss three com-
putationally easy transformations which can be used to simplify the system of set
covering inequalities (3.5), while preserving all its minimal solutions (the presen-
tation is ours, but we refer to Gimpel [382], Pyne and McCluskey [764, 765], or
Zhuravlev [937] for early references on this topic).
S1 If the matrix A contains a row aj ∗ with a single component, say i ∗ , equal to 1
(that is, aj ∗ i ∗ = 1, and aj ∗ i = 0 for all i = i ∗ ), then fix si ∗ = 1 and remove
∗
from the matrix A the column a i and all the rows aj having aj i ∗ = 1.
S2 If the matrix A contains two comparable rows, say aj and aj , such that
aj ≤ aj (i.e., aj i ≤ aj i for every i), then remove the row aj from A.
∗
S3 If the matrix A contains a column a i consisting only of 0 components, then
∗
fix si ∗ = 0 and remove the column a i from A.
It can be seen easily that the three simplifications S1, S2, and S3 preserve the
set of minimal solutions of the set covering inequalities (3.5). Therefore, one can
simplify (3.5) by repeatedly applying S1, S2, and S3 in an arbitrary order, for as
long as possible. Let us denote the resulting matrix by A, the set of variables s
that are fixed at 1 by S1 , and the set of variables s that are fixed at 0 by
S0 . One
would expect the matrix A and the sets
S1 and
S0 to depend on the particular order
in which the simplifications were applied. To avoid ambiguity, let us now specify
an algorithm that first applies S1 for as long as possible, then applies S2 for as
long as possible, and finally applies S3 for as long as possible. We shall call this
algorithm the essential reduction algorithm (ERA). Let us denote the resulting
matrix by A∗ , and the set of variables which are fixed at 1 (respectively, 0) by S1∗
(respectively, S0∗ ).
Theorem 3.10. The end result of applying simplifications S1, S2, and S3 as long
as possible does not depend on the order of their application: Every possible order
= A∗ ,
always yields A S1 = S1∗ , and
S0 = S0∗ .
148 3 Prime implicants and minimal DNFs
Proof. The proof follows from three simple observations. First, let us observe that
if an intermediate matrix A contains a row with a single 1 component, then that
row cannot contain more than one 1 in the original matrix A. Indeed, if a column
was removed during the simplification process, then either this column had no 1’s,
and therefore its removal did not affect the number of 1’s in the remaining rows,
or all the rows in which this column had a 1 were also removed at the same step.
Therefore, S1 = S1∗ .
Second, an intermediate matrix A contains two comparable rows if and only
if these two rows are also comparable in the original matrix A. This is a direct
consequence of the fact that none of the simplification steps S1, S2, or S3 affects
the comparability of the remaining rows. It follows, then, from the foregoing two
observations that the sets of rows of A and A∗ are exactly the same.
and A∗ consists exactly of those columns of the
Third, the set of columns of A
original matrix that have at least one 1 component in the remaining rows. Indeed,
on the one hand, neither of the matrices contains a column consisting only of 0’s.
On the other hand, if a removed column did have some 1’s, then all the rows, in
which it had 1’s, were also removed. In conclusion, the set of remaining columns
is uniquely determined by the set of remaining rows. Since the sets of rows of A
∗ ∗
and A coincide, we have A = A . It follows that exactly the same sets of variables
were fixed in both procedures, and since we have already concluded that S1 = S1∗ ,
we can now conclude that S0 = S0 . ∗
Lemma 3.9. For every variable s of the system of set covering inequalities (3.5),
s is not fixed by ERA if and only if there exists a minimal solution of (3.5) in which
s = 1 and a minimal solution of (3.5) in which s = 0.
Proof. The “if ” part follows from the fact that the simplifications S1, S2, and S3
preserve all the minimal solutions.
We now prove the “only if” part. Let si ∗ be a variable that is not fixed by
ERA. On the one hand, since every row of A∗ has at least two 1’s, we can set
si ∗ = 0, and the problem will remain feasible, showing that there must exist a
minimal solution of (3.5) in which si ∗ = 0. On the other hand, since A∗ has no
columns consisting of all 0’s, there must exist a row j ∗ in A∗ such that aj ∗ i ∗ = 1.
Let us now set si = 0 for every i = i ∗ such that aj ∗ i = 1. Since A∗ has no com-
parable rows, the set covering system remains feasible, and in every solution
of this reduced set covering system (including the minimal ones), si ∗ must be
equal to 1 because there is no other way to satisfy the inequality corresponding
to row j ∗ .
We have seen that the simplifications S1, S2, and S3, and therefore ERA, pre-
serve the minimal solutions of the system of set covering inequalities (3.5); namely,
they preserve the set of prime and irredundant DNFs. This property allows the
application of S1, S2, and S3, and of ERA, to any type of logic minimization
problem whose objective is to minimize the number of terms, the number of lit-
erals, or any monotonically increasing function of these two DNF complexity
measures.
Let us now turn our attention to another type of simplifying transformation,
which has a more limited scope of application, since it may not preserve all the
minimal solutions of the set covering inequalities (3.5).
S4 If the matrix A contains two comparable columns, say, a i and a i , such that
a i ≥ a i (i.e., aj i ≥ aj i for every j ), then fix si = 0 and remove the
column a i from A.
Note that the simplification S3 introduced earlier is a special case of S4. The
simplification S4 is guaranteed to preserve at least one minimum-cardinality solu-
tion of the system (3.5), namely, one optimal solution of the set covering problem
(3.7)–(3.8). Indeed, a single application of S4 reduces the current set of minimal
solutions in such a way that only those minimal solutions in which si = 0 are pre-
served. Further, since a i ≥ a i , if there is a current minimum-cardinality solution
S with si = 1, then the point S ∗ , which is equal to S in all components, except
si∗ = 0 and si∗ = 1, is also a minimum-cardinality solution and is preserved by
S4.
It is now clear that S4 can be applied to simplify those logic minimization
problems whose objective is to find at least one minimum solution of the set
covering problem (3.7)–(3.8), that is, to find a ||φ||-minimizing prime DNF.
Note that the simplification process can never start with S4 because, in our
logic minimization problems, the initial set covering matrix A does not contain
comparable columns (because no prime implicant is absorbed by another one).
Nevertheless, S4 may become applicable after several applications of simplifi-
cations S1 or S2. On the other hand, the opposite phenomenon can also happen;
namely, it is possible that neither S1 nor S2 is applicable but S4 is, and after several
applications of S4 it may become possible to apply S1 or S2. Therefore, further
simplifications can be achieved by alternatively applying ERA and S4 as long as
possible.
At first glance, set covering problems arising from logic minimization prob-
lems do display some special features. For example, since each column of the set
covering matrix corresponds to a prime implicant (i.e., a subcube), the number
of 1’s in the column must be a power of 2. Similarly, the number of 1’s in the
intersection of any subset of columns must also be a power of 2.
In view of such special features, formally, not every set covering problem
originates from logic minimization. Therefore, it comes as a surprise that every
(nontrivial) set covering problem is, in fact, an S1-simplified version of a logic
minimization problem. More precisely, given an arbitrary set covering problem
without zero rows or columns, there exists a logic minimization problem, which –
after several applications of the simplification S1 – reduces to it. This subsection
is devoted to a proof of this result and its corollaries.
Let us consider the system of set covering inequalities
m
aj i si ≥ 1, j = 1, 2, . . . , n (3.13)
i=1
Ci = xj . (3.15)
j : aj i =0
152 3 Prime implicants and minimal DNFs
Example 3.8. As a small example, let us consider the following set covering
matrix:
1 0 1
A= 1 1 0 .
0 0 1
Then, the points associated to its rows are
1
P 0 1 1
P2 = 1 0 1 ,
P3 1 1 0
(C1 , C2 , C3 ) = (x3 , x1 x3 , x2 ).
Lemma 3.10. For every matrix A ∈ Bn×m without zero rows, there holds
(a) for all j = 1, 2, . . . , n and i = 1, 2, . . . , m, aj i = Ci (P j );
(b) for all j = 1, 2, . . . , n, P j is a true point of the function represented by the
DNF m i=1 Ci .
Lemma 3.10 suggests that A comes close to being the matrix associated with a
logic minimization problem, because it expresses the covering of the true points
P j by the terms Ci . However, as the Example 3.8 shows, absorption may possibly
take place among the conjunctions Ci (indeed, x3 absorbs x1 x3 in the example).
Therefore, the construction has to be modified if we want the conjunctions Ci to
represent the prime implicants of some Boolean function.
Let us call a column a i of A dominating if there exists another column a i in
A such that a i ≥ a i , and let us redefine the associated conjunctions Ci by
Ci , if a i is not dominating,
Ci := (3.16)
Ci yi , if a i is dominating,
where the yi ’s represent additional Boolean variables. Obviously, after this trans-
formation, there will be no absorption among the conjunctions Ci . In order to
complete the construction, we shall extend the associated vectors P j by adding
additional components for each of the additional variables yi , and defining the
value of all these components to be 1. This modification preserves the property
that A expresses the covering of the points P j ’s by the conjunctions Ci ’s.
3.3 Logic minimization 153
where the terms Ci are defined by (3.15) and (3.16). Note that ψ represents a
positive Boolean function, say, f , and, since ψ is closed under absorption, it is the
complete DNF of f . The true points of f include all the points P j , j = 1, 2, . . . , n
but can also include many additional points, say, Qt , t = 1, 2, . . . , T .
If we simply extend the set covering problem by adding to A all the rows
corresponding to the additional true points Qt , then the resulting matrix may not
necessarily be reducible to A by using the simplifications S1, S2, and S3. To make
this reduction possible, we introduce two additional variables, z0 and z1 . For any
Boolean point Q, let us denote by [Q] the unique minterm (in the (x, y)-variables)
covering Q, and let us say that Q is “even” (respectively, “odd”) if it has an even
(respectively, odd) number of components equal to 1. We can now define the DNF:
m
ψ∗ = z0 z1 Ci ∨ z0 [Qt ] ∨ z1 [Qt ]. (3.18)
i=1 t: Qt is even t: Qt is odd
We let f ∗ be the Boolean function represented by ψ ∗ .
Example 3.10. For the Example 3.9, there are eight additional true points Qt :
1
Q 1 1 1 1
Q2 1 1 1 0
3
Q 1 1 0 0
4
Q 1 0 1 0
=
Q5 0 1 1 0 .
Q6 0 1 0 1
Q7 0 0 1 1
Q8 0 1 0 0
The associated DNF is
ψ ∗ = x3 y1 z0 z1 ∨ x1 x3 z0 z1 ∨ x2 z0 z1 ∨ x1 x2 x3 y1 z0 ∨ x1 x2 x 3 y 1 z0 ∨ x1 x 2 x3 y 1 z0 ∨
x 1 x2 x3 y 1 z0 ∨ x 1 x2 x 3 y1 z0 ∨ x 1 x 2 x3 y1 z0 ∨ x1 x2 x3 y 1 z1 ∨ x 1 x2 x 3 y 1 z1 .
154 3 Prime implicants and minimal DNFs
ψ ∗ = z0 z1 ψ ∨ z0 ψ0 ∨ z1 ψ1 .
By construction, no two terms of ψ absorb each other. The same holds for the
terms of ψ0 and ψ1 . Moreover, it is obvious that no term of z0 ψ0 can absorb a term
of z1 ψ1 , and vice versa. It is also obvious that no term of z0 z1 ψ can absorb any
term of z0 ψ0 or z1 ψ1 . Since A has no zero columns, no term of ψ is a minterm on
the (x, y)-variables. Then, since every term of ψ0 and of ψ1 is a minterm, no term
of z0 ψ0 or z1 ψ1 can absorb a term of z0 z1 ψ. Thus, ψ ∗ is closed under absorption.
Let us now prove that ψ ∗ is closed under consensus. Obviously, no two terms
of z0 z1 ψ have a consensus because they are all positive. Moreover, any two terms
of ψ0 have at least two conflicting literals, and hence no two terms of z0 ψ0 have
a consensus. For the same reason, no two terms of z1 ψ1 have a consensus.
Let us now assume that a term of z0 z1 ψ, say z0 z1 C, and a term of z0 ψ0 , say
z0 [Q], have a consensus. This can only happen if there is a variable w in C such
that w appears in [Q]. Since Q is a true point of f , there exists a prime implicant
of f , say, C that absorbs [Q] and obviously does not contain w. Then, z0 z1 C is
a term of z0 z1 ψ that absorbs the consensus of z0 z1 C and z0 [Q]. Similarly, every
consensus of a term in z0 z1 ψ and a term in z1 ψ1 will be absorbed by a term in
z0 z1 ψ.
Let us next assume that a term of z0 ψ0 , say, z0 [Q ], and a term of z1 ψ1 , say
z1 [Q ], have a consensus. Without loss of generality, let us assume that [Q ] = wG
and [Q ] = wH . Again, there exists a prime implicant of f , say, C that absorbs
[Q ] and that does not contain w. Then, z0 z1 C is a term of z0 z1 ψ that absorbs the
consensus of z0 [Q ] and z1 [Q ].
Thus, ψ ∗ is closed under consensus and, in view of Corollary 3.2, ψ ∗ is the
complete DNF of f ∗ .
We now discuss the logic minimization problem for f ∗ . By Lemma 3.11, the
columns of the set covering matrix A∗ associated with f ∗ correspond to the terms
of ψ ∗ . The rows of A∗ correspond to the set of true points T (f ∗ ). These true points
are derived from the points P j , j = 1, 2, . . . , n and Qt , t = 1, 2, . . . , T by extending
them with two additional components, corresponding to z0 and z1 , so that T (f ∗ )
consists of the following disjoint subsets:
• The set P of points (P j , 1, 1), j = 1, 2, . . . , n.
• The set Q11 of points (Qt , 1, 1), t = 1, 2, . . . , T .
• The set Q10 of points (Qt , 1, 0), where Qt is even.
• The set Q01 of points (Qt , 0, 1), where Qt is odd.
Let us see what happens when the simplification steps S1 are performed on
A∗ . Every true point of the form (Qt , σ , σ ) (t = 1, . . . , T ; σ = 0, 1) is covered
by a single prime implicant zσ [Qt ]. Therefore, every prime implicant zσ [Qt ] is
essential, and the application of the simplification S1 removes the corresponding
3.3 Logic minimization 155
columns and all the rows in Q10 ∪ Q01 from the set covering matrix A∗ . Moreover,
the rows in Q11 are also removed by S1, since every prime implicant of the form
zσ [Qt ] covers (Qt , 1, 1).
So, the application of S1 only leaves in A∗ the rows associated with the true
points (P j , 1, 1), j = 1, 2, . . . , n, and the columns associated with the prime impli-
cants of the form z0 z1 Ci , i = 1, 2, . . . , m. It now follows from Lemma 3.10 that this
reduced set covering matrix coincides with the original matrix A. This completes
the proof of the following result, due to Gimpel [381]:
Theorem 3.13. Given an arbitrary set covering problem without zero rows or
columns, there exists a logic minimization problem whose set covering formu-
lation can be reduced to the given problem after several applications of the
simplification S1.
The foregoing arguments show how to transform an arbitrary set covering prob-
lem of size n × m to an equivalent logic minimization problem having at most
n + m + 1 Boolean variables. Since it is well known that the set covering problem
is NP-hard [371], one may be tempted to interpret this construction as an NP-
hardness proof for the logic minimization problem. Unfortunately, this inference
is incorrect, since the described reduction is not necessarily polynomial because
the number of true points of f ∗ constructed above can be exponentially large in
n and m. However, this difficulty is easy to overcome, and we can establish the
following result (Gimpel [381]):
Theorem 3.14. The logic minimization problem is NP-hard when its input is a
Boolean function given by the set of its true points.
Proof. It is known [371] that the set covering problem remains NP-hard in the
special case in which every column of the set covering matrix contains at most
three 1’s, and the matrix does not contain any pair of comparable columns. In this
case, because of the incomparability of the columns, no variable y is needed in the
construction (3.17) of the DNF ψ. Moreover, the degree of every conjunction Ci
is at least n − 3, hence the number T of the additional true points Qt of f is at most
8m. Therefore, the number of prime implicants of the Boolean function f ∗ is at
most 9m, and the DNF ψ ∗ can be constructed in polynomial time. It follows that,
for this special case of set covering problems, the transformation to an equivalent
logic minimization problem is polynomial, which completes the proof.
equivalent; namely, the length of one representation may not necessarily be limited
by a polynomial function of the length of another one.
Some representations of a Boolean function can be viewed as special cases of
others. For example, the representation of a Boolean function by the set of its true
points can be viewed as a special type of DNF representation. Thus, in particular,
Theorem 3.14 implies that the logic minimization problem is NP-hard for Boolean
functions expressed in DNF.
On the other hand, the representation of a Boolean function by the set of its
true points can be exponentially shorter than its representation by a complete truth
table. It is therefore surprising that the latter, possibly much larger, representation
does not make the logic minimization problem significantly simpler. As a matter of
fact, Masek [674] was able to prove that the logic minimization problem remains
NP-hard when its input is a complete truth table. A more accessible proof (based on
Gimpel’s construction [381]) of the latter result was recently proposed by Allender
et al. [16].
k
minimize si
i=1
k
subject to aj i si ≥ 1, j = 1, 2, . . . , n,
i=1
(s1 , s2 , . . . , sk ) ∈ Bk ,
and all those variables si , which have not been set to 1 in this process, are now
set to 0.
Let us now describe this greedy procedure in terms of the set covering matrix
A = (aj i )i=1,...,k
j =1,...,n . Denote by A(r) the reduced set covering matrix at the beginning
of step r of the greedy procedure. Thus, A(1) denotes the original set covering
matrix A. At step r, the greedy procedure
1. calculates the number |a(r)i | of 1’s in every column a(r)i of the matrix
A(r);
2. chooses a column a(r)ir having the maximum number of 1’s; and
3. reduces A(r) to A(r + 1) by removing from A(r) the chosen column a(r)ir
as well as all the rows a(r)j covered by it, namely, those with a(r)j ir = 1.
The process stops when all rows have been removed from the set covering matrix.
Let q be the number of steps of the greedy procedure. For simplicity, let us
renumber the columns of the set covering matrix in such a way that the removed
columns i1 , i2 , . . . , iq become 1, 2, . . . , q; the remaining columns are numbered from
q + 1 to k. Let us denote by wir the number |a(r)i | of 1’s in the i-th column of
A(r). With this notation, wrr is the number of rows removed from the set covering
matrix at step r of the greedy procedure. Note that w11 is the maximum number of
1’s in the columns of the original set covering matrix A (we assume, without loss
of generality, that w11 ≥ 1).
Two important observations about the greedy procedure are in order. First, this
procedure is very efficient. Indeed, if n is the number of rows and k is the number
of columns in the set covering matrix, then the number of steps of the procedure
does not exceed min{n, k}, while each step takes O(nk) time; therefore the com-
putational complexity of the greedy procedure is O(min{n, k}nk). Second, despite
its low computational cost, the greedy procedure produces very good solutions.
To quantify this last statement, let us compare the size q of the greedy cover (that
is, the number of variables fixed to 1 by the greedy procedure) with the size m of a
minimum cover. Obviously, m ≤ q. On the other hand, q cannot be “much worse”
than the optimum, in view of the following surprising result:
Theorem 3.15. For any set covering problem, if m is the size of a minimum cover,
then the size q of the greedy cover is bounded by the relation
where w11 is the maximum number of 1’s in a column of the set covering matrix,
and
d
1
H (d) = for all positive integers d.
i=1
i
We refer to Chvátal [198], Johnson [536], or Lovász [623], for a proof of this
classical result. It is easy to show (e.g., by induction) that H (d) ≤ 1 + ln d for any
positive integer d. Thus, Theorem 3.15 implies the following corollary (see also
Slavík [837] for a slight improvement).
158 3 Prime implicants and minimal DNFs
Corollary 3.7. For any set covering problem, if m is the size of a minimum cover,
then the size q of the greedy cover is bounded by the relation
where w11 is the maximum number of 1’s in a column of the set covering matrix,
and n is the number of its rows.
q ≤ (1 − ln 2 + n ln 2)m. (3.21)
We observed in Section 3.3.1 that, when the input is the set of true points
of a Boolean function, the set covering formulation (3.7)–(3.8) of the logic
minimization–problem can be constructed in polynomial time. Hence, the greedy
procedure also runs in polynomial time on this input and provides a solution of
the ||φ||-minimization problem that approximates its optimal value to within a
factor O(n).
A natural question to be asked now is whether there exists a polynomial time
algorithm having a significantly better approximation ratio. In all likelihood, the
answer to this question is negative. Indeed, Feldman [328] established the follow-
ing result: Even when the input of the logic minimization problem consists of the
complete truth table of a Boolean function f , there exists a constant γ > 0 such
that it is NP-hard to approximate m to within a factor nγ , where m is the number of
terms in a ||φ||-minimizing DNF of f , and n is the number of variables. This result
implies that the approximation factor achieved by the greedy algorithm is at most
polynomially larger than the best ratio that can be achieved in polynomial time,
unless P=NP. (When the input is an arbitrary DNF, Umans [876] proves stronger
inapproximability results.)
Additionally, the following surprising fact was proved by Feige [324]: Under
the assumption that NP-complete problems cannot be solved in O(l O(log log l) ) time
(where l denotes the length of the input), it is shown in [324] that no polyno-
mial time algorithm for the set covering problem can have an approximation ratio
less than (1 − o(1)) ln ρ. Since, by Corollary 3.7, the approximation ratio of the
greedy procedure is (1 + o(1)) ln ρ, the only remaining possibility is to improve
the approximation ratio by a lower-order term o(ln ρ).
3.4 Extremal and typical parameter values 159
Chvátal [198] generalized Theorem 3.15 and Corollary 3.7 for the weighted
version of the set covering problem in which nonnegative weights ci are associated
with the variables si , and the problem is the following:
k
minimize ci s i
i=1
k
subject to aj i si ≥ 1, j = 1, 2, . . . , n,
i=1
(s1 , s2 , . . . , sk ) ∈ Bk .
In this case, the generalized greedy procedure is defined in a similar way, the only
difference being that at each iteration r a column a(r)i is chosen so as to maximize
the ratio wir /ci of the number of 1’s remaining in the column divided by its weight.
The approximation results of Theorem 3.15 and Corollary 3.7 remain valid for this
weighted set covering problem [198]. Therefore, if ql is the number of literals in
a prime DNF of an n-variable Boolean function f, constructed by the generalized
greedy procedure applied to the set covering formulation (3.9)–(3.10), and if ml
is the number of literals in a |φ|-minimizing DNF of f , then it follows, similarly
to Corollary 3.8 that
ql ≤ (1 − ln 2 + n ln 2)ml . (3.22)
Theorem 3.16. There is a positive constant c such that, for every n ≥ 3, there
n
exists a Boolean function of n variables having at least c 3n prime implicants.
n,n
Proof. The statement holds for the belt function bn3 3 . Indeed, it follows from
Lemma 3.12 that the number of prime implicants of a belt function bnm,k equals
n n−m n!
= ,
m n−m−k m!k!(n − m − k)!
which, for m = k = n3 , equals
n!
. (3.23)
( n3 !)3
Substituting into (3.23) the well-known Stirling formula (see, e.g., [314])
√ n
n! = 2π n( )n (1 + o(1)),
e
one can see that there exists a positive constant c such that the number of prime
n,n n
implicants of bn3 3
is at least c 3n .
The previous statement shows that the number of prime implicants of a Boolean
function can be exponentially large in the number of Boolean variables. From the
algorithmic point of view, it is also important to understand how large the number
of prime implicants can be in terms of the length of an arbitrary DNF or CNF
representation of a Boolean function. Interestingly, the number of prime implicants
can be exponential in the length of a DNF, even for seemingly simple functions,
as the following theorem shows.
Theorem 3.17. For every integer n ≥ 1, there exists a Boolean function f that
has 2n + 2n prime implicants and can be represented by a DNF having 2n + 1
terms.
3.4 Extremal and typical parameter values 161
n n
φ(x1 , . . . , xn , y1 , . . . , yn ) = xi ∨ (xi y i ∨ x i yi ),
i=1 i=1
Theorem 3.18. For every integer n ≥ 1, there exists a Boolean function f that
has 3n prime implicants and can be represented by a CNF having n clauses.
ψ(x1 , . . . , xn , y1 , . . . yn , z1 , . . . zn ) = (xi ∨ yi ∨ zi ).
i=1
This CNF has n clauses. It is clear from the CNF expression that, in each min-
imal true point of f , exactly one the variables xi , yi , and zi equals 1, for every
i ∈ {1, 2, . . . , n}. Therefore, in view of Theorem 1.26, the complete DNF of f con-
sists of elementary conjunctions of the form ni=1 ui , where each ui is either xi ,
yi , or zi . Hence, the function f has 3n prime implicants.
By the induction hypothesis, f (x1 , x2 , . . . , xn−1 , 0) and f (x1 , x2 , . . . , xn−1 , 1), being
functions of n − 1 variables, have DNF representations φ0 and φ1 such that ||φ0 || ≤
2n−2 and ||φ1 || ≤ 2n−2 . Then, f (x1 , x2 , . . . , xn ) can be represented by the expression
x n φ0 ∨ xn φ1 ,
which immediately expands into a DNF φ such that ||φ|| ≤ ||φ0 || + ||φ1 || ≤ 2n−1 .
To show that the bound is attained, define the parity function of n variables
to be the Boolean function whose value in the Boolean point X = (x1 , x2 , . . . , xn )
is 1 if and only if ni=1 xi is odd. Obviously, the number of true points of the
parity function is 2n−1 . Since every two terms in the minterm DNF of the parity
function have degree n and conflict in at least two variables, this DNF is closed
under absorption and consensus, and is therefore the complete DNF of the parity
function. Since the minterm DNF is obviously irredundant, it then follows that the
parity function has a unique DNF representation, and that this representation has
2n−1 terms.
Theorem 3.20. For almost all Boolean functions f of n variables, the number
|T (f )| of true points of f satisfies the inequalities:
n n
2n−1 − n2 2 ≤ |T (f )| ≤ 2n−1 + n2 2 . (3.24)
Proof.
n The number of Boolean functions of n variables having exactly k true points
is 2k , since every Boolean point is either a true point or a false point. Hence, the
total number of Boolean functions with the property that their number of true
points satisfies (3.24) is
n
n
2n−1
+n2 2 n
−n2 2 −1 n
2n−1
2 n 2
= 22 − 2 .
n k k=0
k
k=2n−1 −n2 2
where the last equality can be obtained by using the formula mk = k!(m−k)!
m!
, together
with the following refined version of the Stirling formula (see, e.g., [314]):
√ n 1 √ n 1
2πn( )n e 12n+1 < n! < 2π n( )n e 12n ,
e e
with the limits: (1 + m1 )m → e and (1 − m1 )m → 1
e
when m → ∞.
Asimple interpretation of Theorem 3.20 is that, for almost all Boolean functions,
the number of true points is about the same as that of false points, namely, about
2n−1 . After establishing this fact, it is natural to ask in what way these two sets
of true and false points are mixed in the Boolean hypercube. More specifically,
one may wonder whether the set of true points of a typical Boolean function
contains large subcubes. The next theorem states that a typical Boolean function
has only “long” implicants, thus showing that the answer to the previous question
is negative.
Theorem 3.21. For almost all Boolean functions of n variables, the degree of
every implicant is at least n − log2 (3n).
Proof. Before proving the statement, we first calculate the average number of
implicants of a fixed degree k over the set of all Boolean functions
of n variables.
Note that the number of different terms of degree k is nk 2k . Every such term
takes the value 1 in exactly 2n−k Boolean vectors. Therefore, every such term is
n n−k
an implicant of exactly 22 −2 different Boolean functions of n variables.
Let us consider now a bipartite graph having two disjoint vertex sets A and B,
where the nodes in A correspond to the terms of degree k over n variables, while
164 3 Prime implicants and minimal DNFs
The next natural question concerns the number of terms (or literals) in a ||φ||-
minimizing (or |φ|-minimizing) DNF of a typical Boolean function of n variables.
Several important results are known in this area. We shall not present the detailed
proofs of these technical results here, and we give only a brief overview.
An interesting result obtained by Nigmatullin [712] shows that the number of
terms (respectively, literals) in the ||φ||-minimizing (respectively, |φ|-minimizing)
DNFs of almost all Boolean functions of n variables is asymptotically the same.
Let t(n) and l(n) represent “asymptotic estimates” of these two numbers. It follows
from Theorem 3.21 that l(n) behaves like nt(n); thus, it is sufficient to estimate
t(n) only.
Glagolev [385] obtained the following lower bound on t(n):
2n−1
t(n) ≥ .
(log2 n)(log2 log2 n)
3.5 Exercises 165
3.5 Exercises
1. Consider a set of ordered pairs D = {(i, j )}, where i, j ∈ {1, 2, . . . , n}, and call
a Boolean point X = (x1 , x2 , . . . , xn ) D-feasible if, for every pair (i, j ) ∈ D,
the implication “xi = 1 implies xj = 1” holds. Let fD be the Boolean function
that takes the value 1 on D-feasible Boolean points, and the value 0 on all
the other Boolean points. Prove that f D has no prime implicants of degree 3.
2. Consider the linear inequality
n
2i xi ≤ k,
i=0
This chapter deals with yet another fundamental topic in the theory of Boolean
functions, namely, duality theory. Some of the applications of duality were sketched
in Chapter 1, and the concept has appeared at various occasions in Chapters 2 and 3.
Here, we collect some of the basic properties of the dual of Boolean functions
and, then characterize those functions that are comparable to (i.e., either imply,
or are implied by) their dual. A large section of the chapter is then devoted to
algorithmic aspects of dualization, especially for the special and most interesting
case of positive functions expressed in disjunctive normal form. It turns out that
the complexity of the latter problem remains incompletely understood, in spite of
much recent progress on the question.
167
168 4 Duality theory
(c) (f )d = (f d );
(d) (f ∨ g)d = f d g d ;
(e) (f g)d = f d ∨ g d ;
(f) f ≤ g if and only if g d ≤ f d .
Proof. All these properties are trivial consequences of Definition 4.1 (properties
(b)–(e) have already been verified in Theorem 1.2).
In view of the involution property (b), we sometimes say that two functions
f , g are mutually dual when g = f d or, equivalently, when f = g d .
Observe that the properties stated in Theorem 4.1 continue to hold when we
replace dualization by complementation. As a matter of fact, investigating proper-
ties of the dual function f d is tantamount to investigating properties of the function
f , namely, the complement of f , up to the “change of variables” X ↔ X. It turns
out, however, that the duality concept arises quite naturally in several applications.
Therefore, we prefer to place our discussion in this framework.
If f is a function on B n and P , N are disjoint subsets of {1, 2, . . . , n}, then we
denote by f|P ,N the restriction of f obtained by fixing xi = 1 for all i ∈ P and
xj = 0 for all j ∈ N . The next property expresses in a formal way that “the dual
of the restriction of a function is the restriction of the dual of the function to the
complementary values.”
Theorem 4.3. Let f and g be two Boolean functions on Bn . If f and g are mutually
dual, then, for all i ∈ {1, 2, . . . , n}, f|xi =0 and g|xi =1 are mutually dual, and f|xi =1
and g|xi =0 are mutually dual. Conversely, if for some i ∈ {1, 2, . . . , n}, f|xi =0 and
g|xi =1 are mutually dual, and f|xi =1 and g|xi =0 are mutually dual, then f and g are
mutually dual.
(P ∩ Pi ) ∪ (N ∩ Ni ) = ∅ for i = 1, 2, . . . , m; (4.5)
(ii) CP N is a prime implicant of f d if and only if (4.5) holds and, for every P ⊆ P
and N ⊆ N with P ∪ N = P ∪ N , there exists an index i ∈ {1, 2, . . . , m}
such that
(P ∩ Pi ) ∪ (N ∩ Ni ) = ∅.
Proof. By definition of dual functions
(namely,
f d (X) = f (X)), CP N is an impli-
cant of f d if and only if CNP = j ∈P x j j ∈N xj is an implicant of f . Since
f ∧ f = 0, the identity CPi Ni ∧ CNP = 0 must hold for all implicants CPi Ni of f ,
which implies (4.5).
Conversely, if (4.5) holds, then f ∧ CN P = 0 holds identically, meaning that
CN P is an implicant of f . This establishes assertion (i).
Assertion (ii) follows from the definition of prime implicants.
Observe that, in conditions (i) and (ii) of Theorem 4.7, the conjunctions CPi Ni
could be taken to be prime implicants, rather than arbitrary implicants of f .
Theorem 4.8. Suppose that the function f has a prime implicant of degree 1.
Then, f is dual-major. Moreover, f is dual-minor (and self-dual) if and only if f
has no other prime implicant.
Of course, the function g in Example 4.3 shows that there also exist nontrivial
examples of dual-comparable functions. The next result is a simple restatement of
Definition 4.2 (compare with Theorem 4.1(a)).
(P ∩ P ) ∪ (N ∩ N ) = ∅ (4.6)
for all pairs of (prime) implicants CP N = j ∈P xj j ∈N x j and CP N =
j ∈P xj j ∈N x j of f .
We now establish that self-dual functions are maximal among all dual-minor
functions. More precisely, let us say that a dual-minor function f is maximally
dual-minor if there exists no dual-minor function g such that f ≤ g and f = g.
gd ≤ f d = f ≤ g ≤ gd .
g d = f d (f ∨ x1 ) = f d f ∨ f d x1 = g
The construction used in the proof of Theorem 4.12 can be generalized to yield
a simple, standard way of associating a self-dual function with an arbitrary Boolean
function.
Theorem 4.14. For every Boolean function f , the function f SD defined by (4.7)
is self-dual. The mapping SD: f + → f SD is a bijection between the set of Boolean
functions of n variables and the set of self-dual functions of n + 1 variables.
(f d (X) ∨ x n+1 )(f (X) ∨ xn+1 ) = f d (X) f (X) ∨ f (X) x n+1 ∨ f d (X) xn+1
= f (X) x n+1 ∨ f d (X) xn+1 .
Hence, f SD is self-dual.
The mapping SD is injective, since the restriction of f SD to xn+1 = 0 is exactly
f . Morevover, SD has an inverse, defined by g + → g|xn+1 =0 for every self-dual
function g ∈ Bn+1 . Indeed, if g is self-dual, then
is self-dual.
174 4 Duality theory
g(X, Y ) = f1 (x 1 , x 2 , . . . , x n , f2 (Y ))
= f1 (x 1 , x 2 , . . . , x n , f2 (Y ))
= f1 (x1 , x2 , . . . , xn , f2 (Y ))
= g(X, Y ),
4.1.4 Applications
Duality plays a central role in various applications arising in artificial intelli-
gence, computational logic, data mining, reliability theory, game theory, integer
programming, and so on. Some of these applications have already been mentioned
in previous chapters. We have observed several times, for instance, that one way
of solving the Boolean equation φ = 0 is to compute a CNF representation of φ or,
equivalently, a DNF representation of φ d . Actually, if a DNF expression of φ d is
at hand, then all solutions of the equation φ = 0 are readily available (see Section
2.11.2).
Example 4.4. Let f (x, y, z, u) = xy ∨ xzu ∨ xyz ∨ yzu. It can be checked that
f d = xz ∨ yz ∨ xyu ∨ x y u. Hence, the solutions of f = 0 have the form (1, ∗, 0, ∗),
(∗, 0, 0, ∗), (0, 0, ∗, 0), or (1, 1, ∗, 1), where ∗ denotes an arbitrary 0–1 value.
We now present a few additional models involving dual functions. Other appli-
cations will be presented in Section 4.2, when we concentrate more specifically
on positive functions.
what went wrong with the system. More precisely, Reiter [783] defines a diagnosis
as a minimal subset J ⊆ {1, 2, . . . , m} such that
m
φk (X ∗ , Y ) = 0
k=1
k ∈ J
is consistent. The idea is that, were it not for the components in J, then I would
have been functioning properly. The minimality of J translates what Reiter calls
the “Principle of Parsimony.”
The task of the analyst is now to produce all diagnoses associated with a given
observation X∗ . Let p1 , p2 , . . . , pm be m new Boolean variables, and define the
function
m
f (Y , P ) = pk φk (X ∗ , Y ).
k=1
Let also
pi yj yj , k = 1, 2, . . . , r,
i∈Jk j ∈Ak j ∈Bk
denote the prime implicants of the dual function f d . We leave it to the reader to
check that diagnoses are exactly the minimal members of the collection of sets
{J1 , J2 , . . . , Jr }. Reiter proposes an ad hoc algorithm that produces all the diag-
noses and that uses as a subroutine a simple dualization algorithm for positive
functions (see also the exercises at the end of this chapter).
of degrees deg(f ) and deg(f d ), respectively; let C be any term of φ, and let
(without loss of generality) {x1 , x2 , . . . , xk } be the set of variables occurring in C.
Thus, k ≤ deg(f ). We construct a decision tree for f as shown in Figure 1.5,
branching on the variables in the natural order (x1 , x2 , . . . , xn ). Thus, the root of
the tree is labeled by x1 . More generally, if u is an internal vertex at depth i from
the root ( 0 ≤ i < k), then u is labeled by xi+1 . Now, consider any internal vertex
at depth k − 1 (if there is no such vertex, then the tree has depth at most k − 1 and
the required inequality holds). Let v and w be the children of this vertex. Then, the
subtree hanging from v (respectively, from w) is a decision tree for a function of the
form g = f|P ,N (resp., h = f|P \{k},N ∪{k} ), where (P , N ) is a partition of {1, 2, . . . , k}
and k ∈ P .
We can assume that the subtrees representing g and h both have optimal depth.
In this way, we obtain for f a decision tree with depth max(DT (g), DT (h)) + k.
Assume that max(DT (g), DT (h)) = DT (g) (the other case is similar). By induc-
tion, we can assume that DT (g) ≤ deg(g) deg(g d ). Note that deg(g) ≤ deg(f ),
since g is a restriction of f . Moreover, by Theorem 4.2,
g d = (f|P ,N )d = (f d )|N,P . (4.8)
Since P ∪ N = {1, 2, . . . , k} is the set of indices of the variables in C, Theorem
4.7, together with (4.8), implies that a DNF of g d is obtained by fixing at least one
variable to either 0 or 1 in each term of ψ. Therefore, deg(g d ) ≤ deg(f d ) − 1.
So, we have represented f by a decision tree of depth
DT (g) + k ≤ deg(g) deg(g d ) + k ≤ desg(f ) (deg(f d ) − 1) + deg(f )
= deg(f ) deg(f d ),
which proves the required inequality.
For a positive function f , we denote by minT (f ) the set of minimal true points
of f , and by maxF (f ) the set of its maximal false points. Theorem 1.26 describes
a simple one-to-one correspondence between the prime implicants of f and its
minimal true points: Namely, X ∗ ∈ minT (f ) if and only if C = ∧i∈supp(X∗ ) xi is a
prime implicant of f , where supp(X ∗ ) = { i ∈ {1, 2, . . . , n} | xi∗ = 1}.
Theorem 1.27 establishes a similar relationship between the prime implicates
of f and its maximal false points. In duality terms, this result translates as follows:
(a) g = f d .
(b) For every partition of N = {1, 2, . . . , n} into two sets A and A, there is either
a member of P contained in A or a member of T contained in A, but not
both.
(c) T is exactly the family of minimal transversals of P.
Proof. The equivalence of (a) and (b) is a restatement of Theorem 4.1(a). Statement
(c) is a corollary of Theorem 4.7.
178 4 Duality theory
Otherwise, there are two sets P , P ∈ P such that P ⊆ A and P ⊆ A, and thus
P ∩ P = ∅, in contradiction with Theorem 4.20. Therefore, either (N1 , N2 , A) or
(A, N3 , . . . , Nk ) is a coloring of Hf involving fewer than k classes.
Finally, let us note that the proof of Theorem 4.12 is easily adapted to establish
the next result.
Theorem 4.24. A positive Boolean function f on B n is self-dual if and only if it
is maximal among all positive dual-minor functions or, equivalently, if and only if
{supp(X): f (X) = 1} is a maximal intersecting family of subsets of {1, 2, . . . , n}.
The number of positive self-dual functions on B n is not as easily determined as
the total number of self-dual functions, but asymptotic formulas have been derived
by Korshunov [579] (see also Bioch and Ibaraki [88]; Loeb and Conway [621]).
4.2.3 Applications
Application 4.3. (Combinatorics.) We saw in Section 1.13.5 that positive functions
are in one-to-one correspondence with clutters, by way of the mapping
f (x1 , x2 , . . . , xn ) = xj + → P.
A∈P j ∈A
Let φ = T ∈T ( j ∈T xj ) be the complete DNF of f d . By Theorem 4.19, every
set T in T is a minimal transversal of H. In hypergraph terminology, T is the
transversal clutter or blocker of H (see, e.g., Berge [72]; Eiter and Gottlob [295];
the terminology blocker is due to Edmonds and Fulkerson [288, 353]).
Let T (H ) denote the blocker of an arbitrary clutter H. Many elementary prop-
erties of blockers are probably best viewed in a Boolean context (and, in this
context, can be extended to nonpositive functions). For instance, Lawler [603]
and Edmonds and Fulkerson [288] observed that T (T (H )) = H , a property
180 4 Duality theory
(y1 , y2 , . . . , yn ) ∈ Bn , (4.11)
and let P = {A1 , A2 , . . . , Am }. Clearly, the (minimal) feasible solutions of SCP
are the characteristic vectors of the (minimal) transversals of P. Therefore, if we
define a Boolean function f by
m
f = xi = xi ,
k=1 i ∈Ak P ∈ P i ∈P
then the (minimal) feasible solutions of SCP are exactly the (minimal) true points
of f d . In particular, any algorithm that computes the dual of f could be used, in
principle, to solve the set covering problem (see, e.g., Lawler [603] for early work
based on these observations).
More generally, dual blocking pairs (P, T ), where T is the blocker of P, play
a very important role in the theory of combinatorial optimization. A paradigmatic
example of such a pair is provided by the set P of elementary paths joining two
vertices s and t in a directed graph, and by the set T of minimal cuts separating s
from t. Another example consists of the set P of all chains in a partially ordered
set and the set T of all antichains.
We have just seen that the set covering problem SCP is equivalent to the min-
imization problem: minT ∈ T i ∈T ci . If we replace the sum by a max-operator in
the objective function, then we obtain a class of bottleneck optimization problems,
expressed as
min max ci .
T ∈ T i ∈T
4.2 Duality properties of positive functions 181
Edmonds and Fulkerson [288] have established that this class of problems displays
a very strong property which, in fact, provides a rather unexpected characterization
of duality for positive Boolean functions.
Theorem 4.25. Let P and T be two nonempty clutters on {1, 2, . . . , n}. Then, the
equality
max min ci = min max ci (4.12)
P ∈ P i ∈P T ∈ T i ∈T
holds for all choices of real coefficients c1 , c2 , . . . , cn if and only if T is the blocker
of P.
Proof. Assume first that T is the blocker of P and fix the coefficients c1 , c2 , . . . , cn .
Consider any P ∈ P and T ∈ T . Since P ∩ T = ∅, mini ∈P ci ≤ maxi ∈T ci .
Therefore, the left-hand side of (4.12) is no larger than its right-hand side.
Now, assume without loss of generality that c1 ≥ c2 ≥ . . . ≥ cn , and consider
the smallest index j such that {1, 2, . . . , j } contains a member of P; say P ∗ ⊆
{1, 2, . . . , j } and P ∗ ∈ P. Then, mini ∈P ∗ ci = cj . Note that {j + 1, j + 2, . . . , n}
does not contain any set T ∈ T because such a set T would not intersect P ∗ .
On the other hand, {j , j + 1, . . . , n} is a transversal of P (since its complement is
stable in P, by choice of j ), and hence it contains some set T ∗ ∈ T . Therefore,
maxi ∈T ∗ ci = cj , and equality holds in (4.12).
For the converse implication, let us assume that (4.12) holds for all choices
of c1 , c2 , . . . , cn , and let us establish condition (b) in Theorem 4.19. Let (A, A) be
a partition of {1, 2, . . . , n} into two sets, and let ci = 1 if i ∈ A, ci = 0 if i ∈ A.
By assumption, (4.12) holds for this choice of c1 , c2 , . . . , cn . If both sides of the
equation are equal to 1, this means that there is a set P ∗ ∈ P such that P ∗ ⊆ A, and
that no set in T is entirely contained in A. On the other hand, if both sides of (4.12)
are equal to 0, then the reverse conclusion holds. Hence, by Theorem 4.19(b), T
is the blocker of P.
Gurvich [421] generalized Theorem 4.25 in order to characterize Nash-
solvable game forms.
Application 4.5. (Reliability theory.) As in Section 1.13.4, let fS be the (positive)
structure function of a coherent system S. We have already seen that each prime
implicant i∈P xi corresponds to a minimal pathset of S, namely, a minimal set
of components P with the property that the whole system S works whenever the
components in P work.
Similarly, every (prime) implicant i∈T xi of fSd is associated with a subset T
of components called a (minimal) cutset of S. A cutset T has the distinguishing
property that, if X∗ describes a state of the components such that xi∗ = 0 for all
i ∈ T , then fSd (X ∗ ) = 1, and hence fS (X ∗ ) = 0. In other words, the system S fails
whenever all components in the cutset fail, irrespectively of the operating state of
the other components. Therefore, the dual function fSd describes the system S in
terms of failing states.
This duality relationship between minimal pathsets and minimal cutsets is well-
known in the context of reliability theory, as stressed by Ramamurthy [777]. Several
182 4 Duality theory
Application 4.6. (Game theory.) Let v be a simple game on the player set N ,
and let fv be the positive Boolean function associated with v, as explained in
Section 1.13.3. Then, the prime implicants of fv correspond to the minimal winning
coalitions of the game, namely, to those minimal subsets P of players such that
v = 1 whenever all players in P vote “Yes.”
If i∈T xi is a prime implicant of fvd , then, in view of Theorem 4.18, T is
the complement of a maximal losing coalition. In other words, T is a blocking
coalition, that is, a minimal subset of players such that v = 0 if all players in T
vote “No.”
When modeling real-world voting bodies, it often makes sense to consider cer-
tain special classes of games (see, e.g., Ramamurthy [777] or Shapley [828]). A
game v is called proper if two complementary coalitions S and N \ S cannot be
simultaneously winning. It follows from Theorem 4.9(i) (or from Theorem 4.20)
that the game v is proper if and only if the function fv is dual-minor. On the other
hand, in a strong game, two complementary coalitions cannot be simultaneously
losing. By Theorem 4.9(ii) (or Theorem 4.22), a game v is strong if and only if fv is
dual-major. Finally, v is called decisive (or constant-sum) if exactly one of any two
complementary coalitions is winning. So, v is decisive if and only if fv is self-dual.
For obvious reasons, most practical voting rules are proper. For instance, when
all the players carry one vote and decisions are made based on the majority rule
with threshold q > n2 , then the resulting game is proper. If the number of players
is odd and q = n+1 2
, then the game is also decisive.
The concept of self-dual extension has been studied in the game-theoretic
literature under the name of constant-sum extension.
Unexpectedly, perhaps, Boolean duality also plays an important role in the
investigation of solution concepts for nonsimple games, such as 2-person (or
n-person) positional games; we refer to Gurvich [421, 423, 424, etc.] and to
Chapter 10 for illustrations.
if it can get permission from all the members of a quorum T ∈ C, where each site
is allowed to issue at most one permission at a time. The intersecting property of
coteries guarantees that at most one task can enter the critical section at any time
(meaning, e.g., that conflicting updates cannot be performed concurrently in the
database).
A coterie C is said to dominate another coterie D if, for each quorum T1 ∈ D,
there is a quorum T2 ∈ C satisfying T2 ⊆ T1 (see Garcia-Molina and Bar-
bara [370]). Non-dominated coteries have maximal “efficiency” and are therefore
important in practical applications. Theorem 4.24 shows that nondominated coter-
ies are nothing but self-dual positive functions in disguise. Theorems 4.22 and 4.23
have also been rediscovered in this context (see [370]).
Dual Recognition
Instance: DNF representations of two Boolean functions f and g.
Question: Is g = f d ?
Dualization
Instance: An arbitrary expression of a Boolean function f .
Output: The complete DNF of f d or, equivalently, a list of all prime implicants
of f d .
DNF Dualization
Instance: A DNF representation of a Boolean function f .
Output: The complete DNF of f d or, equivalently, a list of all prime implicants
of f d .
184 4 Duality theory
Remark. The reader should note that on some occasions, it may be easier to
generate a shortest DNF of f d , or even an arbitrary DNF of f d , rather than its
complete DNF. Indeed, Theorem 3.17 shows that for some Boolean functions, the
size of the complete DNF may be exponentially larger than the size of certain
appropriately selected DNF representations. (This can only hold for nonmonotone
functions. Indeed, for monotone functions, the complete DNF is necessarily shorter
than any other DNF; see Theorem 1.24.)
It turns out, however, that practically all dualization algorithms generate the
complete DNF of f d , rather than an arbitrary DNF. Moreover, analyzing the com-
plexity of the “incomplete” version of the problem requires special care, since the
output of the problem is not univocally defined, or may not have an efficient char-
acterization (e.g., when the objective is to generate a shortest DNF of f d ). These
reasons explain why we mostly concentrate here on generating the complete DNF
of f d . Exceptions will be found in Theorem 4.29 and, indirectly, in the proof of
Theorem 4.28.
Theorem 4.27. Unless P = NP, there is no polynomial total time algorithm for
Dualization or DNF Dualization, even if their input is restricted to cubic
DNFs.
Proof. Assume that there is a polynomial total time algorithm A for either problem.
Denote by r(L, U ) the running time of A, where r(x, y) is a bivariate polynomial,
L is the input length and U is the output length.
Let φ be a cubic DNF. From Theorem 2.1, we know that, unless P = NP, there
is no polynomial time algorithm for deciding whether the equation φ(X) = 0 is
consistent. Note that φ = 0 is inconsistent exactly when φ d is identically 0, that is
when φ d has no implicant.
We now consider any of the two dualization problems with the input φ. Run the
algorithm A on φ until either (i) it halts or (ii) the time limit r(|φ|, 0) is exceeded.
In case (i), if A outputs some implicant of f d , then the equation φ(X) = 0 is
consistent; otherwise, it is inconsistent. In case (ii), the equation φ(X) = 0 is con-
sistent. Therefore, in both cases, the equation has been solved in time polynomial
in |φ|, which can only happen if P = NP.
A converse of Theorem 4.27 holds. Indeed, if P = NP, then the following result
implies the existence of a polynomial total time algorithm for Dualization (and
hence, for DNF Dualization):
A result similar to Theorem 4.28 holds for generating the minterm expression
of the dual (remember that the minterm expression of a function is a special type
of DNF representation; see Definition 1.11).
Theorem 4.29. There is an algorithm which, given an arbitrary Boolean expres-
sion φ(x1 , x2 , . . . , xn ) of a function f , produces the minterm expression of f d by
solving q + 1 Boolean equations of size at most |φ| + nq, where q is the number of
minterms of f d . If t(L) is the complexity of solving a Boolean equation with input
length at most L, then the running time of this algorithm is polynomial in |φ|, q
and t(|φ| + nq).
Proof. This is an immediate corollary of Theorem 2.20 and of the fact that X is a
false point of f if and only if X is a true point of f d .
Together with the results obtained in previous chapters, Theorems 4.28 and
4.29 stress once again the close connection among three fundamental problems on
Boolean functions, namely, the solution of Boolean equations, the generation of
prime implicants, and the dualization problem. Essentially, these results show that
an algorithm for any of these three problems can be used as a black box for the
solution of the other two problems. Indeed, assume that A is an algorithm taking
as input an arbitrary Boolean expression φ, and let f be the function represented
by φ:
(i) If A is a dualization algorithm or an algorithm that generates all prime
implicants of f , then A trivially solves the equation φ = 0.
(ii) Conversely, if A is an algorithm for the solution of Boolean equations,
then A can be used to produce all prime implicants of f (see Theorem 3.9)
as well as all prime implicants of f d (see Theorem 4.28).
deduced from any DNF of f . Then, by repeated use of the distributivity law and
of absorption, the available CNF can easily be transformed into a DNF of f d .
k
More formally, for the input DNF φ = m i=1 Ci , let φk = i=1 Ci and let fk
denote the function represented by φk (k = 1, 2, . . . , m). The k-th iteration of the
algorithm computes all prime implicants of fkd , so that the task is complete after
the m-th iteration.
For i = 1, 2, . . . , m, let Ci = j ∈Li -j , where -1 , -2 , . . . are literals. The prime
implicants of f1d = C1d are exactly the literals -j (j ∈ L1 ). For k > 1, suppose that
d
fk−1 is expressed by its complete DNF T ∈T PT . Then, by Theorem 4.5,
k
fkd = -j = PT ∧ -j ,
i=1 j ∈Li T ∈T j ∈Lk
and, by distributivity,
fkd = P T -j .
T ∈T j ∈Lk
Procedure SD-Dualization is part of the folklore of the field and has been
repeatedly proposed by numerous authors, often in the context of the dualization
of positive DNFs; see Fortet [342], Maghout [643], Pyne and McCluskey [765],
Kuntzmann [589], Benzaken [61], Lawler [603], and so on. (Some authors [119,
432] recently called it “Berge multiplication,” in reference to its description in
[71, 72].) Nelson [705] proposed using it as a subroutine in his so-called double
188 4 Duality theory
Procedure SD-Dualization
Input: A DNF φ = mi=1 j ∈Li -j of a Boolean function f .
Output: The set of prime implicants of f d .
begin
T ∗ := {-j | j ∈ L1 };
for k = 2 to m do
begin
T := ∅;
for all P ∈ T ∗ and for all j ∈ Lk do T := T ∪ {P -j };
remove from T every term which is absorbed by another term in T ;
T ∗ := T ;
end
return T ∗ ;
end
has been improved in various ways, for instance, by Bailey, Manoukian, and
Ramamohanarao [40]; Dong and Li [276]; or Kavvadias and Stavropoulos [559].
SD-Dualization does not run in polynomial total time (even on positive
DNFs), namely, its running time may be exponentially large in the combined input
and output size of the problem. In fact, it tends to generate many useless terms in
its intermediate iterations (for k = 2, . . . , m − 1), and it only generates the prime
implicant of f d in its very last iteration (when k = m), after exponentially many
operations may already have been performed. This behavior was described more
accurately by Takata [856], who showed that on some examples, SD-Dualization
may produce a superpolynomial blowup for every possible ordering of the terms
of the input DNF (see also Hagen [432]). By contrast however, Boros, Elbassioni
and Makino [119] proved that SD-Dualization can be implemented to run in
output-subexponential time on positive DNFs, and to run in polynomial total time
on certain special classes of positive DNFs, such as bounded-degree DNFs or
read-once DNFs (see also Exercise 7).
Theorem 4.32. Unless P = NP, there exists no polynomial total time algorithm
for Dualization, even if its input represents a positive function.
Proof. Consider a DNF equation ψ(x1 , x2 , . . . , xn ) = 0, and assume that each of the
literals xi and x i appears at least once in ψ, for i = 1, 2, . . . , n. Clearly, solving this
type of DNF equation is NP-complete.
Now, let ψ ∗ (x1 , x2 , . . . , xn , xn+1 , xn+2 , . . . , x2n ) be the positive DNF obtained
after replacing each negative literal x i by a new variable xn+i in ψ (i = 1, 2, . . . , n).
Notice that ψ(X) = 0 if and only if ψ ∗ (X, X) = 0. Define further the positive
expression:
n
φ(x1 , x2 , . . . , x2n ) = ψ ∗ ∧ (xi ∨ xn+i ). (4.14)
i=1
Observe that the expression (4.14) is not a disjunctive normal form, so that
Theorem 4.32 does not settle the complexity of Dualization when its input φ is
restricted to positive DNFs: Let us call this problem Positive DNF Dualization.
Clearly, we can assume without loss of generality that the input of Positive DNF
Dualization is the complete DNF of a positive function f , that is, a positive
DNF consisting of all prime implicants of f . Thus, formally, we define Positive
DNF Dualization as follows:
For simplicity, and when no confusion can arise, we often use the same notation
for a positive function f and for its complete DNF φ in the sequel. For instance,
def
we denote by |f | the size of the complete DNF of f , that is, we let |f | = |φ|.
Positive DNF Dualization is known to be equivalent to many interest-
ing problems encountered in various fields (see Section 4.2 and [295]). Within
Boolean theory alone, several authors – in particular, Bioch and Ibaraki [89];
Eiter and Gottlob [295]; Fredman and Khachiyan [347]; Johnson, Yannakakis,
and Papadimitriou [538] – have observed that this problem is polynomially equiv-
alent to the fundamental problem of recognizing whether two positive functions f
and g are mutually dual, namely, whether f = g d (note that this is just the positive
version of the Dual Recognition problem introduced in Section 4.3.1):
If f and g are not mutually dual, then by definition of duality, there exists a point
X ∗ ∈ B n such that f (X ∗ ) = g(X ∗ ). Let us now establish that solving Positive
Dual Recognition indirectly allows us to determine such a point X∗ . (It is
interesting to observe that a similar result holds without the positivity assumptions.)
such that f (X∗ ) = g(X ∗ ) can be found by solving n instances of Positive Dual
Recognition with size at most |f | + |g|.
(i) There is an algorithm for Positive Dual Recognition which, given the
complete DNFs of two positive functions f and g, decides whether f and g
are mutually dual by solving one instance of Positive DNF Dualization.
If r(|f |, |f d |) is the complexity of solving Positive DNF Dualization on
the input f , then the running time of this algorithm is polynomial |f |, |g|
and r(|f |, |g|).
(ii) Conversely, there is an algorithm for Positive DNF Dualization which,
given the complete DNF of a positive function f , produces the complete
DNF of f d by solving O(np) instances of Positive Dual Recognition
of size at most |f | + |f d |, where p is the number of prime implicants of
f d . If t(f1 , f2 ) is the complexity of solving Positive Dual Recognition
on the input (f1 , f2 ), then the running time of this algorithm is polynomial
in |f |, p and t(|f |, |f d |).
Proof. (i) If A is a dualization algorithm with running time r(|f |, |f d |), and (f , g)
is the input to Positive Dual Recognition, then we run A on the input f . If A
does not stop at time r(|f |, |g|), then it means that g = f d . Otherwise, the output
of A can be used to determine whether g = f d and to answer Positive Dual
Recognition.
(ii)Assume that A is an algorithm for Positive Dual Recognition and assume
that, at some stage, the prime implicants PJ (J ∈ G) of f d have already been
produced, where |G| ≤ p. In the next iteration, the algorithm considers the positive
function
g= PJ (X). (4.15)
J ∈G
Lawler, Lenstra, and Rinnooy Kan [605] and several other researchers (see
Garcia-Molina, and Barbara [370]; Johnson, Yannakakis, and Papadimitriou [538];
Bioch and Ibaraki [89]; Eiter and Gottlob [295]) have asked whether Positive DNF
Dualization can be solved in polynomial total time or, equivalently, whether
Positive Dual Recognition can be solved in polynomial time. This central
question of duality theory remains open to this day. A breakthrough result by
Fredman and Khachiyan [347], however, has established the existence of quasi-
polynomial time algorithms for Positive Dual Recognition and for Positive
DNF Dualization. This is in stark contrast with the NP-hardness results obtained
for the general Dual Recognition (Theorem 4.26) and DNF Dualization prob-
lems (Theorem 4.27), since it is widely believed that NP-hard problems have no
quasi-polynomial time algorithm.
The algorithm proceeds to determine whether f and g are mutually dual and, in
the negative, to find a point point X∗ ∈ Bn such that
f (X∗ ) = g(X ∗ ) = 0. (4.19)
However, since an efficient procedure is not immediately available for deciding
whether f and g are mutually dual (i.e., for solving Positive Dual Recognition),
4.4 Algorithmic aspects: Positive functions 193
we cannot apply the same recursive approach used in the proof of Theorem 4.33.
Therefore, we introduce here two crucial modifications. First, instead of exactly
solving an instance of Positive Dual Recognition at every step of the recursion
(as in the proof of Theorem 4.33), we rely on an incomplete test based on examining
the quantity
def
E(f , g) = 2−|I | + 2−|J | . (4.20)
I ∈F J ∈G
Theorem 4.35. Let f and g be two positive functions defined by (4.16) and (4.18).
If E(f , g) < 1, then f and g are not mutually dual, and a point X ∗ satisfying (4.19)
can be computed in polynomial time.
Proof. We use the same approach in the proofs of Theorems 2.26 and 2.27. Namely,
consider the polynomial
F (X) = xi + (1 − xj ).
I ∈F i∈I J ∈G j ∈J
Thus, when E(f , g) < 1, Theorem 4.35 can be used as a substitute for Theo-
rem 4.33. When E(f , g) ≥ 1, however, we cannot draw any immediate conclusion,
and we turn instead to a recursive divide-and-conquer procedure based on The-
orem 4.3. But rather than decomposing f and g on an arbitrary variable, we are
going to show how to choose a “good” variable xi , so that the size of the resulting
subproblems is relatively small. Observe first that when E(f , g) ≥ 1, either f or
g contains a prime implicant of only logarithmic length.
Lemma 4.1. Let f and g be two positive functions defined by (4.16) and (4.18).
If E(f , g) ≥ 1, then either f or g has a prime implicant with degree at most
log(|F| + |G|).
Proof. Let δ = min{|I | | I ∈ F ∪ G} be the degree of a shortest prime implicant of
either f or g. By definition (4.20), (|F| + |G|)2−δ ≥ E(f , g) ≥ 1.
For M ∈ [0, 1] and i ∈ {1, 2, . . . , n}, we say that variable xi occurs in f with
frequency at least M if
|{I | i ∈ I , I ∈ F}|
≥ M.
|F|
We say that xi is a frequent variable for the pair (f , g) if xi occurs with frequency
at least 1/ log(|F| + |G|) either in f or in g.
194 4 Duality theory
Theorem 4.36. Let f and g be two positive functions defined by (4.16) and (4.18),
and assume that
If E(f , g) ≥ 1 and |F| |G| ≥ 1, then there exists a frequent variable for the pair
(f , g).
Proof. By Lemma 4.1, either f or g has a prime implicant with degree at most
log(|F| + |G|). Let us assume without loss of generality that J ∈ G defines such a
short implicant. Then, in view of (4.21), some variable xi , i ∈ J , must occur in f
with frequency 1/|J | ≥ 1/ log(|F| + |G|).
We now have all the necessary ingredients to present the important quasi-
polynomial time algorithm proposed by Fredman and Khachiyan [347] for the
solution of Positive Dual Recognition. A formal description of the algorithm
is given in Figure 4.2.
Theorem 4.37. Procedure Recognize Dual is correct and runs in time
2
m4 log m+O(1) , where m = |F| + |G|.
Proof. The correctness of the procedure follows from the above discussion. Theo-
rem 4.35 implies that Step 1 can be executed in time polynomial in the input size
|f | + |g|. It can be checked that, if g = f d , then n ≤ |F||G| ≤ m2 (see Exercise 5),
4.4 Algorithmic aspects: Positive functions 195
and hence |f | + |g| = O(nm) = O(m3 ). Step 2 is easily done in O(1) time. There-
fore, up to a polynomial factor mO(1) , the running time of the procedure is bounded
by the number of recursive calls.
Fix m, and let M = 1/ log m. Let v = |F||G| be the volume of the pair (f , g), and
let a(v) be the maximum number of recursive calls of the procedure when running
on a pair with size at most m and volume at most v. We are going to show that
2m
a(v) ≤ m4 log . (4.22)
Note that the size of each pair involved in a recursive call is smaller than m. So,
the frequent variable xi selected in Step 3 always has frequency at least M either
in f or in g. Suppose, without loss of generality, that xi occurs with frequency
M in f .
Then, the number of terms of f|xi =0 is at most (1 − M)|F|, and the number of
terms of g|xi =1 is at most |G|, so that the volume of the pair (f|xi =0 , g|xi =1 ) is at
most (1 − M)v. Also, the number of terms of f|xi =1 is at most |F| and the number
of terms of g|xi =0 is at most |G| − 1, so that the volume of the pair (f|xi =1 , g|xi =0 )
is at most v − 1.
We thus obtain the following recurrence:
From this recurrence, we obtain a(v) ≤ k + ka((1 − M)v) + a(v − k) for all k ≤ v.
Letting k = "vM# yields a(v) ≤ (3 + 2vM)a((1 − M)v), and hence
The bound (4.22) on a(v) follows from v = |F||G| ≤ (|F| + |G|)2 /4 ≤ m2 /4 and
M = 1/ log m.
Procedure FK-Dualization
Input: A positive Boolean function f on Bn expressed by its complete DNF.
Output: The complete DNF of f d .
Step 0: g := 0;
Step 1: Call Recognize Dual on the pair (f , g);
if the returned value is “Yes” then halt;
else let X∗ ∈ Bn be the point returned by Recognize Dual;
compute a maximal false point of f , say Y ∗ , such that X ∗ ≤ Y ∗ ;
g := g ∨ j ∈supp(Y ∗ ) xj ;
return to Step 1.
[9, 651, 654, 839]. We refer to Eiter, Makino, and Gottlob [302] and to Boros,
Elbassioni, Gurvich, and Makino [118] for surveys of related results. It is also worth
recalling at this point that the sequential-distributive algorithm SD-Dualization
has been recently shown to run in subexponential time on positive DNFs ([119];
see Section 4.3.2).
Identification
Instance: A black-box oracle to evaluate a positive Boolean function f at any
given point.
Output: All prime implicants of f and of f d .
Add-j
Instance: A prime implicant P of (f|xj =...=xn =0 )d , where f is a positive Boolean
function on Bn expressed in DNF.
Output: All prime implicants of (f|xj +1 =...=xn =0 )d that are absorbed by P .
= xn ∨ (f|xn =1 )d (f|xn =0 )d
= xn (f|xn =0 )d ∨ (f|xn =1 )d (f|xn =0 )d .
Despite its apparent simplicity, the approach sketched in Theorem 4.39 has a
surprisingly broad range of applicability. Several related approaches are mentioned
by Eiter, Makino, and Gottlob [302]; see also Grossi [413].
198 4 Duality theory
4.5 Exercises
1. Consider Reiter’s analysis of the diagnosis problem (Application 4.1).
(a) Prove that the characterization of diagnoses is correct.
(b) With the same notations as in Application 4.1, define a conflict set to be
a minimal subset N ⊆ {1, 2, . . . , m} such that
m
φk (X ∗ , Y ) = 0
k=1
k ∈N
is
inconsistent. Show that N ⊆ {1, 2, . . . , m} is a conflict set if and only
if k∈N pk is a prime implicant of f .
(c) Prove that the diagnoses are exactly the transversals of the conflict sets.
2. Prove that the composition of dual-minor positive functions is dual-minor,
and the composition of dual-major positive functions is dual-major. Show
that these results do not hold without the positivity assumption.
3. Show that, if f (x1 , x2 , . . . , xn ) is a Boolean function, then g(x1 , x2 , . . . , xn , xn+1 ,
xn+2 ) = xn+1 xn+2 ∨ xn+1 f ∨ xn+2 f d is self-dual.
4. Show that there exists a positive function f such that χ (Hf ) ≤ 3, but f is
not dual-minor (compare with Theorem 4.21).
5. Prove that, if f is a positive Boolean function on n variables, then
n ≤ p q, where p (respectively, q) is the number of prime implicants of
f (respectively, f d ).
6. Show that the procedure SD-Dualization presented in Section 4.3.1 does
not run in polynomial total time.
7. Consider a variant of SD-Dualization where the prime implicants of f
are sorted in such a way that, for j = 1, 2, . . . , n, the prime implicants on
{x1 , x2 , . . . , xj } precede any prime implicant containing xj +1 . Prove that this
variant can be implemented to run in polynomial total time on quadratic
positive functions. (Note: this implies that all maximal stable sets of a graph
can be generated in polynomial total time).
8. Prove Theorem 4.31.
9. Let ψ be a DNF of the Boolean function f (x1 , x2 , . . . , xn ). Show that the
complete DNF of f d can be generated by the following procedure: (a) In ψ,
replace every occurence of the literal xi by a new variable yi (i = 1, 2, . . . , n),
thus producing a positive DNF φ(x1 , x2 , . . . , xn , y1 , y2 , . . . , yn ); (b) Generate
the complete DNF of φ d , say η(x1 , x2 , . . . , xn , y1 , y2 , . . . , yn ); (c) In η, replace
every occurence of yi by xi , and remove the terms which are identically
zero. Is this sufficient to conclude that the problem DNF Dualization is
no more difficult than Positive DNF Dualization?
10. Show that the bounds in Lemma 4.1 and Theorem 4.36 are tight up to a
factor of 2. (Fredman and Khachiyan [347].)
11. Show that Theorem 4.35, Lemma 4.1, and Theorem 4.36 hold for arbitrary,
not necessarily positive functions. (Fredman and Khachiyan [347].)
4.5 Exercises 199
Special Classes
5
Quadratic functions
Bruno Simeone
φ(x1 , . . . , xn ) = xj xj
i=1 j ∈Pi j ∈Ni
quadratic if all its terms are quadratic, that is, if they are conjunctions of at most
two literals: |Pi ∪ Ni | ≤ 2 for all i ∈ {1, . . . , m}. A term is called linear or purely
quadratic according to whether it consists of exactly one or exactly two literals.
In a similar fashion, we call a CNF quadratic if all its clauses are disjunctions
of at most two literals.
203
204 5 Quadratic functions
ψ(X) = 1,
pq = 0,
p q = 0,
pq = 0,
pq ∨ p q = 0,
pq ∨ pq = 0.
In fact, it has been estimated that about 95% of the production rules in
expert systems are of the foregoing types and, hence, can be represented by
quadratic equations (see Jaumard, Simeone, and Ow [531]).
206 5 Quadratic functions
xy, x y, xy,
where x and y are variables. By forbidding all terms having one or more of these
three forms, one can naturally define meaningful special subclasses of quadratic
DNFs and, accordingly, of quadratic Boolean functions.
Let us now introduce some special classes of general, not necessarily quadratic,
DNFs. We start with the definitions of Horn, co-Horn and polar DNFs, which are
thoroughly studied in Chapters 6 and 11.
Definition 5.6. A Horn DNF is a DNF in which every term contains at most one
complemented variable.
Definition 5.7. A co-Horn DNF is a DNF in which every term contains at most
one uncomplemented variable.
Definition 5.8. A polar DNF is a DNF in which no term contains both a
complemented and an uncomplemented variable.
In Section 5.4, we extensively refer to those quadratic DNFs in which every
quadratic term consists of one complemented and one uncomplemented variable.
Definition 5.9. A mixed DNF is a DNF that is both Horn and co-Horn.
As mentioned above, these important subclasses of DNFs, when restricted to
quadratic DNFs, can be simply characterized by means of forbidden terms (see
Table 5.1).
Any of these types of DNFs defines in a natural way a corresponding class of
Boolean functions. For example, we say that a Boolean function is Horn if it is
representable by a Horn DNF.
208 5 Quadratic functions
Horn xy
co-Horn xy
polar xy
mixed xy, x y
positive (purely quadratic) xy, x y
positive undirected
mixed directed
arbitrary bidirected
Note that the prime implicants of f are precisely the terms xi xj of this DNF,
which is also the unique irredundant DNF of f . It follows that the correspondence
between positive purely quadratic Boolean functions and undirected graphs is
one-to-one.
Let now D = (N, A) be a directed graph, with N = {1, 2, . . . , n}. We can associate
with D a quadratic mixed DNF ϕ ≡ ϕ(D) as follows: We associate with every
vertex i ∈ N a variable xi of ϕ, and with every arc (i, j ) ∈ A a quadratic term xi x j
of ϕ. Conversely, given any mixed quadratic DNF ϕ (without linear terms), one
210 5 Quadratic functions
can uniquely reconstruct the directed graph D ≡ D(ϕ) whose associated DNF is
ϕ.
However, this time the correspondence between digraphs and quadratic mixed
Boolean functions is not one-to-one: Indeed, a purely quadratic mixed Boolean
function f may be represented by many irredundant quadratic mixed DNFs. In
order to state this relation more precisely, we need the notion of transitive closure
of a digraph (see also Appendix A): Given a digraph D = (N , A), its transitive
closure is the digraph obtained from D by adding to A all the arcs (u, v) such that
there is a directed path from u to v in D.
Theorem 5.2. Two digraphs correspond to the same quadratic mixed Boolean
function if and only if their transitive closures are identical.
Proof. Two mixed DNFs represent the same quadratic Boolean function if and only
if the two sets of prime implicants that one can obtain from them by the consensus
algorithm are the same. It is easy to see that these implicants are quadratic mixed
terms, and that the digraph associated with their disjunction is transitively closed.
ϕ = x 1 ∨ x4 ∨ x1 x2 ∨ x 1 x 2 ∨ x 1 x 4 ∨ x2 x 3 ∨ x2 x4 ∨ x3 x 4 (5.2)
The graph G is said to have the Kőnig-Egerváry (KE) property if equality holds
in (5.3).
Theorem 5.3. The quadratic Boolean equation ϕ = 0 is consistent if and only if
the matched graph Gϕ has the Kőnig-Egerváry property.
Proof. The n null edges form a maximum matching of Gϕ . Therefore, Gϕ has the
KE property if and only if there is a cover C in Gϕ with |C| = n.
212 5 Quadratic functions
Assume first that Gϕ has the KE property, and let C be a cover with |C| = n.
As every null edge has exactly one endpoint in C, we can define Z ∈ Bn by
0 if vertex xi belongs to C
zi =
1 if vertex x i belongs to C.
A variant of the matched graph in which null edges are absent is introduced in
Section 5.9 as a useful tool for dualization.
They add two dummy vertices x0 (representing the constant 0) and x 0 (represent-
ing the constant 1) and, for each linear term ξ , two arcs (x0 , ξ ) and (ξ , x 0 ), again
mirroring each other. The advantages of our representation will become apparent
in Section 5.8, when we discuss the relationship between prime implicants of a
quadratic DNF and transitive closures.
One should notice that, through the implication graph, a quadratic Boolean
equation is represented by an equivalent system of logical implications – a
deductive knowledge base in the terminology of artificial intelligence (see
Nilsson [713]).
The most important property of the implication graph Dϕ relates its strong com-
ponents to the consistency of the quadratic Boolean equation ϕ = 0. According to
definitions in Appendix A, a strongly connected component (or, briefly, a strong
component) of Dϕ = (N, A) is any maximal subset C of vertices with the property
that any two vertices of C lie on some closed directed walk consisting only of
vertices of C. The strong components of Dϕ form a partition of its vertex-set N ,
and they can be computed in O(m) time, where m = |A| (Tarjan [858]). By shrink-
ing each strong component into a single vertex, one obtains an acyclic digraph D̂ϕ ,
the condensed implication graph of ϕ. Notice that, in view of the Mirror Property,
the strong components of Dϕ come in pairs: If C is a strong component, then the
set C consisting of the negations of all literals in C also is a strong component.
Aspvall, Plass, and Tarjan [34] proved:
Theorem 5.4. The quadratic Boolean equation ϕ = 0 is consistent if and only if
no strong component of Dϕ contains both a literal ξ and its complement ξ .
To prove this theorem, let us state a simple, but useful, result.
Lemma 5.3. An assignment of binary values to the vertices of Dϕ corresponds to
a solution of the equation ϕ = 0 if and only if
(i) for all i, vertices xi and x i receive complementary values, and
(ii) no arc (and hence no directed path) goes from a 1-vertex (that is, a vertex
with value 1) to a 0-vertex (that is, a vertex with value 0).
214 5 Quadratic functions
Proof. This equivalence follows directly from the construction of the implication
graph.
ξ ⇒ ξ and ξ ⇒ ξ .
Therefore, the literal ξ must take the values 0 and 1 at the same time, a contradiction.
Hence ϕ = 0 has no solution.
For the converse direction, let us show that if no strong component of Dϕ
contains both a literal and its complement, then ϕ = 0 has a solution. The proof is
by induction on the number s of strong components of Dϕ (which is always even).
If s = 2 and C is a strong component, then the other strong component is C.
Since C and C are different strong components, we may assume that all the arcs
between C and C (if any) go from a vertex of C to a vertex of C. Now, assign the
value 0 to all literals in C and the value 1 to those in C. Properties (i) and (ii) of
Lemma 5.3 are satisfied and thus the assignment defines a solution of ϕ = 0.
Assume now that the statement is true whenever the implication graph has at
most s − 2 strong components (s ≥ 4), and let Dϕ have s strong components.
Consider the acyclic condensed digraph D̂ϕ obtained from Dϕ upon contraction
of the strong components of Dϕ .
Let C be the strong component of Dϕ corresponding to a source in D̂ϕ . Then,
by the Mirror Property, C is a sink of D̂ϕ . By the definitions of source and sink, no
arc of Dϕ goes into C and no arc leaves C. Remove both C and C from Dϕ . Let D
be the resulting subdigraph of Dϕ . The digraph D has s − 2 strong components.
Hence the statement of Theorem 5.4 holds for D by the inductive hypothesis, and
therefore there is an assignment of binary values to the vertices of D satisfying
(i) and (ii) of Lemma 5.3. Such an assignment can be extended to Dϕ by assigning
the value 0 to all literals in C and the value 1 to all literals in C. It is immediate to
verify that the extended assignment still satisfies (i) and (ii) of Lemma 5.3 in the
digraph Dϕ . Hence, it yields a solution of ϕ = 0.
The implication graph enables us not only to determine the consistency of the
corresponding quadratic Boolean equation but also, in case of consistency, to infer
further properties of its solutions.
We say that a literal ξ is forced to the value α (for α ∈ {0, 1}) if either the
quadratic Boolean equation ϕ = 0 is inconsistent, or if ξ takes the value α in all
its solutions.
Theorem 5.5. Suppose that the equation ϕ = 0 is consistent. Then, the literal ξ is
forced to 0 if and only if there exists a directed path from ξ to ξ in Dϕ .
5.4 Quadratic Boolean functions and graphs 215
Proof. If there is a directed path from ξ to ξ and ξ = 1 in some solution, then this
contradicts part (ii) of Lemma 5.3.
For the converse direction, suppose that there is no directed path from ξ to ξ ,
and let X be any solution of ϕ = 0. If ξ = 1 in X, then we are done. Else, let us
modify X as follows: Assign to ξ and to all its successors the value 1; assign to ξ
and to all its ancestors the value 0. Let X be the resulting assignment. First of all,
X is well defined: No conflicting values may arise, since no ancestor of ξ can be
a successor of ξ (as this would yield a directed path from ξ to ξ ).
Let us show that X is a solution. If not, by Lemma 5.3 (ii), there is a path
from a 1-vertex α to a 0-vertex β. Since this path did not exist for X, either α is
a successor of ξ or β is an ancestor of ξ . By symmetry, it is enough to consider
the former case. But, if α is a successor of ξ , so is β, and hence β should take the
value 1 in X , which is a contradiction.
Theorem 5.6. Let ξ be a literal not forced to 0, and let η be a literal not forced
to 1. The relation ξ ≤ η holds in all solutions of the quadratic Boolean equation
ϕ = 0 if and only if there is a directed path from ξ to η in Dϕ .
Proof. The “if” part is obvious after part (ii) of Lemma 5.3. Let us prove the “only
if” part. Assume there is no directed path from ξ to η, and let us prove that, if there
is a solution at all, then there is also a solution in which ξ = 1 and η = 0.
Consider an arbitrary solution X. By part (ii) of Lemma 5.3 there is no directed
path from any 1-vertex to a 0-vertex. If in X we have ξ = 1 and η = 0, we are
done. Otherwise, let us modify X as follows: Assign the value 1 to ξ and the value
0 to η. Also, assign the value 1 to all successors of ξ and 0 to all ancestors of η.
Taking into account the Mirror Property, assign the value 0 to all ancestors of ξ
and the value 1 to all successors of η. Leave the remaining values unchanged. We
claim that the assignment of values X obtained in this way is also a solution of
ϕ = 0.
First of all, X is well-defined: No conflicting values may arise, since no suc-
cessor of ξ may be an ancestor of η (as this would yield a directed path from ξ to
η, against our assumption).
Furthermore, no successor of ξ can be also an ancestor of ξ , else there would
be a directed path from ξ to ξ , and ξ would be forced to 0. Similarly, no ancestor
of η can be a successor of η. Suppose that in X there is a directed path from a
1-vertex α to a 0-vertex β. Then α is a successor either of ξ or η. But then, so is
β; hence β should take the value 1 in X , a contradiction.
Two literals ξ and η are said to be twins if ξ = η in every solution of the quadratic
Boolean equation ϕ = 0.
Corollary 5.1. Suppose that the two literals ξ and η are not forced. Then, they
are twins if and only if they are in the same strong component of the implication
graph Dϕ .
216 5 Quadratic functions
Example 5.4. Figure 5.4 shows a graph G and two of its conflict codes.
they do form a partition when the DNF ϕ is both quadratic and primitive, that is,
when two different terms of ϕ do not involve exactly the same set of variables.
A quadratic graph is called primitive, Horn, or mixed if it admits a primitive,
Horn, or mixed quadratic code, respectively.
The complexity of recognizing quadratic graphs appears to be still an open
question. However, the following negative result was established by Crama and
Hammer [230].
Theorem 5.7. Recognizing quadratic primitive graphs is NP-complete.
Actually, they proved the following stronger result.
Theorem 5.8. Recognizing whether the edge-set of a bipartite graph can be par-
titioned into colors, so that all colors are either stars or squares (that is, C4 ’s),
and at most two colors meet at each vertex, is an NP-complete problem.
Benzaken, Hammer, and Simeone [69] remarked that quadratic primitive mixed
graphs are precisely the adjoints of directed graphs (where the adjoint of a digraph
D is the undirected graph whose vertices are the arcs of D, and where two vertices
218 5 Quadratic functions
u and v are adjacent if and only if the head of v coincides with the tail of u).
Chvátal and Ebenegger [200] proved:
On the positive side, Benzaken, Boyd, Hammer, and Simeone [65] obtained
a characterization of quadratic primitive Horn graphs, and Hammer and Sime-
one [461] characterized bistellar graphs. In the statement of Theorem 5.10
hereunder, the word “configuration” refers to a family of digraphs on a given
set S of vertices. A configuration is defined by two disjoint subsets A, B ⊆ S × S.
The meaning is that, in every digraph of the family, the arcs in A must always be
present, the arcs in B must be absent, and all the remaining arcs may be either
present or absent.
C1 C2
C3 C4
C5 C6
C7 C8
C9 C10
Figure 5.5. The ten forbidden configurations for quadratic primitive Horn graphs.
Continuous arcs must be present; dashed ones must be absent.
is consistent.
220 5 Quadratic functions
is consistent.
is consistent.
The related class of bisplit graphs has been investigated by Brandstädt, Hammer,
Le and Lozin [151]. Their recognition turns out again to be reducible to a quadratic
Boolean equation.
is consistent.
5.5.6 Totally unimodular matrices with two nonzero entries per column
Definition 5.10. A matrix is totally unimodular (TU) if all its square submatrices
have determinant 0, 1 or −1.
Clearly, all entries of a TU matrix must be 0, 1, or −1. TU matrices are very
important in integer programming in view of the following classical result of
Hoffman and Kruskal [495].
Theorem 5.12. Let A be an m × n TU matrix, and let b ∈ Zm be an arbitrary
integral m-vector. Then, each extreme point of the polyhedron
P = {x ∈ Rn : Ax ≤ b}
is integral.
Proof. See Hoffman and Kruskal [495].
222 5 Quadratic functions
Theorem 5.12 and the Fundamental Theorem of Linear Programming (see e.g.,
[199, 812]) imply the following corollary:
maximize cx
(5.8)
subject to Ax ≤ b, x ∈ Rn
Hence the integer linear program obtained from (5.8) by the addition of
integrality constraints on x can be solved by ordinary linear programming.
A complete characterization of TU matrices was obtained by Seymour [823];
a polynomial-time recognition algorithm based on this result can be found in
Schrijver [812].
For the special case of matrices with two nonzero entries per column, how-
ever, Heller and Tompkins [483] gave more efficient characterizations of totally
unimodular matrices.
Theorem 5.13. A necessary and sufficient condition for a (−1, 0, 1)-matrix A with
two nonzero entries per column to be totally unimodular is that its set of rows can
be partitioned into two (possibly empty) subsets R1 and R2 such that, for each
column a j:
(i) if the two nonzero entries of a j are different, then they both belong to R1 ,
or they both belong to R2 ;
(ii) if the two nonzero entries of a j are equal, then one of them belongs to R1 ,
and the other one belongs to R2 .
x 2 x3 ∨ x2 x 3 ∨ x1 x3 ∨ x 1 x 3 ∨ x1 x 2 ∨ x 1 x2 ∨ x1 x3 ∨ x 1 x 3 = 0
has no solution (the submatrix formed by the first three columns has
determinant −2).
5.5 Reducibility to quadratic equations 223
x1
x1 x2
x2
x3 x4
The set I = V \C is stable and must include F because all vertices in C are matched.
Hence, if we assign the value 0 or 1 to ξ(v) according to whether v ∈ C or v ∈ I ,
we obtain a solution of the equation ϕ = 0.
Example 5.6. Consider the graph of Figure 5.6, where the matching is represented
by thick edges. The associated Boolean equation is
ϕ ≡ x1 x2 ∨ x 1 x2 ∨ x 1 x 2 ∨ x 2 x3 ∨ x 2 x4 ∨ x 3 ∨ x 4 = 0.
It is easy to see that this equation is inconsistent, and that the graph does not have
the KE property.
upper lower
(a) (b)
A
A A
B B B
A⬘
B⬘ B⬘
A⬘ B⬘
A⬘
Figure 5.8. Some basic patterns in the single bend wiring problem.
It is easy to see that the relative positions of two given pairs of pins may
induce some constraints on the connections between each pair, and hence, on
the corresponding Boolean variables. Figure 5.8 shows some of the patterns that
may occur. In case (a), no matter whether connections AA’ and BB’ are upper or
lower, they must cross each other, giving rise to an infeasible situation. In case (b),
regardless of whether connection BB’is upper or lower, connection AA’is forced to
be upper, else it would cross BB’. In case (c), if connection BB’ is upper, then also
AA’ must be upper, else it would cross BB’. In every case, each constraint involves
only two connections; hence it can be represented by quadratic conditions on the
corresponding Boolean variables. Therefore, checking the existence of a feasible
noncrossing wiring can be reduced to the solution of a quadratic Boolean equation;
226 5 Quadratic functions
see Raghavan, Cohoon, and Shani [773] and Garrido, Márquez, Morgana, and
Portillo [373] for details and extensions.
In an interactive computer-aided design (CAD) environment, one usually places
one component at a time and tries to connect it to the others. The addition of such
a component gives rise to new terms in the quadratic equation. This motivates the
investigation of an on-line model; see Section 5.7.4.
where
gp (xp1 , xp2 ) = ap xp1 xp2 + bp xp1 + cp xp2 + dp , p∈F
and all xj ∈ {0, 1}.
Boros et al. [132] report an interesting application of the minimization of max-
quadratic functions to a VLSI design problem. The decision version of this problem
asks whether, for a given threshold value t, the set of inequalities
has a solution.
For any given pair p = (p1 , p2 ) ∈ F, the set of all solutions to the inequality
gp (xp1 , xp2 ) ≤ t is a subset of the 2-dimensional binary cube B 2 . Every such subset
is itself the set of solutions of a quadratic Boolean equation in two variables. It
follows that the set of solutions of the system of inequalities (5.10) is also the set
of solutions of a quadratic Boolean equation.
Example 5.7. Let
and let t = 5. Then the set of solutions of the inequality g(2,5) (x2 , x5 ) ≤ t consists
of the points (x2 , x5 ) = (0, 1) and (x2 , x5 ) = (1, 0). Hence, the set of solutions of
the inequality g(2,5) (x2 , x5 ) ≤ t coincides with the set of solutions of the quadratic
Boolean equation
x2 x5 ∨ x 2 x 5 = 0.
5.5 Reducibility to quadratic equations 227
Lh = {v ∈ V : l(v) = h},
Isotony
Instance: r mutually disjoint finite sets L1 , . . . , Lr ; for each h = 1, . . . , r − 1, a
one-to-many mapping ϕh from Lh to Lh+1 .
Question: Are there r linear orders ,1 , . . . , ,r on L1 , . . . , Lr , respectively, such that
ϕh is an isotonic mapping from (Lh , ,h ) into (Lh+1 , ,h+1 ), for h = 1, . . . , r − 1?
Clearly, the above embedding problem is reducible to Isotony.
228 5 Quadratic functions
Example 5.8. Consider the proper level graph G shown in Figure 5.9(a). The
levels of G are L1 = {a, b}, L2 = {c, d, e}, L3 = {f , g}. A level-planar embedding
of G is shown in Figure 5.9(b). With reference to the corresponding Isotony
formulation, the mappings ϕ1 and ϕ2 are given by
b ≺1 a; e ≺2 c ≺2 d; g ≺3 f .
i ≺h p ⇒ j ,h+1 q (isotony)
or, equivalently,
h h+1
zip zj q = 0 (5.11)
since each ,h is a linear order.
(ii) For each h = 1, . . . , r; i, p ∈ Lh , i = p:
Bipartiteness O(m)
Balance in signed graphs O(m)
Recognition of split graphs O(n2 )
Forbidden-color graph bipartition O(m)
Totally unimodular matrices linear
Kőnig-Egerváry property O(n2.5 )
Single bend wiring quadratic
Max-quadratic functions linear
Level graph drawing quadratic
or, equivalently,
h h
zip zpi ∨ zhip zhpi = 0. (5.12)
(iii) For each h = 1, . . . , r; i, k, p ∈ Lh :
i ,h k and k ,h p ⇒ i ,h p (transitivity)
or, equivalently,
h h h
zik zkp zip = 0. (5.13)
Summing up, the answer to Isotony is Yes if and only if the cubic Boolean
equation
F (Z) = 0
is consistent, where F is the disjunction of all the left-hand sides of (5.11), (5.12),
and (5.13). Randerath et al. [778] proved the following surprising result.
Theorem 5.17. The cubic constraints (5.13) are redundant. Therefore, Isotony
is polynomially reducible to a quadratic Boolean equation, and it can be answered
in polynomial time.
Proof. The proof is lengthy and must be omitted here. The reader may consult the
paper by Randerath et al. [778].
All four algorithms above are graph theoretic: In the first three algorithms, the
quadratic Boolean expression ϕ is represented by an undirected graph (namely, the
matched graph introduced in Section 5.4.2), whereas the fourth algorithm exploits
a digraph model (namely, the implication graph introduced in Section 5.4.3).
5.6 Graph-theoretic algorithms for quadratic equations 231
ϕ = x1 x 3 ∨ x1 x 4 ∨ x 1 x2 ∨ x2 x4 ∨ x 2 x 4 ∨ x3 x 4 = 0. (5.15)
STEP:
Pick an arbitrary unscanned term ηζ such that η has the label 1, and assign to ζ
and to ζ the labels 0 and 1, respectively, making sure that ζ did not previously
receive the label 1. Declare the term ηζ “scanned.”
If a conflict arises because ζ was previously assigned the label 1, and the
algorithm now tries to assign the label 0 to ζ , then the labeling stops, all labels
are erased, and the alternative guess ξ = 0 is made. The labeling procedure starts
again, and if a new conflict occurs at a later stage, then the algorithm terminates
with the conclusion that the equation has no solution. On the other hand, if all
literals are successfully labeled, then the algorithm concludes that the equation is
consistent and the labeling directly yields a solution. However, a third possibility
may occur: The labeling “gets stuck,” in the sense that no conflict has occurred,
but some literals are still unlabeled. This may happen only when no unscanned
term contains a literal labeled 1, in other words, when each literal appearing in an
unscanned term is either unlabeled or has the label 0. If this situation occurs, then
the labeled variables are fixed according to the current labels, and the labeling
restarts with a new guess on the reduced expression involving only the unlabeled
literals.
Theorem 5.18. The Labeling algorithm is correct and runs in O(mn) time.
Proof. The algorithm makes a guess on the value of some literal, and then it
deduces the values of as many literals as possible. Label propagation (that is,
value assignment) stops in three cases:
5.6 Graph-theoretic algorithms for quadratic equations 233
(i) the labels assigned to variable xi and to its complement x i are different for
i = 1, . . . , n;
(ii) the label 1 is never simultaneously assigned to two literals appearing in a
same quadratic term.
Case 3: The algorithm “gets stuck,” that is, a proper subset of literals is labeled
and the labeling cannot be extended further.
In this case, let L and U be the sets of labeled and unlabeled literals, respectively.
Let ϕU be the subexpression of ϕ obtained after fixing all the labeled variables to
their current labels; thus, ϕU involves only the unlabeled literals. We claim that
ϕ = 0 is consistent if and only if ϕU = 0 is consistent.
Observe that each term of T of ϕU is among the terms of ϕ: Indeed, if this is
not the case, then T must result from some term Ti = ηζ of ϕ by fixation of one
of its literals to 1. But then, the propagation step implies that the other literal of Ti
should have been labeled 0, so that T should not appear in ϕU .
Now, assume that ϕ = 0 is consistent and that ϕ(X∗ ) = 0. Then, in view of
the previous observation, the restriction of X∗ to the variables associated with
unlabeled literals defines a solution of ϕU = 0.
Conversely, assume that ϕU = 0 is consistent, and consider the labeling cor-
responding to an arbitrary solution. Such labeling, together with the one already
obtained for L, defines a complete labeling of the literals of ϕ having the above
properties (i) and (ii), and hence, a solution of ϕ = 0.
It follows that the labels of L can be made permanent, and that the labeling can
restart from U after the process gets stuck. Hence, the algorithm is correct.
The total number of initial guesses made by the algorithm is at most 2n. After
each guess, the corresponding label propagation stage explores at most m terms.
Hence, the worst-case complexity of the Labeling algorithm is O(mn).
Example 5.9 (continued). The history of the execution of the Labeling algorithm
on the quadratic equation (5.15) is shown in Table 5.5. After Step 10 all literals
have been labeled without conflicts. Hence, the equation ϕ = 0 is consistent, and
a solution is x1∗ = 0, x2∗ = 0, x3∗ = 1, x4∗ = 1.
234 5 Quadratic functions
0 / x2 = 1, x 2 = 0 guess
1 x 1 x2 x 1 = 0, x1 = 1
2 x2 x4 x4 = 0, x 4 = 1
3 x1 x 3 x 3 = 0, x3 = 1
4 x1 x 4 x 4 = 0, x4 = 1 conflict
5 / x2 = 0, x 2 = 1 alternative guess
6 x2x4 x 4 = 0, x4 = 1 stuck
7 / x1 = 0, x 1 = 1 guess
8 / stuck
9 / x3 = 1, x 3 = 0 guess
10 / end
The two labelings are then extended to as many literals as possible through the
alternate execution of the following STEP for the red labeling and for the green
one:
STEP:
Pick an arbitrary unscanned term ηζ such that η has the label 1, and assign to ζ
and to ζ the labels 0 and 1, respectively, making sure that ζ did not previously
receive the label 1. Declare the term ηζ “scanned.” (Here, terms like “label,”
“unscanned,” “scanned” are relative to the color currently under consideration.)
If a conflict arises, say, for the red labeling (i.e., some literal that was previously
red-labeled 1 is forced to get the red label 0, or vice versa), the red labeling stops
and the red labels are erased. If, at a later stage, a conflict occurs also for the green
labeling, the algorithm stops and the equation has no solution. It may happen that
one of the labelings, say, the red one, “gets stuck,” meaning that no conflict has
occurred, but that there are still literals having no red label. This is possible only
when, for each red-unscanned term, the literals appearing in that term are either
red-unlabeled or have red label 0. If this situation occurs, then the red labels are
made permanent, and both the red and the green labeling are restarted on the
reduced expression involving only the red-unlabeled literals.
The algorithm can be shown to run in O(m) time (see Gavril [374]).
Example 5.9 (continued). Table 5.6 summarizes a run of the algorithm on the
equation (5.15). After step 4, all literals have been (red-)labeled without conflicts.
Hence the equation is consistent and a solution is given by x1∗ = 0, x2∗ = 0, x3∗ =
0, x4∗ = 1.
can be written as Horn DNFs after switching a subset of variables, that is, after
performing the change of variables that replaces some of the original variables xi by
new variables yi = x i . He provided a 2-Sat characterization of Horn-renamability
(see Section 6.10.1 for details). For quadratic DNFs, a sort of converse relation
holds.
Theorem 5.19. Given a pure quadratic Boolean DNF ϕ, the equation ϕ = 0 is
consistent if and only ϕ is Horn-renamable.
Proof. The proof is left as an easy exercise.
On the basis of Theorem 5.19, the Switching algorithm tries to transform the
given expression ϕ into a Horn expression, if possible, through a sequence of
switches of variables. The algorithm first identifies an arbitrary negative term,
say x i x r ; if this term is to be transformed into a Horn term, then at least one of
the variables xi , xr needs to switched. The algorithm accordingly picks one of the
variables, say xi , and tries to deduce the consequences of this choice.
In order to describe more formally the algorithm, it is convenient to introduce
some preliminary definitions. (We use the tree terminology of Appendix A.) An
alternating tree rooted at x i is a subgraph T (x i ) of the matched graph Gϕ with
the following properties:
(1) T (x i ) is a tree, and x i is its root.
(2) If xj is a vertex of T (x i ), then its father in T (x i ) is x j .
(3) If x j is a vertex of T (x i ) and j = i, then its father is a vertex xr of T (x i )
such that (xr , x j ) is a mixed edge of Gϕ .
(4) If xr is a vertex of T (x i ) and (xr , x j ) is a mixed edge of Gϕ , then x j is a
vertex of T (x i ).
Note that it is easy to “grow” a maximal alternating tree T (x i ) rooted at a vertex
x i of a matched graph. Indeed, suppose that T is any tree T which satisfies con-
ditions (1)–(3) (initially, T may contain the isolated vertex x i only), and perform
the following steps as long as possible:
(i) If T has a leaf of the form x j , then add vertex xj and edge (x j , xj ) to T .
(ii) If T has a leaf xr , then add to T all vertices x j and edges (xr , x j ) such that
(xr , x j ) is a mixed edge of Gϕ and x j is not already in T .
It is clear that conditions (1)–(3) are maintained by both steps (i) and (ii). Moreover,
when step (ii) no longer applies, then condition (4) is also satisfied; hence, T is an
alternating tree rooted at xi .
Let us now record two useful properties of alternating trees.
Lemma 5.1. Let T (x i ) be an alternating tree of Gϕ rooted at x i , let xj be any
vertex of T (x i ), and let P (i, j ) be the unique path from xi to xj in T (x i ). If X ∗ is
a solution of the equation ϕ(X) = 0 such that xj∗ = 0, then xk∗ = 0 for all vertices
xk lying on P (i, j ).
5.6 Graph-theoretic algorithms for quadratic equations 237
Proof. The proof is by induction on the length of the path P (i, j ). If P (i, j ) has
length 0, then i = j and the statement is trivial. Otherwise, observe that x j is the
father of xj in T (x i ), and consider the father of x j ; in view of condition (3), this
is a vertex xr such that xr x j is a term of of ϕ. Since ϕ(X ∗ ) = 0 and xj∗ = 0, we
obtain that xr∗ = 0. Now, the conclusion follows by induction, since the path from
xi to xr is shorter than P (i, j ).
To state the next property, we define the join of two vertices of T (x i ) to be their
common ancestor that is farthest away from the root x i . Note that the join of any
two vertices necessarily corresponds to an uncomplemented variable.
Proof. In every solution (if any) of ϕ = 0, either xh or xk must take value 0. Since
xj is on the path from xi to xh and on the path from xi to xk , the conclusion follows
from Lemma 5.1.
We are now ready to describe the Switching algorithm. The algorithm works on
the matched graph Gϕ . An endpoint x i of a negative edge (x i , x r ) is selected, and
an alternating tree T (x i ) is grown, as explained above. As soon as a new vertex xh
of T (x i ) is generated, one checks whether Gϕ has a positive edge (xh , xk ) linking
xh to a previously generated vertex xk of T (x i ). If this is the case, the variable xj
corresponding to the join of xh and xk must be forced to 0 by Lemma 5.2.
As a consequence, other variables are forced in cascade according to the
following rules:
If a conflict occurs during this process (that is, if some variable is forced both to 0
and to 1), then the algorithm stops and concludes that the equation is inconsistent.
Otherwise, we obtain a reduced equation involving fewer variables, and a new
iteration begins. If the construction of T (x i ) has been completed and no positive
edge between two vertices of T (x i ) has been detected, then a switch is performed
on all the variables corresponding to the vertices of T (x i ). In this way, we produce
an equivalent expression, and a new iteration begins. The procedure is iterated
until either a Horn equation is obtained or all variables are forced. In both cases,
a solution of the original equation ϕ = 0 can be found by inspection of the lists of
the forced variables and of the switched ones.
Example 5.9 (continued). The matched graph Gϕ of Figure 5.10 has a negative
edge (x 2 , x 4 ). Hence, the alternating tree T (x 2 ) shown in Figure 5.12 is grown,
until the positive edge (x2 , x4 ) is detected.
238 5 Quadratic functions
Figure 5.12. Alternating tree rooted at x 2 for the matched graph of Figure 5.10.
Proof. In view of Lemma 5.2, the consistency of the original equation is not
affected when we fix variables as explained in the algorithm. Also, switching a
set of variables does not affect consistency. Therefore, if the algorithm terminates
(either because the equation is proved to be inconsistent or because a solution
has been produced), then it necessarily returns the correct answer. Thus, we only
need to prove that the algorithm always terminates. To see this, let us show that
each vertex x i can occur at most once as the root of an alternating tree during
5.6 Graph-theoretic algorithms for quadratic equations 239
the execution of the algorithm. Consider what can happen when the tree T (x i ) is
generated.
(a) C is already labeled. Then, the algorithm processes the next strong
component.
(b) C = C. Then, the algorithm stops. In view of Theorem 5.4, the equation
ϕ = 0 is inconsistent.
240 5 Quadratic functions
Figure 5.13. The condensed implication graph D̂ϕ for the equation (5.15)
(c) C is unlabeled. Then, the algorithm assigns the label 1 to C and the label 0
to C.
It is easy to see that, if C1 and C2 are two strong components, if there exists an
arc from some vertex of C1 to some vertex of C2 in D, and if C1 is labeled 1, then
C2 is necessarily labeled 1 as well. Thus, if we assign to each vertex ξ the label of
the component containing ξ , we get a solution to the equation ϕ = 0 (by virtue of
Lemma 5.3).
Aspvall, Plass and Tarjan [34] show that the Strong Components algo-
rithm has complexity O(m). A randomized version of the algorithm, with
expected O(n) time complexity, has been described by Hansen, Jaumard, and
Minoux [470].
5.6 Graph-theoretic algorithms for quadratic equations 241
{x 2 , x4 } 1
{x2 , x 4 } 0
{x3 } 1
{x 3 } 0
{x1 } 1
{x 1 } 0
1) The first, and perhaps most important, observation is that quadratic Boolean
equations are indeed easy to solve: Even the slowest algorithm took only
44 milliseconds (on an IBM 3090 – nowadays an archaic computer!) to
solve the largest problem (2000 variables and 8000 terms).
2) In the satisfiable case, the foregoing experiments show a clear-cut ranking
of the four algorithms with respect to running times: L is unquestionably
the fastest one, followed by AL, S, and SC (see Figure 5.14).
3) In the unsatisfiable case, the running times of L, AL, and S are roughly
comparable, whereas the running time of SC is by far larger; except for
SC, the running times were much smaller in the unsatisfiable case than in
242 5 Quadratic functions
SC
45000
S
AL
L
30000
20000
10000
1600 S
AL
L
1200
800
400
the satisfiable one (see Figure 5.15, where the SC-graph is oversized and
hence, is not shown).
4) In the satisfiable case, the running times of SC and L grow quite regularly
with the problem size. In fact, they are very well fitted by a straight line:
The authors found that TIME L = 5.94n and TIME SC = 21.99n, the squared
correlation coefficients being RL2 = 0.999 and RSL 2
= 1, respectively. On
the other hand, the graph of the running times of AL and S as a function of
n is less regular, but it lies between two straight lines corresponding to L
and SC (see Figure 5.14).
In the unsatisfiable case, the behavior of SC is as regular as it is in the
satisfiable case. The other three algorithms, however, behave very irregu-
larly and exhibit frequent nonmonotonicities (see Figure 5.15). At any rate,
5.7 Quadratic equations: Special topics 243
The main point is that L “capitalizes on luck,” whereas AL follows a more “pes-
simistic” approach, and L is less affected by random factors, which may increase
its running time in the worst-case but may also decrease it on average. Actually,
for L to reach its O(mn) worst-case complexity, the following events must take
place:
Theorem 5.21. The foregoing construction always produces a median graph, and
all median graphs can be obtained in this way.
This result follows from work of Schaefer [807]; see also Bandelt and Chepoi
[50] and Feder [323]. An interesting “closure” property can be derived from it.
(This property is in fact a restatement of the characterization of quadratic functions
given in Section 5.3.2.)
ϕ(x1 , x2 , . . . , xn ) = 0 (5.16)
may be exponentially larger than the number of its variables, and generating them
all is generally a prohibitive task. In fact, Valiant [883] proved that even determin-
ing the number of such solutions is #P-complete, and hence, probably very difficult.
It is perhaps worth mentioning here that merely counting the solutions is somewhat
“easier” than generating them; see Dahlöf, Jonsson, and Wahlström [252]; Fürer
and Kasiviswanathan [354].
Feder [322, 323] proposed a generating algorithm, which we now sketch. For
ease of presentation, we assume that the quadratic equation given by (5.16) is pure
and Horn, that is, all its terms are quadratic and either positive (they involve only
uncomplemented variables) or mixed (they involve exactly one complemented and
one uncomplemented variable). This assumption is not restrictive, since, in view
of Theorem 5.19, every consistent purely quadratic Boolean equation can always
be cast into a Horn equation after some of its variables are renamed.
For every pair of Boolean variables xk , xj , the following equivalences hold:
xk x j = 0 if and only if xk ≤ xj ,
x k xj = 0 if and only if xk ≤ x j .
5.7 Quadratic equations: Special topics 245
Therefore, (5.16) can be rewritten (in more than one way) as a system of Boolean
implications of the form
where Dk ⊆ X ∪ X for k = 1, 2, . . . , n.
We also assume, without loss of generality, that there are no forced variables
and no twin literals in the equation, since these can easily be detected and handled
in a preprocessing phase. As a consequence of our assumptions, the implications
(5.17)–(5.18) can be written in such a way that k < j when either xj ∈ Dk or
x j ∈ Dk .
Feder [322, 323] observed:
Theorem 5.22. Let X ∗ ∈ B n be a nonzero solution of (5.17)–(5.18), and let - ≤ n
be such that x-∗ = 1 and xi∗ = 0 for 1 ≤ i < -. Then, the point Y ∗ obtained after
replacing x-∗ by 0 is again a solution of (5.17)–(5.18).
Proof. Because y-∗ = 0, the point Y ∗ clearly satisfies all implications of the form
(5.17) for xj ∈ D- , as well as all implications of the form (5.18) for x j ∈ D- and for
x - ∈ Dk . Moreover, when x- ∈ Dk , the implication (5.17) is necessarily satisfied
by Y ∗ because k < -, and hence, yk∗ = 0.
x1 x2 ∨x1 x3 ∨x1 x 5 ∨x2 x 7 ∨x3 x 4 ∨x3 x 8 ∨x4 x5 ∨x4 x 7 ∨x5 x6 ∨x6 x7 ∨x6 x 8 ∨x7 x8 = 0.
(5.19)
This equation is equivalent to the system of inequalities
x1 ≤ x 2 , x1 ≤ x 3 , x1 ≤ x5 , x2 ≤ x7 , . . . , x7 ≤ x 8 . (5.20)
We say that Y ∗ is the father of the solution X ∗ if Y ∗ and X ∗ are in the relation
described by Theorem 5.22. Note that every nonzero solution has exactly one
father. Consider now the digraph T = (S, A), where S is the set of solutions of
(5.17)–(5.18) (or, equivalently, of the quadratic equation ϕ = 0), and where an arc
(Y ∗ , X ∗ ) is in A if and only if Y ∗ is the father of X ∗ . Then, T defines an arborescence
rooted at the all-zero solution. Given any solution Y ∗ ∈ S, the children of Y ∗ in T
can easily be generated: If yj∗ is the first nonzero component of Y ∗ , then the children
of Y ∗ are exactly the points of the form X∗ = Y ∗ ∨ ei , i < j , such that X ∗ ∈ S. It
follows that the arborescence T can be generated and traversed efficiently (in fact,
with polynomial delay; see Appendix B.8).
Feder [322, 323] describes a low-complexity implementation of this procedure.
246 5 Quadratic functions
Proof. We refer the reader to Feder [322, 323] for details of the analysis.
for k = 1, 2, . . . , n. Let
Lemma 5.4. If S is the set of solutions of the system (5.22), and if Q is defined by
(5.23)–(5.24), then S ⊆ Q.
The next proposition states a necessary and sufficient condition under which
equality holds between S and Q. We first introduce some additional notation.
With the system (5.22), we associate the directed graph H = (X ∪ X, A), defined
as follows: For all xk in X and µ in X ∪ X, the arc (xk , µ) is in A if and only
if µ ∈ Dk . (H is in general a subgraph of the implication graph of the original
equation (5.16).)
Theorem 5.24. If S is the set of solutions of the system (5.22), and if Q is defined
by (5.23)–(5.24), then S = Q if and only if the digraph H is transitive.
Proof. Assume in the first place that H is transitive. By Lemma 5.4, we only have
to prove that every point (g1 (P ), . . . , gn (P )) in Q is a solution of (5.17)–(5.18).
Let x j ∈ Dk . If gk (P ) = 1, then pj = 0, and hence, gj (P ) = 0. This shows that
the implications (5.18) are satisfied by (g1 (P ), . . . , gn (P )).
Let xj ∈ Dk . If gj (P ) = 0, then either (i) pj = 0, or (ii) pi = 0 for some i such
that xi ∈ Dj , or (iii) pi = 1 for some i such that x i ∈ Dj . In case (ii), xi ∈ Dk by
transitivity of H . Similarly, in case (iii), x i ∈ Dk . Hence, in all cases, gk (P ) = 0,
and the implications (5.17) are satisfied by (g1 (P ), . . . , gn (P )).
Conversely, assume that H is not transitive. This means that, for some xk , xj ∈ X
and µ ∈ X ∪ X, (xk , xj ) and (xj , µ) are in A, but (xk , µ) is not in A. Assume for
instance that µ ∈ X, that is, µ = xi for some i ∈ {1, . . . , n} (the proof is similar if
µ ∈ X). So, xi ∈ Dj , but xi ∈ Dk . Notice that i = k, by our assumptions on the
system (5.17)–(5.18).
Let P = (p1 , . . . , pn ), where pk = 1, pi = 0, pl = 1 if xl ∈ Dk and pl = 0 other-
wise (this is a valid assignment of values to the parameters). Then, gk (P ) = 1
and gj (P ) = 0. So (g1 (P ), . . . , gn (P )) is not a solution of (5.17)–(5.18) and
S = Q.
248 5 Quadratic functions
x8 x7 x6 x5 x4 x3 x2 x1
x8 x7 x6 x5 x4 x3 x2 x1
x8 x7 x6 x5 x4 x3 x2 x1
x8 x7 x6 x5 x4 x3 x2 x1
The reader can check that all solutions of (5.19) are generated by giving all possible
0–1 values to the parameters p1 , p2 , . . . , p8 .
Note also that, since the system of inequalities (5.20) is not uniquely defined by
(5.19), it is possible to derive from Theorem 5.24 several product-form parametric
representations of the solutions of (5.19).
is, to update at each step a suitable data structure that keeps track of the work
done so far and allows us to solve the whole sequence of problems with less
computational effort. In this case, the classical worst-case analysis of the cost of
a single operation may not be adequate to analyze the cost of the whole sequence
of operations, and amortized complexity arguments are more appropriate. For a
general discussion of amortized complexity, see Tarjan [859].
For an on-line equation involving n variables and m terms, Jaumard, Marchioro,
Morgana, Petreschi, and Simeone [528] present an algorithm running in (amor-
tized) O(n) time per term, and hence, in overall O(mn) time. For each formula
in the nested sequence, not only does the algorithm check whether the formula is
consistent or not, but it also yields an explicit solution, if any, and detects the sets
of forced and twin (or identical) variables.
One can hardly conceive on-line algorithms with lower complexity, since simply
writing out the solutions to m equations already requires O(mn) time. For details,
we refer to the paper by Jaumard et al. [528].
(1) Given a quadratic DNF ϕ of a quadratic Boolean function f , find all prime
implicants of f .
(2) Given a quadratic DNF ϕ of a quadratic Boolean function f , find an
irredundant DNF of f .
Because all the prime implicants of a quadratic Boolean function in n variables are
quadratic, their number is O(n2 ); moreover, as we mentioned in Section 5.6.1, the
consensus method, starting from ϕ, generates all of them in polynomial time (actu-
ally, in time O(n6 )). Similar conclusions follow from Theorem 3.9 and Corollary
3.6 in Chapter 3.
However, much faster algorithms can be obtained on the basis of the close
relationship that exists between the generation of all prime implicants of f and
the generation of the transitive closure of a digraph. As we show in Section 5.8.2,
the prime implicants of f can be easily obtained from the transitive closure of the
implication graph of ϕ.
The disjunction of all the prime implicants of a Boolean function f is, in a
sense, the most detailed and explicit DNF of f : Along with each pair of terms
it explicitly features their consensus (or some term absorbing it); so, all logical
implications derivable from those appearing in the DNF are themselves featured
in the DNF. At the opposite extreme, irredundant DNFs are the most succinct and
implicit DNFs of f : No consensus of pairs of terms appearing in any such DNF is
5.8 Prime implicants and irredundant forms 251
also present in it, and the logical implications derivable from those appearing in
the DNF are implicitly, rather than explicitly, present.
A polynomial bound can be derived for the complexity of finding an irredundant
DNF of a quadratic Boolean function f , starting from an arbitrary quadratic DNF
of f . This bound can be estimated as follows: Generate in O(n6 ) time, as earlier,
the disjunction ψ of all prime implicants of f . Choose any term T of ψ and check
in O(n2 ) time whether T is an implicant of the DNF ψ resulting from the deletion
of T in ψ (as e.g., in Theorem 3.8 of Chapter 3). If so, then T is redundant and
ψ can be replaced by ψ ; otherwise, ψ remains unchanged. At this point, choose
another term T and repeat. The process ends when all terms have been checked
for redundancy, and possibly deleted. Since the number of terms in ψ is O(n2 ),
the overall complexity of the foregoing procedure is O(n6 ) – again a polynomial
bound.
However, much faster algorithms can be obtained for this problem, too. As
mentioned above, the graph-theoretic tool of choice for the generation of all prime
implicants of a quadratic Boolean function f is the transitive closure of the impli-
cation digraph. On the other hand, as we show in Section 5.8.4, the appropriate
notion for the generation of an irredundant quadratic DNF of f is that of transitive
reduction of a digraph – just the converse of the transitive closure.
Munro [697] was apparently first to point out that, when computing the transitive
closure of a digraph D, one may assume, without loss of generality, that D is
connected (in the sense that its underlying undirected graph is connected) and
acyclic. As a matter of fact, if D is disconnected, its transitive closure D ∗ is the
union of the transitive closures of the connected components of D. If D has cycles,
then one can preliminarily find the strong components of D by the O(m) algorithm
of Tarjan [858], and subsequently generate the acyclic condensation D̂ of D by
shrinking each strong component into a single supervertex. Once D̂ ∗ has been
computed, D ∗ can be obtained as follows:
Let A∗ and Â∗ be the arc sets of D ∗ and D̂ ∗ , respectively. Then,
there exists (u, v) ∈ Â∗ such that
x belongs to the strong component of D
(x, y) ∈ A∗ ⇔ (5.25)
represented by u, and y belongs to
the strong component of D represented by v.
Theorem 5.25. The algorithm Quadratic Prime Implicants is correct, that is,
it produces all prime implicants of the quadratic Boolean function f represented
by the input DNF ϕ.
Example 5.13. Let f be the quadratic Boolean function represented by the DNF
ϕ = x1 x 2 ∨ x1 x 3 ∨ x2 x3 ∨ x 3 x4 .
254 5 Quadratic functions
begin
Step 1: construct the implication graph Dϕ ;
Step 2: run a transitive closure algorithm on the input Dϕ ;
let H = Dϕ∗ be the (transitive) graph obtained at the end of this step;
Step 3: for each arc (ξ , ξ ) in H , remove all arcs leaving ξ (except (ξ , ξ ))
and all arcs entering ξ (except (ξ , ξ )); let Q be the resulting digraph;
Step 4: if there is a pair of arcs (ξ , ξ ), (ξ , ξ ) in Q, then the Boolean constant 1n
is the only prime implicant of ϕ;
else
for each arc (ξ , ξ ) in Q, the linear term ξ is a prime implicant of ϕ;
for each pair of mirror arcs (ξ , η) and (η, ξ ), the quadratic term ξ η
is a prime implicant of ϕ;
Step 5: return the list of prime implicants constructed in Step 4.
end
The implication graph Dϕ is shown in Figure 5.21; the graphs H and Q are shown
in Figures 5.22 and 5.23, respectively. It follows that the disjunction of all the
prime implicants of f is given by
x1 ∨ x2 x3 ∨ x2 x4 ∨ x 3 x4 .
Conversely, given an arbitrary digraph G, one can associate with G a mixed DNF
ϕ ≡ ϕ(G) by simply reading the double implications (5.26) and (5.27) from left
to right. Two terms in ϕ, let them be xy and uv, have a consensus only in two
cases:
Thus, the consensus of any two mixed terms is still mixed. A graph-theoretic
interpretation in G of cases (a) and (b) is provided by Figures 5.24 (a) and (b),
respectively. As in the case of implication graphs, here, too, an elementary con-
sensus operation corresponds to a transitive arc addition, and vice versa. Observe
that in the context of mixed DNFs, absorption is trivial. In fact, since linear terms
can never be generated by consensus in this case, a quadratic mixed term can
be absorbed only by itself; that is, it is absorbed only if it is already present in
the current list of terms. Accordingly, any consensus algorithm whose input is a
mixed DNF ϕ can be interpreted as a transitive closure algorithm on the associated
digraph G, and vice versa (recall also Theorem 5.2).
Now we are ready to give a graph-theoretic description of a generic input
disengagement consensus algorithm. We assume that the algorithm directly takes
as input, instead of a mixed DNF, a digraph G = (V , E). As in Section 5.8.2, we
may assume, without loss of generality, that G is a connected directed acyclic
graph or a connected DAG.
begin
let F := E;
declare all arcs of E to be engaged;
while there is some engaged arc do { process arc a }
select the first (with respect to ≺) engaged arc a;
declare arc a to be disengaged;
let a = (h, k);
for each arc (p, h) ∈ F do add arc (p, k) to F (if missing);
for each arc (k, q) ∈ F do add arc (h, q) to F (if missing);
end while
return H = (V , F );
end
Added arcs
p h k q
Processed arc
Definition 5.2. A disengagement order ≺ is any strict linear order on the arc set
of G.
Example 5.14. Consider the directed path G = P5 , and label its four arcs as
shown in Figure 5.27. If the disengagement order is 1 ≺ 4 ≺ 3 ≺ 2, then H is a
proper subgraph of G∗ , since arc (v1 , v5 ) is missing (see Figure 5.28; here and
258 5 Quadratic functions
in Figure 5.29, all arcs are assumed to be directed from top to bottom; at each
iteration, the thick arc is the one that is being processed, and the dashed arcs are
the ones that are being added).
On the other hand, if the disengagement order is 1 ≺ 3 ≺ 4 ≺ 2, then H = G∗
(see Figure 5.29). Interestingly, in this case, only three iterations are needed in
order to generate G∗ .
Figure 5.29. H = G∗ .
h 0 h + 1 for h = 1, . . . , i − 1;
h ≺ h + 1 for h = i, . . . , m − 1;
h 0 h+1 for h = 1, . . . , i − 2;
i − 1 ≺ i and i 0 i + 1;
h ≺ h+1 for h = i + 1, . . . , m − 1.
Notice that V-orders, N-orders and W-orders exist only when m ≥ 3, m ≥ 4, and
m ≥ 5, respectively. Monotone orders and N-orders are very easy to recognize.
One can recognize both V-orders and W-orders among all strict linear orders in
O(m) time by constructing the m-vector rank, whose components are defined
by rank(h) = r if and only if h is the r-th smallest element with respect to ≺
(h = 1, . . . , m), and comparing each component with the next one.
Clearly, for m ≤ 3, any linear order is a successful disengagement order for the
path Pm+1 . The following theorem yields several characterizations of successful
disengagement orders for m ≥ 4:
Theorem 5.26. Let Pn be the standard dipath on n vertices whose arcs are labeled
1, 2, . . . , m (where m = n − 1), and let ≺ be any disengagement order on the set
{1, 2, . . . , m}. Then, the following statements are equivalent for m ≥ 4:
(a) The disengagement order ≺ is successful for Pn .
(b) There are no i < j < h < k such that i ≺ j and h 0 k.
(c) There is no i such that
either i ≺ i + 1 and i + 2 0 i + 3,
or i ≺ i + 1 and i + 3 0 i + 4.
Proof. See Boros et al. [120]. For instance, in the second case of Example 5.14, it
is enough to disengage the arcs 1, 3, 4 in order to generate the transitive closure
of P5 (see Figure 5.29), but this is impossible with one or two arcs.
Proof. We label the arcs of G from 1 to m, as follows: The vertices of G are visited
in reverse topological order and for each vertex i, the arcs going into i are assigned
the highest previously unassigned labels (ties can be broken arbitrarily). This can
be done in time O(m) as in Tarjan [858].
Now, let ≺ be the disengagement order, defined by
Since the arc labels are strictly increasing along each dipath P of G, ≺ induces
a monotone strict linear order on the arcs of each dipath. By Theorem 5.26, the
input disengagement algorithm running on the instance (G, ≺) must generate the
transitive closure of each maximal dipath of G. Since a DAG is transitively closed
if and only if each of its maximal dipaths is such, it follows that the DAG H
produced by the input disengagement algorithm must coincide with G∗ .
The final result of this subsection concerns the complexity of the input
disengagement algorithm.
Proof. Since the algorithm is an input consensus one and since all arcs of G are
disengaged after processing, the algorithm consists of m stages, one for each arc
of G. At each stage, an arc (h, k) of G is processed: All its predecessors (p, h) and
all its successors (k, q) are examined and the arcs (p, k) and (h, q) added to the
current set F , provided that they are not already present. Since the initial digraph
G is acyclic and each transitive arc addition transforms a DAG again into a DAG,
no predecessor p of h can coincide with a successor q of k. Hence, for a fixed arc
(h, k), the number of all such vertices p and q is at most n − 2. Therefore, there
are at most m(n − 2) transitive arc additions, and the thesis follows.
We restrict ourselves to finding prime irredundant DNFs. Recall from Section 1.7
that a prime irredundant DNF of f has the following two properties:
• It is a disjunction of prime implicants of f .
• It does not have any redundant terms, that is, terms whose deletion results in
a shorter DNF representation of f .
Therefore, a natural algorithmic strategy for finding a prime irredundant DNF
is the following:
1) Generate all linear implicants of f from the input DNF ϕ.
2) If there are two linear implicants of the form ξ and ξ , then the constant 1n is
the only prime implicant, and hence, also the only prime irredundant form
of f ; stop.
3) Otherwise, perform all possible absorptions of quadratic terms by linear
ones. The resulting DNF χ is prime.
4) Check whether any term of χ is redundant.
Step 1 can be efficiently implemented as follows: To check whether the linear
term ξ is an implicant of f , assign to ξ the value 1 and deduce the values of as many
literals as possible, exactly as in the Labeling algorithm of Section 5.6.2. Then, ξ
is an implicant of f if and only if a conflict arises. Another efficient alternative
is to work on the implication graph Dϕ using Theorem 5.5 to check whether ξ is
forced to 0. Each of these two approaches takes O(mn) time.
Steps 2 and 3 are easy to implement.
An efficient implementation of Step 4 relies on the notion of transitive reduction.
A transitive reduction of a digraph D = (V , A) is any digraph D = (V , A ) such
that the transitive closure of D is equal to the transitive closure of D , and such
that the cardinality of A is minimum with this property. In the case of acyclic
digraphs, the transitive reduction is unique and can be computed in polynomial
time (see Aho, Garey, and Ullman [10]).
Let Dχ be the implication graph of the DNF χ found in Step 3. At this point,
we may assume that no linear term is redundant. Also, we may assume, without
loss of generality, that Dχ is an acyclic digraph.
Lemma 5.5. A quadratic term ξ η is redundant in χ if and only if, in the transitive
reduction of Dχ , the arcs (ξ , η) and (η, ξ ) are both missing.
Proof. A term ξ η of χ is redundant if and only if it can be obtained from the
remaining terms of χ through a sequence of consensus operations. In view of
the interpretation of consensus as a transitive arc addition, ξ η is redundant if
and only if in Dχ there is a directed path from ξ to η (and hence, also from η to
ξ ), that is, if and only if both arcs (ξ , η) and (η, ξ ) are missing in the transitive
reduction of Dχ .
from χ all the quadratic implicants ξ η such that both arcs (ξ , η) and (η, ξ ) are
missing in Dχr .
The resulting DNF T is a prime irredundant DNF of f .
Aho, Garey, and Ullman [10] have shown that the transitive reduction of an
arbitrary DAG can be generated with the same order of complexity as its transitive
closure. In particular, O(mn) algorithms are available. Hence, an irredundant DNF
of f can be obtained within the same complexity.
5.9.1 Introduction
Several algorithmic problems related to dualization of Boolean functions were
introduced in Section 4.3. We now consider the following special case (recall that
the complete DNF of a Boolean function consists of the disjunction of all its prime
implicants):
We conclude that the minimal vertex covers of Gf that do not contain both a
vertex and its negation (namely, the prime implicants of f d ) correspond precisely
to the minimal vertex covers of G∗f .
Several algorithms are available in the literature for generating all maximal
stable sets of a graph with polynomial delay (and even in linear space) [538, 605,
873]. Since maximal stable sets are precisely the complements of minimal vertex
covers, we obtain the following result due to Ekin [303, 304].
Theorem 5.30. The problem Quadratic DNF Dualization can be solved with
polynomial delay.
Example 5.15. Let the Boolean function f be given by the quadratic DNF
x1 ∨ x 1 x2 ∨ x 3 x 4 ∨ x4 x5 ∨ x3 x 5 ∨ x4 x 6 ∨ x5 x 7 ∨ x6 x8 .
ϕ = x1 ∨ x2 ∨ x 3 x 4 ∨ x 3 x5 ∨ x3 x4 ∨ x4 x5 ∨ x3 x 5 ∨ x 4 x 5 ∨ x 3 x 6 ∨ x4 x 6
∨x 5 x 6 ∨ x3 x 7 ∨ x 4 x 7 ∨ x5 x 7 ∨ x 6 x 7 ∨ x6 x8 ∨ x 7 x8
∨x 3 x8 ∨ x4 x8 ∨ x 5 x8 .
266 5 Quadratic functions
5.10 Exercises
1. In a plant, two machines are available for processing n jobs. Each job i has
a fixed start time si and a fixed end time ti , and it must be processed without
interruption by either machine. No job can be processed by both machines,
and neither machine can process more than one job at a time. When a job
ends, the next one can start instantaneously on the same machine. Set up a
quadratic Boolean equation that is consistent if and only if a feasible schedule
exists for the n jobs.
2. Solve the quadratic Boolean equation ϕ = 0, with
ϕ = x1 x 2 ∨ x1 x6 ∨ x 1 x2 ∨ x 1 x 5 ∨ x 1 x 6 ∨ x 1 x 9 ∨ x2 x6 ∨ x 2 x3 ∨ x 2 x 6 ∨ x3 x4
∨x 3 x 4 ∨ x 3 x7 ∨ x 3 x 8 ∨ x4 x5 ∨ x4 x6 ∨ x 4 x7 ∨ x5 x 8 ∨ x 5 x8 ∨ x6 x 7 ∨ x6 x 8
∨x 7 x9 ∨ x8 x9 ,
x1 x 3 ∨ x1 x5 ∨ x 1 x2 ∨ x 2 x4 ∨ x 2 x7 ∨ x3 x 6 ∨ x3 x 8 ∨ x 3 x5 ∨ x 4 x6 ∨ x 4 x8
∨ x 5 x 7 ∨ x 5 x 8 ∨ x 7 x8 = 0
is given by the Fibonacci number Fn+1 and thus grows exponentially with
n.
8. Find all prime implicants of the quadratic Boolean function
x1 x2 ∨ x1 x 4 ∨ x1 x6 ∨ x2 x 3 ∨ x2 x5 ∨ x 2 x 4 ∨ x3 x5 ∨ x3 x 6 ∨ x 3 x 4 ∨ x4 x 5
∨ x4x6 ∨ x5x6
2 3 4
5 6 7
8 9
In this chapter, we study the class of Horn functions. The importance of Horn
functions is supported by their basic role in complexity theory (see, e.g., Schaefer
[807]), by the number of applications involving these functions, and, last but not
least, by the beautiful mathematical properties that they exhibit.
Horn expressions and Horn logic were introduced first in formal logic by
McKinsey [638] and Horn [509] and were later recognized as providing a proper
setting for universal algebra by Galvin [367], Malcev [657], and McNulty [640].
Horn logic proved particularly useful and gained prominence in logic programming
[19, 185, 488, 489, 494, 521, 552, 582, 648, 656, 721, 855, 816], artificial intelli-
gence [186, 277, 318, 612, 853], and in database theory through its proximity to
functional dependencies in relational databases [179, 267, 319, 320, 646, 647, 797].
The basic principles of Horn logic have been implemented in several widely
used software products, including the programming language PROLOG and the
query language DATALOG for relational databases [494, 648]. Though many of
the cited papers are about first-order logic, the simplicity, expressive power, and
algorithmic tractability of propositional Horn formulae are at the heart of these
applications.
269
270 6 Horn functions
T (x1 , . . . , xn ) = xj xk (6.1)
j ∈P k∈N
is called a Horn term if |N| ≤ 1, that is, if T contains at most one complemented
variable. The term T is called pure Horn if |N | = 1, and positive if N = ∅.
Definition 6.2. A DNF
m
is called Horn (pure Horn) if all of its terms are Horn (pure Horn).
Note that the same function may have both Horn and non-Horn DNF
representations.
Example 6.1. The DNF
η1 (x1 , x2 , x3 ) = x1 x 2 ∨ x1 x 3 ∨ x2 x3
is Horn because its first two terms are pure Horn and its last term is positive,
whereas the following DNF of the same (monotone) Boolean function,
η2 (x1 , x2 , x3 ) = x1 x2 ∨ x1 x3 ∨ x1 x 2 x 3 ∨ x2 x3 ,
is not Horn because its third term contains two complemented variables.
Definition 6.3. For a pure Horn term T = x k ∧ j ∈P xj , variable xk is called
the head of T , while variables xj , j ∈ P , are called the subgoals of T .
To simplify subsequent discussions, we further introduce the following nota-
tions. Given a subset P ⊆ {1, 2, . . . , n}, we also use the letter P to denote the
corresponding elementary conjunction as well as the Boolean function defined by
that conjunction:
P = P (x1 , . . . , xn ) = xj ,
j ∈P
whenever this notation does not cause any confusion. Thus, a Horn DNF can be
written as
n
η= P∨ P xi , (6.3)
P ∈P0 i=1 P ∈Pi
where P0 denotes the set of positive terms, while Pi denotes the family of subgoals
of the terms with head xi , for i = 1, . . . , n. We interpret the families Pi , i = 0, . . . , n,
as hypergraphs over the base set {1, 2, . . . , n}.
Example 6.2. Consider the Boolean expression
η = x 1 ∨ x1 x 2 ∨ x1 x2 x 3 ∨ x2 x3 x 1 . (6.4)
6.1 Basic definitions and properties 271
This is is a Horn expression, for which P0 = ∅, P1 = {∅, {2, 3}}, P2 = {{1}}, and
P3 = {{1, 2}}. Since P0 = ∅, η is in fact a pure Horn formula.
Recall Definition 2.5: If Ax and Bx are two terms such that AB = 0, then
AB is called their consensus. The term AB is the largest elementary conjunction
satisfying AB ≤ Ax ∨ Bx, and thus, whenever both Ax and Bx are implicants of
a same Boolean function f , then AB = 0 is an implicant of f , too.
Theorem 6.1. The consensus of two Horn terms is Horn. More precisely, the
consensus of two pure Horn terms is pure Horn, while the consensus of a positive
and a pure Horn term is positive.
Proof. Assume, without any loss of generality that Ax and Bx are two Horn terms
that have a consensus (at least one of the terms must be pure Horn for their con-
sensus to exist). Then, A must contain only positive literals, and B can contain
at most one negated variable (which cannot belong to A). Hence, their consensus
AB contains at most one negative literal; thus it is Horn. More precisely, AB is
positive (respectively, pure Horn) if Bx is positive (respectively, pure Horn).
Example 6.3. Returning to Example 6.2, we observe that among the terms of η,
only x 1 is a prime implicant of the function h represented by η. All other terms in
(6.4) are nonprime. In fact, h has three prime implicants, namely, x 1 , x 2 and x 3 ,
the disjunction of which is another representation of h.
for all pure Horn functions. Let us further add that we can consider 0 n to be both
Horn and pure Horn, by definition, since its only DNF representation is the empty
DNF.
Although pure Horn functions play an important role in parts of this chapter,
they are not fundamentally different from Horn ones. Indeed:
Proof. Necessity of (a)–(b) is obvious from the definition of pure Horn functions.
Conversely, if (a) holds then h can be represented by a Horn DNF, and if (b) holds
then this DNF cannot contain any positive term.
When dealing with Horn functions, we usually assume that the function is rep-
resented by one of its Horn DNFs. As a matter of fact, recognizing Horn functions
expressed by arbitrary DNFs turns out to be hard.
Proof. In statement (a), NP-hardness follows from Theorem 1.30, and membership
in co-NP is an easy consequence of Corollary 6.2 to be proved in Section 6.3.
In statement (b), NP-hardness is implied by statement (a) and Theorem 6.4,
whereas membership in co-NP is implied by Theorem 6.3 and Corollary 6.2.
Finally, note that the number of prime implicants of a Horn function can be
much larger than the number of terms in an arbitrary defining Horn expression
of it.
6.2 Applications of Horn functions 273
Example 6.4. The expression given in the proof of Theorem 3.17 is such a Horn
DNF. We can also consider the following, somewhat simpler expression:
# k $ k
η2 = xi y i ∨ yi . (6.5)
i=1 i=1
xi yi
i∈S i∈S
R = {x1 ∧ x3 =⇒ x2 , x1 ∧ x3 =⇒ x4 , · · · }.
In certain situations, some of the values of these propositional variables are known,
and the rule base R is used to derive the values of the other variables (e.g., to choose
which actions to take) so that all rules remain valid. In other cases, we might just
want to check whether a certain chain of events (assignments of truth values to the
propositional variables) obeys all the rules, or not.
We can easily see that such a rule base can equivalently be represented by a
Horn DNF
h = x1 x3 x 2 ∨ x1 x3 x 4 ∨ · · · .
More precisely, a binary assignment X to the propositional variables satisfies all
the rules of R exactly when h(X) = 0 (such an assignment X is called a model of
R). In other words, the models of R are exactly the false points of h.
Important problems arising in this context include deciding the consistency of a
given rule base (namely, finding a solution to the Horn equation h = 0, see Section
274 6 Horn functions
A B C D
a b c d
a bb c dd
aa b cc d
aa bb cc dd
We can observe that {A} → {C}, {B, C} → {A, D}, and {D} → {B} are a few of the
many functional dependencies in this database. For instance, {D} → {B} means
that, whenever we know the value of attribute D, we “know” the value of B as
well: Whenever D = d in the above database, we also have B = b. In fact, using
{A} → {C} and {D} → {B}, we can uniquely recompute all records of the given
6.2 Applications of Horn functions 275
A C D B
a c and d b
aa cc dd bb
It is easy to see that all prime implicants of ηG are also quadratic and pure Horn,
and that they are in a one-to-one correspondence with the directed paths in G,
namely, xi x j is a prime implicant of ηG if and only if there exists a directed path
from i to j in G. Algorithms and graph properties for directed graphs naturally
correspond to operations and properties of Horn functions. For instance, strong
components of G correspond in a one-to-one way to logically equivalent variables
of ηG , the transitive closure of G corresponds to the set of prime implicants of ηG ,
etc. (see Chapter 5 for more details, in particular, Sections 5.4 and 5.8).
Directed hypergraphs (V , A) provide a natural generalization of directed
graphs. They consist of hyperarcs of the form T → h, where T ⊆ V and h ∈ V .
The set T is called the tail (or source set) of the hyperarc T → h, while h is called
its head (see Ausiello, D’Atri, and Sacca [37] or Gallo et al. [361]). The connec-
tion with Horn expressions is quite obvious, and several algorithmic problems and
procedures of logical inference on Horn systems can naturally be reformulated on
directed hypergraphs (see, e.g., [168, 360, 363, 756]).
The more general notion of Petri nets was introduced for modeling and ana-
lyzing finite state dynamic systems (see Petri [743]). Many important aspects of
Petri nets can equivalently be modeled by associated Horn expressions, providing
efficient algorithmic solutions to some of the basic problems of system design and
analysis (see, e.g., Barkaoui and Minoux [53, 683]).
276 6 Horn functions
and are used to model certain facility location problems (see Moon and Chaudhry
[691] or Chaudhry, Moon, and McCormick [189]). A similar model is used by
Salvemini, Simeone, and Succi [800] to model shareholders’ networks and to
determine optimal ownership control.
For another type of connection, let us consider a Horn DNF, say, for instance,
η = x1 x2 x3 x 4 ∨ x3 x 2 ∨ x1 x4 ∨ · · ·
and observe that a binary assignment X is a false point of η if and only if the
corresponding system of linear inequalities
−x1 − x2 − x3 + x4 ≥ −2
x2 − x3 ≥ 0
−x1 − x4 ≥ −1
..
.
is satisfied by X. One characteristic of this system of inequalities is that each row
has at most one positive coefficient. This feature turns out to imply interesting
properties of the set of feasible solutions. Namely, it was proved by Cottle and
Veinott [216] that a nonempty convex polyhedron of the form
P = {x | AT x ≥ b, x ≥ 0} (6.6)
has a least element if each row of the integral matrix A has at most one positive
element. Furthermore, as was shown by Chandrasekaran [180], the polyhedron P
has an integral least element for every integral right-hand side vector b if A has
at most one positive element in each row, and all positive elements in A are equal
to 1. For the special case of 0, ±1 matrices, this property was also observed and
utilized by Jeroslow and Wang [534] and Chandru and Hooker [182].
The property that P has a least element is perfectly analogous to the fact that
Horn functions have a unique minimal false point, and it can in fact be established
analogously to Theorem 6.6. This very useful property implies that for a linear
integer minimization problem over a polytope of the form (6.6), a simple rounding
procedure provides the optimal solution.
For further connections between cutting planes in binary integer programming
and prime implicant generation techniques for Boolean functions and, in particular,
6.3 False points of Horn functions 277
those specialized for Horn DNFs, we refer the reader to the book by Chandru and
Hooker [184] and to the survey by Hooker [503].
The next interesting connection is between (0, ±1) matrices, certain associated
polyhedra, and Horn functions. It is quite natural to associate with an m×n, (0, ±1)
matrix A the DNF
m
φA = xj ∧ xk .
i=1 j :aij =1 k:aik =−1
Since Boolean functions can also be defined by their sets of true and/or false
points, and since Horn functions constitute a proper subfamily of all Boolean
functions, not all subsets of B n can appear as sets of false points of Horn functions.
Indeed, the set of false points of a Horn function has a very special property,
observed first by McKinsey [638] and also by Horn [507; Lemma 7].
Theorem 6.6. A Boolean function is Horn if and only if its set of false points is
closed under conjunction.
Proof. Let us consider a Boolean function h on B n , and let T1 , …, Tp denote its
prime implicants.
Assume first that h is Horn or, equivalently, by Theorem 6.2, that all its prime
p
implicants are Horn. Let us note that F (h) = ∩k=1 F (Tk ), and that the intersection
of conjunction-closed sets is conjunction-closed again. Hence, to prove the first
half of the statement, it is enough to show that the set of false points F (T ) of a
Horn term T is closed under
conjunction. Since this is obvious for a positive term,
let us assume that T = j ∈P xj x i , and let us consider binary vectors X, Y , and Z
for which Z = X ∧ Y and T (Z) = 1. Then, we must have zi = 0 and zj = 1 for
all j ∈ P , implying by the definition of conjunction that xj = yj = 1 for all j ∈ P
and xi ∧ yi = 0. Thus, at least one of xi and yi must be equal to 0, say, xi = 0, and
therefore T (X) = 1 follows. This implies that F (T ) is closed under conjunction.
For the reverse direction, let usassume that h
is not Horn, and let us consider
a non-Horn prime implicant T = j ∈P xj ∧ k∈N x k of h, where |N | ≥ 2.
According to the Definition 1.18 of prime implicants, deleting any literal from T
yields a non-implicant of h. Thus in particular, for every index i ∈ N there exists
a binary vector X i ∈ B n such that xji = 1 for j ∈ P , xki = 0 for k ∈ N \ {i}, and
xii = 1, and for which h(X i ) = 0 holds. Therefore, T (X i ∧ Xi ) = 1 follows for any
two distinct indices i = i , i, i ∈ N , implying h(Xi ∧ X i ) = 1, and thus proving
that F (h) is not closed under conjunction.
This result has several interesting consequences. First, it implies a simple char-
acterization of Horn functions, which can serve as the basis for learning Horn
theories (see, e.g., [139, 264, 650]), and which was generalized to several other
classes of Boolean functions (see, e.g., [305, 303] and Chapter 11 in this book).
Corollary 6.2. A Boolean function f on B n is Horn if and only if
f (X ∧ Y ) ≤ f (X) ∨ f (Y ) (6.7)
n
holds for every X, Y ∈ B .
6.3 False points of Horn functions 279
Xh = Y
Y ∈F (h)
Theorem 6.6 also implies that every Boolean function f has a unique maximal
Horn minorant h, that is, a Horn function h such that h ≤ f and the inequalities
h ≤ h ≤ f hold for no other Horn function h = h.
Theorem 6.7. Given a Boolean function f , let h be the function defined by F (h) =
F (f )∧ . Then h is the unique maximal Horn minorant of f .
Proof. Clearly, h is well defined, and since F (h)∧ = (F (f )∧ )∧ = F (f )∧ = F (h),
it is also Horn by Theorem 6.6. It is also clear that F (h) = F (f )∧ ⊇ F (f ), and
hence, h ≤ f . Furthermore, for any Horn minorant h ≤ f we have F (h ) ⊇ F (f ),
and thus, by Theorem 6.6, F (h ) = F (h )∧ ⊇ F (f )∧ = F (h), which implies
h≥h.
6.3.1 Deduction in AI
The false points of Horn functions play a role in artificial intelligence in a
slightly different context, though the characterization by Theorem 6.6 remains
essential. In the artificial intelligence literature, typically, Horn CNFs instead of
Horn DNFs are considered. A Horn CNF is a conjunction of elementary dis-
junctions, called clauses, in which at most one literal is positive. Due to De
m
Morgan’s laws, η = i=1 j ∈P xj k∈Ni x k is a Horn DNF if and only if
m i
η = i=1 j ∈Pi x j k∈Ni xk is a Horn CNF. Accordingly, the solutions of the
Boolean equation η = 0, that is, the false points of the Boolean mapping η are
referred to as the models of η or, more precisely, as the models of the Boolean
function h represented by the DNF η.
One of the frequently arising tasks in this context is deduction, that is, the
problem of recognizing whether another logical expression η is consistent with
the given knowledge base h represented by η. Here, consistency means that all
280 6 Horn functions
Corollary 6.4. For every nonempty subset S ⊆ Bn , there exists a unique minimal
subset Q(S) ⊆ S such that Q(S)∧ = S ∧ ⊇ S.
Proof. Define )
Q(S) = Q. (6.8)
Q⊆S⊆Q∧
Clearly, Q(S) ⊆ S, and by Theorem 6.8, S ⊆ Q(S)∧ . It follows by Lemma 6.1 that
Q(S)∧ = S ∧ .
To prove that the tautology problem for Horn DNFs can be solved efficiently,
we shall show below that given a Horn DNF η representing the Horn function h,
the unique minimal false point X h ∈ F (h) can be found in linear time in the size
|η| of the DNF η. (As before, |η| denotes the number of literals occurring in the
DNF η.) Furthermore, h = 1 can also be recognized with the same effort whenever
F (h) = ∅.
Consider a Horn DNF η of the form (6.2), and denote by h the Horn function
represented by η. We assume, without loss of generality, that |Pi ∪ Ni | > 0 for all
i = 1, . . . , m.
Note first that if Pi = ∅ for all terms i = 1, . . . , m, then the vector 0 =
(0, 0, . . . , 0) ∈ B n is a solution of the equation η(X) = 0, and clearly, Xh = 0
is the unique minimal false point in this case.
Consider next the case in which Pi = ∅ for some term Ti of η. In this case, Ti
must be a negative linear term of the form Ti = x j for some index j . Clearly, for
all solutions of the equation η(X) = 0 (i.e., for all false points X ∈ F (h)), we have
xj = 1, and thus xjh = 1 is implied, too.
Based on these observations, a naïve approach to solving the equation η(X) = 0
could proceed as shown in Figure 6.1. We can observe that this procedure is a
restricted version of the so-called Unit Literal Rule employed by most satisfiability
algorithms (see Chapter 2). In this version only negative linear terms are used, and
hence, we call it the Negative Unit Literal Rule procedure (NULR).
282 6 Horn functions
Procedure NULR(η)
Input: A Horn DNF η representing the Horn function h.
Output: A false point of h or a proof that h = 1.
set η0 := η and k := 0.
repeat
if there is an empty term in ηk
then stop {comment: no solution, h = 1}
else find j such that x j is a negative linear term of ηk ;
if there is no such index j
then set all remaining variables to 0,
return X = Xh and stop {comment: solution found}
else
set xj := 1, ηk+1 := ηk |xj =1 , and k := k + 1.
Theorem 6.10. Let η be a Horn DNF of the Horn function h on B n . Then, algorithm
NULR(η) runs in O(n|η|) time, and either it detects that h ≡ 1 or it finds the vector
Xh ∈ F (h).
Proof. Let us denote by l the value of index k at termination, and let jk denote the
index of the variable fixed at 1 in step k − 1. Observe that for every k = 1, . . . , l
we have ηk = ηk−1 |xjk =1 , and hence,
T= xi
i∈S
follows, implying h ≡ 1n .
6.4 Horn equations 283
On the other hand, if NULR terminates with finding a solution, let us denote
this solution by X ∗ ; thus, X ∗ is the point defined as
∗ 1 if i ∈ {j1 , . . . , jl },
xi =
0 otherwise.
Then, since ηl has neither an empty term, nor a negative linear term, ηl (0, 0, …,
0) = 0 follows. By (6.9), we have 0 = ηl (0, 0, . . . , 0) = η(X ∗ ), implying X ∗ ∈ F (h).
Then X∗ ≥ X h follows by Corollary 6.3. Since we have shown that all the terms
x jk for k = 1, . . . , l are negative linear implicants of h, and since h ≡ 1n , all these
terms must be negative linear prime implicants of h, implying thus X∗ ≤ Xh by
the definition of X∗ and by Corollary 6.5. Hence, X ∗ = X h follows, concluding
the proof of correctness.
Finally, we note that all operations of the repeat loop can obviously be carried
out in linear time in the size of the input DNF η, hence the total running time of
NULR can be bounded by O(n|η|).
Theorem 6.11. Procedure NULR can be implemented to run in O(n + |η|) time.
Proof. We leave the proof as an exercise to the reader (see, e.g., Exercise 6 at the
end of this chapter).
A first important consequence of the previous results is that, unlike in the case
of general Boolean functions, we can decide in polynomial time whether or not a
given term is an implicant of a Horn function.
Corollary 6.6. Given a Horn DNF η of a Horn function h, one can decide in
O(n + |η|) time whether a given term T is an implicant of h.
Proof. Follows readily by Theorems 3.8 and 6.11, since the restriction of η for
T = 1 is, again, a Horn DNF.
Recall from Chapter 1 that a DNF η of the Boolean function h is called prime
if all terms of η are prime implicants of h and called irredundant if no terms can
be deleted from η without changing the Boolean function it represents.
Theorem 6.12. Given a Horn DNF, η of a Horn function h, one can construct in
O(|η|(n + |η|)) time an irredundant and prime Horn DNF of h.
Proof. For a term T of η, let η denote the DNF obtained from η by deleting the
term T . Clearly, η = η if and only if T is an implicant of η , which we can test
whether O(n + |η |) time in view of Corollary 6.6. Repeating this for all terms of η
one by one, and deleting redundant terms, we can produce in O(m(n + |η|)) time
an irredundant DNF of h.
284 6 Horn functions
To achieve primality, let us take a term T of the current Horn DNF, and let T
denote the term obtained from T by deleting a literal u of T . By definition, if T
is an implicant of h, then we can replace T by T . According to Corollary 6.6,
we can test whether T is an implicant in O(n + |η|) time. Thus, by repeating this
procedure for all literals of T , replacing T by T whenever T is proved to be an
implicant, and repeating for all terms of η, we can derive in O(|η|(n + |η|)) time
a prime DNF of h.
Since |η| ≥ m, the claim follows.
Lemma 6.2. If η is a pure Horn DNF, then S η ⊇ S, and (S η )η = S η for all subsets
S ⊆ {1, 2, . . . , n}.
Note that, in fact, the set S η depends only on the pure Horn function h repre-
sented by η, and not on the particular representation η. Hence, we often prefer to
use the notation S h rather than S η . Still, although the set S h does not depend on the
given representation of h, its computation may; hence, the notation S η will also
be used when necessary to avoid computational ambiguity.
The forward chaining procedure can also be viewed as producing the unique
minimal false point of h within a subcube of B n . Recall from Chapter 1 that for
a subset S ⊆ {1, 2, . . . , n} we denote by eS the characteristic vector of S, and by
T |S,∅ the sub-cube of vectors X ≥ eS : T |S,∅ = {X ∈ Bn | xi = 1 for all i ∈ S}. With
these notations the following statement follows directly from the forward chaining
procedure:
Remark 6.2. Given a pure Horn function h and a subset S ⊆ {1, 2, . . . , n} the point
eS h is the unique minimal point in T |S,∅ ∩ F (h).
Let us add that the simple linear-time forward chaining procedure is also
instrumental in testing if a given term is an implicant of a Horn function.
Lemma
6.3. Given a pure Horn DNF η of the pure Horn function h, a term
T = j ∈P xj j ∈N x j is an implicant of h if and only if N ∩ P η = ∅.
Proof. If T is not an implicant of h, then there must exist a vector X ∗ ∈ B n such that
h(X∗ ) = 0 and T (X∗ ) = 1, implying xj∗ = 0 for all j ∈ N . Moreover, for all indices
j ∈ P η we must have xj∗ = 1 by the definition of P η . Thus, N ∩ P η = ∅ follows.
Conversely, if i ∈ N ∩ P η , then the term T = j ∈P x j x i is an implicant of h,
as we observed earlier, and thus T ≤ T is an implicant, too.
286 6 Horn functions
initialize P := ∅, L = {T1 , . . . , Tm };
repeat while L = ∅
select a term T ∈ L and set L := L \ {T } and P := P ∪ {T };
generate all consensuses of T with terms T ∈ L, and add the
produced terms to L;
substitute each term in L by a corresponding prime implicant
of f absorbing this term;
eliminate duplicates from L, as well as those terms which also
appear in P;
end while
return the list of terms in P
Proof. Let us assume that T is an implicant of f for which T ≤ A does not hold;
hence, A contains a literal u that is not in T . Let us show that T ≤ ψ ∨ η.
Consider any binary point X ∈ Bn for which T (X) = 1. Since T ≤ f , we must
have A(X) ∨ ψ(X) = 1. If ψ(X) = 1, we are done. Otherwise, A(X) = 1, and
in particular, u(X) = 1. Let us denote by Y the binary point obtained from X
by switching the value of the literal u. Since u ∈ T , we have T (Y ) = 1, hence
f (Y ) = 1. On the other hand, since u ∈ A, we have A(Y ) = 0, thus implying
ψ(Y ) = 1. This, together with ψ(X) = 0, implies that there is a term B in ψ
involving the literal u, for which B(Y ) = 1. Hence, the terms A and B have
exactly one conflict, and thus their consensus C = (A ∪ B) \ {u, u} must be a term
of η, implying C(X) = 1 ≤ η(X). This proves the lemma.
Theorem 6.13. The Prime Implicant Depletion procedure generates the com-
plete list of prime implicants of the Boolean function f represented by the prime
DNF η. Furthermore, when η is a Horn DNF, the procedure runs in polynomial
incremental time and each main while loop takes O(n(n + |η|)|L|) time, where L
is the current list of prime implicants at the beginning of the loop.
Proof. Let TL denote the disjunction of prime implicants in the current list L.
We argue by induction on the size of P that at any moment during the procedure,
every prime implicant of f is either explicitly listed in P, or is an implicant of
TL . This is clearly the case at the very beginning of the procedure. According
to Lemma 6.4, this property is not changed when we move a term T (a prime
implicant of f ) from L to P, and then increment L with the consensuses obtained
with T . The property also remains unchanged when we substitute the terms in L
by some absorbing prime implicants, since such a substitution does not change
the function represented by TL . Similarly, the property remains valid when we
eliminate duplicates from L.
Now, when the algorithm stops L is empty; hence, P contains all prime
implicants of f .
To see the complexity claim, let us observe that the consensus of two terms can
be carried out in O(n) steps, where n is the number of variables in η; hence all
consensuses in a main iteration take O(n|L|) time. This step introduces at most
|L| new terms. For each term, we need to find a prime implicant of f that absorbs
it, which can be done, for instance, by forward chaining in O(n(n + |η|)) time.
Hence, a prime list can be obtained in O(n(n + |η|)|L|) time. Finally, by keeping L
and P in a hash, the elimination of duplicates can be accomplished in O(|L| log n)
time, proving the claim.
6.5 Prime implicants of Horn functions 289
φ = x1 x 3 x7 ∨ x2 x3 x6 ∨ x1 x 2 x7 ∨ x 1 x 4 x7 ∨ x4 x5 x8 ∨ x 1 x 5 x7 .
initialize P := ∅, L = {T1 , . . . , Tm };
repeat while L = ∅
select a term T ∈ L and set L := L \ {T } and P := P ∪ {T };
generate all consensuses of T with the input terms T1 , …, Tm , and
add the obtained new terms to L;
absorption: delete from P ∪ L all terms which are absorbed
by some other terms of P ∪ L;
end while
return the list of terms in P
For all i = 1, . . . , k, we can also observe that, since x ji is a negative linear term
of ηi−1 , η must contain a corresponding term of the form
Ti = xj ∧ x j i
j ∈Si
Using the preceding lemma, we can now prove that the input consensus
algorithm indeed works for Horn DNFs (see Hooker [498]).
Lemma 6.6. Let η be a Horn DNF of the Horn function h, and let T be a prime
implicant of h. Then, T can be obtained from η by a sequence of input consensuses
such that each term of η is used at most once in the sequence.
Proof. Let us consider the DNF η = η|T =1 obtained from η by substituting the
value 1 for all literals in T . Then η ≡ 1, and hence, there is a subset of its terms,
say, D1 , …, Dl , such that the empty term can be obtained from these by a sequence
of consensuses, without repetitions. Since D1 , …, Dl are terms of η , each of them
corresponds to a term Ti of η, for i = 1, . . . , l. Performing exactly the same sequence
of consensuses on T1 , …, Tl , yields T .
initialize P := ∅, L = {T1 , . . . , Tm };
repeat while L = ∅
select a term T ∈ L and set L := L \ {T } and P := P ∪ {T };
generate all consensuses of T with the input terms T1 , …, Tm ;
replace each such consensus by a prime implicant of f absorbing it;
check if each of these prime implicants is in P ∪ L, and if not,
add the new ones to L.
end while
return the list of terms in P
Lemma 6.7. Let us assume that P , Q, R are implicants of a function f and that
P is the consensus of Q and R. Let us assume further that R is a prime implicant
of f absorbing R. Then, P is absorbed either by R or by the consensus of Q
and R .
Proof. Assume first that Q and R do not have a consensus. Since R ≤ R , this
implies that P ≤ R . Assume next that Q and R have a consensus, say T . Then,
Q and R must have the same conflicting variable as Q and R, and thus, P ≤ T is
implied.
Theorem 6.14. When η is a Horn DNF, the Input Prime Consensus procedure
correctly generates all prime implicants of the function represented by η.
Corollary 6.8. The complete list of prime implicants of the Boolean function
represented by a Horn DNF η can be generated with polynomial delay using
procedure Input Prime Consensus(η).
Proof. Let us remark first that the incremental complexity of the previously
described methods for prime implicant generation (namely, Prime Implicant
Depletion and Input Consensus) resulted from the fact that, in each main cycle,
we had to check for absorption, a task requiring time proportional to the length
of the lists P and L. The speedup of Input Prime Consensus is due to the fact
that, instead of absorption, we have to check now for membership in P and L; this
can be done in O(n) time with an appropriate data structure, independently of the
length of those lists.
More precisely, let us assume that we keep both P and L in a hash table. Then
inserting a new member, deleting a member, or checking membership can all be
done in O(n) time. Now, it is easy to see that with every execution of the main
while loop, we add exactly one new element to the output list P. In the while
loop, selecting a term T , deleting it from L and adding it to P takes O(n) time;
generating the consensus of T with T1 , ..., Tm can be done in O(nm) time; replacing
the (at most m) consensuses by prime implicants can be done in O(nm|η|) time;
checking membership in P and L takes O(nm) time; and adding the new terms to
L can be done in O(nm) time. It follows that all prime implicants can be generated
with polynomial delay O(nm|η|) between two successive prime implicants.
further results about the structure of the family of implicants and about Horn DNF
representations of a Horn function. Since these results may be of independent
interest, we present them in this section, before turning to Horn minimization in
Section 6.7.
Definition 6.6. A set T of terms (elementary conjunctions) is said to be
closed under consensus if, for any two terms T , T ∈ T , their consensus, when
it exists, also belongs to T .
Let us note the difference between this definition and a similar one introduced
in Section 3.2.2. In Definition 6.6, we consider a set of terms (without absorp-
tions), and not their disjunction. This is an important detail, since our purpose is
to understand the structure of different DNF representations of a given function.
Clearly, the intersection of closed sets of terms is closed again; hence, every
set of terms has a unique smallest closed set containing it.
Definition 6.7. The consensus closure T c of a set of terms T is the smallest closed
set containing T .
Given a Boolean function f , let us denote by If the set of all implicants of f ,
and let Pf denote the set of its prime implicants. Clearly, If is a closed set, and
Pfc is a subset of If (typically, a proper subset).
Definition 6.8. Let T be a closed set of terms. A partition (R, D) of T
(R ∪ D = T and R ∩ D = ∅) is called a recessive-dominant partition (or, in
short, an RD-partition) of T if
• both R and D are closed under consensus, and
• if two terms T1 ∈ R and T2 ∈ D have a consensus T3 , then T3 ∈ D.
To simplify our notations for the rest of this section, and with a slight abuse of
terminology, we shall view a DNF as a set of terms.
Definition 6.9. Given a DNF η representing the Horn function h and a family T
of terms, let us denote by ηT = η ∩ T the DNF formed by those terms of η that
belong to T , and let us call it the T -component of the DNF η. Let us further denote
by hT the Horn function defined by the disjunction of the terms in Ph ∩ T , and let
us call it the T -component of h.
The following result of Čepek [173] implies that the R-component of an arbi-
trary prime DNF representation η of a Horn function h defines the same Boolean
function, namely, the R-component of h, for any RD-partition (R, D) of Phc . As
a consequence, one can start the minimization of h by minimizing first hR , rep-
resented by ηR , and then replacing in η the terms of ηR by the obtained minimal
representation of hR , yielding a new, “shorter” DNF representation of h (in fact,
this scheme works for several different measures of “size”).
Theorem 6.15. Let h be a Horn function represented by a Horn DNF η ⊆ Phc , and
let (R, D) be an RD-partition of Phc . Then, the R-component ηR of η is a Horn
DNF representation of the R-component hR of h.
Proof. Let us first note that (η)c = Phc by Theorem 3.5 and by the properties
of the consensus closure. Since we obviously have (η ∩ R)c ⊆ Rc = R, the
above equality and the definition of an RD-partition by Definition 6.8 imply
(η ∩ R)c = R. Applying this for the particular DNF representation Ph of h,
instead of η, we also get (Ph ∩ R)c = R. Consequently, both ηR and hR have
the same set of prime implicants, namely, Ph ∩ R (since no prime implicant
of h is absorbed by a term of Phc , and since (Ph ∩ R)c ⊆ Phc , obviously),
which proves the statement.
For the special case in which R is the set of pure Horn terms of a Horn function
(as in Example 6.7), this result was established by Hammer and Kogan [446].
Let us remark that the condition η ⊆ Phc in Theorem 6.15 can easily be fulfilled
by requiring η to be prime. Note, however, that this condition cannot be relaxed
completely; for instance, it cannot be simply replaced by the irredundancy of η.
Indeed, irredundant Horn DNFs may contain terms that cannot be obtained from
the prime implicants by consensus; moreover, there may exist an irredundant DNF
representation of a Horn function, which is perfectly disjoint from the consensus
closure of its prime implicants. To illustrate this, let us consider the following
example.
η = x1 x 2 ∨ x1 x 3 ∨ x1 x2 x3 ∨ x 1 x2 x3 ,
φ = x1 ∨ x2 x3 .
6.6 Properties of the set of prime implicants 295
It is easy to verify that both DNFs are irredundant Horn representations of the
same function h. However, when R is the family of all pure Horn terms in Phc (as
in Example 6.7), we have ηR = x1 x 2 ∨ x1 x 3 ∨ x 1 x2 x3 = 0 = φ R . The main reason
for the equality ηR = hR to fail in this case is that none of the terms of η belongs
to Phc .
Let us further remark that a result analogous to Theorem 6.15 does not hold
for D-components: The D-components of different Horn DNF representations of
a same Horn function may represent different Boolean functions, as the following
example shows:
Example 6.9. Consider the following Horn DNFs
η = x1 x2 ∨ x1 x 3 ∨ x2 x 4 ∨ x3 x 1 ∨ x4 x 2 ,
φ = x3 x4 ∨ x1 x 3 ∨ x2 x 4 ∨ x3 x 1 ∨ x4 x 2 .
It is easy to verify that η and φ are equivalent irredundant prime Horn DNFs of
the Horn function h having the following prime implicants Ph = {x1 x2 , x1 x4 , x2 x3 ,
x3 x4 , x1 x 3 , x2 x 4 , x3 x 1 , x4 x 2 }. If we partition the implicants of h into pure Horn
and positive terms, we obtain an RD-partition (as in Example 6.7). However, the
D-components of the above DNFs, ηD = x1 x2 and φ D = x3 x4 , are not equivalent,
and none of them represents the disjunction x1 x2 ∨ x1 x4 ∨ x2 x3 ∨ x3 x4 of all posi-
tive prime implicants of h.
This theorem was proved for the positive terms of an irredundant prime Horn
DNF (see Example 6.7) by Hammer and Kogan [446]. The statement implies in this
case that the number of positive terms is the same constant in all irredundant and
prime DNF representations of a Horn function; see Example 6.9 for an illustration
of this.
Let us note again that the conditions η1 , η2 ⊂ Phc cannot be simply disregarded,
since the statement does not remain true, in general, even for irredundant Horn
DNFs, as the following example shows:
Example 6.10. Consider the Horn DNFs of Example 6.8. The DNF η contains
only one positive term, while φ contains two such terms, and, in fact, φ is the
(unique) shortest DNF of the corresponding Horn function. The conclusion of
Theorem 6.16 fails here because η contains implicants that do not belong to Phc . It
is possible to perform consensus operations with these implicants that introduce
extra arcs in the corresponding digraph G, and in effect reduce the number of
source components from 2 to 1 (cf. Exercise 28).
Theorems 6.15 and 6.16 provide the basis for a very useful decomposition
technique of Horn minimization problems. For Horn functions, and especially
for pure Horn functions, there are several different RD-partitions that could be
utilized in such decomposition methods (see, e.g., [108] and Exercise 27). Similar
structural properties of Horn CNFs also play an important role in decomposability
of Horn functions, and in an AI context, in Horn belief revision (see [595]).
As we shall see in the rest of this chapter, the above results alone provide
efficient minimization techniques for several special classes of Horn functions.
We also refer the reader to [109] for a more thorough treatment of this topic.
6.7 Minimization of Horn DNFs 297
Let us recall from Section 6.2 that a Horn function h can also be represented as a
set of implications of the form
P =⇒ for P ∈ P0 , and
P =⇒ xi for P ∈ Pi , i = 1, . . . , n.
V= P =⇒ xj . (6.12)
P ∈P j ∈R(P )
The number σ (V) = |P| is called the number of source sides in such an implication
representation V, and can also be used as a measure of the size of the represen-
tation (see, e.g., [37, 646]). For a Horn function h we define σ (h) = min σ (V),
where the minimization is over all possible implication representations V of h, as
in (6.12).
298 6 Horn functions
We also consider for each µ ∈ {λ, τ , σ } the decision variant of the problem of
finding a shortest representation of a given Horn function:
Horn µ-Minimization
Instance: A Horn DNF η of a Horn function h and an integer K.
Output: A (Horn) DNF or implication representation η∗ of the Horn function h
such that µ(η∗ ) ≤ K, if there is one.
Note that we do not have to require the output to be Horn in case of µ ∈ {λ, τ }.
In fact, by substituting the non-Horn terms of η∗ by prime implicants of η (which
can easily be done in polynomial time in the size of η according to Lemma 6.3),
we can always obtain a Horn DNF η∗∗ such that η ≥ η∗∗ ≥ η∗ and µ(η∗∗ ) ≤ µ(η∗ )
for both measures µ ∈ {τ , λ}. It is also easy to see that η∗ ≥ η holds if and only
if η∗∗ ≥ η holds, and the latter can be checked in polynomial time by Lemma 6.3
(see [173]). Thus, we can assume in the sequel, without any loss of generality, that
η∗ is a Horn DNF when µ ∈ {λ, τ }.
D’Atri, and Saccà [37] using a reduction from set covering. We sketch their proof
in the context of pure Horn τ -minimization:
Proof. Let us consider a hypergraph H = (V , E) over the base set V = {1, 2, ..., n}
such that H ∈E H = V . It is well-known that, for a given integer k < m = |E|, it
is NP-complete to decide the existence of a subset of hyperedges S⊆ E that is a
cover of H of cardinality at most k, that is, such that |S| ≤ k and H ∈S H = V
(see, e.g., [371]).
With the hypergraph H and with every subset of hyperedges S ⊆ E, we now
associate pure Horn DNFs V and ηS , depending on the Boolean variables z, xj
for j ∈ V , and yH for H ∈ E, where
n
V = x j yH ∨ xj y H ,
H ∈E j ∈H H ∈E j =1
and
# $
ηS = z y H ∨ V.
H ∈E
Let us further denote by h the Horn function represented by the pure Horn DNF
ηH . We claim that h has a DNF with no more than k + τ (V) terms if and only if
H has a cover of cardinality no more than k.
To see this, let us observe first that since ηH does not involve the literal z, no
term in Phc contains z (all those terms can be obtained from ηH by consensus).
Let us then define D as the set of those terms in Phc involving the literal z, and
let R = Phc \ D. Any consensus involving a term in D will result in a term also
containing z. Hence, (R, D) forms an RD-partition for h and thus V represents
hR , the R-component of h. Furthermore, V is a τ -minimal representation of hR .
This is because all quadratic terms in V must appear in all representations of V,
and all such representations must also contain at least one term including y H for
all H ∈ H. Since the only prime implicants in D are z y H , H ∈ H, and z x j , j ∈ V ,
and since a term z x j can always be replaced by z y H for H ∈ H such that j ∈ H
without changing the size of the representation, Theorem 6.15 implies that a τ -
minimal prime DNF of h looks like ηS for some subhypergraph S ⊆ E. Since ηS
represents h if and only if S is a cover, our main claim follows.
This result can further be improved, as observed by Boros, Čepek, and
Kučera [110].
Proof. Let us try to repeat the above proof with a small modification in the definition
of V. Namely, let us introduce n − 1 additional variables and replace the high-
degree terms by a chain of cubic and quadratic terms, as follows:
# $
T = x j yH ∨ x1 x2 u1 ∨ u1 x3 u2 ∨ · · · ∨ un−2 xn un−1 ∨ un−1 y H ,
H ∈H H ∈H
j ∈H
and set # $
ηS = z y H ∨ T.
H ∈S
As in the proof of Theorem 6.17, we denote by h the function represented by the
cubic pure Horn DNF ηH . We can then repeat the preceding proof, with T playing
the role of V.
Note that for quadratic pure Horn DNFs, τ -minimization is equivalent to finding
the transitive reduction of a directed graph (that is, finding the smallest subset of
arcs, the transitive closure of which is the same as that of the original graph), which
is a polynomially solvable problem; see Sections 5.4.1 and 5.8.4.
On the positive side, for an arbitrary Horn function h, Hammer and Kogan
[447] proved that τ (h) is approximated within a reasonable factor by the size of
any irredundant prime DNF of h.
Theorem 6.19. If h is a Horn function on Bn and η ⊆ Phc is an irredundant Horn
DNF of h, then τ (η) ≤ (n − 1)τ (h).
Proof. Let us consider the RD-partition R ∪ D = Phc into pure Horn and positive
terms, and let ζ denote a τ -optimal irredundant, prime DNF of h. Then, τ (ηD ) =
τ (ζ D ) holds for the positive components according to Theorem 6.16, and η1 =
ηR ≡ ζ1 = ζ R = hR must hold for the pure Horn components by Theorem 6.15.
Let us further divide R into R = PhcR = R ∪ D , where D is the set of linear terms
and R is the set of nonlinear pure Horn terms in R. This yields an RD-partition
of the closure of the prime implicants of hR (see Exercise 27), and by the same
theorems, we get that τ (η1D ) = τ (ζ1D ) and η2 = η1R ≡ ζ2 = ζ1R = hR .
Let us consider next a term Ay of ζ2 . Since η2 ≡ ζ2 , this term is an implicant
of η2 , and thus, by Lemma 6.3 variable y must belong to the forward chaining
closure Aη2 of A. Let Aη2 \ A = {xi1 , xi2 , . . . , xik } be indexed according to the order
in which forward chaining adds these variables to A, and let Aij x ij be the term of
η2 used in this process when adding xij to A, for j = 1, . . . , k. (We have y = xit for
some t ≤ k.) It is easy to see that performing consensuses between these terms,
we can derive the prime implicant Ay.
Thus we need at most |Aη2 \ A| terms of η2 to derive a term Ay ∈ ζ2 . Due to the
fact that η is irredundant, η2 must also be irredundant (this follows by Theorem
6.15), and thus, all terms of η2 must appear in such derivation for some terms of
ζ2 . Therefore, we have τ (η2 ) ≤ Ay∈ζ2 |Aη2 \ A| ≤ (n − 1)τ (ζ2 ), since ζ2 does not
contain linear pure Horn terms by our construction.
6.7 Minimization of Horn DNFs 301
Let us finally remark that much better polynomial time approximation may
not be achievable, as shown by a recent inapproximability result of Bhattacharya,
DasGupta, Mubayi, and Turán [77]:
1−M
Theorem 6.20. For any fixed 0 < M < 1, one cannot guarantee a 2log n -
approximation for Horn τ -minimization in polynomial time, unless N P ⊆
DT I ME(npolylog(n) ).
Proof. Given a hypergraph (V , E), let us consider the cubic Horn DNF ηS ,
defined as in the proof of Theorem 6.18, for any subfamily S ⊆ E. It can be
verified that ηS is not only τ -minimal but also λ-minimal if and only if S is a
minimal cover.
Here again, condition η ⊆ Phc is important because Theorem 6.22 does not hold
for arbitrary irredundant Horn DNFs.
Example 6.12. Consider the DNF η of the Horn function h = x1 , as in Example
6.11. In this case, we have n = 3, λ(η) = 7, while λ(h) = 1.
Lemma 6.8 implies that we can associate a unique pure Horn function h in
n + 1 variables with every Horn function h in n variables, so that σ (h) = σ (h ).
Therefore, in the sequel, we shall consider source minimization only for pure Horn
functions.
Recall from Section 6.4 that the forward chaining closure S η of a subset S of the
variables is uniquely defined for every (pure) Horn DNF η, and that this closure
is the same for every (pure) Horn DNF representing a given function h, so that
we can also denote S η as S h . It follows from Lemma 6.3 that a pure Horn term
Ax ∈ Phc is an implicant of a Horn function h if and only if x ∈ Ah .
Note further that, since we view a DNF as a set of terms, we consider η = xz to be
different from η = xz ∨ xyz, even if they represent the same Boolean function; but
η is considered to be the same as η = xz ∨ xz, even if they are written differently.
Definition 6.10. Given an implicant T x ∈ Phc of a pure Horn function h, the set
of terms I(T ) = {T y | y ∈ T h \ T } ⊆ Phc is called the h-star of T .
Definition 6.11. For a pure Horn DNF η, we denote by S(η) the family of all
those subsets of variables which appear as sets of positive variables of a term of η.
We call S(η) the family of source sets of η.
With this definition, we have σ (η) = |S(η)| for every pure Horn DNF η.
Definition 6.12. Given a DNF η ⊆ Phc of the pure Horn function h, we associate
to it another DNF defined by η∗ = T ∈S(η) I(T ). We say that η∗ is the star closure
of η, and we say that η is star closed if η = η∗ .
The star closure η∗ represents h, and we have S(η) = S(η∗ ) by the preceding
definitions.
Definition 6.13. A star closed pure Horn DNF η representing the pure Horn
function h is called star irredundant if the DNF T ∈S I(T ) does not represent
h for any proper subset S S(η).
Lemma 6.9. Given a DNF η ⊆ Phc representing a pure Horn function h, a star
η representing h can be constructed in O(n|η|2 )
closed and star irredundant DNF *
time.
Proof. Since T η can be computed by forward chaining in O(n + |η|) time for an
arbitrary subset T of the variables (see Section 6.4), we can compute the star closure
η∗ of η, namely, the sets I(T ) for T ∈ S(η) in O(|S(η)|(n+|η|)) = O(n|η|+|η|2 )
time.
Let us next initialize *η = η∗ and label the sets S(η) = {T1 , T2 , . . . , Tk } (where
k = |S(η)|). Then, repeat the following for j = 1, . . . , k: define the DNF
φ
φj = Q∈S(*η)\{Tj } I(Q), and compute the forward chaining closure Tj j in
φ η
O(n + |φj |) = O(n|η|) time. Clearly, if Tj j = Tj , then φj also represents h;
304 6 Horn functions
The main result of this subsection, then, states that any star irredundant and star
closed DNF representation of a pure Horn function is also σ -minimal.
Theorem 6.23. If h is a pure Horn function, and η ⊆ Phc is a star closed, star
irredundant DNF of h, then σ (h) = σ (η).
Before we prove this statement, we need a few more definitions and lemmas.
Observe first that if h is a pure Horn function, and S is subset of its variables such
that S h = S, then the partition RS = {Ax ∈ Phc | A ⊆ S} and DS = {Ax ∈ Phc | A S}
is an RD-partition of Phc (see Exercise 27). To simplify our notations, we denote
respectively by hS and ηS the RS -components of h and η, when η ⊆ Phc is a DNF
representation of h; we call hS and ηS the S-components of h and η, respectively.
Note that hS could equivalently be defined by the disjunction of all terms Ax ∈ Phc
for which Ah ⊆ S, and that ηS is a DNF representation of hS for every DNF η ⊆ Phc
of h, by Theorem 6.15.
Definition 6.14. For a pure Horn function h and a subset S of its variables such
that S h = S, we denote by hS the function defined by the disjunction of all those
terms T x ∈ Phc such that T h S. Analogously, for a DNF η ⊆ Phc of h, we denote
by ηS the disjunction of all those terms T x ∈ η such that T h S.
The next lemma is instrumental in our proof of Theorem 6.23, and it leads to the
identification of another type of “subfunction” of pure Horn functions, not implied
by RD-partitions.
Lemma 6.10. Let h be a pure Horn function, let S be a subset of its variables
such that S h = S, and let η ⊆ Phc be a Horn DNF of h. Then, for every implicant
Ax ≤ hS , either Ax ≤ ηS or Ah = S.
Proof. Let us consider an arbitrary implicant Ax ≤ hS for which Ax ηS . We
claim that Ah ⊇ S, which will imply the lemma, since S ⊇ A and S h = S by our
assumptions. To see this claim, we consider the partial assignment that sets all
variables in A to 1 and assigns 0 to x. Since Ax ηS , the Horn function obtained
from ηS ≡ hS by substituting this partial assignment has some false points, and
thus it has a unique minimal false point by Corollary 6.3. Let X∗ denote this unique
binary assignment, extended with the values assigned to the variables of A and to
x, and let us denote by Q the subset of variables which are assigned value 1 in
S
X ∗ . It is easy to see by the definition of forward chaining that we have Q ⊆ Aη
(since x = 0 limits the forward chaining procedure). Since Ax ≤ hS ≡ ηS , and the
term Ax evaluates to 1 at X ∗ , by our construction, there must exist a term By of
6.7 Minimization of Horn DNFs 305
ηS that also evaluates to 1 at X ∗ , that is, for which B ⊆ Q and y ∈ Q. Clearly, this
term of ηS does not belong to ηS , since all terms of ηS vanish at X∗ ; thus, B h = S
S
is implied by the definition of ηS . Since we have Ah = Aη ⊇ Aη ⊇ Q ⊇ B, the
relations Ah = (Ah )h ⊇ B h = S follow, concluding the proof of the claim.
Corollary 6.10. Let h be a pure Horn function, let S be a subset of its variables
such that S h = S, and let η ⊆ Phc be a Horn DNF of h. Then, ηS represents the
function hS .
Proof. For any term T x ∈ Phc for which T h S it follows by Lemma 6.10 that
T x ≤ ηS , which then implies hS ≤ ηS by Definition 6.14. For the converse direc-
tion, the terms of ηS are also implicants of hS by Definition 6.14, since η ⊆ Phc is
assumed.
Proof of Theorem 6.23. Consider two star closed, star irredundant DNFs η ⊆ Phc
and ζ ⊆ Phc of the pure Horn function h, and fix an arbitrary subset S of the
variables for which S h = S. Clearly, both ηS and ζ S represent the S-component hS
of h; thus, they both must be star closed and star irredundant because both η and
ζ are assumed to be star closed and star irredundant. Let us further denote by
the source sets of η and ζ , respectively, for which Ahi = Bjh = S holds for i = 1, . . . , k
and j = 1, . . . , -.
We claim that k = -. Since every source set of S(η) and S(ζ ) corresponds
to exactly one subset S of the variables, satisfying S h = S, this claim implies
the statement of the theorem, for example, by assuming that ζ is a σ -optimal
representation.
To prove the claim, let us assume indirectly that, for instance, k > -. Note
first that according to Corollary 6.10, both ηS and ζ S represent the same function
hS . Furthermore, the star irreducibility of ηS and ζ S implies that Ai x ≤ ηS , and
Bj y ≤ ζ S for some variables x ∈ Ahi \ Ai , and y ∈ Bjh \ Bj for all i = 1, . . . , k and
j = 1, . . . , -.
Thus it follows, as in the proof of Lemma 6.10, that for every index i, there
S
exists a corresponding index j , such that Ah ⊇ Bj , and conversely, for every
S
index j , there exists a corresponding index i such that Bjh ⊇ Ai . Since k > -, we
S S
must have indices i1 , i2 and j for which Ahi1 ⊇ Bj and Ahi2 ⊇ Bj . Let us denote
hS
by i3 one of the indices for which Bj ⊇ Ai3 holds. Since i1 = i2 , we can assume,
without any loss of generality, that i3 = i1 . Thus,
ηS
ηS ηS S
Ai 1 = Ai1 ⊇ Bjh ⊇ Ai3
306 6 Horn functions
This last relation implies by Lemma 6.3 that every term of I(Ai1 ) is an impli-
cant of ηS ∪ I(Ai3 ), contradicting the fact that η was chosen as a star irredundant
expression. This contradiction proves that k = -, finishing the proof of the claim
and of the theorem.
φ(X) = xj xk .
i=1 j ∈Pi k∈Ni
Then, f is the dual of a Horn function if and only if for any two distinct indices
i = i ,
φ|{xj =1|j ∈Pi ∪Pi }∪{xk =0|k∈Ni ∩Ni } ≡ 1. (6.13)
Proof. Recall (see Theorem 4.7) that a nontrivial term
T= xj xk, (6.14)
j ∈P k∈N
with respect to these conditions. In other words, the prime implicants of the dual
f d , as subsets of literals, are in one-to-one correspondence with those minimal
transversals of the hypergraph on the set of literals formed by the terms of φ,
which do not contain complementary pairs of literals. Thus, f is the dual of a
Horn function if and only if all such minimal transversals of the terms of φ contain
at most one negative literal.
To prove the theorem, let us assume first that there exists a non-Horn prime
implicant of f d of the form (6.14) with |N | ≥ 2, and let us prove that, in this case,
condition (6.13) is violated.
Since T is prime, for every - ∈ N there must exist a term i(-) of φ such that
(P ∪ N ) ∩ (Pi(-) ∪ Ni(-) ) = {-}. Thus, for any two distinct indices - = - , -, - ∈ N ,
we have - ∈ Ni(-) \ Ni(- ) , - ∈ Ni(- ) \ Ni(-) and P ∩ (Pi(-) ∪ Pi(- ) ) = ∅. On the other
hand, P ∩ Pi = ∅ must hold for all terms of φ such that Ni ⊆ (Ni(-) ∪ Ni(- ) ) \ {-, - },
since N ∩ Ni = ∅ for such terms.
It follows from these observations that the assignment {xj = 0 | j ∈ P } ∪
{xk = 1 | k ∈ N } is compatible with the assignment {xj = 1 | j ∈ Pi(-) ∪ Pi(- ) } ∪
{xk = 0 | k ∈ Ni(-) ∩ Ni(- ) }. However, since T is an implicant of f d , φ vanishes
when xj = 0 for j ∈ P and xk = 1 for k ∈ N , contradicting (6.13).
For the reverse direction, let us assume indirectly that there exist two distinct
indices i and i such that
φ|{xj =1|j ∈Pi ∪Pi }∪{xk =0|k∈Ni ∩Ni } ≡ 1.
Let X be an assignment of the variables xj , j ∈ Pi ∪ Pi ∪ (Ni ∩ Ni ), at which the
left-hand side vanishes, and let us define P = {j | xj = 0} and N = {k | xk = 1}.
Then the term T corresponding to these sets P and N is a transversal of the terms
in φ; thus, it contains a minimal transversal. All such minimal transversals, how-
ever, must have a literal from both terms i and i , which can only be from the
sets Ni \ Ni and Ni \ Ni , respectively, implying that all such minimal transversals
must contain at least two negative literals.
However, for special classes of DNFs for which tautology is tractable and
remains so after fixing some of the variables, Theorem 6.24 provides a computa-
tionally efficient way of recognizing whether the dual of the input DNF is indeed
Horn. This applies, for instance, when φ itself is a Horn DNF; see also [298] and
Section 6.9.2.
We turn now to the problem of generating a DNF of the dual f d of a Horn
function f . It is clear that this problem is at least as hard as the generation of
the dual of a monotone function, since monotone functions are Horn. It is not
so clear, however, whether “Horn dualization” is strictly harder than “monotone
dualization.” Recall that a prime DNF of the dual of a monotone function can
be generated incrementally efficiently (see Fredman and Khachiyan [347] and
Section 4.4.2 in Chapter 4). We explain next that a similar claim can be made for
Horn dualization, as well.
While it is hard to recognize whether a given conjunction is an implicant of a
function expressed in DNF (see Theorem 3.7), we can show that the same problem
is tractable for the dual function (see also Theorem 4.7).
In contrast, note that checking whether f d has no prime implicant, that is,
whether φ d ≡ 0, is co-NP-complete for general DNFs. However, even this case
becomes easy when φ is a Horn DNF (see Theorems 6.10 and 6.11). This implies
that the following special variant of the Dual Recognition problem may be eas-
ier than the general case:
f (X ∨ Y ) ∨ f (X ∧ Y ) ≤ f (X) ∨ f (Y ) (6.15)
Proof. It is easy to verify that for a function f both conditions – namely, being
submodular or being simultaneously Horn and co-Horn – are equivalent to the fact
that F (f ) is closed with respect to both componentwise conjunction and compo-
nentwise disjunction; see Corollary 6.2.
310 6 Horn functions
n
n
fd = xj ∨ xk.
j =1 k=1
• If Gf is acyclic, then
fd = xj xk.
I ∈I(Gf ) j ∈I :j ≺a k∈I :k0a
for some i∈I for some i∈I
In the general case, when Gf has c strong components (c > 1), we can write
f = f0 ∨ f1 ∨ · · · ∨ fc , where f0 is the disjunction of those prime implicants that
involve variables from different strong components, and fi is the disjunction of
those prime implicants that involve variables only from the ith strong component
of f , for i = 1, . . . , c. Then, we have f d = f0d ∧ f1d ∧ · · · ∧ fcd , where each of these
functions can be determined by Theorem 6.29, since Gf0 is acyclic, and Gfi is
strongly connected for i = 1, . . . , c.
6.9 Special classes 311
Since testing AB ≤ η can be done in linear time when η is Horn (see, e.g.,
Corollary 6.6), the above characterization provides an O(|η|2 3η3) algorithm to
test whether a Horn DNF represents a bidual Horn function.
Unfortunately, this positive result does not extend to general DNF representa-
tions. Namely, it was shown in [298] that it is co-NP-complete to recognize whether
an arbitrary DNF represents a bidual Horn function (this is again a corollary of
Theorem 1.30).
Recall from Definition 3.3 in Section 3.3.2 that a prime implicant of a Boolean
function f is essential if it is present in all prime DNF representations of f . An
interesting property of bidual Horn functions is stated next.
Theorem 6.31 ([298]). If f is bidual Horn, then all pure Horn prime implicants
of f are essential.
In light of Theorem 6.16, this implies that every irredundant prime DNF of a
bidual Horn function has the same number of terms; thus, minimizing the number
of terms in a DNF representation of a bidual Horn function given by a Horn DNF
is polynomially solvable, by Theorem 6.12.
The foregoing does not imply that all irredundant prime DNfs of a bidual Horn
function f should involve the same number of literals. Still, finding a repre-
sentation with the minimum number of literals can also be solved efficiently in
312 6 Horn functions
O(l(m2h mp + l)) time, where l is the number of literals in a given Horn DNF η of
f , and mh and mp denote respectively the number of Horn and positive terms in
η. Furthermore, the number of positive prime implicants of f cannot be more than
2m2h + mp (mh + 1), and thus the consensus algorithm generates from η all prime
implicants of f in polynomial time (see [298]).
Let us further observe that generating the dual of a bidual Horn function f
represented by a Horn DNF η is not easier than dualizing a monotone DNF, since
bidual DNFs include all monotone DNFs as special cases.
Finally, we remark that the existence of a bidual extension for a given partially
defined Boolean function (T , F ) (see Chapter 12 for definitions) can be checked
in O(n|T ||F |) time, where n is the number of variables. Interestingly, listing all
bidual extensions of (T , F ) is computationally equivalent (i.e., as easy or difficult)
as generating all prime implicants of the dual of a monotone DNF (see [298]). In
particular, deciding whether a given partially defined Boolean function (T , F ) has
a unique bidual extension is equivalent to Dual Recognition (see Chapter 4),
and hence can be solved in quasi-polynomial time (see [347]).
φ= x jk x j i ,
i∈S k=1
Note that the preceding DNF is an orthogonal expression (i.e., no two of its
terms can take value 1 simultaneously; see Section 1.6 and Chapter 7) which is
short, since it consists of at most n + 1 terms, where n is the number of variables.
In fact, a much stronger statement can be established:
It can also be shown (e.g., by Theorem 6.32) that double Horn functions are
read-once, that is they can be represented by a Boolean expression in which every
variable appears at most once (see Chapter 10).
Despite the fact that this class of functions is very “small” and quite well charac-
terized, recognizing whether a given DNF φ represents a double Horn function is
still co-NP-complete in view of Theorem 1.30. However, the recognition problem
is polynomially solvable under appropriate conditions on the input DNFs.
Theorem 6.34 ([296]). Let F be a class of formulae that is closed under restric-
tions (i.e., variable fixing) and for which checking ϕ ≡ 1 and ϕ ≡ 0 can both be
done in t(n, |ϕ|) time, where n is the number of variables and |ϕ| denotes the input
length of formula ϕ ∈ F. Then, deciding whether ϕ ∈ F represents a double Horn
function can be performed in O(n2 t(n, |ϕ|)) time.
Theorem 6.36 ([448]). If h is an acyclic pure Horn function, then every prime
implicant of h is either essential or redundant.
This remarkable property of acyclic Horn functions implies that they have
a unique irredundant prime DNF representation. Thus, in light of the preced-
ing results and of Theorem 6.12, we can check whether a given pure Horn
DNF η is acyclic, and if yes, we can find the unique irredundant prime DNF
representing the same acyclic Horn function in O(3η32 ) time (where, actually,
the majority of the time will be spent on transforming η into an irredun-
dant prime DNF). Clearly, the unique irredundant prime DNF of an acyclic
Horn function minimizes all usual measures of complexity (see Chapter 3 and
Section 6.7).
Further properties and generalizations of acyclic functions based on the
structure of associated graphs can be found in [108, 173, 448, 449].
6.10 Generalizations
6.10.1 Renamable Horn expressions and functions
In most applications of Boolean functions, the meaning of a variable and its nega-
tion are interchangeable, since a particular variable x could equally well denote the
truth value of a logical proposition or its negation. Thus, it is natural to consider
logical expressions obtained from a given expression after replacing some of the
variables by their negations. More formally, given a DNF
m
φ= xj xk (6.16)
i=1 j ∈Pi k∈Ni
and a subset S ⊆ {1, 2, . . . , n}, we say that the DNF φ S is obtained from φ by
switching (or renaming; see also Chapter 5) the variables in the subset S if
m
φS = xj xk .
i=1 j ∈(Pi \S)∪(Ni ∩S) k∈(Ni \S)∪(Pi ∩S)
6.10 Generalizations 315
We say that the DNF φ is renamable Horn if φ S is a Horn DNF for some subset
S of the variables (as before, we do not distinguish between sets of variables and
sets of indices whenever this does not cause any confusion).
The problem of recognizing whether a given DNF φ is renamable Horn was
considered first by Lewis [612], who provided an elegant proof showing that this
problem is polynomially solvable, namely, that it can be reduced to a quadratic
Boolean equation.
Theorem 6.37 ([612]). Let φ be a DNF given as in (6.16) and let S ⊆ {1, 2, . . . , n}.
Then φ S is Horn if and only if the following implications hold for every i = 1, . . . , m:
Pi ∩ S = ∅ =⇒ Ni ⊆ S and |Pi ∩ S| = 1,
Ni \ S = ∅ =⇒ |Ni \ S| = 1 and Pi ∩ S = ∅.
Proof. If some of the above implications were not valid for the ith term of φ, then
after switching the variables in S, this term would have more than one negated
variables. On the other hand, if all of the above implications hold for term i, then
it will have at most one negated variable after switching.
η = x1 ∨ x2 ∨ x3 ,
φ = x1 x2 ∨ x1 x3 ∨ x2 x3 ∨ x 1 x 2 x3 ∨ x 1 x2 x 3 ∨ x1 x 2 x 3 ,
H (α)∩H (β). Since there are only finitely many different subsets H ⊆ {1, 2, . . . , n},
it follows that there exists a vector α ∈ Pφ such that H (α) ⊆ H (β) for all β ∈ Pφ .
Thus the lemma follows by H (α) = H ([α]).
z(φ) = min z
s.t. z≥ j ∈Pi αj + k∈Ni (1 − αk ) for i = 1, . . . , m,
0 ≤ αj ≤ 1 for j = 1, . . . , n.
Clearly, φ is Q-Horn if and only if z(φ) ≤ 1. Boros et al. [116] showed that if
z(φ) ≤ 1 + (c log n)/n, then the Boolean equation φ = 0 can be solved in O(nc )
time. On the other hand, the tautology problem remains NP-complete for any fixed
M < 1 when restricted to instances for which z(φ) ≤ 1 + n−M .
6.10 Generalizations 319
is not empty. Furthermore, repeated application of the unit literal rule allows us
to detect whether φ ≡ 1.
Note that by Theorem 2.10, an arbitrary DNF equation φ = 0 has a solution if and
only if the polyhedron Qφ contains an integral point. The strength of the preceding
statement is that for an extended Horn DNF φ, the integrality requirement can be
disregarded, and hence, the consistency question can be decided in polynomial
time by linear programming.
It was also shown by Schlipf et al. [809] that extended Horn equations (and
many others, including renamable extended Horn equations) can be solved by the
single look-ahead unit literal rule. In this algorithm, variables are assigned binary
values one-by-one, and the unit literal rule is applied right after each assignment
has been made. If a contradiction is found, then the last assignment is reversed;
otherwise, the last assignment is accepted permanently.
The recognition of extended Horn DNFs is strongly related to the so-called
arborescence realization problem (given a hypergraph H on a base set E, find an
arborescence T with arc set E such that all hyperedges of H are directed paths
in T ), and, in fact, a polynomial time recognition algorithm for simple extended
Horn DNFs was derived via arborescence realization by Swaminathan and Wagner
[852]. This was later improved to a linear time algorithm by Benoist and Hebrard
[59]. The problem of recognizing extended Horn DNFs is still open.
320 6 Horn functions
D0 ⊂ D1 ⊂ · · · ⊂ Dk ⊂ · · ·
such that (i) the membership φ ∈ Dk can be tested in time polynomial in |φ|k ; (ii)
if k is a fixed constant and φ ∈ Dk , then the Boolean equation φ = 0 can be solved
in polynomial time; and (iii) for every DNF φ, φ ∈ Dk for some integer k.
Several such hierarchies were considered in the literature (see, e.g., [174,
253, 362, 756]), most of them built on Horn expressions or on some of their
generalizations. To describe these, we need to introduce a few more notations.
With a DNF φ given by (6.16), let us associate the hypergraph N (φ) = {Ni |
i = 1, . . . , m} consisting of the index sets of the negated variables of the terms of
φ. Note that N (φ) may not be a clutter; for example, it contains the empty set
whenever φ includes a positive term. For an index j , consider two operations,
defined by N \ {j } = N \ {N ∈ N | N 4 j } and N ÷ {j } = {N \ {j } | N ∈ N },
respectively, called the deletion and the contraction of element j (note the slight
difference with the similar terminology introduced in Section 1.13.5).
One of the earliest polynomial hierarchies N0 ⊂ N1 ⊂ · · · ⊂ Nk ⊂ · · · , where
N0 is the family of Horn expressions, was proposed by Gallo and Scutella [362].
To describe this hierarchy, first we need to define a hierarchy of hypergraphs
I0 ⊂ I1 ⊂ · · · by
Note that class Ik for k > 0 is initialized by the condition Ik−1 ⊂ Ik . Then, classes
of DNFs Nk , k = 0, 1, . . . are defined by φ ∈ Nk if and only if N (φ) ∈ Ik .
Clearly, N0 is the family of Horn DNFs. The class N1 is the family of so-called
generalized Horn DNFs, introduced earlier by Yamasaki and Doshita [931]. It
was shown in [362] that the membership φ ∈ Nk can be tested in O(|φ|nk ) time.
Furthermore, the membership algorithm in [362] provides the index j appearing
in the recursive definition of Ik . When k is a fixed constant, a polynomial time
algorithm to solve the Boolean equation φ = 0, with φ ∈ Ik , follows easily from
these results. Indeed, branching on the j -th variable results in two subproblems,
one from Nk−1 and one from Nk , both having one variable less than the original
problem. (The same results were obtained by [931] when k = 1.)
The previous hierarchy was somewhat improved by Dalal and Etherington
[253], so that both Horn and quadratic formulae could be included at the lowest
level of the hierarchy. Furthermore, it was shown by Kleine Büning [570] that,
to prove φ ≡ 1 for a DNF φ ∈ Nk , it is enough to use a restricted version of the
consensus algorithm in which the consensus of two terms is computed only if at
least one of the terms is of degree at most k.
6.11 Exercises 321
Pretolani [756] observed that many other classes of DNFs could be used in
place of N0 , resulting in a similar polynomial hierarchy. Unfortunately, renamable
extensions of otherwise simple classes may not always be included at low levels
of such hierarchies. For instance, Eiter, Kilpelainen, and Mannila [301] showed
that recognizing renamable generalized Horn DNFs is an NP-complete problem.
Recent work of Čepek and Kučera [174] provides a quite general framework
for more general polynomial hierarchies. Let D0 be a class of DNFs, and
• for k > 0, let φ ∈ Dk if and only if there exists a literal u of φ such that
φ|u=0 ∈ Dk−1 and φ|u=1 ∈ Dk .
For example, the class D0 can be chosen to be the family of renamable Horn
DNFs or the family of Q-Horn DNFs, and so on, with each choice resulting in a
different polynomial hierarchy.
6.11 Exercises
1. Let f and g denote arbitrary Horn functions. Decide whether the following
claims are true or false:
• f ∨ g is Horn.
• f ∧ g is Horn.
• f is Horn.
2. Find a Boolean function, in n variables for which the number of minimal
Horn majorants is exponential in n.
3. Find a Horn function in n variables for which the number of prime implicants
is polynomial in n, but the number of different Horn DNF representations
is exponential in n.
4. Let f be a Boolean function, and let Pi , i = 1, . . . , m, be its Horn prime impli-
cants. Prove that η(X) = m i=1 Pi (X) is the unique maximal Horn minorant
of f . Does this claim remain true if Pi , i = 1, . . . , m are the Horn terms of
an arbitrary DNF of f ?
5. Let
n
f= P xi
i=1 P ∈Pi
322 6 Horn functions
and
n
g= Qx i
i=1 Q∈Qi
• Prove that the family of pure Horn functions with the operations ⊗ and
∨ form a lattice.
• Prove that f ⊗ g is the unique largest Horn minorant of f ∧ g.
• Can you generalize this for the family of Horn functions?
6. Prove that for a nonempty subset S ⊆ B n and for the characteristic models
Q(S) of this set (see Corollary 6.4), we have
/ 0
Q(S) = X ∈ S | X ∈ (S \ {X})∧ . (6.19)
S A = {X | X ≥A Y for some Y ∈ S}
• Let A denote the set of those n + 1 binary vectors from B n that contain
at least n − 1 ones. Prove that, for every Horn function h, we have
)
F (h) = F (h)A .
A∈A
φ= xj ∧ xj
i=1 j ∈Pi j ∈Ni
φσ = xj ∨ xj ∧ x σ (i) .
i:Ni =∅ j ∈Pi i:Ni =∅ j ∈Pi
• Construct a Horn DNF η representing the Horn function h such that, for
every DNF representation η(2) of h(2) , we have |η(2) | > |η| (cf. [172]).
23. Let us call a consensus k-restricted if at least one of the terms involved in
the consensus has degree at most k.
• Prove that all linear prime implicants of a pure Horn DNF η can be
obtained by a sequence of 1-restricted consensuses.
• Generalize this statement for any k ≥ 2 (see [173]).
24. Consider a Horn function h given by a prime DNF η, and let T be an implicant
of h. How difficult is it to decide whether T can be derived from the prime
implicants of h by a sequence of consensuses? How many prime implicants
of h are needed for such a consensus derivation of T when it exists?
25. Let T and Q ⊆ T be two sets of terms, both closed under consensus, and let
(R1 , D1 ) and (R2 , D2 ) be two RD-partitions of T .
• Prove that (R3 , D3 ) is also an RD-partition of T if R3 = R1 ∩ R2 and
D3 = D1 ∪ D2 .
• Prove that (R4 , D4 ) is an RD-partition of Q if R4 = R1 ∩ Q and D4 =
D1 ∩ Q.
26. Prove that the partition in Example 6.7 is an RD-partition.
27. Prove that, if h is a Horn function, then each of the following defines an
RD-partition of Phc :
(a) R = {T ∈ Phc | |T | ≥ 2} and D = Phc \ R.
(b) R = {T ∈ Phc | |T | ≤ 2} and D = Phc \ R.
(c) R = {T ∈ Phc | T (X) = 0} and D = Phc \ R, if X ∈ Bn is a point at which
every prime implicant of h contains at most one literal that evaluates to
zero. (How easy is it to check for the existence of such a binary vector
X ∈ Bn ?)
(d) R = {T ∈ Phc | all variables of T belong to S} and D = Phc \ R, where
S is a subset of the variables that is closed under forward chaining,
namely, S h = S (see Section 6.4).
28. Prove that the minimum number of positive terms in a Horn DNF of a
Horn function h is always at most 1. For which Horn functions is it 0?
How difficult is to find such an “optimal” Horn DNF, having the minimum
number of positive terms?
29. Consider a pure Horn DNF η of a pure Horn function h, and the associated
directed graph Gη = (V , Aη ) defined in Section 6.9.4. Prove that if Ay is
an implicant of h (not necessarily present in η) and x ∈ A, then there is a
directed path from x to y in Gη .
30. Consider two prime DNFs η1 and η2 of the pure Horn function h. Prove
that the transitive closures of the directed graphs Gη1 and Gη2 are the same,
and that they coincide with the transitive closure of Gh (see Appendix A for
definitions).
6.11 Exercises 325
31. Let us consider a pure Horn function h, the associated transitively closed
directed graph Gh = (V , Ah ), as defined in the previous exercise, and let us
assume that S ⊆ V is an initial set of the vertices (namely, there is no arc
(x, y) with x ∈ V \ S and y ∈ S). Define
The concept of orthogonal disjunctive normal form (or ODNF, sometimes called
sum of disjoint products) was introduced in Chapter 1. Orthogonal forms are a
classic object of investigation in the theory of Boolean functions, where they were
originally introduced in connection with the solution of Boolean equations (see
Kuntzmann [589], Rudeanu [795]). More recently, they have also been extensively
studied in the reliability literature (see, e.g., Colbourn [205, 206]; Provan [759];
Schneeweiss [811]).
In general, however, orthogonal forms are difficult to compute, and few classes
of disjunctive normal forms are known for which orthogonalization can be effi-
ciently performed. An interesting class with this property, called the class of
shellable DNFs, has been introduced and investigated by Ball and Provan [49, 760].
As these authors established, the DNFs describing several important classes of reli-
ability problems (all-terminal reliability, all-point reachability, k-out-of-n systems,
etc.) are shellable. Moreover, besides its unifying role in reliability theory, shella-
bility also provides a powerful theoretical and algorithmic tool of combinatorial
geometry, where it originally arose in the study of abstract simplicial complexes
(see [96, 97, 205, 206, 254, 569, etc.]; let us simply mention here, without further
details, that an abstract simplicial complex can be viewed as the set of true points
of a positive Boolean function).
In this chapter, we first review some basic facts concerning orthogonal forms
and describe a simple orthogonalization procedure for DNFs. Then, we intro-
duce shellable DNFs and establish some of their most remarkable properties: In
particular, we prove that shellable DNFs can be orthogonalized and dualized in
polynomial time. Finally, we define and investigate a fruitful strengthening of
shellability, namely, the lexico-exchange property.
326
7.1 Computation of orthogonal DNFs 327
φ= xi xj , (7.1)
k=1 i∈Ak j ∈Bk
or, equivalently,
xi xj xi xj ≡ 0 for all 1 ≤ k < - ≤ m.
i∈Ak j ∈Bk i∈A- j ∈B-
ψ = C1 ∨ C1 C2 ∨ C1 C2 C3 ∨ . . . ∨ C1 C2 . . . Cm−1 Cm
is equivalent to φ;
(ii) if ψk is an ODNF of C1 C2 . . . Ck−1 Ck for k = 1, 2, . . . , m, then mk=1 ψk is
an ODNF of φ.
Proof. The expression ψ is clearly equivalent to φ. Let T1 be a term of ψk and T2
be a term of ψj , where T1 = T2 and k ≤ j . If k < j , then T1 T2 ≡ 0, since ψk ψj ≡ 0.
On the other hand, if k = j , then T1 T2 ≡ 0 by orthogonality of ψk .
Theorem 7.1 suggests the recursive procedure described in Figure 7.1 for
computing an ODNF of an arbitrary DNF (see, e.g., Kuntzmann [589]).
328 7 Orthogonal forms and shellability
Procedure Orthogonalize(φ)
Input: A DNF φ = m k=1 Ck .
Output: An orthogonal DNF ψ equivalent to φ.
begin
for k := 1 to m do
begin
compute a DNF φk of C1 C2 . . . Ck−1 Ck ;
ψk := Orthogonalize(φk );
end;
ψ := m k=1 ψk ;
end
There are many ways of implementing this algorithm, thus giving rise to differ-
ent variants of Orthogonalize, such as those proposed by Fratta and Montanari
[346]; Abraham [2]; Aggarwal, Misra, and Gupta [7]; Locks [619]; Bruni [158];
and so on; see also the surveys [206, 776]. (Note that most authors restrict
their attention to positive Boolean functions, although there is no need to be so
restrictive.)
A specific difficulty with Orthogonalize is to work around the recursive
call to the procedure, since orthogonalizing φk may, in general, be as difficult
as orthogonalizing φ itself. One way to resolve this difficulty is to produce φk
directly in orthogonal form, as this suppresses the need for the recursive call. To
nj
achieve this goal, we write Cj = i=1 -ij , where -1j , -2j , . . . , -nj j are literals, for
j = 1, 2, . . . , m. Then,
nj
k−1
C1 C2 . . . Ck−1 Ck = -ij Ck
j =1 i=1
k−1
= -1j ∨ -1j -2j ∨ -1j -2j -3j ∨ . . . ∨ -1j -2j . . . -nj −1,j -nj ,j Ck .
j =1
Using distributivity to “multiply out” its k − 1 factors, the latter expression can
easily be transformed into an orthogonal DNF ψk .
Abraham [2] suggested to implement this approach in an iterative fashion,
by successively computing an ODNF expression ϕj of C1 C2 . . . Cj Ck for j =
1, 2, . . . , k − 1, until ϕk−1 = φk = ψk is obtained. Suppose that the ODNF ϕj −1 is
in the form ϕj −1 = t∈T Pt , where Pt (t ∈ T ) are elementary conjunctions. Then,
nj
C1 C2 . . . Cj Ck = Cj ϕj −1 = -ij Pt , (7.2)
i=1 t∈T
7.1 Computation of orthogonal DNFs 329
and the right-hand side of (7.2) can be transformed to produce the ODNF
nj
ϕj = -ij Pt
i=1 t∈T
= -1j Pt ∨ -1j -2j Pt ∨ -1j -2j -3j Pt ∨ . . . ∨ -1j -2j . . . -nj −1,j -nj ,j Pt .
t∈T
(7.3)
φ2 = ψ2 = (x1 ∨ x2 ) x2 x3 = x1 x2 x3 .
ϕ1 = (x1 ∨ x1 x2 ) x3 x4 = x1 x3 x4 ∨ x1 x2 x3 x4 .
ϕ2 = φ3 = ψ3 = (x2 ∨ x3 ) ϕ1 = x1 x3 x4 ∨ x1 x2 x3 x4 ,
ψ = ψ1 ∨ ψ2 ∨ ψ3 = x1 x2 ∨ x1 x2 x3 ∨ x1 x3 x4 ∨ x1 x2 x3 x4 .
Another way to look at the for loop of the procedure in Figure 7.1 relies on the
observation that computing a DNF C C . . . Ck−1 Ck is essentially equivalent to
of 1 2
dualizing the function fk−1 = k−1 d
i=1 i . Indeed, if θk−1 (X) is a DNF of fk−1 (X),
C
then the required DNF φk is easily derived from the expression θk−1 (X)Ck (X).
Note, however, that the resulting DNF is usually not orthogonal, so that the
recursive call to Orthogonalize is needed here.
Incidentally, for positive functions, this relation between dualization and
orthogonalization procedures prompts an intriguing conjecture.
330 7 Orthogonal forms and shellability
Conjecture 7.1. Every positive Boolean function f has an ODNF ψ whose length
is polynomially related to the length of the complete (i.e., prime irredundant) DNFs
of f and f d : More precisely, there exist positive constants α and β such that, if
p, q, and r respectively denote the number of terms of f , f d , and a shortest ODNF
of f , then asymptotically
α (p + q) ≤ r ≤ (p + q)β .
Weaker forms of the lower bound conjecture have been informally stated by Ball
and Nemhauser [48] and Boros et al. [111] (see also Jukna et al. [541] for related
considerations and negative results in the context of decision trees and branching
programs). Note that if m denotes the number of terms of an arbitrary DNF of
f , then the bound p ≤ m holds in view of the unicity of the prime irredundant
representation of positive functions.
An interesting result concerning the length of ODNFs was established by Ball
and Nemhauser [48]. (The proof of this result involves arguments based on linear
programming duality. It is rather lengthy and we omit it here.)
Theorem 7.2. For all n ≥ 1, the shortest ODNF of f (x1 , x2 , . . . , xn , y1 , y2 , . . . , yn ) =
n n
i=1 xi yi contains 2 − 1 terms.
Observe that the dual of the function mentioned in Theorem 7.2 has 2n prime
implicants, in agreement with Conjecture 7.1.
(Section 7.6), when we shall have more tools at hand with which to establish
shellability.
For now, we just provide a couple of small examples showing that shellable
DNFs exist, that some DNFs are not shellable, and that an arbitrary permutation
of the terms of a shellable DNF is not necessarily a shelling.
Example 7.2. Consider again the DNF φ = x1 x2 ∨ x2 x 3 ∨ x3 x4 , as in Example
7.1. The permutation (x1 x2 , x2 x 3 , x3 x4 ) is not a shelling of its terms, since
(x1 ∨ x2 ) (x2 ∨ x3 ) x3 x4 = x1 x3 x4 ∨ x2 x3 x4
is not equivalent to an elementary conjunction. However, φ is shellable. Indeed,
when we consider its terms in the order (x2 x 3 , x3 x4 , x1 x2 ), we successively obtain
(x 2 ∨ x3 ) x3 x4 = x3 x4 ,
and
(x 2 ∨ x3 ) (x 3 ∨ x4 ) x1 x2 = x1 x2 x3 x4 .
Thus, in particular, φ is equivalent to the orthogonal DNF x2 x 3 ∨ x3 x4 ∨ x1 x2 x3 x4 .
Finally, the positive DNF x1 x2 ∨ x3 x4 is not shellable, since neither (x1 ∨
x2 ) x3 x4 nor (x3 ∨ x4 ) x1 x2 is equivalent to an elementary conjunction.
However, it is absolutely not obvious that the “short” ODNF whose existence is
guaranteed by Theorem 7.3 can always be computed efficiently (say, in polynomial
time) for every shellable DNF. This question actually raises multiple side issues:
How difficult is it to recognize whether a DNF is shellable? How difficult is it
to find a shelling of a shellable DNF? How difficult is it to recognize whether a
given permutation of the terms of a DNF is a shelling? Given a shelling of a DNF,
how difficult is it to compute an equivalent ODNF? and so on. We tackle most of
these questions in forthcoming sections. From here on, however, we restrict our
attention to positive DNFs, since all published results concerning shellability have
been obtained for such DNFs.
Ck = xi , k = 1, 2, . . . , m, (7.4)
i∈Ak
332 7 Orthogonal forms and shellability
Lemma 7.2. Let C- = i∈A- xi for - = 1, 2, . . . , m, let k ∈ {1, 2, . . . , m}, and assume
that φk = C1 C2 . . . Ck−1 Ck is not identically zero. For an arbitrary elementary
conjunction
Uk = xi xj , (7.6)
i∈Ok j ∈Fk
7.2 Shellings and shellability 333
(a) Ok ⊆ Ak ; and
(b) Fk ⊆ S(Ak ).
Example 7.4. Observe that the DNF φ sh is not necessarily orthogonal. For
instance, when φ = x1 x2 ∨ x3 x4 , we find S(A1 ) = S(A2 ) = ∅, and φ = φ sh .
C1 C2 . . . Ck−1 Ck = xi xj . (7.7)
i∈Ak j ∈S(Ak )
334 7 Orthogonal forms and shellability
φ sh = xi xj (7.8)
k=1 i∈Ak j ∈S(Ak )
is orthogonal.
(d) A- ∩ S(Ak ) = ∅ for all 1 ≤ - < k ≤ m.
(e) For all 1 ≤ - < k ≤ m, there exists j ∈ A- and h < k such that Ah \Ak = {j }.
Proof. (a) ⇐⇒ (b). Statement (b) implies (a), by definition of shellings. Conversely,
assume that (C1 , C2 , . . . , Cm ) is a shelling of φ. Then, C1 C2 . . . Ck−1 Ck must be an
elementary conjunction. But Lemma 7.2 implies that the right-hand side of (7.7)
is the smallest elementary conjunction implied by C1 C2 . . . Ck−1 Ck , and hence,
equality must hold in (7.7).
(b) ⇐⇒ (c). If (b) holds, then φ sh is orthogonal, since the expressions
C1 C2 . . . Ck−1 Ck are pairwise orthogonal. Conversely, suppose that φ sh is orthog-
onal. By Lemma 7.2, we know that
C1 C2 . . . Ck−1 Ck ≤ xi xj (7.9)
i∈Ak j ∈S(Ak )
As a corollary of Theorem 7.4, we can now answer some of the questions posed
at the end of Section 7.2.1 (compare with Theorem 7.3).
7.2 Shellings and shellability 335
Theorem 7.5. If φ = m k=1 Ck is a positive DNF on n variables, there is an
O(nm2 )-time algorithm to test whether (C1 , C2 , . . . , Cm ) is a shelling of φ and,
when this is the case, to compute an orthogonal DNF of φ.
Proof. Given a permutation of the terms, it suffices to compute the expression (7.8)
and to test whether it is orthogonal.
f (x1 , . . . , x4 ) = x1 x2 ∨ x3 x4 .
On the other hand, the concept of weak shellability is rather vacuous, since it can be
shown that every positive Boolean function is weakly shellable (Boros et al. [111]).
Theorem 7.6. Every positive Boolean function can be represented by a shellable
DNF.
Proof. Let f be a positive function, let {CI = j ∈I xj | I ∈ I} denote the set of
all implicants (not necessarily prime) of f , and let π be a permutation that orders
the implicants by nonincreasing degree. Then, the DNF
φ= xj
I ∈I j ∈I
represents f , and condition (d) in Theorem 7.4 can be used to verify that π
is a shelling of φ. Indeed, if I- , Ik ∈ I and CI- precedes CIk in π , then there
is an index j ∈ I- \ Ik . The set I = Ik ∪ {j } is in I and CI precedes CIk in
π. Therefore, j ∈ S(Ik ), and we conclude that j ∈ I- ∩ S(Ik ), as required by
condition (d).
Since the size of the DNF produced in the proof of Theorem 7.6 can generally
be very large relative to the number of prime implicants of f , let us provide another
construction that uses a smaller subset of the implicants.
We first recall a well-known definition.
336 7 Orthogonal forms and shellability
Definition 7.4. If I , J are two subsets of N = {1, 2, . . . , n}, we say that I precedes
J in the lexicographic order, and we write I <L J if
Now, for I ∈ I, let h(I ) denote the largest element of the subset I ⊆ {1, 2, . . . , n},
and let H (I ) = I \ {h(I )}. We call leftmost implicant of f any implicant CI of
f for which CH (I ) is not an implicant of f , and we denote by L the family of
leftmost implicants of f . Clearly, all prime
implicants
of f are in L, therefore f
is represented by the DNF ψL = I ∈L x
j ∈I j . Boros et al. [111] showed that
the lexicographic order <L defines a shelling of ψL . We leave the proof of this
claim as an end-of-chapter exercise and simply illustrate it on an example.
ψL = x1 x2 ∨ x1 x3 x4 ∨ x2 x3 x4 ∨ x3 x4
ψLsh = x1 x2 ∨ x1 x 2 x3 x4 ∨ x 1 x2 x3 x4 ∨ x 1 x 2 x3 x4
is orthogonal.
Let us finally observe that there exist families of positive functions for which
the smallest shellable DNF representation involves a number of terms that grows
exponentially with the number of its prime implicants:
p
On the other hand, g = C1 C2 . . . Cm−1 = k=1 j ∈Jk xj , so that
p
C1 C2 . . . Cm−1 Cm = xj ∧ xi
k=1 j ∈Jk i∈Am
p
= xj xi . (7.12)
k=1 j ∈Jk i∈Am
fd = xj ∨ xj , (7.13)
k=1 j ∈Jk i∈Am j ∈J- ∪{i}
k =-
Note that the dualization procedure sketched in the proof of Theorem 7.8
is exactly the classical algorithm SD-Dualization presented in Chapter 4,
Section 4.3.2.
In Theorem 7.6, we established that every positive function can be represented
by a shellable DNF. This result, combined with Theorem 7.8, might raise the
impression that every positive function can be dualized in polynomial time. This
is, of course, a fallacy because, as shown in Theorem 7.7, the shortest shellable
representation of a positive Boolean function may be extremely large.
Finally, we mention that Theorem 7.8 generalizes a sequence of earlier results
on regular functions [74, 225, 735, 736] and on aligned functions [105], since these
are special classes of shellable positive DNFs (see Chapter 8 and the end-of-chapter
exercises). For aligned and regular functions, efficient dualization algorithms with
running time O(n2 m) have been proposed in [74, 105, 225, 736]. None of those
procedures, however, seems to be generalizable to shellable functions.
φ(x1 , x2 , . . . , xn ) = xj (7.14)
k=1 j ∈Ak
has the lexico-exchange (LE) property with respect to (x1 , x2 , . . . , xn ) if, for every
pair -, k ∈ {1, 2, . . . , m} such that A- <L Ak , there exists h ∈ {1, 2, . . . , m} such that
Ah <L Ak and Ah \ Ak = {j }, where j = min{i | i ∈ A- \ Ak }.
We say that φ has the LE property with respect to a permutation
(σ (x1 ), σ (x2 ), . . . , σ (xn )) of its variables, or that σ is an LE order for φ, if the
DNF φ σ defined by
φ σ (σ (x1 ), σ (x2 ), . . . , σ (xn )) = φ(x1 , x2 , . . . , xn )
has the LE property with respect to (σ (x1 ), σ (x2 ), . . . , σ (xn )).
Finally, we simply say that φ has the LE property if φ has the LE property with
respect to some permutation of its variables.
Note that these definitions can be extended to positive functions by applying
them to the complete DNF of such functions (as in Section 7.2.3).
7.4 The lexico-exchange property 339
The LE property was introduced by Ball and Provan in [49] and further inves-
tigated in [111, 760]. Interest in this concept is motivated by the observation that
every DNF with the LE property is also shellable.
Theorem 7.9. If the DNF φ(x1 , x2 , . . . , xn ) given by equation (7.14) has the
LE property with respect to (x1 , x2 , . . . , xn ), then the lexicographic order on
{A1 , A2 , . . . , Am } induces a shelling of the terms of φ.
Proof. This follows by comparing Definition 7.5 and condition (e) in
Theorem 7.4.
In fact, most classes of shellable DNFs investigated in the literature have the
LE property (see [49, 105] and the examples in Section 7.6).
It is interesting to observe that the converse of Theorem 7.9 does not hold:
Namely, the lexicographic order may induce a shelling of the terms of a DNF,
even when this DNF does not have the LE property with respect to (x1 , x2 , . . . , xn ).
This is because Definition 7.5 not only determines the order of the terms of φ but
also imposes the choice of the element j in A- \ Ak .
Example 7.7. The DNF φ = x1 x2 ∨ x2 x3 ∨ x3 x4 is shellable with respect to
the lexicographic order of its terms. However, φ does not have the LE prop-
erty with respect to (x1 , x2 , x3 , x4 ): With A- = {1, 2} and Ak = {3, 4}, we obtain
j = min{i | i ∈ A- \ Ak } = 1, and there is no h such Ah \ Ak = {1}. (But the reader
may check that φ has the LE property with respect to the permutation (x2 , x3 , x1 , x4 )
of its variables.)
m
Proof. Let φ = m k=1 Ck = k=1 j ∈Ak xj .
Necessity. Property (a) is an immediate consequence of Definition 7.5. To estab-
lish property (b), we must show that, if φ involves x1 and Ck is any term of φ0 ,
then Ck is absorbed by some term of φ1 . Let C- be any term of φ1 , and observe that
A- <L Ak and min{i | i ∈ A- \ Ak } = 1. Since φ has the LE property, there exists
h ∈ {1, 2, . . . , m} such that Ah <L Ak and Ah \ Ak = {1}. Then, j ∈Ah \{1} xj is a
term of φ1 which absorbs Ck , as required.
Sufficiency. Suppose that (a) and (b) hold. If φ does not involve x1 , then (a)
implies that φ = φ0 has the LE property. So, assume that x1 is a leader for φ, let
A- <L Ak , and let j = min{i | i ∈ A- \ Ak }. If C- and Ck are both in φ1 or both in
φ0 , then condition (a) implies that φ has the LE property. Otherwise, it must be
the case that C- is a term of φ1 , Ck is a term of φ0 , and j = 1. By definition of
leaders, there is a term in φ1 , say, j ∈Ah \{1} xj , which absorbs Ck . Then, however,
Ah <L Ak and Ah \ Ak = {1}, showing that φ has the LE property.
(a) If φ is identically 0, then T (φ) is empty, that is, T (φ) has no vertices.
(b) If φ is identically 1, then T (φ) has exactly one unlabeled vertex, namely,
its root r(φ).
(c) If φ(x1 , x2 , . . . , xn ) is not identically 1, then let φ = x1 φ1 ∨ φ0 (where φ0
and φ1 do not involve x1 , as usual); build T (φ) by introducing a root r(φ)
labeled by x1 , creating disjoint copies of T (φ0 ) and T (φ1 ), and making
r(φ1 ) (respectively, r(φ0 )) the left son (respectively, the right son) of r(φ).
(If either r(φ1 ) or r(φ0 ) is not defined, i.e., if either φ1 or φ0 is identically
zero, then the corresponding son of r(φ) does not exist.)
It is obvious that T (φ) has height at most n. Except for the leaves, all
vertices of T (φ) are labeled by a variable. Moreover, the leaves themselves cor-
respond in a natural way to the terms of φ. Indeed, for an arbitrary leaf v, let
x1 u1
x2 u2 u3 x2
u4 x3 u5 u6 x3
u7 x4 u8 u9 x4
x5 u10 u11
u12
r(φ) = u1 , u2 , . . . , uq = v be the vertices of T (φ) lying on the unique path from the
root r(φ) to v. Define
q−1
1
P (v) = { j | uk is labeled by xj and uk+1 is the left son of uk }. (7.16)
k=1
Theorem 7.11. For every positive DNF φ, the mapping v + → j ∈P (v) xj defines
a one-to-one correspondence between the leaves of T (φ) and the terms of φ.
Thus, T (φ) has exactly m leaves and at most nm vertices. It is actually easy
to see that T (φ) sorts the terms of φ in lexicographic order, from left to right.
Moreover, T (φ) can be set up in time O(nm).
With the data structure T (φ) at hand, let us now revert to the query: “Is j ∈A xj
absorbed by a term of φ?” Our next goal is to show that, when φ has the LE property
with respect to (x1 , x2 , . . . , xn ), the query is correctly answered by the procedure
Implicant(A) in Figure 7.3, consisting of one traversal of T (φ) along a path from
root to leaf.
Example 7.9. The reader may want to apply the procedure Implicant(A) to the
tree T (φ) displayed in Figure 7.2, with A = {2, 4, 5}, and check that it returns the
answer False.
Procedure Implicant(A)
Input: A subset A of {1, 2, . . . , n}.
Output: True if j ∈A xj is absorbed by a term of the DNF φ represented by T (φ),
False otherwise.
begin
if T (φ) is empty (that is, φ = 0) then return False;
u1 := r(φ);
for k = 1 to n do
begin
if uk is a leaf of T (φ) then return True
else if k ∈ A then
begin
if uk has a left son then uk+1 := leftson(uk ) else uk+1 := rightson(uk )
end
else if k ∈ A then
begin
if uk has a right son then uk+1 := rightson(uk ) else return False
end
end
end
However, even for an arbitrary DNF φ, Implicant works correctly “in half of
the cases”: Namely, it never errs on the answer True.
Theorem 7.12. Let φ(x1 , x2 , . . . , xn ) be a positive DNF, let T (φ) be the associated
binary tree, and let A ⊆ {1, 2, . . . , n}. If the procedure Implicant(A)
returns the
answer True and terminates at the leaf v of T (φ), then j ∈A xj is absorbed by
the term of φ associated with v.
Proof. Suppose that the procedure eventually reaches the leaf v = uq and returns
the answer True. Let j ∈P (v) xj be the term of φ associated with v by (7.16).
From the description of Implicant, we see that, if k ∈ A, then uk+1 is the right son
of uk , k = 1, 2, . . . , q. Hence, by construction of P (v), k ∈ P (v). Thus, P (v) ⊆ A,
and j ∈A xj is absorbed by the term j ∈P (v) xj of φ.
More interestingly for our purpose, Provan and Ball [760] proved that Impli-
cant works correctly when φ has the LE property with respect to (x1 , x2 , . . . , xn ).
Note that the DNF φ considered in Example 7.10 does not have the LE property
with respect to (x1 , x2 , x3 , x4 , x5 ) (although it has it with respect to the permutation
(x1 , x4 , x5 , x2 , x3 ) of its variables).
Procedure LE-Property(φ)
Input: A DNF φ(x1 , x2 , . . . , xn ) = m
k=1 j ∈Ak xj , where A1 <L A2 <L . . . <L Am .
Output: True if φ has the LE property with respect to (x1 , x2 , . . . , xn ), False
otherwise.
begin
set up the binary tree T (φ);
for k = 2 to m do
begin
find the leaf of T (φ), say vk , associated with the term j ∈Ak xj ;
for each vertex u on the path from r(φ) to vk
if vk is a successor of the right son of u and if u has a left son then
begin
let xi be the label of u;
if Implicant(Ak ∪ {i}) = False then return False;
end
end
return True;
end
We can now state (in Figure 7.4) the efficient procedure proposed by Provan
and Ball [760] to test whether a DNF φ(x1 , x2 , . . . , xn ) has the LE property with
respect to the identity permutation (x1 , x2 , . . . , xn ).
Proof. Assume first that φ has the LE property with respect to (x1 , x2 , . . . , xn ).
Trivially, for all k ∈ {1, 2, . . . , m} and all i ∈ {1, 2, . . . , n}, j ∈Ak ∪{i} xj is absorbed by
the term j ∈Ak xj . Hence, by Theorem 7.13, Implicant(Ak ∪ {i}) always returns
the answer True, and LE-Property eventually returns True.
Conversely, assume that LE-Property returns True, and consider two sets
A- , Ak with A- <L Ak and i = min{j | j ∈ A- \ Ak }. Let vk and v- be the leaves
of T (φ) associated with Ak and A- , respectively. On the path from r(φ) to vk ,
consider the last vertex u that is an ancestor of v- . Then, u is labeled by xi , and
Implicant(Ak ∪ {i}) is called in the innermost for loop of the procedure. When
running on the input Ak ∪ {i}, Implicant traverses T (φ) until vertex u, then visits
7.4 The lexico-exchange property 345
the left son of u, and eventually returns the value True (by assumption). By
Theorem 7.12, this means that Ak ∪ {i} is absorbed by the term Ch = j ∈Ah xj
associated with the leaf reached by Implicant. It follows that Ah <L Ak and
Ah \ Ak = {i}; hence, φ has the LE property.
We have mentioned that T (φ) can be set up in time O(nm). LE-Property
makes at most nm calls on Implicant, and each of these calls can be executed in
time O(n). Hence, the overall running time of LE-Property is O(n2 m).
In contrast with the previous results, Provan and Ball [760] pointed out that
the existence of an efficient procedure to determine whether a DNF has the LE
property with respect to some unknown permutation of its variables, is far from
obvious. Boros et al. [111] settled this question in the negative by proving the
following result:
We omit the (rather technical) proof of this result. It should be noted, however,
that the complexity of this recognition problem remains open for DNFs of degree
3 or 4. The case of quadratic DNFs is the topic of Section 7.5.
that eh \ ek = {j }. The latter condition means that the edge eh shares at least one
vertex (namely, vertex j ) with e- , and shares exactly one vertex with ek . This
is easily seen to be equivalent to the condition that e- ∪ ek does not induce 2K2
in Gk .
We are now ready for our main characterization of shellable graphs [66].
7.6 Applications
We conclude this chapter with a brief presentation of three generic classes of
shellable DNFs arising in reliability theory and in game theory. We refer to Ball
and Nemhauser [48], Ball and Provan [49, 760], or Colbourn [205, 206] for a more
detailed discussion.
Application 7.1. (Undirected all-terminal reliability.) Let G = (N , E) be a con-
nected undirected graph with E = {e1 , e2 , . . . , em }, and let T be the collection of all
spanning trees of G, viewed as subsets of E. Let us associate with every edge ei ,
i = 1, 2, . . . , m, a Boolean variable xi indicating whether the edge is operational
or failed. Then, the DNF φ = T ∈T ei ∈T xi takes value 1 exactly when the graph
formed by the operational edges is connected. In the terminology of Section 1.13.4,
φ represents the structure function of the reliability system whose minimal pathsets
are the spanning trees of G.
We claim that φ satisfies the LE property with respect to (x1 , x2 , . . . , xm ) (i.e.,
with respect to an arbitrary permutation of its variables). Indeed, let T- , Tk be two
spanning trees with T- <L Tk , and let j = min{i | ei ∈ T- \ Tk }. From elementary
properties of trees, there exists an edge ei ∈ Tk \ T- such that Tk ∪ {ej } \ {ei } is a
spanning tree. Call this spanning tree Th . Then, Th \ Tk = {ej }, and Th <L Tk , as
required by the LE property.
This result implies, in particular, that the all-terminal reliability of a graph can
be computed in time polynomial in the number of spanning trees of the graph (see
Ball and Nemhauser [48] for details).
7.7 Exercises
1. Let C1 , C2 , . . . , Ck and U be elementary conjunctions. Prove that it is co-NP-
complete to decide whether C1 C2 . . . Ck−1 Ck ≤ U . (Compare with Lemma
7.2.) Hint: Let Ck = y and U = z, where y, z are variables not occuring in
C1 , C2 , . . . , Ck−1 .
2. Complete the argument following the proof of Theorem 7.6: Show that the
DNF ψL is shellable, where ψL is the disjunction of all leftmost implicants
of a positive function.
3. A positive DNF φ = m k=1 i∈Ak xi is aligned if, for every k = 1, 2, . . . , m
and for every j ∈ Ak such that j < hk = max{i : i ∈ Ak }, there exists A- ⊆
(Ak ∪ {j }) \ {hk }. Prove that every aligned DNF has the LE property (see
Boros [105] and Section 8.9.2).
4. Complete the proof of Theorem 7.16 (see Boros [105]).
5. Let φ(x1 , x2 , . . . , xn ) = mk=1 i∈Ak xi be a DNF such that |Ak | = n − 2 for
k = 1, 2, . . . , m. Show that φ is shellable if and only if the graph G = (N , E)
is connected, where N = {1, 2, . . . , n} and E = {N \ Ak | k = 1, 2, . . . , m}.
9. The article [111] states a stronger form of Theorem 7.8, namely, it claims
that:
Claim. If a Boolean function in n variables can be represented by a shellable
positive DNF of m terms, then its dual can be represented by a shellable
DNF of at most nm terms.
Unfortunately, the proof given in [111] is flawed, so that the validity of the
claim (namely, the existence of a short, shellable DNF of the dual) remains
open. Can you prove or disprove it?
8
Regular functions
In this chapter we investigate the main properties of regular Boolean functions. This
class of functions constitutes a natural extension of the class of threshold functions,
and, as such, has repeatedly and independently been “rediscovered” by several
researchers over the last 40 years. It turns out that regular functions display many
of the most interesting properties of threshold functions, and that these properties
are, accordingly, best understood by studying them in the appropriate context of
regularity. From an algorithmic viewpoint, regular functions constitute one of the
most tractable classes of Boolean functions: Indeed, fundamental problems such
as dualization, computation of reliability, or set covering are efficiently solvable
when associated with regular functions. Besides its more obvious implications, this
nice algorithmic behavior will eventually pave the way for the efficient recognition
of threshold functions, which are discussed in the next chapter.
351
352 8 Regular functions
Theorem 8.1. For a Boolean function f (x1 , x2 , . . . , xn ), and for i, j ∈ {1, 2, . . . , n},
xi ≈f xj if and only if
As Example 8.2 illustrates, certain pairs of variables may turn out to be incom-
parable with respect to the strength relation f ; in other words, f is generally
not a complete relation. On the other hand, as we prove now, the strength relation
always defines a preorder, that is, a reflexive and transitive relation.
Theorem 8.2. The strength relation is a preorder on the set of variables of every
Boolean function.
Proof. The strength relation is obviously reflexive. To see that it is also transitive,
consider a function f (x1 , x2 , . . . , xn ), three indices i, j , k such that xi f xj and
xj f xk , and a point X∗ ∈ B n with xi∗ = xk∗ = 0. We must show that f (X ∗ ∨ ek ) ≤
f (X∗ ∨ ei ).
If xj∗ = 0, then f (X ∗ ∨ ek ) ≤ f (X ∗ ∨ ej ) ≤ f (X ∗ ∨ ei ) and we are done. If
∗
xj = 1, then let Y ∗ ∈ B n be the point obtained by switching the j -th component of
X ∗ from 1 to 0; thus, yj∗ = 0 and X∗ = Y ∗ ∨ ej . Then,
f (X ∗ ∨ ek ) = f (Y ∗ ∨ ej ∨ ek )
≤ f (Y ∗ ∨ ei ∨ ek ) (since xi f xj )
≤ f (Y ∗ ∨ ei ∨ ej ) (since xj f xk )
= f (X∗ ∨ ei ),
and the proof is complete.
We pointed out in the introductory paragraphs of this section that the strength
relation associated with a threshold function is always complete. More precisely,
we can state:
Threshold functions are not the only Boolean functions featuring a complete
strength preorder. For instance, the function displayed in Example 8.1 has a com-
plete strength preorder but is not a threshold function, since it is not monotone (the
reader will easily verify that every threshold function is monotone). If we restrict
our attention to monotone functions, then it can be shown that all functions of five
variables for which the strength preorder is complete are threshold functions, but
this implication fails for functions of six variables or more (see Winder [917] and
Exercise 11 at the end of this chapter).
354 8 Regular functions
Example 8.3. The function f in Example 8.2 is not regular since x1 and x4 are
not comparable in the preorder f .
On the other hand, the function g(x1 , x2 , x3 ) = x1 x2 ∨x1 x3 is regular with respect
to (x1 , x2 , x3 ), and the function h(x1 , x2 , . . . , x5 ) = x1 x2 ∨ x1 x3 ∨ x1 x4 x5 ∨ x2 x3 x4
is regular with respect to (x1 , x2 , . . . , x5 ).
Because it is so natural and (as we will see) fruitful, the regularity concept has
been “rediscovered” several times in various fields of applications (see Muroga,
Toda, and Takasu [700]; Paull and McCluskey [732]; Winder [916]; Neumaier
[709]; Golumbic [398]; Ball and Provan [49], etc.). It constitutes our main object
of study in this chapter.
Before diving more deeply into this topic, however, let us first offer the impatient
reader an illustration of how the notion of strength preorder can be used in a game-
theoretical framework. More applications are presented at the end of Section 8.2,
after we have become better acquainted with the elementary properties of the
strength relation.
Application 8.1. (Political science, Game theory.) The legislative body in Boole-
land consists of 45 representatives, 11 senators and a president. In order to be
passed by this legislature, a bill must receive
(1) at least half of the votes in the House of Representatives and in the Senate,
as well as the president’s vote, or
(2) at least two-thirds of the votes in the House of Representatives and in the
Senate.
(The knowledgeable reader will recognize that this lawmaking process is a slightly
simplified version of the system actually in use in the United States)
As usual, we can model this voting mechanism by a monotone Boolean function
f (r1 , . . . , r45 , s1 , . . . , s11 , p), where variable ri (respectively, sj , p) takes value 1 if
representative i (respectively, senatorj , the president) casts a “Yes” vote, and takes
value 0 otherwise (1 ≤ i ≤ 45 and 1 ≤ j ≤ 11). The true points of f correspond to
the voting patterns described by rules (1) and (2) above.
A more detailed description of f can be obtained as follows: For k, n ≥ 1,
denote by gk (x1 , x2 , . . . , xn ) the “k–majority” function on n variables, that is, the
threshold function defined by
n
gk (x1 , x2 , . . . , xn ) = 1 ⇐⇒ xi ≥ k.
i=1
8.2 Basic properties 355
One can easily verify that, with respect to the strength preorder associated
to f ,
ωi = | {X ∈ T : xi = 1} |
= | {X ∈ T : xi = xj = 1} | + | {X ∈ T : xi = 1, xj = 0} |
≥ | {X ∈ T : xi = xj = 1} | + | {X ∈ T : xi = 0, xj = 1} |
= | {X ∈ T : xj = 1} |
= ωj .
If xi 0f xj , then there exists at least one point X ∗ ∈ B n such that xi∗ = xj∗ = 0,
f (X∗ ∨ ei ) = 1 and f (X∗ ∨ ej ) = 0. Thus, the above inequality is strict.
If xi ≈f xj , then f is symmetric on xi , xj , and hence, ωi = ωj .
Having clarified this point, let us now turn to the issue of deciding whether two
variables are comparable with respect to the strength relation. We only deal with
positive functions expressed by their complete (i.e., prime irredundant) DNF, as
the same question turns out to be NP-hard for arbitrary DNFs (see Exercise 2 at
the end of this chapter).
Theorem 8.5. Let f (x1 , x2 , . . . , xn ) be a positive Boolean function and let i, j be
distinct indices in {1, 2, . . . , n}. Write the complete DNF of f in the form α xi xj ∨
β xi ∨ γ xj ∨ δ, where α, β, γ and δ are positive DNFs which do not involve xi nor
xj . Then
xi f xj if and only if β ≥ γ .
Proof. Without loss of generality, suppose that i = 1 and j = 2. For X = (0, 0, Y ) ∈
Bn , we get f (X ∨e1 ) = β(Y )∨δ(Y ), and f (X ∨e2 ) = γ (Y )∨δ(Y ). Hence, by def-
inition of the strength relation, x1 f x2 if and only if β(Y ) ∨ δ(Y ) ≥ γ (Y ) ∨ δ(Y )
for all Y ∈ B n−2 . To establish the theorem, note that β ≥ γ trivially implies
β ∨ δ ≥ γ ∨ δ. For the converse implication, assume that β ∨ δ ≥ γ ∨ δ, and
let C be a prime implicant of γ . Since C ≤ γ ≤ β ∨ δ, the DNF β ∨ δ contains a
term B which absorbs C. Note that B cannot be a term of δ (hence, of f ), since B
absorbs Cx2 , which is, by assumption, a prime implicant of f . Hence, B must be
a term of β. We conclude that β ≥ γ , and the proof is complete.
β = x3 x4 ∨ x3 x5 > γ = x3 x4 x6 ∨ x3 x4 x7 ∨ x3 x5 x7 ,
β = x2 x3 ∨ x3 x5 and γ = x2 x3 x6 ∨ x2 x3 x7 ∨ x5 x6 .
Since neither β ≥ γ nor β ≤ γ holds, we conclude that x1 and x4 are not compa-
rable with respect to f .
Let us now see how the strength preorder behaves under some fundamen-
tal transformations of Boolean functions, namely, restriction, composition, and
dualization.
Proof. Let h = g(f1 , f2 , . . . , fm ), and let X ∗ be a point of B n with xi∗ = xj∗ = 0. For
k = 1, 2, . . . , m, fk (X ∗ ∨ ei ) ≥ fk (X ∗ ∨ ej ). Hence, by positivity of g, h(X∗ ∨ ei ) ≥
h(X∗ ∨ ej ).
358 8 Regular functions
f (0, 1, Z ∗ ) = φ0 (1, Z ∗ ) = φ0 (Y ∗ ) = 1,
f (1, 0, Z ∗ ) = φ1 (0, Z ∗ ) ∨ φ0 (0, Z ∗ ) = φ1 (0, Z ∗ ).
where f is a positive Boolean function (cf. Section 1.13.6 in Chapter 1). If xi and
xj are two variables such that xi f xj and ci ≤ cj , then one easily verifies that
there exists an optimal solution X∗ of (8.2)–(8.4) such that xi∗ ≤ xj∗ . This fact can
be used in an enumerative approach to the solution of (8.2)–(8.4). Indeed, as soon
as variable xi has been fixed to 1 in a branch of the enumeration tree, then xj
can automatically be fixed to 1. More generally, the conclusion that xi ≤ xj can
also be handled as a logical condition to be satisfied by the optimal solution of
the problem (see Application 2.4 in Section 2.1).
In particular, if c1 ≤ c2 ≤ · · · ≤ cn and if f is regular with x1 f x2 f · · · f xn ,
then (8.2)–(8.4) has an optimal solution X ∗ satisfying x1∗ ≤ x2∗ ≤ · · · ≤ xn∗ . Under
360 8 Regular functions
Application 8.3. (Game theory). Since a simple game is nothing but a positive
Boolean function, we can speak of the strength preorder of a simple game (see
Section 1.13.3). What can be said about this preorder in a game-theoretic setting?
As discussed in Application 8.1, the strength preorder can be naturally inter-
preted as providing an ordinal ranking of the players according to their relative
power in the game. On the other hand, we have defined in Section 1.13.3 different
cardinal measures of power, or power indices, associated with a simple game. In
particular, we have observed that the Banzhaf indices are a monotone transfor-
mations of the Chow parameters of the associated Boolean function. Hence, it
follows from Theorem 8.4 that these power indices are consistent with the strength
preorder, in the following sense: If variable xi is (strictly) stronger than variable
xj with respect to the strength preorder of the game, then the Banzhaf index of
player i is (strictly) larger than the Banzhaf index of player j .
The notion of strength preorder has been extended by Maschler and Peleg [673]
to cooperative games in characteristic function form (i.e., pseudo-Boolean func-
tions, or real-valued functions of 0-1 variables; see Chapter 13).
fH (x1 , x2 , . . . , xn ) = xj .
A∈E j ∈A
Note that H is a tactical configuration if and only if all terms of fH have the same
degree k and every variable appears in r terms of fH . Then, Neumaier’s result
states: If H is a tactical configuration such that fH is regular, then E = {A ⊆ N :
|A| = k}. To see that this is indeed the case, consider any two variables xi and xj
with xi f xj and rewrite fH in the form: fH = αxi xj ∨ βxi ∨ γ xj ∨ δ. Theorem
8.7 implies that β ≥ γ . But then, using the definition of a tactical configuration,
8.2 Basic properties 361
φ sh = xj xj (8.5)
k=1 j ∈Ak j ∈Sk
and
m
Relf (p1 , p2 , . . . , pn ) = Prob[f (X) = 1] = pj (1 − pj ) .
k=1 j ∈Ak j ∈Sk
(8.6)
Before proving Theorem 8.13, we illustrate it by means of a small example.
Example 8.8. Let f = x1 x2 ∨ x1 x3 ∨ x1 x4 x5 ∨ x2 x3 x4 . Then, x1 f x2 f x3 f x4
f x5 . We obtain
f = x1 x2 ∨ x1 x2 x3 ∨ x1 x2 x3 x4 x5 ∨ x1 x2 x3 x4 ,
Prob[f (X) = 1]
= p1 p2 + p1 (1 − p2 )p3 + p1 (1 − p2 )(1 − p3 )p4 p5 + (1 − p1 )p2 p3 p4 .
Proof. Assume, without loss of generality, that the prime implicants of f are listed
in lexicographic order, that is, A1 <L A2 <L . . . <L Am (remember Definition 7.4).
Then, the statement is an immediate corollary of Theorem 7.4 if we can prove that,
for k = 1, 2, . . . , m, the set Sk is the shadow of Ak , that is,
Note that the computation of the expressions (8.5) and (8.6) does not require
explicitly computing the lexicographic order of A1 , A2 , . . . , Am , that is, the shelling
of f . All that is actually needed is the knowledge of the strength (complete)
preorder on the variables of f .
As a corollary of Theorem 8.13, we observe that the number of true points and
the Chow parameters of a regular Boolean function can be efficiently computed.
Indeed, as pointed out in Section 1.13.4, the number of true points of a function
f is equal to 2n times the probability that f takes the value 1 when each variable
takes value 0 or 1 with probability 12 . In view of equation (8.6), this probability is
given by the expression
m 1 1 m
µk
1 1 1
Relf ( , . . . , ) = =
2 2 k=1 j ∈A
2 j ∈S
2 k=1
2
k k
and
(1, 1, 1) (1, 1, 0) (1, 0, 1) (0, 1, 1) (0, 1, 0) (0, 0, 1) (0, 0, 0),
but the points (1, 0, 0) and (0, 1, 1) are not comparable with respect to .
Some authors prefer to take condition (b) in Theorem 8.14 as the defining
property of regular functions (up to a permutation of the variables). In particular,
consideration of the “left-shift” relation allows us to introduce in a natural way
some special types of false points and true points that play an interesting role in
computational manipulations of regular and threshold functions (see e.g., Bradley,
Hammer, and Wolsey [148], Muroga [698], and Section 9.4.2).
Definition 8.5. A point X ∗ ∈ B n is a ceiling of the Boolean function
f (x1 , x2 , . . . , xn ) if X∗ is a false point of f and if no other false point of f is
364 8 Regular functions
such that (0, 1, 0) X ∗ must be false points of rA . Moreover, by definition of a ceil-
ing, all left-shifts of (0, 1, 0) are true points of rA . A look at Example 8.9 indicates
that this classification exhausts all points of B3 . Hence, rA is uniquely determined.
One easily verifies that rA (x1 , x2 , x3 ) = x1 ∨ x2 x3 .
Peled and Simeone [735] used Theorem 8.15 to show that, if r(n) is the number
3
of regular functions on n variables, then log2 r(n) ≥ cn− 2 2n for some constant c.
Regularity Recognition
Instance: The complete DNF of a positive Boolean function f .
Output: True if f is regular, False otherwise.
that f is regular if and only if σ coincides with the strength preorder of f . “Quickly”
means here in O(n2 + nm) operations. Next, making use of an appropriate data
structure, we explain how to check in O(n2 m) steps whether σ actually is the
strength preorder of f and, hence, whether f is regular.
(a) if xi ≈f xj then R i = R j ;
(b) if xi 0f xj then R i >L R j .
Proof. Consider two variables xi , xj , and write the complete DNF of f in the form
α xi xj ∨ β xi ∨ γ xj ∨ δ as in Theorem 8.5. If xi ≈f xj , then β = γ , and hence,
R i = R j . So, assume now that xi 0f xj . Then β > γ . For d = 0, 1, . . . , n − 1, define
If B(d) = C(d) for all d, then β = γ , which contradicts our assumption. Thus,
there exists a smallest d ∗ such that B(d ∗ ) = C(d ∗ ). We claim that C(d ∗ ) ⊂ B(d ∗ ).
Indeed, let P ∈ C(d ∗ ). Since β > γ , there exists a term of β, say k∈Q xk , such
that Q ⊆ P . If Q is not equal to P , then |Q| < |P | = d ∗ , and hence, Q ∈ B(d)
for some d < d ∗ . By our choice of d ∗ , this implies that Q ∈ C(d). But, then both
k∈P xk and k∈Q xk are terms of γ , a contradiction. So, we conclude that Q = P ,
and hence, P ∈ B(d ∗ ) as required.
From the assertions B(d) = C(d) for d < d ∗ and C(d ∗ ) ⊂ B(d ∗ ), one easily
derives rid = rj d for d < d ∗ and rj d ∗ < rid ∗ , which completes the proof.
0 2 1 0 0
0 1 1 0 0
R=
0 1 1 0 0 .
0 0 2 0 0
0 0 1 0 0
Since x1 f x2 , the first row of R is lexicographically larger than its second
row. Also, the second and third rows of R are identical, since x2 ≈f x3 .
Note that the strength preorder f does not coincide perfectly, in general, with
the lexicographic order ≥L on the rows of R (in particular, ≥L completely orders
the rows of R, whereas f is generally incomplete). When f is regular, however,
we obtain as an immediate corollary of Theorem 8.17:
Theorem 8.18. Let f (x1 , x2 , . . . , xn ) be a regular function and denote by R i the
i-th row of its Winder matrix (i = 1, 2, . . . , n). Then,
R 1 ≥L R 2 ≥L · · · ≥L R n if and only if x1 f x2 f · · · f xn .
Proof. This immediately follows from Theorem 8.17.
Procedure Regular(f )
Input: A positive Boolean function f (x1 , x2 , . . . , xn ) in complete DNF.
Output: True if f is regular, False otherwise.
begin
compute R, the Winder matrix of f ;
order the rows of R lexicographically;
{comment: assume without loss of generality that R 1 ≥L R 2 ≥L · · · ≥L R n }
set up the binary tree T (f );
for i = 1 to n − 1 and
for every prime implicant k∈A xk of f such that i ∈ A and i + 1 ∈ A do
if Implicant(A ∪ {i} \ {i + 1}) = False then return False;
return True;
end
More precisely, denote by T (f ) the binary tree associated with (the com-
plete DNF of) a positive function f as on page 341, and consider the procedure
Implicant(A) defined on page 342. Then, we can state:
We are now ready to state an efficient algorithm due to Provan and Ball [760]
for the recognition of regular functions; see Figure 8.1 for a formal statement of
the algorithm.
and k∈(A∪{i})\{i+1} xk is not an implicant of f . But then, Implicant(A ∪ {i} \
{i + 1}) returns False, by Theorem 8.19. This establishes that the procedure is
correct.
As for the complexity of the procedure, we have already observed that its first
and second steps can be performed in time O(n2 + nm). Setting up the tree T (f )
takes time O(nm) (see Section 7.4). The nested loops require at most nm calls on
the procedure Implicant, and each of these calls can be executed in time O(n).
Hence, the overall running time of Regular is O(n2 m).
Regular Dualization
Instance: The complete DNF of a regular function f or, equivalently, the list of
all minimal true points of f .
370 8 Regular functions
Output: The complete DNF of f d or, equivalently, the list of all maximal false
points of f .
O(n2 m) dualization algorithm and lends itself to an O(nm) algorithm for the
solution of “regular set covering problems” to be discussed in Section 8.6. Later
on, Peled and Simeone [736] proposed yet another O(n2 m) regular dualization
algorithm.
Finally, we note that an O(n2 m) dualization algorithm for regular func-
tions can be obtained as a corollary of Theorem 7.16, since regular functions
have the LE property by Theorem 8.12. This algorithm was first described by
Boros [105], within the framework of his analysis of so-called aligned functions
(see Section 8.9.2).
The presentation hereunder combines ideas from Crama [225] and Bertolazzi
and Sassano [74]. It mostly rests on a key result from Crama [225]:
Theorem 8.22. Assume that f (x1 , x2 , . . . , xn ) is regular with respect to
(x1 , x2 , . . . , xn ) and let X ∗ ∈ B n−1 . Then, (X ∗ , 0) is a maximal false point of f
if and only if (X ∗ , 1) is a minimal true point of f .
Proof. Assume that (X ∗ , 0) is a maximal false point of f . Then, (X∗ , 1) is a true
point of f . To see that (X∗ , 1) actually is a minimal true point of f , consider any
index i < n such that xi∗ = 1. Since xi f xn , (x1∗ , x2∗ , . . . , xi−1 ∗ ∗
, 0, xi+1 ∗
, . . . , xn−1 , 1)
is a false point of f , as required.
Conversely, if (X∗ , 1) is a minimal true point of f , then (X ∗ , 0) is a false
point of f . To see that (X ∗ , 0) is a maximal false point, consider i < n such that
xi∗ = 0. Since xi f xn , (x1∗ , x2∗ , . . . , xi−1
∗ ∗
, 1, xi+1 ∗
, . . . , xn−1 , 0) is a true point of f , as
required.
Note that, in contrast with Theorem 8.22, Theorem 8.23 is valid for all positive
functions, whether regular or not. Taken together, these theorems immediately
suggest a recursive dualization procedure for regular functions. This procedure,
which we call DualReg0, is described in Figure 8.2.
The procedure is obviously correct in view of Theorem 8.22 and Theorem 8.23.
Moreover, it can actually be implemented recursively, since fn is regular when f
is regular (by Theorem 8.7).
372 8 Regular functions
Procedure DualReg0(f )
Input: The list of minimal true points of a regular function f (x1 , x2 , . . . , xn )
such that x1 f x2 f · · · f xn .
Output: All maximal false points of f .
begin
identify all minimal true points of f with last component equal to 1,
say (X1∗ , 1), (X2∗ , 1), . . . , (Xk∗ , 1);
fix xn to 1 in f and determine the minimal true points of fn ;
generate (recursively) all maximal false points of fn ,
∗ , X∗ , . . . , X∗ ;
say Xk+1 k+2 p
return (X1 , 0), (X2∗ , 0), . . . , (Xk∗ , 0) and (Xk+1
∗ ∗ , 1), (X ∗ , 1), . . . , (X ∗ , 1);
k+2 p
end
Step 1. The maximal false points of f5 with last component equal to 0 are (0, 1, 0, 0)
and (1, 0, 0, 0) (derived from Z 1 and Z 3 by Theorem 8.22). Thus, f has the maximal
false points X2 = (0, 1, 0, 0, 1) and X3 = (1, 0, 0, 0, 1) (by Theorem 8.23).
Step 2. The restriction of f5 to x4 = 1 is f4 (x1 , x2 , x3 ) = x1 ∨ x2 , with minimal true
points V 1 = (0, 1, 0) and V 2 = (1, 0, 0).
Step 3. We recursively apply DualReg0 to f4 .
Step 1. Using Theorem 8.22, we see that f3 has the maximal false point (0, 0) with
last component equal to 0. Thus, f has the maximal false point X 4 = (0, 0, 1, 1, 1)
(by repeated applications of Theorem 8.23).
Step 2. Fixing x2 = 1 in f3 , we obtain f2 (x1 ) ≡ 1.
8.5 Dualization of regular functions 373
Step 3. Since f2 has no maximal false points, the procedure terminates here: all
maximal false points of f have been listed.
DualReg0 requires generating the minimal true points of fn from the mini-
mal true points of f . To carry out this step efficiently, one may rely on the next
observation.
Theorem 8.24. Let f (x1 , x2 , . . . , xn ) be a positive function and let Y ∈ Bn−1 . Then,
Y is a minimal true point of fn if and only if
Procedure DualReg(f )
Input: The list of minimal true points Y 1 , Y 2 , . . . , Y m of a regular function
f (x1 , x2 , . . . , xn ) such that x1 f x2 f · · · f xn .
Output: The list L of all maximal false points of f .
begin
sort the points Y i (i = 1, 2, . . . , m) in lexicographic order;
{comment: assume without loss of generality that Y 1 <L Y 2 <L . . . <L Y m };
ν1 := 0;
for i = 2 to m do νi := min{k : yki−1 < yki };
initialize L := empty list;
for i = 1 to m and for j = 1 to n do
if yji = 1 and νi < j then
begin
for k = 1 to j − 1 do xk∗ := yki ;
xj∗ := 0;
for k = j + 1 to n do xk∗ := 1;
add X∗ to L;
end
return L;
end
in time O(nm) (see for instance Aho, Hopcroft, and Ullman [11]). The parameters
ν1 , ν2 , . . . , νm can be simultaneously computed on the run. Each execution of the
(for i, for j )–loop requires O(n) operations, thus leading to the overall O(n2 m)
time bound.
Theorem 8.29 strengthens the result of Peled and Simeone [735] mentioned in
the introduction of this section. Further refinements of the bound can be found in
[105, 225, 735].
Procedure RegCover0(c, f )
Input: A vector (c1 , c2 , . . . , cn ) of integer coefficients and a regular function
f (x1 , x2 , . . . , xn ) in complete disjunctive normal form.
Output: An optimal solution of the instance of RSCP defined by (c1 , c2 , . . . , cn ) and f .
begin
generate all maximal false points of f ;
evaluate the value of each maximal false point and return the best one;
end
The maximal false points of f have been computed in Example 8.14; they are
X 1 = (0, 1, 0, 1, 0), X 2 = (0, 1, 0, 0, 1), X 3 = (1, 0, 0, 0, 1) and X 4 = (0, 0, 1, 1, 1).
Their respective values are z(X1 ) = 3, z(X 2 ) = 4, z(X 3 ) = 5 and z(X4 ) = 4. So,
X 3 is an optimal solution for this instance of RSCP.
The first polynomial-time algorithm for RSCP was obtained by Peled and
Simeone [735], based on the general approach outlined in procedure DualReg0.
The complexity of their algorithm is O(n3 m), since this is also the complexity
of the dualization algorithm proposed in [735]. The better time bound mentioned
in Theorem 8.30 immediately results from the improvements brought by Crama
[225] or Bertolazzi and Sassano [74] to the efficiency of dualization procedures
for regular functions.
However, as shown by Bertolazzi and Sassano [74], Hammer and Simeone
[462], or Peled and Simeone [736], even faster algorithms (with complexity
O(nm)) can be obtained for RSCP by exploiting a slightly different idea: Namely,
these authors manage to replace the explicit generation of the maximal false points
of f by their implicit generation, and to compute in constant time the value z(X)
of each such point. In Bertolazzi and Sassano [74], this idea is implemented via
a simple adaptation of the dualization algorithm DualReg. This leads to the pro-
cedure RegCover shown in Figure 8.5. In this procedure, the variable best keeps
track of the value of the best point found so far, and i ∗ , j ∗ are the values of i and
j describing this point, as in Theorem 8.27.
The meaning of the computations carried out in RegCover is revealed in the
following proof.
Proof. It is trivial to verify that, at the beginning of an arbitrary (for j )–loop (that
is, just after the counter j has been increased), the value of C is given as
C= ck yki + ck .
k<j k≥j
On the other hand, in view of Theorem 8.27 and the comments that follow it, we
know that the maximal false points of f are all points of the form
Procedure RegCover(c, f )
Input: A vector (c1 , c2 , . . . , cn ) of nonnegative integer coefficients and the list of
minimal true points Y 1 , Y 2 , . . . , Y m of a regular function f (x1 , x2 , . . . , xn )
such that x1 f x2 f · · · f xn .
Output: An optimal solution of the instance of RSCP defined by (c1 , c2 , . . . , cn ) and f .
begin
best :=
−1;
S := nj=1 cj ;
sort the points Y i (i = 1, 2, . . . , m) in lexicographic order;
{comment: assume without loss of generality that Y 1 <L Y 2 <L . . . <L Y m };
ν1 := 0;
for i = 2 to m do νi := min{k : yki−1 < yki };
{comment: compute the value of each maximal false point};
for i = 1 to m do
begin
C := S;
for j = 1 to n do
begin
if yji = 0 then C := C − cj ;
if yji = 1 and νi < j and C − cj > best then
begin
best := C − cj ;
i ∗ := i;
j ∗ := j ;
end
end
end
∗ ∗ ∗
return (y1i , y2i , . . . , yji ∗ −1 , 0, 1, 1, . . . , 1);
end
such that yji = 1 and νi < j . Thus, if X ∗ = (y1i , y2i , . . . , yji −1 , 0, 1, . . . , 1) is such a
point, then C − cj is precisely the value of z(X ∗ ). It follows easily that RegCover
returns a maximal false point with maximum value.
The complexity analysis is straightforward.
Example 8.16. Let us consider again the set covering instance given in Exam-
ple 8.15, and let us run RegCover on this instance. The minimal true points of f are
(in lexicographic order): Y 1 = (0, 1, 0, 1, 1), Y 2 = (0, 1, 1, 0, 0), Y 3 = (1, 0, 0, 1, 0),
Y 4 = (1, 0, 1, 0, 0), and Y 5 = (1, 1, 0, 0, 0). So, ν1 = 0, ν2 = 3, ν3 = 1, ν4 = 3,
ν5 = 2. The sum of the objective function coefficients is S = 9, and we initially set
best := −1.
For i = 1 and for j = 1 to 5, we successively obtain
j = 1 : y11 = 0 =⇒ C := 9 − c1 = 6;
j = 2 : y21 = 1 and ν1 < 2 and C − c2 = 4 > best =⇒ best := 4, i ∗ := 1, j ∗ := 2;
380 8 Regular functions
j = 3 : y31 = 0 =⇒ C := 6 − c3 = 5;
j = 4 : y41 = 1 and C − c4 = 4 ≤ best =⇒ no update;
j = 5 : y51 = 1 and C − c5 = 3 ≤ best =⇒ no update.
No better solution is found for i = 2, since ν2 ≥ j whenever yj2 = 1.
For i = 3, we get:
j = 1 : y13 = 1 and ν3 ≥ j =⇒ no update;
j = 2 : =⇒ C := 9 − c2 = 7;
j = 3 : =⇒ C := 7 − c3 = 6;
j = 4 : y43 = 1 and ν3 < 4 and C − c4 = 5 > best =⇒ best := 5, i ∗ := 3, j ∗ := 4.
We leave it to the reader to continue the execution of RegCover on this example
and to verify that no further updates of best, i ∗ and j ∗ take place. So, the solution
returned by the algorithm is
∗ ∗ ∗
(y1i , y2i , . . . , yji ∗ −1 , 0, 1, . . . , 1) = (y13 , y23 , y33 , 0, 1) = (1, 0, 0, 0, 1),
Further connections between regular functions and set covering problems can be
found, for instance, in Balas [42], Hammer, Johnson and Peled [443, 444], Laurent
and Sassano [602], Wolsey [922], etc. (see also Section 8.7.3 and Chapter 9).
Theorem 8.32. For every Boolean function f (x1 , x2 , . . . , xn ), there exist two
positive functions f R (x1 , x2 , . . . , xn ) and fR (x1 , x2 , . . . , xn ) such that
Proof. Let us denote by L and U the sets of all positive functions such that (8.13)
and (8.14) are satisfied for all f− ∈ L and f + ∈ U . Observe that L and U are both
nonempty, since 0 n ∈ L and 1n ∈ U . Define
fR = {f− : f− ∈ L}, f R = {f + : f + ∈ U }.
Then, fR and f R trivially satisfy conditions (b) and (c), and Theorem 8.9
implies (a).
(i) fR is the unique function that is regular with respect to (x1 , x2 , . . . , xn ) and
that has the same set of ceilings as f; and
(ii) f R is the unique function that is regular with respect to (x1 , x2 , . . . , xn ) and
that has the same set of floors as f .
Proof. Let A be the set of ceilings of f , and let τA be defined as in Theorem 8.15.
We want to prove that fR = rA , that is, we want to prove that rA satisfies conditions
(a)-(c) in Theorem 8.32. Condition (a) follows from the definition of τA .
382 8 Regular functions
The next result shows that fij is the largest positive minorant of f for which xi
is stronger than xj .
(a) fij ≤ f ;
(b) xi is stronger than xj with respect to fij ;
(c) if g(x1 , x2 , . . . , xn ) is a positive function such that g ≤ f and xi g xj , then
g ≤ fij .
Proof. Assertions (a) and (b) are easily verified. Suppose now that g ≤ f and
xi g xj . Let Y ∈ Bn . We must show that g(Y ) ≤ fij (Y ).
If yi = 1, then fij (Y ) = f (Y ) by (8.16), and hence, g(Y ) ≤ fij (Y ).
If yi = yj = 0, then
Procedure RegMinor0(f )
Input: A positive Boolean function f (x1 , x2 , . . . , xn ).
Output: fR , the largest regular minorant of f with respect to (x1 , x2 , . . . , xn ).
begin
fR := f ;
while there is a pair of variables xi , xj such that i < j and
xi is not stronger than xj with respect to fR
do fR := fij ;
return fR ;
end
Theorem 8.35. The procedure RegMinor0(f ) is correct, that is, it stops for
every input, and it returns the largest regular minorant of f with respect to
(x1 , x2 , . . . , xn ).
Proof. It follows from Theorem 8.34 that, if xi is not stronger than xj with respect
to f , then fij < f . Thus, the sequence of functions produced in the while loop is
strictly decreasing, and it must terminate. Denote by g the output of the procedure,
and denote by fR the largest regular minorant of f with respect to (x1 , x2 , . . . , xn )
(its existence is guaranteed by Theorem 8.32). We must show that g = fR .
By construction, g is regular with respect to (x1 , x2 , . . . , xn ). Thus, g ≤ fR by
definition of fR .
On the other hand, Theorem 8.34(c) implies (by induction) that fR ≤ f X for
each of the functions f X produced in the course of the procedure. In particular,
fR ≤ g, and this completes the proof.
(a) x1 x2 · · · xi−j ;
(b) For all k ∈ {i, i + 1, . . . , n}, xi−1 xk ;
(c) For all k ∈ {i + 1, i + 2, . . . , j − 1}, xi xk .
Then, conditions (a), (b), and (c) also hold with respect to fij .
h(Y ∨ ei ) = f (Y ∨ ei )
and
f|xj =0 (Y ∨ ei−1 ∨ ei ) ≥ f (Y ∨ ei )
Procedure RegMinor(f )
Input: A positive Boolean function f (x1 , x2 , . . . , xn ).
Output: fR , the largest regular minorant of f with respect to (x1 , x2 , . . . , xn ).
begin
fR := f ;
for i = 1 to n − 1 do
for j = i + 1 to n do
if xi is not stronger than xj with respect to fR then fR := fij ;
return fR ;
end
Proof. The correctness of the procedure is implied by Theorem 8.36, and the bound
on the number of minorization steps is trivial.
Note that, despite the fact that the number of minorization steps performed by
RegMinor is small, this procedure necessarily runs in exponential (input) time,
in view of Example 8.18. It is not clear, however, whether the procedure runs in
polynomial total time, that is, in time polynomial in |f | + |fR | (see Appendix B).
Proof. Let (x1 , x2 , . . . , xn ) be the given order, and let g d be the largest regular
minorant of f d with respect to (x1 , x2 , . . . , xn ). We must prove that g is the smallest
regular majorant of f with respect to (1, 2, . . . , n).
First, g d ≤ f d implies f ≤ g. Next, since g d is regular with respect to
(x1 , x2 , . . . , xn ), g is regular with the same strength preorder (by Theorem 8.10).
Finally, if g is not the smallest regular majorant of f with respect to (x1 , x2 , . . . , xn ),
then there exists another regular function h, with the same strength preorder, such
that f ≤ h < g. But then, g d < hd ≤ f d , contradicting the definition of g d .
Paraphrasing the statement of Theorem 8.34, we obtain the following result due
to Hammer and Mahadev [452].
(a) f ≤ f ij ;
(b) xi is stronger than xj with respect to f ij ;
(c) if g(x1 , x2 , . . . , xn ) is a positive function such that f ≤ g and xi g xj , then
f ij ≤ g.
Proof. This can be proved either by a duality argument or by adapting the proof of
Theorem 8.34. Details are left to the reader.
where f is a positive Boolean function, and let us define the set covering problem
SCP 12 as follows:
n
maximize z(x1 , x2 , . . . , xn ) = ci x i
i=1
Theorem 8.40. If c1 > c2 , then SCP and SCP 12 have the same set of optimal
solutions.
Proof. We only need to show that every optimal solution of SCP 12 is feasi-
ble for SCP. Let X∗ = (x1∗ , x2∗ , . . . , xn∗ ) be an optimal solution of SCP 12 . Since
f12 (X ∗ ) = 0, Equation (8.16) implies that f (X∗ ) = 0 or
z(X ∗ ) = z(0, 1, x3∗ , x4∗ , . . . , xn∗ ) < z(1, 0, x3∗ , x4∗ , . . . , xn∗ ) = z(Y ∗ ). But then X ∗ is not
an optimal solution of SCP 12 , and we reach a contradiction.
f12 = x1 x2 ∨ x1 x3 ∨ x1 x4 ∨ x2 x3 ∨ x2 x4 x5 .
As shown in Example 8.14, the maximal false points of f12 are X1 = (0, 1, 0, 1, 0),
X2 = (0, 1, 0, 0, 1), X3 = (1, 0, 0, 0, 1), and X4 = (0, 0, 1, 1, 1). The objective func-
tion value of X1 , X 3 and X4 is 6, and the value of X2 is 5. So, Theorem 8.40
implies that {X1 , X 3 , X 4 } is the set of optimal solutions of the original problem
(8.27)–(8.29). One easily verifies that this is indeed the fact, since X 1 , X 3 , and X4
are all the maximal false points of f .
In view of Theorem 8.40 and of the results obtained in Section 8.7, it is now
natural to associate with SCP the set covering problem SCP R , defined as follows:
n
maximize z(x1 , x2 , . . . , xn ) = ci xi
i=1
subject to fR (x1 , x2 , . . . , xn ) = 0
(x1 , x2 , . . . , xn ) ∈ Bn ,
Theorem 8.41. If c1 > c2 > · · · > cn , then SCP and SCP R have the same set of
optimal solutions.
Proof. The statement easily follows from Theorem 8.35 and 8.40, by induction on
the number of (i, j )–minorization steps that are necessary to derive fR from f .
As usual, we drop the subscript f from the symbol f when no confusion can
result.
Application 8.6. (Game theory.) The strength relation among subsets of vari-
ables has a clear interpretation in the context of game theory. If f represents a
simple game, and S is stronger than T with respect to f , then a coalition C (dis-
joint from S and T ) can more easily form a winning coalition by joining S than
by joining T (remember Application 8.1). Therefore, Lapidot [597] says that S is
“more desirable” than T when S f T . This relation among coalitions was used
by Peleg [737, 738] to develop a theory of coalition formation in simple games,
and its game-theoretic properties have been further investigated by Einy [291].
Thus, in particular, monotone functions are precisely those functions such that
S and T are comparable whenever |S ∪ T | ≤ 1. Similarly, a positive function is
regular if and only if S and T are comparable whenever |S ∪ T | = 2.
T h ⊂ CM ⊂ . . . ⊂ Mk+1 ⊂ Mk ⊂ . . . ⊂ M1
and
yi∗ = 0 for all i ∈ S ∪ T , f (Y ∗ ∨ eS ) = 1, f (Y ∗ ∨ eT ) = 0.
We can assume without loss of generality that S and T are disjoint (see the end-
of-chapter exercises).
Let now I = {i : xi∗ = 0, yi∗ = 1}, J = {i : xi∗ = 1, yi∗ = 0}, K = {i : xi∗ = yi∗ , i ∈
S ∪ T }, and define two points W ∗ , Z ∗ as follows:
1 if i ∈ T ,
∗
wi = xi∗ if i ∈ K,
0 otherwise,
and
1 if i ∈ S,
∗
zi = xi∗ if i ∈ K,
0 otherwise.
It is trivial to check that X∗ ∨ eS = Z ∗ ∨ eJ , X∗ ∨ eT = W ∗ ∨ eJ , Y ∗ ∨ eS = Z ∗ ∨ eI ,
and Y ∗ ∨ eT = W ∗ ∨ eI . As a consequence, we see that
f (W ∗ ∨ eI ) = 0, f (W ∗ ∨ eJ ) = 1,
and
f (Z ∗ ∨ eI ) = 1, f (Z ∗ ∨ eJ ) = 0.
Hence, I and J are not comparable with respect to f . Since |I ∪J | + |T ∪ S| ≤ n,
we conclude that f is not k-monotone for some k ≤ 7n/28.
So, if we restrict our attention to functions of n variables (for fixed n), the
hierarchy of k-monotone functions boils down to
T h ⊆ CM = M7n/28 ⊂ M7n/28−1 ⊂ . . . ⊂ M1 ,
Proof. Necessity. Assume first that f is k-monotone, and consider two complemen-
tary faces F (I , J ) and F (J , I ), with |I ∪ J | ≤ k. By definition of k-monotonicity,
we can assume without loss of generality that I f J . But this easily implies that
f|I ,J ≥ f|J ,I .
Sufficiency. To prove the reverse implication, consider two (disjoint) subsets
S, T of {1, 2, . . . , n} such that |S ∪ T | ≤ k. Then, if we assume for instance that
f|S,T ≥ f|T ,S , it is straightforward to check that S f T .
As a corollary, we obtain:
Proof. This follows from Theorem 8.46 and from Theorem 4.2 in Section 4.1.
Muroga, Toda, and Takasu [700] observed that completely monotone functions
are dual-comparable (see Section 4.1.3 for definitions).
Proof. Assume that f is neither dual-minor nor dual-major. Then, there exist
X ∗ , Y ∗ ∈ Bn such that f (X ∗ ) = 1, f d (X ∗ ) = 0, f (Y ∗ ) = 0, and f d (Y ∗ ) = 1.
Let S = {i : xi∗ = yi∗ = 1} and T = {i : xi∗ = yi∗ = 0}. Define two points W ∗ , Z ∗ ∈
n
B as follows:
wi∗ = 0 if i ∈ S ∪ T ,
= xi∗ otherwise,
8.8 Higher-order monotonicity 395
and
zi∗ = 0 if i ∈ S ∪ T ,
= yi∗ otherwise.
f (W ∗ ∨ eS ) = f (X∗ ) = 1, f (W ∗ ∨ eT ) = f (Y ∗ ) = 0,
and
f (Z ∗ ∨ eS ) = f (Y ∗ ) = 0, f (Z ∗ ∨ eT ) = f (X ∗ ) = 1.
Hence, S and T are not comparable with respect to f , and f is not completely
monotone.
Taken together, Theorem 8.45 and 8.48 imply that the restriction of a completely
monotone function to any face of Bn is either dual-minor or dual-major. Ding
[272] established that this property actually characterizes completely monotone
functions.
Theorem 8.49. A Boolean function f on B n is completely monotone if and only
if, for every face F of Bn , f|F is either dual-minor or dual-major.
Proof. Assume that f is not completely monotone. Then, there exist S, T ⊆
{1, 2, . . . , n} and X ∗ , Y ∗ ∈ Bn such that
and
yi∗ = 0 for all i ∈ S ∪ T , f (Y ∗ ∨ eS ) = 1, f (Y ∗ ∨ eT ) = 0.
Moreover, we can again assume, without loss of generality, that S and T are
disjoint.
Let now I = {i : xi∗ = yi∗ = 1}, J = {i ∈ (S ∪ T ) : xi∗ = yi∗ = 0}, F = F (I , J )
and g = f|F . We claim that g is neither dual-minor nor dual-major, that is, there
exist W ∗ , Z ∗ ∈ F such that g(W ∗ ) = 1, g d (W ∗ ) = 0 and g(Z ∗ ) = 0, g d (Z ∗ ) = 1.
We leave it to the reader to verify that W ∗ = X ∗ ∨ eT and Z ∗ = X ∗ ∨ eS are as
required.
(X ∗ ∨ eS ) + (Y ∗ ∨ eT ) = (X∗ ∨ eT ) + (Y ∗ ∨ eS ).
f (U ∗ ∨ eS ) = f (X∗ ) = 0, f (U ∗ ∨ eT ) = f (Y ∗ ) = 1,
f (V ∗ ∨ eS ) = f (Z ∗ ) = 1, f (V ∗ ∨ eT ) = f (W ∗ ) = 0.
Hence, S and T are not comparable with respect to f , and f is not completely
monotone.
Based on this observation, and on the fact that many of the remarkable features
of regular functions actually rest on Theorem 8.22 and Theorem 8.23, Crama [224]
introduced the following class of functions:
Definition 8.11. A positive Boolean function f (x1 , x2 , . . . , xn ) is weakly regular
with respect to (x1 , x2 , . . . , xn ) if f is constant, or if
(a) xj f xn for all j ∈ {1, 2, . . . , n}, and
(b) f|xn =1 is weakly regular with respect to (x1 , x2 , . . . , xn−1 ).
We simply say that f is weakly regular if f is weakly regular with respect to some
permutation of its variables.
So, when f is weakly regular with respect to (x1 , x2 , . . . , xn ), xi is a “weakest”
variable in the preorder associated with f|xi+1 =···=xn =1 , for all i ∈ {1, 2, . . . , n}.
Clearly, regular functions are weakly regular, but the converse is not necessarily
true.
398 8 Regular functions
Comparing this statement with Definition 7.5 in Section 7.4, it is easy to con-
clude that aligned functions have the LE property. As a consequence, aligned
functions can be dualized in O(n2 m) time (see Theorem 7.16 in Section 7.4.4).
Example 8.26. To see that the class of aligned functions is distinct from previ-
ously introduced classes, consider fB = x1 x2 ∨ x1 x3 ∨ x1 x4 ∨ x2 x3 ∨ x2 x4 x5 ∨
x3 x4 x5 x6 ∨ x4 x5 x6 x7 . This function is aligned with respect to (x1 , x2 , . . . , x7 ), but it
8.9 Generalizations of regularity 399
is not weakly regular, and hence, it is not regular either (Boros [105]). By duality,
f d is weakly regular, but it is not aligned.
On the other hand, the function fC4 = x1 x3 ∨ x1 x4 ∨ x2 x3 ∨ x2 x4 has the LE
property by virtue of Theorem 7.18, but it is not aligned since it does not have a
weakest variable.
The converse of this theorem is false:An ideal function is not necessarily weakly
regular.
Example 8.27. Each of x1 , x2 , x3 , x4 is a last variable of fC4 = x1 x3 ∨ x1 x4 ∨
x2 x3 ∨ x2 x4 , so that the function is ideal. But fC4 is neither weakly regular nor
aligned because it has no weakest variable.
Similarly, the function fP4 = x1 x2 ∨ x1 x3 ∨ x2 x4 is ideal with respect to the
order (x1 , x2 , x3 , x4 ), but fP4 is neither weakly regular nor aligned.
The main motivation for considering ideal functions is that they can be dualized
in
polynomial time. To describe this result, let us introduce the following notation:
If i∈A xi is a prime implicant of f and if j ∈ A, we let
and
Q(A, j ) = P (A, j ) ∪ {j }.
400 8 Regular functions
Theorem 8.55, combined with Theorem 8.23, allows us to generate the dual of
an ideal function in polynomial time. Details are left to the reader.
Example 8.28. Note that the dual of an ideal function is generally not ideal:
For instance, the dual of the function fC4 defined in Example 8.27 is f2K2 =
x1 x2 ∨ x3 x4 , which is not ideal.
aligned ⇒ LE property
regular ⇒ 9 (dual)
weakly regular ⇒ ideal
investigating (we do not claim that these are very difficult questions, but simply
that their answers do not seem to appear readily in the literature).
8.10 Exercises
1. Let f (x1 , x2 , . . . , xn ) be a positive Boolean function, let g = f|x1 =1 , let
h = f|x1 =0 , and assume that both g and h are regular with respect to
(x2 , x3 , . . . , xn ). Show that f is not necessarily regular (compare with
Theorem 8.7).
2. Show that, for a function f (x1 , x2 , . . . , xn ) given in DNF, it is co-NP-complete
to decide whether x1 f x2 .
3. Prove Theorem 8.12 by resorting only to the definitions of regularity and of
the LE property.
4. Prove the validity of the claims in Application 8.2.
5. Prove the validity of the claims in Application 8.4.
6. Prove Theorem 8.24.
7. Show that Theorem 8.25 can be used to produce an O(n2 m) implementation
of DualReg0 (Crama [225]).
8. Prove Theorem 8.39.
9. Prove that Theorem 8.41 extends to problem (8.30)–(8.32) if the objective
function g satisfies the “generalized regularity” condition (8.33). (Hammer,
Johnson, and Peled [443]).
402 8 Regular functions
10. Show that in Definition 8.9, comparisons can be restricted to pairs of disjoint
subsets: A Boolean function f on Bn is k-monotone (1 ≤ k ≤ n) if and only
if S and T are comparable for all S, T ⊆ {1, 2, . . . , n} such that S ∩ T = ∅ and
|S ∪ T | ≤ k.
11. Show that the function f (x1 , x2 , . . . , x6 ) defined by
f = x1 (x2 ∨ x3 ∨ x4 x5 x6 ) ∨ x2 x3 (x4 x5 ∨ x4 x6 ∨ x5 x6 ) ∨ (x2 ∨ x3 ) x4 x5 x6
is regular, but is not 3-monotone (and, hence, is not threshold; Winder [917]).
12. Prove that
(a) a function f of n variables is completely monotone if and only if
its self-dual extension f SD (x1 , x2 , . . . , xn , xn+1 ) = f x n+1 ∨ f d xn+1 is
completely monotone;
(b) a self-dual function of n variables is completely monotone if and only
if it is 7n/38-monotone.
(See [698, 918].)
13. Show that, for a positive function f ,
(a) f is 2-summable if and only if there exists a pair of maximal false points
X∗ , W ∗ and a pair of minimal true points Y ∗ , Z ∗ such that Y ∗ + Z ∗ ≤
X∗ + W ∗ ;
(b) in the previous statement, the inequality Y ∗ + Z ∗ ≤ X∗ + W ∗ cannot be
replaced by Y ∗ + Z ∗ = X∗ + W ∗ . (Compare with Definition 8.10.)
14. Prove that a function f is completely monotone if and only if it is k-
monotone, where k is the largest degree of a prime implicant of f . (See
[272].)
15. When S f T holds, but T f S does not hold for two subsets S, T ⊆
{1, 2, . . . , n}, we write S0f T . Show that the relation 0f may be cyclic in
general, but is acylic for threshold functions. (See, e.g., Einy [291], but also
Muroga [698, p. 200] and Winder [917] for additional results along this line.)
16. Complete the proof of Theorem 8.52.
17. Prove Theorem 8.53 and conclude that aligned functions have the LE
property.
18. Prove that, if f (x1 , x2 , . . . , xn ) is ideal, then f|xj =1 is ideal for all j ∈
{1, 2, . . . , n}. Use this result to derive a polynomial-time algorithm for the
recognition of ideal functions. (See Bertolazzi and Sassano [75].)
19. Let G = (V , E) be a graph and let fG (x1 , x2 , . . . , xn ) = (i,j )∈E xi xj be
the corresponding stability function (see Section 1.13.5 in Chapter 1). For
i ∈ V , denote by N (i) the neighborhood of vertex i, that is, N (i) = {j ∈ V :
(i, j ) ∈ E}.
(a) Prove that xi xj with respect to fG if and only if N (j )\{i} ⊆ N (i)\{j },
for all i, j ∈ V .
(b) Prove that fG is regular if and only if G does not contain 2K2 , P4 , or
C4 as induced subgraphs.
(c) Prove that fG is regular if and only if fG is weakly regular.
(See Chvátal and Hammer [201]; Crama [224].)
8.10 Exercises 403
404
9.1 Definitions and applications 405
(remember the discussion in Section 1.1 of Chapter 1). In geometric terms, thresh-
old functions are precisely those functions for which the set of true points can be
separated from the set of false points by a hyperplane (the separator).
Example 9.1. The function f (x, y, z) = xy ∨ z is a threshold function, with sep-
arator {(x, y, z) ∈ R3 : x − y + 2z = 0} and with structure (1, −1, 2, 0). Observe
that f admits many other separators (actually, an infinite number of them): For
instance {(x, y, z) ∈ R3 : α x − α y + 2α z = 0} is a separator for all α > 0, but
so are {(x, y, z) ∈ R3 : x − 2y + 3z = 0}, {(x, y, z) ∈ R3 : 5x − 5y + 10z = 3},
and so on.
instance, to the monograph Wegener [902] and to papers by Anthony [27, 28],
Bruck [157], Krause and Wegener [583] for various aspects of this line of
research.
Application 9.4. (Game theory.) In the framework of game theory (recall Section
1.13.3), a positive threshold Boolean function is called a weighted majority game.
Such games model the familiar situation in which each of n players (or voters) is
assigned a number of votes, say, wi (i = 1, 2, . . . , n), which she can decide to cast –
or not – in favor of the issue at stake. The issue is adopted if the total number of votes
cast in its favor exceeds a predetermined threshold t. In the simplest case (simple
majority rule), every voter carries exactly one vote, and the threshold is equal
to half the number of players. More elaborate voting rules arise, for instance, in
legislatures, where the number of votes of each member is correlated to the size of
the constituency that she represents, or in shareholder meetings, where the number
of votes corresponds to the number of shares held by each member.
Weighted majority procedures constitute the main paradigm in the theory of sim-
ple games and social choice. Many properties of these procedures and of their gen-
eralizations appear in the literature, for instance, in [79, 720, 777, 850, 861, 893],
and so on.
9.1 Definitions and applications 407
(x1 , x2 , . . . , xn ) ∈ Bn ,
We have already observed that a threshold function may have infinitely many
separators (see Example 9.1). In fact, the set of separators can be characterized
more precisely.
Theorem 9.4. The separating structures of a threshold function of n variables
constitute a full-dimensional convex cone in Rn+1 .
Proof. If S and S are two arbitrary separating structures of the threshold function
f , and if α is a positive scalar, then α S and S + S are also separating structures
of f : Thus, the set of separating structures is a convex cone.
To establish full-dimensionality, let S = (w1 , w2 , . . . , wn , t). If f is identically
0, then the claim is easily checked. Otherwise, define
& n n
(
n
µ = min wi xi : wi xi > t, X ∈ B ,
i=1 i=1
Let us stress the fact that, as the next example illustrates a variable can have
nonzero weight in the separating structure of a threshold function even if the
function does not depend on this variable (as a matter of fact, we will show later
that it is NP-hard to determine whether or not a threshold function given by a
separating structure depends on a particular variable; see Theorem 9.26).
Example 9.4. The function f (x, y, z, u) = xy ∨ z is a threshold function with
structure (2, 4, 6, 1, 5). The variable u, which is inessential, has positive weight in
this separating structure.
An easy way of proving that the functions f , g, and h in Example 9.5 are not
threshold is to observe that they are not regular. Indeed:
Theorem 9.7. Every threshold function has a complete strength preorder. More
precisely, if f (x1 , x2 , . . . , xn ) is a threshold function, then,
(1) for every structure (w1 , w2 , . . . , wn , t) of f , and for all i, j = 1, 2, . . . , n, if
wi ≥ wj , then xi f xj ;
(2) there exists a structure (w1 , w2 , . . . , wn , t) of f such that, for all i, j =
1, 2, . . . , n, wi ≥ wj if and only if xi f xj .
Proof. The proof of (1) is straightforward (this statement was established as The-
orem 8.3 in Section 8.1). As for statement (2), consider an arbitrary structure
(v1 , v2 , . . . , vn , t) of f . Denote by C an equivalence class of the relation ≈f ,
9.2 Basic properties of threshold functions 411
It can be checked directly that every regular function of five variables or less is
a threshold function, but there exist nonthreshold regular functions of six variables
(see Winder [917] and Exercise 11 in Chapter 8).
We also recall Theorem 8.43 (Section 8.8).
Proof. Straightforward.
Proof. Let t = ni=1 wi − t − 1. Since t and w1 , w2 , . . . , wn are integral, the
following equivalences hold for all X ∈ Bn :
f d (X) = 0 if and only if f (X) = 1
n
if and only if wi (1 − xi ) > t
i=1
n
if and only if wi xi ≤ t .
i=1
This proves the first part of the statement. For the second and third parts, simply
notice that f d ≤ f if t ≤ t and f ≤ f d if t ≤ t.
In view of Theorem 9.5, the requirement that the structure be integral is obvi-
ously not essential in the statement of Theorem 9.9; it merely simplifies its
expression. Note also that the conditions for f to be dual-major or dual-minor
are sufficient, but not necessary, in this statement, as illustrated by the next exam-
ple. (Exercise 8 at the end of the chapter actually suggests that it may be hard to
characterize self-dual threshold functions.)
Example 9.7. The threshold function f (x, y, z, u) = xy ∨ xz ∨ xu ∨ yzu
admits the structure (4, 2, 2, 2, 5). Thus, f d is a threshold function with structure
(4, 2, 2, 2, 4), and f is dual-minor. But another structure of f is (2, 1, 1, 1, 2), which
implies that the same vector (2, 1, 1, 1, 2) is also a structure of f d , and hence, that
f is self-dual: f = f d .
The next property was independently observed in the context of threshold logic
(see for instance [698]) and of game theory (see [291]). It involves the concept of
self-dual extension, which we introduced in Section 4.1.3.
Theorem 9.10. The function f (x1 , x2 , . . . , xn ) is a threshold function if and only if
its self-dual extension f SD (x1 , x2 , . . . , xn , xn+1 ) = f x n+1 ∨ f d xn+1 is a threshold
function.
Proof. Assume that f is a threshold function and that (w1 , w2 , . . . , wn , t) is an inte-
gral structure of f . Then, it follows from Theorem 9.9 that (w1 , w2 , . . . , wn , 2t +
1 − ni=1 wi , t) is a structure of f SD . Conversely, if f SD is a threshold
function with structure (w1 , w2 , . . . , wn , wn+1 , t), then (w1 , w2 , . . . , wn , t) is a
structure of f .
We conclude this section by stating some results regarding the number of thresh-
old functions and the size of the weights required in a separating structure. The
number of threshold functions of n variables is known quite precisely.
Theorem 9.11. The number τn of threshold functions of n variables satisfies
n2 n
− ≤ log2 τn ≤ n2 . (9.1)
2 2
9.3 Characterizations of threshold functions 413
Theorem 9.12. For every threshold function of n variables, there exists an integral
separating structure (w1 , w2 , . . . , wn , t) such that
Moreover, there are constants k > 0 and c > 1 such that, for n a power of 2, there
is a threshold function of n variables, such that any integral separating structure
representing f involves a weight of magnitude at least kc−n nn/2 .
In Theorem 9.12 (that we quote directly from Anthony [25]), the upper bound
is due to Muroga [698] and the lower bound is due to Håstad [477]. Observe that
the dominating factor nn/2 is identical in both bounds. Here, we again omit the
proofs and refer the reader to [25, 27, 477, 698] for additional details; see also
Diakonikolas and Servedio [271] for significant extensions.
has a solution (w1 , w2 , . . . , wn , t). When this is the case, every solution of (TS) is a
separating structure of the function.
Proof. The statement follows directly from Definition 9.1 and Theorems 9.4 and
9.6.
w1 + w3 ≤ t
w1 + w4 ≤ t
w2 + w3 ≤ t
w2 + w4 ≤ t
w3 + w4 ≤ t
w1 + w2 ≥ t +1
w1 + w3 + w4 ≥ t +1
w2 + w3 + w4 ≥ t +1
w1 , w2 , w3 , w4 ≥ 0.
This system admits the solution (w1 , w2 , w3 , w4 , t) = (5, 4, 3, 2, 8). Hence, f is a
threshold function with structure (5, 4, 3, 2, 8).
Theorem 9.13, like Definition 9.1, has a strong numerical flavor. The next result
originated in the efforts devoted by researchers in switching logic to establish
purely combinatorial, rather than numerical, characterizations of threshold func-
tions (remember that the study of regularity and of k-monotonicity also originated
in such attempts; see Chapter 8).
We start with a definition (due to Winder [917]) that extends the notions
of 2-summability and 2-asummability already introduced in Definition 8.10 of
Section 8.8.
Definition 9.2. Let k ∈ N, k ≥ 2. A Boolean function f on B n is k-summable if, for
some r ∈ {2, 3, . . . , k}, there exist r (not necessarily distinct) false points of f , say,
X 1 , X 2 , . . . , X r , and r (not necessarily distinct) true points of f , say, Y 1 , Y 2 , . . . , Y r ,
such that ri=1 X i = ri=1 Y i . A function is k-asummable if it is not k-summable,
and it is asummable if it is k-asummable for all k ≥ 2.
Example 9.9. We have shown in Example 8.24 that the function f (x1 , x2 ) =
x1 x2 ∨ x1 x2 is 2-summable. We shall provide an example of a 2-asummable,
3-summable function in the proof of Theorem 9.15.
The following characterization of threshold Boolean functions is due to Chow
[193] and Elgot [310].
Theorem 9.14. A Boolean function is a threshold function if and only if it is
asummable.
9.3 Characterizations of threshold functions 415
W Xi ≤ t < W Y i ,
and hence, ri=1 X i = ri=1 Y i . Therefore, f is asummable.
Conversely, assume that f is not threshold, meaning that the set
{X 1 , X 2 , . . . , X p } of false points of f cannot be separated from the set
{Y 1 , Y 2 , . . . , Y m } of its true points by a hyperplane of Rn . Then, standard separation
theorems (see, e.g., [788]) imply that the convex hulls of {X1 , X 2 , . . . , X p } and of
{Y 1 , Y 2 , . . . , Y m } have nonempty intersection. In other words, the following system
has a feasible solution in the variables ui , i = 1, 2, . . . , p, and vj , j = 1, 2, . . . , m:
p m
ui X i = vj Y j (9.4)
i=1 j =1
p
ui = 1 (9.5)
i=1
m
vj = 1 (9.6)
j =1
ui ≥ 0 (i = 1, 2, . . . , p) (9.7)
vj ≥ 0 (j = 1, 2, . . . , m). (9.8)
In this proof, we have stressed the connection of Theorem 9.14 with geometric
separability theorems. Alternatively, this result could also be deduced directly by
416 9 Threshold functions
T h ⊆ Ak+1 ⊆ Ak ⊆ . . . ⊆ A2 = CM
applying the strong duality theorem of linear programming to the formulation (TS)
(as in [310, 917]).
Thus, if we denote by T h the set of threshold functions, by Ak the set of
k-asummable functions (k ≥ 2), and by CM the set of completely monotone func-
tions, we obtain the hierarchy displayed in Figure 9.1 for all k ≥ 2. (Compare with
the hierarchy of k-monotone functions pictured in Figure 8.8 of Section 8.8, and
recall that A2 = CM by Theorem 8.50.)
It was once conjectured that this hierarchy may be finite, meaning that there
would exist some possibly large, but fixed value k ∗ such that the equality T h =
Ak = Ak∗ holds for all k ≥ k ∗ . This conjecture was demolished by Winder [915,
917] who proved that, for every k, there exist k-asummable functions that are not
linearly separable. We do not establish this result here, but simply prove the weaker
statement that the inclusion T h ⊆ A2 is strict.
Theorem 9.15. Some 2-asummable functions are not threshold functions.
Proof. Moore [692] (cited in [698, 917]) first exhibited a 12-variable function
establishing this statement. Gabelman [356] later produced a 9-variable example.
We propose here a variant of Gabelman’s example.
Consider first the vector A = (14, 18, 24, 26, 27, 30, 31, 36, 37). We shall use the
observation that the only points of B9 lying on the hyperplane H = {X ∈ B 9 :
9
i=1 ai xi = 81} are the six points:
threshold 2-asummable
: ⇒ k-asummable ⇒ : ⇒ k-monotone
asummable (k > 2) completely (k ≥ 1)
monotone
The proof of Theorem 9.15 actually shows that the inclusion A3 ⊂ A2 is strict.
This result was generalized by Taylor and Zwicker [860], who proved that Ak+1 =
Ak for all k ≥ 2. Another interesting generalization of Winder’s result is provided
by Theorem 11.14 in Chapter 11.
Figure 9.2 summarizes the relations between some of the classes of Boolean
functions studied in this chapter and in the previous one. The one-way implications
displayed in Figure 9.2 cannot be reversed. It may be useful to recall here that 1-
monotone functions are exactly monotone functions, and that 2-monotone positive
functions coincide with regular functions. Figure 9.2 will be enriched with one
more class of functions in Section 9.6 (see Figure 9.5).
Threshold Recognition
Instance: A Boolean function f represented by a Boolean expression.
Output: False if f is not a threshold function; a separating structure of f other-
wise.
This question has been extensively studied in the threshold logic literature under
the name of threshold synthesis problem (see, e.g., Hu [511] or Muroga [698]). It
has stimulated the discovery of properties of threshold functions that we discussed
418 9 Threshold functions
in Section 9.2 and Section 9.3. As we have seen, all early attempts to derive
a “tractable” characterization of thresholdness were unsuccessful. In particular,
none of the increasingly intricate conjectures linking threshold functions to k-
monotonicity or to k-asummability has resisted a deeper examination. Note also
that the asummability characterization in Theorem 9.14 does not seem to yield a
straightforward, efficient thresholdness test.
In spite of this negative news, we are going to prove in this section that the
threshold recognition problem is polynomially solvable when the input function is
positive and is expressed by its complete (prime irredundant) disjunctive normal
form. In Section 9.4.3, we briefly discuss the extent to which these assumptions
are restrictive.
Like most classical approaches to the threshold recognition problem, the algo-
rithm presented relies on the characterization of threshold functions and on the
system of inequalities (TS) formulated in Theorem 9.13. We know that if a pos-
itive Boolean function is given by its complete DNF, then the list of its minimal
true points is readily available. Thus, in order to generate the system (TS) for
such a function, we only need to enumerate the maximal false points of the func-
tion, or, equivalently, to dualize it. But, as we know from Chapter 4, dualizing an
arbitrary positive Boolean function is in general a difficult task, and the number
of maximal false points may very well be exponential in the size of the input
DNF. These difficulties originally motivated the quest for efficient dualization
algorithms for regular functions, which eventually led to the results presented in
Section 8.5. Indeed, these results are easily exploited to obtain a polynomial-time
implementation of the recognition procedure displayed in Figure 9.3.
We thus obtain a remarkable result due to Peled and Simeone [735].
Theorem 9.16. The procedure Threshold is correct and can be implemented to
run in time O(n7 m5 ), where n is the number of variables and m is the number of
prime implicants of the function to be tested.
Procedure Threshold(f )
Input: The complete DNF of a positive Boolean function f (x1 , x2 , . . . , xn ).
Output: False if f is not a threshold function; a separating structure of f otherwise.
begin
if f is not regular then return False
else begin
dualize f ;
set up the system (TS);
solve (TS);
if (TS) has no solution then return False
else return a solution (w1 , w2 , . . . , wn , t) of (TS);
end
end
Proof. Testing whether the input function f is regular can be accomplished in time
O(n2 m) by the procedure Regular presented in Section 8.4 (Theorem 8.20). If
f is not regular, then f is not a threshold function (by Theorem 9.7). If f is
regular, then it can be dualized in O(n2 m) time by the procedure DualReg (The-
orem 8.28), and the system (TS) can be set up within the same time bound. Now, by
Theorem 9.13, f is a threshold function if and only if the system (TS) is consistent,
and every solution of (TS) is a structure of f . Using a polynomial-time algorithm
for linear programming (see [76, 812]), (TS) can be solved in time O(n7 m5 ), since
(TS) has n + 1 variables and O(nm) constraints (by Theorem 8.29).
To describe more precisely what happens in Example 9.11, we first recall two
definitions from Section 8.3.
has a solution (w1 , w2 , . . . , wn , t). When this is the case, every solution of (TS*) is
a separating structure of the function.
points (1, 1, 0, 0) and (0, 1, 1, 1). Thus, the system (TS*) associated with f reads
w1 + w3 ≤ t
w1 + w2 ≥ t +1
w2 + w3 + w4 ≥ t +1
w1 = w2 ≥ w3 = w4 ≥ 0.
This system is equivalent to the system (TS∗ ) in Example 9.11.
Because this system has rational coefficients and involves n+2 equations, standard
results about linear programming problems imply that (9.4)–(9.8) has a rational
solution in which at most n + 2 variables take a nonzero value, and whose size is
polynomially bounded in n (see, e.g., [812]; in geometric terms, this is also a con-
sequence of Caratheodory’s theorem; see [199, 788]). Let (U , V ) ∈ Rp+m be such
a solution, with I = {i : ui > 0, i = 1, 2, . . . , p}, J = {j : vj > 0, j = 1, 2, . . . , m},
|I | ≤ (n+2), and |J | ≤ (n+2). Then, the points X i (i ∈ I ) and Y j (j ∈ J ), together
with the coefficients ui (i ∈ I ) and vj (j ∈ J ), constitute a polynomial-size cer-
tificate of nonthresholdness for f . This implies that Threshold Recognition is
in co-NP.
The bound on the degree of the input DNF is sharp in Theorem 9.18: Indeed,
the prime implicants of a quadratic DNF can be generated in polynomial time (see
Section 5.8), so that the generic recognition procedure sketched at the beginning
of this subsection applies. In particular, the case of nonmonotone quadratic DNFs
can be efficiently reduced to the positive case, which we discuss in greater detail
in Section 9.7.
such that (9.13) and (9.14) have the same set of solutions over B n ? To establish
the link between the aggregation problem and the threshold recognition problem,
we rely on some of the concepts that have been introduced in Chapter 1, Section
1.13.6. Remember that the resolvent of the system (9.13) is the Boolean function
f (x1 , x2 , . . . , xn ) whose false points are exactly the 0–1 solutions of (9.13) (see
Section 1.13.6). Then, the aggregation problem is simply asking whether f is a
threshold function, and Theorem 9.18 implies that the aggregation problem is NP-
hard, even for systems of generalized covering inequalities (see Theorem 1.39).
However, when (9.13) happens to be a system of set-covering inequalities, meaning
that aij ∈ {−1, 0} and bi = −1 (i = 1, 2, . . . , m, j = 1, 2, . . . , n), then f is a nega-
tive function and its prime implicants are readily available. Hence, in this special
case, the procedure Threshold provides an efficient solution of the aggregation
problem (Peled and Simeone [735]). See also Application 9.12 in Section 9.7 for
related considerations.
9.5 Prime implicants of threshold functions 423
We now present an algorithm to generate all prime implicants (or, more pre-
cisely, all minimal true points) of a positive threshold function f (x1 , x2 , . . . , xn )
described by a structure (w1 , w2 , . . . , wn , t). We assume that the variables of f have
been permuted in such a way that w1 ≥ w2 ≥ · · · ≥ wn ≥ 0 and, to rule out the
trivial cases where f is constant on B n , we also assume that 0 ≤ t < ni=1 wi .
For k = 1, 2, . . . , n, we denote by Tk the set of all points (y1∗ , y2∗ , . . . , yk∗ ) ∈ B k such
that f has a minimal true point of the form (y1∗ , y2∗ , . . . , yn∗ ) ∈ B n for an appropriate
∗ ∗
choice of (yk+1 , yk+2 , . . . , yn∗ ) ∈ Bn−k . Thus, Tn contains exactly the minimal true
points of f . We also let T0 = {( )}, where () is the “empty” vector. (Observe that
this convention is coherent with the previous definition, since we have assumed
that Tn is nonempty.)
Now, the prime implicant generation algorithm recursively generates
T1 , T2 , . . . , Tn : The next result explains how Tk+1 can be efficiently generated when
Tk is at hand.
Proof. Let Y ∗ = (y1∗ , y2∗ , . . . , yk∗ ), and consider assertion (1). If (Y ∗ , 1) is in Tk+1 , then
f has a minimal true point of the form (Y ∗ , 1, Z ∗ ) ∈ B n , and hence, (Y ∗ , 0, . . . , 0) ∈
B n is a false point of f , meaning that ki=1 wi yi∗ ≤ t.
k ∗ ∗
Conversely, assume now that i=1 wi yi ≤ t.Since Y ∈ Tk , the point
(Y , 1, . . . , 1) ∈ B is a true point of f , and hence, i=1 wi yi + ni=k+1 wi > t.
∗ n k ∗
Let r ≥ k + 1 be the smallest index such that ki=1 wi yi∗ + ri=k+1 wi > t. Define
∗ ∗ ∗
yk+1 = . . . = yr∗ = 1, yr+1 = . . . = yn∗ = 0, and X∗ = (Y ∗ , yk+1 , . . . , yn∗ ). Then, X∗
∗
is a minimal true point of f , and hence, (Y , 1) ∈ Tk+1 as required.
Consider now assertion (2), and assume that (Y ∗ , 0) is in Tk+1 . Then
(Y , 0, 1, . . . , 1) ∈ B n is a true point of f ; hence, ki=1 wi yi∗ + ni=k+2 wi > t.
∗
Conversely, assume that ki=1 wi yi∗ + ni=k+2 wi > t. There are two cases:
k
• If ∗
r i=1 wi yi ≤ t, let r ≥∗k + 2 be the smallest index such that ki=1 wi yi∗ +
∗ ∗ ∗ ∗
i=k+2 wi > t. Define yk+1 = 0, yk+2 = . . . = yr = 1, yr+1 = . . . = yn = 0,
∗ ∗ ∗ ∗ ∗
and let X = (Y , yk+1 , . . . , yn ). Then, X is a minimal true point of f , and
hence, (Y ∗ , 0) ∈ Tk+1 .
k
• If i=1 wi yi∗ > t, then X ∗ = (Y ∗ , 0, . . . , 0) ∈ B n is a true point of f . On the
other hand, since Y ∗ ∈ Tk , there exists a minimal true point of f of the form
∗
Z ∗ = (Y ∗ , yk+1 , . . . , yn∗ ). By minimality of Z ∗ , we conclude that X∗ = Z ∗ ,
and hence, (Y ∗ , 0) ∈ Tk+1 .
Theorem 9.20 leads to the algorithm displayed in Figure 9.4. We illustrate this
algorithm on a small example.
Procedure MinTrue(w1 , w2 , . . . , wn , t)
Input: The separating structure (w1 , w2 , . . . , wn , t) ∈ Qn+1 of a threshold function f ,
with w1 ≥ w2 ≥ · · · ≥ wn ≥ 0.
Output: The set T of minimal true points of f .
begin
if ni=1 wi ≤ t then return T := ∅
else if t < 0 then return T := {(0, . . . , 0)}
else begin
T0 := {( )};
for j := 1 to n do use Theorem 9.20 to generate Tj ;
return T := Tn ;
end
end
Next, we let k = 1 in Theorem 9.20. When Y ∗ = (1), we have ki=1 wi yi∗ =
k ∗
n
5 ≤ 8 and i=1 wi yi + i=k+2 wi = 11 > 8. Thus, the points (1, 0) and
(1, 1) are in T 2 . On the other hand, when Y ∗ = (0), ki=1 wi yi∗ = 0 ≤ 8 and
k ∗
n
i=1 wi yi + i=k+2 wi = 6 ≤ 8. Hence, (0, 1) is in T2 , and we conclude that
T2 = {(1, 0), (1, 1), (0, 1)}.
Continuing in this way, we successively produce
In the next statement, the term (arithmetic) operations denotes elementary oper-
ations, such as additions, subtractions, multiplications, comparisons, performed on
numbers of size polynomially bounded in the size of the input.
Johnson, and Peled [444], and Wolsey [922], and further developed in numerous
publications (see, e.g., Balas and Zemel [46]; Weismantel [904]; Zemel [935],
etc.). Their practical use in 0–1 programming was first convincingly demonstrated
by Crowder, Johnson, and Padberg [245]. We refer the reader to Nemhauser and
Wolsey [707] or Wolsey [924] for more information on this topic.
We also note that, more recently, a number of researchers have examined effi-
cient procedures to translate knapsack systems of the form (9.15)–(9.16) into
equivalent Boolean DNF equations, possibly involving additional variables. This
line of research, in the spirit of Chapter 2, Section 2.3, opens the possibility
of relying on purely Boolean techniques (such as satisfiability solvers) to han-
dle 0–1 linear optimization problems; see, for instance, Bailleux, Boufkhad, and
Roussel [41]; Eén and Sörensson [290]; or Manquinho and Roussel [667].
0 ≤ xj ≤ 1 (j = 1, 2, 3, 4). (9.21)
It is easily seen that the inequality (9.20) has exactly the same 0–1 solutions as
2x1 + x2 + x3 + x4 ≤ 3 (9.22)
(that is, both inequalities define the same threshold function f (x1 , x2 , x3 , x4 ) =
x1 x2 x3 ∨ x1 x2 x4 ∨ x1 x3 x4 ), but some fractional solutions of (9.20)–(9.21) are cut
off by the inequality (9.22): For instance, (x1∗ , x2∗ , x3∗ , x4∗ ) = ( 23 , 23 , 23 , 23 ) satisfies
(9.20)–(9.21) but violates (9.22).
Even though it is not true that a reduction of the coefficient sizes always implies
a strengthening of the inequality, this is, nevertheless, often the case. Coefficient
reduction is therefore of interest in branch-and-bound and cutting-plane algo-
rithms for 0–1 linear programming problems (see Bradley, Hammer, and Wolsey
[148]; Nemhauser and Wolsey [707]; Williams [910], etc.). Similar issues also
arise in electrical engineering; see, for instance, Muroga [698].
A possible approach to coefficient reduction problems goes as follows: Given the
initial separating structure (w1 , w2 , . . . , wn , t), generate the maximal false points
X 1 , X 2 , . . . , X p and the minimal true points Y 1 , Y 2 , . . . , Y m of the corresponding
threshold function f (as usual, we assume that f is positive). Then, in view of
Theorem 9.13, a “reduced” structure of f can be found by solving the optimization
428 9 Threshold functions
problem
wi ≥ 0 (i = 1, 2, . . . , n), (9.26)
ωi (f ) = | {Y ∗ ∈ Bn | f (Y ∗ ) = 1 and yi∗ = 1} |, i = 1, 2, . . . , n.
When no confusion can arise, we sometimes drop the symbol f from the nota-
tion ωi (f ) or ω(f ). Note that (ω1 , ω2 , . . . , ωn ) = ωj=1 Y j , where Y 1 , Y 2 , . . . , Y ω
are the true points of f . We should also mention that many variants of Definition
9.5 have been used in the literature (see, e.g., [920] and Section 9.6.2). These vari-
ants give rise to different scalings of the Chow parameters, while preserving their
main features.
9.6 Chow parameters of threshold functions 429
With Definition 9.6 at hand, we are now ready to state Chow’s fundamental
result (see Chow [194]; Muroga [698] also credits Tannenbaum [857] for this
result).
Theorem 9.22. Every threshold function is a Chow function.
Proof. Consider a threshold function f on B n , and a function g on Bn having the
same Chow parameters as f . We must show that f = g.
Let us denote by Y 1 , Y 2 , . . . , Y ω the true points of f , and by X1 , X 2 , . . . , X k ,
Y , Y k+2 ,. . ., Y ω the true points of g, where ω = ω(f ) = ω(g), 0 ≤ k ≤ ω,
k+1
and X1 , X 2 , . . . , X k are false points of f . Since f and g have the same Chow
parameters,
ω k ω
Yj = Xj + Yj,
j =1 j =1 j =k+1
or, equivalently,
k
k
Yj = Xj . (9.27)
j =1 j =1
Now, if k ≥ 1, then (9.27) contradicts the fact that f is asummable. Hence, we con-
clude that k = 0, meaning that f and g have the same true points, and that f = g.
This result shows that every threshold function is uniquely identified by its
Chow parameters. Chow parameters have therefore been used as convenient iden-
tifiers for cataloging threshold functions; see Muroga [698] for a table of threshold
functions up to five variables; Muroga, Toda, and Kondo [701] or Winder [917]
for functions of six variables; Winder [919] for functions of seven variables; and
Muroga, Tsuboi, and Baugh [702] (cited in [698]) for functions of eight variables.
Observe that all points occurring in equation (9.27) are distinct. This motivates
the introduction of yet another concept.
Definition 9.7. A Boolean function is weakly asummable if, for all k ≥ 1, there do
not exist k distinct false points of f , say, X1 , X 2
, . . . , X k , and k distinct true points
of f , say, Y 1 , Y 2 , . . . , Y k , such that i=1 X i = ki=1 Y i .
k
(k > 2)
threshold ⇒ k-asummable ⇒ 2-asummable
: : ⇒ k-monotone
asummable ⇒ weakly assumable ⇒ completely (k ≥ 1)
: monotone
Chow
Figure 9.5. A hierarchy of Boolean functions: Enlarged version.
There exist Chow functions that are not threshold, and the function constructed
in the proof of Theorem 9.15 is completely monotone but not a Chow function.
On the other hand, Yajima and Ibaraki [929] showed that Chow functions are
completely monotone (we leave the proof of this assertion as an end-of-chapter
exercise). Thus, we obtain the hierarchy displayed in Figure 9.5 (compare with
Figure 9.2 in Section 9.3).
Chow’s theorem has been more recently revisited by O’Donnell and Serve-
dio [717], who established a “robust” generalization of it: Namely, they
proved that if f is a threshold function, if g is an arbitrary function, and if
(ω1 (f ), ω2 (f ), . . . , ωn (f ), ω(f )) is “close” to (ω1 (g), ω2 (g), . . . , ωn (g), ω(g)) in
some appropriate norm, then the functions f and g are also “close” in the norm
X∈Bn |f (X) − g(X)| (if we replace “close” by “equal” in this statement, then
we obtain exactly Theorem 9.22). Based on this result, O’Donnell and Servedio
proposed a fast algorithmic version of Chow’s theorem, which allows them to effi-
ciently construct an approximate representation of a threshold function given its
Chow parameters (an extension of this problem is mentioned in Application 9.10
hereunder). We refer to [717] for details and applications in learning theory; see
also Matulef et al. [677] for additional far-reaching extensions of Chow’s theorem.
Proof. Let (ω1 , ω2 , . . . , ωn , ω) denote the Chow parameters of f . Fix i ∈ {1, 2, . . . , n},
and let A = {X ∈ B n : f (X) = 1, xi = 1}, B = {X ∈ Bn : f (X) = 1, xi = 0}. So,
|A| = ωi and |B| = ω − ωi . If f is positive in xi , then the mapping m(X) = X ∨ ei
is one-to-one on B, and m(B) ⊆ A. Hence, |B| ≤ |A| and πi ≥ 0. Moreover, if f
depends on xi , then |B| < |A|, and hence πi > 0. This establishes assertion (1);
assertions (2) and (3) are proved in a similar way.
Assertions (4) and (5) are a restatement of Theorem 8.4 in Section 8.2.
By definition of duality, f d (X) = 1 if and only if f (X) = 0. It follows directly
that ω(f d ) = 2n − ω, and hence, that π(f d ) = ω(f d ) − 2n−1 = 2n−1 − ω = −π .
Similarly, for i = 1, 2, . . . , n,
and hence,
πi (f d ) = 2 ωi (f d ) − ω(f d )
= 2 (2n−1 − ω + ωi ) − (2n − ω)
= 2 ωi − ω
= πi .
This proves assertion (6). As for (7), observe that f d ≤ f implies ω(f d ) ≤ ω, and
hence, π ≥ 0. A similar reasoning yields (8).
In the case of threshold functions, how much further does the analogy go
between weights and modified Chow parameters? First, it can be informally
stated that, for a threshold function with separating structure (w1 , w2 , . . . , wn , t),
the vectors (w1 , w2 , . . . , wn ) and (π1 , π2 , . . . , πn ) often turn out to be “roughly”
proportional. This, in spite of the fact that, as the following example shows,
proportionality can become quite rough when the separating structure is picked
arbitrarily.
Example 9.17. The threshold function with structure (w, w2 , w3 , t) = (1, 1, 1, 1) has
modified Chow parameters (π1 , π2 , π3 ) = (2, 2, 2), so that (w1 , w2 , w3 ) is exactly
proportional to (π1 , π2 , π3 ). But (50, 50, 1, 50) and (50, 33, 18, 50) are two other
structures of the same function, for which proportionality with (π1 , π2 , π3 ) is much
more approximative!
Based on the previous example, one may be tempted to go one step further and
to conjecture that every threshold function admits a separating structure whose
weights are proportional to the modified Chow parameters π1 , π2 . . . πn of the
function. Or, in other words, that every such function has a structure of the form
(π1 , π2 , . . . , πn , t), for some suitable choice of t. This conjecture is easily disproved,
however.
Example 9.18. The function f (x1 , x2 , x3 , x4 , x5 ) = x1 x2 ∨ x1 x3 x4 ∨ x1 x3 x5 ∨
x2 x3 x4 x5 is a threshold function with separating structure (4, 3, 2, 1, 1, 6), and
its modified Chow parameters are (π1 , π2 , π3 , π4 , π ) = (10, 6, 4, 2, 2, −4). But this
function has no structure of the form (10, 6, 4, 2, 2, t), for any t (otherwise, 14 ≤ t,
since (1, 0, 1, 0, 0) is a false point of f , and t < 14, since (0, 1, 1, 1, 1) is a true
point of f ). A similar reasoning also shows that f has no structure of the form
(ω1 , ω2 , ω3 , ω4 , ω5 , t) = (11, 9, 8, 7, 7, t).
Notwithstanding this dispiriting news, Dubey and Shapley [279] observed that,
in some sense, the vector of modified Chow parameters actually is proportional to
the “average” of the vector of weights.
Theorem
n 9.25. Let w1 , w2 , . . . , wn be fixed nonnegative numbers and W =
j =1 wj , let t be a random variable uniformly distributed on [ 0, W ],
and let f (x1 , x2 , . . . , xn ) be the (random) threshold function with structure
9.6 Chow parameters of threshold functions 433
t π1 π2 π3 π4 π5
0, 14 1 1 1 1 1
1, 13 2 2 2 2 0
2, 12 3 3 3 1 1
3, 11 5 5 3 1 1
4, 10 7 5 3 3 1
5, 9 8 6 4 4 2
6, 8 9 7 5 3 1
7 10 6 6 2 2
Total 80 64 48 32 16
The same result holds, with the same proof, if the weights w1 , w2 , . . . , wn
are assumed to be nonnegative integers and if t is uniformly distributed on
{0, 1, . . . , W − 1}. Dubey and Shapley [279] illustrate this point with the following
example.
Example 9.19. Let (w1 , w2 , w3 , w4 , w5 ) = (5, 4, 3, 2, 1) and consider all threshold
functions with separating structures (w1 , w2 , w3 , w4 , w5 , t), where t can take any
value in the set {0, 1, . . . , 14}. The modified Chow parameters of these 15 func-
tions are displayed in Table 9.1. The average value of π1 for this set of functions
is 80
15
= 2n−1 wW1 . Note, however, that there is no single choice of the threshold
t for which the vector of modified Chow parameters is exactly proportional to
(w1 , w2 , . . . , wn ).
Additional theoretical results describing the relation between weights and Chow
parameters of a threshold function, as well as algorithms allowing us to reconstruct
434 9 Threshold functions
Bilbao, Fernández, Jiménez, and López [81]; Laruelle and Widgrén [600]; Leech
[607], and so on.
Even more interestingly, the above mathematical model makes it very clear that
the one man, one vote principle, as embodied in the Court decree and further inter-
preted in terms of Banzhaf indices, cannot always be implemented in real-world
situations. Indeed, for fixed n, the number of possible realizations of the vector
1
(s , s , . . . , sn ) is infinite, whereas the number of Banzhaf vectors (β1 , β2 , . . . , βn )
S 1 2
is obviously finite (since the number of threshold functions of n variables is finite).
So, for most distributions of population sizes, there exists no allocation of weights
(w1 , w2 , . . . , wn ) that implies a distribution of power equal to S1 (s1 , s2 , . . . , sn ). In
such cases, the need arises again to give an operational meaning to the one man,
one vote principle. How can this be achieved?
One (rather intriguing) possibility raised by Papayanopoulos [729] would be
to assign exactly si votes to legislator i, and to let the threshold t vary randomly
between 0 and S (namely, the threshold would be drawn randomly in [0, S] when-
ever the legislature is to vote). By virtue of Theorem 9.25, this would provide
legislator i with an expected share of power proportional to the size of his or her
constituency. Unfortunately, even though this solution may sound quite attractive
to a mathematically inclined political scientist, it is doubtful that it will be adopted
by any real-world legislature in the foreseeable future!
Another, more realistic way out of the dilemma has been actually implemented
by some county supervisorial boards in the State of New York. In these bodies, the
one man, one vote principle has been translated as follows: The voting weights
w1 , w2 , . . . , wn and the threshold t should be specified in such a way that the
Banzhaf vector (β1 , β2 , . . . , βn ) be “as close as possible” to the population distri-
bution S1 (s1 , s2 , . . . , sn ), or, in other words, so as to minimize the distance (in some
appropriate norm) between (β1 , β2 , . . . , βn ) and S1 (s1 , s2 , . . . , sn ). This interpretation
of the one man, one vote principle gives rise to an interesting, but hard, combinato-
rial optimization problem; see Alon and Edelman [17]; Aziz, Paterson, and Leech
[39]; Lucas [632]; McLean [639]; O’Donnell and Servedio [717]; Papayanopou-
los [727, 728, 729]; Laruelle and Widgrén [600]; or Leech [607, 608] for more
information and related applications.
Indeed, the following result due to Garey and Johnson [371], in conjunction with
Theorem 9.24, shows that it is already NP-complete to decide whether a modified
Chow parameter vanishes or not (compare with Theorem 1.32).
Theorem 9.26. Deciding whether a threshold function depends on its last variable
is NP-complete when the function is described by a separating structure.
Proof. The problem is obviously in NP. Now, recall that the following Subset Sum
problem is NP-complete [371]: Given n + 1 positive integers (w1 , w2 , . . . , wn , t),
is there a point X∗ ∈ Bn such that nj=1 wj xj∗ = t?
With an arbitrary instance (w1 , w2 , . . . , wn , t) of Subset Sum, we associate the
threshold function f (x1 , x2 , . . . , xn+1 ) with structure (w1 , w2 , . . . , wn , 12 , t). It is clear
that f depends on its last variable xn+1 if and only if (w1 , w2 , . . . , wn , t) is a Yes
instance of Subset Sum.
Prasad and Kelly [754] actually proved that, for a threshold function given
by a separating structure, computing Banzhaf indices – or, equivalently, Chow
parameters – is #P-complete; compare with Theorem 1.38. (A similar observation
was already formulated by Garey and Johnson [371] for Shapley-Shubik indices;
see also Deng and Papadimitriou [268] and Matsui and Matsui [676].)
As a remarkable illustration of the occurrence of “dummy” variables in weighted
majority systems, we mention a well-known story among political scientists (see,
e.g., [150, 329]).
Application 9.11. (Political science, game theory.) In 1958, the European Eco-
nomic Community had six member-states, namely, Belgium, France, Germany,
Italy, Luxembourg, and the Netherlands. Its Council of Ministers relied on a
weighted majority decision rule with voting weight 4 for France, Germany and
Italy, weight 2 for Belgium and the Netherlands, and weight 1 for Luxembourg. The
threshold was set to t = 11. With these rules, it is readily seen that Luxembourg
actually had no voting power at all, since the outcome of the vote was always
determined regardless of the decision made by Luxembourg.
Proof. We assume for the sake of simplicity that w1 , w2 , . . . , wn and t are positive
(only minor adaptations are required in the general case). For j = 0, 1, . . . , n and
9.6 Chow parameters of threshold functions 437
and
and Shapley [661] (who credit Cantor), and for Banzhaf indices by Brams and
Affuso [150]; see also Algaba et al. [15]; Bilbao [79]; Bilbao et al. [80, 81];
Fernández, Algaba, Bilbao, Jiménez, Jiménez and López [330]; Leech [607];
Papayanopoulos [729] for related work, extensions, and applications in various
political settings.
Finally, we refer to Crama and Leruth [236]; Crama, Leruth, Renneboog, and
Urbain [237]; Cubbin and Leech [246]; Gambarelli [368]; Leech [606, 608, 609]
for different approaches to the computation of Banzhaf indices in the framework
of corporate finance applications.
Threshold graphs were introduced by Chvátal and Hammer [201] and, inde-
pendently, by Henderson and Zalcstein [487]. Most of this section is based
on [201].
The central question we want to address is: Which graphic functions are thresh-
old, or, equivalently, which graphs are threshold? In view of the correspondence
1 m m2
❅
❅
❅
❅
❅
❅
4 m m3
1 ✐ ✐2 1 ✐ ✐2 1 ✐ ✐2
4 ✐ ✐3 4 ✐ ✐3 4 ✐ ✐3
(a) (b) (c)
Figure 9.7. (a): 2K2 , (b): P4 , (c): C4 .
Theorem 9.6 and the arguments in the proof of Theorem 9.4, we can assume that
wi > 0 for all i = 2, 3, . . . , n, and that t > 0.
Now, it is easy to see that (w1 , w2 , . . . , wn , t) is a structure for f , where w1 = 0
if vertex 1 is isolated in Gf , and w1 = t if vertex 1 is dominating in Gf . Hence, f
is a threshold function.
Definition 9.10. Let G be a graph on {1, 2, . . . , n}, and let di = deg(i) be the
degree of vertex i, for i = 1, 2, . . . , n. If (π(1), π(2), . . . , π(n)) is a permutation
of {1, 2, . . . , n} such that dπ(1) ≥ dπ(2) ≥ · · · ≥ dπ(n) , then we say that d(G) =
(dπ(1) , dπ(2) , . . . , dπ(n) ) is the degree sequence of G. A degree sequence is called
threshold if it is the degree sequence of at least one threshold graph.
Theorem 9.31. If G is a threshold graph and H is a graph such that d(H ) = d(G),
then H is isomorphic to G.
Theorem 9.31 is closely related to Theorem 9.22: Indeed, the Chow parameters
of a threshold graphic function can be explicitly expressed as a function of the
corresponding degree sequence (see Exercise 19 at the end of the chapter).
An interesting corollary of this result is that all the information concerning the
“thresholdness” of a graph is embodied in its degree sequence. In other words,
it must be possible to decide whether a graph G is a threshold graph by simply
examining its degree sequence. As a matter of fact, a careful reading of the proof
of Theorem 9.31 indicates how to decide, in O(n2 ) operations, whether a sequence
(d1 , d2 , . . . , dn ) of nonnegative integers is a threshold sequence. This result is not
best possible: Threshold sequences and threshold graphs can be recognized in time
O(n); see Golumbic [398] or Mahadev and Peled [645] for details.
The foregoing observations can also be derived from an analytical characteri-
zation of threshold sequences due to Hammer, Ibaraki, and Simeone [442]. Before
we state this result, we first recall a classical theorem of Erdős and Gallai [313]
(see also [71]).
Example 9.21. The sequence d = (2, 2, 2, 2) is the degree sequence of the cycle
C4 . For this sequence, (9.31) becomes
and
2r ≤ r(r − 1) + 2(4 − r) when r = 3, 4.
Notice that, in the previous example, all of the Erdős-Gallai inequalities are
satisfied as strict inequalities. This need not be the case in general.
442 9 Threshold functions
xi + xj ≤ 1, (i, j ) ∈ E. (9.33)
Now, the 0–1 solutions of (9.33) are precisely the characteristic vectors of
the stable sets of G(A). Hence, by Theorem 9.28, there exists a single linear
inequality having the same 0–1 solutions as (9.32) (or (9.33)) if and only if G(A)
is a threshold graph. In particular, this proves that the aggregation problem is
polynomially solvable for set packing inequalities.
When (9.32) is not equivalent to a single linear inequality, one can push the
investigation a bit further and ask instead: What is the smallest integer δ for which
there exists a system of δ linear inequalities
n
wkj xj ≤ tk , k = 1, 2, . . . , δ, (9.34)
j =1
such that (9.32) and (9.34) have the same set of 0–1 solutions (Chvátal and Hammer
[201], Neumaier [709])? We denote this value by δ(A) and call it the threshold
dimension of A. Similarly, for a graph G = (V , E), we can define the threshold
9.7 Threshold graphs 443
can now construct a Guttman scale g as follows. For p ∈ P , let g(p) = a(p). For
s ∈ S, consider the largest value a(p ∗ ) such that w(s) + a(p ∗ ) ≤ t, and define
g(s) = a(p∗ ). It is easy to check that g is a valid Guttman scale.
Conversely, if G is not a threshold graph, then by Theorem 9.30 it has four
vertices, say, 1, 2, 3, 4, such that (1, 2) ∈ E, (3, 4) ∈ E, (1, 3) ∈ E and (2, 4) ∈ E
(cf. Figure 9.7). We can assume, without loss of generality, that vertices 1 and 4
are in P , and vertices 2 and 3 are in S (indeed, 1 and 3 are not both in S, since
they are not linked; hence, we can assume that 1 ∈ P ; then, 2 ∈ S, etc.). Then, if g
is a Guttman scale, there holds
g(4) ≤ g(2) < g(1), since 2 agrees with 1 but 2 does not agree with 4,
g(1) ≤ g(3) < g(4), since 3 agrees with 4 but 3 does not agree with 1,
and we reach a contradiction.
We refer to the paper by Cozzens and Leibowitz [223] for additional information
on the connections between Guttman scales and threshold graphs.
Other connections between threshold functions and graph properties have been
explored in several papers. For instance, Benzaken and Hammer [67] characterized
domishold graphs: A graph G = (V , E) is domishold if there exists a structure
(w1 , w2 , . . . , wn , t) such that, for every S ⊆ V ,
S is dominating in G if and only if wi ≤ t.
i∈S
9.8 Exercises
1. ABoolean function f (x1 , x2 , . . . , xn ) is a ball if there exist (w1 , w2 , . . . , wn , t) ∈
Rn such that, for all (x1 , x2 , . . . , xn ) ∈ Bn ,
n
f (x1 , x2 , . . . , xn ) = 0 if and only if (wi − xi )2 ≤ r 2 .
i=1
x1 + x2 + . . . + x10 ≥ 7
and
34x1 + 29x2 + 9x3 + 7x4 + 5x5 + 5x6 + 4x7 + 3x8 + 3x9 + x10 ≥ 50.
12. In 1973, the voting weights of the nine members of the Council of Ministers
of the European Economic Community were 10, 10, 10, 10, 5, 5, 3, 3, and
2, respectively. The threshold was 40 votes. Show that this voting procedure
is equivalent to the procedure defined by the smaller weights 6, 6, 6, 6, 3, 3,
2, 2, 1 with threshold 24. Compute the Banzhaf indices of the nine states.
13. Prove that every Chow function is completely monotone.
14. Let f and g be two functions on B n . Prove that, if f is a positive threshold
function and ωi (f ) = ωi (g) for i = 1, 2, . . . , n, then either f = g or ω(f ) <
ω(g).
15. Prove that a graph G = (V , E) is threshold if and only if there exist (n + 1)
numbers a1 , a2 , . . . , an and q such that, for all i, j in V ,
P ∈ E if and only if ai > q.
i∈P
Prove that
10.1 Introduction
In this chapter, we present the theory and applications of read-once Boolean func-
tions, one of the most interesting special families of Boolean functions. A function
f is called read-once if it can be represented by a Boolean expression using
the operations of conjunction, disjunction, and negation in which every variable
appears exactly once. We call such an expression a read-once expression for f .
For example, the function
f0 (a, b, c, w, x, y, z) = ay ∨ cxy ∨ bw ∨ bz
f1 = ab ∨ bc ∨ cd
and
f2 = ab ∨ bc ∨ ac.
Neither of these is a read-once function; indeed, it is impossible to express them so
that each variable appears only once. (Try to do it.) The functions f1 and f2 illustrate
the two types of forbidden functions that characterize read-once functions, as we
448
10.1 Introduction 449
1 The property of normality is sometimes called clique-maximality in the literature. It also appears in the
definition of conformal hypergraphs in Berge [71] and is used in the theory of acyclic hypergraphs.
450 10 Read-once functions
f3 = abc
has the same co-occurrence graph as f2 , namely, the triangle G(f2 ) = G(f3 ) in
Figure 10.2, yet f3 is clearly read-once and f2 is not read-once. This example
illustrates the motivation for the following definition.
A Boolean function f is called normal if every clique of its co-occurrence graph
is contained in a prime implicant of f .
In our example, f2 fails to be normal, since the triangle {a, b, c} is not contained
in any prime implicant of f2 . This leads to our second necessary property of read-
once functions, namely, that a read-once function must be normal, which we will
also prove in Section 10.3. Moreover, a classical theorem of Gurvich [422, 426]
shows that combining these two properties characterizes read-once functions.
result will be used later in the proof of one of the characterizations of read-once
functions.
f d (X) = f (X),
and an expression for f d can be obtained from any expression for f by simply
interchanging the operators ∧ and ∨ as well as the constants 0 and 1. In particular,
given a DNF expression for f , this exchange yields a CNF expression for f d . This
shows that the dual of a read-once function is also read-once.
The process of transforming a DNF expression of f into a DNF expression of
f d is called DNF dualization; its complexity for positive Boolean functions is still
unknown, the current best algorithm being quasi-polynomial [347]; see Chapter 4.
Let P be the collection of prime implicants of a positive Boolean function f
over the variables x1 , x2 , . . . , xn , and let D be the collection of prime implicants of
the dual function f d . We assume throughout that all of the variables for f (and
hence for f d ) are essential. We use the term “dual (prime) implicant” of f to
mean a (prime) implicant of f d . For positive functions, the prime implicants of f
correspond precisely to the set of minimal true points minT (f ), and the dual prime
implicants of f correspond precisely to the set of maximal false points maxF (f );
see Sections 1.10.3 and 4.2.1.
Theorem 4.7 states that the implicants and dual implicants of a Boolean function
f , viewed as sets of literals, have pairwise nonempty intersections. In particular,
this holds for the prime implicants and the dual prime implicants. Moreover, the
prime implicants and the dual prime implicants are minimal with this property,
that is, for every proper subset S of a dual prime implicant of f , there is a prime
implicant P such that P ∩ S = ∅.
In terms of hypergraph theory, the prime implicants P form a clutter (namely,
a collection of sets, or hyperedges, such that no set contains another set), as does
the collection of dual prime implicants D.
Finally, we recall the following properties of duality to be used in this chapter
and which can be derived from Theorems 4.1 and 4.19.
(i) g = f d .
(ii) For every partition of {x1 , x2 , . . . , xn } into sets A and A, there is either a
member of P contained in A or a member of D contained in A, but not
both.
(iii) D is exactly the family of minimal transversals of P.
452 10 Read-once functions
P0 (T ) = {P ∈ P|P ∩ T = ∅},
not be relevant for our analysis. (We may omit the parameter T from our notation
when it is clear which subset is meant.)
A selection S(T ), with respect to T , consists of one prime implicant Px ∈
Px (T ) for every x ∈ T . A selection is called covering if there is a prime implicant
P0 ∈ P0 (T ) such that P0 ⊆ x∈T Px . Otherwise, it is called noncovering. (See
Example 10.2.)
We now present the characterization of the dual subimplicants of a positive
Boolean function from [121].
Theorem 10.4. Let f be a positive Boolean function over the variable set
{x1 , x2 , . . . , xn }, and let T be a subset of the variables. Then T is a dual subimplicant
of f if and only if there exists a noncovering selection with respect to T .
Proof. Assume that T is a dual subimplicant of f , and let D ∈ D be a prime
implicant of f d for which T ⊆ D. For any variable x ∈ T , the subset D \ {x} is a
proper subset of D, and therefore, by Remark 10.1 (or trivially, if D = {x}), there
exists a prime implicant Px ∈ P such that Px ∩ (D \ {x}) = ∅. Since Px ∩ D = ∅
by Theorem 10.3, we have {x} = Px ∩ D = Px ∩ T , that is, Px ∈ Px (T ).
If S = {Px |x ∈ T } were a covering selection, then there would exist a prime
implicant P0 ∈ P0 (T ) such that P0 ⊆ x∈T Px . But this would imply
1 1
P0 ∩ D ⊆ Px ∩ D = (Px ∩ D) = T ,
x∈T x∈T
We will often apply Theorem 10.4 in its contrapositive form or in its dual form,
as follows.
Remark 10.2. A subset T is not a dual subimplicant of f if and only if every
selection with respect to T is a covering selection.
Remark 10.3. We may also apply Theorem 10.4 to subimplicants of f and dual
selections, where the roles of P and D are reversed in the obvious manner.
Example 10.2. Consider the positive Boolean function
The selection S = {bdg, ecg, eh} is noncovering since {a, d, g}, {a, e, g}
{b, c, d, e, g, h}; hence, by Theorem 10.4, T is a dual subimplicant.
(ii) Now let T = {a, b, g}. We have
There is only one possible selection S = {adh, bdh, ecg} and S is a cov-
ering selection since {e, h} ⊆ {a, b, c, d, e, g, h}. Hence, by Remark 10.2, T
is not a dual subimplicant.
It can be verified that T is contained in the dual prime implicant abch, and
that, to extend T to a dual implicant, it would be necessary to add either e or h;
however, neither abeg nor abgh are prime (since abe, bgh ∈ D), see Exercise 5
at the end of the chapter.
10.2 Dual implicants 455
Example 10.3. Let us calculate G(f d ) for the function f = abc ∨ bde ∨ ceg, as
illustrated in Figure 10.4.
The pair (a, b) is not an edge: Indeed, we have in this case Pa = ∅, so a and
b are not adjacent in G(f d ). Similarly, (a, c), (b, d), (c, g), (d, e), (e, g) are also
nonedges.
The pair (b, c) is an edge: In this case, both Pb and Pc are nonempty, but P0 is
empty, so b and c are adjacent in G(f d ). Similarly, (b, e), (c, e) are also edges.
The pair (a, e) is an edge: In this case, as in the previous one, both Pa = ∅ and
Pe = ∅, but P0 = ∅, so a and e are adjacent in G(f d ). Similarly, (b, g), (c, d) are
also edges.
The pair (a, d) is an edge: In this case, Pa = {abc}, Pd = {bde}, P0 = {ceg}.
Since {c, e, g} {a, b, c, d, e}, we conclude that a and d are adjacent in G(f d ).
Similarly, (a, g), (d, g) are also edges.
Notice what happens if we add an additional prime implicant bce to the function
f in this example. Consider the function f = abc ∨ bde ∨ ceg ∨ bce. Then ad is
not a dual subimplicant of f although it was of f . Indeed, there is still only one
selection {abc, bde}, but now it is covering, since it contains bce. By symmetry,
neither ag nor dg are dual subimplicants of f .
We are now ready to present and prove the main characterization theorem of
read-once functions. We describe briefly what will be shown in our Theorem 10.6.
We saw in Theorem 10.2 that for any positive Boolean function f , every prime
implicant P of f and every prime implicant D of its dual f d must have at least
one variable in common. This property is strengthened in the case of read-once
functions, by condition (iv) in Theorem 10.6, which claims that f is read-once if
and only if this common variable is unique. Moreover, this condition immediately
458 10 Read-once functions
implies (by definition) that the co-occurrence graphs G(f ) and G(f d ) have no
edges in common; otherwise, a pair of variables adjacent in both graphs would
be contained in some prime implicant and in some dual prime implicant. This
is condition (iii) of our theorem and already implies that recognizing read-once
functions has polynomial-time complexity (by Theorem 10.5).
Condition (ii) is a further strengthening of condition (iii). It says that in addition
to being edge-disjoint, the graphs are complementary, that is, every pair of variables
appear together either in some prime implicant or in some dual prime implicant,
but not both.
The remaining condition (v) characterizing read-once functions is the one
mentioned as Theorem 10.1 at the beginning of this chapter, namely, that the
co-occurrence graph G(f ) is P4 -free and the maximal cliques of G(f ) are pre-
cisely the prime implicants of f (normality). It is condition (v) that will be used
in Section 10.5 to obtain an efficient O(n|f |) recognition algorithm for read-once
functions.
Example 10.4. The function
f4 = x1 x2 ∨ x2 x3 ∨ x3 x4 ∨ x4 x5 ∨ x5 x1 ,
whose co-occurrence graph G(f4 ) is the chordless 5-cycle C5 , is normal but G(f4 )
is not P4 -free. Hence, f4 is not a read-once function. Its dual
f4d = x1 x2 x4 ∨ x2 x3 x5 ∨ x3 x4 x1 ∨ x4 x5 x2 ∨ x5 x1 x3 ,
Theorem 10.6. Let f be a positive Boolean function over the variable set
{x1 , x2 , . . . , xn }. Then the following conditions are equivalent:
(i) f is a read-once function.
(ii) The co-occurrence graphs G(f ) and G(f d ) are complementary, that is,
G(f d ) = G(f ).
(iii) The co-occurrence graphs G(f ) and G(f d ) have no edges in common, that
is, E(G(f )) ∩ E(G(f d )) = ∅.
(iv) For all P ∈ P and D ∈ D, we have |P ∩ D| = 1.
(v) The co-occurrence graph G(f ) is P4 -free and f is normal.
Proof. (i) =⇒ (ii): Assume that f is a read-once function, and let T be the parse
tree of a read-once expression for f . By interchanging the operations ∨ and ∧, we
obtain the parse tree T d of a read-once expression for the dual f d . By Lemma 10.1,
(xi , xj ) is an edge in G(f ) if and only if the lowest common ancestor of xi and
xj in the tree T is labeled ∧ (conjunction). Similarly, (xi , xj ) is an edge in G(f d )
if and only if the lowest common ancestor of xi and xj in the tree T d is labeled
∧ (conjunction). It follows from the foregoing construction that G(f ) and G(f d )
are complementary.
10.3 Characterizing read-once functions 459
{x1 } ∪ B1 , {x3 } ∪ B3 ∈ D.
By our assumptions,
and
y1 = A12 ∩ B3
y3 = A23 ∩ B1
10.3 Characterizing read-once functions 461
so
D0 ∩ ({x1 , x2 } ∪ A12 ) = {x2 }.
Hence, x2 ∈ D0 .
In a similar manner, we can show that
Hence, x3 ∈ D0 .
Thus, we have shown x2 , x3 ∈ D0 , implying that (x2 , x3 ) is an edge of G(f d ),
a contradiction to condition (iii). This proves Claim 3.
(v) =⇒ (i): Let us assume that f is normal and that G = G(f ) is P4 -free. We
will show how to construct a read-once formula for f recursively. In order to prove
this implication, we will use the following property of P4 -free graphs (cographs)
which we will prove in Section 10.4, Theorem 10.7.
Claim 4. If a graph G is P4 -free, then its complement G is also P4 -free;
moreover, if G has more than one vertex, precisely one of G and G is connected.
The function is trivially read-once if there is only one variable. Assume that the
implication (v) ⇒ (i) is true for all functions with fewer than n variables.
By Claim 4, one of G or G is disconnected. Suppose G is disconnected, with
connected components G1 , . . . , Gr partitioning the variables of f into r disjoint
sets. Then the prime implicants of f are similarly partitioned into r collections
Pi , (i = 1, . . . , r), defining positive functions f1 , . . . , fr , respectively, where Gi =
G(fi ) and f = f1 ∨ · · · ∨ fr . Clearly, G(fi ) is P4 -free because it is an induced
subgraph of G(f ), and each fi is normal for the same reason. Therefore, by
induction, there is a read-once expression Fi for each i, and combining these, we
obtain a read-once expression for f given by F = F1 ∨ · · · ∨ Fr .
10.4 The properties of P4 -free graphs and cographs 463
f0 = ay ∨ cxy ∨ bw ∨ bz,
whose co-occurrence graph G(f0 ) was shown in Figure 10.1. Clearly, f0 is normal,
and G(f0 ) is P4 -free and has two connected components G1 = G{a,c,x,y} and G2 =
G{b,w,z} . Using the arguments presented after Claim 4 above, we can handle these
components separately, finding a read-once expression for each and taking their
disjunction.
For G1 , we note that its complement G1 is disconnected with two components,
namely, an isolated vertex H1 = {y} and H2 = G1{a,c,x} having two edges; we can
handle the components separately and take their conjunction. The complement
H2 has an isolate {a} and edge (c, x) which we combine with disjunction. Finally,
complementing (c, x) gives two isolates which are combined with conjunction.
Therefore, the read-once expression representing G1 will be y ∧ (a ∨ [c ∧ x]).
For G2 , we observe that its complement G2 has an isolate {b} and edge (w, z),
which we combine with conjunction, giving b∧(w ∨z). So the read-once expression
for f0 is
f0 = [y ∧ (a ∨ [c ∧ x])] ∨ [b ∧ (w ∨ z)].
where the join of disjoint graphs G1 , . . . , Gk is the graph G with V (G) = V (G1 ) ∪
· · · ∪ V (Gk ) and E(G) = E(G1 ) ∪ · · · ∪ E(Gk ) ∪ {(x, y) | x ∈ V (Gi ), y ∈ V (Gj ),
for all i = j }. An equivalent definition can be obtained by substituting for (3) the
rule
(3 ) the complement of a cograph is a cograph;
see Exercise 15 at the end of the chapter.
The building of a cograph G from these rules can be represented by a rooted
tree T that records its construction, where
(a) the leaves of T are labeled by the vertices of G;
(b) if G is formed from the disjoint cographs G1 , . . . , Gk (k > 1), then the root
r of T has as its children the roots of the trees of G1 , . . . , Gk ; moreover,
(c) the root r is labeled 0 if G is formed by the union rule (2), and labeled 1 if
G is formed by the join rule (3).
Among all such constructions, there is a canonical one whose tree T is called
the cotree and satisfies the additional property that
(d) on every path, the labels of the internal nodes alternate between 0 and 1.
Thus, the root of the cotree is labeled 1 if G is connected and labeled 0 if G is
disconnected; an internal node is labeled 0 if its parent is labeled 1, and vice versa.
A subtree Tu rooted at an internal node u represents the subgraph of G induced by
the labels of its leaves, and vertices x and y of G are adjacent in G if and only if
their least common ancestor in the cotree is labeled 1.
Notice that the recursive application of rules (1)–(3) follows a bottom-up view-
point of the construction of G. An alternate top-down viewpoint can also be taken,
as a recursive decomposition of G, where we repeatedly partition the vertices
according to either the connected components of G (union) or the connected
components of its complement (join).
One can recognize whether a graph G is a cograph by repeatedly decomposing
it this way, until the decomposition either fails on some component H (both H and
H are connected) or succeeds, reaching all the vertices. The cotree is thus built
top-down as the decomposition proceeds.2
The next theorem gives several characterizations of cographs.
Theorem 10.7. The following are equivalent for an undirected graph G:
(i) G is a cograph.
(ii) G is P4 -free.
(iii) For every subset X of vertices (|X| > 1), either the induced subgraph GX
is disconnected or its complement GX is disconnected.
2 This latter viewpoint is a particular case of modular decomposition [358] that applies to arbitrary
graphs, and any modular decomposition algorithm will produce a cotree when given a cograph,
although such general algorithms [430, 636] are more involved than is necessary for cograph
recognition.
10.4 The properties of P4 -free graphs and cographs 465
In particular, any graph G for which both G and G are connected, must contain
an induced P4 . This claim appears in Seinsche [817]; independently, it was one of
the problems on the 1971 Russian Mathematics Olympiad, and seven students gave
correct proofs; see [366]. The full version of the theorem was given independently
by Gurvich [422, 423, 425] and by Corneil, Lerchs, and Burlingham [213], where
further results on the theory of cographs were developed. Note that it is impossible
for both a graph G and its complement G to be disconnected; see Exercise 7.
It is rather straightforward to recognize cographs and build their cotree in O(n3 )
time. The first linear O(n + e) time algorithm for recognizing cographs appears in
Corneil, Perl, and Stewart [214]. Subsequently, other linear time algorithms have
appeared in [154, 155, 431]; a fully dynamic algorithm is given in [826], and a
parallel algorithm was proposed in [251].
to a2 . Consider the shortest such path. It consists of exactly two edges of H , say,
(a1 , a3 ), (a2 , a3 ) ∈ E(H ), since H is P4 -free.
By a complementary argument, since H is connected and P4 -free, there is
a shortest path in H from a2 to a3 consisting of exactly two edges of H , say,
(a2 , a4 ), (a3 , a4 ) ∈ E(H ). Now we argue that (a1 , a4 ) ∈ E(H ), since otherwise, H
would have a P4 .
We continue constructing the ordering in the same manner. Assume we have
a1 , a2 , . . . , a2j ; we will find the next vertices in the ordering.
(Find a2j +1 ). There is a shortest path in H from a2j −1 to a2j consisting of
exactly two edges of H , say, (a2j −1 , a2j +1 ), (a2j , a2j +1 ) ∈ E(H ). Note that a2j +1
has not yet been seen in the ordering, since none of the ai is adjacent to a2j . We
argue, for all i < 2j − 1, that (ai , a2j +1 ) ∈ E(H ), since otherwise, H would have
a P4 on the vertices {ai , a2j −1 , a2j +1 , a2j }. Thus, we have enlarged our ordering by
one new vertex.
(Find a2j +2 ). There is a shortest path in H from a2j to a2j +1 consisting of
exactly two edges of H , say, (a2j , a2j +2 ), (a2j +1 , a2j +2 ) ∈ E(H ). Now we argue,
for all i < 2j , that (ai , a2j +2 ) ∈ E(H ), since otherwise, H would have a P4 on
the vertices {ai , a2j , a2j +2 , a2j +1 }. Thus, we have enlarged our ordering by another
new vertex.
Eventually, this process orders all vertices, and the last one an will be either
isolated or universal, giving the promised contradiction.
Read-Once Recognition
Input: A representation of a positive Boolean function f by its list of prime impli-
cants, namely, its complete DNF expression.
Output: A read-once expression for f , or “failure” if there is none.
Chein [190] and Hayes [479] first introduced read-once functions and provided
an exponential-time recognition algorithm for the family. Peer and Pinter [734]
also gave an exponential-time factoring algorithm for read-once functions, whose
nonpolynomial complexity is due to the need for repeated calls to a routine that
converts a DNF representation to a CNF representation, or vice-versa. We have
already observed in Section 10.3 that combining Theorem 10.5 with condition (iii)
of Theorem 10.6 implies that recognizing read-once functions has polynomial-time
complexity, although without immediately providing the read-once expression.
In this section, we present the polynomial-time recognition algorithm due
to Golumbic, Mintz, and Rotics [400, 401, 402] and analyze its computational
10.5 Recognizing read-once functions 467
Remark 10.4. The reader has no doubt noticed that the cotree of a P4 -free graph
is very similar to the parse tree of a read-once expression. On the one hand, when
a function is read-once, its parse tree is identical to the cotree of its co-occurrence
graph: Just switch the labels {0, 1} to {∨, ∧}. On the other hand, a cotree always
generates a read-once expression that represents “some” Boolean function g. Thus,
the question to be asked is:
Given a function f , although G(f ) may be P4 -free and, thus, has a cotree T , will
the read-once function g represented by T be equal to f or not? (In other words,
G(g) = G(f ) and, by construction, the maximal cliques of G(g) are precisely the
prime implicants of g, so will these also be the prime implicants of f ?)
The function f = ab ∨ bc ∨ ac is a negative example; its graph is a triangle and
g = abc.
The answer to our question lies in testing normality, that is, comparing the prime
implicants of g with those of f , and doing it efficiently.
Theorem 10.8. [400, 401, 402] Given the complete DNF formula of a positive
Boolean function f on n variables, the GMR procedure solves the Read-Once
Recognition problem in time O(n|f |), where |f | denotes the length of the DNF
expression.
Proof. (Step 1.) The first step of the GMR procedure is building the graph G(f ).
If an arbitrary positive function f is given by its DNF expression, that is, as a list
of its prime implicants P = {P1 , . . . , Pm }, then the edge set of G(f ) can be found
in O( m 2
i=1 |Pi | ) time. It is easy to see that this is at most O(n|f |).
(Step 2.) As we saw in Section 10.4, the complexity of testing whether the graph
G(f ) is P4 -free and providing a read-once expression (its cotree T ) is O(n + e),
as first shown in [214]. This is at worst O(n2 ) and is bounded by O(n|f |). (A
straightforward application of Theorem 10.7 would yield complexity O(n3 )).
468 10 Read-once functions
(Step 3.) Finally, we show that the function f can be tested for normality
in O(n|f |) time by a novel method, due to [400] and described more fully in
[401, 402, 685]3 . As in Remark 10.4, we denote by g the function represented by
the cotree T ; we will verify that g = f .
Testing normality
We may assume that G = G(f ) has successfully been tested to be P4 -free, and
that T is its cotree. We construct the set of maximal cliques of G recursively, by
traversing the cotree T from bottom to top, according to Lemma 10.2 below. For
a node x of T , we denote by Tx the subtree of T rooted at x, and we denote by gx
the function represented by Tx . We note that Tx is also the cotree representing the
subgraph GX of G induced by the set X of labels of the leaves of Tx .
First, we introduce some notation. Let X1 , X2 , . . . , Xr be disjoint sets, and let Ci
be a set of subsets of Xi (1 ≤ i ≤ r). We define the Cartesian sum C = C1 ⊗ · · · ⊗ Cr ,
to be the set whose elements are unions of individual elements from the sets Ci
(one element from each set). In other words,
C = C1 ⊗ · · · ⊗ Cr = {C1 ∪ · · · ∪ Cr | Ci ∈ Ci , 1 ≤ i ≤ r}.
For a cotree T , let C(T ) denote the set of all maximal cliques in the cograph
corresponding to T . From the definitions of cotree and cograph, we obtain:
Lemma 10.2. Let G be a P4 -free graph and let T be the cotree of G. Let h be an
internal node of T and let h1 , . . . , hr be the children of h in T .
(1) If h is labeled with 0, then C(Th ) = C(Th1 ) ∪ · · · ∪ C(Thr ).
(2) If h is labeled with 1, then C(Th ) = C(Th1 ) ⊗ · · · ⊗ C(Thr ).
The following algorithm calculates, for each node x of the cotree, the set C(Tx )
of all the maximal cliques in the cograph defined by Tx . It proceeds bottom up,
using Lemma 10.2, and also keeps at each node x:
s(Tx ): The number of cliques in C(Tx ). This number is equal to the number of
prime implicants in gx .
L(Tx ): The total length of the list of cliques at Tx , namely, L(Tx ) = {|C| :
C ∈ C(Tx )}, which represents the total length of the list of prime
implicants of gx .
A global variable L maintains the overall size of the clique lists as they are being
built. (In other words, L is the sum of all L(Tx ) taken over all x on the frontier as
we proceed bottom up.)
3 In [401] only a complexity bound of O(n2 k) was claimed, where k is the number of prime implicants;
however, using an efficient data structure and careful analysis, it has been shown in [402], following
[685], that the method can be implemented in O(n|f |). For the general case of a positive Boolean
function given in DNF form, it is possible to check normality in O(n3 k) time using the results of
[538]; see Exercise 13 at the end of the chapter.
10.5 Recognizing read-once functions 469
Step 3a: Initialize k to be the number of terms (clauses) in the DNF representation of f . For every
leaf a of T , set C(Ta ) = {a} and set s(Ta ) = 1, L(Ta ) = 1, and L = n.
Step 3b: Scan T from bottom to top, at each internal node h reached, let h1 , . . . , hr be the children
of h and do:
The steps of the normality-checking procedure are given in Figure 10.7. This
procedure correctly tests normality because it tests whether the maximal cliques
of the cograph are precisely the prime implicants of f .
Complexity analysis
The purpose of comparing s(Th ) with k at each step is simply a speedup mechanism
to assure that the number of cliques never exceeds the number of prime implicants.
Similarly, calculating L(Th ), that is, |gh | and comparing L with |f | at each step
assures that the overall length of the list of cliques will never exceed the sum of
the lengths of the prime implicants. (Note that we precompute L, and test against
|f | before we actually build a new set of cliques.)
For efficiency, we number the variables {x1 , x2 , . . . , xn }, and maintain both the
prime implicants and the cliques as lists of their variables. Then, each collection of
cliques C(Tx ) is maintained as a list of such lists. In this way, constructing C(Th ) in
Step 3b(1) can be done by concatenating the lists C(Th1 ), . . . , C(Thr ), and construct-
ing C(Th ) in Step 3b(2) can be done by creating a new list of cliques by repeatedly
taking r (sub)cliques, one from each set C(Th1 ), . . . , C(Thr ) and concatenating these
r (disjoint) lists of variables.
470 10 Read-once functions
Thus, the overall calculation of C(Th ) takes at most O(|f |) time. Since the
number of internal nodes of the cotree is less than n, the complexity of Steps 3a
and 3b is O(n|f |).
It remains to compare the list of the prime implicants of f with the list of the
maximal cliques C(Ty ), where y is the root of T . This can be accomplished using
radix sort in O(nk) time. Initialize two k × n bit matrices P and C filled with zeros.
Each prime implicant Pi is traversed (it is a list of variables), and for every xj ∈ Pi
we assign Pi,j ← 1, thus, converting it into its characteristic vector, which will be
in row i of P. Similarly, we traverse each maximal clique Ci and convert it into
its characteristic vector, which will be in row i of C. It is now a straightforward
procedure to lexicographically sort the rows of these two matrices and compare
them in O(nk) time.
This concludes the proof, since the complexity of each step is bounded by
O(n|f |).
Proof. Let θ be any CNF such that no variable appears more than three times in
θ . It is NP-complete to decide whether θ is satisfiable; see [371, 932].
Now, replace x̄i by yi in θ for all i ∈ {1, 2, . . . , n}, denote by θ the resulting
(positive) CNF, and denote by h the Boolean function represented by θ . It is easy
10.5 Recognizing read-once functions 471
to see that θ is satisfiable if and only if g0 < h or, equivalently, if and only if
g0 ∨ h = g0 . Moreover, if the number of clauses of θ is large enough (say, at
least 7), then θ has no implicants of degree smaller than three.
Remark 10.5. Recall that verifying the equality of two Boolean functions defined
by positive DNF and CNF expressions, respectively, is exactly Positive DNF
Dualization, which is not co-NP-complete unless every problem of co-NP can
be solved in quasi-polynomial time; see Section 4.4.2. Yet, verifying the similar
identity g0 ∨ h = g0 appears harder.
Lemma 10.4. Let h be a positive function without linear implicants. If the function
f = g0 ∨ h is read-once, then f is quadratic.
Proof. The “if” part is obvious, since function g0 is read-once, while the “only if”
part follows immediately from the previous lemma.
we find
g0 ∨ h = (x1 ∨ y2 )(x2 ∨ y1 ) ∨ (x3 ∨ x4 )(y3 ∨ y4 ) ∨ x5 y5 ,
so that g0 ∨ h is read-once but distinct from g0 .
Now we are ready to prove the desired result.
Theorem 10.9. For a Boolean function f given by a positive ∨-∧ expression, it
is co-NP-complete to decide whether f is read-once.
Proof. NP-hardness immediately follows from Lemmas 10.3 and 10.5. It remains
to show that the decision problem is in co-NP. This will follow from Theorem 10.6:
f is read once if and only if every prime implicant P of f and prime implicant D of
f d have exactly one variable in common. Hence, to disprove that f is read-once,
it is sufficient (and necessary) to exhibit dual prime implicants P0 and D0 with at
least two common variables. Furthermore, to verify that P0 is a prime implicant
of f , it is sufficient to check that
(i) f is true if all variables of P0 are true, while all others are false.
(ii) f is false if all variables of P0 but one are true, while all others are false.
This can be checked in polynomial time.
Similarly, we can check that D0 is a prime implicant of f d . To do so, it is
enough to dualize the expression of f by swap of ∨ and ∧ (see Theorem 1.3 in
Section 1.3).
Remark 10.8. The recognition problem remains in co-NP when the function f
is given by any polynomially computable representation (or polynomial oracle)
and is guaranteed to be positive. Moreover, Aizenstein et al. [13] showed that the
problem remains in co-NP even without assumption of the positivity of f .
Remark 10.9. Interestingly, the same arguments (three lemmas and theorem)
prove that it is a co-NP-complete problem to recognize whether a positive Boolean
formula, φ0 ∨ θ, defines a quadratic Boolean function. Indeed, the corresponding
Boolean function g0 ∨ h is quadratic if and only if g0 ∨ h = g0 , provided h has no
implicants of degree less than three.
Exercise 27 at the end of the chapter raises some related open questions
regarding the complexity of recognizing a read-once function depending on the
10.6 Learning read-once functions 473
The answer, of course, is yes, 20 questions are enough if the number of variables
is at most 4. Otherwise, the answer is no. If there are n variables, then there will
be 2n independent points to be queried before you can “know” the function.
Suppose I give you a clue: The function f is a positive Boolean function. Now
can you learn f with fewer queries?
Again the answer is yes. The extra information given by the clue allows you to
ask fewer questions in order to learn the function. For example, in the case n = 4,
first try (1,1,0,0). If the answer is true, then you immediately know that (1,1,1,0),
(1,1,0,1) and (1,1,1,1) are all true. If the answer is false, then (1,0,0,0), (0,1,0,0)
and (0,0,0,0) are all false. Either way, you asked one question and got four answers.
Not bad. Now if you query (0,0,1,1), you will similarly get two or three more free
answers. In the worst case, it could take 10 queries to learn the function (rather
than 16 had you queried each point).
Learning a Boolean function in this manner is sometimes called Exact Learn-
ing with Queries; see Angluin [21]. It receives as input an oracle for a Boolean
function f , that is, a “black box” that can answer a query on the value of f at a
given Boolean point in constant time. It then attempts to learn the value of f at all
2n points and outputs a Boolean expression that is logically equivalent to f .
If we know something extra about the structure of the function f , then it may
be possible to reduce the number of queries required to learn the function. We saw
this earlier in our example with the clue (that the mystery function was positive).
However, even for positive functions, the number of queries needed to learn the
function remains exponential.
The situation is much better for read-once functions. In this case, the number
of required queries can be reduced to a polynomial number, and the unique read-
once formula can be produced, provided we “know” that the function is read-once.
Thus, the read-once functions constitute a very natural class of functions that can
be learned efficiently, and, for this reason, they have been extensively studied
within the computational learning theory community.
For our purposes, we define the problem as follows:
474 10 Read-once functions
Remark 10.10. There is a subtle but significant difference between the Exact
Learning problem and the Recognition problem. With recognition, we have
a DNF expression for f and must determine whether it represents a read-once
function. With exact learning, we have an oracle for f whose correct usage relies
upon the a priori assumption that the function to be learned is read-once. So the
input assumptions are different, but the output goal in both cases is a correct
read-once expression for f . Also, when measuring the complexity of recognition,
we count the algorithmic operations; when measuring the complexity of exact
learning, we must count both the operations implemented by the algorithm and the
number of queries to the oracle.
As we saw in Section 10.5, the GMR recognition procedure: (1) uses the DNF
expression to construct the co-occurrence graph G(f ), then (2) tests whether G(f )
is P4 -free and builds a cotree T for it, and (3) uses T and the original DNF formula
to test whether f is normal; if so, T is the read-once expression.
In contrast to this, Angluin, Hellerstein, and Karpinski [22] give the exact
learning algorithm in Figure 10.8.
The main difference between AHK exact learning and GMR recognition that
concerns us will be Step 1, that is, how to construct G(f ) using an oracle. We
outline the solution through a series of exercises at the end of the chapter.
(A) In a greedy manner, we can determine whether a subset U ⊆ X of the
variables contains a prime implicant, and find one when the answer is positive.
Exercise 16 gives such a routine Find-PI-In(U ), which has complexity O(n) plus
|U | queries to the oracle. A similar greedy algorithm Find-DualPI-In(U ) will
find a dual prime implicant contained in U .
(B) An algorithm Find-Essential-Variables is developed in Exercises 17,
18, and 19 that not only finds the set Y of essential variables4 but also, in the
process, for each variable xi in Y , generates a prime implicant P [i] and a dual
4 We have generally assumed throughout this chapter that all of the variables for a Boolean function
f (and hence for f d ) are essential. However, in the exact learning problem, we may wish to drop
this assumption and then need to find the set of essential variables.
10.6 Learning read-once functions 475
prime implicant D[i] containing xi . This algorithm uses Find-PI-In and Find-
DualPI-In and can be implemented to run in O(n2 ) time using O(n2 ) queries to
the oracle.
(C) Finally, we construct the co-occurrence graph G(f ) based on the following
Lemma (whose proof is proposed as Exercise 14):
Lemma 10.6. Let f be a nonconstant read-once function over the variables N =
{x1 , x2 , . . . , xn }. Suppose that Di is a dual prime implicant containing xi but not
xj , and that Dj is a dual prime implicant containing xj but not xi . Let Ri,j =
(N \ (Di ∪ Dj )) ∪ {xi , xj }. Then (xi , xj ) is an edge in the co-occurrence graph
G(f ) if and only if Ri,j contains a prime implicant.
We obtain G(f ) using the oracle in the following way: For each pair of essential
variables xi and xj ,
C.1: if xi ∈ D[j ] or xj ∈ D[i], then (xi , xj ) is not an edge of G(f );
C.2: otherwise, construct Ri,j from D[i] and D[j ] and test whether Ri,j contains
a prime implicant using just one query to the oracle, namely, is f (XRi,j ) = 1? If
so, then (xi , xj ) is an edge in G(f ); otherwise, it is not an edge.
Complexity
The computational complexity of the procedure is determined as follows. Step 0
requires two queries to the oracle. Step 1 constructs the co-occurrence graph G(f )
by first calling the algorithm Find-Essential-Variables (Part B) to generate
P [i] and D[i] for each variable xi in O(n2 ) time using O(n2 ) queries, then it
applies Lemma 10.6 (Part C) to determine the edges of the graph. Step C.1 can be
done in the same complexity as Step B; however, Step C.2 uses O(n3 ) time and
O(n2 ) queries, since, for each pair i, j , we have O(n) operations and 1 query. Step
2, building the cotree T for G(f ) takes O(n2 ) time using one of the fast cograph
algorithms of [154, 214, 431], and Step 3 takes no time at all.
To summarize, the overall complexity using the method of Angluin, Hellerstein
and Karpinski [22] will be O(n3 ) time and O(n2 ) queries. However, in an unpub-
lished manuscript [250], Dahlhaus subsequently reported an alternative to Step C.2
using only O(n2 ) time. (Further generalizations by Raghavan and Schach [774]
lead to the same time bound.)
The main result, therefore, is the following:
Theorem 10.10. The Read-Once Exact Learning problem can be solved with
the AHK procedure in O(n2 ) time, using O(n2 ) queries to the oracle.
Proof. The correctness of the AHK exact learning procedure follows from
Lemma 10.6, Exercises 17–19, and Remark 10.4.
read-once” assumption is vital for the construction of G(f ). (See the discussion
in Exercise 28 concerning what might happen if such an oracle were to be applied
to a non-read-once function.)
Further topics relating computational learning theory with read-once functions
may be found in [13, 22, 162, 484, 396, 397, 482, 749, 774, 838, 884, etc.].
Definition 10.1. Given three finite sets S 1 = {s11 , s21 , ..., sm1 1 }, S 2 = {s12 , s22 , ..., sm2 2 },
which are interpreted as the sets of strategies of the players 1 and 2, and X =
{x1 , x2 , ..., xk }, which is interpreted as the set of outcomes, a game form (of two
players) is a mapping g : S 1 × S 2 → X, which assigns an outcome x(s 1 , s 2 ) ∈ X
to every pair of strategies s 1 ∈ S 1 , s 2 ∈ S 2 .
Definition 10.2. Two strategies s1i and s2i of player i, where i = 1 or 2, are
called equivalent if for every strategy s 3−i of the opponent, we have g(s1i , s 3−i ) =
g(s2i , s 3−i ); in other words, if in matrix M(g), the rows (i = 1) or the columns
(i = 2) corresponding to the strategies s1i and s2i , are equal.
Definition 10.3. Given a read-once function f , we can interpret its parse tree
(or read-once formula) T (f ) as an extensive game form (or game tree) of two
players. The leaves X = {x1 , x2 , ..., xk } of T are the final positions or outcomes.
The internal vertices of T are the internal positions. The game starts at the root of
T and ends in a final position x ∈ X. Each path from the root to a final position
(leaf) is called a play. If an internal node v is labeled by ∨ (respectively, by ∧),
then it is the turn of player 1 (respectively, player 2) to move in v. This player can
choose any vertex that is a child of v in T .
A strategy of a player is a mapping which assigns a move to every position in
which this player has to move. In other words, a strategy is a plan of how to play
in every possible situation.
Any pair of strategies s 1 of player 1 and s 2 of player 2 define a play p(s 1 , s 2 ) and
an outcome x(s 1 , s 2 ) that would appear if both players implement these strategies.
Two strategies s1i and s2i of player i, where i = 1 or 2, are called equivalent
if for every strategy s 3−i of the opponent the outcome is the same, that is, if
x(s1i , s 3−i ) = x(s2i , s 3−i ). By suppressing all but one (arbitrary) strategy from every
class of equivalent strategies, we obtain two reduced sets of strategies, denoted by
S 1 = {s11 , s21 , ..., sm1 1 } and S 2 = {s12 , s22 , ..., sm2 2 }.
The mapping g : S 1 × S 2 → X, which assigns the outcome x(s 1 , s 2 ) ∈ X to
every pair of strategies s 1 ∈ S 1 , s 2 ∈ S 2 , defines a game form, which we call the
normal form of the corresponding extensive game form.
Note that such a mapping g = g(T ) may be not injective because different pairs
of strategies may generate the same play.
We call a game form g positional if it is the normal form of an extensive game
form, that is, if g = g(T (f )) for a read-once function f .
Example 10.6. In the extensive game form defined by the read-once formula
((x1 ∨ x2 )x3 ∨ x4 )x5 , each player has three strategies, and the corresponding
normal game form is given by the following (3 × 3)-matrix:
10.7 Related topics and applications of read-once functions 479
x1 x3 x5
M 2 = x2 x3 x5 .
x4 x4 x5
The game form given by the matrix
9 :
x1 x1
M3 =
x2 x3
is also generated by a read-once formula, namely, by x1 ∨ x2 x3 .
Our aim is to characterize the positional game forms.
Definition 10.4. Let us consider a game form g and the corresponding matrix
M = M(g). We associate with M two DNFs, representing two Boolean functions
f1 = f1 (g) = f1 (M) and f2 = f2 (g) = f2 (M), respectively, by first taking the
conjunction of all the variables in each row (respectively, each column) of M,
and then taking the disjunction of all these conjunctions for all rows (respectively,
columns) of M.
We call a game form g (as well as its matrix M) tight if the functions f1 and f2
are mutually dual.
Example 10.7. Matrix M2 of Example 10.6 generates the functions f1 (M2 ) =
x1 x3 x5 ∨ x2 x3 x5 ∨ x4 x5 and f2 (M2 ) = x1 x2 x4 ∨ x3 x4 ∨ x5 . These functions are
mutually dual, thus the game form is tight. Matrix M3 is also tight, because its func-
tions f1 (M3 ) = x1 ∨ x2 x3 and f2 (M3 ) = x1 x2 ∨ x1 x3 are mutually dual. However,
M1 is not tight, because its functions f1 (M1 ) = f2 (M1 ) = x1 x2 are not mutually
dual.
Remark 10.12. It is proven in [421] that a normal game form (of two players) is
Nash-solvable (that is, for an arbitrary payoff the obtained game has at least one
Nash equilibrium in pure strategies) if and only if this game form is tight.
Theorem 10.12. Let f be a read-once function; T = T (f ), the parse tree of f
interpreted as an extensive game form; g = g(T ), its normal form; M = M(g),
the corresponding matrix; and f1 = f1 (M), f2 = f2 (M), the functions generated
by M. Then, f1 = f and f2 = f d .
Proof. By induction. For a trivial function f the claim is obvious. If f = f ∨f ,
then f1 = f1 ∨f1 and f2 = f2 ∧f2 . If f = f ∧f , then f1 = f1 ∧f1 and f2 =
f2 ∨f2 . The theorem follows directly from the definition of strategies.
Theorem 10.13. A game form g and its corresponding matrix M are rectangular
if and only if every prime implicant of f1 (M) and every prime implicant of f2 (M)
have exactly one variable in common.
Proof. Obviously, any two such prime implicants must have at least one common
variable because every row and every column in M intersect, that is, row s 1 and
column s 2 always have a common outcome x = g(s 1 , s 2 ). Let us suppose that they
have another common outcome, namely, that there exist strategies si1 and sj2 such
that g(s 1 , sj2 ) = g(si1 , s 2 ) = x = x. Then, g(s 1 , s 2 ) = x; thus, g is not rectangular.
Conversely, let us assume that g is not rectangular, that is, g(s11 , s12 ) = g(s21 , s22 ) =
x, while g(s11 , s22 ) = x = x. Then row s11 and column s22 have at least two outcomes
in common, namely, x and x .
Theorem 10.14. (Gurvich [423, 424]). A normal game form g is positional if and
only if it is tight and rectangular.
Proof. The normal form g corresponding to an extensive game form T (f ) is tight
in view of Theorem 10.12, and g is rectangular in view of Theorem 10.13 and
Theorem 10.6(iv).
Conversely, if g is tight and rectangular, then, by definition, f1 (g) and f2 (g)
are dual. Further, according to Theorem 10.13, every prime implicant of f1 (g) and
every prime implicant of f2 (g) have exactly one variable in common. Hence, by
Theorem 10.6(iv), f1 (g) and f2 (g) are read-once; thus, g is positional.
Remark 10.13. In [423], this theorem is generalized for game forms of n players.
The criterion is the same: A game form is positional if and only if it is tight and
rectangular. The proof is based on the cotree decomposition of P4 -free graphs; see
Sections 10.3, 10.5.
In his 1978 doctoral thesis, Gurvich [423] remarked that the parse tree decom-
position is a must for any minimum ∨-∧ formula for f , in both the monotone
and general cases. However, a bit earlier, Michel Chein’s short paper [190] based
on his doctoral thesis of 1967 may be the earliest one mentioning “read-once”
functions. J. Kuntzmann (Chein’s thesis advisor) raised the question a few years
earlier in the first edition (1965) of his book “Algèbre de Boole” [589], mentioning
a problem called “dédoublement de variables,” and in the second edition (1968)
he cites Chein’s work.
What Chein does (using our notation) is to look at the bipartite graph B(f ) =
(P, V , E), where P is the set of prime implicants, V is the set of variables, and
edges represent containment, that is, for all P ∈ P, v ∈ V ,
(P , v) ∈ E ⇐⇒ v ∈ P .
The reader can easily verify that B(f ) is connected if and only if the graph
G(f ) is connected.
Chein’s method is to check which of B(f ) or B(f d ) is disconnected (failing if
both are connected) and continuing recursively. An exponential price is paid for
dualizing. Peer and Pinter [734] do something quite similar.
By contrast, as the reader also now knows, the polynomial-time algorithm of
Golumbic, Mintz, Rotics similarly acts on G(f ) and G(f d ), but G(f d ) is gotten for
free, without dualizing, thanks to the fact that G(f d ) equals the graph complement
of G(f ) (by Theorem 10.6), paying only an extra low price to check for normality.
Finally, to clarify complexities using our notation: Clearly, building B(f d )
involves dualization of f ; however, building G(f d ) can be done in polynomial
time for any positive Boolean function (i.e., without any dualization). The implica-
tion is that one can compute a unique read-once decomposition for any (positive)
read-once Boolean function in polynomial time; see also Ramamurthy’s book
[777].
To summarize, testing read-onceness and obtaining a parse tree decomposition
is just an extreme case of representing f by a minimum length ∨-∧ formula.
The parse tree decomposition implies (A) and has been known since 1958 [592],
whereas (B) has been known since 1977 [422, 423] and been rediscovered inde-
pendently several times thereafter [293, 294, 548, 696]. Dominique de Werra has
described it as “an additional interesting example of rediscovery by people from
the same scientific community. It shows that the problem has kept its importance
and [those involved] have good taste.”
10.9 Exercises
1. Prove that a Boolean function f for which some variable appears in its
positive form x in one prime implicant and in its negative form x in another
prime implicant cannot be a read-once function.
2. Verify Remark 10.1; namely, if T is a proper dual subimplicant of f , then
there exists a prime implicant of f , say, P , such that P ∩ T = ∅.
482 10 Read-once functions
f = x1 x2 ∨ x1 x5 ∨ x2 x3 ∨ x2 x4 ∨ x3 x4 ∨ x4 x5 .
(a) Draw the co-occurrence graph G(f ). Prove that f is not a read-once
function.
(b) Let T = {x1 , x4 }. What are the sets P0 , Px1 , Px4 ? Prove that T is a dual
subimplicant of f by finding a noncovering selection.
(c) Let T = {x3 , x4 , x5 }. What are the sets P 0 , P x3 , P x4 , P x5 ? Prove that
T is not a dual subimplicant of f .
4. Consider the function f = ab ∨ bc ∨ cd. Verify that {a, d} is not a dual
subimplicant.
5. Verify that the function
f1 = x1 x3 x5 ∨ x1 x3 x6 ∨ x1 x4 x5 ∨ x1 x4 x6 ∨ x2 x3 x5 ∨ x2 x3 x6 ∨ x2 x4 x5 ∨ x2 x4 x6
and
f2 = x1 x3 x5 ∨ x1 x3 x6 ∨ x1 x4 x5 ∨ x1 x4 x6 ∨ x2 x3 x5 ∨ x2 x3 x6 ∨ x2 x4 x5 .
Verify that they generate the same co-occurrence graph G, which is P4 -free,
and that all prime implicants of f1 and f2 correspond to maximal cliques of
G; yet, f1 is normal, while f2 is not. Find the cotree for G and the read-once
expression for f1 .
10.9 Exercises 483
23. Prove directly that the normal form of any extensive game form is rectan-
gular. In other words, if two pairs of strategies (s11 , s12 ) and (s21 , s22 ) result in
the same play p, that is, p(s11 , s12 ) = p(s21 , s22 ) = p, then (s11 , s22 ) and (s21 , s12 )
also result in the same play, that is, p(s11 , s22 ) = p(s21 , s12 ) = p.
24. Verify that the following two game forms are tight:
x1 x2 x1 x2
x3 x4 x4 x3
M4 = ,
x1 x4 x1 x5
x3 x2 x6 x2
x1 x1 x2
M5 = x1 x1 x3 .
x2 x4 x2
for the functions f1 and f2 of Section 10.1. Could the oracle generate all
prime implicants? What would be the complexity?
29. The two game forms M4 and M5 in Exercise 24 represent the normal form
of some extensive games on graphs that have no terminal positions, and
their cycles are the outcomes of the game. Find two graphs that generate
M4 and M5 .
11
Characterizations of special classes by
functional equations
Lisa Hellerstein
487
488 11 Characterizations by functional equations
written XY ) and X similarly denote the bitwise conjunction of X and Y and the
bitwise negation of X, respectively.
By Theorem 1.20, a Boolean function f on Bn is positive if and only if f (X) ≤
f (Y ) for all X, Y ∈ B n such that X ≤ Y . The following theorem gives two other
characterizations of positive functions:
Theorem 11.1. A Boolean function f on B n is positive if and only if the following
inequality is satisfied for all X, Y ∈ Bn :
f (X) ≤ f (X ∨ Y ) (11.1)
or, equivalently, if and only if the following inequality is satisfied for all X, Y ∈ Bn :
From the foregoing, it is easy to show that the class of negative Boolean
functions is characterized by the inequalities
f (X ∨ Y ) ≤ f (X)
and
f (X) ≤ f (XY ),
which are opposite to the inequalities given for positive functions.
We will show below that similar functional equations and inequalities charac-
terize other interesting classes of Boolean functions.
The value of Cg ((0, 1), (1, 0), (0, 0)) can be computed as follows:
Cg ((0, 1), (1, 0), (0, 0)) = g((0, 1) ∨ (0, 0)) ∨ g(1, 0) = g(0, 1) ∨ g(1, 0) = 0 ∨ 0 = 1.
We say that a particular Boolean function g on B n satisfies a functional equation
C = D in the variables Y1 , . . . , Yn and function symbol f , if for all Y1 , . . . , Ym ∈ B n ,
Cg (Y1 , . . . , Ym ) = Dg (Y1 , . . . , Ym ).
Otherwise, we say that g falsifies the equation.
Example 11.4. Consider the functional equation
Let C denote the left-hand side of this equation and D the right-hand side. Note
that C is the same as in the previous example.
Also as in the previous example, let g be the function on B2 such that g(x1 , x2 ) =
x1 x2 , and let Y1 = (0, 1), Y2 = (1, 0), Y3 = (0, 0). We showed that Cg (Y1 , Y2 , Y3 ) = 1.
For the same values of the Yi ’s,
Proof. Follows directly from the fact that C and D are both Boolean-valued.
f (X ∨ Y ) ≤ f (X) ∨ f (Y ). (11.5)
Definition 11.4. The degree of a Boolean function f is the degree of the complete
DNF of f . Equivalently, it is the maximum degree of any prime implicant of f . A
Boolean function is called linear if its degree is at most 1.
If a Boolean function is representable by a DNF of degree 1, then all of its prime
implicants have degree 1. Therefore, a Boolean function is linear if and only if it
can be represented by a DNF of degree at most 1.
We discuss functions of degree k ≥ 2 in the next section.
Polar functions were defined in Chapter 5, Section 5.3. A Boolean function is
polar if it is representable by a DNF in which no term contains both a complemented
and an uncomplemented variable. Equivalently, a Boolean function f is polar if
f = g ∨ h for some positive function g and some negative function h.
Submodular functions were defined in Chapter 6, Section 6.9 to be the functions
satisfying the inequality
f (X) ∨ f (Y ) ≥ f (X ∨ Y ) ∨ f (XY ). (11.6)
Supermodular functions are defined by reversing the inequality for submodular
functions.
Definition 11.5. A Boolean function is supermodular if it satisfies the inequality
f (X) ∨ f (Y ) ≤ f (X ∨ Y ) ∨ f (XY ). (11.7)
In fact, the class of supermodular functions is identical to the class of polar
functions.
Theorem 11.3. A Boolean function f is polar if and only if it is supermodular.
Proof. Suppose f is a polar function on Bn . Let f = g ∨ h, where g is positive
and h is negative. Suppose X, Y ∈ B n are such that f (X) ∨ f (Y ) = 1. Assume,
without loss of generality, that f (X) = 1. Then, X satisfies either g or h, or both.
If X satisfies g, then X ∨ Y must also satisfy g because g is positive, and hence,
f (X ∨ Y ) = 1. If X satisfies h, then XY must satisfy h because h is negative, and
hence, f (XY ) = 1. Therefore, f is supermodular.
Conversely, suppose f is a supermodular function on Bn . Define the following
sets:
S = {X ∈ Bn | f (X) = 1 and for all Y ∈ Bn , X ≤ Y ⇒ f (Y ) = 1},
T = {X ∈ Bn | f (X) = 1 and for all Y ∈ Bn , Y ≤ X ⇒ f (Y ) = 1}.
Let g be the function on Bn such that g(X) = 1 if and only if X ∈ S, and let
h be the function on Bn such that h(X) = 1 if and only if X ∈ T . Clearly g is
positive, h is negative, and g ∨ h ≤ f . We will show that f = g ∨ h. Suppose not.
Then, there exist points P , Q, R ∈ Bn such that f (Q) = 1, f (P ) = f (R) = 0, and
P ≤ Q ≤ R. Define Z = P ∨ QR. Since P ≤ Q ≤ R, ZQ = P and Z ∨ Q = R.
But then f (Z) ∨ f (Q) = 1 and f (ZQ) ∨ f (Z ∨ Q) = 0, contradicting that f is
supermodular. Therefore f = g ∨ h, and thus, f is polar.
11.3 Characterizations of particular classes 493
P1 P2 P3 ,
where each factor Pi is an elementary conjunction with at least one variable, but
no two of the three factors P1 , P2 , P3 have a common variable. Define elementary
conjunctions
R1 = P1 P3 , R2 = P2 P3 , R3 = P1 P2 .
Since P is a prime implicant, none of these Ri is an implicant of f , namely, there
are points X, Y , Z such that
R1 (X) = R2 (Y ) = R3 (Z) = 1
f (X) = f (Y ) = f (Z) = 0.
These points violate (11.8).
f (Y1 ) = . . . = f (Yk+1 ) = 0
Yj ,
i=1 j =i
and hence, so does P . It follows that the left hand side of Equation (11.10) is 0.
11.4 Conditions for characterization 495
Ri = Pi .
j =i
f (Y1 ) = · · · = f (Yk+1 ) = 0.
These points violate (11.10).
the vector map associated with r. Clearly, for all X ∈ J , f (X) = g(s(X)).
Definition 11.7. Let f be a Boolean function on Bn . Let k > 0. Then, the function
g on B n+k defined by g(x1 , . . . , xn+k ) = f (x1 , . . . , xn ) is said to be produced from
g by addition of inessential variables.
Example 11.6. Let f (x1 , x2 , x3 ) = x1 x2 ∨x1 x 3 . Let r : {1, 2, 3} → {1, 2} be such that
r(1) = r(3) = 2 and r(2) = 1. Then g(x1 , x2 ) = x1 x2 ∨x2 x 2 = x1 x2 is produced from
f by the identification map r. The function h(x1 , x2 , x3 , x4 ) = x1 x2 can be produced
from g by addition of inessential variables. The function h (x1 , x2 , x3 , x4 ) = x1 x4
can be produced from h by identification of variables (in fact, by permutation of
variables).
We can use Theorem 11.7 to prove that certain classes of Boolean functions
cannot be characterized by functional equations.
Proof. We show that each of these classes is not closed under identification of
variables.
Monotone functions: Let f (x1 , x2 , x3 , x4 ) = x1 x 2 ∨ x 3 x4 , and apply the identifica-
tion map r : {1, 2, 3, 4} → {1, 2} such that r(1) = 1, r(2) = 2, r(3) = 1, and r(4) = 2
to yield f (x1 , x2 ) = x1 x 2 ∨ x 1 x2 . The function f is neither positive nor negative
in x1 and x2 , and hence, it is not monotone (i.e., not unate).
Functions of degree at most k, for all k ≥ 3: Let f (x1 , . . . , x2k ) = x1 x2 x3 . . . xk
∨ x k+1 . . . x 2k , and apply the identification map r : {1, . . . , 2k} → {1, . . . , 2k − 1}
such that r(i) = i for all i < 2k, and r(2k) = 1. The resulting function has
11.4 Conditions for characterization 497
f (x1 , x2 , x3 , x4 , x5 , x6 ) = x3 x5 ∨ x4 x5 ∨ x2 x3 x4 ∨ x1 x2 x4 ∨ x1 x2 x5 .
f (0, 0, 1, 0, 1) = 1
f (0, 1, 0, 0, 1) = 0
f (1, 1, 0, 1, 0) = 1
f (1, 0, 1, 1, 0) = 0,
For i ∈ {1, . . . , t} let hi be the Boolean function on B t such that, for all
(x1 , . . . , xt ) ∈ B t , hi (x1 , . . . , xt ) = xi if the transpose of (x1 , . . . , xt ) is in col(A),
and hi (x1 , . . . , xt ) = 0 otherwise. Similarly, for i ∈ {1, . . . , t} let ht+i be the Boolean
function on Bt such that for all (x1 , . . . , xt ) ∈ B t , ht+i (x1 , . . . , xt ) = xi if the transpose
of (x1 , . . . , xt ) is in col(A), and ht+i (x1 , . . . , xt ) = 1 otherwise. For i ∈ {1, . . . , 2t},
let φi (x1 , . . . , xt ) be a Boolean expression representing hi .
For all n ≥ 0, define the function hin : (Bn )t → Bn as follows: For all
X1 , . . . , Xt ∈ B n , hin (X1 , . . . , Xt ) = (y1 , . . . , yn ) such that, for all j ∈ {1, . . . , n},
yj = hi (X1 [j ], X2 [j ], . . . , Xt [j ]). That is, hin is the function obtained by apply-
ing hi componentwise to X1 , . . . , Xt . Thus, hin is the interpretation of φi (x1 , . . . , xt )
in Bn . Because n may not be equal to 1, we will write φi (X1 , . . . , Xt ) rather than
φi (x1 , . . . , xt ), to emphasize that the variables of φi are vector variables.
Let H = {h1 , . . . , h2t }. Define a partition of H into two sets, H0 and H1 as
follows:
H0 = {hkt+i : i ∈ {1, . . . , t}, k ∈ {0, 1}, and g(Ai ) = 0},
H1 = {hkt+i : i ∈ {1, . . . , t}, k ∈ {0, 1}, and g(Ai ) = 1}.
The desired equation Ig is defined to be
(f (φi (X1 , . . . , Xt ))) ∨ f (φi (X1 , . . . , Xt )) = 1. (11.11)
hi ∈H0 hi ∈H1
{1, . . . , m} be the bijection such that, for all u ∈ {1, . . . , m}, p(u) = ju . Let f be
the function produced from f0 by the identification map p.
Let i ∈ {1, . . . , t}. For index c, let Wic and Aic denote the cth components of Wi
and Ai respectively. Let ρ = Wik1 . Then,
Thus g(Ai ) = f (Ai ) for all i ∈ {1, . . . , t}. Since the rows of A are the t elements
of the domain of g, f = g.
The class K is closed under identification of variables and addition of inessential
variables. If f were in K, then g would be also, since g can be obtained from f
by these operations. Therefore, f is not in K, which is what we wanted to show.
It remains to consider the case q = 0. Let i ∈ {1, . . . , n}. By Equation (11.12),
for ρ ∈ {0, 1}, g(hm n
ρt+i (A1 , . . . , At )) = f (hρt+i (W1 , . . . , Wt )). By the definitions of
hi and ht+i , it follows that g(Ai1 , . . . , Ain ) = f (0, . . . , 0) = f (1, . . . , 1). Since this
is true for all i ∈ {1, . . . , t}, g is a constant function. The constant function g can be
produced from f by first applying the identification map r : {1, . . . , n} → {1} such
that r(u) = 1 for all u ∈ {1, . . . , n}, and then adding m − 1 inessential variables. As
in the case q > 0, it follows immediately that f is not in K.
Theorem 11.9 can be used to show that particular classes of functions have a
characterization by functional equations. For example, we can prove the following
result for the class of threshold functions.
Proof. By Theorem 11.9, it suffices to show that the class of threshold functions
is closed under identification of variables and addition of inessential variables.
Closure under addition of inessential variables is obvious.
We show closure under identification of variables. Suppose f (x1 , . . . , xn ) is a
threshold function. Then, for some w1 , . . . , wn and t in R, f (x1 , . . . , xn ) = 0 if and
only if i wi xi ≤ t. If f (x1 , . . . , xm ) is obtained from f using an identification
map r, then f (x1 , . . . , xm ) = 0 if and only if
m
wj xi ≤ t.
i=1 1≤j ≤n,r(j )=i
Then for each Boolean function g that is not a threshold function, there exists
a certificate Qg of nonmembership of g in the set of threshold functions, such
that Qg has size at most c. Let Sg = {X ∈ Qg |g(X) = 1}, and let Tg = {X ∈ Qg |
g(X) = 0}.
Consider an arbitrary Boolean function g on Bn that is not a threshold function.
If the convex hull of Sg does not intersect the convex hull of Tg , then, by standard
separation theorems, (see, e.g., [788]), there exists a hyperplane separating the
points in Sg from the points in Tg . In this case, there exists a threshold function f
such that f (X) = g(X) for all X ∈ Qg . This contradicts that Qg is a certificate of
nonmembership of g in the set of threshold functions. Hence, the convex hulls of
Sg and Tg intersect.
Let X1 , . . . , Xt be the elements of Qg . Let Mg be the t × n matrix whose rows
are X1 , . . . , Xt . Let Mg be the matrix obtained from Mg by deleting all columns
j from Mg such that for some j < j , column j and column j of Mg are equal.
Let m be the number of columns of Mg , and let X̂1 , . . . , X̂t be the rows of Mg
corresponding to rows X1 , . . . , Xt of Mg .
Let Ŝg = {X̂i |Xi ∈ Sg }, and T̂g = {X̂i |Xi ∈ Tg }. Since Sg and Tg are disjoint, so
are Ŝg and T̂g . Also, since the convex hulls of Sg and Tg intersect, the convex hulls
of Ŝg and T̂g intersect.
Since the convex hulls of Ŝg and T̂g intersect, it follows from the proof of
Theorem 9.14 in Chapter 9 that, for some z > 0, there exist z points X̂i1 , .., X̂iz in
Ŝg (not necessarily distinct), and z points X̂j1 , ..., X̂jz in T̂g (not necessarily distinct)
such that
and hence,
in Sg (not necessarily distinct), and z points Xj1 , ..., Xjz in Tg (not necessarily
distinct), such that Equation (11.15) holds. Since zg ≤ α, g is α-summable, a
contradiction.
of their similarity to graph minors, which have been extensively studied in graph
theory.
Hellerstein and Raghavan [486] showed that, for any k, the class of functions
representable by DNFs having at most k terms has constant-sized certificates of
nonmembership, and hence, by the above theorem, it has a characterization by a
finite set of functional equations.
506 11 Characterizations by functional equations
11.6 Exercises
1. Give a functional equation characterizing the class of elementary conjunc-
tions.
2. Prove that the class of read-once functions cannot be characterized by a set
of functional equations.
3. This exercise is based on a result of Bernard Rosell (personal communica-
tion). Let k > 2. Recall that the class Fk consists of functions representable
by DNFs of degree at most k. Let f (x1 , . . . , xn ) be the function whose output
is 1 if and only if at least k+1 2
of its inputs are 1 and at least k+1
2
of its inputs
are 0.
(a) Show that the given function f is not in Fk .
(b) Prove a lower bound on the size of any certificate of nonmembership of
f in Fk . Use this lower bound to show that Fk cannot be characterized
by a finite set of functional equations.
4. In [748], Pippenger defined a Boolean constraint to be a pair (R, S), where
R and S are each a set of binary column vectors of length m, for some m ≥ 0.
If A is an m × n binary matrix, and f (x1 , . . . , xn ) is a Boolean function on n
variables, then let f (A) denote the column vector produced by applying f to
each row of A; namely, f (A) is the length m column vector whose ith entry
is f (A[i, 1], A[i, 2], . . . , A[i, n]), for all entries i. We write A ≺ R if each
column of A is a member of R. Function f (x1 , . . . , xn ) satisfies constraint
(R, S) if for all m × n binary matrices A, A ≺ R implies that f (A) ∈ S.
Aset I of Boolean constraints characterizes a class K of Boolean functions
if K consists precisely of the Boolean functions that satisfy all constraints
in I .
(a) Show that the following constraint characterizes the class of positive
9 : 9 : 9 :. 9 : 9 : 9 :.
0 1 0 0 1 0
Boolean functions: , , , , , .
1 1 0 1 1 0
(b) Give a constraint that characterizes the class of Horn functions.
(c) By Theorem 9.14, a Boolean function is a threshold function if and only
if it is k-asummable for every k ≥ 2.
Describe a constraint that characterizes the set of functions that are k-
asummable, for fixed k ≥ 2. Then construct an infinite set of constraints
that characterizes the class of threshold functions.
(d) Show that a class of Boolean functions can be characterized by a set
of functional equations if and only if it can be characterized by a set of
Boolean constraints. (See [748].)
11.6 Exercises 507
Generalizations
12
Partially defined Boolean functions
Toshihide Ibaraki
12.1 Introduction
Suppose that a set of data points is at hand for a certain phenomenon. A data point
is called a positive example if it describes a case that triggers the phenomenon, and
a negative example otherwise. We consider the situation in which all data points
are binary and have a fixed dimension; namely, they belong to B n .
Given a set of positive examples T ⊆ B n and a set of negative examples F ⊆ B n ,
we call the pair (T , F ) a partially defined Boolean function (pdBf) on B n . For a
pdBf (T , F ) on B n , a Boolean function f : Bn → B satisfying
T (f ) ⊇ T and F (f ) ⊇ F
is called an extension of (T , F ), where
T (f ) = {A ∈ Bn | f (A) = 1}, (12.1)
n
F (f ) = {B ∈ B | f (B) = 0}. (12.2)
If we associate n Boolean variables xj , j = 1, 2, . . . , n, with the components of
points in B n , then extensions are Boolean functions of the variables x1 , x2 , . . . , xn .
As an example of a pdBf (T , F ), let us assume that each point A =
(a1 , a2 , . . . , an ) ∈ T ∪ F indicates the result of physical tests applied to a patient,
where T denotes the set of results for patients diagnosed as positive, and F denotes
the set of negative results. Each component aj of a point A gives the result of the
j -th test; for example, a1 = 1 may indicate that blood pressure is “high,” while
a1 = 0 indicates, “low”; a2 = 1 may say that body temperature is “high,” while
a2 = 0 says “low,” and so on. An extension f of this pdBf (T , F ) then describes
how the diagnosis of the disease could be formulated for all possible patients. In
other words, this Boolean function f contains all the details of the diagnosis. As
extensions of a given pdBf (T , F ) are not unique, in general, it is interesting and
important to investigate how to build meaningful extensions from given pdBfs.
This line of approach to data analysis recently received increasing attention in
statistics and in artificial intelligence under various names such as data mining,
511
512 12 Partially defined Boolean functions
x1 x2 x3 x4 x5 x6 x7 x8
(1)
A = 0 1 0 1 0 1 1 0
T A(2) = 1 1 0 1 1 0 0 1
A(3) = 0 1 1 0 1 0 0 1
B (1) = 1 0 1 0 1 0 1 0
F B (2) = 0 0 0 1 1 1 0 0
B (3) = 1 1 0 1 0 1 0 1
B (4) = 0 0 1 0 1 0 1 0
f1 = x̄1 x2 ∨ x2 x5
f2 = x̄1 x̄5 ∨ x3 x̄7 ∨ x1 x5 x̄7
f3 = x5 x8 ∨ x6 x7 .
It can be verified that all these functions are indeed extensions of (T , F ), that is,
fk (A(i) ) = 1 holds for i = 1, 2, 3 and fk (B (i) ) = 0 holds for i = 1, 2, 3, 4. As we
shall see later, this pdBf has many other extensions.
which may reflect our general belief that the truth is simple and beautiful, or as a
translation of Occam’s razor principle. Simplicity can be measured, for example,
by the sizes of representations such as DNFs, CNFs, and decision trees. The size of
a “support set,” to be discussed in Section 12.2.2, is another measure of simplicity.
Second,
• embodiment in the extensions of structural knowledge concerning the
phenomenon to be modeled.
For example, if high blood pressure is known to favor the appearance of a disease,
then we expect the extension f to depend positively on the variable xj associated
with blood pressure.
In more general mathematical terms, we require the obtained extension to
belong to a specified class of functions C. The selected class C may arise not
only from prior structural information, but also from the application that we have
514 12 Partially defined Boolean functions
in mind for the resulting extensions. For example, if an extension f is Horn, then
f can be dealt with by Horn rules; as discussed in Chapter 6, this allows us to
benefit from numerous convenient mathematical properties of Horn rules.
In this chapter, we consider the following classes of functions:
(1) The class of all Boolean functions, FALL .
(2) The class of positive functions, F+ (defined in Sections 1.10 and 1.11).
(3) The class of monotone, or unate functions, FUNATE (defined in Section 1.10).
(4) The class of functions representable by a DNF of degree at most k, Fk
(defined in Sections 1.4 and 1.11).
(5) The class of Horn functions, FHORN (discussed in Chapter 6).
(6) The class of threshold functions, FTh (discussed in Chapter 9).
(7) The class of decomposable functions, FF0 (S0 ,F1 (S1 )) (defined in
Section 12.3.6).
(8) The class of k-convex functions, Fk-CONV (discussed in Section 12.3.7).
For other classes of functions studied in the literature on pdBfs, see [139].
In dealing with real-world data, we should also be aware that the data may
contain errors as well as missing bits. A missing bit is denoted by ∗, meaning that
it can be either 0 or 1. We shall discuss in Sections 12.4 and 12.5 how to deal
with these situations, and we shall introduce various problems associated with the
extensions in such cases.
Problem EXTENSION(C)
Instance: A pdBf (T , F ).
Question: Does (T , F ) have an extension in C?
n −|T |−|F |
If a pdBf (T , F ) satisfies T ∩ F = ∅, then it has 22 extensions. Define
two extensions fmin and fmax by
fmin ≤ f ≤ fmax ,
that is, fmax maximizes T (f ) and fmin minimizes T (f ) among all extensions f
of (T , F ). Furthermore, all extensions of (T , F ) form a finite lattice with respect
to the operations ∨ and ∧ between functions. The largest element of this lattice is
fmax , and its smallest element is fmin . A remaining question is: Which extensions
in the lattice are appropriate for the purpose of logical analysis of data?
Problem MIN-SUPPORT(C)
Instance: A pdBf (T , F ) (where we assume that (T , F ) has an extension in C).
Output: A minimum support set S of (T , F ) for class C.
We first show that this problem for class FALL can be formulated as a set
covering problem. Recall that the set covering problem is the following NP-hard
optimization problem [371]:
n
and that minimizes j =1 yj , where 1 is the n-dimensional column vector of 1’s.
The relation between support sets and the set covering problem has been observed
in various early papers (e.g., Kambayashi [547]; Kuntzmann [589]; Necula [704]).
The preceding description follows the presentation by Crama, Hammer, and
Ibaraki [233].
Example 12.2. The set covering problem (12.7) corresponding to the pdBf in
Example 12.1 is given as follows.
8
minimize yj
j =1
subject to y1 + y2 + y3 + y4 + y5 + y6 ≥ 1
y2 + y5 + y7 ≥ 1
y1 + y7 + y8 ≥ 1
y2 + y3 + y4 + y5 + y6 ≥ 1
y2 + y3 + y4 + y7 + y8 ≥ 1
y1 + y2 + y6 + y8 ≥ 1
y5 + y6 ≥ 1
12.2 Extensions of pdBfs and their representations 517
y1 + y2 + y3 + y4 + y7 + y8 ≥ 1
y1 + y2 + y7 + y8 ≥ 1
y2 + y3 + y4 + y6 + y8 ≥ 1
y1 + y3 + y4 + y5 + y6 ≥ 1
y2 + y7 + y8 ≥ 1
y1 , y2 , . . . , y8 ∈ {0, 1}.
This set of inequalities contains many redundant inequalities, and can be greatly
simplified. As already observed in Chapter 1, Section 1.13, the constraints of a set
covering problem can be associated with a CNF such that a 0–1 assignment of
values to y satisfies the set covering constraints if and only if it satisfies all clauses
of the CNF. In the current example, we obtain the CNF
It is not difficult to see that the prime implicants of the function represented
by ψ correspond exactly to the minimal support sets of (T , F ) (see Chapter 4,
Section 4.2). Applying this procedure, we conclude that there are eight minimal
support sets for our example, namely,
The first two sets, S1 and S2 , are the only minimum support sets, and the following
DNFs provide two extensions associated with S1 and S2 , respectively.
ϕ1 = x5 x8 ∨ x̄5 x̄8
ϕ2 = x6 x7 ∨ x̄6 x̄7 .
T = {Qi | i = 1, 2, . . . , m},
F = {(0, 0, · · · , 0)},
where Qi denotes the i-th row of the 0–1 matrix Q. It is easy to see that the
formulation (12.7) for MIN-SUPPORT(FALL ) is exactly the same as the original
instance of SET COVER. This shows that SET COVER is reducible to MIN-
SUPPORT(FALL ), and proves the theorem.
518 12 Partially defined Boolean functions
ϕ1 = x5 x8 ∨ x̄5 x̄8 ϕ71 = x̄1 x̄5 ∨ x3 x̄7 ∨ x1 x5 x̄7 ϕ81 = x̄1 x̄5 ∨ x̄4 x̄7 ∨ x1 x4 x5
ϕ2 = x6 x7 ∨ x̄6 x̄7 ϕ72 = x̄1 x̄5 ∨ x3 x̄7 ∨ x1 x̄3 x5 ϕ82 = x̄1 x̄5 ∨ x̄4 x̄7 ∨ x1 x5 x̄7
ϕ31 = x̄1 x2 ∨ x2 x5 ϕ73 = x3 x̄7 ∨ x̄3 x7 ∨ x1 x5 x̄7 ϕ83 = x4 x7 ∨ x̄4 x̄7 ∨ x1 x4 x5
ϕ32 = x̄1 x̄5 ∨ x2 x5 ϕ74 = x3 x̄7 ∨ x̄3 x7 ∨ x1 x̄3 x5 ϕ84 = x4 x7 ∨ x̄4 x̄7 ∨ x1 x5 x̄7
ϕ4 = x̄1 x2 ∨ x2 x̄6 ϕ75 = x3 x̄7 ∨ x̄5 x7 ∨ x1 x5 x̄7 ϕ85 = x̄4 x̄7 ∨ x̄5 x7 ∨ x1 x4 x5
ϕ51 = x2 x5 ∨ x2 x7 ϕ76 = x3 x̄7 ∨ x̄5 x7 ∨ x1 x̄3 x5 ϕ86 = x̄4 x̄7 ∨ x̄5 x7 ∨ x1 x5 x̄7
ϕ52 = x2 x5 ∨ x̄5 x7
ϕ61 = x2 x̄6 ∨ x2 x̄8
ϕ62 = x2 x̄8 ∨ x̄6 x8
Proof. For every A ∈ T , there is a term Ci(A) , with i(A) ∈ {1, 2, . . . , m}, such that
Ci(A) (A) = 1. Since Ci(A) (B) = 0 for all B ∈ F , we see that Ci(A) is a pattern of
(T , F ). Now, let S = {i(A) | A ∈ T }. Then, i∈S Ci is a theory of (T , F ) and since
f itself is not a theory, it must be the case that S is a proper subset of {1, 2, . . . , m}.
Example 12.4. As all minimal support sets were listed in Example 12.2 for the
pdBf (T , F ) of Example 12.1, we are now able to obtain all basic theories by
enumerating all prime patterns for each support set Sk . Table 12.2 gives all basic
theories thus obtained, where ϕki indicates the i-th basic theory generated from a
minimal support set Sk . (The superscript i is not indicated if Sk has only one basic
theory.)
Note that the pdBf (T , F ) has |B8 \ (T ∪ F )| = 28 − 7 = 249 unspecified points,
implying that it has 2249 extensions in FALL . Table 12.2 shows that only 21 of these
extensions are basic theories.
x1 x2 x3 x4 x5 x6 x7 x8 x9
0 1 0 1 0 1 1 0 1
1 1 0 1 1 0 0 1 1
0 1 1 0 1 0 0 1 1
1 0 1 0 1 0 1 0 0
0 0 0 1 1 1 0 0 0
1 1 0 1 0 1 0 1 0
0 0 1 0 1 0 1 0 0
and x̄5 x8 is a prime copattern that covers B (3) . Therefore ϕ = x5 x̄8 ∨ x̄5 x8 is a prime
cotheory. It is easy to see that this extension ϕ is also a basic cotheory.
In concluding this subsection, we briefly comment upon the history of the fun-
damental concepts of patterns and theories. In the context of LAD, the definitions
of patterns and theories were formulated in Crama, Hammer, and Ibaraki [233], and
their properties have been studied in several subsequent papers; see, for example,
Boros et al. [115]. However, as patterns and prime patterns for pdBfs are natural
generalizations of implicants and prime implicants for Boolean functions, similar
concepts can be found in early references such as Mendelson [680]; Prather [755];
and Roth [793, 879]. For example, in [755, 793], patterns are discussed under the
name of “basic cells,” and prime patterns under the name of “maximal basic cells.”
The concept of theories is also introduced in these references.
There also exists an interesting relation between patterns, as defined in this
section, and association rules, which are a basic concept used in data mining. In
data mining, a data set is usually given as a “flat” list of data points without “output
bit,” rather than as a pair of sets consisting of positive and negative examples. The
data set of Example 12.1, for instance, would be given as Table 12.3, after adding
the attribute x9 that indicates the outcome of each data point.
A property that holds among such data points is called an association rule if it
can be described as an implication of the form: “if x2 = 0 and x4 = 1, then x9 = 1
holds.” More formally, an association rule takes the form,
where aj1 , aj2 , . . . , ajk and al are either 0 or 1, respectively. It is further required
that at least one data point should satisfy the rule, that no data point should violate
the rule, and that no shorter rule (namely, consisting of a subset of {xj1 = aj1 ,
xj2 = aj2 , . . . , xjk = ajk } in its left-hand side) should exist.
It is not difficult to see that, if we give a special role to the conclusion variable
xl , and if we define the set of data points satisfying xl = 1 (respectively, xl = 0)
as T (respectively, F ), then the above association rule actually asserts that
522 12 Partially defined Boolean functions
i∈Pxji i∈N x̄ji is either a prime pattern (in case al = 1) or a prime copattern (in
case al = 0) of the pdBf (T , F ), where P = {i | aji = 1} and N = {i | aji = 0}.
In the discussion of association rules in data mining, data sets are usually sup-
posed to contain errors and missing parts. To cope with such situations, the concepts
of support and confidence are introduced as essential constituents of association
rules. We do not go into details, but refer to Agrawal, Imielinski, and Swami [8];
Fayyad et al. [321]; and Mannila, Toivonen, and Verkamo [665] for further discus-
sion. In this chapter, we shall deal with errors and missing bits of data in Sections
12.4 and 12.5, respectively, from a slightly different viewpoint.
Remark 12.1. There is some confusion in the use of the terms “theory” and
“cotheory” in application areas. In learning theory and data-mining, “theory” is
often used as a synonym of “extension.” But, here we use it to mean a special
extension with certain properties.Acotheory is sometimes referred to as a “negative
theory” to emphasize its role with respect to the set of negative examples F (in
this case, the theory itself is called “positive theory”). We do not follow these
conventions, so as to avoid a potential confusion with the concepts of positive and
negative functions.
Therefore, the functions α and β, when represented by the disjunctions of all prime
patterns and prime copatterns, can be written as:
α = x1 , β = x̄1 ∨ x̄2 x3 .
This implies that
T (α) = {(1, 0, 0), (1, 1, 1), (1, 1, 0), (1, 0, 1)},
T (β) = {(0, 0, 0), (0, 0, 1), (0, 1, 1), (0, 1, 0), (1, 0, 1)}.
Note that the point (1, 0, 1) belongs to both T (α) and T (β).
For a pdBf (T , F ), let us now define
T ∗ = F (β) = {X ∈ Bn | β(T ,F ) (X) = 0}, (12.10)
Proof. We consider only the membership in T ∗ , since the other case is similar. Let
X ∈ B n be a point not in T ∪ F . Then X ∈ T ∗ if and only if there is a copattern
t of (T , F ) satisfying t(X) = 1. Let this t cover B ∈ F , and let t(X,B) be the term
consisting of all the literals in L(X) ∩ L(B). By definition, t(X,B) ≤ t holds and
t(X,B) is also a copattern of (T , F ). This argument implies that the condition X ∈ T ∗
holds if and only if t(X,B) is a copattern for some B ∈ F . This test can be conducted
in time polynomial in the input length n(|T | + |F |).
Theorem 12.6. If a pdBf has an extension, then it has at least one bi-theory.
For each X ∈ T (f ), let tX be the term consisting of all the literals in L(X)∩L(A)
for some A ∈ T satisfying d(X, A) = d(X, T ). Then define
ϕ= tX .
X∈T (f )
Similarly, for each Y ∈ F (f ), let tY be the term consisting of all the literals in
L(Y ) ∩ L(B) for some B ∈ F satisfying d(Y , B) = d(Y , F ), and define
ψ= tY .
Y ∈F (f )
Procedure Tree-DNF
For each leaf node with value 1, construct the term corresponding to the path from the root to the
leaf node, and take the disjunction of all such terms.
the function value 0 or 1 for the assignment defined by the unique path from the
root (the top node numbered 0) to the leaf node under consideration. Figure 12.2
shows an example of a decision tree, where intermediate nodes are drawn as circles
and leaves are drawn as squares. For example, the rightmost bottom node (with
assignment 1) indicates that the function value for the assignment x2 = 1, x1 = 1,
and x5 = 1 (along the rightmost path) is 1. To know the function value for a given
data point A = (0, 1, 0, 1, 0, 1, 1, 0), for example, we start from the root and follow
the branch x2 = 1 (since a2 = 1) to the intermediate node 1. Then from node 1 we
follow the branch x1 = 0 (since a1 = 0) to arrive at a leaf node with value 1. This
tells us that f (A) = 1 for the function f represented by this decision tree.
Given a decision tree representing a Boolean function f , the above explanation
entails that a DNF of f can be constructed by the procedure in Figure 12.3.
In this procedure, the term corresponding to a path is defined by including the
literal xj (respectively, x̄j ) if the assignment xj = 1 (respectively, xj = 0) occurs
along the path.
Example 12.7. As the decision tree in Figure 12.2 has two leaf nodes with value 1,
the following DNF is obtained by the procedure Tree-DNF.
ϕ = x̄1 x2 ∨ x1 x2 x5 .
528 12 Partially defined Boolean functions
Let us now consider the problem of constructing a decision tree that represents
an extension of a given pdBf (T , F ). Similarly to the case of DNFs, there are
many such decision trees, and it is desirable to obtain a “simple” one. The sim-
plicity of decision trees may be measured by their number of nodes, or by their
height. Exact minimization is, however, intractable; for example, it is known that
finding a decision tree with the minimum number of nodes is NP-hard (Hyafil
and Rivest [514]). Therefore, various heuristic algorithms have been proposed to
obtain approximately minimum decision trees. When applied to a pdBf (T , F ),
most of these heuristics fit in the generic scheme described in Figure 12.4, where
we assume that T ∩ F = ∅.
The procedure pdBf Decision Tree yields a decision tree D = D(T , F ) for
every pdBf (T , F ). The tree D can be viewed as representing a Boolean function
where p0 = |Tk0 |, q0 = |Fk0 |, p1 = |Tk1 | and q1 = |Fk1 |. This means that the amount
of information gained by the decomposition based on the selction of xj is
In ID3, the variable xj that maximizes gain(xj ) among all the remaining unfixed
variables is selected as the branching variable at node k.
Example 12.8. Let us apply the procedure ID3 to the pdBf of Example 12.1. We
first apply rule 3 to the original pdBf (T , F ). In determining the branching variable
that maximizes gain(xj ), we can choose the variable that minimizes E(xj ), since
I (p, q) is constant for all xj in (12.12). In order to illustrate the computation of
530 12 Partially defined Boolean functions
E(x1 ), observe that the following pdBfs (T , F ) and (T , F ) result when we fix
x1 to 0 and to 1, respectively,
T = {A(1) , A(3) }
F = {B (2) , B (4) }
T = {A(2) }
F = {B (1) , B (3) }.
Thus we have p0 = |T | = 2, q0 = |F | = 2, p1 = |T | = 1, q1 = |F | = 2, and
hence,
4 2 2 2 2 3 1 1 2 2
E(x1 ) = (− log − log ) + (− log − log ) = 0.77.
7 4 4 4 4 7 3 3 3 3
Similarly, we obtain
E(x2 ) = 0.46,
E(x3 ) = E(x4 ) = E(x6 ) = E(x7 ) = E(x8 ) = 0.77,
E(x5 ) = 0.98.
Therefore, at the root, x2 minimizes E(xj ) and is chosen as the branching variable.
Now the two pdBfs (T0 , F0 ) and (T1 , F1 ) that result by fixing x2 to 0 and to 1,
respectively, are given by
T0 = ∅,
F0 = {B (1) , B (2) , B (4) },
T1 = {A(1) , A(2) , A(3) },
F1 = {B (3) }.
As the pdBf (T0 , F0 ) corresponding to x2 = 0 satisfies T0 = ∅, we obtain a leaf node
with value 0 by rule 1 of the branching step in procedure pdBf Decision Tree.
For the pdBf (T1 , F1 ) corresponding to x2 = 1, we again apply rule 3 to select a
branching variable from among x1 , x3 , x4 , . . . , x8 ; in this case, x1 is selected.
Repeating this procedure, we eventually obtain the decision tree of Figure 12.2,
in which the pdBf (T2 , F2 ) of node 2 does not depend on x1 , x2 , and is given by
T2 = {A(2) }
F2 = {B (3) }.
Other types of selection rules for branching variables have also been proposed.
The successful software C4.5 and its successor C5.0 by Quinlan [771, 772], for
example, use a rule based on the gain-ratio in place of the above gain crite-
rion. Another important addition included in these algorithms is the operation of
“pruning,” which is applied after a decision tree is constructed. This operation is
performed on each intermediate node in order to test whether it is more beneficial
12.3 Extensions within given function classes 531
to retain the node or to prune it into a leaf node, according to some statistical
criterion. The resulting decision tree usually features a more robust behavior on
new input samples.
Before closing this section, we briefly compare two representations of exten-
sions of pdBfs, by DNFs and by decision trees, respectively. Generally speaking,
if an extension f has a small decision tree, it tends to have a small DNF, and vice
versa, since both representations are closely related as explained earlier in this
section. A decision tree is visually appealing, while a DNF may be more conve-
nient for the purpose of understanding the logical content of f . For certain function
classes, such as F+ , FHORN , and Fk , it is easier to check whether a function belongs
to the class when it is represented by a DNF.
The size of a support set is also positively correlated with that of a decision
tree, but not always exactly. Recall that a support set is a set of variables which
is required to represent an extension. On the other hand, heuristic minimization
of a decision tree, such as performed by ID3, is based on choosing an appropriate
branching variable at each node, independently of the choices at other nodes. As a
result of this difference, minimization of a support set does not generally coincide
with minimization of a decision tree.
+ +
It is clear that fmin is a positive function. Furthermore, T (fmin ) ∩ F = ∅ holds by
+
the assumption on T and F . Therefore, fmin is a positive extension of (T , F ).
Finally, the condition in the theorem statement can be checked by directly
comparing all pairs (A, B) with A ∈ T and B ∈ F . This can be done in O(n|T ||F |)
time, which is polynomial in the input length n(|T | + |F |).
+
The positive extension fmin defined in the proof minimizes the set T (f ) among
+
all positive extensions f of the pdBf (T , F ). It is not difficult to show that fmin is
+
in fact the unique minimum positive extension of (T , F ). We can also define fmax
dually:
+
F (fmax ) = {C ∈ Bn | C ≤ B holds for some B ∈ F },
+
T (fmax ) = B n \ F (fmax
+
).
+
This function fmax is the unique extension that maximizes the set T (f ) among all
positive extensions f . Any positive extension f of a pdBf (T , F ) satisfies
+ +
fmin ≤ f ≤ fmax ,
and all positive extensions form a lattice under the operations ∨ and ∧ between
functions. This is a sublattice of the lattice of all extensions of a pdBf (T , F )
introduced in Section 12.2.1.
Assume now that a pdBf (T , F ) has positive extensions. We say that a set
S + ⊆ {1, 2, . . . , n} is a positive support set for (T , F ) if (T |S + , F |S + ) has a positive
extension, and we define
J+ (A, B) = {j ∈ {1, 2, . . . , n} | aj = 1, bj = 0}
(compare with J(A, B) of (12.5) in Section 12.2.2). Then the problem of finding
a minimum positive support set can be formulated as the following set covering
problem:
n
minimize yj
j =1
subject to yj ≥ 1, A ∈ T , B ∈ F
j ∈J+ (A,B)
yj + zj ≤ 1, j ∈ {1, 2, . . . , n} (12.14)
yj ∈ {0, 1}, zj ∈ {0, 1}, j ∈ {1, 2, . . . , n}. (12.15)
Furthermore, since yj = 1 or zj = 1 implies that j is used in the resulting support
set of f ∈ FUNATE , problem MIN-SUPPORT(FUNATE ) can be formulated as the
0–1 programming problem obtained by considering the objective function
n
n
minimize yj + zj
j =1 j =1
DNF Equation
Instance: A DNF expression φ(X) on the variables X = (x1 , x2 , . . . , xn ).
Question: Is the equation φ(X) = 0 consistent?
534 12 Partially defined Boolean functions
(see Chapter 2 and Appendix B). Given an instance φ(X) of DNF Equation,
we construct a pdBf (T , F ) such that the corresponding 0–1 problem (12.13)–
(12.14) has a feasible solution if and only if the equation φ(X) = 0 is consistent.
For this purpose, let t1 , t2 , . . . , tm denote the terms of the DNF φ, and let Ci ⊆
{x1 , x̄2 , . . . , xn , x̄n } be the set of literals that appear in ti . We define
For the pdBf (T , F ), we obtain the following set of inequalities from (12.13)–
(12.14):
From (ii) and (iii), we see that yn+i = 1 and zj +i = 0 must hold for all i ∈
{1, 2, . . . , m}. This implies that the inequalities in (iv) are all redundant since any
inequality in (iv) contains at least one variable yn+i (i = 1, 2, . . . , m) in its left-
hand side. Therefore, our problem EXTENSION(FUNATE ) becomes equivalent to
deciding whether the constraints (i) and (iii) have a feasible 0–1 solution. It is
now obvious that such a solution (Y , Z) exists if and only if the original Boolean
equation has a solution X defined by xj = 1 if yj = 1, xj = 0 if zj = 1, and xj
arbitrary if xj = yj = 0.
12.3 Extensions within given function classes 535
and generate all terms t consisting of at most k literals chosen from the n literals in
ta∗ . If at least one of these terms t satisfies T (t)∩F = ∅, then t is the required pattern;
otherwise, A is not covered by any pattern of degree k. A naive implementation of
this procedure requires O(nk × n|F |) time, which is polynomial when k is viewed
as a constant.
Thus, we obtain:
Theorem 12.11. The problem EXTENSION(Fk ) can be solved in polynomial
time when k is fixed. Similarly, EXTENSION(Fk+ ) can be solved in polynomial
time for every fixed k.
Proof. The first part of the theorem follows from the above discussion. The
statement
about positive extensions can be shown similarly, by starting from
+ ∗
tA = j :aj =1 xj instead of tA .
x1 x2 x3 x4 x5 x6 x7 x8 x9
(1)
A = 1 1 1 1 0 0 1 0 0
A(2) = 1 1 1 0 1 0 1 0 0
A(3) = 1 1 1 0 0 1 0 1 0
T A(4) = 0 0 1 0 0 0 1 0 0
A(5) = 1 0 0 0 0 0 1 0 0
A(6) = 0 1 1 0 0 0 0 0 1
A(7) = 1 1 0 0 0 0 0 0 1
A(8) = 1 1 1 1 1 1 0 0 0
B (1) = 1 1 1 1 0 0 1 1 0
F B (2) = 1 1 1 0 1 0 1 1 1
B (3) = 1 1 1 0 0 1 1 1 0
B (4) = 1 1 1 0 0 0 1 0 1
For A ∈ Bn , define
F≥A = {B ∈ F | B ≥ A}.
For every F ⊆ F , the condition B∈F B = A implies that B ≥ A for all B ∈ F
(i.e., F ⊆ F≥A ), and hence, that B∈F≥A B = A also holds. Therefore, the condition
F ∧ ∩ T = ∅ is equivalent to
which can be checked in O(n|T ||F |) time by scanning all B ∈ F for each A ∈ T .
Example 12.9. Consider the pdBf (T , F ) defined in Table 12.4. It is easily checked
that
F≥A(1) = {B (1) },
F≥A(2) = {B (2) },
F≥A(3) = {B (3) },
F≥A(4) = F≥A(5) = {B (1) , B (2) , B (3) , B (4) },
F≥A(6) = F≥A(7) = {B (2) , B (4) },
F≥A(8) = ∅,
12.3 Extensions within given function classes 537
and condition (12.16) holds for all A(i) ∈ T . Therefore, this pdBf has a Horn exten-
sion by Theorem 12.12.
HORN
In general, a pdBf (T , F ) may have many Horn extensions. Let fmax denote
the Horn extension that maximizes T (f ) among all Horn extensions f . Then, it
HORN
follows from the discussion before Theorem 12.12 that fmax is given by
HORN
F (fmax ) = F∧
HORN
T (fmax ) = Bn \ F ∧ ,
and it is unique. On the other hand, there are generally many minimal Horn
extensions, that is, Horn extensions f with minimal true set T (f ).
As observed in Chapter 6, DNFs of Horn functions have numerous special
properties. Some of them can be generalized to Horn extensions of pdBfs. In
particular, there are pdBfs (T , F ) for which the number of prime implicants in
HORN
fmax is exponential in the input length n(|T | + |F |). There are algorithms for
Horn
generating all prime implicants of fmax , but none of them runs in polynomial time
in its input and output length (Kautz, Kearns, and Selman [554]; Khardon [564]).
It is known that this problem has a polynomial time algorithm if and only if there
is a polynomial time algorithm (in its input and output length) to generate all
prime implicants of the dual of a positive function (Kavvadias, Papadimitriou, and
Sideri [558]). As discussed in Section 4.4, the complexity of the latter problem is
HORN
still open. Observe that just finding any Horn DNF of fmax is not easier than
finding all its prime implicants, since, from such a DNF, all prime implicants can
be generated in polynomial total time (see Section 6.5). The complexity of this
HORN
problem and other related problems, such as finding an irredundant DNF of fmax
HORN
and finding a shortest DNF of fmax , still remain to be studied.
On the other hand, DNFs of minimal (in the sense of T (f )) Horn extensions
can be described in a canonical form, each of which is of polynomial length in the
input length n(|T | + |F |). To see this, let us introduce some notations. For a pdBf
(T , F ) with T ∩ F = ∅, and for each A ∈ T ,
I (A) = {j ∈ {1, 2, . . . , n} | aj = 0 and bj = 1 for all B ∈ F≥A }
{ nj=1 xj } if A = (1, 1, . . . , 1)
R(A) = {( j :aj =1 xj ) x̄l | l ∈ I (A)} if A = (1, 1, . . . , 1) and I (A) = ∅
∅ if A = (1, 1, . . . , 1) and I (A) = ∅.
Note that R(A) is empty only if A = (1, 1, . . . , 1) and I (A) = ∅, in which case
condition (12.16) implies that (T , F ) has no Horn extension.
Now, when R(A) is nonempty for all A ∈ T , we define a canonical Horn DNF
for the pdBf (T , F ) to be any DNF of the form
ϕ= tA , where tA ∈ R(A).
A∈T
In words, a canonical Horn DNF is obtained by choosing one term tA from each
set R(A) and taking the disjunction of these terms over all A ∈ T . Note that
538 12 Partially defined Boolean functions
each term tA ∈ R(A) satisfies tA (A) = 1 and tA (B) = 0 for all B ∈ F . Therefore,
every canonical Horn DNF represents a Horn extension of (T , F ), and its length
is O(n|T |).
Example 12.10. Let us obtain I (A) and R(A) for all A ∈ T of Example 12.9.
where we employ a shorthand notation for terms; for example, 123478̄ stands for
x1 x2 x3 x4 x7 x̄8 , and so on. Consequently, there are 1×2×1×2×2×2×2×3 = 96
canonical Horn DNFs, among which we find, for example,
It can be proved that every minimal Horn extension has a canonical Horn DNF,
but the converse is not always true; we refer the reader to Makino, Hatanaka,
and Ibaraki [650] for details. In the above Example 12.10, ϕ (2) represents a
minimal Horn extension, but ϕ (1) does not. It can be checked in polynomial
time whether a canonical Horn DNF represents a minimal Horn extension or
not [650]. Further properties of Horn extensions can be found in Ibaraki, Kogan,
and Makino [518].
Theorem 12.13. The pdBf (T , F ) has a threshold extension if and only if the
system of inequalities
n
wj xj ≤ t for all X ∈ F , (12.17)
j =1
n
wj xj ≥ t + 1 for all X ∈ T , (12.18)
j =1
Note however that, even when there exists a threshold extension, the solution of
the system (12.17)–(12.18) does not immediately produce a DNF of the extension,
but only a linear separating structure (w1 , w2 , . . . , wn , t). Also, the existence of a
threshold extension does not guarantee the existence of a threshold prime theory
or of a threshold theory defined over a minimum cardinality support set.
Example 12.11. Consider the pdBf given by
T = {(1, 0, 1, 1), (1, 1, 0, 0), (1, 1, 0, 1), (1, 1, 1, 0), (1, 1, 1, 1)},
F = {(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 0), (0, 1, 0, 0), (0, 1, 0, 1),
(0, 1, 1, 0), (1, 0, 0, 0), (1, 0, 0, 1), (1, 0, 1, 0)}.
This pdBf has four extensions, namely,
ψ1 = x1 x2 ∨ x1 x3 x4 ,
ψ2 = x1 x2 ∨ x1 x3 x4 ∨ x2 x3 x4 ,
ψ3 = x1 x2 ∨ x1 x3 x4 ∨ x̄2 x3 x4 ,
ψ4 = x1 x2 ∨ x3 x4 .
Of these four extensions, only ψ1 and ψ2 are threshold. The unique prime theory
of (T , F ) is ψ4 , which is not threshold.
Similarly, the pdBf of Example 12.1 has several threshold extensions, but the
extensions defined over the minimum cardinality support sets S1 = {5, 8} and
S2 = {6, 7} (namely, ϕ1 and ϕ2 in Table 12.2) are not threshold.
V = V0 ∪ V1 ,
E = EF ∪ ET ,
Vi = {X|Si | X ∈ T ∪ F }, i = 0, 1
ET = {(A|S0 , A|S1 ) | A ∈ T },
EF = {(B|S0 , B|S1 ) | B ∈ F }.
When displaying the graph G(T ,F ) , we draw the edges in ET as solid lines, and the
edges in EF as broken lines.
Example 12.12. Consider the pdBf in Table 12.5 with S0 = {1, 2, 3} and S1 =
{4, 5, 6} (ignore the column h1 for the time being). The corresponding structure
graph is shown in Figure 12.5.
In view of Theorem 12.1, the pdBf (T , F ) has an extension f ∈ FF0 (S0 ,F1 (S1 )) if
and only if there exists a function h1 : V1 → B such that T ∩ F = ∅, where
T = {(A|S0 , h1 (A|S1 )) | A ∈ T }
F = {(B|S0 , h1 (B|S1 )) | B ∈ F }.
12.3 Extensions within given function classes 541
S0 S1 h1
T 100 101 1
011 110 0
011 010 1
110 101 1
F 100 110 0
000 110 0
000 010 1
Figure 12.5. The structure graph G(T ,F ) of the pdBf in Example 12.12.
Example 12.13. For the pdBf of Example 12.12, possible values of h1 (X) (X ∈ V1 )
are indicated in Table 12.5 and also beside the vertices in V1 , in Figure 12.5. It
is easy to see that these values h1 (X) satisfy the condition in Lemma 12.4, thus
implying that the pdBf of Example 12.12 has an extension in FF0 (S0 ,F1 (S1 )) .
In order to verify if there exists a function h1 satisfying the condition of Lemma
12.4, let us construct the auxiliary graph G∗(T ,F ) = (V ∗ , E ∗ ) as follows:
V ∗ = V1
E ∗ = {(X1 , X1 ) | there is a vertex X0 ∈ V0 in G(T ,F )
such that (X0 , X1 ) ∈ ET and (X0 , X1 ) ∈ EF }.
542 12 Partially defined Boolean functions
Figure 12.6. The graph G∗(T ,F ) and its two-coloring for the pdBf in Example 12.12.
Theorem 12.14. The pdBf (T , F ) has an extension f ∈ FF0 (S0 ,F1 (S1 )) if and only
if G∗(T ,F ) is a bipartite graph. In particular, for given sets S0 and S1 , the problem
EXTENSION(FF0 (S0 ,F1 (S1 )) ) can be solved in polynomial time.
Proof. It is easy to see that there exists a function h1 as described in Lemma 12.4
if and only if each vertex of G∗(T ,F ) can be assigned one of two colors, either 0 or
1, so that no two adjacent vertices receive the same color. This condition means
that G∗(T ,F ) must be bipartite, and it can be checked in polynomial time.
Example 12.14. The auxiliary graph G∗(T ,F ) for the pdBf (T , F ) of Example 12.12
is displayed in Figure 12.6. The colors satisfying the above condition are indi-
cated beside the vertices. This construction illustrates how the h1 -values shown in
Figure 12.5 were obtained.
We next turn to the class of positively decomposable functions FF+0 (S0 ,F1 (S1 )) . In
this case, we have to rely on Theorem 12.8 rather than on Theorem 12.1. Thus, let
us define the positive structure graph G+ (T ,F ) = (V0 ∪ V1 , EF ∪ ET ∪ H0 ∪ H1 ) for
a given pdBf (T , F ), by adding the following sets of directed arcs to the structure
graph G(T ,F ) = (V0 ∪ V1 , EF ∪ ET ):
Hi = {(X, X ) | X, X ∈ Vi and X ≤ X }, i = 0, 1.
Example 12.15. For the pdBf (T , F ) of Table 12.6, assume that S0 = {1, 2} and
S1 = {3, 4, 5} are given (ignore the column h1 temporarily). The positive structure
graph G+ (T ,F ) is shown in Figure 12.7.
12.3 Extensions within given function classes 543
S0 S1 h1
11 011 0
T 01 101 1
01 110 1
01 010 0
F 00 101 1
10 110 1
Figure 12.8. Illustration of a pair of (A, B), A ∈ T ∗ and B ∈ F ∗ , such that A|S1 ≤ B|S1 .
Note that some vertices (e.g., those connected by the arcs in Hi ) may be contracted
when we consider the subgraph of Figure 12.8.
Necessity. Assume that there are points A ∈ T ∗ , B ∈ F ∗ , A ∈ T , B ∈ F sat-
isfying the condition of Figure 12.8, for which A|S1 ≤ B|S1 , A|S0 ≤ B |S0 and
A |S0 ≤ B|S0 hold. This means h1 (A|S1 ) = 1 because h1 (A|S1 ) = 0 implies
(A|S0 , h1 (A|S1 )) ≤ (B |S0 , h1 (B |S1 )), which contradicts the condition (ii) on h1
stated before this lemma. However, h1 (A|S1 ) = 1 implies h(B|S1 ) = 1 by con-
dition (i) on h1 , and hence, (A |S0 , h1 (A |S1 )) ≤ (B|S0 , h1 (B|S1 )), contradicting
condition (ii) on h1 .
Sufficiency. If the subgraph of Figure 12.8 is not contained in G+ (T ,F ) , then a
positive function h1 : V1 → {0, 1} can be defined as follows:
1 if some A ∈ T ∗ satisfies A|S1 ≤ X
h1 (X) = (12.20)
0 otherwise.
It is straightforward to show that this function h1 satisfies the above conditions (i)
and (ii).
Example 12.16. It can be checked directly that the positive structure graph G+ (T ,F )
of Figure 12.7 for Example 12.15 does not contain the subgraph of Figure 12.8.
The values of h1 , indicated in Figure 12.7 beside the vertices of V1 , are determined
by (12.20). This assignment h1 satisfies conditions (i) and (ii), as easily seen from
Table 12.6, and we can conclude that the pdBf (T , F ) has a positive extension
f = g(X|S0 , h1 (X|S1 )) ∈ FF+0 (S0 ,F1 (S1 )) .
FALL F+
F0 (S0 , F1 (S1 )) P P
F0 (F1 (S1 ), F2 (S2 )) P P
F0 (S0 , F1 (S1 ), F2 (S2 )) NPC P
F0 (F1 (S1 ), F2 (S2 ), F3 (S3 )) NPC P
F0 (S0 , F1 (S1 ), . . . , Fk (Sk )), k ≥ 3 NPC NPC
F0 (F1 (S1 ), F2 (S2 ), . . . , Fk (Sk )), k ≥ 4 NPC NPC
Theorem 12.15. For given sets S0 and S1 , problem EXTENSION(FF+0 (S0 ,F1 (S1 )) )
can be solved in polynomial time.
Since the two prime implicants of f conflict in three literals, this function is 2-
convex. In other words, T (f ) consists of two clusters represented by the two prime
implicants, and any two points belonging to different clusters are at Hamming dis-
tance at least 3.
For a function f (which may not be k-convex), define the k-convex envelope
of f to be the smallest k-convex majorant of f . The k-convex envelope of f is
denoted by [f ]k . Thus, [f ]k ∈ Fk-CONV , and [f ]k ≤ g for all g ∈ Fk-CONV such
that f ≤ g.
Ekin, Hammer and Kogan [306] introduced the k-convex envelope and proved
that it always exists. In order to describe an algorithm to compute the k-convex
envelope, we define as follows the convex hull of two terms s and t. Let yj ,
j = 1, 2, . . . , n, denote (positive or negative) literals, and assume that s and t are
written as
s= yj yj yj ,
j ∈S1 j ∈S2 j ∈S3
t= ȳj yj yj ,
j ∈S1 j ∈S2 j ∈S4
where S1 denotes the set of indices of conflicting literals, S2 the set of indices of
common literals, and S3 and S4 (satisfying S3 ∩ S4 = ∅) the sets of literals which
appear only in s and only in t, respectively. The convex hull [s, t] is defined as the
conjunction of the common literals in s and t:
[s, t] = yj .
j ∈S2
This algorithm terminates in polynomial time in the length of the DNF ϕ0 , since
the number of terms decreases by one at each iteration.
Example 12.18. Let us compute the 2-convex envelope of the following function:
f = x1 x2 x3 x4 x5 ∨ x1 x2 x3 x4 x6 ∨ x̄1 x̄2 x̄3 x5 x6 ∨ x̄1 x̄2 x̄3 x4 x̄5 x6 ∨ x̄1 x̄2 x̄3 x̄4 x̄5 x̄6 .
12.4 Best-fit extensions of pdBfs containing errors 547
Taking the convex hull of the first two terms, which have no conflicting literal, we
obtain
[x1 x2 x3 x4 x5 , x1 x2 x3 x4 x6 ] = x1 x2 x3 x4 .
Similarly, from the third and fourth terms having one conflicting literal x5 , we
obtain
[x̄1 x̄2 x̄3 x5 x6 , x̄1 x̄2 x̄3 x4 x̄5 x6 ] = x̄1 x̄2 x̄3 x6 .
Finally, from this new term and the fifth term of f having one conflicting literal
x6 , we obtain
[x̄1 x̄2 x̄3 x6 , x̄1 x̄2 x̄3 x̄4 x̄5 x̄6 ] = x̄1 x̄2 x̄3 .
The resulting two terms conflict in three (= k + 1) literals, and thus we have
obtained the 2-envelope of f :
[f ]2 = x1 x2 x3 x4 ∨ x̄1 x̄2 x̄3 .
This is indeed a 2-convex function as already discussed in Example 12.17.
Now let (T , F ) be a pdBf, and suppose we want to know whether (T , F ) admits
a k-convex extension. Let ϕT be the DNF consisting of all the minterms associated
with the true points in T (ϕT is the minterm expression of fmin ; see (12.3)). Then
from the preceding argument, the following theorem easily follows:
Theorem 12.16. A pdBf (T , F ) has an extension f ∈ Fk-CONV if and only if the k-
envelope of ϕT satisfies [ϕT ]k (B) = 0 for all B ∈ F . This condition can be checked
in polynomial time.
Proof. Supppose that g is a k-convex extension of (T , F ). Since [ϕT ] ≤ g, the
definition of the k-convex envelope implies [ϕT ]k ≤ g. Now, for all B ∈ F , g(B) =
0, and hence, [ϕT ]k (B) = 0.
The converse implication and the complexity statement are straightforward.
Remark 12.2. The problem EXTENSION(C) has also been extensively studied in
computational learning theory, where it is usually called the consistency problem.
This interest is motivated by the fact that a class C is not PAC learnable and not
polynomially exact learnable with equivalence queries, if the consistency problem
for C is NP-complete (provided, of course, P = N P ); see, for example, Anthony
[26] for details. For example, the consistency problem for the class of h-term DNF
functions (namely, functions representable by a disjunction of at most h terms)
was shown to be NP-complete by Pitt and Valiant [749]. For related topics, the
reader is referred to Aizenstein et al. [13]; Angluin [21]; Bshouty [161]; Kearns,
Li, and Valiant [560]; and Valiant [884], and so on.
classified, and some attributes not included in the current data set may render
it inconsistent. In this section, in order to cope with such situations, we allow an
extension f “to make errors” in the sense that some points A ∈ T may be classified
in F (f ) (f (A) = 0), and some points B ∈ F may be classified in T (f ) (f (B) = 1).
However, we obviously want to minimize the magnitude of such errors. In order
to state more precisely the resulting questions, let
w : T ∪ F → R+
Boros, Ibaraki, and Makino [139] introduced the following problem (see also
Boros, Hammer, and Hooker [128]):
Problem BEST-FIT(C)
Instance: A pdBf (T , F ) and a weighting function w on T ∪ F .
Output: A pdBf (T ∗ , F ∗ ) (and an extension f ∈ C of (T ∗ , F ∗ )) with the following
properties:
1. T ∗ ∩ F ∗ = ∅ and T ∗ ∪ F ∗ = T ∪ F .
2. (T ∗ , F ∗ ) has an extension in C.
3. w(T ∩ F ∗ ) + w(F ∩ T ∗ ) is minimized.
The conditions in this problem express that if we consider the points in T ∩
F ∗ and F ∩ T ∗ as erroneously classified, and if we change their classification
accordingly, then the resulting pdBf (T ∗ , F ∗ ) has an extension in the designated
class C. In case the weighting function w satisfies w(A) = 1 for all A ∈ T ∪ F , the
problem asks to minimize the number of erroneously classified points in T ∪ F .
Clearly, problem BEST-FIT(C) contains problem EXTENSION(C) as a spe-
cial case. Therefore, if EXTENSION(C) is NP-complete, then BEST-FIT(C) is
NP-hard. Conversely, if BEST-FIT(C) is solvable in polynomial time, then so is
EXTENSION(C). The next theorem indicates that BEST-FIT(C) is quite hard and
is polynomially solvable only for very restrictive classes C (see [139] for additional
results).
Proof. We prove the polynomiality of BEST-FIT(C) for FALL and F+ . Its NP-
hardness for FUNATE follows from Theorem 12.10. The results for other classes
are omitted (see Boros, Ibaraki, and Makino [139]).
C = FALL : By Theorem 12.1, if (T , F ) does not have an extension in FALL , then
T ∩ F = ∅. The optimal pdBf (T ∗ , F ∗ ) is obtained by reclassifying every point
12.4 Best-fit extensions of pdBfs containing errors 549
X ∈ T ∩ F either into T ∗ or into F ∗ . Since both decisions carry the same weight
w(X), we can minimize w(T ∗ ∩ F ) + w(F ∗ ∩ T ) by letting, for example,
T ∗ = T \F, F∗ = F. (12.21)
E = {(A, B) | A ≤ B, A ∈ T , B ∈ F }.
This graph H(T ,F ) can be constructed from (T , F ) in O(n|T ||F |) time. A minimum
vertex cover of H(T ,F ) is a subset of vertices U ⊆ T ∪ F such that
(1) U is a vertex cover of H(T ,F ) , that is, every edge (A, B) ∈ E satisfies either
A ∈ U or B ∈ U , and
(2) w(U ) is minimum among all vertex covers.
Although the problem of finding a minimum vertex cover is NP-hard for general
graphs, it is solvable in O((|T | + |F |)3 ) time for bipartite graphs (e.g., Ford and
Fulkerson [341]; Kuhn [587]).
Let U be a minimum vertex cover of H(T ,F ) . We can assume without loss of
generality that U is a minimal cover, meaning that no proper subset of U is a vertex
cover (this is certainly true if all weights w are strictly positive; otherwise, simply
remove all redundant vertices from U ).
Observe that for every positive Boolean function f , the set
W = (T ∩ F (f )) ∪ (F ∩ T (f ))
T ∗ = (T \ U ) ∪ (F ∩ U ), (12.23)
∗
F = (T ∩ U ) ∪ (F \ U ). (12.24)
and this, together with (12.22) implies that (T ∗ , F ∗ ) provides an optimal solution
of BEST-FIT(F+ ). The total time required for the entire computation of (T ∗ , F ∗ )
is O(n|T ||F | + (|T | + |F |)3 ).
550 12 Partially defined Boolean functions
In order to prove the claim, assume that (T ∗ , F ∗ ) does not have a pos-
itive extension. This means that there exist A ∈ T ∗ and B ∈ F ∗ such that
A ≤ B. We distinguish three cases, according to the definition (12.23)–(12.24)
of (T ∗ , F ∗ ):
T = {(0, 1, 1, 0, 0), (0, 1, 0, 1, 0), (0, 0, 1, 1, 0), (0, 0, 1, 0, 1), (0, 0, 1, 1, 1)},
F = {(0, 1, 0, 1, 1), (1, 1, 0, 1, 0), (0, 1, 1, 1, 0), (0, 0, 1, 1, 1)}.
T ∗ = T \ {(0, 0, 1, 1, 1)},
F∗ = F,
in view of (12.21).
To solve BEST-FIT(F+ ), we then construct the bipartite graph H(T ,F ) of
Figure 12.9. This graph has a minimum vertex cover
T ∗ = (T \ U ) ∪ (F ∩ U )
= {(0, 1, 1, 0, 0), (0, 0, 1, 1, 0), (0, 0, 1, 0, 1), (0, 0, 1, 1, 1), (0, 1, 1, 1, 0)},
∗
F = (T ∩ U ) ∪ (F \ U )
= {(0, 1, 0, 1, 0), (0, 1, 0, 1, 1), (1, 1, 0, 1, 0)}.
12.5 Extensions of pdBfs with missing bits 551
Figure 12.9. Bipartite graph H(T ,F ) for the pdBf in Example 12.19 (dark circles denote the
vertices in a minimum vertex cover U ).
The problem BEST-FIT was extensively studied by Boros, Ibaraki and Makino
[139].As the problem plays an important role in analyzing real-world data, efficient
heuristic algorithms are necessary to deal with those classes C for which BEST-
FIT(C) is NP-hard. Some attempts in this direction have been made, for instance,
in Boros et al. [131].
M = {0, 1, ∗}.
We call partially defined Boolean function with missing bits, abbreviated as pBmb,
any pair (T̃ , F̃ ) consisting of a set of positive examples T̃ ⊆ Mn and of a set of
negative examples F̃ ⊆ Mn . Following the line of Boros, Ibaraki, and Makino
[140], we introduce in the next subsection various types of extensions which are
meaningful for pBmbs. Related complexity results are then discussed in Section
12.5.2.
552 12 Partially defined Boolean functions
1. We consider that each missing bit can take value either 0 or 1, and the value
of the extension should be identical in both cases.
2. We consider that each missing bit should be fixed to one of the two values
0 and 1, and an extension should exist for these fixed values. (Here, it is
important to fix appropriately the value of the missing bits.)
If we take the first point of view, then we can define a fully robust extension1
of a pBmb (T̃ , F̃ ) to be a Boolean function f such that f (A) = 1 (respectively,
f (B) = 0) for all A ∈ Bn (respectively, B ∈ B n ) obtainable from a point à ∈ T̃
(respectively, B̃ ∈ F̃ ) by fixing each missing bit to either 0 or 1. From the second
point of view, we can define a consistent extension of (T̃ , F̃ ) to be an extension of
some pdBf (T , F ) obtained from (T̃ , F̃ ) by fixing all missing bits appropriately.
When a pBmb (T̃ , F̃ ) admits a consistent extension, but no fully robust exten-
sion, then we may also take an intermediate view whereby we should fix a smallest
possible number of missing bits so that the resulting pBmb has a fully robust
extension. Such an extension is called a most robust extension.
To describe more precisely the above three problems, let us introduce some
notations. For a set of points S̃ ⊆ Mn , let
Example 12.20. If
then
AS(S̃) = {(X, 2), (Y , 3), (Y , 4), (Z, 3)}.
If we consider Q = {(X, 2), (Y , 4)} and the assignment (α(X, 2), α(Y , 4)) = (1, 0) ∈
BQ , then we obtain
We also use the following shorthand notations: For a given pBmb (T̃ , F̃ ), we
let
AS = AS(T̃ ∪ F̃ ),
and if S̃ is a singleton {X}, then we simply write AS(X) for AS(S̃). Note that an
assignment α ∈ BAS fixes all missing bits of the points in T̃ ∪ F̃ .
Based on these definitions, we say that a Boolean function f is a fully robust
extension of the pBmb (T̃ , F̃ ) if the conditions
Remark 12.3. There is yet another type of extension of a pBmb called fully
consistent extension: A pBmb (T̃ , F̃ ) is fully consistent in class C if for every
assignment α ∈ BAS there is an extension f ∈ C of the pdBf (T̃ α , F̃ α ). Note that
the extensions may be different for different assignments α ∈ BAS . Clearly, a
pBmb (T̃ , F̃ ) is fully consistent in class C if it has a fully robust extension in C,
but the converse may not be true. This type of extension was studied in Boros et
al. [141, 142].
Theorem 12.18. A pBmb (T̃ , F̃ ) has a fully robust extension if and only if there
exists no pair (A, B) with A ∈ T̃ and B ∈ F̃ such that A ≈ B. Hence, FRE(FALL )
can be solved in polynomial time.
Proof. The necessary and sufficient condition is obvious from the definition of a
fully robust extension. The condition A ≈ B is equivalent to the existence of an
index j such that aj = bj , aj , bj ∈ {0, 1}. This can be checked in O(n|T̃ ||F̃ |) time
by direct comparison of all points A ∈ T̃ and B ∈ F̃ .
The next lemma holds for the class F+ and for any subclass of F+ .
Lemma 12.7. A pBmb (T̃ , F̃ ) has a fully robust extension in the class C ⊆ F+ if
and only if the pdBf (T − , F + ) defined by
T − = {A0 | A ∈ T̃ }, F + = {B 1 | B ∈ F̃ },
has an extension in C.
It is easily checked that A ≈ B does not hold for any A ∈ T̃ , B ∈ F̃ , and hence,
there is a fully robust extension of (T̃ , F̃ ) in FALL . Such a fully robust extension f
is for example given by
T (f ) = {(0, 1, 0, 0, 0), (0, 1, 0, 1, 0), (0, 1, 1, 0, 0), (0, 1, 1, 1, 0), (0, 1, 0, 1, 1),
(1, 1, 0, 1, 1)}
F (f ) = B 5 \ T (f ).
T + = {A1 | A ∈ T̃ }, F − = {B 0 | B ∈ F̃ }
has an extension in C.
Proof. Assume first that there is a consistent extension f ∈ C of (T̃ , F̃ ). That is,
f is an extension of the pdBf (T̃ β , F̃ β ) for some assignment β ∈ B AS . Since f
is positive and Aβ ≤ A1 , we see that f (A1 ) = 1 holds for all A ∈ T̃ . Similarly,
f (B 0 ) = 0 for all B ∈ F̃ . Therefore, f is an extension of (T + , F − ).
The converse direction is obvious since (T + , F − ) is obtained from (T̃ , F̃ ) by
an assignment.
yj ≤ wj , yj ≤ 0 j = 1, 2, . . . , n, (12.29)
zj ≥ wj , zj ≥ 0 j = 1, 2, . . . , n. (12.30)
We claim that this system has a feasible solution if and only if (T̃ , F̃ ) has a fully
robust threshold extension.
Let us assume first that (12.27)–(12.30) has a feasible solution (W , Y , Z, t).
Then, for all A ∈ T̃ and for all α ∈ AS(A), we obtain from (12.27) and (12.29):
n
t +1 ≤ wj + yj ≤ wj = wj ajα . (12.31)
j :aj =1 j :aj =∗ j :ajα =1 j =1
and (12.27) is satisfied. The same reasoning holds for (12.28), and hence,
(W , Y , Z, t) is a feasible solution of (12.27)–(12.30).
This claim will prove the theorem, since condition (12.33) can be checked for all
A ∈ T̃ in O(n|T̃ ||F̃ |) time.
To prove the claim, assume first that condition (12.33) holds. Consider a point
A ∈ T̃ and the corresponding index j satisfying condition (12.33). Then, for all
assignments α ∈ BAS and all B ∈ F̃=a , we have ajα = 0 and bjα = 1. Therefore, the
Horn term
tA = ( xi ) x̄j
i:ai =1
satisfies tA (Aα ) = 1 and tA (B α ) = 0 for all α ∈ B AS and all B ∈ F̃=a . This term
tA also satisfies tA (B α ) = 0 for all B ∈ F̃ \ F̃=A and all α ∈ B AS ; indeed, for all
such B, there is some i such that ai = 1 and bi = 0 by the assumption that B = A.
We conclude that the following Horn DNF represents a fully robust extension of
(T̃ , F̃ ):
ϕ= tA . (12.34)
A∈T̃
Conversely, if condition (12.33) does not hold for some A ∈ T̃ with F̃=A = ∅,
then define an assignment α ∈ BAS({A}∪F̃=A ) as follows: For all (A, i) ∈ AS(A),
&
B∈F̃=A :bi =∗ bi if there is a point B ∈ F̃=A such that bi = ∗
α(A, i) =
1 otherwise,
Aα = Bα.
B α ∈(F̃ α )≥Aα
By condition (12.16) in the proof of Theorem 12.12, this implies that the pdBf
(T̃ α , F̃ α ) does not have a Horn extension, and consequently, the pBmb (T̃ , F̃ )
does not have a fully robust extension in FHORN .
Example 12.22. Let us consider the pBmb (T̃ , F̃ ) of Example 12.21. For the two
points in T̃ , we obtain
The condition (12.33) holds with j = 5 for A = (0, 1, ∗, ∗, 0), and with j = 3 for
A = (∗, 1, 0, 1, 1). Therefore, the DNF ϕ defined by (12.34), namely,
ϕ = x2 x̄5 ∨ x2 x4 x5 x̄3 ,
558 12 Partially defined Boolean functions
In contrast with the previous positive results, Boros, Ibaraki, and Makino [140]
also proved that, except for the special cases discussed in Theorems 12.18, 12.19,
12.20, and 12.21, all other variants of the problems FRE, CE, MRE are either NP-
complete or NP-hard for the classes FALL , F+ , FUNATE , FTh , FHORN , FF0 (S0 ,F1 (S1 )) ,
and Fk . We refer the reader to [140] for details and additional results.
the output obtained for such input values when designing the logic circuit under
consideration. This provides some freedom, which can be exploited in the design
process. Since this aspect was not mentioned in Chapter 3, Section 3.3, we discuss
it here very briefly (for more details, we refer the reader to the specialized litera-
ture; see, e.g., Umans, Villa and Sangiovanni-Vincentelli [877] or Villa, Brayton,
and Sangiovanni-Vincentelli [891]).
In the terminology of this chapter, we can restate the logic synthesis problem
as the problem of realizing some extension of a pdBf (T , F ) (instead of a Boolean
function) by a DNF φ, where all points in B n \ (T ∪ F ) are interpreted as don’t
cares. Keeping in mind the difference between a Boolean function and a pdBf,
we can accordingly adapt the discussion of logic minimization in Section 3.3.
Namely, we want now to find an extension f of (T , F ) having a shortest DNF φ,
as measured either by |φ| (the number of literals) or by ||φ|| (the number of terms).
A main difference with the discussion in Chapter 3 is that here, we do not know
the function f beforehand, but we have to select it among the extensions of (T , F ).
Note that, since our objective is to minimize the size of the DNF representation,
Lemma 12.3 implies that there is no loss of generality in restricting our attention
to prime irredundant theories (defined in Section 12.2.3). Therefore, we use prime
patterns of (T , F ) (instead of prime implicants of f in Chapter 3), and we aim
to find a set of prime patterns that together cover T . The DNF φ defined as the
disjunction of such prime patterns is a prime theory. If this prime theory mini-
mizes |φ| (or ||φ||), then we deem it desirable from the point of view of circuit
design.
In order to select an appropriate prime theory, the usual procedures require first
to generate all prime patterns of the given pdBf (T , F ). Since the prime patterns
of (T , F ) are among the prime implicants of the function fmax defined by (12.4)
(see Lemma 12.1 in Section 12.2.3), we can proceed as described in Figure 12.11.
Note that ψfmax is explicitly available from F . The prime implicants of fmax can
be generated, for instance, by (the dual version of) the procedure SD-Dualization
of Section 4.3.2.
Example 12.23. Let us consider the pdBf in Table 12.1 of Section 12.1. First we
construct ψfmax from F = {B (1) , B (2) , B (3) , B (4) } as follows (we use the shorthand
Expanding this CNF into a DNF, and then manipulating it as discussed in Section
4.3.2, we obtain the next DNF consisting of all prime implicants of fmax :
ϕ = 1̄2∗ ∨ 1̄5̄∗ ∨ 1̄8∗ ∨ 23∗ ∨ 24̄∗ ∨ 25∗ ∨ 2̄5̄ ∨ 26̄∗ ∨ 27∗ ∨ 28̄∗
∨ 2̄8 ∨ 34 ∨ 3̄4̄ ∨ 35̄ ∨ 36 ∨ 3̄6̄∗ ∨ 3̄7∗ ∨ 37̄∗ ∨ 38∗ ∨ 4̄5̄
∨ 46̄∗ ∨ 4̄6 ∨ 47∗ ∨ 4̄7̄∗ ∨ 4̄8∗ ∨ 5̄6̄ ∨ 5̄7∗ ∨ 58∗ ∨ 5̄8̄∗ ∨ 67∗
∨ 6̄7̄∗ ∨ 6̄8∗ ∨ 78 ∨ 12̄3̄ ∨ 12̄4 ∨ 12̄6 ∨ 12̄7̄ ∨ 13̄5∗ ∨ 13̄8̄
∨ 145∗ ∨ 148̄ ∨ 156 ∨ 157̄∗ ∨ 168̄ ∨ 17̄8̄ ∨ 268̄∗ .
In this DNF, the terms marked with ∗ are the prime patterns of (T , F ), while the
unmarked terms are not prime patterns. This example shows that the procedure
may generate many prime implicants which are not prime patterns of (T , F ).
Following the line of Section 3.3, the next step of logic minimization is to find
a set of prime patterns that together cover the set T of true points. For the purpose
of computing a theory φ which minimizes |φ| or ||φ||, the methods described in
Section 3.3 (Quine-McCluskey method and its extensions) can be readily applied,
if we simply replace the words “prime implicant” by “prime pattern.” We illustrate
this by continuing the foregoing example.
Example 12.24. From the list of prime patterns marked in the DNF ϕ for Example
12.23, we choose a set of prime patterns which cover the set T = {A(1) , A(2) , A(3) }
given in Table 12.1. It is easy to see that a single prime pattern cannot do
this, and thus we must select at least two prime patterns. Even if we restrict
ourselves to short prime implicants, there are many such sets, for example,
{1̄2, 25}, {1̄2, 26̄}, . . . , {67, 6̄7̄}. The corresponding prime theories φ contain exactly
two prime patterns of degree two and are minimum with respect to both norms |φ|
and ||φ||. The extensions f1 and f3 given in Example 12.1 are two such minimum
realizations of (T , F ).
The preceding method based on generating all prime patterns seems reasonably
efficient for those pdBfs (T , F ) such that T ∪F is not much smaller than B n (which
is often the case when don’t cares are considered). But if the set T ∪ F is small,
other methods that construct prime patterns directly from T may be more efficient.
For example, Boros et al. [131] propose a naive method that first generates all terms
of degree 1 one and picks up prime patterns from them, and then repeats the same
for all terms of degree 2, and so on. This method can be used to obtain short prime
patterns. Another approach is to apply, for each A ∈ T , a method to generate all
prime patterns that cover A, by relying on the set covering characterization of
12.7 Conclusion 561
patterns (see Exercise 2 of this chapter). This can be elaborated into an algorithm
that runs with polynomial delay for the generation of all prime patterns, as discussed
in Boros et al. [117].
12.7 Conclusion
In this chapter, we introduced partially defined Boolean functions (pdBfs) as fun-
damental models arising in various fields of applications, in particular, in logical
analysis of data. We defined various problems and classified their computational
complexity, with an emphasis on questions related to extensions of pdBfs. We sum-
marize in Table 12.8 the main complexity results mentioned in this chapter. In this
table, a letter P indicates that the corresponding problem is solvable in polynomial
time, while NPH or NPC indicate that it is NP-hard or NP-complete, respectively.
Also, EXT stands for EXTENSION and MIN-SUPT for MIN-SUPPORT.
As mentioned in the introduction of this chapter, acquiring or discovering
meaningful information (or knowledge) from available data has recently received
increased attention. The approach in this chapter may be regarded as a logical
approach, since it is based solely on the consideration of pdBfs and of their
extensions, viewed as Boolean functions having simple Boolean expressions. The
performance of different approaches may be compared from several viewpoints,
such as:
12.8 Exercises
1. Prove Theorem 12.9, that is, prove that problem MIN-SUPPORT(F+ ) is
NP-hard.
2. Given a pdBf (T , F ) on B n and a point A ∈ T , let tA = z1 z2 · · · zn be the
minterm of A (that is, zj = xj if aj = 1 and zj = x̄j otherwise). Define an
|F | × n matrix Q by
1 if bj(i) = aj ,
Qij =
0 otherwise,
where B (i) is the i-th point in F . Then consider the following set covering
constraints:
Qy ≥ 1 (12.35)
n
y ∈ {0, 1} . (12.36)
V = V1 ∪ V2 ,
E = EF ∪ ET
Vi = {X|Si | X ∈ T ∪ F }, i = 1, 2
ET = {(X|S1 , X|S2 ) | X ∈ T },
EF = {(X|S1 , X|S2 ) | X ∈ F }.
and derive a necessary and sufficient condition for this type of decompos-
ability.
5. Prove the second half of Theorem 12.11, that is, prove that problem
EXTENSION(Fk+ ) can be solved in polynomial time.
12.8 Exercises 563
6. For a given graph G = (V , E) with V = {1, 2, . . . , n}, define the points A(i,j ) ,
(i, j ) ∈ E, and B (i) , i ∈ V , as follows:
(i,j ) (i,j )
• ak = 1 for k ∈ / {i, j } and ak = 0 for k ∈ {i, j },
(i) i
• bk = 1 for k = i and bk = 0 for k = i.
Then define a pdBf (T , F ) in Bn by
Show that
min(|T ∩ F ∗ | + |F ∩ T ∗ |) = τ (G)
holds, where the minimum is taken over all pdBfs (T ∗ , F ∗ ) having an exten-
sion f ∈ FHORN , and τ (G) is the size of a minimum vertex cover in G.
Knowing that the minimum vertex cover problem is NP-hard, prove that
BEST-FIT(FHORN ) is NP-hard.
7. For each of the following conditions, construct a pBmb (T̃ , F̃ ) satisfying
it.
a. (T̃ , F̃ ) has a consistent extension in FALL , but does not have a fully
robust extension in FALL .
b. (T̃ , F̃ ) has a fully robust extension in FALL , but does not have a fully
robust extension in F+ .
c. (T̃ , F̃ ) has a consistent extension in F+ , but does not have a fully robust
extension in F+ .
8. Consider the consistent extension problem CE(FALL ) for a pBmb (T̃ , F̃ )
such that each A ∈ T̃ ∪ F̃ has at most one missing bit. Recall that (T̃ , F̃ )
has a consistent extension in FALL if and only if there is an assignment
α such that Aα = B α holds for all pairs of A ∈ T̃ and B ∈ F̃ . Show that
the question of the existence of such an assignment can be formulated as a
quadratic Boolean equation (or 2-sat problem). Since quadratic equations
are solvable in polynomial time, this proves that CE(FALL ) is also solvable
in polynomial time under the stated restriction.
13
Pseudo-Boolean functions
564
13.1 Definitions and examples 565
topic and to briefly indicate some of the main research directions and techniques
encountered in the field.
We now proceed with a description of a few representative problems arising in
mathematics, computer science, and operations research, where pseudo-Boolean
functions appear naturally and contribute to the analysis and the solution of area-
specific problems.
Mathematics
Application 13.1. (Graph theory.) As observed by Hammer and Rudeanu [460],
many graph theoretic concepts can be easily formulated in the pseudo-Boolean
language. We only give here a few examples.
Let N = {1, 2, . . . , n}, and consider a graph G = (N , E) with nonnegative
weights w : N → R+ on its vertices, and capacities c : E → R+ on its (undirected)
edges. For every S ⊆ N , the cut (S, N \ S) is the set of edges having exactly
one endpoint in S; the capacity of this cut is defined as (i,j )∈(S,N \S) c(i, j ). The
max-cut problem is to find a cut of maximum capacity in G. If (x1 , x2 , . . . , xn ) is
interpreted as the characteristic vector of S, then the edge (i, j ) has i ∈ S and
j ∈ S if and only if xi x j = 1. Therefore, the max-cut problem is equivalent to the
maximization of the quadratic pseudo-Boolean function
f (x1 , x2 , . . . , xn ) = c(i, j )(xi x j + x i xj ). (13.1)
(i,j )∈E
for a sufficiently large value of the penalty M (say, M > max1≤i≤n w(i)).
Let us now assume that w(i) = 1 for i = 1, 2, . . . , n. For every A ⊆ N , we denote
by αG (A) the stability number of the subgraph of G induced by A, that is, the size of
a largest stable set of G contained in A. We can associate with G a pseudo-Boolean
function fαG defined as follows: For each X = (x1 , x2 , . . . , xn ) ∈ Bn ,
Application 13.2. (Linear algebra.) Let V be a finite set of vectors over an arbi-
trary field and consider the set function f : P(V ) → R, where f (T ), T ⊆ V , is the
rank of the matrix whose rows are the members of T . This rank function has two
interesting properties that are further examined in Section13.6. First, the function
is monotone nondecreasing, that is,
f (S) ≤ f (T ) whenever S ⊆ T .
Second, the function is submodular, meaning that
f (S ∪ T ) + f (S ∩ T ) ≤ f (S) + f (T ) for all S, T ⊆ V .
It is interesting to remark that both of these properties continue to hold for rank
functions defined on subsets of elements of a matroid (see for instance Welsh
[905]).
function
m
f= wk xi xj .
k=1 i∈Ak j ∈Bk
We refer the reader to Chapter 2, Section 2.11.4, for a more complete discussion
of this well-known generalization of the Boolean satisfiability problem.
Ci+ (X) = 1 ⇒ X ∈ *+ (i = 1, . . . k)
Cj− (X) = 1 ⇒ X ∈ *− (j = 1, . . . h)
(see Chapter 12 for details). In Boros et al. [131], patterns have been used to
define a family of discriminants, namely, pseudo-Boolean functions of the form
k
h
d(X) = αi Ci+ (X) − βj Cj− (X),
i=1 j =1
where the αi ’s and the βj ’s are nonnegative reals, and ki=1 αi = hj=1 βj = 1.
An appropriate choice of the parameters (αi , βj ) allows the construction of dis-
criminants which take “high" values in positive observations, and “low" values
in negative ones. We refer to [131] for details. See also Genkin, Kulikowski, and
Muchnik [376] for other pseudo-Boolean models in data mining.
Operations research
Application 13.6. (0–1 linear programming.) Consider the 0–1 linear program-
ming problem
n
maximize z(x1 , x2 , . . . , xn ) = cj x j (13.4)
j =1
n
subject to aij xj = bi , i = 1, 2, . . . , m (13.5)
j =1
where the product j ∈T (i) xj takes value 1 only if part i can be processed by the
selected tools. Different formulations and detailed discussions of this problem can
be found, for instance, in [228, 238, 845].
For a second example, consider the classical simple facility location problem:
Here, we must select an optimal subset of locations for some facilities (such as
plants, warehouses, emergency facilities) in order to serve the needs of a set of
users. Opening a facility in a given location i requires a fixed cost ci , and deliv-
ering the service to user j from location i carries a cost dj i (j = 1, 2, . . . , m,
i = 1, 2, . . . , n).
Let us introduce a 0–1 variable xi which indicates whether a facility is to be
opened in location i (i = 1, . . . , n). Two pseudo-Boolean functions can be defined:
A function c(X) to indicate the total fixed cost required to open a configuration
X = (x1 , x2 , . . . , xn ), and a function d(X) to indicate the optimal cost of serving
the set of users from the corresponding locations. The optimal location problem
(essentially) consists now in finding the minimum of the pseudo-Boolean function
c(X) + d(X). Detailed expressions of this function were first proposed by Hammer
[434] and further examined in [70, 263, 394, 395, 900, etc.]. If we denote by π(j ) =
(i1 (j ), i2 (j ), . . . , in (j )) a permutation of locations such that dj i1 (j ) ≤ dj i2 (j ) ≤ . . . ≤
dj in (j ) , then the function to be minimized can be written as
n
m
n n
f (X) = ci xi + dj ik (j ) xik (j ) x i- (j ) + M xi .
i=1 j =1 k=1 -<k i=1
13.2 Representations
Different application areas may rely on different descriptions of pseudo-Boolean
functions. For instance, in game theory, the payoff of a coalition of players may
be computed as the optimal value of an associated combinatorial optimization
problem (see Bilbao [79]). In other models, the values assumed by a pseudo-
Boolean function may be listed in a table, or computed by a black-box oracle.
One of the main impacts of the pseudo-Boolean viewpoint on the theory of set
functions, however, is due to the existence of various algebraic representations
of these functions. The properties of such algebraic representations are the main
topic of this section.
takes value f (X ∗ ) in the point X ∗ , and the value 0 in every other point of B n .
Therefore,
∗
f (x1 , x2 , . . . , xn ) = f (X ) xi xj . (13.14)
X ∗ ∈Bn i|xi∗ =1 j |xj∗ =0
Note that the polynomial (13.12) is linear in each of its variables: We say that
it is multilinear.
Definition 13.1. The expression in the right-hand side of (13.12) is the (multilin-
ear) polynomial expression of f . The degree of f is the degree of this polynomial,
namely, degree(f ) = max{|A| : c(A) = 0}. We say that a pseudo-Boolean func-
tion is either linear, or quadratic, or cubic if its degree is at most 1, or 2, or 3,
respectively.
The set function c: P(N ) → R is sometimes called the Möbius transform or the
mass function associated with f (see for instance [407, 824]). In fact, it follows
from the elementary theory of Möbius inversion for ordered sets that c can be
computed as
c(A) = (−1)|A|−|S| f (eS ) for all A ∈ P(N ),
S⊆A
where eS denotes as usual the characteristic vector of S (see Aigner [12]). The
bijective correspondence linking the functions f and c has been investigated in
a broader context by various authors; see for instance Grabisch, Marichal, and
Roubens [407].
572 13 Pseudo-Boolean functions
Definition 13.3. The PBNF (13.15) is called a posiform if bk > 0 for all k =
1, . . . , m.
Note that the sign of the free coefficient b0 is unrestricted in a posiform. Hammer
and Rosenberg [457] introduced posiforms and observed the following property:
x y z f (x, y, z)
0 0 0 3
0 0 1 1
0 1 0 0
0 1 1 −2
1 0 0 4
1 0 1 2
1 1 0 −5
1 1 1 6
ψ1 = −8 + x + 6 x + 3 y + 2 z + 6 xy + 13 xyz
and
ψ2 = −8 + x + 3 y + 6 y + 2 z + 6 xy + 13 xyz,
which can be further simplified to
ψ1 = −7 + 5 x + 3 y + 2 z + 6 xy + 13 xyz
and
ψ2 = −8 + x + 9 y + 2 z + 6 xy + 13 xyz,
respectively.
Clearly, if the functions f1 and f2 are Boolean, then disjunction and conjunc-
tion are simply the usual Boolean operators (and we sometimes omit to write the
operator ∧).
13.2 Representations 575
f (x, y) = 6 + 3x − xy
attains its minimum value (min f (X) = 6) when (x, y) = (0, 0) or when (x, y) =
(0, 1). Hence, using the construction (13.21), f can be expressed as
f (x, y) = (6 + 2 x y) ∨ (6 + 3 x y),
or as
f (x, y) = (5 + 3 x y) ∨ (5 + x y) ∨ (5 + 4 x y) ∨ (5 + x y),
576 13 Pseudo-Boolean functions
or as
f (x, y) = (8 x y) ∨ (6 x y) ∨ (9 x y) ∨ (6 x y),
and so on.
To prove Claim (ii), assume first that b = 0 and that a < fmin (or equivalently
in this case, a + b < fp ). Since p (X) = fmin is an implicant of f and since
p(X) = a < fmin = p (X), we conclude that p(X) is not prime.
So, let us assume from now on that b > 0. If a
< fmin , let 0 < M ≤ min(b, fmin −
a), and define p (X) = (a +M)+(b −M) x
i∈A i j ∈B x j . There holds p(X) ≤
p (X) ≤ f (X) for all X ∈ Bn , p(X) = p (X), and we conclude that p(X) is not
prime.
Finally, assume that a + b< fp , let 0 < η ≤ fp − (a + b), and define
p (X) = a + (b + η) i∈A xi j ∈B x j . Here again, p(X) ≤ p (X) ≤ f (X)
for all X ∈ Bn , and we conclude that p(X) is not prime.
define a canonical DNF for each pseudo-Boolean function, thus extending the cor-
responding representation theory of Boolean functions. A similar situation arises
for CNFs.
Foldes and Hammer [336] propose an algorithm which produces all prime
implicants of an arbitrary function expressed in DNF. Their algorithm is a gener-
alization of the Boolean consensus method (see Chapter 3). It is also analogous
to the consensus procedure for discrete functions described by Davio, Deschamps
and Thayse [259].
Remark. The term “extension” was used with a different meaning in Chapter 12,
where it applied to partially defined Boolean functions. On the other hand, the
qualifier “continuous” is somewhat ambiguous in Definition 13.6, since this def-
inition does not require that extensions be continuous in the standard sense for
functions of real variables (namely, with respect to the Euclidean topology of Rn );
the word “continuous” only reminds us here that extensions are defined over a
nondiscrete domain. Therefore, we generally use the short terminology “exten-
sion” in this chapter; this should hopefully cause no confusion.
and
n
g(X) = wi xi for all X ∈ U n ,
i=1
= f pol (p1 , p2 , . . . , pn ).
Example 13.4. In Example 13.3, if each variable takes value 0 or 1 with proba-
bility 12 , then the expected value of f is f pol ( 12 , 12 , 12 ) = 98 .
In the special case where f is a Boolean function, Proposition 13.8 has already
been anticipated in our discussion of reliability theory, in Section 1.13.4 of
Chapter 1. In this framework, the polynomial extension f pol corresponds to the so-
called reliability polynomial; see for instance Colbourn [205, 206], Ramamurthy
[777].
13.3 Extensions of pseudo-Boolean functions 581
bk > 0 (respectively, for bk < 0). In [227], the function ψ std is called the standard
extension of f associated with the PBNF ψ.
The following facts will be useful:
Lemma 13.2. Consider the PBNF ψ in (13.28). For k = 1, 2, . . . , m, let Hk denote
the polyhedron
Hk = { (X, y) ∈ U n+1 | y ≤ gk (X) } if bk > 0
and
Hk = { (X, y) ∈ U n+1 | y ≥ gk (X) } if bk < 0.
All vertices of Hk are in B n+1 , that is, they only have 0–1 components.
Proof. The claim follows from the fact that, in both cases, the system of inequalities
defining Hk is totally unimodular; this follows from Theorem 5.13 in Chapter 5
for the case where bk is positive (see [47, 474, 786]); the other case is easily estab-
lished by direct arguments.
The next lemma is found in Crama [227] (see also Hammer and Kalantari [445]
and Hammer and Simeone [463]).
Lemma 13.3. Consider the PBNF ψ in (13.28). If ψ consists of a single non-
constant term, that is, if ψ = b1 ( i∈A1 xi ) ( j ∈B1 x j ), then its standard extension
ψ std and its concave envelope ψ env coincide on U n :
ψ env (X) = ψ std (X) = b1 g1 (X) for all X ∈ U n .
Proof. Since ψ std is concave, ψ env ≤ ψ std on U n . To establish the reverse inequal-
ity, let X ∗ ∈ U n . Since the point (X∗ , g1 (X ∗ )) is in H1 , it is a convex combination
of vertices of H1 : That is, there exists a collection of 0–1 points (X r , y r ) ∈ H1
and of positive scalars λr (r ∈ R) such that (X∗ , g1 (X ∗ )) = r∈R λr (X r , y r ) and
r∈R λr = 1. Hence,
ψ std (X ∗ ) =
b1 g1 (X ∗ )
= r∈R λr b1 y r
≤ r∈R λr b1 g1 (X r ) (since (X r , y r ) ∈ H1 )
= r∈R λr ψ(X r ) (since X r ∈ Bn by Lemma 13.2)
= r∈R λr ψ env (X r ) (since Xr ∈ Bn )
≤ ψ ( r∈R λr X r )
env
(by concavity of ψ env )
= ψ env (X ∗ ).
Our next result shows that, in spite of their very different definitions, ψ std and
pup
ψ turn out to be identical.
Theorem 13.9. The standard extension ψ std and the paved upper-plane extension
ψ pup associated with a same PBNF ψ coincide on U n .
Proof. Let p(X) be a paved upper-plane of ψ given by (13.30). Since each term pk
(k = 1, 2, . . . , m) is a concave majorant of the corresponding term of ψ, it follows
from Lemma 13.3 that bk gk (X) ≤ pk (X), and hence, ψ std (X) ≤ p(X) for all
X ∈ U n . So, ψ std ≤ ψ pup on U n .
To see that ψ pup ≤ ψ std on U n , fix X∗ ∈ U n and consider the paved upper-plane
p(X) given by (13.30), where for each k = 1, 2, . . . , m:
(a) pk = bk xi if bk > 0 and gk (X ∗ ) = xi∗ , i ∈ Ak ;
(b) pk = bk (1 − xj ) if bk > 0 and gk (X ∗ ) = 1 − xj∗ , j ∈ Bk ;
(c) pk = 0 if bk < 0 and gk (X ∗ ) = 0;
(d) pk = bk (1 − |Ak | + i∈Ak xi − j ∈Bk xj ) if bk < 0 and gk (X ∗ ) > 0.
(Apply an arbitrary tie-breaking rule to select the indices i and j if either (a) or
(b) are ambiguous.) This construction is such that p(X ∗ ) = ψ std (X ∗ ), and hence,
ψ pup (X ∗ ) ≤ ψ std (X ∗ ).
This extension was introduced by Lovász in [624]; see also [625]. Observe
that, if c(A) ≥ 0 for all A ⊆ {1, 2, . . . , n} such that |A| ≥ 2, then f L coincides with
the standard extension associated with the polynomial representation of f , and
it is concave. In general, however, f L is neither concave nor convex on U n , as
illustrated by the next example.
Example 13.5. The Lovász extension of f (x, y, z) = xy − xz is the function f L =
min(x, y) − min(x, z), which is neither concave nor convex on U 3 since
1
2
= 12 f L (1, 1, 0) + 12 f L (0, 1, 1) > f L ( 12 , 1, 12 ) = 0,
and
− 12 = 12 f L (1, 0, 1) + 12 f L (0, 1, 1) < f L ( 12 , 12 , 1) = 0.
The following discussion provides a different perspective on the Lovász exten-
sion. For a set A ⊆ {1, 2, . . . , n}, A = ∅, denote by m(A) the smallest element in A:
m(A) = min{i | i ∈ A}. Let S = {X ∈ U n | x1 ≤ x2 ≤ . . . ≤ xn } and observe that
S is a simplex, that is, S is a full-dimensional convex bounded polyhedron with
n + 1 vertices. Its vertices are exactly the points (0, 0, . . . , 0, 0, 0), (0, 0, . . . , 0, 0, 1),
(0, 0, . . . , 0, 1, 1), . . ., (1, 1, . . . , 1, 1, 1).
Consider now the restriction of f L to the simplex S. This function, that we
denote by fSL , is linear on S: Indeed, for all X ∈ S, Definition 13.8 yields
f L (X) = fSL (X) = c(A) xm(A) .
A∈P(N)
Even more, since fSL coincides with f at the n + 1 vertices of S, it follows that
fSL actually is the unique linear extension of f on S.
This reasoning is easily generalized. For an arbitrary permutation π of
{1, 2, . . . , n}, let S(π ) be the simplex S(π ) = {X ∈ U n | xπ(1) ≤ xπ(2) ≤ . . . ≤ xπ(n) }
L L L
and let fS(π ) be the restriction of f to S(π ). Then, fS(π ) is the unique linear
n
extension of f on S(π ). Moreover, since the cube U is covered by the family of
simplices
S = {S(π ) | π is a permutation of {1, 2, . . . , n}},
it follows that f L is the unique extension of f that is linear on every member of S.
L
In order to obtain an analytical expression of the function fS(π ) , let us introduce
the following notation: For 1 ≤ k ≤ n, let
E π ,k = eπ(k) + eπ(k+1) + . . . + eπ(n) .
We also let E π,n+1 = (0, . . . , 0), so that E π ,1 , E π ,2 , ..., E π ,n+1 are exactly the vertices
of the simplex S(π ).
Theorem 13.10. For every permutation π of {1, 2, . . . , n} and for every X ∈ S(π ),
n
L
fS(π ) (X) = (xπ(k) − xπ(k−1) )f (E π ,k ), (13.34)
k=1
Proof. Since the right-hand side of (13.34) defines a linear function, it suffices to
verify that this function coincides with f at every vertex of S(π ), which is true by
construction.
Remark. Some authors have recently started to use the term “pseudo-Boolean
optimization problems” to designate 0–1 linear programming problems of the
form (13.4)–(13.6), possibly subject to inequality constraints; see Eén and Sörens-
son [290]; Manquinho and Roussel [667], and so on. This usage is likely to create
confusion with the classically accepted definition of pseudo-Boolean optimization
problems, and we do not encourage it.
where the polynomials g and h do not depend on xi , then g is the (unique) poly-
nomial expression of Ji f . In other words, the polynomial expression of Ji f is
∂f
obtained by writing the partial derivative ∂x i
of the polynomial expression of f
with respect to xi .
Fortet [343] and Hammer and Rudeanu [460] observed that the local maxima of
a function are characterized by a system of implications involving its derivatives
(compare with (13.36)).
Let now Mi be an arbitrary upper bound on |Ji f | (for instance, the sum of the
absolute values of all coefficients in the polynomial representation of Ji f ). Then,
it is easily seen that an equivalent characterization of the local maxima of f is
given by the system of inequalities
and Zabih [147]; Davoine, Hammer, and Vizvári [262]; Hansen and Jaumard [468];
Hvattum, Løkketangen, and Glover [513]; Lodi, Allemand, and Liebling [620],
Merz and Freisleben [681], and so on.
From a theoretical perspective, however, things are not so nice. Indeed, it can
be shown that in order to find a local maximum of a pseudo-Boolean function
of n variables, such local search procedures may require a number of steps that
grows exponentially with n (see Emamy-K. [311]; Hammer, Simeone, Liebling,
and de Werra [464]; Hoke [496]; Tovey [866, 867, 868] for related investigations)
or with the encoding size of the polynomial expression of f (see Schäffer and Yan-
nakakis [806]). Moreover, Schäffer and Yannakakis [806] proved that computing
a local maximum of a quadratic pseudo-Boolean function belongs to a class of
hard (so-called PLS-complete), and likely intractable, local search problems (see
also Pardalos and Jha [730]).
Finally, it should be observed that the value of f may be arbitrarily worse in a
local maximum of f than in its global maximum (see Exercise 5 at the end of the
chapter).
Then, setting f1 = t1 +h, we have reduced the maximization of the original function
f0 in n variables to the maximization of f1 , which only depends on n − 1 variables:
Indeed, if (x2∗ , x3∗ , . . . , xn∗ ) is a maximum of f1 , then setting x1∗ to either 0 or 1
according to rules (13.41) yields a maximum of0 . f
Repeating n times this elimination process produces a sequence of pseudo-
Boolean functions f0 , f1 , . . . , fn , where fi depends on n − i variables, and
eventually allows us to determine a (global) maximum of f0 by backtracking. (Note
the analogy with the elimination techniques for the solution of Boolean equations
588 13 Pseudo-Boolean functions
Section 2.11.4), and was put to systematic use in Boros and Hammer [127], Boros
and Prékopa [145], and so on.
Theorem 13.12 also suggests that continuous global optimization techniques
can be applied to f pol to compute the maximum of f . This approach has not
proved computationally efficient in past experiments, but it remains conceptually
valuable.
This 0–1 linear model can be handled, in principle, by any algorithm for the solu-
tion of integer programming problems. The analysis of its facial structure has been
been initiated by Balas and Mazzola [44, 45]. Its continuous relaxation, meaning
the linear programming problem obtained after replacing the integrality require-
ments (13.49) and (13.50) by the weaker constraints 0 ≤ xi ≤ 1 (i = 1, 2, . . . , n)
and 0 ≤ yk ≤ 1 (k = 1, 2, . . . , m), yields an easily computable upper bound W std
590 13 Pseudo-Boolean functions
Proof. This follows from Theorem 13.10, which shows that the Lovász extension is
linear on every simplex S(π ): Hence, its maximum is necessarily attained at a
vertex of B n .
Proof. For any point X∗ ∈ Bn , let us observe first that the set
S(X∗ ) = { k ∈ {1, 2, . . . , m} | Tk (X ∗ ) = 1}
is a stable set of the graph G(ψ). Indeed, no two terms in S(X∗ ) can conflict, since
otherwise, at least one of them would vanish at the point X∗ . Hence,
ψ(X ∗ ) = bk ≤ α(ψ) for all X∗ ∈ Bn .
k∈S(X ∗ )
weighted stable set problem to posiform maximization can also be inferred from
the following observations:
First, for a posiform ψ on Bn given by (13.51), consider an arbitrary variable x i
and define the sets
Ak = {i ∈ {1, 2, . . . , n} | k ∈ Pi },
Bk = {i ∈ {1, 2, . . . , n} | k ∈ Ni },
bk = w(k).
13.5 Approximations
In this section, we briefly discuss the problem of approximating a pseudo-Boolean
f on B n by a “simpler” function. Hammer and Holzman [441] considered the
specific version of this problem in which the objective is to find a function g of
degree k, for a predetermined value of k, which minimizes the L2 -norm
[f (X) − g(X)]2 . (13.52)
X∈Bn
assume that |A1 | ≥ 2, and select j , - ∈ A1 . Let y be a new 0-1 variable, different
from x1 , x2 , . . . , xn , let M be a positive constant, and define
m
g(x1 , x2 , . . . , xn , y) = c1 xi y + ck xi
i∈A1 \{j ,-} k=2 i∈Ak
If M is large enough, then the maximum value of f over Bn is equal to the maximum
value of g over B n+1 .
Proof. Consider any point (X ∗ , y ∗ ) ∈ Bn+1 . It is easy to check that the expression
xj∗ x-∗ − 2xj∗ y ∗ − 2x-∗ y ∗ + 3y ∗ is equal to 0 when y ∗ = xj∗ x-∗ , and is strictly positive
otherwise.
Assume now that M is large (say, M > |c1 |). Then, f (X ∗ ) = g(X ∗ , y ∗ ) for all
(X ∗ , y ∗ ) ∈ Bn+1 such that y ∗ = xj∗ x-∗ , and g(X∗ , y ∗ ) < f (X ∗ ) for all other points
in Bn+1 . The claim follows directly.
Note that, after applying the transformation described in Theorem 13.17, the degree
of the first term of g is equal to |A1 | − 1. Thus, applying repeatedly this transfor-
mation eventually yields a function of degree 2 which has the same maximum
value as f .
13.6 Special classes of pseudo-Boolean functions 595
called the roof dual of f , that majorizes f (x1 , x2 , . . . , xn ) in every binary point
and that has the following property of strong persistency: If lj is strictly positive
(respectively, negative), then xj is equal to 1 (respectively, 0) in every maximizer
of f . Thus, in some cases, strong persistency allows the determination of the
optimal values of a subset of variables.
Note that the maximum of l(X) over Bn is simply equal to ρ(f ) = l0 +
n
j =1 max(lj , 0), and ρ(f ) provides an upper-bound on the maximum of f over
Bn . Hammer, Hansen, and Simeone [440] proved that ρ(f ) is exactly the opti-
mal value W std of the continuous relaxation of the 0–1 linear programming model
(13.45)–(13.50) associated with the polynomial expression of f or with any posiform of
f.
Moreover, the equality ρ(f ) = maxX∈Bn f (X) holds if and only if an associated
quadratic Boolean function is consistent; therefore, the optimality of ρ(f ) can be
tested in polynomial time (see Exercise 10 in Chapter 5).
The determination of the roof dual l(X) was derived in [440] from the solution
of the continuous relaxation of the model (13.45)–(13.50); Boros, Hammer, and Sun
[125, 134] showed that the computation of the roof dual can be efficiently reduced
to a maximum flow problem. We refer again to the survey by Boros and Hammer
[127] for additional details, as well as to Boros, Crama, and Hammer [113, 114]
or Boros, Lari, and Simeone [143] for extensions of roof duality theory.
The convex hull of the set of 0–1 solutions of (13.46)–(13.50) is called the quadric
polytope, or correlation polytope. Its facial structure was investigated by Padberg
[723] and by several other authors; see also Deza and Laurent [270] and Laurent
and Rendl [601].
There is a huge number of papers discussing exact or heuristic optimization
algorithms for quadratic pseudo-Boolean functions, and it is impossible to cite
them all here. Among recent ones, let us only mention a variety of approaches by
Billionnet and Elloumi [85]; Boros, Hammer, and Tavares [136]; Glover and Hao
[386]; Gueye and Michelon [420]; Hansen and Meyer [472]; Lodi, Allemand, and
596 13 Pseudo-Boolean functions
Liebling [620]; Merz and Freisleben [681]; Palubeckis [724], and so on, as well
as efficient implementations of the roof duality computations in the framework
of computer vision applications by Kolmogorov and Rother [576] and Rother,
Kolmogorov, Lempitsky, and Szummer [794].
where C is a large enough constant (say, C = nj=1 wj + t). One easily verifies
that Ji f ≥ 0 for i = 1, 2, . . . , n, and that Jn+1 f = r 2 (x1 , x2 , . . . , xn ) − 1. Hence, f
is not monotone if and only if there exists X∗ ∈ Bn such that r(X ∗ ) = 0, that is, if
and only if the Subset Sum problem has a “Yes” answer.
where bk ≥ 0 for k = 1, 2, . . . , m.
Suppose that some term of ψ contains at least one complemented variable, say,
B1 = ∅, and let
q1 (X) = a + b1 xi .
i∈A1
We claim that q1 (X) ≤ f (X) for all X ∈ Bn . Indeed, assume that q1 (X ∗ ) > f (X ∗ )
for some point X∗ ∈ B n . Then, we define another point Y ∗ ∈ B n as follows: yj∗ = xj∗
for all j ∈ B1 , and yj∗ = 0 for all j ∈ B1 . For this point Y ∗ ,
p1 (Y ∗ ) = q1 (X ∗ ) > f (X ∗ ) ≥ f (Y ∗ )
(the last inequality holds because f is nondecreasing). But the conclusion p1 (Y ∗ ) >
f (Y ∗ ) is in contradiction with the definition of the DNF expression of f .
Thus, there holds q1 (X) ≤ f (X) for all X, and (13.53) leads to
m
f (X) = q1 (X) ∨ pk (X).
k=2
Repeating this procedure for each term of the DNF ψ, we eventually conclude that
the expression obtained by dropping all complemented literals from ψ is again a
DNF of f (compare with Theorem 1.24 in Chapter 1).
Proof. We focus on the first statement, since the second one follows immediately
by sign reversal.
600 13 Pseudo-Boolean functions
Suppose first that f is supermodular and consider two indices i < j (note that
Ji Ji f (X) ≡ 0). In view of Definition 13.10,
and
Y ∗ = (x1 , . . . , xi−1 , 0, xi+1 , . . . , xj −1 , 1, xj +1 , . . . , xn ),
we see that Ji Jj f ≥ 0 holds as a consequence of (13.54).
Conversely, assume that (13.56) holds, and let X0 , Y 0 ∈ Bn . We are going to estab-
lish that (13.54) holds for X0 , Y 0 by induction on the Hamming distance d(X0 , Y 0 )
between X0 and Y 0 , where
n
d(X, Y ) = |xi − yi |
i=1
Moreover,
The sequence of Theorems 13.18 and 13.21 has been extended in Crama, Hammer, and
Holzman [232] and Foldes and Hammer [339] to the characterization of functions
with nonnegative derivatives of higher order (see also Choquet [192]).
Theorem 13.21 has several corollaries for a pseudo-Boolean function f given by
its polynomial expression.
First, notice that f is linear if and only if all its second-order derivatives
are identically zero. This implies that linear functions are exactly those pseudo-
Boolean functions that are simultaneously supermodular and submodular; they are
sometimes called “modular” in the literature.
Example 13.9. A prime example of linear pseudo-Boolean function is provided
by a probability measure on a finite set. Linearity is due to the defining identity
Prob(A) = Prob({j }) for all A ⊆ {1, 2, . . . , n},
j ∈A
Consider now the quadratic case. It follows from Theorem 13.21 that a quadratic
function f is supermodular if and only all its quadratic terms have nonnegative
coefficients (Nemhauser, Wolsey, and Fisher [708]). This property can easily be
checked in polynomial time.
The second-order derivatives of cubic functions are linear functions. Hence,
the minimum and maximum of these derivatives can be efficiently computed.
This implies in turn that supermodular and submodular cubic functions can also
be recognized in polynomial time. On the other hand, the following result was
independently established by Crama [226] and by Gallo and Simeone [364].
Theorem 13.22. It is co-NP-complete to decide whether a pseudo-Boolean func-
tion expressed in polynomial form is supermodular (or submodular), even when
the input is restricted to polynomials of degree 4.
Proof. The proof is similar to the proof of Theorem 13.19. We leave it as an end-of-
chapter exercise to the reader.
(If) Assume that f L is concave and let X, Y ∈ B n . Observe that the points
X ∧ Y and X ∨ Y are in a same simplex S(π ) ∈ S since X ∧ Y ≤ X ∨ Y . Thus, we
successively derive:
1
2
f (X) + 12 f (Y ) = 12 f L (X) + 12 f L (Y ) (since f L is an extension of f )
≤ f L ( 12 (X + Y )) (by concavity of f L )
= f L 12 (X ∨ Y ) + 12 (X ∧ Y )
= 12 f L (X ∨ Y ) + 12 f L (X ∧ Y ) (by linearity of f L on S(π ))
= 12 f (X ∨ Y ) + 12 f (X ∧ Y ) (since f L is an extension of f ).
(Only if) Assume that f is supermodular. Recall that, for an arbitrary permuta-
L
tion π of {1, 2, . . . , n}, fS(π ) denotes the unique linear extension of f on S(π ) and
that it can be expressed by Equation (13.34). By a slight abuse of notations, we look
L n
at fS(π ) as being defined on R , rather than on S(π ) only.
Consider now an arbitrary point X ∈ U n , and assume that xπ ∗ (1) ≤ xπ ∗ (2) ≤
L
. . . ≤ xπ ∗ (n) , meaning that X is in the simplex S(π ∗ ) and f L (X) = fS(π ∗ ) (X). We
are going to prove that, for every other permutation π ,
L L
fS(π ∗ ) (X) ≤ fS(π ) (X). (13.60)
f L (X) = fS(π
L L
∗ ) (X) = min fS(π ) (X), (13.61)
S(π )∈S
ρ(j ) = π(j + 1), ρ(j + 1) = π(j ), and ρ(i) = π(i) for all i = j , j + 1.
L L L
fS(π ∗ ) (X) = fS(ρ ∗ ) (X) ≤ fS(π ) (X).
The proof of Theorem 13.23, in particular, Equation (13.61), shows that every
supermodular function can be represented as the lower-envelope of linear (pseudo-
Boolean) functions. Interestingly, supermodular functions can also be shown to
be upper-envelopes of linear functions; this result is discussed in Rosenmüller
[791], where it is used to characterize extreme rays of the cone of nonnegative
supermodular functions.
Let us now turn to the problem of optimizing supermodular functions.
Grötschel, Lovász, and Schrijver [414] were first to prove that supermodular func-
tions can be maximized in polynomial time, even when the function can only be
accessed via an oracle (that is, a black-box algorithm which returns the value f (X)
for every input X ∈ B n ). Another proof of this result was provided by Lovász
[624], as a direct consequence of Theorem 13.23, of the fact that concave functions
can be maximized over convex sets in polynomial time, and of the observation
that maxX∈U n f L (X) = maxX∈Bn f (X) (Theorem 13.14).
Strongly polynomial combinatorial algorithms for the maximization of super-
modular functions were subsequently proposed by Iwata, Fleischer, and Fujishige
[524] and Schrijver [813]; see also Fujishige [351] and Schrijver [814], as well as
the surveys by Iwata [523] and McCormick [637].
When a supermodular function is given by its polynomial expression and is
either quadratic or cubic, then its maximization can be reduced to a max-flow min-
cut problem in an associated network (compare with Equation (13.1) in Section
13.1; see for instance Balinski [47], Billionnet and Minoux [84], Hansen and
Simeone [474], Kolmogorov and Zabih [577], Picard and Ratliff [746], Rhys
[786], Živný, Cohen, and Jeavons [938], and Section 13.6.4 hereunder for related
considerations).
Finally, let us remark that even though the maximum of a supermodular (or
the minimum of a submodular) function can be computed in polynomial time,
the opposite optimization problems, namely, the maximization of a submodular
(or the minimization of a supermodular) function is NP-hard; this follows easily,
for instance, from the NP-hardness of the max-cut problem and of the weighted
stability problem in graphs; see Application 13.1. However, a standard greedy
procedure for the maximization of a submodular set function provides a (1 − 1e )-
approximation of the maximum; see Fisher, Nemhauser, and Wolsey [332, 708],
Fujito [352], Nemhauser, and Wolsey [706], Wolsey [923], and so on. Goldengorin
[393] reviews theoretical results about the structure of local and global maxima of
submodular functions, and discusses specialized maximization algorithms.
604 13 Pseudo-Boolean functions
unate
> ?
almost-positive unimodular
? > ?
polar supermodular after switching
? >
supermodular
unate
> ?
almost-positive unimodular ⇔
supermodular after switching
? >
polar ⇔ supermodular
almost-positive unate
.
⇔ polar =⇒ ⇔ unimodular
⇔ supermodular ⇔ supermodular after switching
So, for each value of r, there exists a hyperplane which separates the vertices
of B n where f takes value at most r from those where it takes value larger than r.
Definition 13.18. A pseudo-Boolean function f on B n is unimax if it has a unique
local maximum in B n . It is completely unimodal if, for each face F of Bn , the
restriction of f to F is unimax.
This terminology is due to Hammer et al. [464], who proved that threshold
functions are completely unimodal. Completely unimodal functions were also
examined by Emamy-K. [311], Hoke [496, 497], and Wiedemann [907], and
unimax functions by Tovey [866, 867].
The main motivation for considering unimax functions is that local maximiza-
tion algorithms could be expected to perform well for such functions. Indeed, if
f is a unimax function, then the decision version of the maximization problem
is in NP ∩ co-NP, since the global maximum of f is “well-characterized” [866];
based on this observation, it has been conjectured that unimax functions can be
maximized in polynomial time. (Pardalos and Jha [730] proved that it is NP-hard
to find the global maximum of a quadratic pseudo-Boolean function even when
this global maximum is unique; however, this does not seem to have immediate
consequences for unimax functions.)
When f is completely unimodal, Hammer et al. [464] proved that there always
exists an increasing path of length at most n from any point X ∈ Bn to the maximum
of f . However, rather surprisingly, it has also been shown that simple local search
procedures may perform an exponential number of steps before they reach a local
(and global) maximum of a completely unimodal function; we refer to the above-
mentioned references or to papers by Björklund, Sandberg, and Vorobyov [93, 94,
95] for related investigations and for applications in game theory and computer-
aided verification.
Crama [226] proved that the recognition problem is NP-hard for threshold,
completely unimodal, and unimax functions expressed in polynomial form. The
question remains open, however, for quadratic unimax functions.
13.7 Exercises 607
13.7 Exercises
1. Prove that the conjunction of two pseudo-Boolean elementary conjunctions
is an elementary conjunction.
2. Show that condition (ii) in Lemma 13.1 does not completely characterize
the prime implicants of a pseudo-Boolean function.
3. Consider the (simple) game associated with a Boolean function f , and
let βi denote the Banzhaf index of player i, as in Section 1.13.3.
Show that (β1 , β2 , . . . , βn ) is proportional to the vector of first deriva-
tives (J1 f pol (C), J2 f pol (C), . . . , Jn f pol (C)) evaluated at the point C =
( 12 , 12 , . . . , 12 ). (See Owen [720].)
4. (a) Show that every point X ∈ U n can be written in a unique way as a linear
combination of the form
K
X= λk X k , (13.62)
k=1
K
L
f (X) = λk f (X k ) subject to (13.62).
k=1
1
c(i, j ).
2 1≤i<j ≤n
Conclude that every graph contains a cut of capacity at least equal to half
the sum of the edge capacities, and that such a cut can be found efficiently.
(See Erdős [312] and Sahni and Gonzalez [798].)
7. Prove that every pseudo-Boolean function f on B n has a posiform ψ of the
form (13.51) such that b 0 = minX∈Bn f (X).
8. Prove that the optimal value of the linear relaxation of (13.45)–(13.50) is exactly
the maximum of the concave standard extension ψ std (13.29).
9. Show that the hyperbolic or fractional programming problem
n
a0 + j =1 aj xj
maxn f (X) = n
X∈B b0 + j =1 bj xj
608 13 Pseudo-Boolean functions
but is NP-hard when this condition does not hold. (See Boros and Hammer
[127]; Hammer and Rudeanu [460]; Hansen, Poggi de Aragão, and Ribeiro
[473].)
10. Prove that it is co-NP-complete to decide whether a pseudo-Boolean function
expressed in DNF is monotone.
11. Prove Theorem 13.22.
12. Prove that the concave envelope of a supermodular pseudo-Boolean function
is its Lovász extension.
13. Show that the classes of almost-positive, supermodular, and polar functions
are not closed under switching.
14. Establish all the implications displayed in Figures 13.1,13.2 and 13.3, and show
that they cannot be reversed.
15. Prove that threshold pseudo-Boolean functions are completely unimodal.
16. If f is a completely unimodal function on Bn , prove that there always exists
an increasing path of length at most n from any point X ∈ B n to the maximum
of f .
17. Prove that
(a) it is NP-hard to decide whether a quadratic pseudo-Boolean function
has a unique global maximum;
(b) it is NP-hard to find the maximum of a quadratic pseudo-Boolean func-
tion even if we know that the global maximum is unique; (Pardalos and
Jha [730]).
This appendix proposes a short primer on graph and hypergraph theory. It sums up
the basic concepts and terminology used in the remainder of the monograph. For
(much) more information, we refer the reader to numerous excellent books dealing
in-depth with this topic, such as Bang-Jensen and Gutin [51]; Berge [71, 72];
Brandstädt, Le, and Spinrad [152]; Golumbic [398]; Mahadev and Peled [645]; or
Schrijver [814].
609
610 Appendix A
1
✈
✁❆
✁ ❆
✁ ❆
✁ ❆
✁ ❆
2 ✈
✁ ❆✈ 3
❍❍
❍
❍
❍❍✈
✟ 6
✟✟
✟
✈ ✈✟
✟
4 5
Figure A.1. Representation of a small graph.
A.1.1 Subgraphs
Let G = (V , E) be a graph. A graph H = (W , A) is a subgraph of G if W ⊆ V and
A ⊆ E. We say that H is the subgraph of G induced by W if A is exactly the set of
edges of G that have both of their endpoints in W ; namely, if A = {e ∈ E : e ⊆ W }.
We sometimes denote by GW the subgraph of G induced by W .
A subset of vertices S ⊆ V is said to be a stable set (or an independent set) of
G if S does not contain any edge of G. The subset S is a clique of G if every pair
of vertices of S is an edge. It is a transversal, or a vertex cover, if every edge in E
intersects S.
We denote by α(G) the maximum size of a stable set of G; by ω(G), the
maximum size of a clique of G; and by τ (G), the minimum size of a vertex cover.
A subset of edges M ⊆ E is called a matching of G if the edges in M are
pairwise disjoint. A matching is perfect if it contains 12 |V | edges, that is, if every
vertex of G is incident to an edge of the matching.
The walk (A.1) is a path if all its vertices (and hence, all its edges) are distinct:
vi = vj for 1 ≤ i < j ≤ k + 1. The walk (A.1) is a circuit if is is closed (v1 = vk+1 ),
if v1 , v2 , . . . , vk+1 are all distinct, and if e1 , e2 , . . . , ek are all distinct.
A connected component of G = (V , E) is a maximal subset S ⊆ V such that,
for all u, v ∈ S, u and v are the endpoints of a path in G. So, connected components
are the equivalence classes of the equivalence relation “u and v are connected by
a path.” A graph is connected if it has a unique connected component.
1 ❤ ❤2 1 ❤ ❤2 1 ❤ ❤2
❅
❅
❅
❅❤
4 ❤ ❤3 4 ❤ ❤3 4 ❤ 3
(a) (b) (c)
Figure A.2. (a): P4 , (b): C4 , (c): K4 .
612 Appendix A
✏
r
✒✑
❅
❅
❅
✏ ❅
❅✏
s t
✒✑ ✒✑
❅
❅
❅
✏ ❅
❅✏
u v
✒✑ ✒✑
❅ ❅
❅ ❅
❅ ❅
✏ ❅
❅✏ ❅
❅✏
w x y
✒✑ ✒✑ ✒✑
Figure A.3. A tree rooted at r.
✏
r
✒✑
❅
❅
❅
✏
✠ ❅
❘✏
❅
s t
✒✑ ✒✑
❅
❅
❅
✏
✠ ❅
❘✏
❅
u v
✒✑ ✒✑
❅ ❅
❅ ❅
❅ ❅
✏
✠ ❅
❘✏
❅ ❅
❘✏
❅
w x y
✒✑ ✒✑ ✒✑
Figure A.4. An arborescence rooted at r.
such that the cardinality of A is minimum with this property. If D is acyclic, then
D has a unique transitive reduction.
A.3 Hypergraphs
A hypergraph, or set system, is a pair of sets H = (V , E), where V is the set of
vertices of H, and the elements of E are subsets of V called edges (or hyperedges)
of the hypergraph.
Hypergraphs constitute a natural generalization of (undirected) graphs: Indeed,
a graph is nothing but a hypergraph with edges of cardinality 2. As such, many of
the concepts introduced for graphs can be extended (often in more than one way)
to hypergraphs.
For instance, a subset of vertices is said to be stable in H if it does not contain
any edge of H, and it is a transversal of H if it intersects every edge of H. A
matching is a set of pairwise disjoint edges of H.
A clutter (or Sperner family, or simple hypergraph) is a hypergraph H = (V , E)
with the property that no edge is a subset of another edge: If A ∈ E, B ∈ E and
A = B, then A ⊆ B.
Appendix B
Algorithmic complexity
By and large, we assume that the readers of this book have at least some intu-
itive knowledge about algorithms and complexity. For the sake of completeness,
however, we provide in this appendix an informal introduction to fundamental
concepts of computational complexity: problems, algorithms, running time, easy
and hard problems, etc. For a more thorough and rigorous introduction to this
topic, we refer the reader to the classical monograph by Garey and Johnson [371],
or to other specialized books like Aho, Hopcroft and Ullman [11], Papadim-
itriou [725], or Papadimitriou and Steiglitz [726]. Note that Cook et al. [211]
and Schrijver [814] also provide gentle introductions to the topic, much in the
spirit of this appendix.
In Section B.8, we propose a short primer on the complexity of list-generating
algorithms; such algorithms are usually not discussed in basic textbooks on
complexity theory, but they arise naturally in several chapters of our book.
615
616 Appendix B
Problem D
Instance: A word W ∈ I ∗ .
Question: Is W contained in D?
Quadratic equations
Instance: Three integers a, b, c ∈ N.
Question: Does the equation ax 2 + bx + c = 0 have a solution in R?
Tree
Instance: A graph G = (V , E).
Question: Is G a tree?
Hamiltonian graph
Instance: A graph G = (V , E).
Question: Does G contain a Hamiltonian circuit, that is, does G contain a circuit
that visits every vertex exactly once?
DNF Equation
Instance: A DNF expression φ(X).
Question: Is the equation φ(X) = 0 consistent?
Appendix B 617
B.2 Algorithms
In order to solve a problem, we like to rely on an algorithm, that is, on a step-
by-step procedure that describes how to compute a solution for each instance of
the problem. Thus, for the problem Quadratic equations, the algorithm may
consist in computing the resolvent ρ = b2 − 4ac, in testing whether ρ ≥ 0, and in
returning the answer either “Yes” or “No” depending on the outcome of the test.
More formally, algorithms (and computers) can be modelled in many different
ways, such as Turing machines or random access machines (RAMs). A sketchy
description of Turing machines [11, 371, 725] will suffice for our purpose. (In
fact, we only need this description for the proof of Cook’s theorem, in Section B.7
hereunder. So, the reader may choose to skip the following definitions in a first
reading and to return to them later if necessary.)
A one-tape Turing machine A consists of
where m ∈ {−1, +1}. Then, the processor changes its state from q to q , the RW
head replaces the symbol σ by σ in cell k, and the head moves to cell k + m (that
is, it moves either one step to the left or one step to the right). Iteration i + 1 can
begin.
We say that the Turing machine A accepts the word W ∈ I ∗ if it halts in state
qY when applied to W . The set of words (that is, the language) accepted by A is, by
definition, a decision problem DA . Note that when A is applied to an input word
that does not belong to DA , then A may either halt in the state qN , or it may go on
computing forever. Since we are not fond of endless computations, we introduce
one more concept: Namely, we say that the Turing machine A solves the decision
problem D if D = DA and if A halts for all inputs W ∈ I ∗ . Thus, A returns the
answer “Yes” when W ∈ D, and it returns “No” otherwise.
We also note, for the record, that if a Turing machine A halts for all inputs
W ∈ I ∗ , then it can be used to compute a function fA : I ∗ → I ∗ , where fA (W )
is the word written on the tape when A halts, disregarding all blank symbols.
Despite its apparent simplicity, the Turing machine model is surprisingly power-
ful and can be used to simulate complex computations, such as those performed by
real-world computers. Therefore, in the remainder of this appendix and throughout
most of the book, we do not distinguish between “algorithms” and Turing machines
unless the distinction is absolutely required. We refer again to the literature cited
earlier for a discussion of the relation between Turing machines and other mod-
els of computation. Roughly speaking, however, the basic idea is that all these
models are “essentially equivalent” from the point of view of their computational
efficiency. Which brings us to our next topic...
does not increase too fast with the size of the instances that it solves: We consider
such an algorithm to be efficient.
The complexity class P contains the set of all problems that can be solved by a
polynomial-time algorithm (or Turing machine):
φ = x 1 x2 x 3 ∨ x1 x 2 x3 ∨ x 1 x 2 x4 ∨ x 1 x3 ∨ x2 x3 x4 ∨ x4 x5 x6
∨ x 4 x 5 x 6 ∨ x1 x3 x 4 ∨ x3 x5 x 6 .
It may not be easy for you to decide whether the equation φ = 0 is consistent or
not. (Try!) But since we are nice people, we can provide some help: In fact, we
can assure you of the existence of a solution, and we can even convince you easily
that we are not lying. Indeed, X ∗ = (1, 0, 0, 1, 0, 0) is a solution.
Now, a crucial point in this example is that you do not need to know how we
have found the solution in order to convince yourself of its correctness: We may
have stumbled upon it by chance (nondeterministically), or guessed it otherwise.
What matters is that, once you hold the candidate X ∗ , it is easy to check that the
equation φ = 0 is indeed consistent. (This situation is not as strange as it may
initially appear; mathematicians, in particular, do not usually have to explain how
they came up with the proof of a new theorem: Their professional community only
requires that they be able to verify the validity of the alleged proof.)
Let us now generalize this idea. We say that a decision problem D ⊆ I ∗ is in
the class NP if there exists a problem D ∈ P and a polynomial p (n) such that, for
every word W ∈ I ∗ , the following statements are equivalent:
(a) W ∈ D, that is, W is a Yes-instance of D.
(b) There exists a certificate V ∈ I ∗ such that |V | ≤ p (|W |) and such that
(V , W ) ∈ D .
620 Appendix B
To relate this formal definition to the previous discussion, note that for every
Yes-instance W ∈ D, there must exist a certificate V (in our previous example, a
candidate solution X ∗ ) that is reasonably short relative to W (this is ensured by
the condition |V | ≤ p (|W |)), and such that checking the condition W ∈ D boils
down to verifying that (V , W ) ∈ D (in our example, verifying that φ(X∗ ) = 0).
Moreover, the condition (V , W ) ∈ D must be testable in polynomial time; this is
ensured by the assumption that D ∈ P.
It is easy to see that P ⊆ NP, meaning that every polynomially solvable problem
is in NP. Indeed, if D ∈ P, then it suffices to choose D = D and p (n) ≡ 0 in the
definition of NP (with V the empty string).
It is also quite obvious that the problem DNF Equation is in NP, just like
Hamiltonian graph and numerous other combinatorial problems (for example,
any Hamiltonian circuit can be used to certify that a graph is Hamiltonian). To
date, however, nobody has been able to devise a polynomial-time algorithm for
DNF Equation or for Hamiltonian graph; that is, nobody knows whether these
problems are in P or in NP\P.
The vast majority of mathematicians and computer scientists actually believe
that P = NP, but this famous conjecture has resisted all proof attempts (and there
have been many) since the early 70s. To better appreciate this conjecture, it is
useful to introduce the concepts of polynomial-time reductions and of NP-complete
problems.
W ∈ D if and only if W ∈ D .
Equation [208] (see also Levin [610]). We sketch a proof of this seminal result
later, in Section B.7.
Once we get hold of a first NP-complete problem D, it becomes easier to
establish that another problem D is also NP-complete: Indeed, to reach this con-
clusion, it suffices to prove that D is at least as hard as D, or, more precisely,
that D is reducible to D . This type of reduction has been provided for thousands
of decision problems, starting with the work of Cook [208] and Karp [550]; see
also Ausiello et al. [36], Crescenzi and Kann [244], or Garey and Johnson [371].
Several examples of NP-completeness proofs are given in the book.
Note also that the existence of NP-complete problems has interesting conse-
quences for the “P vs. NP” question stated above: Namely, to validate the conjecture
that P = NP, it is sufficient to prove that at least one NP problem cannot be solved
in polynomial time, and NP-complete problems are most natural candidates for
this purpose. Moreover, the equality P = NP holds if and only if at least one
NP-complete problem happens to be polynomially solvable.
time, then so can every NP-complete problem. In particular, NP-complete and co-
NP-complete problems are NP-hard, as are certain problems that are not known
to be either in NP or in co-NP.
Proof. We only sketch the main arguments of the proof, leaving aside some of the
technical fine points, and we refer the reader to the specialized literature for details
(see Cook’s original paper or Garey and Johnson [371]).
The proof of the theorem heavily relies on the observation that the computa-
tions performed by a Turing machine can be “encoded” by the solution of a DNF
equation, much in the same way that the output of a combinational circuit can be
implicitly represented by the solution of a DNF equation (see Section 1.13.2). So,
we start with a demonstration of this fact.
Consider an arbitrary decision problem D, and suppose that D is solved in
polynomial time by a Turing machine A. The complexity of A is bounded by a
polynomial p(n) for every instance of size n ∈ N.
For simplicity, and without loss of generality, we assume that A works on the
encoding alphabet B ∪ {Z}, so that an input word of size n can be viewed as a
point in B n .
We make the following claim:
Claim. For every n ∈ N, there is an integer m = O(p(n)2 ) and a Boolean DNF
φ(X, Y , z) (where X ∈ B n , Y ∈ Bm , and z ∈ B) with the property that, for every
point X∗ ∈ Bn ,
Q Q
• variables yq,t : their intended meaning is that yq,t = 1 if A is in state q at
iteration t;
H H
• variables yk,t , where yk,t = 1 if the RW head scans cell k at iteration t;
• variables yσ ,k,t , where yσC,k,t = 1 if σ is the symbol contained in cell k at
C
iteration t.
Q
For the variables (yq,t H
, yk,t , yσC,k,t ) to correctly describe the (uniquely defined)
configuration of the Turing machine at every iteration t, there must hold
(a) yqQ0 ,1 = 1 (the machine is initially in state q0 ) and, for all q ∈ Q \ {q0 },
Q
yq,1 = 0;
H H
(b) y1,1 = 1 (the RW head initially scans cell 1) and, for all k = 1, yk,1 = 0;
C
(c) if k ∈ {1, 2, . . . , n} and σ = xk , then yσ ,k,1 = 1; if k ∈ K \ {1, 2, . . . , n} and
σ = Z, then yσC,k,1 = 1; for all other pairs (σ , k), yσC,k,1 = 0.
At every iteration, the variables describe a valid configuration resulting
from a correct transition from the previous configuration, meaning that
(d) for all t ∈ {1, . . . , p(n)}, for all k ∈ K, for all q ∈ {qY , qN }, for all σ , for
(q , σ , m) = T (q, σ ), for all q = q , for all k = k + m, for all k = k, for
all σ ,
Q
if yq,t = 1 and yk,t H
= 1 and yσC,k,t = 1, then
H
yk+m,t+1 = 1, ykH,t+1 = 0 (the RW head scans cell k + m at iteration t + 1),
yσC ,k,t+1 = 1, yσC ,k,t+1 = 0 (cell k contains the symbol σ at iteration t +1),
yσC ,k ,t+1 = yσC ,k ,t (all cells other than cell k remain unchanged).
H H
yk,t+1 = yk,t for all k ∈ K,
The conditions (a)–(f) are easily translated into a DNF equation φ(X, Y , z) = 0.
For instance, condition (c) can be written as
n
C
(y0,k,1 xk ∨ y C0,k,1 x k ∨ y1,k,1
C
x k ∨ y C1,k,1 xk ∨ yZ,k,1
C
)∨
k=1
C
(y0,k,1 C
∨ y1,k,1 ∨ y CZ,k,1 ) = 0.
k∈K\{1,...,n}
φ(V , W ∗ , Y , 1) = 0
is consistent. Observe that when the equation has a solution (V ∗ , W ∗ , Y ∗ , 1), the
point V ∗ describes the certificate associated to W ∗ , and the point Y ∗ describes the
steps of the verification of the certificate by A. This completes the proof of Cook’s
theorem.
problem consists in generating all solutions of the equation (these problems are
considered in Sections 2.11.1 and 2.11.2, respectively). Similarly, if D expresses
the property “C is a prime implicant of the function f ”, and if the input W encodes
f (in some predetermined format), then the counting problem asks for the number
of prime implicants of f , and the list-generation problem requires the production
of all prime implicants of f (see Chapter 3).
We do not discuss the complexity of counting problems in detail here, as we
encounter very few of them in this book. Let us simply say that we call #P -
complete those counting problems that are “hardest” among a natural class of
counting problems (essentially, among those counting problems such that the prop-
erty (V , W ) ∈ D can be verified in polynomial time). We refer to [371, 725, 883]
for details.
By contrast, we find it necessary to discuss more formally the complexity
of list-generation algorithms. The main difficulty here is that the number of
solutions V satisfying the property (V , W ) ∈ D may be much larger than the
size |W | of the input; to put it another way, the size of the output of a list-
generation problem may be exponentially large in the size of its input, and hence,
no polynomial-time algorithm can possibly exist for such a problem. There-
fore, it makes sense to measure the complexity of list-generation algorithms as
a function of their input size and of their output size. This notion has been formal-
ized and used by many authors; early references include Read and Tarjan [781];
Valiant [883]; Lawler, Lenstra, and Rinnooy Kan [605]; and Johnson, Yannakakis,
and Papadimitriou [538].
Consider a binary relation D ⊆ I ∗ × I ∗ and the associated list-generation
problem LD . Let A be a list-generation algorithm for LD , and suppose that, when
running on the input W , A outputs the list V1 , V2 , . . . , Vm , in that order. Note that
the value of m depends on W but is independent of A. We take it as a measure
of the output size of LD for the instance W ∈ I ∗ (remember that the size of each
solution V1 , V2 , . . . , Vm has been assumed to be polynomially bounded in the size
of W ).
For k = 1, . . . , m, we denote by τ (k) the running time required by A to
output the first k elements of the list, that is, to generate V1 , V2 , . . . , Vk . So,
τ (m) is the total running time of A on W , and if we let τ (0) = 0, then
τ (k) − τ (k − 1) is the time elapsed between the (k − 1)-st and the k-th outputs, for
k = 1, 2, . . . , m.
Following the terminology of Johnson, Yannakakis, and Papadimitriou [538],
we say that
Polynomial total time is, in a sense, the weakest notion of polynomiality that
can be applied to LD , since the running time of any algorithm for LD must grow
at least linearly with m.
Polynomial incremental time captures the idea that the algorithm A outputs
the solutions of LD sequentially and does not spend “too much time” between
two successive outputs. Indeed, the definition implies that τ (k) − τ (k − 1) is
polynomially bounded in |W | and k, for all k. When generating the next element
in the list, however, the algorithm may need to look at all previous outputs, and
therefore, we allow τ (k) to depend on k as well as on the input size |W |.
Finally, an algorithm runs with polynomial delay when the time elapsed between
two successive outputs is polynomial in the input size of the problem. This is a
rather strong requirement, the strongest, in fact, among those discussed by Johnson,
Yannakakis, and Papadimitriou [538].
In order to better understand the complexity of the list-generation problem LD ,
it is also useful to grasp its relation with the following problem:
NEXT-GEND
Instance: A word W ∈ I ∗ , and a set K of words such that (V , W ) ∈ D for all
V ∈ K.
Output: Either find a word V ∈ K such that (V , W ) ∈ D, or prove that no such
word exists.
Clearly, if problem NEXT-GEND can be solved in polynomial time (meaning, in
time polynomial in |W | and |K|), then LD can be solved in polynomial incremental
time: Indeed, starting from the empty list K = ∅, one can iteratively generate
solutions of LD by solving a sequence of instances of NEXT-GEND , until we can
conclude that all solutions of LD have been generated.
Boros et al. [117] pointed out that, somewhat surprisingly, the converse relation
also holds (see also Lawler et al. [605]). Namely, if algorithm A solves the list-
generation problem LD in polynomial incremental time, then NEXT-GEND can
be solved in polynomial time for every input (W , K) by a single run of A on the
input W , which can be aborted after the generation of the first |K| + 1 solutions.
Thus, investigating the complexity of NEXT-GEND provides valuable insights
into the complexity of the list-generation problem LD .
Appendix C
C.1 Introduction
JBool is an application designed for teaching and illustrative purposes. It allows
users to work with Boolean functions in disjunctive normal form (DNF) or in
conjunctive normal form (CNF), and to easily manipulate the concepts described
in this book or test conjectures on small-size examples. It is not an industrial
software package, and it is not optimized to tackle large problems.
JBool can be downloaded freely from http://hdl.handle.net/2268/
72714. The user interface is written in Java and the core engine for Boolean
functions is written in ANSI C. The Java application requires a Java Runtime
Environment (JRE) 1.3 or later, and binaries for the engine are available for the
following platforms:
Source code is available, so the engine can be compiled for other platforms as
well.
This appendix is organized as follows. First, the basic interface of the software
is presented in Section C.2. The tools available to create, load, or save a function
are described in Section C.3. The main functionalities of the software are then
successively examined: Modify the elements of the edition (Section C.4), cre-
ate several representations of the same function (Section C.5.1), apply various
operators to the current function (Section C.5.2), perform operations on sev-
eral functions and test properties of the current function (Section C.5). More
details on all these functionalities can be found in the on-line help of the
software.
627
628 Appendix C
• The [File] menu gives access to standard functionalities like New, Open,
Save, and so on.
• The [Edit] menu contains classical commands like Cut, Copy, Paste, as well
as some functionalities that change the function form.
• The [Presentation] menu contains items that produce an equivalent Boolean
expression of the current function, like a dual form or an orthogo-
nal form. In each case, a new name is created with structure <Item
name>(<function name>).
text zone
or “&”). Each word starts with the alphabetical list of positive literals, followed
by the sign “-” and by the alphabetical list of negative (complemented) literals.
Empty words are allowed. For instance, the DNF (a ∧ b ∧ f ) ∨ (c) ∨ (d ∧ e) is
written as a-bf + -c + e-d. Similarly, the CNF (a ∨ b ∨ f ) ∧ (c) ∧ (d ∨ e) is written
as a-bf & -c & e-d.
An empty list (mF = 0) represents a a constant function (0 for a DNF, and 1
for a CNF) and is displayed as “F” (False) for a DNF and as “T” (True) for a CNF.
(One may also simply type “T” or “F” in the text zone.)
All Boolean expressions are automatically simplified according to the absorp-
tion laws
x ∧ (x ∨ y) = x, x ∨ (x ∧ y) = x.
Example.txt
a + b + c-d
The [Rename function] item in the [File] menu opens a dialog for entering a
new name for the current Boolean function. The new name appears in the title bar
of the corresponding window.
The [Compact...] item in the [Edit] menu deletes all dummy variables in the
variable set (Varset) and replaces the rank of all the variables by the smallest
possible rank. For instance, the Boolean function e ∨ j ∨ hu becomes a ∨ c ∨ bd.
C.5.2 Constructions
The [Construction] menu allows various constructions of new functions from the
current one, for instance, by duplication, dualization, or complementation of the
current function; by assignment of values to subsets of literals; or by merging of
variables. The new function can also be obtained by extracting terms of a given
degree or by switching variables. In each case, a new function is created whose
name is <Item Name>(<Function name>).
[1] P.A. Abdulla, P. Bjesse and N. Eén, Symbolic reachability analysis based on SAT-solvers,
in: S. Graf and M. Schwartzbach, eds., Tools and Algorithms for the Construction and
Analysis of Systems, Lecture Notes in Computer Science, Vol. 1785, Springer-Verlag,
Berlin Heidelberg, 2000, pp. 411–425.
[2] J.A. Abraham, An improved algorithm for network reliability, IEEE Transactions on
Reliability R-28 (1979) 58–61.
[3] D. Achlioptas and Y. Peres, The threshold for random k-SAT is 2k log 2 − O(k), Journal
of the American Mathematical Society 17 (2004) 947–973.
[4] D. Achlioptas and G.B. Sorkin, Optimal myopic algorithms for random 3-SAT, Proceed-
ings of the 41st Annual IEEE Symposium on the Foundations of Computer Science, IEEE,
2000, pp. 590–600.
[5] A. Adam, Truth Functions and the Problem of Their Realization by Two-Terminal Graphs,
Akademiai Kiado, Budapest, 1968.
[6] W.P. Adams and P.M. Dearing, On the equivalence between roof duality and Lagrangian
duality for unconstrained 0–1 quadratic programming problems, Discrete Applied
Mathematics 48 (1994) 1–20.
[7] K.K. Aggarwal, K.B. Misra and J.S. Gupta, A fast algorithm for reliability evaluation,
IEEE Transactions on Reliability R-24 (1975) 83–85.
[8] R. Agrawal, T. Imielinski and A. Swami, Mining association rules between sets of items in
large databases, International Conference on Management of Data (SIGMOD 93), 1993,
pp. 207–216.
[9] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A.I. Verkamo, Fast discovery of
association rules, in: U.M. Fayyad et al., eds., Advances in Knowledge Discovery and
Data Mining, AAAI Press, Menlo Park, California, 1996, pp. 307–328.
[10] A.V. Aho, M.R. Garey and J.D. Ullman, The transitive reduction of a directed graph, SIAM
Journal on Computing 1 (1972) 131–137.
[11] A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Design and Analysis of Computer
Algorithms, Addison-Wesley Publishing Company, Reading, MA, 1974.
[12] M. Aigner, Combinatorial Theory, Springer-Verlag, Berlin, Heidelberg, New York,
1979.
[13] H. Aizenstein, T. Hegedűs, L. Hellerstein and L. Pitt, Complexity theoretic hardness
results for query learning, Computational Complexity 7 (1998) 19–53.
[14] G. Alexe, P.L. Hammer, V. Lozin and D. de Werra, Struction revisited, Discrete Applied
Mathematics 132 (2003) 27–46.
[15] E. Algaba, J.M. Bilbao, J.R. Fernández Garcia and J.J. López, Computing power
indices in weighted multiple majority games, Mathematical Social Sciences 46 (2003)
63–80.
635
636 Bibliography
[16] E. Allender, L. Hellerstein, P. McCabe, T. Pitassi and M.E. Saks, Minimizing disjunctive
normal form formulas and AC0 circuits given a truth table, SIAM Journal on Computing
38 (2008) 63–84.
[17] N. Alon and P.H. Edelman, The inverse Banzhaf problem, Social Choice and Welfare 34
(2010) 371–377.
[18] M. Alonso-Meijide, B. Casas-Méndez, M.J. Holler and S. Lorenzo-Freire, Computing
power indices: Multilinear extensions and new characterizations, European Journal of
Operational Research 188 (2008) 540–554.
[19] H. Andreka and I. Nemeti, The generalized completeness of Horn predicate-logic as a
programming language, Research Report of the Department of Artificial Intelligence 21,
University of Edinburgh, 1976.
[20] D. Angluin, Learning propositional Horn sentences with hints, Research Report of the
Department of Computer Science 590, Yale University, 1987.
[21] D. Angluin, Queries and concept learning, Machine Learning 2 (1988) 319–342.
[22] D. Angluin, L. Hellerstein and M. Karpinski, Learning read-once formulas with queries,
Journal of the ACM 40 (1993) 185–210.
[23] M.F. Anjos, An improved semidefinite programming relaxation for the satisfiability
problem, Mathematical Programming 102 (2005) 589–608.
[24] M.F. Anjos, Semidefinite optimization approaches for satisfiability and maximum-
satisfiability problem, Journal on Satisfiability, Boolean Modeling and Computation
1 (2005) 1–47.
[25] M. Anthony, Discrete Mathematics of Neural Networks: Selected Topics, SIAM Mono-
graphs on Discrete Mathematics and Applications, SIAM, Philadelphia, 2001.
[26] M. Anthony, Probabilistic learning and Boolean functions, in Y. Crama and P.L. Hammer,
eds., Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 197–220.
[27] M. Anthony, Neural networks and Boolean functions, in Y. Crama and P.L. Hammer,
eds., Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 554–576.
[28] M. Anthony, Decision lists and related classes of Boolean functions, in Y. Crama and P.L.
Hammer, eds., Boolean Models and Methods in Mathematics, Computer Science, and
Engineering, Cambridge University Press, Cambridge, 2010, pp. 577–595.
[29] M. Anthony and N. Biggs, Computational Learning Theory, Cambridge University Press,
Cambridge, 1992.
[30] W.W. Armstrong, Dependency structures of database relationships, in: IFIP-74, North-
Holland, Amsterdam, 1974, pp. 580–583.
[31] T. Asano and D.P. Williamson, Improved approximation algorithms for MAX SAT, Work-
ing paper, IBM Almaden Research Center, 2000. Preliminary version in the Proceedings
of the 11th ACM-SIAM Symposium on Discrete Algorithms, 2000, pp. 96–105.
[32] R.L. Ashenhurst, The decomposition of switching functions, in: Proceedings of the
International Symposium on the Theory of Switching, Part I, Harvard University Press,
Cambridge, MA, 1959, pp. 75–116.
[33] B. Aspvall, Recognizing disguised NR(1) instances of the satisfiability problem, Journal
of Algorithms 1 (1980) 97–103.
[34] B. Aspvall, M.F. Plass and R.E. Tarjan, A linear-time algorithm for testing the truth of
certain quantified Boolean formulas, Information Processing Letters 8 (1979) 121–123.
[35] J. Astola and R.S. Stanković, Fundamentals of Switching Theory and Logic Design: A
Hands on Approach, Springer, Dordrecht, The Netherlands, 2006.
[36] G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti-Spaccamela and M. Protasi,
Complexity and Approximation, Springer-Verlag, Berlin, 1999.
[37] G. Ausiello, A. D’Atri and D. Saccà, Minimal representation of directed hypergraphs,
SIAM Journal on Computing 15 (1986) 418–431.
[38] A. Avidor, I. Berkovitch and U. Zwick, Improved approximation algorithms for MAX
NAE-SAT and MAX SAT, in: T. Erlebach and G. Persiano, eds., Approximation and
Bibliography 637
[59] E. Benoist and J-J. Hebrard, Recognition of simple enlarged Horn formulas and simple
extended Horn formulas, Annals of Mathematics and Artificial Intelligence, 37 (2003)
251–272.
[60] M. Ben-Or and N. Linial, Collective coin flipping, in: S. Micali, ed., Randomness and
Computation, Academic Press, New York, 1990, pp. 91–115.
[61] C. Benzaken, Algorithmes de dualisation d’une fonction booléenne, R.F.T.I.-Chiffres 9
(1966) 119–128.
[62] C. Benzaken, Post’s closed systems and the weak chromatic number of hypergraphs,
Discrete Mathematics 23 (1978) 77–84.
[63] C. Benzaken, Critical hypergraphs for the weak chromatic number, Journal of Combina-
torial Theory B 29 (1980) 328–338.
[64] C. Benzaken, From logical gates synthesis to chromatic bicritical clutters, Discrete Applied
Mathematics 96–97 (1999) 259–305.
[65] C. Benzaken, S. Boyd, P.L. Hammer and B. Simeone, Adjoints of pure bidirected graphs,
Congressus Numerantium 39 (1983) 123–144.
[66] C. Benzaken, Y. Crama, P. Duchet, P.L. Hammer and F. Maffray, More characterizations
of triangulated graphs, Journal of Graph Theory 14 (1990) 413–422.
[67] C. Benzaken and P.L. Hammer, Linear separation of dominating sets in graphs, Annals of
Discrete Mathematics 3 (1978) 1–10.
[68] C. Benzaken, P.L. Hammer and B. Simeone, Graphes de conflit des fonctions pseudo-
booléennes quadratiques, in: P. Hansen and D. de Werra, eds., Regards sur la Théorie des
Graphes, Presses Polytechniques Romandes, Lausanne, 1980, pp. 165–170.
[69] C. Benzaken, P.L. Hammer and B. Simeone, Some remarks on conflict graphs of quadratic
pseudo-Boolean functions, International Series of Numerical Mathematics 55 (1980)
9–30.
[70] V.L. Beresnev, On a problem of mathematical standardization theory, Upravliajemyje
Sistemy 11 (1973) 43–54 (in Russian).
[71] C. Berge, Graphes et Hypergraphes, Dunod, Paris, 1970. (Graphs and Hypergraphs,
North-Holland, Amsterdam, 1973, revised translation.)
[72] C. Berge, Hypergraphs, North-Holland, Amsterdam, 1989.
[73] J. Berman and P. Köhler, Cardinalities of finite distributive lattices, Mitteilungen aus dem
Mathematischen Seminar Giessen 121 (1976) 103–124.
[74] P. Bertolazzi and A. Sassano, An O(mn) algorithm for regular set-covering problems,
Theoretical Computer Science 54 (1987) 237–247.
[75] P. Bertolazzi and A. Sassano, A class of polynomially solvable set-covering problems,
SIAM Journal on Discrete Mathematics 1 (1988) 306–316.
[76] D. Bertsimas and J. Tsitsiklis, Introduction to Linear Optimization, Athena Scientific,
Paris, 1997.
[77] A. Bhattacharya, B. DasGupta, D. Mubayi and G. Turán, On approximate Horn
minimization, manuscript, 2009.
[78] W. Bibel and E. Eder, Methods and calculi for deduction, in: D.M. Gabbay, C.J. Hogger and
J.A. Robinson, eds., Handbook of Logic in Artificial Intelligence and Logic Programming,
Vol. 1, Logical Foundations, Oxford Science Publications, Clarendon Press, Oxford, 1993,
pp. 67–182.
[79] J.M. Bilbao, Cooperative Games on Combinatorial Structures, Kluwer Academic
Publishers, Dordrecht, 2000.
[80] J.M. Bilbao, J.R. Fernández, A. Jiménez Losada and J.J. López, Generating functions for
computing power indices efficiently, Sociedad de Estadística e Investigación Operativa
Top 8 (2000) 191–213.
[81] J.M. Bilbao, J.R. Fernández, N. Jiménez and J.J. López, Voting power in the Euro-
pean Union enlargement, European Journal of Operational Research 143 (2002)
181–196.
[82] L.J. Billera, Clutter decomposition and monotonic Boolean functions, Annals of the New
York Academy of Sciences 175 (1970) 41–48.
Bibliography 639
[83] L.J. Billera, On the composition and decomposition of clutters, Journal of Combinatorial
Theory 11 (1971) 234–245.
[84] A. Billionnet and M. Minoux, Maximizing a supermodular pseudoboolean function:
A polynomial algorithm for supermodular cubic functions, Discrete Applied Mathematics
12 (1985) 1–11.
[85] A. Billionnet and S. Elloumi, Using a mixed integer quadratic programming solver for the
unconstrained quadratic 0–1 problem, Mathematical Programming 109 (2007) 55–68.
[86] J.C. Bioch, Dualization, decision lists and identification of monotone discrete functions,
Annals of Mathematics and Artificial Intelligence 24 (1998) 69–91.
[87] J.C. Bioch, Decomposition of Boolean functions, in: Y. Crama and P.L. Hammer, eds.,
Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 39–75.
[88] J.C. Bioch and T. Ibaraki, Generating and approximating non-dominated coteries, IEEE
Transactions on Parallel and Distributed Systems 6 (1995) 905–914.
[89] J.C. Bioch and T. Ibaraki, Complexity of identification and dualization of positive Boolean
functions, Information and Computation 123 (1995) 50–63.
[90] E. Birnbaum and E.L. Lozinskii, The good old Davis-Putnam procedure helps counting
models, Journal of Artificial Intelligence Research 10 (1999) 455–477.
[91] Z.W. Birnbaum, On the importance of different components in a multicomponent system,
in: P.R. Krishnaiah, ed., Multivariate Analysis-II, Academic Press, New York, 1969.
[92] Z.W. Birnbaum, J.D. Esary and S.C. Saunders, Multi-component systems and structures
and their reliability, Technometrics 3 (1961) 55–77.
[93] H. Björklund, S. Sandberg and S. Vorobyov, Optimization on completely unimodal hyper-
cubes, Technical Report TR-2002-018, Department of Information Technology, Uppsala
University, Sweden May 2002.
[94] H. Björklund, S. Sandberg and S. Vorobyov, Complexity of model checking by iterative
improvement: The pseudo-Boolean framework, in: M. Broy and A.V. Zamulin, eds.,
Perspectives of System Informatics 2003, Lecture Notes in Computer Science, Vol. 2890,
Springer-Verlag, Berlin-Heidelberg, 2003, pp. 381–394.
[95] H. Björklund and S. Vorobyov, Combinatorial structure and randomized subexponential
algorithms for infinite games, Theoretical Computer Science 349 (2005) 347–360.
[96] A. Björner, Homology and shellability of matroids and geometric lattices, in: N. White,
ed., Matroid Applications, Cambridge University Press, Cambridge, 1992, pp. 226–283.
[97] A. Björner, Topological methods, in: R. Graham, M. Grötschel and L. Lovász, eds.,
Handbook of Combinatorics, Elsevier, Amsterdam, 1995, pp. 1819–1872.
[98] C.E. Blair, R.G. Jeroslow and J.K. Lowe, Some results and experiments in program-
ming techniques for propositional logic, Computers and Operations Research 13 (1986)
633–645.
[99] A. Blake, Canonical Expressions in Boolean Algebras, Dissertation, Department of Math-
ematics, University of Chicago, 1937. Published by University of Chicago Libraries,
1938.
[100] B. Bollig, M. Sauerhoff, D. Sieling and I. Wegener, Binary decision diagrams, in:
Y. Crama and P.L. Hammer, eds., Boolean Models and Methods in Mathematics, Computer
Science, and Engineering, Cambridge University Press, Cambridge, 2010, pp. 473–505.
[101] B. Bollobás, C. Borgs, J. Chayes, J.H. Kim and D.B. Wilson, The scaling window of the
2-SAT transition, Random Structures and Algorithms 18 (2001) 201–256.
[102] T. Bonates and P.L. Hammer, Logical Analysis of Data: From combinatorial optimization
to medical applications, Annals of Operations Research 148 (2006) 203–225.
[103] G. Boole, An Investigation of the Laws of Thought, Walton, London, 1854. (Reprinted by
Dover Books, New York, 1954.)
[104] K.S. Booth, Boolean matrix multiplication using only 0(nlog2 7 log n) bit operations,
SIGACT News 9 (Fall 1977) p. 23.
[105] E. Boros, Dualization of aligned Boolean functions, RUTCOR Research Report RRR
9-94, Rutgers University, Piscataway, NJ, 1994.
640 Bibliography
[106] E. Boros, Maximum renamable Horn sub-CNFs, Discrete Applied Mathematics 96–97
(1999) 29–40.
[107] E. Boros and O. Čepek, Perfect 0, ±1 matrices. Discrete Mathematics 165–166 (1997)
81–100.
[108] E. Boros, O. Čepek and A. Kogan, Horn minimization by iterative decomposition, Annals
of Mathematics and Artificial Intelligence 23 (1998) 321–343.
[109] E. Boros, O. Čepek, A. Kogan and P. Kučera, Exclusive and essential sets of implicates of
Boolean functions, RUTCOR Research Report 10-2008, Rutgers University, Piscataway,
NJ, 2008.
[110] E. Boros, O. Čepek, and P. Kučera, Complexity of minimizing the number of clauses and
literals in a Horn CNF, manuscript, 2010.
[111] E. Boros, Y. Crama, O. Ekin, P.L. Hammer, T. Ibaraki and A. Kogan, Boolean normal
forms, shellability and reliability computations, SIAM Journal on Discrete Mathematics
13 (2000) 212–226.
[112] E. Boros, Y. Crama and P.L. Hammer, Polynomial-time inference of all valid implications
for Horn and related formulae, Annals of Mathematics and Artificial Intelligence 1 (1990)
21–32.
[113] E. Boros, Y. Crama and P.L. Hammer, Upper bounds for quadratic 01 maximization,
Operations Research Letters 9 (1990) 7379.
[114] E. Boros, Y. Crama and P.L. Hammer, Chvátal cuts and odd cycle inequalities in quadratic
0-1 optimization, SIAM Journal on Discrete Mathematics 5 (1992) 163–177.
[115] E. Boros, Y. Crama, P.L. Hammer, T. Ibaraki, A. Kogan and K. Makino, Logical Analysis
of Data: Classification with justification, Annals of Operations Research (2011), to appear.
[116] E. Boros, Y. Crama, P.L. Hammer and M. Saks, A complexity index for satisfiability
problems, SIAM Journal on Computing 23 (1994) 45–49.
[117] E. Boros, K.M. Elbassioni, V. Gurvich, L. Khachiyan and K. Makino, Dual-bounded
generating problems: All minimal integer solutions for a monotone system of linear
inequalities, SIAM Journal on Computing 31 (2002) 1624–1643.
[118] E. Boros, K.M. Elbassioni, V. Gurvich and K. Makino, Generating vertices of polyhe-
dra and related monotone generation problems, in: D. Avis, D. Bremner and A. Deza,
eds., Polyhedral Computations, CRM Proceedings and Lecture Notes, Vol. 48, Centre de
Recherches Mathématiques and AMS (2009) pp. 15–44.
[119] E. Boros, K.M. Elbassioni and K. Makino, On Berge multiplication for monotone Boolean
dualization, in: A. Luca et al., eds., Proceedings of the 35th International Colloquium on
Automata, Languages and Programming (ICALP), Lecture Notes in Computer Science,
Vol. 5125, Springer-Verlag, Berlin Heidelberg, 2008, pp. 48–59.
[120] E. Boros, S. Foldes, P.L. Hammer and B. Simeone, A restricted consensus algorithm for
the transitive closure of a digraph, manuscript, in preparation, 2008.
[121] E. Boros, V. Gurvich and P.L. Hammer, Dual subimplicants of positive Boolean functions,
Optimization Methods and Software 10 (1998) 147–156.
[122] E. Boros, V. Gurvich, P.L. Hammer, T. Ibaraki and A. Kogan, Decompositions of partially
defined Boolean functions, Discrete Applied Mathematics 62 (1995) 51–75.
[123] E. Boros, V. Gurvich, L. Khachiyan and K. Makino, Dual-bounded generating problems:
Partial and multiple transversals of a hypergraph, SIAM Journal on Computing 30 (2000)
2036–2050.
[124] E. Boros, V. Gurvich, L. Khachiyan and K. Makino, Dual-bounded generating problems:
Weighted transversals of a hypergraph, Discrete Applied Mathematics 142 (2004) 1–15.
[125] E. Boros and P.L. Hammer, A max-flow approach to improved roof duality in quadratic
0–1 minimization, RUTCOR Research Report RRR 15-1989, Rutgers University, 1989.
[126] E. Boros and P.L. Hammer, A generalization of the pure literal rule for satisfiability
problems, RUTCOR Research Report 20-92, Rutgers University, 1992.
[127] E. Boros and P.L. Hammer, Pseudo-Boolean optimization, Discrete Applied Mathematics
123 (2002) 155–225.
Bibliography 641
[128] E. Boros, P.L. Hammer and J.N. Hooker, Predicting cause-effect relationships from
incomplete discrete observations, SIAM Journal on Discrete Mathematics 7 (1994)
531–543.
[129] E. Boros, P.L. Hammer, T. Ibaraki and K. Kawakami, Polynomial time recognition of
2-monotonic positive Boolean functions given by an oracle, SIAM Journal on Computing
26 (1997) 93–109.
[130] E. Boros, P.L. Hammer, T. Ibaraki and A. Kogan, Logical analysis of numerical data,
Mathematical Programming 79 (1997) 163–190.
[131] E. Boros, P.L. Hammer, T. Ibaraki, A. Kogan, E. Mayoraz and I. Muchnik, An implemen-
tation of logical analysis of data, IEEE Transactions on Knowledge and Data Engineering
12 (2000) 292–306.
[132] E. Boros, P.L. Hammer, M. Minoux and D.J. Rader Jr., Optimal cell flipping to mini-
mize channel density in VLSI design and pseudo-Boolean optimization, Discrete Applied
Mathematics 90 (1999) 69–88.
[133] E. Boros, P.L. Hammer and X. Sun, The DDT method for quadratic 0–1 minimization,
RUTCOR Research Report 39-89, Rutgers University, 1989.
[134] E. Boros, P.L. Hammer and X. Sun, Network flows and minimization of quadratic pseudo-
Boolean functions, RUTCOR Research Report 17-91, Rutgers University, 1991.
[135] E. Boros, P.L. Hammer and X. Sun, Recognition of q-Horn formulae in linear time,
Discrete Applied Mathematics 55 (1994) 1–13.
[136] E. Boros, P.L. Hammer and G. Tavares, Local search heuristics for quadratic unconstrained
binary optimization, Journal of Heuristics 13 (2007) 99–132.
[137] E. Boros, T. Horiyama, T. Ibaraki, K. Makino and M. Yagiura, Finding essential attributes
from binary data, Annals of Mathematics and Artificial Intelligence 39 (2003) 223–257.
[138] E. Boros, T. Ibaraki and K. Makino, Boolean analysis of incomplete examples, in: R.
Karlsson and A. Lingas, eds., Algorithm Theory – SWAT’96, Lecture Notes in Computer
Science, Vol. 1097, Springer-Verlag, Berlin, 1996, pp. 440–451.
[139] E. Boros, T. Ibaraki and K. Makino, Error-free and best-fit extensions of partially defined
Boolean functions, Information and Computation 140 (1998) 254–283.
[140] E. Boros, T. Ibaraki and K. Makino, Logical analysis of binary data with missing bits,
Artificial Intelligence 107 (1999) 219–264.
[141] E. Boros, T. Ibaraki and K. Makino, Fully consistent extensions of partially defined
Boolean functions, in: J. van Leeuwen, O. Watanabe, M. Hagiya, P.D. Mosses and T. Ito,
eds., Theoretical Computer Science - International Conference IFIP TCS 2000, Lecture
Notes in Computer Science, Vol. 1872, Springer, Berlin, 2000, pp. 257–272.
[142] E. Boros, T. Ibaraki and K. Makino, Variations on extending partially defined Boolean
functions with missing bits, Information and Computation 180 (2003) 53–70.
[143] E. Boros, I. Lari and B. Simeone, Block linear majorants in quadratic 01 optimization,
Discrete Applied Mathematics 145 (2004) 52–71.
[144] E. Boros and A. Prékopa, Closed form two-sided bounds for probabilities that at least r or
exactly r out of n events occur, Mathematics of Operations Research 14 (1989) 317–342.
[145] E. Boros and A. Prékopa, Probabilistic bounds and algorithms for the maximum
satisfiability problem, Annals of Operations Research 21 (1989) 109–126.
[146] J.-M. Bourjolly, P.L. Hammer, W.R. Pulleyblank and B. Simeone, Boolean-combinatorial
bounding of maximum 2-satisfiability, in: O. Balci, R. Sharda, S. Zenios, eds., Computer
Science and Operations Research: New Developments in their Interfaces, Pergamon Press,
1992, 23–42.
[147] Y. Boykov, O. Veksler and R. Zabih, Fast approximate energy minimization via graph cuts,
IEEE Transactions on Pattern Analysis and Machine Intelligence 23 (2001) 1222–1239.
[148] G.H. Bradley, P.L. Hammer and L.A. Wolsey, Coefficient reduction for inequalities in
0–1 variables, Mathematical Programming 7 (1974) 263–282.
[149] P.S. Bradley, U.M. Fayyad and O.L. Mangasarian, Mathematical programming for data
mining: Formulations and challenges, INFORMS Journal on Computing 11 (1999)
217–238.
642 Bibliography
[150] S.J. Brams and P.J. Affuso, Power and size: A new paradox, Theory and Decision 7 (1976)
29–56.
[151] A. Brandstädt, P.L. Hammer, V.B. Le and V.V. Lozin, Bisplit graphs, Discrete Mathematics
299 (2005) 11–32.
[152] A. Brandstädt, V.B. Le and J.P. Spinrad, Graph Classes: A Survey, SIAM Monographs on
Discrete Mathematics and Applications, SIAM, Philadelphia, 1999.
[153] R.K. Brayton, G.D. Hachtel, C.T. McMullen, A.L. Sangiovanni-Vincentelli, Logic
Minimization Algorithms for VLSI Synthesis, Kluwer Academic Publishers, Boston, 1984.
[154] A. Bretscher, D.G. Corneil, M. Habib and C. Paul, A simple linear time LexBFS cograph
recognition algorithm (extended abstract), in: Proceedings of the 29th International Work-
shop on Graph-Theoretic Concepts in Computer Science, WG2003, Lecture Notes in
Computer Science, Vol. 2880, Springer-Verlag, Berlin Heidelberg, 2003, pp. 119–130.
[155] A. Bretscher, D.G. Corneil, M. Habib and C. Paul, A simple linear time LexBFS cograph
recognition algorithm, SIAM Journal on Discrete Mathematics 22 (2008) 1277–1296.
[156] F.M. Brown, Boolean Reasoning: The Logic of Boolean Equations, Kluwer Academic
Publishers, Boston - Dordrecht - London, 1990.
[157] J. Bruck, Fourier transforms and threshold circuit complexity, in: Y. Crama and
P.L. Hammer, eds., Boolean Models and Methods in Mathematics, Computer Science,
and Engineering, Cambridge University Press, Cambridge, 2010, pp. 531–553.
[158] R. Bruni, On the orthogonalization of arbitrary Boolean formulae, Journal of Applied
Mathematics and Decision Sciences 2 (2005) 61–74.
[159] R. Bruni and A. Sassano, A complete adaptive solver for propositional satisfiability,
Discrete Applied Mathematics 127 (2003) 523–534.
[160] R.E. Bryant, Graph-based algorithms for Boolean function manipulation, IEEE Transac-
tions on Computers 35 (1986) 677–691.
[161] N.H. Bshouty, Exact learning Boolean functions via the monotone theory, Information
and Computation 123 (1995) 146–153.
[162] N. Bshouty, T.R. Hancock and L. Hellerstein, Learning boolean read-once formulas with
arbitrary symmetric and constant fan-in gates, Journal of Computer and System Sciences
50 (1995) 521–542.
[163] N. Bshouty and C. Tamon, On the Fourier spectrum of monotone functions, Journal of
the Association for Computing Machinery 43 (1996) 747–770.
[164] C. Buchheim and G. Rinaldi, Efficient reduction of polynomial zero-one optimization to
the quadratic case, SIAM Journal on Optimization 18 (2007) 1398–1413.
[165] C. Buchheim and G. Rinaldi, Terse integer linear programs for Boolean optimization,
Journal on Satisfiability, Boolean Modeling and Computation 6 (2009) 121–139.
[166] M. Buro and H. Kleine Büning, Report on a SAT competition, Report Nr. 110,
Mathematik/Informatik, Universität Paderborn, 1992.
[167] W. Büttner and H. Simonis, Embedding Boolean expressions into logic programming,
Journal of Symbolic Computation 4 (1987) 191–205.
[168] R. Cambini, G. Gallo and M.G. Scutellà, Flows on hypergraphs, Mathematical Program-
ming 78 (1997) 195–217.
[169] A. Caprara, M. Fischetti and P. Toth, A heuristic method for the set covering problem,
Operations Research 47 (1999) 730–743.
[170] C. Carlet, Boolean functions for cryptography and error-correcting codes, in: Y. Crama
and P.L. Hammer, eds., Boolean Models and Methods in Mathematics, Computer Science,
and Engineering, Cambridge University Press, Cambridge, 2010, pp. 257–397.
[171] C. Carlet, Vectorial Boolean functions for cryptography, in: Y. Crama and P.L. Hammer,
eds., Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 398–469.
[172] O. Čepek, Restricted consensus method and quadratic implicates of pure Horn functions,
RUTCOR Research Report 31, Rutgers University, Piscataway, NJ September 1994.
[173] O. Čepek, Structural properties and minimization of Horn Boolean functions, Ph.D. thesis,
RUTCOR, Rutgers University, Piscataway, NJ, October 1995.
Bibliography 643
[174] O. Čepek and P. Kučera, Known and new classes of generalized Horn formulae with
polynomial recognition and SAT testing, Discrete Applied Mathematics 149 (2005) 14–52.
[175] O. Čepek and P. Kučera, On the complexity of minimizing the number of literals in Horn
formulae, RUTCOR Research Report 11-2008, Rutgers University, Piscataway, NJ, 2008.
[176] S. Ceri, G. Gottlob and L. Tanca, Logic Programming and Databases, Springer-Verlag,
Berlin Heidelberg, 1990.
[177] D. Chai and A. Kuehlmann, A fast pseudo-Boolean constraint solver, IEEE Transactions
on Computer-Aided Design of Integrated Circuits and Systems 24 (2005) 305–317.
[178] S.T. Chakradhar, V.D. Agrawal and M.L. Bushnell, Neural Models and Algorithms for
Digital Testing, Kluwer Academic Publishers, Boston - Dordrecht - London, 1991.
[179] A.K. Chandra, H.R. Lewis and J.A. Makowsky, Embedded implicational dependencies
and their inference problem, in: Proceedings of the 13th Annual ACM Symposium on the
Theory of Computation, ACM Press, New York, 1981, pp. 342–354.
[180] R. Chandrasekaran, Integer programming problems for which a simple rounding type
algorithm works, in: W.R. Pulleyblank, ed., Progress in Combinatorial Optimization,
Academic Press Canada, Toronto, 1984, pp. 101–106.
[181] V. Chandru, C.R. Coullard, P.L. Hammer, M. Montanez, and X. Sun. On renamable
Horn and generalized Horn functions, Annals of Mathematics and Artificial Intelligence
1 (1990) 33–47.
[182] V. Chandru and J.N. Hooker, Extended Horn sets in propositional logic, Journal of the
ACM 38 (1991) 205–221.
[183] V. Chandru and J.N. Hooker, Detecting embedded Horn structure in propositional logic,
Information Processing Letters 42 (1992) 109–111.
[184] V. Chandru and J.N. Hooker, Optimization Methods for Logical Inference, John Wiley &
Sons, New York etc., 1999.
[185] C.L. Chang, The unit proof and the input proof in theorem proving, Journal of the ACM
14 (1970) 698–707.
[186] C.-L. Chang and R.C. Lee, Symbolic Logic and Mechanical Theorem Proving, Academic
Press, New York - San Francisco - London, 1973.
[187] M.T. Chao and J. Franco, Probabilistic analysis of a generalization of the unit-clause
literal section heuristic for the k-satisfiability problem, Information Science 51 (1990)
289–314.
[188] A. Chateauneuf and J.Y. Jaffray, Some characterizations of lower probabilities and other
monotone capacities through the use of Möbius inversion, Mathematical Social Sciences
17 (1989) 263–283.
[189] S.S. Chaudhry, I.D. Moon and S.T. McCormick, Conditional covering: Greedy heuristics
and computational results, Computers and Operations Research 14 (1987) 11–18.
[190] M. Chein, Algorithmes d’écriture de fonctions Booléennes croissantes en sommes et
produits, Revue Française d’Informatique et de Recherche Opérationnelle 1 (1967)
97–105.
[191] Y. Chen and D. Cooke, On the transitive closure representation and adjustable compres-
sion, in: SAC06 – Proceedings of the 21st Annual ACM Symposium on Applied Computing,
Dijon, France, 2006, pp. 450–455.
[192] G. Choquet, Theory of capacities, Annales de l’Institut Fourier 5 (1954) 131–295.
[193] C.K. Chow, Boolean functions realizable with single threshold devices, in: Proceedings
of the IRE 49 (1961) 370–371.
[194] C.K. Chow, On the characterization of threshold functions, in: IEEE Symposium on
Switching Circuit Theory and Logical Design, 1961, pp. 34–48.
[195] F.R.K. Chung, R.L. Graham and M.E. Saks, A dynamic location problem for graphs,
Combinatorica 9 (1989) 111–132.
[196] R. Church, Enumeration by rank of the elements of the free distributive lattice with 7
generators, Notices of the American Mathematical Society 12 (1965) 724.
[197] V. Chvátal, Edmonds polytopes and a hierarchy of combinatorial problems, Discrete
Mathematics 4 (1973) 305–337.
644 Bibliography
[198] V. Chvátal, A greedy heuristic for the set-covering problem, Mathematics of Operations
Research, 4 (1979) 233–235.
[199] V. Chvátal, Linear Programming, W.H. Freeman and Co., New York, 1983.
[200] V. Chvátal and C. Ebenegger, A note on line digraphs and the directed max-cut problem,
Discrete Applied Mathematics 29 (1990) 165–170.
[201] V. Chvátal and P.L. Hammer, Aggregation of inequalities in integer programming, Annals
of Discrete Mathematics 1 (1977) 145–162.
[202] V. Chvátal and B. Reed, Mick gets some (the odds are on his side), in: Proceedings of
the 33rd Annual IEEE Symposium on the Foundations of Computer Science, IEEE, 1992,
pp. 620–627.
[203] V. Chvátal and E. Szemerédi, Many hard examples for resolution, Journal of the
Association for Computer Machinery 35 (1988) 759–788.
[204] E. Clarke, A. Biere, R. Raimi and Y. Zhu, Bounded model checking using satisfiability
solving, Formal Methods in System Design 19 (2001) 7–34.
[205] C.J. Colbourn, The Combinatorics of Network Reliability, Oxford University Press,
New York, 1987.
[206] C.J. Colbourn, Boolean aspects of network reliability, in: Y. Crama and P.L. Hammer,
eds., Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 723–759.
[207] M. Conforti, G. Cornuéjols and C. de Francesco, Perfect 0, ±1 matrices, Linear Algebra
and its Applications 43 (1997) 299–309.
[208] S.A. Cook, The complexity of theorem-proving procedures, in: Proceedings of the Third
ACM Symposium on the Theory of Computing, 1971, pp. 151–158.
[209] S.A. Cook and D.G. Mitchell, Finding hard instances for the satisfiability problem: A
survey, in: D. Du, J. Gu and P.M. Pardalos, eds., Satisfiability Problem: Theory and Appli-
cations, DIMACS series in Discrete Mathematics and Theoretical Computer Science,
Vol. 35, American Mathematical Society, 1997, pp. 1–17.
[210] W.J. Cook, C.R. Coullard and Gy. Turán, On the complexity of cutting-plane proofs,
Discrete Applied Mathematics 18 (1987) 25–38.
[211] W.J. Cook, W.H. Cunningham, W.R. Pulleyblank and A. Schrijver, Combinatorial
Optimization, Wiley-Interscience, New York, 1998.
[212] D. Coppersmith and S. Winograd, On the asymptotic complexity of matrix multiplication,
SIAM Journal on Computing 11 (1982) 472–492.
[213] D. Corneil, H. Lerchs and L. Burlingham, Complement reducible graphs, Discrete Applied
Mathematics 3 (1981) 163–174.
[214] D. Corneil, Y. Perl and L. Stewart, A linear recognition algorithm for cographs, SIAM
Journal on Computing 14 (1985) 926–934.
[215] G. Cornuéjols, Combinatorial Optimization, SIAM, Philadelphia, 2001.
[216] R.W. Cottle and A.F. Veinott, Polyhedral sets having a least element, Mathematical
Programming 3 (1972) 238–249.
[217] M. Couceiro and S. Foldes, Definability of Boolean function classes by linear equations
over GF(2), Discrete Applied Mathematics 142 (2004) 29–34.
[218] M. Couceiro and S. Foldes, On closed sets of relational constraints and classes of functions
closed under variable substitutions, Algebra Universalis 54 (2005) 149–165.
[219] M. Couceiro and S. Foldes, Functional equations, constraints, definability of function
classes, and functions of Boolean variables, Acta Cybernetica 18 (2007) 61–75.
[220] M. Couceiro and M. Pouzet, On a quasi-ordering on Boolean functions, Theoretical
Computer Science 396 (2008) 71–87.
[221] O. Coudert, Two-level logic minimization: An overview, Integration: The VLSI Journal
17 (1994) 97–140.
[222] O. Coudert and T. Sasao, Two-level logic minimization, in: Logic Synthesis and Verifica-
tion, S. Hassoun and T. Sasao, eds., Kluwer Academic Publishers, Norwell, MA, 2002,
pp. 1–27.
Bibliography 645
[223] M.B. Cozzens and R. Leibowitz, Multidimensional scaling and threshold graphs, Journal
of Mathematical Psychology 31 (1987) 179–191.
[224] Y. Crama, Recognition and Solution of Structured Discrete Optimization Problems, Ph.D.
thesis, Rutgers University, Piscataway, NJ, 1987.
[225] Y. Crama, Dualization of regular Boolean functions, Discrete Applied Mathematics 16
(1987) 79–85.
[226] Y. Crama, Recognition problems for special classes of polynomials in 0–1 variables,
Mathematical Programming 44 (1989) 139–155.
[227] Y. Crama, Concave extensions for nonlinear 0–1 maximization problems, Mathematical
Programming 61 (1993) 53–60.
[228] Y. Crama, Combinatorial optimization models for production scheduling in automated
manufacturing systems, European Journal of Operational Research 99 (1997) 136–153.
[229] Y. Crama, O. Ekin and P.L. Hammer, Variable and term removal from Boolean formulae,
Discrete Applied Mathematics 75 (1997) 217–230.
[230] Y. Crama and P.L. Hammer, Recognition of quadratic graphs and adjoints of bidirected
graphs, in: G.S. Bloom, R.L. Graham and J. Malkevitch, eds., Combinatorial Mathemat-
ics: Proceedings of the Third International Conference, Annals of the New York Academy
of Sciences, Vol. 555, 1989, pp. 140–149.
[231] Y. Crama and P.L. Hammer, eds., Boolean Models and Methods in Mathematics, Computer
Science, and Engineering, Cambridge University Press, Cambridge, 2010.
[232] Y. Crama, P.L. Hammer and R. Holzman, A characterization of a cone of pseudo-Boolean
functions via supermodularity-type inequalities, in: P. Kall, J. Kohlas, W. Popp and
C.A. Zehnder, eds., Quantitative Methoden in den Wirtschaftswissenschaften, Springer-
Verlag, Berlin-Heidelberg, 1989, pp. 53–55.
[233] Y. Crama, P.L. Hammer and T. Ibaraki, Cause-effect relationships and partially defined
Boolean functions, Annals of Operations Research 16 (1988) 299–326.
[234] Y. Crama, P.L. Hammer, B. Jaumard and B. Simeone, Product form parametric represen-
tation of the solutions to a quadratic Boolean equation, RAIRO - Operations Research 21
(1987) 287–306.
[235] Y. Crama, P. Hansen and B. Jaumard, The basic algorithm for pseudo-Boolean program-
ming revisited, Discrete Applied Mathematics 29 (1990) 171–185.
[236] Y. Crama and L. Leruth, Control and voting power in corporate networks: Concepts and
computational aspects, European Journal of Operational Research 178 (2007) 879–893.
[237] Y. Crama, L. Leruth, L. Renneboog and J.-P. Urbain, Corporate control concentration
measurement and firm performance, in: J.A. Batten and T.A. Fetherston, eds., Social
Responsibility: Corporate Governance Issues, Research in International Business and
Finance (Volume 17), Elsevier, Amsterdam, 2003, pp. 123–149.
[238] Y. Crama and J.B. Mazzola, Valid inequalities and facets for a hypergraph model of the
nonlinear knapsack and FMS part-selection problems, Annals of Operations Research 58
(1995) 99–128.
[239] J.M. Crawford and L.D. Auton, Experimental results on the crossover point in random
3-SAT, Artificial Intelligence 81 (1996) 31–57.
[240] N. Creignou, A dichotomy theorem for maximum generalized satisfiability problems,
Journal of Computer and System Sciences 51 (1995) 511–522.
[241] N. Creignou and H. Daudé, Generalized satisfiability problems: Minimal elements and
phase transitions, Theoretical Computer Science 302 (2003) 417–430.
[242] N. Creignou and H. Daudé, The SAT–UNSAT transition for random constraint satisfaction
problems, Discrete Mathematics 309 (2009) 2085–2099.
[243] N. Creignou, S. Khanna and M. Sudan, Complexity Classifications of Boolean Constraint
Satisfaction Problems, SIAM Monographs on Discrete Mathematics and Applications,
SIAM, Philadelphia, 2001.
[244] P. Crescenzi and V. Kann, eds., A compendium of NP optimization problems, pub-
lished electronically at http://www.nada.kth.se/∼viggo/wwwcompendium/
(2005).
646 Bibliography
[245] H.P. Crowder, E.L. Johnson and M.W. Padberg, Solving large-scale zero–one linear
programming problems, Operations Research 31 (1983) 803–834.
[246] J. Cubbin and D. Leech, The effect of shareholding dispersion on the degree of control in
British companies: Theory and measurement, The Economic Journal 93 (1983) 351–369.
[247] R. Cunninghame-Green, Minimax Algebra, Lecture Notes in Economics and Mathemat-
ical Systems, Vol. 166, Springer, Berlin, 1979.
[248] H.A. Curtis, A New Approach to the Design of Switching Circuits, D. Van Nostrand,
Princeton, NJ, 1962.
[249] S.L.A. Czort, The Complexity of Minimizing Disjunctive Normal Form Formulas,
Master’s thesis, University of Aarhus, 1999.
[250] E. Dahlhaus, Learning monotone read-once formulas in quadratic time, Unpublished
manuscript, Department of Computer Science, University of Sydney, 1990.
[251] E. Dahlhaus, Efficient parallel recognition algorithms of cographs and distance hereditary
graphs, Discrete Applied Mathematics 57 (1995) 29–44.
[252] V. Dahllöf, P. Jonsson and M. Wahlström, Counting models for 2SAT and 3SAT formulae,
Theoretical Computer Science 332 (2005) 265–291.
[253] M. Dalal and D.W. Etherington, A hierarchy of tractable satisfiability problems,
Information Processing Letters 44 (1992) 173–180.
[254] G. Danaraj and V. Klee, Which spheres are shellable? Annals of Discrete Mathematics 2
(1978) 33–52.
[255] E. Dantsin, A. Goerdt, E.A. Hirsch, R. Kannan, J. Kleinberg, Ch. Papadimitriou, P. Ragha-
van and U. Schöning, A deterministic (2 − 2/(k + 1))n algorithm for k-SAT based on local
search, Theoretical Computer Science 289 (2002) 69–83.
[256] G.B. Dantzig, On the significance of solving linear programming problems with some
integer variables, Econometrica 28 (1960) 30–44.
[257] A. Darwiche, New advances in compiling CNF to decomposable negation normal form,
in: Proceedings of the 16th European Conference on Artificial Intelligence, Valencia,
Spain, 2004, pp. 328–332.
[258] S.B. Davidson, H. Garcia-Molina and D. Skeen, Consistency in partitioned networks,
ACM Computing Surveys 17 (1985) 341–370.
[259] M. Davio, J.-P. Deschamps and A. Thayse, Discrete and Switching Functions,
McGraw-Hill, New York, 1978.
[260] M. Davis, G. Logemann and D. Loveland, A machine program for theorem-proving,
Communications of the ACM 5 (1962) 394–397.
[261] M. Davis and H. Putnam, A computing procedure for quantification theory, Journal of
the Association for Computing Machinery 7 (1960) 201–215.
[262] T. Davoine, P.L. Hammer and B. Vizvári, A heuristic for Boolean optimization problems,
Journal of Heuristics 9 (2003) 229–247.
[263] P.M. Dearing, P.L. Hammer and B. Simeone, Boolean and graph theoretic formulations
of the simple plant location problem, Transportation Science 26 (1992) 138–148.
[264] R. Dechter and J. Pearl, Structure identification in relational data, Artificial Intelligence
58 (1992) 237–270.
[265] E. de Klerk and J.P. Warners, Semidefinite programming relaxations for MAX 2-SAT and
3-SAT: Computational perspectives, in: P.M. Pardalos, A. Migdalas and R.E. Burkard,
eds., Combinatorial and Global Optimization, Series on Applied Optimization, Volume
14, World Scientific Publishers, River Edge, NJ, 2002, pp. 161–176.
[266] E. de Klerk, J.P. Warners and H. van Maaren, Relaxations of the satisfiability problem
using semidefinite programming, Journal of Automated Reasoning 24 (2000) 37–65.
[267] C. Delobel and R.G. Casey, Decomposition of a database and the theory of Boolean
switching functions, IBM Journal of Research and Development 17 (1973) 374–386.
[268] X. Deng and C.H. Papadimitriou, On the complexity of cooperative solution concepts,
Mathematics of Operations Research 19 (1994) 257–266.
[269] M.L. Dertouzos, Threshold Logic: A Synthesis Approach, M.I.T. Press, Cambridge, MA,
1965.
Bibliography 647
[270] M.M. Deza and M. Laurent, Geometry of Cuts and Metrics, Springer-Verlag, Berlin, 1997.
[271] I. Diakonikolas and R.A. Servedio, Improved approximation of linear threshold functions,
in: Proceedings of the 24th Annual IEEE Conference on Computational Complexity, IEEE
Computer Society, Los Alamitos, CA, 2009, pp. 161–172.
[272] G. Ding, Monotone clutters, Discrete Mathematics 119 (1993) 67–77.
[273] G. Ding, R.F. Lax, J. Chen and P.P. Chen, Formulas for approximating pseudo-Boolean
random variables, Discrete Applied Mathematics 156 (2008) 1581–1597.
[274] G. Ding, R.F. Lax, J. Chen, P.P. Chen and B.D. Marx, Transforms of pseudo-Boolean
random variables, Discrete Applied Mathematics 158 (2010) 13–24.
[275] C. Domingo, N. Mishra and L. Pitt, Efficient read-restricted monotone CNF/DNF
dualization by learning with membership queries, Machine Learning 37 (1999) 89–110.
[276] G. Dong and J. Li, Mining border descriptions of emerging patterns from dataset pairs,
Knowledge Information Systems 8 (2005) 178–202.
[277] W.F. Dowling and J.H. Gallier, Linear time algorithms for testing the satisfiability of
propositional Horn formulae, Journal of Logic Programming 3 (1984) 267–284.
[278] D. Du, J. Gu and P.M. Pardalos, eds., Satisfiability Problem: Theory and Applications,
DIMACS Series in Discrete Mathematics and Theoretical Computer Science, Vol. 35,
American Mathematical Society, 1997.
[279] P. Dubey and L.S. Shapley, Mathematical properties of the Banzhaf power index,
Mathematics of Operations Research 4 (1979) 99–131.
[280] O. Dubois, Counting the number of solutions for instances of satisfiability problems,
Theoretical Computer Science 81 (1991) 49–64.
[281] O. Dubois, P. André, Y. Boufkhad and J. Carlier, SAT versus UNSAT, in: D.S. Johnson
and M.A. Trick, eds., Cliques, Coloring, and Satisfiability, DIMACS Series in Discrete
Mathematics and Theoretical Computer Science, Vol. 26,American Mathematical Society,
1996, pp. 415–436.
[282] O. Dubois, Y. Boufkhad and J. Mandler, Typical random 3-SAT formulae and the sat-
isfiability threshold, in: Proceedings of the Eleventh Annual ACM-SIAM Symposium on
Discrete Algorithms, 2000, pp. 126–127.
[283] O. Dubois and G. Dequen, A backbone-search heuristic for efficient solving of hard 3-
SAT formulae, in: Proceedings of the 17th International Joint Conference on Artificial
Intelligence (IJCAI’01), Seattle, Washington, 2001, pp. 248–253.
[284] P. Duchet, Classical perfect graphs, in: Topics on Perfect Graphs, North-Holland,
Amsterdam, 1984, pp. 67–96.
[285] Ch. Ebenegger, P.L. Hammer and D. de Werra, Pseudo-Boolean functions and stability
of graphs, Annals of Discrete Mathematics 19 (1984) 83–97.
[286] J. Ebert,Asensitive transitive closure algorithm, Information Processing Letters 12 (1981)
255–258.
[287] J. Edmonds, Submodular functions, matroids, and certain polyhedra, in: R. Guy,
H. Hanani, N. Sauer and J. Schönheim, eds., Combinatorial Structures and Their
Applications, Gordon and Breach, New York, 1970, pp. 69–87.
[288] J. Edmonds and D.R. Fulkerson, Bottleneck extrema, Journal of Combinatorial Theory
8 (1970) 299–306.
[289] N. Eén and N. Sörensson, An extensible SAT-solver, in: Proceedings of the 6th
International Conference on Theory and Applications of Satisfiability Testing, 2003.
[290] N. Eén and N. Sörensson, Translating pseudo-Boolean constraints into SAT, Journal on
Satisfiability, Boolean Modeling and Computation 2 (2006) 1–26.
[291] E. Einy, The desirability relation of simple games, Mathematical Social Sciences 10 (1985)
155–168.
[292] E. Einy and E. Lehrer, Regular simple games, International Journal of Game Theory 18
(1989) 195–207.
[293] T. Eiter, Exact transversal hypergraphs and application to Boolean µ-functions, Journal
of Symbolic Computation 17 (1994) 215–225.
[294] T. Eiter, Generating Boolean µ-expressions, Acta Informatica 32 (1995) 171–187.
648 Bibliography
[295] T. Eiter and G. Gottlob, Identifying the minimal transversals of a hypergraph and related
problems, SIAM Journal on Computing 24 (1995) 1278–1304.
[296] T. Eiter, T. Ibaraki and K. Makino, Double Horn functions, Information and Computation
144 (1998) 155–190.
[297] T. Eiter, T. Ibaraki and K. Makino, Computing intersections of Horn theories for reasoning
with models, Artificial Intelligence 110 (1999) 57–101.
[298] T. Eiter, T. Ibaraki and K. Makino, Bidual Horn functions and extensions, Discrete Applied
Mathematics 96 (1999) 55–88.
[299] T. Eiter, T. Ibaraki and K. Makino. On the difference of Horn theories, Journal of Computer
and System Sciences 61 (2000) 478–507.
[300] T. Eiter, T. Ibaraki and K. Makino, Disjunction of Horn theories and their cores, SIAM
Journal on Computing 31 (2001) 269–288.
[301] T. Eiter, P. Kilpelainen and H. Mannila, Recognizing renamable generalized propositional
Horn formulas is NP-complete, Discrete Applied Mathematics 59 (1995) 23–31.
[302] T. Eiter, K. Makino and G. Gottlob, Computational aspects of monotone dualization: A
brief survey, Discrete Applied Mathematics 156 (2008) 2035–2049.
[303] O. Ekin, Special Classes of Boolean Functions, Ph.D. Thesis, Rutgers University,
Piscataway, NJ, 1997.
[304] O. Ekin Karaşan, Dualization of quadratic Boolean functions, Annals of Operations
Research (2011), to appear.
[305] O. Ekin, S. Foldes, P.L. Hammer and L. Hellerstein, Equational characterizations of
Boolean function classes, Discrete Mathematics 211 (2000) 27–51.
[306] O. Ekin, P.L. Hammer and A. Kogan, On connected Boolean functions, Discrete Applied
Mathematics 96/97 (1999) 337–362.
[307] O. Ekin, P.L. Hammer and A. Kogan, Convexity and logical analysis of data, Theoretical
Computer Science 244 (2000) 95–116.
[308] O. Ekin, P.L. Hammer and U.N. Peled, Horn functions and submodular Boolean functions,
Theoretical Computer Science 175 (1997) 257–270.
[309] K.M. Elbassioni, On the complexity of monotone dualization and generating minimal
hypergraph transversals, Discrete Applied Mathematics 156 (2008) 2109–2123.
[310] C.C. Elgot, Truth functions realizable by single threshold organs, in: IEEE Symposium
on Switching Circuit Theory and Logical Design, 1961, pp. 225–245.
[311] M.R. Emamy-K., The worst case behavior of a greedy algorithm for a class of pseudo-
Boolean functions, Discrete Applied Mathematics 23 (1989) 285–287.
[312] P. Erdős, On some extremal problems in graph theory, Israel Journal of Mathematics 3
(1965) 113–116.
[313] P. Erdős and T. Gallai, Graphen mit Punkten vorgeschriebenen Graden, Mat. Lapok 11
(1960) 264–274.
[314] P. Erdős and J. Spencer, Probabilistic Methods in Combinatorics, Akadémiai Kiadó,
Budapest, 1974.
[315] B. Escoffier and V.Th. Paschos, Differential approximation of MIN SAT, MAX SAT and
related problems, European Journal of Operational Research 181 (2007) 620–633.
[316] E. Eskin, E. Halperin and R.M. Karp, Efficient reconstruction of haplotype structure via
perfect phylogeny, Journal of Bioinformatics and Computational Biology 1 (2003) 1–20.
[317] R. Euler, Regular (2,2)-systems, Mathematical Programming 24 (1982) 269–283.
[318] S. Even, A. Itai and A. Shamir, On the complexity of timetable and multicommodity flow
problems, SIAM Journal on Computing 5 (1976) 691–703.
[319] R. Fagin, Functional dependencies in a relational database and propositional logic, IBM
Journal of Research and Development 21 (1977) 534–544.
[320] R. Fagin, Horn clauses and database dependencies, Journal of the ACM 29 (1982)
952–985.
[321] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, Advances in Knowledge
Discovery and Data Mining, The MIT Press, Cambridge, MA, 1996.
[322] T. Feder, Network flow and 2-satisfiability, Algorithmica 11 (1994) 291–319.
Bibliography 649
[323] T. Feder, Stable Networks and Product Graphs, Memoirs of the American Mathematical
Society, Vol. 116, No. 555, Providence, RI, 1995.
[324] U. Feige, A threshold of ln n for approximating set cover, Journal of the Association for
Computing Machinery 45 (1998) 634–652.
[325] U. Feige and M.X. Goemans, Approximating the value of two prover proof sys-
tems, with applications to MAX SAT and MAX DICUT, in: Proceedings of the
Third Israel Symposium on Theory of Computing and Systems, Tel Aviv, Israel, 1995,
pp. 182–189.
[326] J. Feldman, Minimization of Boolean complexity in human concept learning, Nature 407
(2000) 630–633.
[327] J. Feldman, An algebra of human concept learning, Journal of Mathematical Psychology
50 (2006) 339–368.
[328] V. Feldman, Hardness of approximate two-level logic minimization and PAC learning
with membership queries, in: Proceedings of the 38th ACM Symposium on Theory of
Computing (STOC) 2006, pp. 363–372.
[329] D.S. Felsenthal and M. Machover, The Measurement of Voting Power: Theory and
Practice, Problems and Paradoxes, Edward Elgar, Cheltenham, UK, 1998.
[330] J.R. Fernández, E. Algaba, J.M. Bilbao, A. Jiménez, N. Jiménez and J.J. López, Generating
functions for computing the Myerson value, Annals of Operations Research 109 (2002)
143–158
[331] M.J. Fischer and A.R. Meyer, Boolean matrix multiplication and transitive closure, in:
Proceedings of the 12th Annual IEEE Symposium on the Foundations of Computer
Science, IEEE, 1971, pp. 129–131.
[332] M.L. Fisher, G.L. Nemhauser and L.A. Wolsey, An analysis of approximations for
maximizing submodular set functions - II, Mathematical Programming Study 8 (1978)
73–87.
[333] C. Flament, L’analyse booléenne de questionnaires, Mathématiques et Sciences Humaines
12 (1966) 3–10.
[334] S. Foldes, Equational classes of Boolean functions via the HSP Theorem, Algebra
Universalis 44 (2000) 309–324.
[335] S. Foldes and P.L. Hammer, Split graphs, Congressus Numerantium 19 (1977)
311–315.
[336] S. Foldes and P.L. Hammer, Disjunctive and conjunctive normal forms of pseudo-Boolean
functions, Discrete Applied Mathematics 107 (2000) 1–26.
[337] S. Foldes and P.L. Hammer, Monotone, Horn and quadratic pseudo-Boolean functions,
Journal of Universal Computer Science 6 (2000) 97–104.
[338] S. Foldes and P.L. Hammer, Disjunctive analogues of submodular and supermodular
pseudo-Boolean functions, Discrete Applied Mathematics 142 (2004) 53–65.
[339] S. Foldes and P.L. Hammer, Submodularity, supermodularity, and higher-order mono-
tonicities of pseudo-Boolean functions, Mathematics of Operations Research 30 (2005)
453–461.
[340] S. Foldes and G.R. Pogosyan, Post classes characterized by functional terms, Discrete
Applied Mathematics 142 (2004) 35–51.
[341] L.R. Ford and D.R. Fulkerson, Flows in Networks, Princeton University Press, Princeton,
NJ 1962.
[342] R. Fortet, L’algèbre de Boole et ses applications en recherche opérationnelle, Cahiers du
Centre d’Etudes de Recherche Opérationnelle 1 (1959) 5–36.
[343] R. Fortet, Applications de l’algèbre de Boole en recherche opérationnelle, Revue
Française de Recherche Opérationnelle 4 (1960) 17–26.
[344] J. Franco, Probabilistic analysis of satisfiability algorithms, in: Y. Crama and P.L. Hammer,
eds., Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 99–159.
[345] J. Franco and M. Paull, Probabilistic analysis of the Davis-Putnam procedure for solving
the satisfiability problem, Discrete Applied Mathematics 5 (1983) 77–87.
650 Bibliography
[346] L. Fratta and U.G. Montanari, A Boolean algebra method for computing the terminal
reliability in a communication network, IEEE Transactions on Circuit Theory CT-20
(1973) 203–211.
[347] M. Fredman and L. Khachiyan, On the complexity of dualization of monotone disjunctive
normal forms, Journal of Algorithms 21 (1996) 618–628.
[348] E. Friedgut, Sharp threshold of graph properties, and the k-SAT problem, Journal of the
American Mathematical Society 12 (1999) 1017–1054 (with an appendix by J. Bourgain).
[349] A.M. Frieze and B. Reed, Probabilistic analysis of algorithms, in: M. Habib, C. McDi-
armid, J. Ramirez-Alfonsin and B. Reed, eds., Probabilistic Methods for Algorithmic
Discrete Mathematics, Springer, Berlin, 1998, pp. 36–92.
[350] A. Frieze and N.C. Wormald, Random k-SAT: A tight threshold for moderately growing k,
Combinatorica 25 (2005) 297–305.
[351] S. Fujishige, Submodular Functions and Optimization, Annals of Discrete Mathematics
Vol. 58, Elsevier, Amsterdam, 2005.
[352] T. Fujito, On approximation of the submodular set cover problem, Operations Research
Letters 25 (1999) 169–174.
[353] D.R. Fulkerson, Networks, frames, blocking systems, in: G.B. Dantzig and A.F. Veinott
Jr., eds., Mathematics of the Decision Sciences - Part I, American Mathematical Society,
Providence, RI, 1968, pp. 303–334.
[354] M. Fürer and S.P. Kasiviswanathan, Algorithms for counting 2-SAT solutions and col-
orings with applications, Algorithmic Aspects in Information and Management, Lecture
Notes in Computer Science, Vol. 4508, Springer-Verlag, Berlin, 2007, pp. 47–57.
[355] M.E. Furman, Application of a method of fast multiplication to the problem of finding
the transitive closure of a graph, Soviet Mathematics Doklady 22 (1970) 1252.
[356] I.J. Gabelman, The Functional Behavior of Majority (Threshold) Elements, Ph.D.
Dissertation, Department of Electrical Engineering, Syracuse University, NY, 1961.
[357] H.N. Gabow and R.E. Tarjan, A linear-time algorithm for a special case of disjoint set
union, Journal of Computer and System Sciences 30 (1996) 209–221.
[358] T. Gallai, Transitiv orientierbare Graphen, Acta Mathematica Academiae Scientiarum
Hungaricae 18 (1967) 25–66.
[359] H. Gallaire and J. Minker, eds., Logic and Data Bases, Plenum, New York, 1978.
[360] G. Gallo, C. Gentile, D. Pretolani and G. Rago, Max Horn sat and the minimum cut
problem in directed hypergraphs, Mathematical Programming 80 (1998) 213–237.
[361] G. Gallo, G. Longo, S. Nguyen and S. Pallottino, Directed hypergraphs and applications,
Discrete Applied Mathematics 42 (1993) 177–201.
[362] G. Gallo and M.G. Scutellà, Polynomially solvable satisfiability problems, Information
Processing Letters 29 (1988) 221–227.
[363] G. Gallo and M.G. Scutellà, Directed hypergraphs as a modelling paradigm, Rivista
AMASES 21 (1998) 97–123.
[364] G. Gallo and B. Simeone, On the supermodular knapsack problem, Mathematical
Programming Study 45 (1989) 295–309.
[365] G. Gallo and G. Urbani, Algorithms for testing the satisfiability of propositional formulae,
Journal of Logic Programming 7 (1989) 45–61.
[366] G. Galperin and A. Tolpygo, Moscow Mathematical Olympiads, in: A. Kolmogorov, ed.,
Prosveschenie (Education), Moscow, USSR, 1986, Problem 72 (in Russian).
[367] F. Galvin, Horn sentences, Annals of Mathematical Logic 1 (1970) 389–422.
[368] G. Gambarelli, Power indices for political and financial decision making, Annals of
Operations Research 51 (1994) 165–173.
[369] B. Ganter and R. Wille, Formal Concept Analysis - Mathematical Foundations, Springer-
Verlag, Berlin, 1999.
[370] H. Garcia-Molina and D. Barbara, How to assign votes in a distributed system, Journal
of the Association for Computer Machinery 32 (1985) 841–860.
[371] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H. Freeman, New York, 1979.
Bibliography 651
[372] M.R. Garey, D.S. Johnson and L. Stockmeyer, Some simplified NP-complete graph
problems, Theoretical Computer Science 1 (1976) 237–267.
[373] M.A. Garrido, A. Márquez, A. Morgana and J.R. Portillo, Single bend wiring on surfaces,
Discrete Applied Mathematics 117 (2002) 27–40.
[374] F. Gavril, Testing for equality between maximum matching and minimum node covering,
Information Processing Letters 6 (1977) 199–202.
[375] F. Gavril, An efficiently solvable graph partition problem to which many problems are
reducible, Information Processing Letters 45 (1993) 285–290.
[376] A. Genkin, C.A. Kulikowski and I.B. Muchnik, Set covering submodular maximization:
An optimal algorithm for data mining in bioinformatics and medical informatics, Journal
of Intelligent and Fuzzy Systems 12 (2002) 5–17.
[377] I. Gent, H. van Maaren and T. Walsh, eds., SAT2000: Highlights of Satisfiability Research
in the Year 2000, IOS Press, Amsterdam, 2000.
[378] F. Giannessi and F. Niccolucci, Connections between nonlinear and integer programming
problems, Symposia Mathematica XIX (1976) 161–176.
[379] R. Giles and R. Kannan, A characterization of threshold matroids, Discrete Mathematics
30 (1980) 181–184.
[380] P.C. Gilmore, A proof method for quantification theory: Its justification and realization,
IBM Journal of Research and Development 4 (1960) 28–35.
[381] J.F. Gimpel, A method of producing a Boolean function having an arbitrarily pre-
scribed prime implicant table, IEEE Transactions on Electronic Computers EC-14 (1965)
485–488.
[382] J.F. Gimpel, A reduction technique for prime implicant tables, IEEE Transactions on
Electronic Computers EC-14 (1965) 535–541.
[383] A. Ginsberg, Knowledge-base reduction: A new approach to checking knowledge bases
for inconsistency and redundancy, in: Proceedings of the Seventh National Conference
on Artificial Intelligence, 1988, pp. 585–589.
[384] E. Giunchiglia, F. Giunchiglia and A. Tacchella, SAT-based decision procedures for clas-
sical modal logics, in: I. Gent, H. van Maaren and T. Walsh, eds., SAT2000: Highlights
of Satisfiability Research in the Year 2000, IOS Press, Amsterdam, 2000, pp. 403–426.
[385] V.V. Glagolev, Some estimates of disjunctive normal forms of functions in the algebra of
logic, in: Problems of Cybernetics, Vol. 19, Nauka, Moscow, 1967, pp. 75–94 (in Russian).
[386] F. Glover and J-K. Hao, Efficient evaluations for solving large 0–1 unconstrained quadratic
optimisation problems, International Journal of Metaheuristics 1 (2010) 3–10.
[387] F. Glover and E. Woolsey, Converting the 0-1 polynomial programming problem to a 0-1
linear program, Operations Research 22 (1974) 180–182.
[388] R. Gnanadesikan, Methods for Statistical Data Analysis of Multivariate Observations,
Wiley-Interscience, New York, 1977.
[389] M.X. Goemans and D.P. Williamson, New 34 -approximation algorithm for the maximum
satisfiability problem, SIAM Journal on Discrete Mathematics 7 (1994) 656–666.
[390] A. Goerdt, A threshold for unsatisfiability, in: I.M. Havel and V. Koubek, eds., Proceedings
of the 17th International Symposium on Mathematical Foundations of Computer Science,
Lecture Notes in Computer Science, Vol. 629, Springer-Verlag, Berlin, 1992, pp. 264–274.
[391] G. Gogic, C. Papadimitriou and M. Sideri, Incremental recompilation of knowledge,
Journal of Artificial Intelligence Research 8 (1998) 23–37.
[392] E. Goldberg and Y. Novikov, BerkMin: A fast and robust SAT solver, Discrete Applied
Mathematics 155 (2007) 1549–1561.
[393] B. Goldengorin, Maximization of submodular functions: Theory and enumeration
algorithms, European Journal of Operational Research 198 (2009) 102–112.
[394] B. Goldengorin, D. Ghosh and G. Sierksma, Equivalent instances of the simple plant
location problem, SOM Research Report No. 00A54, University of Groningen, The
Netherlands, 2000.
[395] B. Goldengorin, D. Ghosh and G. Sierksma, Branch and peg algorithms for the simple
plant location problem, Computers and Operations Research 31 (2004) 241–255.
652 Bibliography
[396] S.A. Goldman, M.J. Kearns and R.E. Schapire, Exact identification of read-once formulas
using fixed points of amplification functions, SIAM Journal on Computing 22 (1993)
705–726.
[397] J. Goldsmith, R.H. Sloan, B. Szorenyi and G. Turán, Theory revision with queries: Horn,
read-once, and parity formulas, Artificial Intelligence 156 (2004) 139–176.
[398] M.C. Golumbic, Algorithmic Graph Theory and Perfect Graphs, Academic Press,
New York, 1980. Second edition: Annals of Discrete Mathematics, Vol. 57, Elsevier,
Amsterdam, 2004.
[399] M.C. Golumbic and A. Mintz, Factoring logic functions using graph partitioning, in:
Proceedings of the IEEE/ACM International Conference on Computer Aided Design,
November 1999, pp. 195–198.
[400] M.C. Golumbic, A. Mintz and U. Rotics, Factoring and recognition of read-once functions
using cographs and normality, in: Proceedings of the 38th Design Automation Conference,
June 2001, pp. 109–114.
[401] M.C. Golumbic, A. Mintz and U. Rotics, Factoring and recognition of read-once functions
using cographs and normality and the readability of functions associated with partial
k-trees, Discrete Applied Mathematics 154 (2006) 1465–1477.
[402] M.C. Golumbic, A. Mintz and U. Rotics, An improvement on the complexity of factoring
read-once Boolean functions, Discrete Applied Mathematics 156 (2008) 1633–1636.
[403] C.P. Gomes, B. Selman, N. Crato and H. Kautz, Heavy-tailed phenomena in satisfiability
and constraint satisfaction problems, in: I. Gent, H. van Maaren and T. Walsh, eds.,
SAT2000: Highlights of Satisfiability Research in the Year 2000, IOS Press, Amsterdam,
2000, pp. 15–41.
[404] A. Goralcikova and V. Koubek, A reduct and closure algorithm for graphs, in: Pro-
ceedings of the 8th Symposium on Mathematical Foundations of Computer Science
(MFCS’79), Lecture Notes in Computer Science, Vol. 74, Springer-Verlag, Berlin, 1979,
pp. 301–307.
[405] M. Grabisch, The application of fuzzy integrals in multicriteria decision making, European
Journal of Operational Research 89 (1996) 445–456.
[406] M. Grabisch, J.-L. Marichal, R. Mesiar and E. Pap, Aggregation Functions, Cambridge
University Press, Cambridge, 2009.
[407] M. Grabisch, J.-L. Marichal and M. Roubens, Equivalent representations of set functions,
Mathematics of Operations Research 25 (2) (2000) 157–178.
[408] D. Granot and F. Granot, Generalized covering relaxations for 0–1 programs, Operations
Research 28 (1980) 1442–1450.
[409] D. Granot, F. Granot and J. Kallberg, Covering relaxation for positive 0–1 polynomial
programs, Management Science 25 (1979) 264–273.
[410] F. Granot and P.L. Hammer, On the use of Boolean functions in 0–1 programming,
Methods of Operations Research 12 (1972) 154–184.
[411] F. Granot and P.L. Hammer, On the role of generalized covering problems, Cahiers du
Centre d’Etudes de Recherche Opérationnelle 16 (1974) 277–289.
[412] J.F. Groote and J.P. Warners, The propositional formula checker HeerHugo, in: I. Gent,
H. van Maaren and T. Walsh, eds., SAT2000: Highlights of Satisfiability Research in the
Year 2000, IOS Press, Amsterdam, 2000, pp. 261–282.
[413] A. Grossi, Algorithme à séparation de variables pour la dualisation d’une fonction
booléenne, R.A.I.R.O. 8 (B-1) (1974) 41–55.
[414] M. Grötschel, L. Lovász and A. Schrijver, The ellipsoid method and its consequences in
combinatorial optimization, Combinatorica 1 (1981) 169–197.
[415] J. Gu, Efficient local search for very large-scale satisfiability problems, SIGART Bulletin
3 (1992) 8–12.
[416] J. Gu, Local search for satisfiability (SAT) problems, IEEE Transactions on Systems, Man
and Cybernetics 23 (1993) 1108–1129.
[417] J. Gu, Global optimization for satisfiability (SAT) problems, IEEE Transactions on
Knowledge and Data Engineering 6 (1994) 361–381.
Bibliography 653
[418] J. Gu, P.W. Purdom, J. Franco and B.W. Wah, Algorithms for the satisfiability (SAT)
problem: A survey, in: D. Du, J. Gu and P.M. Pardalos, eds., Satisfiability Problem: Theory
and Applications, DIMACS series in Discrete Mathematics and Theoretical Computer
Science, Vol. 35, American Mathematical Society, 1997. pp. 19–151.
[419] B. Guenin, Perfect and ideal 0, ±1 matrices, Mathematics of Operations Research 23
(1998) 322–338.
[420] S. Gueye and P. Michelon, A linearization framework for unconstrained quadratic (0–1)
problems, Discrete Applied Mathematics 157 (2009) 1255–1266.
[421] V. Gurvich, Nash-solvability of positional games in pure strategies, USSR Computer
Mathematics and Mathematical Physics 15(2) (1975) 74–87.
[422] V. Gurvich, On repetition-free Boolean functions, Uspekhi Mat. Nauk. 32 (1977) 183–
184, (in Russian); translated as: On read-once Boolean functions, Russian Mathematical
Surveys 32 (1977) 183–184.
[423] V. Gurvich, Applications of Boolean Functions and Networks in Game Theory, Ph.D.
thesis, Moscow Institute of Physics and Technology, Moscow, USSR, 1978 (in Russian).
[424] V. Gurvich, On the normal form of positional games, Soviet Mathematics Doklady 25(3)
(1982) 572–575.
[425] V. Gurvich, Some properties and applications of complete edge-chromatic graphs and
hypergraphs, Soviet Mathematics Doklady 30(3) (1984) 803–807.
[426] V. Gurvich, Criteria for repetition-freeness of functions in the algebra of logic, Soviet
Mathematics Doklady 43(3) (1991) 721–726.
[427] V. Gurvich, Positional game forms and edge-chromatic graphs, Soviet Mathematics
Doklady 45(1) (1992) 168–172.
[428] V. Gurvich and L. Khachiyan On the frequency of the most frequently occurring variable
in dual DNFs, Discrete Mathematics 169 (1997) 245–248.
[429] V. Gurvich and L. Khachiyan, On generating the irredundant conjunctive and disjunctive
normal forms of monotone Boolean functions, Discrete Applied Mathematics 96 (1999)
363–373.
[430] M. Habib, F. de Montgolfier and C. Paul, A simple linear-time modular decomposition
algorithm, in: Proceedings of the 9th Scandinavian Workshop on Algorithm Theory -
SWAT 2004, Lecture Notes in Computer Science, Vol. 3111, Springer-Verlag, Berlin,
2004, pp. 187–198.
[431] M. Habib and C. Paul, A simple linear time algorithm for cograph recognition, Discrete
Applied Mathematics 145 (2005) 183–197.
[432] M. Hagen, Algorithmic and Computational Complexity Issues of MONET, Ph.D. thesis,
Friedrich-Schiller-Universität Jena, Germany, 2009.
[433] A. Haken, The intractability of resolution, Theoretical Computer Science 39 (1985)
297–308.
[434] P.L. Hammer, Plant location: A pseudo-Boolean approach, Israel Journal of Technology
6 (1968) 330–332.
[435] P.L. Hammer, A note on the monotonicity of pseudo-Boolean functions, Zeitschrift für
Operations Research 18 (1974) 47–50.
[436] P.L. Hammer, Pseudo-Boolean remarks on balanced graphs, International Series of
Numerical Mathematics 36 (1977) 69–78.
[437] P.L. Hammer, The conflict graph of a pseudo-Boolean function, Bell Laboratories,
Technical Report, August 1978.
[438] P.L. Hammer, Boolean elements in combinatorial optimization, in: P.L. Hammer, E.L.
Johnson and B. Korte, eds., Discrete Optimization, Annals of Discrete Mathematics Vol. 4,
Elsevier, Amsterdam, 1979, pp. 51–71.
[439] P.L. Hammer and P. Hansen, Logical relations in quadratic 0–1 programming, Revue
Roumaine de Mathématiques Pures et Appliquées 26 (1981) 421–429.
[440] P.L. Hammer, P. Hansen and B. Simeone, Roof duality, complementation and persistency
in quadratic 0–1 optimization, Mathematical Programming 28 (1984) 121–155.
654 Bibliography
[463] P.L. Hammer and B. Simeone, Quadratic functions of binary variables, in: B. Simeone,
ed., Combinatorial Optimization, Lecture Notes in Mathematics, Vol. 1403, Springer,
Berlin, 1989, pp. 1–56.
[464] P.L. Hammer, B. Simeone, T. Liebling and D. de Werra, From linear separability
to unimodality: A hierarchy of pseudo-Boolean functions, SIAM Journal on Discrete
Mathematics 1 (1988) 174–184.
[465] A. Hamor (alias P.L. Hammer), Stories of the one-zero-zero-one nights: Abu Boul in
Graphistan, in: P. Hansen and D. de Werra, eds., Regards sur la Théorie des Graphes,
Presses Polytechniques Romandes, Lausanne, 1980.
[466] D.J. Hand, Construction and Assessment of Classification Rules, Wiley, Chichester, 1997.
[467] P. Hansen and B. Jaumard, Minimum sum of diameters clustering, Journal of Classifica-
tion 4 (1987) 215–226.
[468] P. Hansen and B. Jaumard,Algorithms for the maximum satisfiability problem, Computing
44 (1990) 279–303.
[469] P. Hansen, B. Jaumard and V. Mathon, Constrained nonlinear 0–1 programming, ORSA
Journal on Computing 5 (1993) 97–119.
[470] P. Hansen, B. Jaumard and M. Minoux, A linear expected-time algorithm for deriving all
logical conclusions implied by a set of Boolean inequalities, Mathematical Programming
34 (1986) 223–231.
[471] P. Hansen, S.H. Lu and B. Simeone, On the equivalence of paved-duality and standard
linearization in nonlinear 0–1 optimization, Discrete Applied Mathematics 29 (1990)
187–193.
[472] P. Hansen and C. Meyer, Improved compact linearizations for the unconstrained quadratic
0–1 minimization problem, Discrete Applied Mathematics 157 (2009) 1267–1290.
[473] P. Hansen, M.V. Poggi de Aragão and C.C. Ribeiro, Boolean query optimization and the
0–1 hyperbolic sum problem, Annals of Mathematics and Artificial Intelligence 1 (1990)
97–109.
[474] P. Hansen and B. Simeone, Unimodular functions, Discrete Applied Mathematics 14
(1986) 269–281.
[475] F. Harary, On the notion of balance of a signed graph, Michigan Mathematics Journal 2
(1954) 143–146.
[476] F. Harche, J.N. Hooker and G.L. Thompson, A computational study of satisfiability
algorithms for propositional logic, ORSA Journal on Computing 6 (1994) 423–435.
[477] J. Håstad, On the size of weights for threshold gates, SIAM Journal on Discrete
Mathematics 7 (1994) 484–492.
[478] J. Håstad, Some optimal inapproximability results, Journal of the Association for
Computing Machinery 48 (2001) 798–859.
[479] J.P. Hayes, The fanout structure of switching functions, Journal of the ACM 22 (1975)
551–571.
[480] J.-J. Hebrard, Unique Horn renaming and unique 2-satisfiability, Information Processing
Letters 54 (1995) 235–239.
[481] T. Hegedűs and N. Megiddo, On the geometric separability of Boolean functions, Discrete
Applied Mathematics 66 (1996) 205–218.
[482] R. Heiman and A. Wigderson, Randomized vs. deterministic decision tree complexity for
read-once Boolean functions, Computational Complexity 1 (1991) 311–329.
[483] I. Heller and C.B. Tompkins, An extension of a theorem of Dantzig, in: H.W. Kuhn and
A.W. Tucker, eds., Linear Inequalities and Related Systems, Princeton University Press,
Princeton, N.J., 1956, pp. 247–254.
[484] L. Hellerstein, Functions that are read-once on a subset of their variables, Discrete Applied
Mathematics 46 (1993) 235–251.
[485] L. Hellerstein, On generalized constraints and certificates, Discrete Mathematics 226
(2001) 211–232.
[486] L. Hellerstein and V. Raghavan, Exact learning of DNF formulas using DNF hypothesis,
Journal of Computer and System Sciences 70 (2005) 435–470.
656 Bibliography
[487] P.B. Henderson and Y. Zalcstein, A graph-theoretic characterization of the PV chunk class
of synchronizing primitives, SIAM Journal on Computing 6 (1977) 88–108.
[488] L.J. Henschen, Semantic resolution for Horn sets, IEEE Transactions on Computers 25
(1976) 816–822.
[489] L.J. Henschen and L. Wos, Unit refutations and Horn sets, Journal of the ACM 21 (1974)
590–605.
[490] M. Herbstritt, Satisfiability and Verification: From Core Algorithms to Novel Application
Domains, Suedwestdeutscher Verlag für Hochschulschriften, 2009.
[491] A. Hertz, On the use of Boolean methods for the computation of the stability number,
Discrete Applied Mathematics 76 (1997) 183–203.
[492] E.A. Hirsch, New worst-case upper bounds for SAT, Journal of Automated Reasoning 24
(2000) 397–420.
[493] W. Hodges, Reducing first order logic to Horn logic, School of Mathematical Sciences,
Queen Mary and Westfield College, London, 1985.
[494] W. Hodges, Logical features of Horn clauses, in: Handbook of Logic in Artifi-
cial Intelligence and Logic Programming, Vol. 1, Oxford University Press, 1993,
pp. 449–503.
[495] A.J. Hoffman and J.B. Kruskal, Integral boundary points of convex polyhedra, in:
H.W. Kuhn and A.W. Tucker, eds., Linear Inequalities and Related Systems, Princeton
University Press, Princeton, N.J., 1956, 223–246.
[496] K. Williamson Hoke, Completely unimodal numberings of a simple polytope, Discrete
Applied Mathematics 20 (1988) 69–81.
[497] K. Hoke, Extending shelling orders and a hierarchy of functions of unimodal simple
polytopes, Discrete Applied Mathematics 60 (1995) 211–217.
[498] J.N. Hooker, A quantitative approach to logical inference, Decision Support Systems 4
(1988) 45–69.
[499] J.N. Hooker, Generalized resolution and cutting planes, Annals of Operations Research
12 (1988) 217–239.
[500] J.N. Hooker, Resolution vs. cutting plane solution of inference problems: Some compu-
tational experience, Operations Research Letters 7 (1988) 1–7.
[501] J.N. Hooker, Resolution and the integrality of satisfiability problems, Mathematical
Programming 74 (1996) 1–10.
[502] J.N. Hooker, Logic-Based Methods for Optimization: Combining Optimization and
Constraint Satisfaction, John Wiley & Sons, New York, 2000.
[503] J.N. Hooker, Optimization methods in logic, in: Y. Crama and P.L. Hammer, eds., Boolean
Models and Methods in Mathematics, Computer Science, and Engineering, Cambridge
University Press, Cambridge, 2010, pp. 160–194.
[504] J.N. Hooker and V. Vinay, Branching rules for satisfiability, Journal of Automated
Reasoning 15 (1995) 359–383.
[505] H.H. Hoos and T. Stützle, Towards a characterisation of the behaviour of stochastic local
search algorithms for SAT, Artificial Intelligence 112 (1999) 213–232.
[506] H.H. Hoos and T. Stützle, SATLIB: An online resource for research on SAT, in:
I. Gent, H. van Maaren and T. Walsh, eds., SAT2000: Highlights of Satisfiability Research
in the Year 2000, IOS Press, Amsterdam, 2000, pp. 283–292.
[507] H.H. Hoos and T. Stützle, Local search algorithms for SAT: An empirical evaluation,
Journal of Automated Reasoning 24 (2000) 421–481.
[508] H.H. Hoos and T. Stützle, Stochastic Local Search: Foundations and Applications,
Morgan Kaufmann Publishers, San Francisco, CA, 2005.
[509] A. Horn, On sentences which are true of direct unions of algebras, Journal of Symbolic
Logic 16 (1951) 14–21.
[510] I. Horrocks and P.F. Patel-Schneider, Evaluating optimized decision procedures for propo-
sitional modal K(m) satisfiability, in: I. Gent, H. van Maaren and T. Walsh, eds., SAT2000:
Highlights of Satisfiability Research in the Year 2000, IOS Press, Amsterdam, 2000,
pp. 427–458.
Bibliography 657
[511] S.-T. Hu, Threshold Logic, University of California Press, Berkeley - Los Angeles,
1965.
[512] S.-T. Hu, Mathematical Theory of Switching Circuits and Automata, University of
California Press, Berkeley - Los Angeles, 1968.
[513] L.M. Hvattum, A. Løkketangen and F. Glover, Adaptive memory search for Boolean
optimization problems, Discrete Applied Mathematics 142 (2004) 99–109.
[514] L. Hyafil and R.L. Rivest, Constructing optimal binary decision trees is NP-complete,
Information Processing Letters 5 (1976) 15–17.
[515] T. Ibaraki, T. Imamichi, Y. Koga, H. Nagamochi, K. Nonobe and M. Yagiura, Efficient
branch-and-bound algorithms for weighted MAX-2-SAT, Technical Report 2007-011,
Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto
University, May 2007.
[516] T. Ibaraki and T. Kameda, A theory of coteries: Mutual exclusion in distributed systems,
IEEE Transactions on Parallel and Distributed Systems 4 (1993) 779–794.
[517] T. Ibaraki, A. Kogan and K. Makino, Functional dependencies in Horn theories, Artificial
Intelligence 108 (1999) 1–30.
[518] T. Ibaraki, A. Kogan and K. Makino, Inferring minimal functional dependencies in
Horn and q-Horn theories, Annals of Mathematics and Artificial Intelligence, 38 (2003)
233–255.
[519] J.P. Ignizio, Introduction to Expert Systems: The Development and Implementation of
Rule-Based Expert Systems, McGraw-Hill, New York, 1991.
[520] J.R. Isbell, A class of simple games, Duke Mathematical Journal 25 (1958) 423–439.
[521] A. Itai and J.A. Makowsky, Unification as a complexity measure for logic programming,
Journal of Logic Programming 4 (1987) 105–117.
[522] K. Iwama, CNF satisfiability test by counting and polynomial average time, SIAM Journal
on Computing 18 (1989) 385–391.
[523] S. Iwata, Submodular function minimization, Mathematical Programming Ser. B 112
(2008) 45–64.
[524] S. Iwata, L. Fleischer and S. Fujishige, A combinatorial, strongly polynomial-time algo-
rithm for minimizing submodular functions, in: Proceedings of the 32nd ACM Symposium
on Theory of Computing, 2000, pp. 97–106.
[525] S. Janson, Y.C. Stamatiou and M. Vamvakari, Bounding the unsatisfiability threshold of
random 3-SAT, Random Structures and Algorithms 17 (2000) 103–116.
[526] B. Jaumard, Extraction et Utilisation de Relations Booléennes pour la Résolution des
Programmes Linéaires en Variables 0-1, Thèse de doctorat, Ecole Nationale Supérieure
des Télécommunications, Paris, France, 1986.
[527] B. Jaumard, P. Marchioro, A. Morgana, R. Petreschi and B. Simeone, An O(n3 )
on-line algorithm for 2-satisfiability, Atti Giornate di Lavoro AIRO, Pisa, 1988,
pp. 391–399.
[528] B. Jaumard, P. Marchioro, A. Morgana, R. Petreschi and B. Simeone, On-line 2-
satisfiability, Annals of Mathematics and Artificial Intelligence 1 (1990) 155–165.
[529] B. Jaumard and M. Minoux, An efficient algorithm for the transitive closure and a linear
worst-case complexity result for a class of sparse graphs, Information Processing Letters
22 (1986) 163–169.
[530] B. Jaumard and B. Simeone, On the complexity of the maximum satisfiability problem
for Horn formulas, Information Processing Letters 26 (1987) 1–4.
[531] B. Jaumard, B. Simeone and P.S. Ow, A selected Artificial Intelligence bibliography for
Operations Researchers, Annals of Operations Research 12 (1988) 1–50.
[532] B. Jaumard, M. Stan and J. Desrosiers, Tabu search and a quadratic relaxation for
the satisfiability problem, in: D.S. Johnson and M.A. Trick, eds., Cliques, Coloring,
and Satisfiability, DIMACS Series in Discrete Mathematics and Theoretical Computer
Science, Vol. 26, American Mathematical Society, 1996, pp. 457–477.
[533] R.G. Jeroslow, Logic-Based Decision Support - Mixed Integer Model Formulation,
North-Holland, Amsterdam, 1989.
658 Bibliography
[534] R.G. Jeroslow and J. Wang, Solving propositional satisfiability problems, Annals of
Mathematics and Artificial Intelligence 1 (1990) 167–187.
[535] J.H.R. Jiang and T. Villa, Hardware equivalence checking, in: Y. Crama and P.L. Hammer,
eds., Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 599–674.
[536] D.S. Johnson, Approximation algorithms for combinatorial problems, Journal of Com-
puter and System Sciences 9 (1974) 256–278.
[537] D.S. Johnson and M.A. Trick, eds., Cliques, Coloring, and Satisfiability, DIMACS
Series in Discrete Mathematics and Theoretical Computer Science, Vol. 26, American
Mathematical Society, 1996.
[538] D.S. Johnson, M. Yannakakis and C.H. Papadimitriou, On generating all maximal
independent sets, Information Processing Letters 27 (1988) 119–123.
[539] N.D. Jones and W.T. Laaser, Complete problems for deterministic polynomial time,
Theoretical Computer Science 3 (1976) 105–117.
[540] S. Joy, J. Mitchell and B. Borchers, A branch and cut algorithm for MAX-SAT and
weigthed MAX-SAT, in: D. Du, J. Gu and P.M. Pardalos, eds., Satisfiability Problem:
Theory and Applications, DIMACS series in Discrete Mathematics and Theoretical
Computer Science, Vol. 35, American Mathematical Society, 1997, pp. 519–536.
[541] S. Jukna, A. Razborov, P. Savický and I. Wegener, On P versus NP ∩ co-NP for decision
trees and read-once branching programs, in: I. Privara and P. Ruzicka, eds., Mathematical
Foundations of Computer Science 1997, Lecture Notes in Computer Science, Vol. 1295,
Springer-Verlag, Berlin-New York, 1997, pp. 319–326.
[542] J. Kahn, Entropy, independent sets and antichains:Anew approach to Dedekind’s problem,
Proceedings of the American Mathematical Society 130 (2002) 371–378.
[543] J. Kahn, G. Kalai and N. Linial, The influence of variables on Boolean functions, in: Pro-
ceedings of the 29th Annual IEEE Symposium on the Foundations of Computer Science,
IEEE, White Plains, NY, 1988, pp. 68–80.
[544] B. Kalantari and J.B. Rosen, Penalty formulation for zero-one nonlinear programming,
Discrete Applied Mathematics 16 (1987) 179–182.
[545] A.P. Kamath, N.K. Karmarkar, K.G. Ramakrishnan and M.G.C. Resende, Computa-
tional experience with an interior point algorithm on the satisfiability problem, Annals of
Operations Research 25 (1990) 43–58.
[546] A.P. Kamath, N.K. Karmarkar, K.G. Ramakrishnan and M.G.C. Resende, A continuous
approach to inductive inference, Mathematical Programming 57 (1992) 215–238.
[547] Y. Kambayashi, Logic design of programmable logic arrays, IEEE Transactions on
Computers C-28 (1979) 609–617.
[548] M. Karchmer, N. Linial, I. Newman, M. Saks and A. Wigderson, Combina-
torial characterization of read-once formulae, Discrete Mathematics 114 (1993)
275–282.
[549] H. Karloff and U. Zwick,A7/8-approximation algorithm for MAX 3SAT?, in: Proceedings
of the 38th Annual IEEE Symposium on the Foundations of Computer Science, IEEE, 1997,
pp. 406–415.
[550] R.M. Karp, Reducibility among combinatorial problems, in: R.E. Miller and
J.W. Thatcher, eds., Complexity of Computer Computations, Plenum Press, New York,
1972, pp. 85–103.
[551] R.M. Karp, M. Luby and N. Madras, Monte-Carlo approximation algorithms for
enumeration problems, Journal of Algorithms 10 (1989) 429–448.
[552] M. Karpinski, H. Kleine Büning and P.H. Schmitt, On the computational complexity of
quantified Horn clauses, in: E. Börger, H. Kleine Büning and M.M. Richter, eds., CSL’87,
First Workshop on Computer Science Logic, Lecture Notes in Computer Science, Vol. 329,
Springer-Verlag, Berlin, 1988, pp. 129–137.
[553] S.A. Kauffman, The Origins of Order: Self-Organization and Selection in Evolution,
Oxford University Press, New York, 1993.
Bibliography 659
[554] H.A. Kautz, M.J. Kearns and B. Selman, Horn approximations of empirical data, Artificial
Intelligence 74 (1995) 129–145.
[555] H. Kautz and B. Selman, Knowledge compilation and theory of approximation, Journal
of the ACM 43 (1996) 193–224.
[556] H. Kautz and B. Selman, Pushing the envelope: Planning, propositional logic, and stochas-
tic search, in: Proceedings of the 13th National Conference on Artificial Intelligence,
Portland, OR, 1996, pp. 1188–1194.
[557] H. Kautz, B. Selman and Y. Jiang, A general stochastic approach to solving problems with
hard and soft constraints, in: D. Du, J. Gu and P.M. Pardalos, eds., Satisfiability Prob-
lem: Theory and Applications, DIMACS series in Discrete Mathematics and Theoretical
Computer Science, Vol. 35, American Mathematical Society, 1997, pp. 573–586.
[558] D.J. Kavvadias, C.H. Papadimitriou and M. Sideri, On Horn envelopes and hypergraph
transversals, in: K.W. Ng et al., eds., Algorithms and Computation – ISAAC’93, Lecture
Notes in Computer Science, Vol. 762, Springer-Verlag, Berlin, 1993, pp. 399–405.
[559] D.J. Kavvadias and E.C. Stavropoulos, An efficient algorithm for the transversal
hypergraph generation, Journal of Graph Algorithms and Applications 9 (2005) 239–264.
[560] M. Kearns, M. Li and L. Valiant, Learning Boolean functions, Journal of the Association
for Computing Machinery 41 (1994) 1298–1328.
[561] H. Kellerer, U. Pferschy and D. Pisinger, Knapsack Problems, Springer-Verlag, Berlin-
Heidelberg-New York, 2004.
[562] L. Khachiyan, E. Boros, K. Elbassioni and V. Gurvich, Generating all minimal integral
solutions to AND-OR systems of monotone inequalities: Conjunctions are simpler than
disjunctions, Discrete Applied Mathematics 156 (2008) 2020–2034.
[563] S. Khanna, M. Sudan and D.P. Williamson, A complete classification of the approx-
imability of maximization problems derived from Boolean constraint satisfaction, in:
Proceedings of the 29th Annual ACM Symposium on the Theory of Computing, 1997,
pp. 11–20.
[564] R. Khardon, Translating between Horn representations and their characteristic models,
Journal of Artificial Intelligence Research 3 (1995) 349–372.
[565] R. Khardon, H. Mannila and D. Roth, Reasoning with examples: Propositional formulae
and database dependencies, Acta Informatica 36 (1999) 267–286.
[566] R. Khardon and D. Roth, Reasoning with models, Artificial Intelligence 87 (1996)
187–213.
[567] S. Khot, G. Kindler, E. Mossel and R. O’Donnell, Optimal inapproximability results for
MAX-CUT and other 2-variable CSPs?, SIAM Journal on Computing 37 (2007) 319–357.
[568] P. Kilby, J.K. Slaney, S. Thibaux and T. Walsh, Backbones and backdoors in satisfiability,
AAAI Proceedings, 2005, pp. 1368–1373.
[569] V. Klee and P. Kleinschmidt, Convex polytopes and related complexes, in: R. Graham, M.
Grötschel and L. Lovász, eds., Handbook of Combinatorics, Elsevier, Amsterdam, 1995,
pp. 875–917.
[570] H. Kleine Büning, On generalized Horn formulas and k-resolution, Theoretical Computer
Science 116 (1993) 405–413.
[571] H. Kleine Büning and T. Lettmann, Propositional Logic: Deduction and Algorithms,
Cambridge University Press, Cambridge, 1999.
[572] D. Kleitman, On Dedekind’s problem: The number of monotone Boolean functions,
Proceedings of the American Mathematical Society 21 (1969) 677–682.
[573] D. Kleitman and G. Markowsky, On Dedekind’s problem: The number of isotone Boolean
functions. II, Transactions of the American Mathematical Society 213 (1975) 373–390.
[574] B. Klinz and G.J. Woeginger, Faster algorithms for computing power indices in weighted
voting games, Mathematical Social Sciences 49 (2005) 111–116.
[575] D.E. Knuth, The Art of Computer Programming, Volume 4, Fascicle 0, Introduction to
Combinatorial Algorithms and Boolean Functions, Stanford University, Stanford, CA,
2008. http://www-cs-faculty.stanford.edu/ knuth/taocp.html
660 Bibliography
[576] V. Kolmogorov and C. Rother, Minimizing nonsubmodular functions with graph cuts -
A review, IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (2007)
1274–1279.
[577] V. Kolmogorov and R. Zabih, What energy functions can be minimized via graph cuts?,
IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (2004) 147–159.
[578] A.D. Korshunov, The number of monotone Boolean functions, Problemy Kibernetiki 38
(1981) 5–108 (in Russian).
[579] A.D. Korshunov, Families of subsets of a finite set and closed classes of Boolean functions,
in: P. Frankl et al., eds., Extremal Problems for Finite Sets, János Bolyai Mathematical
Society, Budapest, Hungary, 1994, pp. 375–396.
[580] A.D. Korshunov, Monotone Boolean functions, Russian Mathematical Surveys 58 (2003)
929–1001.
[581] S. Kottler, M. Kaufmann and C. Sinz, Computation of renameable Horn backdoors, in:
Proceedings of the 11th International Conference on Theory and Applications of Satisfia-
bility Testing (SAT 2008), Lecture Notes in Computer Science, Vol. 4996, Springer-Verlag,
Berlin, 2008, pp. 154–160.
[582] R. Kowalski, Logic for Problem Solving, North-Holland, Amsterdam-New York, 1979.
[583] M. Krause and I. Wegener, Circuit complexity, in: Y. Crama and P.L. Hammer, eds.,
Boolean Models and Methods in Mathematics, Computer Science, and Engineering,
Cambridge University Press, Cambridge, 2010, pp. 506–530.
[584] L. Kroc, A. Sabharwal and B. Selman, Leveraging belief propagation, backtrack search,
and statistics for model counting, in: L. Perron and M.A. Trick, eds., Integration of AI and
OR Techniques in Constraint Programming for Combinatorial Optimization Problems,
Lecture Notes in Computer Science Vol. 5015, Springer-Verlag, Berlin Heidelberg, 2008,
pp. 127–141.
[585] P. Kučera, On the size of maximum renamable Horn sub-CNF, Discrete Applied
Mathematics 149 (2005) 126–130.
[586] W. Küchlin and C. Sinz, Proving consistency assertions for automotive product data
management, in: I. Gent, H. van Maaren and T. Walsh, eds., SAT2000: Highlights of
Satisfiability Research in the Year 2000, IOS Press, Amsterdam, 2000, pp. 327–342.
[587] H.W. Kuhn, The Hungarian method for solving the assignment problem, Naval Research
Logistics Quarterly 2 (1955) 83–97.
[588] O. Kullmann, New methods for 3-SAT decision and worst-case analysis, Theoretical
Computer Science 223 (1999) 1–72.
[589] J. Kuntzmann, Algèbre de Boole, Dunod, Paris, 1965. English translation: Fundamental
Boolean Algebra, Blackie and Son Limited, London and Glasgow, 1967.
[590] W. Kunz and D. Stoffel, Reasoning in Boolean Networks, Kluwer Academic Publishers,
Boston - Dordrecht - London, 1997.
[591] Z.A. Kuzicheva, Mathematical logic, in: A.N. Kolmogorov and A.P. Yushkevich, eds.,
Mathematics of the 19th Century, Volume 1, 2nd revised edition, Birkhaüser Verlag,
Basel, 2001, pp. 1–34.
[592] A.V. Kuznetsov, Non-repeating contact schemes and non-repeating superpositions of
functions of algebra of logic, in: Collection of Articles on Mathematical Logic and its
Applications to Some Questions of Cybernetics, Proceedings of the Steklov Institute of
Mathematics, Vol. 51, Academy of Sciences of USSR, Moscow, 1958, pp. 862–25.
[593] L. Lamport, The implementation of reliable distributed multiprocess systems, Computing
Networks 2 (1978) 95–114.
[594] M. Langlois, D. Mubayi, R.H. Sloan and G. Turán, Combinatorial problems for Horn
clauses, manuscript, 2008.
[595] M. Langlois, R.H. Sloan, B. Szörényi and G. Turán, Horn complements: Towards Horn-
to-Horn belief revision, in: D. Fox and C.P. Gomes, eds., Proceedings of the Twenty-Third
AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Illinois, USA, 2008,
pp. 466–471.
Bibliography 661
[596] M. Langlois, R.H. Sloan and G. Turán, Horn upper bounds and renaming, in: J. Marques-
Silva and K.A. Sakallah, eds., Proceedings of the 10th International Conference on Theory
and Applications of Satisfiability Testing – SAT 2007, Lisbon, Portugal, 2007, pp. 80–93.
[597] E. Lapidot, Weighted majority games and symmetry groups of games, M.Sc. thesis (in
Hebrew), Technion, Haifa, Israel, 1968.
[598] E. Lapidot, The counting vector of a simple game, Proceedings of the American
Mathematical Society 31 (1972) 228–231.
[599] T. Larrabee, Test pattern generation using Boolean satisfiability, IEEE Transactions on
Computer-Aided Design 11 (1992) 4–15.
[600] A. Laruelle and M. Widgrén, Is the allocation of voting power among EU states fair?,
Public Choice 94 (1998) 317–339.
[601] M. Laurent and F. Rendl, Semidefinite programming and integer programming, in:
K. Aardal, G. Nemhauser and R. Weismantel, eds., Discrete Optimization, Elsevier,
Amsterdam, 2005, pp. 393–514.
[602] M. Laurent and A. Sassano, A characterization of knapsacks with the max-flow-min-cut
property, Operations Research Letters 11 (1992) 105–110.
[603] E.L. Lawler, Covering problems: Duality relations and a new method of solution, SIAM
Journal on Applied Mathematics 14 (1966) 1115–1132.
[604] E.L. Lawler, Combinatorial Optimization: Networks and Matroids, Holt, Rinehart and
Winston, New York, 1976.
[605] E.L. Lawler, J.K. Lenstra and A.H.G. Rinnooy Kan, Generating all maximal independent
sets: NP-hardness and polynomial-time algorithms, SIAM Journal on Computing 9 (1980)
558–565.
[606] D. Leech, The relationship between shareholding concentration and shareholder voting
power in British companies: A study of the application of power indices for simple games,
Management Science 34 (1988) 509–528.
[607] D. Leech, Designing the voting system for the Council of the European Union, Public
Choice 113 (2002) 437–464.
[608] D. Leech, Voting power in the governance of the International Monetary Fund, Annals of
Operations Research 109 (2002) 375–397.
[609] D. Leech, Computation of power indices, Warwick Economic Research Papers, Number
644, The University of Warwick, 2002.
[610] L.A. Levin, Universal’nye zadachi perebora, Problemy Peredachi Informatsii 9 (1973)
115–116 (in Russian); translated as: Universal sequential search problems, Problems of
Information Transmission 9 (1974) 265–266.
[611] M. Lewin, D. Livnat and U. Zwick, Improved rounding techniques for the MAX 2-SAT
and MAX DI-CUT problems, in: Integer Programming and Combinatorial Optimiza-
tion (IPCO), Lecture Notes in Computer Science, Vol. 2337, Springer-Verlag, Berlin
Heidelberg New York, 2002, pp. 67–82.
[612] H.R. Lewis, Renaming a set of clauses as a Horn set, Journal of the ACM 25 (1978)
134–135.
[613] C.M. Li and Anbulagan, Heuristics based on unit propagation for satisfiability problems,
Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence,
Morgan Kaufmann, 1997, pp. 366–371.
[614] N. Linial and N. Nisan, Approximate inclusion-exclusion, Combinatorica 10 (1990)
349–365.
[615] N. Linial and M. Tarsi, Deciding hypergraph 2-colourability by H-resolution, Theoretical
Computer Science 38 (1985) 343–347.
[616] M.O. Locks, Inverting and minimalizing path sets and cut sets, IEEE Transactions on
Reliability R-27 (1978) 107–109.
[617] M.O. Locks, Inverting and minimizing Boolean functions, minimal paths and mini-
mal cuts: Noncoherent system analysis, IEEE Transactions on Reliability R-28 (1979)
373–375.
662 Bibliography
[618] M.O. Locks, Recursive disjoint products, inclusion-exclusion, and min-cut approxima-
tions, IEEE Transactions on Reliability R-29 (1980) 368–371.
[619] M.O. Locks, Recursive disjoint products:Areview of three algorithms, IEEE Transactions
on Reliability R-31 (1982) 33–35.
[620] A. Lodi, K. Allemand and T.M. Liebling, An evolutionary heuristic for quadratic 0–1
programming, European Journal of Operational Research 119 (1999) 662–670.
[621] D.E. Loeb and A.R. Conway, Voting fairly: Transitive maximal intersecting families of
sets, Journal of Combinatorial Theory A 91 (2000) 386–410.
[622] L. Lovász, Normal hypergraphs and the perfect graph conjecture, Discrete Mathematics
2 (1972) 253–267.
[623] L. Lovász, On the ratio of optimal and integral and fractional covers, Discrete Mathematics
13 (1975) 383–390.
[624] L. Lovász, Submodular functions and convexity, in: A. Bachem, M. Grötschel and
B. Korte, eds., Mathematical Programming – The State of the Art, Springer-Verlag, Berlin,
1983, pp. 235–257.
[625] L. Lovász, An Algorithmic Theory of Numbers, Graphs and Convexity, Society for
Industrial and Applied Mathematics, Philadelphia, 1986.
[626] L. Lovász, Lecture Notes on Evasiveness of Graph Properties, Notes by Neal Young,
Computer Science Department, Princeton University, January 1994.
[627] D.W. Loveland, Automated Theorem-Proving: A Logical Basis, North-Holland, Amster-
dam, 1978.
[628] L. Löwenheim, Über das Auflösungsproblem im logischen Klassenkalkul, Sitzungs-
berichte der Berliner Mathematischen Gesellschaft 7 (1908) 89–94.
[629] L. Löwenheim, Über die Auflösung von Gleichungen im logischen Gebietkalkul,
Mathematische Annalen 68 (1910) 169–207.
[630] E. Lozinskii, Counting propositional models, Information Processing Letters 41 (1992)
327–332.
[631] W.F. Lucas, Measuring power in weighted voting systems, in: Case Studies in Applied
Mathematics, Mathematical Association of America, 1976, pp. 42–106. Also Chapter 9
in: S.J. Brams, W.F. Lucas and P.D. Straffin, Jr., eds., Political and Related Models,
Springer-Verlag, Berlin Heidelberg New York, 1983.
[632] W.F. Lucas, The apportionment problem, Chapter 14 in: S.J. Brams, W.F. Lucas and P.D.
Straffin, Jr., eds., Political and Related Models, Springer-Verlag, Berlin Heidelberg New
York, 1983.
[633] E.J. McCluskey, Minimization of Boolean functions, Bell Systems Technical Journal 35
(1956) 1417–1444.
[634] E.J. McCluskey, Introduction to the Theory of Switching Circuits, McGraw-Hill,
New York, 1965.
[635] E.J. McCluskey, Logic Design Principles, Prentice-Hall, Englewood Cliffs, New Jersey,
1986.
[636] R.M. McConnell and J.P. Spinrad, Modular decomposition and transitive orientation,
Discrete Mathematics 201 (1999) 189–241.
[637] S.T. McCormick, Submodular function minimization, in: K. Aardal, G.L. Nemhauser,
R. Weismantel, eds., Discrete Optimization, Handbooks in Operations Research and
Management Science, Vol. 12, Elsevier, Amsterdam, 2005, pp. 321–391.
[638] J.C.C. McKinsey, The decision problem for some classes of sentences without quantifiers,
Journal of Symbolic Logic 8 (1943) 61–76.
[639] I. McLean, Don’t let the lawyers do the math: Some problems of legislative districting in
the UK and the USA, Mathematical and Computer Modelling 48 (2008) 1446–1454.
[640] G.F. McNulty, Fragments of first order logic, I: Universal Horn logic, Journal of Symbolic
Logic 42 (1977) 221–237.
[641] T.-H. Ma, On the threshold dimension 2 graphs, Technical report, Institute of Information
Sciences, Academia Sinica, Taipei, Republic of China, 1993.
Bibliography 663
[642] F.J. MacWilliams and N.J.A. Sloane, The Theory of Error-Correcting Codes, North-
Holland, Amsterdam, The Netherlands, 1977.
[643] K. Maghout, Sur la détermination des nombres de stabilité et du nombre chroma-
tique d’un graphe, Comptes Rendus de l’Académie des Sciences de Paris 248 (1959)
3522–3523.
[644] K. Maghout,Applications de l’algèbre de Boole à la théorie des graphes et aux programmes
linéaires et quadratiques, Cahiers du Centre d’Etudes de Recherche Opérationnelle 5
(1963) 21–99.
[645] N.V.R. Mahadev and U. Peled, Threshold Graphs and Related Topics, Annals of Discrete
Mathematics Vol. 56, North-Holland, Amsterdam, The Netherlands, 1995.
[646] D. Maier, Minimal covers in the relational database model, Journal of the ACM 27 (1980)
664–674.
[647] D. Maier, The Theory of Relational Databases, Computer Science Press, Rockville, MD,
1983.
[648] D. Maier and D.S. Warren, Computing with Logic: Logic Programming with PROLOG,
Benjamin/Cummings Publishing Co., Menlo Park, CA, 1988.
[649] K. Makino, A linear time algorithm for recognizing regular Boolean function, Journal of
Algorithms 43 (2002) 155–176.
[650] K. Makino, K. Hatanaka and T. Ibaraki, Horn extensions of a partially defined Boolean
function, SIAM Journal on Computing 28 (1999) 2168–2186.
[651] K. Makino and T. Ibaraki, Interior and exterior functions of Boolean functions, Discrete
Applied Mathematics 69 (1996) 209–231.
[652] K. Makino and T. Ibaraki, The maximum latency and identification of positive Boolean
functions, SIAM Journal on Computing 26 (1997) 1363–1383.
[653] K. Makino and T. Ibaraki,Afast and simple algorithm for identifying 2-monotonic positive
Boolean functions, Journal of Algorithms 26 (1998) 291–305.
[654] K. Makino and T. Ibaraki, Inner-core and outer-core functions of partially defined Boolean
functions, Discrete Applied Mathematics 96–97 (1999) 307–326.
[655] K. Makino, K. Yano and T. Ibaraki, Positive and Horn decomposability of partially defined
Boolean functions, Discrete Applied Mathematics 74 (1997) 251–274.
[656] J.A. Makowsky, Why Horn formulas matter in computer science: Initial structures and
generic examples, Journal of Computer and System Sciences 34 (1987) 266–292.
[657] A.I. Malcev, The Metamathematics of Algebraic Systems, Collected Papers: 1936–1967,
North Holland, Amsterdam, 1971.
[658] O.L. Mangasarian, Linear and nonlinear separation of patterns by linear programming,
Operations Research 13 (1965) 444–452.
[659] O.L. Mangasarian, Mathematical programming in neural networks, ORSA Journal on
Computing 5 (1993) 349–360.
[660] O.L. Mangasarian, R. Setiono and W.H. Wolberg, Pattern recognition via linear
programming: Theory and applications to medical diagnosis, in: T.F. Coleman and Y.
Li, eds., Large-Scale Numerical Optimization, SIAM Publications, Philadelphia, 1990,
pp. 22–30.
[661] I. Mann and L.S. Shapley, Values of large games VI: Evaluating the Electoral College
exactly, RM-3158, The Rand Corporation, Santa Monica, CA, 1962.
[662] H. Mannila and K. Mehlhorn, A fast algorithm for renaming a set of clauses as a Horn
set, Information Processing Letters 21 (1985) 261–272.
[663] H.K. Mannila and J. Räihä, Design of Relational Databases, Addison-Wesley, Woking-
ham, 1992.
[664] H.K. Mannila and J. Räihä, Algorithms for inferring functional dependencies, Data and
Knowledge Engineering 12 (1994) 83–99.
[665] H.K. Mannila, H. Toivonen and A.I. Verkamo, in: U.M. Fayyad and R. Uthurusamy, eds.,
Efficient Algorithms for Discovering Association Rules, AAAI Workshop on Knowledge
Discovery in Databases, 1994, pp. 181–192.
664 Bibliography
[715] N. Nishimura, P. Ragde and S. Szeider, Detecting backdoor sets with respect to Horn
and binary clauses, Seventh International Conference on Theory and Applications of
Satisfiability Testing – SAT04, 2004, Vancouver, Canada.
[716] R. O’Donnell, Some topics in analysis of Boolean functions, in: Proceedings of the 40th
ACM Annual Symposium on Theory of Computing (STOC), 2008, pp. 569–578.
[717] R. O’Donnell and R.A. Servedio, The Chow parameters problem, in: Proceedings of the
40th ACM Annual Symposium on Theory of Computing (STOC), 2008, pp. 517–526.
[718] H. Ono, K. Makino and T. Ibaraki, Logical analysis of data with decomposable structures,
Theoretical Computer Science 289 (2002) 977–995.
[719] G. Owen, Multilinear extensions of games, Management Science 18 (1972) 64–79.
[720] G. Owen, Game Theory, Academic Press, San Diego, 1995.
[721] P. Padawitz, Computing in Horn Clause Theories, Springer-Verlag, Berlin, 1988.
[722] M.W. Padberg, Perfect zero-one matrices, Mathematical Programming 6 (1974) 180–196.
[723] M.W. Padberg, The Boolean quadric polytope: Some characteristics, facets and relatives,
Mathematical Programming 45 (1989) 139–172.
[724] G. Palubeckis, Iterated tabu search for the unconstrained binary quadratic optimization
problem, Informatica 17 (2006) 279–296.
[725] C.H. Papadimitriou, Computational Complexity, Addison Wesley Publishing Co.,
Reading, MA, 1994.
[726] C.H. Papadimitriou and K. Steiglitz, Combinatorial Optimization: Algorithms and
Complexity, Prentice Hall, Englewood Cliffs, NJ, 1982.
[727] L. Papayanopoulos, Computerized weighted voting reapportionment, in: AFIPS Proceed-
ings, Vol. 50, 1981, pp. 623–629.
[728] L. Papayanopoulos, On the partial construction of the semi-infinite Banzhaf polyhedron,
in: A.V. Fiacco and K.O. Kortanek, eds., Semi-Infinite Programming and Applications,
Lecture Notes in Economics and Mathematical Systems, Vol. 215, Springer-Verlag,
Berlin-Heidelberg-New York, 1983, pp. 208–218.
[729] L. Papayanopoulos, DD analysis: Variational and computational properties of power
indices, Research Report 83-18, Graduate School of Management, Rutgers University,
NJ, 1983.
[730] P.M. Pardalos and S. Jha, Complexity of uniqueness and local search in quadratic 0–1
programming, Operations Research Letters 11 (1992) 119–123.
[731] R. Paturi, P. Pudlák, M.E. Saks and F. Zane, An improved exponential-time algorithm
for k-SAT, in: Proceedings of the 39th Annual IEEE Symposium on the Foundations of
Computer Science, IEEE, 1998, pp. 628–637.
[732] M.C. Paull and E.J. McCluskey, Jr., Boolean functions realizable with single threshold
devices, Proceedings of the IRE 48 (1960) 1335–1337.
[733] M.C. Paull and S.H. Unger, Minimizing the number of states in incompletely specified
sequential switching functions, IRE Transactions on Electronic Computers EC-8 (1959)
356–367.
[734] J. Peer and R. Pinter, Minimal decomposition of Boolean functions using non-repeating
literal trees, in: Proceedings of the International Workshop on Logic and Architecture
Synthesis, IFIP TC10 WD10.5, Grenoble, 1995, pp. 129–139.
[735] U.N. Peled and B. Simeone, Polynomial-time algorithms for regular set-covering and
threshold synthesis, Discrete Applied Mathematics 12 (1985) 57–69.
[736] U.N. Peled and B. Simeone, An O(nm)-time algorithm for computing the dual of a regular
Boolean function, Discrete Applied Mathematics 49 (1994) 309–323.
[737] B. Peleg, A theory of coalition formation in committees, Journal of Mathematical
Economics 7 (1980) 115–134.
[738] B. Peleg, Coalition formation in simple games with dominant players, International
Journal of Game Theory 10 (1981) 11–33.
[739] L.S. Penrose, The elementary statistics of majority voting, Journal of the Royal Statistical
Society 109 (1946) 53–57.
Bibliography 667
[740] G. Pesant and C.-G. Quimper, Counting solutions of knapsack constraints, in: L. Perron
and M.A. Trick, eds., Integration of AI and OR Techniques in Constraint Programming for
Combinatorial Optimization Problems, Lecture Notes in Computer Science, Vol. 5015,
Springer-Verlag, Berlin-Heidelberg, 2008, pp. 203–217.
[741] R. Petreschi and B. Simeone, A switching algorithm for the solution of quadratic Boolean
equations, Information Processing Letters 11 (1980) 193–198.
[742] R. Petreschi and B. Simeone, Experimental comparison of 2-satisfiability algorithms,
RAIRO Recherche Opérationnelle 25 (1991) 241–264.
[743] C.A. Petri, Introduction to General Net Theory of Processes and Systems, Springer-Verlag,
Berlin, 1980.
[744] S.R. Petrick, A direct determination of the irredundant forms of a boolean function from
the set of prime implicants, Technical Report AFCRC-TR-56-110, Air Force Cambridge
Research Center, Cambridge, MA, April 1956.
[745] J.-C. Picard and M. Queyranne, A network flow solution to some nonlinear
0–1 programming programs, with applications to graph theory, Networks 12 (1982)
141–159.
[746] J.-C. Picard and H.D. Ratliff, Minimum cuts and related problems, Networks 5 (1975)
357–370.
[747] E. Pichat, The disengagement algorithm or a new generalization of the exclusion
algorithm, Discrete Mathematics 17 (1977) 95–106.
[748] N. Pippenger, Galois theory for minors of finite functions, Discrete Mathematics 254
(2002) 405–419.
[749] L. Pitt and L.G. Valiant, Computational limitations on learning from examples, Journal
of the Association for Computing Machinery 35 (1988) 965–984.
[750] D. Plaisted and S. Greenbaum, A structure-preserving clause form translation, Journal of
Symbolic Computation 2 (1986) 293–304.
[751] G.R. Pogosyan, Classes of Boolean functions defined by functional terms, Multiple Valued
Logic 7 (2002) 417–448.
[752] R. Pöschel and I. Rosenberg, Compositions and clones of Boolean functions, in: Y. Crama
and P.L. Hammer, eds., Boolean Models and Methods in Mathematics, Computer Science,
and Engineering, Cambridge University Press, Cambridge, 2010, pp. 3–38.
[753] E.L. Post, The Two-Valued Iterative Systems of Mathematical Logic, Annals of Mathe-
matics Studies Vol. 5, Princeton University Press, Princeton, NJ, 1941.
[754] K. Prasad and J.S. Kelly, NP-completeness of some problems concerning voting games,
International Journal of Game Theory 19 (1990) 1–9.
[755] R.E. Prather, Introduction to Switching Theory: A Mathematical Approach, Allyn and
Bacon, Inc., Boston, MA, 1967.
[756] D. Pretolani, Satisfiability and Hypergraphs, Ph.D. thesis, University of Pisa, Pisa, Italy,
1992.
[757] D. Pretolani,Alinear time algorithm for unique Horn satisfiability, Information Processing
Letters 48 (1993) 61–66.
[758] D. Pretolani, Efficiency and stability of hypergraph SAT algorithms, in: D.S. Johnson
and M.A. Trick, eds., Cliques, Coloring, and Satisfiability, DIMACS Series in Discrete
Mathematics and Theoretical Computer Science, Vol. 26,American Mathematical Society,
1996, pp. 479–498.
[759] J.S. Provan, Boolean decomposition schemes and the complexity of reliability computa-
tions, DIMACS Series in Discrete Mathematics Vol. 5, American Mathematical Society,
1991, pp. 213–228.
[760] J.S. Provan and M.O. Ball, Efficient recognition of matroid and 2-monotonic systems,
in: R.D. Ringeisen and F.S. Roberts, eds., Applications of Discrete Mathematics, SIAM,
Philadelphia, 1988, pp. 122–134.
[761] P. Pudlák, Lower bounds for resolution and cutting planes proofs and monotone
computations, Journal of Symbolic Logic 62 (1997) 981–998.
668 Bibliography
[762] P.W. Purdom, Solving satisfiability with less searching, IEEE Transactions on Pattern
Analysis and Machine Intelligence 6(4) (1984) 510–513.
[763] P.W. Purdom, A survey of average time analyses of satisfiability algorithms, Journal of
Information Processing 13 (1990) 449–455.
[764] I.B. Pyne and E.J. McCluskey, Jr., An essay on prime implicant tables, Journal of the
Society for Industrial and Applied Mathematics 9 (1961) 604–631.
[765] I.B. Pyne and E.J. McCluskey, Jr., The reduction of redundancy in solving prime implicant
tables, IRE Transactions on Electronic Computers EC-11 (1962) 473–482.
[766] W.V. Quine, The problem of simplifying truth functions, American Mathematical Monthly
59 (1952) 521–531.
[767] W.V. Quine, Two theorems about truth functions, Boletin de la Sociedad Matemática
Mexicana 10 (1953) 64–70.
[768] W.V. Quine, A way to simplify truth functions, American Mathematical Monthly 62 (1955)
627–631.
[769] W.V. Quine, On cores and prime implicants of truth functions, American Mathematical
Monthly (1959) 755–760.
[770] J.R. Quinlan, Induction of decision trees, Machine Learning 1 (1986) 81–106.
[771] J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers,
1993.
[772] J.R. Quinlan, Data mining tools See5 and C5.0, published electronically at
http://www.rulequest.com/see5-info.html/ (2000).
[773] R. Raghavan, J. Cohoon and S. Sahni, Single bend wiring, Journal of Algorithms 7 (1986)
232–257.
[774] V. Raghavan and S. Schach, Learning switch configurations, in: Proceedings of the Third
Annual Workshop on Computational Learning Theory, Morgan Kaufmann Publishers,
San Francisco, CA, 1990, pp. 38–51.
[775] C.C. Ragin, The Comparative Method: Moving Beyond Qualitative and Quantitative
Strategies, University of California Press, Berkeley-Los Angeles-London, 1987.
[776] S. Rai, M. Veeraraghavan and K.S. Trivedi. A survey of efficient reliability computation
using disjoint products approach, Networks 25 (1995) 147–163.
[777] K.G. Ramamurthy, Coherent Structures and Simple Games, Kluwer Academic Publishers,
Dordrecht, 1990.
[778] B. Randerath, E. Speckenmeyer, E. Boros, O. Čepek, P.L. Hammer, A. Kogan, K. Makino
and B. Simeone, Satisfiability formulation of problems on level graphs, in: H. Kautz and
B. Selman, eds., Proceedings of the LICS 2001 Workshop on Theory and Applications of
Satisfiability Testing (SAT 2001), Boston, MA, Electronic Notes in Discrete Mathematics
9 (2001) pp. 1–9.
[779] T. Raschle and K. Simon, Recognition of graphs with threshold dimension two, Proceed-
ings of the 27th Annual ACM Symposium on the Theory of Computing, Las Vegas, NE,
1995, pp. 650–661.
[780] C. Ré and D. Suciu, Approximate lineage for probabilistic databases, Proceedings of the
Very Large Database Endowment 1 (2008) 797–808.
[781] R.C. Read and R.E. Tarjan, Bounds on backtrack algorithms for listing cycles, paths, and
spanning trees, Networks 5 (1975) 237–252.
[782] I.S. Reed, A class of multiple error-correcting codes and the decoding scheme, IRE
Transactions on Information Theory IT-4 (1954) 38–49.
[783] R. Reiter, A theory of diagnosis from first principles, Artificial Intelligence 32 (1987)
57–95.
[784] J. Reiterman, V. Rödl, E. Šiňajová and M. Tůma, Threshold hypergraphs, Discrete
Mathematics 54 (1985) 193–200.
[785] M.G. Resende, L.S. Pitsoulis and P.M. Pardalos, Approximate solution of weighted
MAX-SAT problems using GRASP, in: D. Du, J. Gu and P.M. Pardalos, eds., Satis-
fiability Problem: Theory and Applications, DIMACS series in Discrete Mathematics
Bibliography 669
and Theoretical Computer Science, Vol. 35, American Mathematical Society, 1997,
pp. 393–405.
[786] J.M.W. Rhys, A selection problem of shared fixed costs and network flows, Management
Science 17 (1970) 200–207.
[787] J.A. Robinson, A machine oriented logic based on the resolution principle, Journal of the
Association for Computing Machinery 12 (1965) 23–41.
[788] R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970.
[789] I.G. Rosenberg, 0–1 optimization and non-linear programming, Revue Française d’Auto-
matique, d’Informatique et de Recherche Opérationnelle (Série Bleue) 2 (1972) 95–97.
[790] I.G. Rosenberg, Reduction of bivalent maximization to the quadratic case, Cahiers du
Centre d’Etudes de Recherche Opérationnelle 17 (1975), 71–74.
[791] J. Rosenmüller, Nondegeneracy problems in cooperative game theory, in: A. Bachem,
M. Grötschel and B. Korte, eds., Mathematical Programming – The State of the Art,
Springer-Verlag, 1983, pp. 391–416.
[792] D. Roth, On the hardness of approximate reasoning, Artificial Intelligence 82 (1996)
273–302.
[793] J.P. Roth, Algebraic topological methods for the synthesis of switching systems,
Transactions of the American Mathematical Society 88 (1958) 301–326.
[794] C. Rother, V. Kolmogorov, V. Lempitsky and M. Szummer, Optimizing binary MRFs via
extended roof duality, in: IEEE Conference on Computer Vision and Pattern Recognition,
June 2007.
[795] S. Rudeanu, Boolean Functions and Equations, North-Holland, Amsterdam, 1974.
[796] S. Rudeanu, Lattice Functions and Equations, Springer-Verlag, Heidelberg, 2001.
[797] Y. Sagiv, C. Delobel, D.S. Parker and R. Fagin,An equivalence between relational database
dependencies and a fragment of propositional logic, Journal of the ACM 28 (1981)
435–453.
[798] S. Sahni and T. Gonzalez, P-complete approximation problems, Journal of the ACM 23
(1976) 555–565.
[799] M. Saks, Slicing the hypercube, in: K. Walker, ed., Surveys in Combinatorics, Cambridge
University Press, Cambridge, 1993, pp. 211–255.
[800] M.T. Salvemini, B. Simeone and R. Succi, Analisi del possesso integrato nei gruppi di
imprese mediante grafi, L’Industria XVI(4) (1995) 641–662.
[801] E.W. Samson and B.E. Mills, Circuit minimization: Algebra and algorithms for new
Boolean canonical expressions, Air Force Cambridge Research Center, Technical Report
TR 54-21, 1954.
[802] T. Sang, F. Bacchus, P. Beame, H.A. Kautz and T. Pitassi, Combining component caching
and clause learning for effective model counting, in: SAT 2004 - The Seventh International
Conference on Theory and Applications of Satisfiability Testing, Vancouver, Canada,
2004, pp. 20–28.
[803] A.A. Sapozhenko, On the complexity of disjunctive normal forms obtained by the use
of the gradient algorithm, in: Discrete Analysis, Vol. 21, Novosibirsk, 1972, pp. 62–71
(in Russian).
[804] T. Sasao, Switching Theory for Logic Synthesis, Kluwer Academic Publishers, Norwell,
Massachusetts, 1999.
[805] M. Sauerhoff, I. Wegener and R. Werchner, Optimal ordered binary decision diagrams
for read-once formulas, Discrete Applied Mathematics 46 (1993) 235–251.
[806] A.A. Schäffer and M. Yannakakis, Simple local search problems that are hard to solve,
SIAM Journal on Computing 20 (1991) 56–87.
[807] T.J. Schaefer, The complexity of satisfiability problems, in: Proceedings of the
10th Annual ACM Symposium on the Theory of Computing, San Diego, CA, 1978,
pp. 216–226.
[808] I. Schiermeyer, Pure literal lookahead: an O(1.497n ) 3-satisfiability algorithm, in:
Proceedings of the Workshop on Satisfiability, Siena, Italy, 1996, pp. 63–72.
670 Bibliography
[809] J.S. Schlipf, F.S. Annexstein, J.V. Franco and R.P. Swaminathan, On finding solutions for
extended Horn formulas, Information Processing Letters 54 (1995) 133–137.
[810] L. Schmitz, An improved transitive closure algorithm, Computing 30 (1983) 359–371.
[811] W.G. Schneeweiss, Boolean Functions with Engineering Applications and Computer
Programs, Springer-Verlag, Berlin, New York, 1989.
[812] A. Schrijver, Theory of Linear and Integer Programming, Wiley-Interscience Series in
Discrete Mathematics and Optimization, John Wiley & Sons, Chichester, 1986.
[813] A. Schrijver, A combinatorial algorithm minimizing submodular functions in strongly
polynomial time, Journal of Combinatorial Theory B 80 (2000) 346–355.
[814] A. Schrijver, Combinatorial Optimization: Polyhedra and Efficiency, Springer, Berlin,
2003.
[815] M.G. Scutellà, A note on Dowling and Gallier’s top-down algorithm for propositional
Horn satisfiability, Journal of Logic Programming 8 (1990) 265–273.
[816] J. Sebelik and P. Stepanek, Horn clause programs for recursive functions, in: K.L. Clark
and S.-A. Tarnlund, eds., Logic Programming, Academic Press, 1982, pp. 325–340.
[817] D. Seinsche, On a property of the class of n-colorable graphs, Journal of Combinatorial
Theory B 16 (1974) 191–193.
[818] B. Selman, H. Kautz and B. Cohen, Noise strategies for improving local search, in:
Proceedings of the Twelfth National Conference on Artificial Intelligence, Seattle, WA,
1994, pp. 337–343.
[819] B. Selman, H. Kautz and B. Cohen, Local search strategies for satisfiability testing,
in: D.S. Johnson and M.A. Trick, eds., Cliques, Coloring, and Satisfiability, DIMACS
Series in Discrete Mathematics and Theoretical Computer Science, Vol. 26, American
Mathematical Society, 1996, pp. 521–531.
[820] B. Selman, H. Levesque and D. Mitchell, A new method for solving hard satisfiabil-
ity problems, in: AAAI’92, Proceedings of the Tenth National Conference on Artificial
Intelligence, San Jose, CA, 1992, pp. 440–446.
[821] P.D. Seymour, The forbidden minors of binary matroids, Journal of the London
Mathematical Society Ser. 2, 12 (1976) 356–360.
[822] P.D. Seymour, The matroids with the max-flow min-cut property, Journal of Combinato-
rial Theory B 23 (1977) 189–222.
[823] P.D. Seymour, Decomposition of regular matroids, Journal of Combinatorial Theory B
28 (1980) 305–359.
[824] G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, Princeton,
1976.
[825] G. Shafer, Perspectives on the theory and practice of belief functions, International
Journal of Approximate Reasoning 4 (1990) 323–362.
[826] R. Shamir and R. Sharan, A fully dynamic algorithm for modular decomposition and
recognition of cographs, Discrete Applied Mathematics 136 (2004) 329–340.
[827] C.E. Shannon, The synthesis of two-terminal switching circuits, Bell System Technical
Journal 28 (1949) 59–98.
[828] L.S. Shapley, Simple games: An outline of the descriptive theory, Behavioral Science 7
(1962) 59–66.
[829] L.S. Shapley, Cores of convex games, International Journal of Game Theory 1 (1971)
11–26.
[830] D.R. Shier and D.E. Whited, Algorithms for generating minimal cutsets by inversion,
IEEE Transactions on Reliability R-34 (1985) 314–318.
[831] I. Shmulevich, E.R. Dougherty and W. Zhang, From Boolean to probabilistic Boolean
networks as models of genetic regulatory networks, in: Proceedings of the IEEE 90 (2002)
1778–1792.
[832] I. Shmulevich and W. Zhang, Binary analysis and optimization-based normalization of
gene expression data, Bioinformatics 18 (2002) 555–565.
[833] B. Simeone, Quadratic 0–1 Programming, Boolean Functions and Graphs, Ph.D. thesis,
University of Waterloo, Ontario, Canada, 1979.
Bibliography 671
[834] B. Simeone, Consistency of quadratic Boolean equations and the Kőnig-Egerváry property
for graphs, Annals of Discrete Mathematics 25 (1985) 281–290.
[835] B. Simeone, D. de Werra and M. Cochand, Recognition of a class of unimodular functions,
Discrete Applied Mathematics 29 (1990) 243–250.
[836] I. Singer, Extensions of functions of 0–1 variables and applications to combinatorial
optimization, Numerical Functional Analysis and Optimization 7 (1984-85) 23–62.
[837] P. Slavík, A tight analysis of the greedy algorithm for set cover, Journal of Algorithms 25
(1997) 237–254.
[838] R.H. Sloan, B. Szörényi and G. Turán, Learning Boolean functions with queries, in:
Y. Crama and P.L. Hammer, eds., Boolean Models and Methods in Mathematics, Computer
Science, and Engineering, Cambridge University Press, Cambridge, 2010, pp. 221–256.
[839] R.H. Sloan, K. Takata and G. Turán, On frequent sets of Boolean matrices, Annals of
Mathematics and Artificial Intelligence 24 (1998) 1–4.
[840] N.J.A. Sloane, The On-Line Encyclopedia of Integer Sequences, published electronically
at http://www.research.att.com/∼njas/sequences/ (2006).
[841] J.-G. Smaus, On Boolean functions encodable as a single linear pseudo-Boolean con-
straint, in: P. Van Hentenryck and L.A. Wolsey, eds., Proceedings of the 4th International
Conference on Integration of AI and OR Techniques in Constraint Programming for Com-
binatorial Optimization Problems (CPAIOR 2007), Lecture Notes in Computer Science,
Vol. 4510, Springer-Verlag, Berlin-Heidelberg, 2007, pp. 288–302. Full version available
as: Technical Report 230, Institut für Informatik, Universität Freiburg, Germany, 2007.
[842] D.R. Smith, Bounds on the number of threshold functions, IEEE Transactions on
Electronic Computers EC-15 (1966) 368–369.
[843] J.D. Smith, M.J. Murray, Jr. and J.P. Minda, Straight talk about linear separability,
Journal of Experimental Psychology: Learning, Memory, and Cognition 23 (1997)
659–680.
[844] Z. Stachniak, Going non-clausal, in: Fifth International Symposium on the Theory and
Applications of Satisfiability Testing, SAT 2002, Cincinnati, Ohio, 2002, pp. 316–322.
[845] K.E. Stecke, Formulation and solution of nonlinear integer production planning problems
for flexible manufacturing sytems, Management Science 29 (1983) 273–288.
[846] P.R. Stephan, R.K. Brayton and A.L. Sangiovanni-Vincentelli, Combinational test gen-
eration using satisfiability, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems 15 (1996) 1167–1176.
[847] A. Sterbini and T. Raschle, An O(n3 ) time algorithm for recognizing threshold dimension
2 graphs, Information Processing Letters 67 (1998) 255–259.
[848] R.R. Stoll, Set Theory and Logic, Dover Publications, New York, 1979.
[849] H. Störmer, Binary Functions and their Applications, Lecture Notes in Economics and
Mathematical Systems, Vol. 348, Springer, Berlin, 1990.
[850] P.D. Straffin, Game Theory and Strategy, The Mathematical Association of America,
Washington, 1993.
[851] M. Sugeno, Fuzzy measures and fuzzy integrals: a survey, in: M.M. Gupta, G.N. Saridis
and B.R. Gaines, eds., Fuzzy Automata and Decision Processes, North-Holland, Amster-
dam, 1977, pp. 89–102.
[852] R. Swaminathan and D.K. Wagner, The arborescence realization problem, Discrete
Applied Mathematics 59 (1995) 267–283.
[853] O. Sykora,An optimal algorithm for renaming a set of clauses into the Horn set, Computers
and Artificial Intelligence 4 (1985) 37–43.
[854] S. Szeider, Backdoor sets for DLL subsolvers, Journal of Automated Reasoning 35 (2005)
73–88.
[855] W. Szwast, On Horn spectra, Theoretical Computer Science 82 (1991) 329–339.
[856] K. Takata, A worst-case analysis of the sequential method to list the minimal hitting sets
of a hypergraph, SIAM Journal on Discrete Mathematics 21 (2007) 936–946.
[857] M. Tannenbaum, The establishment of a unique representation for a linearly separable
function, Lockheed, Technical Note no 20, 1961.
672 Bibliography
[858] R.E. Tarjan, Depth first search and linear graph algorithms, SIAM Journal on Computing
1 (1972) 146–160.
[859] R.E. Tarjan, Amortized computational complexity, SIAM Journal on Algebraic and
Discrete Methods 6 (1985) 306–318.
[860] A.D. Taylor and W.S. Zwicker, Simple games and magic squares, Journal of Combinato-
rial Theory A 71 (1995) 67–88.
[861] A.D. Taylor and W.S. Zwicker, Simple Games: Desirability Relations, Trading, Pseu-
doweightings, Princeton University Press, Princeton, NJ, 1999.
[862] A. Thayse, Boolean Calculus of Differences, Lecture Notes in Computer Science, Vol. 101,
Springer-Verlag, Berlin-Heidelberg-New York, 1981.
[863] A. Thayse, From Standard Logic to Logic Programming, John Wiley & Sons, Chichester
etc., 1988.
[864] P. Tison, Generalization of consensus theory and application to the minimization of
Boolean functions, IEEE Transactions on Electronic Computers EC-16, No. 4 (1967)
446–456.
[865] D.M. Topkis, Supermodularity and Complementarity, Princeton University Press, Prince-
ton, NJ, 1998.
[866] C.A. Tovey, Hill climbing with multiple local optima, SIAM Journal on Algebraic and
Discrete Methods 6 (1985) 384–393.
[867] C.A. Tovey, Low order polynomial bounds on the expected performance of local
improvement algorithms, Mathematical Programming 35 (1986) 193–224.
[868] C.A. Tovey, Local improvement on discrete structures, in: E. Aarts and J.K. Lenstra,
eds., Local Search in Combinatorial Optimization, John Wiley & Sons, Chichester, 1997,
pp. 57–89.
[869] M.A. Trick, A dynamic programming approach for consistency and propagation for
knapsack constraints, Annals of Operations Research 118 (2003) 73–84.
[870] K. Truemper, Monotone decomposition of matrices, Technical Report UTDCS-1-94,
1994.
[871] K. Truemper, Effective Logic Computation, Wiley-Interscience, New York, 1998.
[872] G.S. Tseitin, On the complexity of derivations in propositional calculus, in: A.O. Slisenko,
ed., Studies in Constructive Mathematics and Mathematical Logic, Part II, Consultants
Bureau, New York, 1970, pp. 115–125. (Translated from the Russian).
[873] S. Tsukiyama, M. Ide, H. Ariyoshi and I. Shirakawa, A new algorithm for generating all
the maximal independent sets, SIAM Journal on Computing 6 (1977) 505–517.
[874] J.D. Ullman, Principles of Database and Knowledge-Base Systems, Vol. I: Classical
Database Systems, Computer Science Press, New York, 1988.
[875] J.D. Ullman, Principles of Database and Knowledge-Base Systems, Vol. II: The New
Technologies, Computer Science Press, New York, 1989.
[876] C. Umans, The minimum equivalent DNF problem and shortest implicants, Journal of
Computer and System Sciences 63 (2001) 597–611.
[877] C. Umans, T. Villa and A.L. Sangiovanni-Vincentelli, Complexity of two-level logic
minimization, IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems 25 (2006) 1230–1246.
[878] T. Uno, Efficient computation of power indices for weighted majority games, NII
Technical Report NII-2003-006E, National Institute of Informatics, Japan, 2003.
[879] R.H. Urbano and R.K. Mueller, A topological method for the determination of the minimal
forms of a Boolean function, IRE Transactions on Electronic Computers EC-5 (1956)
126–132.
[880] A. Urquhart, Hard examples for resolution, Journal of the Association for Computing
Machinery 34 (1987) 209–219.
[881] A. Urquhart, The complexity of propositional proofs, Bulletin of Symbolic Logic 1 (1995)
425–467.
Bibliography 673
[882] A. Urquhart, Proof theory, in: Y. Crama and P.L. Hammer, eds., Boolean Models and
Methods in Mathematics, Computer Science, and Engineering, Cambridge University
Press, Cambridge, 2010, pp. 79–98.
[883] L.G. Valiant, The complexity of enumeration and reliability problems, SIAM Journal on
Computing 8 (1979) 410–421.
[884] L.G. Valiant, A theory of the learnable, Communications of the ACM 27 (1984)
1134–1142.
[885] A. Van Gelder, A satisfiability tester for non-clausal propositional calculus, Information
and Computation 79 (1988) 1–21.
[886] A. Van Gelder and Y.K. Tsuji, Satisfiability testing with more reasoning and less guessing,
in: D.S. Johnson and M.A. Trick, eds., Cliques, Coloring, and Satisfiability, DIMACS
Series in Discrete Mathematics and Theoretical Computer Science, Vol. 26, American
Mathematical Society, 1996, pp. 559–586.
[887] J. van Leeuwen, Graph algorithms, in: J. van Leeuwen, ed., Handbook of Theoretical
Computer Science: Algorithms and Complexity, Volume A, The MIT Press, Cambridge,
MA, 1990, pp. 525–631.
[888] H. Vantilborgh and A. van Lamsweede, On an extension of Dijkstra’s semaphore
primitives, Information Processing Letters 1 (1972) 181–186.
[889] Yu.L. Vasiliev, On the comparison of the complexity of prime irredundant and mini-
mal DNFs, in: Problems of Cybernetics, Vol. 10, PhysMatGIz, Moscow, 1963, pp. 5–61
(in Russian).
[890] Yu.L. Vasiliev, The difficulties of minimizing Boolean functions using universal
approaches, Doklady Akademii Nauk SSSR, Vol. 171, No. 1, 1966, pp. 13–16
(in Russian).
[891] T. Villa, R.K. Brayton and A.L. Sangiovanni-Vincentelli, Synthesis of multi-level Boolean
networks, in: Y. Crama and P.L. Hammer, eds., Boolean Models and Methods in Math-
ematics, Computer Science, and Engineering, Cambridge University Press, Cambridge,
2010, pp. 675–722.
[892] H. Vollmer, Introduction to Circuit Complexity: A Uniform Approach, Springer, Berlin -
New York, 1999.
[893] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton
University Press, Princeton, NJ, 1944.
[894] B.W. Wah and Y. Shang, A discrete Lagrangian-based global-search method for solving
satisfiability problems, in: D. Du, J. Gu and P.M. Pardalos, eds., Satisfiability Prob-
lem: Theory and Applications, DIMACS series in Discrete Mathematics and Theoretical
Computer Science, Vol. 35, American Mathematical Society, 1997, pp. 365–392.
[895] D. Waltz, Understanding line drawings of scenes with shadows, in: P.H. Winston, ed., The
Psychology of Computer Vision, McGraw-Hill, New York, 1975.
[896] C. Wang, Boolean minors, Discrete Mathematics 141 (1995) 237–258.
[897] C. Wang and A.C. Williams, The threshold order of a Boolean function, Discrete Applied
Mathematics 31 (1991) 51–69.
[898] H. Wang, H. Xie, Y.R. Yang, L.E. Li, Y. Liu and A. Silberschatz, Stable egress route
selection for interdomain traffic engineering: Model and analysis, in: Proceedings of
Thirteenth IEEE Conference on Network Protocols (ICNP ’05), Boston, 2005, pp. 16–29.
[899] S. Warshall, A theorem on Boolean matrices, Journal of the ACM 9 (1962) 11–12.
[900] A. Warszawski, Pseudo-Boolean solutions to multidimensional location problems,
Operations Research 22 (1974) 1081–1096.
[901] W.D. Wattenmaker, G.I. Dewey, T.D. Murphy and D.L. Medin, Linear separability
and concept learning: Context, relational properties, and concept naturalness, Cognitive
Psychology 18 (1986) 158–194.
[902] I. Wegener, The Complexity of Boolean Functions, Wiley-Teubner Series in Computer
Science, John Wiley & Sons, Chichester etc., 1987.
674 Bibliography
[903] I. Wegener, Branching Programs and Binary Decision Diagrams: Theory and Applica-
tions, SIAM Monographs on Discrete Mathematics and Applications, SIAM, Philadel-
phia, PA, 2000.
[904] R. Weismantel, On the 0–1 knapsack polytope, Mathematical Programming 77 (1987)
49–68.
[905] D.J.A. Welsh, Matroid Theory, London Mathematical Society Monographs, Vol. 8,
Academic Press, New York, 1976.
[906] D.J.A. Welsh, Matroids: Fundamental concepts, in: R. Graham, M. Grötschel
and L. Lovász, eds., Handbook of Combinatorics, Elsevier, Amsterdam, 1995,
pp. 481–526.
[907] D. Wiedemann, Unimodal set-functions, Congressus Numerantium 50 (1985) 165–169.
[908] D. Wiedemann, A computation of the eighth Dedekind number, Order 8 (1991) 5–6.
[909] D.J. Wilde and J.M. Sanchez-Anton, Discrete optimization on a multivariable Boolean
lattice, Mathematical Programming 1 (1971) 301–306.
[910] H.P. Williams, Experiments in the formulation of integer programming problems,
Mathematical Programming Studies 2 (1974) 180–197.
[911] H.P. Williams, Linear and integer programming applied to the propositional calculus,
Systems Research and Information Sciences 2 (1987) 81–100.
[912] H.P. Williams, Logic applied to integer programming and integer programming applied
to logic, European Journal of Operational Research 81 (1995) 605–616.
[913] R. Williams, C. Gomes and B. Selman, Backdoors to typical case complexity, in: Pro-
ceedings of the International Joint Conference on Artificial Intelligence (IJCAI) 2003,
pp. 1173–1178.
[914] J.M. Wilson, Compact normal forms in propositional logic and integer programming
formulations, Computers and Operations Research 90 (1990) 309–314.
[915] R.O. Winder, More about threshold logic, in: IEEE Symposium on Switching Circuit
Theory and Logical Design, 1961, pp. 55–64.
[916] R.O. Winder, Single stage threshold logic, in: IEEE Symposium on Switching Circuit
Theory and Logical Design, 1961, pp. 321–332.
[917] R.O. Winder, Threshold Logic, Ph.D. Dissertation, Department of Mathematics, Princeton
University, Princeton, NJ, 1962.
[918] R.O. Winder, Properties of threshold functions, IEEE Transactions on Electronic
Computers EC-14 (1965) 252–254.
[919] R.O. Winder, Enumeration of seven-arguments threshold functions, IEEE Transactions
on Electronic Computers EC-14 (1965) 315–325.
[920] R.O. Winder, Chow parameters in threshold logic, Journal of the Association for
Computing Machinery 18 (1971) 265–289.
[921] P.H. Winston, Artificial Intelligence, Addison-Wesley, Reading, MA, 1984.
[922] L.A. Wolsey, Faces for a linear inequality in 0–1 variables, Mathematical Programming
8 (1975) 165–178.
[923] L.A. Wolsey,An analysis of the greedy algorithm for the submodular set covering problem,
Combinatorica 2 (1982) 385–393.
[924] L.A. Wolsey, Integer Programming, Wiley-Interscience Series in Discrete Mathematics
and Optimization, John Wiley & Sons, New York, 1998.
[925] L. Wos, R. Overbeek, E. Lusk and J. Boyle, Automated Reasoning: Introduction and
Applications, Prentice-Hall, Englewood Cliffs, NJ, 1984.
[926] Z. Xing and W. Zhang, MaxSolver: An efficient exact algorithm for (weighted) maximum
satisfiability, Artificial Intelligence 164 (2005) 47–80.
[927] M. Yagiura, M. Kishida and T. Ibaraki, A 3-flip neighborhood local search for the set
covering problem, European Journal of Operational Research 172 (2006) 472–499.
[928] S. Yajima and T. Ibaraki, A lower bound on the number of threshold functions, IEEE
Transactions on Electronic Computers EC-14 (1965) 926–929.
Bibliography 675
[929] S. Yajima and T. Ibaraki, On relations between a logic function and its characteristic
vector, Journal of the Institute of Electronic and Communication Engineers of Japan 50
(1967) 377–384 (in Japanese).
[930] M. Yamamoto, An improved Õ(1.234m )-time deterministic algorithm for SAT, in: X.
Deng and D. Du, eds., Algorithms and Computation - ISAAC 2005, Lecture Notes in
Computer Science, Vol. 3827, Springer-Verlag, Berlin-Heidelberg, 2005, pp. 644–653.
[931] S. Yamasaki and S. Doshita, The satisfiability problem for a class consisting of Horn
sentences and some non-Horn sentences in propositional logic, Information and Control
59 (1983) 1–12.
[932] M. Yannakakis, Node-and edge-deletion NP-complete problems, in: Proceedings of the
10th Annual ACM Symposium on Theory of Computing (STOC) 1978, ACM, NY, USA,
pp. 253–264.
[933] M. Yannakakis, The complexity of the partial order dimension problem, SIAM Journal
on Algebraic and Discrete Methods 3 (1982) 351–358.
[934] M. Yannakakis, On the approximation of maximum satisfiability, Journal of Algorithms
17 (1994) 475–502.
[935] E. Zemel, Easily computable facets of the knapsack polytope, Mathematics of Operations
Research 14 (1989) 760–764.
[936] H. Zhang and J.E. Rowe, Best approximations of fitness functions of binary strings,
Natural Computing 3 (2004) 113–124.
[937] Yu.I. Zhuravlev, Set-theoretical methods in Boolean algebra, Problems of Cybernetics 8
(1962) 5–44 (in Russian).
[938] S. Živný, D.A. Cohen and P.G. Jeavons, The expressive power of binary submodular
functions, Discrete Applied Mathematics 157 (2009) 3347–3358.
[939] Yu.A. Zuev, Approximation of a partial Boolean function by a monotonic Boolean
function, U.S.S.R. Computational Mathematics and Mathematical Physics 18 (1979)
212–218.
[940] Yu.A. Zuev, Asymptotics of the logarithm of the number of threshold functions of the
algebra of logic, Soviet Mathematics Doklady 39 (1989) 512–513.
[941] U. Zwick, Approximation algorithms for constraint satisfaction problems involving at
most three variables per constraint, in: SODA ’98: Proceedings of the 9th Annual ACM-
SIAM Symposium on Discrete Algorithms, 1998, SIAM, Philadelphia, PA, pp. 201–210.
Index
2-Sat problem, see quadratic equation BDD, see binary decision diagram (BDD)
3-Sat problem, see DNF equation, degree 3 belief function, 569
belt function, 159
bidirected graph, 206, 210
absorption, 9, 26
binary decision diagram (BDD), 46–49
closure, 131
and orthogonal DNF, 48
affine Boolean function, 110
ordered (OBDD), 47
algebraic normal form, see representation over
bipartite graph, 542, 549, 611
GF(2)
and conflict codes, 216, 592
aligned function, 398–399
and Guttman scale, 443
almost-positive pseudo-Boolean function, 604
and posiforms, 604
apportionment problem, 434
complete, 611
approximation algorithm, 117
recognition, 219
arborescence, 613
black box oracle, see oracle algorithm
artificial intelligence, 50–52, 68, 124, 174,
blocker, 179
273–274, 279–280, 511, 566, 569, 599
Boolean equation
association rule, 521–522
complexity, 72–74, 104–111
asummable function, 414–417
consistent, 67
k-asummable
definition, 67, 73
definition, 414
DNF, see DNF equation
2-asummable
generating all solutions, 112
complete monotonicity, 396
inconsistent, 67
definition, 395
parametric solutions, 113–115
vs. threshold function, 416
Boolean expression
threshold function, 414
definition, 10
weakly asummable function, 429
dual, 14
Chow function, 430
equivalent, 12
length, size, 12
backdoor set, 83 of a function, 10–13
Banzhaf index, 57–58 read-once, 448
and pseudo-Boolean approximations, 593 satisfiable, 68
and strength preorder, 360 tautology, 68
definition, 57 valid, 68
in reliability, 59 Boolean function, 3
of threshold functions, 349, 434, 436, 437 expression, representation, 10–13
raw, 57 normal form representations, 15–19
677
678 Index