Binary Decision Diagrams and Applications For Vlsi Cad
Binary Decision Diagrams and Applications For Vlsi Cad
AND APPLICATIONS
FOR VLSI CAD
THE KLUWER INTERNATIONAL SERIES
IN ENGINEERING AND COMPUTER SCIENCE
by
Shin-ichi Minato
FOREWORD ix
PREFACE xi
1 INTRODUCTION
1.1 Background 1
1.2 Outline ofthe Book 4
v
VI BDDs AND ApPLICATIONS FOR VLSI CAD
6 ZERO-SUPPRESSED BDDS 61
6.1 BODs for Sets of Combinations 62
6.2 Zero-Suppressed BODs 65
6.3 Manipulation of ZBDDs 67
6.4 Unate Cube Set Algebra 72
6.5 Implementation and Applications 76
6.6 Conclusion 80
10 CONCLUSIONS 129
Contents Vll
REFERENCES 133
INDEX 139
FOREWORD
Symbolic Boolean manipulation using Binary Decision Diagrams (BDDs) has been
applied successfully to a wide variety of tasks, particularly in very large scale
integration (VLSI) computer-aided design (CAD). The concept of decision graphs
as an abstract representation of Boolean functions dates back to early work by Lee
and Akers. In the last ten years, BDDs have found widespread use as a concrete
data structure for symbolic Boolean manipulation. With BDDs, functions can be
constructed, manipulated, and compared by simple and efficient graph algorithms.
Since Boolean functions can represent not just digital circuit functions, but also such
mathematical domains as sets and relations, a wide variety of CAD problems can be
solved using BDDs.
Although I can claim some credit in initiating the use of BDDs for symbolic Boolean
manipulation, the state of the art in the field has been advanced by researchers around
the world. In particular, the group headed by Prof. Shuzo Yajima at Kyoto University
has been the source of many important research results as well. as the spawning
ground for some of the most productive researchers. Shin-Ichi Minato is a prime
example of this successful research environment. While a Master's student at Kyoto
University, he and his colleagues developed important refinements to the BDD data
structure, including a shared representation, attributed edges, and improved variable
ordering techniques. Since joining the NIT LSI Laboratories, Minato has continued
to make important research contributions. Perhaps most significant among these is
the use of a zero-suppressed reduction rule when using BDDs to represent sparse sets.
These "ZBDDs" have proved effective for such tasks as the cube-set manipulations in
two-level logic optimization, as well as for representing polynomial expressions.
This book is based on Minato's Phd dissertation and hence focuses on his particular
contributions to the field of BDDs. The book provides valuable information for
both those who are new to BDDs as well as to long time aficionados. Chapters
1-4 provide a thorough and self-contained background to the area, in part because
Minato's contributions have become part of the mainstream practice. Chapters 5-7
present an area for which Minato's contributions have been of central importance,
namely the encoding of combinatorial problems as sparse sets and their solution using
ZBDDs. Chapters 8-9 carry these ideas beyond Boolean functions, using BDDs to
ix
x BDDs AND ApPLICATIONS FOR VLSI CAD
Randal E. Bryant
Carnegie Mellon University
PREFACE
Over the past ten years, since appearance of Bryant's paper in 1986, BDDs have
attracted the attention of many researchers because of their suitability for representing
Boolean functions. They are now widely used in many practical VLSI CAD systems. I
hope that this book can serve as an introduction to BDD techniques and that it presents
several new ideas on BDDs and their applications. I expect many computer scientists
and engineers will be interested in this book since Boolean function manipulation is a
fundamental technique not only in digital system design but also in exporing various
problems in computer science.
xi
xii BDDs AND ApPLICATIONS FOR VLSI CAD
Chapter 9 can be read independently from the other chapters. Here we apply the BDD
techniques for solving various problems in computer science, such as the 8-queens
problem, the traveling salesman problem, and a kind of scheduling problems. We
developed an arithmetic Boolean expression manipulator, which is a helpful tool for
the research or education related to discrete functions. This program, runs on a SPARC
station, is open to the public. Anyone interested in this program can get it from an
anonymous FrP server authorized by the Information Processing Society of Japan:
eda.kuee.kyoto-u.ac.jp (130.54.29.134) /pub/cad/Bemll.
Throughout I have assumed that the reader of this book is a researcher, an engineer,
or a student who is familiar with switching theory. Since BDD is a graph-based
representation, I have endeavored to provide plentiful illustrations to help the reader's
understanding, and I have shown experimental results whenever possible, in order to
demonstrate the practical side of my work.
As always, this work would not have been possible without the help of many other
people. First I would like to express my sincere appreciation to Professor Shuzo
Yajima of Kyoto University for his continuous guidance, interesting suggestions, and
encouragement during this research. I would also like to express my thanks to Dr.
Nagisa Ishiura of Osaka University, who introduced me to the research field of VLSI
CAD and BDDs, and has been giving me invaluable suggestions, accurate criticism,
and encouragement throughout this research.
I also acknowledge interesting comments that I have received from Professor Hiromi
Hiraishi of Kyoto Sangyo University, Professor Hiroto Yasuura ofKyushuu University,
and Associate Professor Naofumi Takagi of Nagoya University. I would like to thank
Dr. Kiyoharu Hamaguch, Mr. Koich Yasuoka, and Mr. Shoichi Hirose of Kyoto
University, Associate Professor Hiroyuki Ochi of Hiroshima City University, Mr.
Yasuhito Koumura of Sanyo Corporation, and Kazuya loki of IBM Japan Corporation
for their interesting discussions and for their collaboration in implementing the BDD
program package.
Special thanks are also due to Professor Saburo Muroga of the University of Illinois at
Urbana-Champaign and Professor Tsutomu Sasao of Kyushu Institute of Technology
for their instructive comments and advice on the publication of this book. I am
grateful to Dr. Masahiro Fujita and Mr. Yusuke Matsunaga of Fujitsu Corporation,
who started to work on BDDs early. I referred to their papers many times, and had
fruitful discussions with them.
Preface xiii
I also wish to thank Dr. Olivier Coudert of Synopsys corporation for his discussions
on the implicit set representation, which led to the idea of ZBOOs. I would like
to thank Professor Gary O. Hachtel and Associate Professor Fabio Somenzi of the
University of Colorado at Boulder for their fruitful discussions on the application of
BODs to combinatorial problems. I am also grateful to Professor Robert K. Brayton
of the University of Calfomia at Berkeley, Dr. Patrik McGeer of Cadance Berkeley
Laboratolies, Dr. Yoshinori Watanabe of Digital Equipment Corporation, and Mr. Yuji
Kukimoto of the University of Calfomia at Berkeley for their interesting discussions
and suggestions on the techniques of BOD-based logic synthesis.
I wish to express my special gratitude to Mr. Yasuyoshi Sakai, Dr. Osamu Karatsu,
Mr. Tamio Hoshino, Mr. Makoto Endo, and all the other members of NTT Advanced
LSI Laboratory for giving me an opportunity to write this book and for their advice
and encouragement. Thanks are due to all the members of the Professor Yajima's
Laboratory for their discussions and helpful supports throughout this research.
Lastly, I thank my parents and my wife for their patience, support, and encouragement.
BINARY DECISION DIAGRAMS
AND APPLICATIONS
FOR VLSI CAD
1
INTRODUCTION
1.1 BACKGROUND
• Generating Boolean function data, which is the result of a logic operation (such
as AND, OR, NOT, and EXOR), for given Boolean functions.
• Checking the tautology or satisfiability of a given Boolean function.
• Finding an assignment of input variables such that a given Boolean function
becomes I, or counting the number of such assignments.
Various methods for representing and manipulating Boolean functions have been
developed, and some of the classical methods are truth tables, parse trees, and cube
sets.
2 CHAPTER 1
Truth tables are suitable for manipulation on computers, especially on recent high-
speed vector processors[IYY87] or parallel machines, but they need 2" bit of memory
to represent an n-input function - even a very simple one. A lOO-input tautology
function, for example, requires 2 100 bit of truth table. Since an exponential memory
requirement leads to an exponential computation time, truth tables are impractical for
manipulating Boolean functions with many input variables.
Parse trees for Boolean expressions sometimes give compact representations for
functions that have many input variables, and that cannot be represented compactly
using truth tables. A drawback to the use of parse trees, however, is that there are
many different expressions for a given function. Checking the equivalence of two
expressions is very hard because it is an NP problem, although rule-based method for
transformation of the Boolean expressions have been developed[LCM89].
Cube sets (also called sum-of-products, PLA forms, covers, or two-level logics)
are regarded as a special form of the Boolean expressions with the AND-OR two
level structure. They have been extensively studied for many years and have been
used to represent Boolean functions on computers. Cube sets sometimes give more
compact representation than truth tables, but redundant cubes may appear in logic
operations and they have to be reduced in order to check tautology or equivalency.
This reduction process is time consuming. Other drawbacks are that NOT operation
cannot be performed easily and that the size of cube sets for parity functions become
exponential.
Because the classical methods are impractical for large-scale problems, it have
been necessary to develop an efficient method for representing practical Boolean
functions. The basic concept of Binary Decision Diagrams (BDDs) - which
are graph representations of Boolean functions - was introduced by Akers in
1978[Ake78], and efficient methods for manipulating BODs were developed by
Bryant in 1986[Bry86]. BODs have since attracted the attention of many researchers
because of their suitability for representing Boolean functions. We can easily check
the equivalence of two functions because a BOD gives a canonical form for a Boolean
function. Although a BOD may in the worst case become of exponential size for the
number of inputs, its size varies with the kind of functions (unlike a truth table, which
always requires 2" bit of memory). One attractive feature of BODs is known that
many practical functions can be represented by BODs of feasible sizes.
There have been a number of attempts to improve the BOD technique in terms of
execution time and memory space. One of them is the technique of shared BDDs
(SBDDs)[MIY90], or multi-rooted BODs, which manages a set of BODs by joining
them into a single graph. This method reduces the amount of memory required and
makes it easy to check the equivalence of two BODs. Another improvement of BODs
Introduction 3
Boolean function manipulators using BDDs with these improved methods are imple-
mented on workstations and are widely distributed as BDD packages[Bry86, MIY90,
MB88, BRB90], that have been utilized in various applications - especially in
VLSI CAD systems - such as formal verification[FFK88, MB88, BCMD90], logic
synthesis[CMF93, MSB93], and testing[CHJ+90, TIY91].
Despite the advantages of using BDDs to manipulate Boolean functions, there are
some problems to be considered in practical applications. One of the problems is
variable ordering. Conventional BDDs requires the order of input variables to be fixed,
and the size of BDDs greatly depends on the order. It is hard to find the best order that
minimizes the size, and the variable ordering algorithm is one of the most important
issues in BDD utilization. Another problem occurs because we sometimes manipulate
ternary-valued functions containing don't care to mask unnecessary information. In
such cases, we have to devise a way of representing don't cares because the usual
BDDs deal only with binary logics. This issue can be generalized to the representation
for multi-valued logics or integer functions. It is also necessary to efficiently transform
BDD representation into other kind of data structures, such as cube sets or Boolean
expressions. This is important because in practical applications it is of course
necessary to output the result of BDD manipulation.
As our understanding of BDDs has deepened, their range of applications has broadened.
Besides having to manipulate Boolean functions, we are often faced with manipulating
sets of combinations in many problems. One proposal is for multiple fault simulation
by representing sets of fault combinations with BDDs[TIY91]. Two others are
verification of sequential machines using BDD representation for state sets[BCMD90],
and computation of prime implicants using Meta products[CMF93], which represent
cube sets using BDDs. There is also general method for solving binate covering
problems using BDDs[LS90]. By mapping a set of combinations into the Boolean
space, we can represent it as a characteristic function using a BDD. This method
enables us to manipulate a huge number of combinations implicitly, something that
had never been practical before. But because this BDD-based set representation
does not completely match the properties of BDDs, the BDDs sometimes grow large
because the reduction rules are not effective. There is room to improve the data
structure for representing sets of combinations.
4 CHAPTER 1
This book discusses techniques related to BODs and their applications for VLSI CAD
systems. The remainder of this book is organized as follows.
In Chapter 2, we start with describing the basic concept of BODs and Shared BODs.
We then present the algorithms of Boolean function manipulation using BODs. In
implementing BOD manipulators on computers, memory management is an important
issue on system performance. We show such implementation techniques that make
BOD manipulators applicable to practical problems. As an improvement of BODs,
we propose the use of attributed edges, which are the edges attached with several sorts
of attributes representing a certain operation. Using these techniques, we developed a
BOD subroutine package for Boolean function manipulation. We present the results
of some experiments evaluating the applicability of this BOD package to practical
problems.
In Chapter 3, we discuss the variable ordering for BODs. This is important because
the size of BODs greatly depends on the order of the input variables. It is difficult to
derive a method that always yields the best order to minimize BODs, but with some
heuristic methods, we can find a fairly good order in many cases. We first consider the
general properties of variable ordering for BO~s, and then we propose two heuristic
methods of variable ordering: dynamic weight assignment method and minimum-width
method. Our experimental results show that these methods are effective to reduce
BOD size in many cases, and are useful for practical applications.
Chapter 5 presents a fast method for generating prime-irredundant cube sets from
given BODs. Prime-irredundant denotes a form such that each cube is a prime
implicant and no cube can be eliminated. Our algorithm generates compact cube sets
directly from BODs, in contrast to the conventional cube set reduction algorithms,
which commonly manipulate redundant cube sets or truth tables. Experimental results
demonstrate that our method is very efficient in terms of time and space.
Introduction 5
In Chapter 8, we present another good application of the ZBDD technique, one that
manipulates arithmetic polynomial formulas containing higher-degree variables and
integer coefficients. This method enables us to represent large-scale polynomials
compactly and uniquely and to manipulate them in a practical time. Constructing
canonical forms of polynomials immediately leads to equivalence checking of arith-
metic expressions. Polynomial calculus is a basic part of mathematics, so it is useful
for various problems.
In Chapter 9, we present a helpful tool for research on computer science. Our product,
called BEM -1/, calculates not only binary logic operations but also arithmetic operations
on multi-valued logics (such as addition, subtraction, multiplication, division, equality
and inequality). Such arithmetic operations provide simple descriptions for various
problems. We show the data structure and algorithms for the arithmetic operations.
We then describe the specification of BEM-II and some application examples.
This chapter introduces the basic concept of BDDs, which are now commonly used
for Boolean function representation. We then discuss methods for manipulating BDDs
on computers and describe techniques for reducing computation time and memory
requirements.
A BDD is a directed acyclic graph with two terminal nodes, which we call the
O-terminal node and i-terminal node. Each non-terminal node has an index to identify
an input variable of the Boolean function and has two outgoing edges, called the
O-edge and i-edge.
An Ordered BDD (OBDD) is a BDD where input variables appear in a fixed order in
all the paths of the graph and no variable appears more than once in a path. In this
book, we use natural numbers 1. 2 .... for the indices of the input variables, and every
non-terminal node has an index greater than those of its descendant nodes. 1
A compact OBDD is derived by reducing a binary tree graph, as shown in Fig. 2.I(b).
In the binary tree, O-terminals and I-terminals represent logic values (O/l), and each
1 Although this numbering is reversal to that in Bryant's papcr[Bry86], it is convenient because the
variable index at the root-node gives the number of variables for the function.
7
8 CHAPTER 2
(a) A BDD for (:z:3· :z:2 V xl). (b) A binary decision tree.
!=xi·!oVx.·h,
where ·i is the index of the node and where !o and h are the functions of the nodes
pointed to by the O-edge and the I-edges.
1. Eliminate all the redundant nodes whose two edges point to the same node.
(Fig. 2.2(a»
ROBDDs give canonical forms for Boolean functions when the order of variables is
fixed. This property is very important for practical applications, since we can easily
check the equivalence of two Boolean function simply by checking the isomorphism
of their ROBDDs. Most works relating to BDDs are based on the technique of the
ROBDDs, so for simplicity in this book, we refer to ROBDDs as BDDs.
Since there are 22 " kinds of n-input Boolean functions, the representation requires
at least 2 n bit of memory in the worst case. It is known that a BDD for an n-input
function includes O(2n In} nodes in generaI[LL92]. As each node consumes about
O(n} bit (to distinguish the two child nodes from O(2 n ln) nodes), the total storage
Techniques of BDD Manipulation 9
Jump
space exceeds 2n bit. The size of BODs, however, varies with the kind of functions,
whereas truth tables always require 2n bit of memory. There is a class of Boolean
functions that can be represented by a polynomial size of BODs, and many practical
functions fall into this class[IY90]. This is an attractive feature of BODs.
A set of BODs representing multiple functions can be united into a graph that consists
of BODs sharing their subgraphs with each other, as shown in Fig. 2.3. This sharing
saves time and space to have duplicate BODs. When the isomorphic subgraphs are
completely shared, two equivalent nodes never coexist. We call such graphs Shared
BDDs (SBDDs) [MIY90], or multi-rooted BDDs.
• We can save the time and space to have duplicate BODs by only copying a pointer
to the root node.
Fl F2 F3 F4
In a typical implementation of a BOD manipulator, all the nodes are stored in a node
table on the main memory of the computer. Figure 2.4 shows, as an example, a node
table representing the BODs shown in Fig. 2.3. Each node has three basic attributes:
an index of the input variable and two pointers of the 0- and I-edges. Some additional
pointers and counters for maintaining the node table are attached to the node data.
The 0- and I-terminal nodes are at first allocated in the table as special nodes, and the
other (non-terminal) nodes are gradually generated as the results of logic operations.
By referring to the address of a node, we can immediately determine whether or not
the node is a terminal.
In the shared BOD environment, all isomorphic subgraphs are shared. That is, two
equivalent nodes should never coexist. To assure this, before creating a new node we
Techniques of BDD Manipulation 11
always check the reduction rules shown in Fig. 2.2. If the 0- and I-edges have the
same destination or if an equivalent node already exists, we do not create a new node
but simply copy the pointer to the existing node. We use a hash table which indices
all the nodes, so that the equivalent node can be checked in a constant time if the
hash table acts successfully. The effectiveness of the hash table is important since it
is frequently referred to in the BOD manipulation. In this technique, the uniqueness
of the nodes is maintained, and every Boolean function can be identified by a I-word
address of the root node.
In many cases, BODs are generated as the results of logic operations. Figure 2.5
shows an example for (:r1 . :[:2) V :r3. First, trivial BODs for ;:1, ;:2. ;:3 are created.
Then by applying the AND operation between ;:1 and ;:2, the BOD for (;:1 . :r2) is
generated. The final BOD for the entire expression is obtained as the result of the OR
operation between (:1:1 . :r2) and ;:3.
where v is the highest ordered variable in I and g. This expansion creates a new node
with the variable v having two subgraphs generated by suboperations (/(v=o) 0 9(v=0»)
and (f( v= 1) 0 9( v= 1) ). This formula means that the operation can be expanded to two
suboperations (f(v=O) 0 9(v=0»)) and (f(v=l) 0 9(v=1») with respect to an input variable
v. Repeating this expansion recursively for all the input variables, eventually a trivial
operation appears, and the result is obtained. (For instance, I . 1 = I, I E!) I = 0, etc.)
As mentioned in the previous section, we check the reduction rules before creating a
new node.
As mentioned in previous section, we check the hash table before creating a new node
to avoid duplication of the node.
Figure 2.6(a) shows an example. When we perform the operation between the nodes
(1) and (5), the procedure is broken down into the binary tree, shown in Fig. 2.6(b).
We usually compute this tree in a depth-first manner. It seems that this algorithm
always takes a time exponential for the number of inputs, since it traverses the binary
trees. However, some of these trees are redundant; for instance, the operations of
(3)-(4) and (4)-(7) appear more than once. We can accelerate the procedure by using a
hash-based cache that memorizes the results of recent operations. By referring to the
cache before every recursive call, we can avoid duplicate executions for equivalent
suboperations. This technique enables the binary logic operations to be executed in a
time almost proportional to the size of the BODs.
The cache size is important for the performance of the BOD manipulation. If the cache
size is insufficient, the execution time grows rapidly. Usually we fix the cache size
empirically, and in many cases it is several times greater or smaller than the number
of the nodes.
2.2.2 Negation
A BOD for ], which is the complement of I, has a form similar to that of the BOD
for f: just the O-terminal and the i-terminal are exchanged. Complementary BODs
14 CHAPTER 2
•• op'.
...... ......
f g
/(1).(5)"
/
(3)·(5)
....
.....\.............
.....
/
(2)·(5)
'""
(3)·(4) (3)·(6) (3)·(4) (4)·(6)
/(4~4) / (4;(7)
I \
.r.!"···;~~~~····~~)~·····~.(7)
I \
(a) An example. (b) Structure of procedure calls.
contain the same number of nodes, in contrast to the cube set representation, which
sometimes suffers a great increase of the data size.
By using binary operation (f EEl 1), we can compute 7 in a time linear to the size of the
BDDs. However, the operation is improved to a constant time by using the negative
edge. The negative edges are a kind of attributed edges, discussed in Section 2.4. This
technique is now commonly used in many implementations.
In general, there are many solutions to satisfy a function. Under the definition of the
costs to assign "1" to respective input variables, we can search for an assignment that
minimizes the total cost[LS90), which is defined as 'E7=1 (C i x ;Z:i), where Ci is a
non-negative cost for input variable xd E {O.l}). Many NP-complete problems can
be described in the above format.
With this method, we can immediately solve the problem if the BDD for the constraint
function can be generated in the main memory of the computer. There are many
practical examples where the BDD becomes compact. Of course, it is still an NP
problem so in the worst case the BDD requires an exponential number of nodes and
overflows the memory.
We can also efficiently count the number of the solutions to satisfy 1 = 1. On the root
node of I, the number of solutions is computed as the sum of the solutions of the two
subfunctions 10 and It. By using the cache technique to save the result on each node,
we can compute the number of the solutions in a time proportional to the number of
nodes in the BDD.
In a similar way, we can compute the truth table density for a given Boolean function
represented by a BDD. The truth table density is the rate of I's in the truth table, and
it indicates the probability of satisfying 1 = 1 for an arbitrary assignment to the input
variables. Using BDDs, we can compute it as an average of the density for the two
subfunctions on each node.
16 CHAPTER 2
When generating BODs for Boolean expressions, many intermediate BODs are
temporarily generated. It is important for the memory efficiency to delete such already
used BODs. If the used BOD shares subgraphs with other BODs, we cannot delete the
subgraphs. In order to determine the necessity of the nodes, a reference counter that
shows the number of incoming edges is attached to each node. When a BOD becomes
unnecessary, we decrease the reference counter of the root node, and if it becomes
zero, the node can be eliminated. When a node is really deleted, we recursively
execute this procedure on its descendant nodes. On the other hand, in the case of
copying an edge to a BOD, the reference counter of the root node is increased.
Deletion of the used nodes saves memory space, but there may be a time cost because
we might eliminate nodes that will later become necessary again. Besides, deletion of
the nodes breaks the consistency of the cache for memorizing recent operation, so we
have to recover (or clear) the cache. To avoid these problems, even when the reference
counter becomes zero, we do not eliminate the nodes until the memory becomes full.
Then they are deleted all at once as a garbage collection.
The BOD manipulator is based on the hash table technique, so we have to allocate
a fixed size of hash table when initializing the program. At that time, it is difficult
to estimate the final size of BODs to be generated, but it is inefficient to allocate
too much memory because other application programs cannot use the space. In our
implementation, some small amount of space is allocated initially, and if the table
becomes full during BOD manipulation, the table is re-allocated two or four times
larger (unless the memory overflows). When the table size reaches the limit of
extension, the garbage collection process is invoked.
The attributed edge is the technique to reduce the computation time and memory
required for BODs by using edges with an attribute representing a certain operation.
Several kinds of attributed edges have been proposed[MIY90]. In this section, we
describe three kinds of attributed edges: negative edges, input inverters, and variable
shifters. The negative edge is particularly effective and now widely implemented in
BOD manipulators.
Techniques of BDD Manipulation 17
The negative edge has an attribute which indicates that the function of the subgraph
pointed by the edge should be complemented, as shown in Fig. 2.7(a). This idea was
introduced as Akers's inverter[Ake78] and Madre and Billon's typed edge[MB88].
The use of negative edges results in some remarkable improvements:
• Logicoperationscanbeacceleratedbyapplyingrulessuchasl-f = O. IV] = 1,
and I ffi ] = 1.
• Using the quick negation, we can transform operations such as OR, NOR, and
NAND into AND by applying De Morgan's theorems, thus raising the hit rate of
the cache of recent operations.
The use of negative edges may break the uniqueness of BDDs. To preserve the
uniqueness, we have to put two constraints:
F G F G
If necessary, the negative edges can be carried over as shown in Fig. 2.7(b). These
constraints are basically same as those in Madre and Billon's worklMB88].
We propose another attribute indicating that the 0- and I-edges at the next node should
be exchanged (Fig. 2.8). Since it is regarded as complementing an input variable of
the node, we call it an input inverter. Using input inverters, we can reduce the size
of BODs by as much as half. There are cases where input inverters are very effective
while negative edges are not (Fig. 2.9).
Since abuse of input inverters can also breaks the uniqueness of BODs, we also place a
constraint on their use. We use input inverters so that the two nodes 10 and 11 pointed
to by the 0- and I-edges satisfy the constraint: (fo < h), where '<' represents an
arbitrary total ordering of all the nodes. In our implementation, each edge identifies
the destination with the address of the node table, so we define the order as the value
of the address. Under this constraint, the BOD forms may vary with the addressing
manner, but the uniqueness is assured in the shared BOD environment.
F F
indicate that a number should be added to the indices of all its descendant nodes.
We do not record information of variable index on each node because the variable
shifter gives information about the distance between the pair of nodes. We place the
following rules to use variable shifters:
2. On the edge pointing to the root node of a BDD, the variable shifter indicates the
absolute index number of the node.
3. Otherwise, a variable shifter indicates the difference between the index numbers
of the start node and the end node of the edge.
As shown in Fig. 2.10, for example, the graphs representing (Xl V (X2 . X3)),
(1:2 V (X3 . 1:4)), ... , (Xk V (Xk+l . 1:k+2)) can be joined into the same graph.
Using variable shifters, we can reduce the size of BDDs - especially when manipu-
lating a number of regular functions, such as arithmetic systems. Another advantage
of using variable shifters is that we can raise the hit rate of the cache of operations by
applying the rule:
(f 0 g = h) <=> (f(k) 0 g(k) = h(k)),
20 CHAPTER 2
Fl F2 Fl F2
where I(k) is a function whose indices are shifted by J,; from I and where 0 is a binary
logic operation such as AND, OR, and EXOR.
2. Define a mapping F : (S ...... S), such that for any I E SI there is a unique
10 E So to satisfy f = FUo).
The above argument explains both negative edges and input inverters. Variable shifters
can also be explained if we extend the argument having more than two partitions, as
follows:
Techniques of BDD Manipulation 21
2. For any k ~ I, define a mapping Fk : (S --> S), such that for any h E Sk there
is a unique 10 E So to satisfy h = Fd/o).
In this section, we describe the implementation of a BDD package and present the
results of experiments evaluating the performance of the BDD package and the effect
of attributed edges.
• Giving the trivial functions, such as 1 (tautology), 0 (inconsistency), and ;1:k for
a given index k.
• Generating BDDs by applying logic operations such as NOT, AND, OR, EXOR,
and restriction.
The results are listed in Table 2.1. The circuit "seI8" is an 8-bit data selector, and "enc
8" is an 8-bit encoder. The circuits "add8" and "addI6" are 8-bit and 16-bit adders,
and "mult4" and "mu1t8" are 4-bit and 8-bit multipliers. The rests are selected from
the benchmarks in ISCAS'85[BF85]. The column "BDD nodes" lists the number of
the nodes in the set of BDDs, and the total time for loading the circuit data, ordering
the input variables, and generating the BDDs is listed in column "Time (sec)."
The results show that we can represent the functions of these practical circuits quickly
and compactly. It took less than a minute to represent circuits with dozens of inputs
and hundreds of nets. We can see that the CPU time is almost proportional to the
number of nodes.
Techniques of BDD Manipulation 23
To evaluate the effect of the attributed edges, we conducted the similar experiments
by incrementally applying the three kinds of attributed edges. The results are shown
in Table 2.2. Column (A) lists the results of the experiments using original BDDs
without any kind of attributed edges, and column (B) lists the results obtained when
using only negative edges. Comparing the results, we can see that the use of negative
edges reduces the number of nodes by as much as 40% and speed up the computation
markedly.
Column (C) lists the results obtained when using both input inverters and negative
edges. The number of nodes is reduced owing to the use of input inverters, but there
are no remarkable differences in CPU time. Input inverters were especially effective
for the circuits "c499," "c1355," and "cI908," for which we can see up to 45%
reduction in size.
And as we can see from the results listed in column (D), the additional use ofthe variable
shifters reduced .graph size still more without producing remarkable differences in
the CPU time. Variable shifters are effective especially for the circuits with regular
structures, such as arithmetic logics, and we can also see some effectiveness for other
circuits.
24 CHAPTER 2
These experimental results show that the combination of the three attributed edges is
very effective in many cases, though none of them are themselves effective for all
types of circuits.
2.6 CONCLUSION
BODs give canonical fonns of Boolean functions provided that the order of input
variables is fixed, but the BOD for a given function can have many different fonns
depending on the pennutation of the variables - and sometimes the size of BODs
greatly varies with the order. The size of BODs determines not only the memory
requirement but also the amount of execution time for their manipulation. The variable
ordering algorithm is thus one of the most important issues in the application of BODs.
The effect of variable ordering depends on the kind of function to be handled. There
are very sensitive functions whose BODs vary extremely (exponentially to the number
of inputs) by only reversing the order. Such functions often appear in practical
digital system designs. On the other hand, there are some kinds of functions for
which the variable ordering is ineffective. For The symmetric functions, for example,
obviously have the same fonn for any variable order. It is known that the functions of
multipliers[Bry91] cannot be represented by a polynomial-sized BOD in any order.
There are some works on the variable ordering. Concerning the method to find exactly
the best order, Friedman et al. presented an algorithm[FS87] of O( 71. 2 3n ) time based
on the dynamic programming, where n is the number of inputs. For functions with
many inputs, it is still difficult to find the best order in a practical time, although this
algorithm has been improved to the point that the best order for some functions with
17 inputs can be found[ISY91].
From the practical viewpoint, heuristic have been studied extensively. Malik et
al.[MWBSV88] and Fujita et al.[FFK88] reported heuristic methods based on the
topological infonnation of logic circuits. Butler et al. [BRKM91] uses a testability
measure for the heuristics, which reflect not only topological but logical information
of the circuit. These methods can find a (may be) good order before generating BODs.
25
26 CHAPTER 3
(a) Circuit. (b) In the best order. (c) In the worst order.
They are applied to the practical benchmark circuits and in many cases compute a
good order.
Fujita et al.[FMK9IJ showed another approach that improves the order for the given
BOD by repeating the exchange of the variables. It can give results better than the
initial BODs, but it is sometimes trapped in a local optimum.
In this chapter, we discuss the properties of the variable ordering for BODs and show
two heuristic methods we have developed for variable ordering.
1. (Local computability)
The group of the inputs with local computability should be near in the order.
Namely, inputs that are closely related should be kept near to each other.
Consider, for example, the BOD representing the function of the AND-OR 2-level
logic circuit with 271 inputs shown in Fig. 3.l(a). It takes 271 nodes under the
Variable Ordering for BDDs 27
ordering :1:1 . X2 V :1:3· :1:4 V ... V :1:2n-1 . :1:2n (Fig. 3.1(b», but under the ordering
:1:1 . Xn+1 V:1:2 . :1: n +2 V ... V X n · :1:2n it takes (2.2 11 - 2) nodes (Fig. 3.l(c».
If we could find a variable order that satisfies those two properties, the BODs would
be compact. However, the two properties are mixed ambiguously, and sometimes they
require the conflicting orders with each other. Another problem is that when multiple
functions are represented together, those functions may require different orders. It is
difficult to find a point of compromise. For large-scale functions, automatic methods
giving an appropriate solution are desired.
• To find an appropriate order before generating BDDs by using the logic circuit
information that is the source of the Boolean function to be represented.
• To reduce BDD size by permuting the variables of a given BDD started with an
initial variable order.
The former approach refers only to the circuit information and does not use the detailed
logical data, to carry out the ordering in a short time. Although it sometimes gives a
poor result depending on the structure of the circuits, it is still one of the most effective
ways to deal with large-scale problems. The latter approach, on the other hand, can
find a fairly good order by fully using the logical information of the BDDs. It is useful
when the former method is not available or ineffective. A drawback of this approach
is that it cannot start if we fail to make an initial BDD of reasonable size.
We first propose Dynamic Weight Assignment (DWA) method, which belongs the
former approach, and show their experimental results. We then present Minimum-
width method, which is an reordering method we developed. The two methods can be
used in combination.
Considering the properties of the variable ordering of BDDs, we have developed the
Dynamic Weight Assignment () method[MIY90], in which the order is computed from
the topological information of a given combinational circuit.
3.2.1 Algorithm
We first assign a weight 1.0 to one of the primary outputs of the circuit, and then that
weight is propagated toward the primary inputs in the following manner:
1. At each gate, the weight on the output is divided equally and distributed to the
inputs.
Variable Ordering for BDDs 29
2. At each fan-out point, the weights of the fan-out branches are accumulated into
the fan-out stem.
After this propagation, we give the highest order to the primary input with the largest
weight. Since the weights reflect the contribution to the primary output in a topological
sense, primary inputs with a large weight are expected to have a large influence to the
output function.
Next, after choosing the highest input, we delete the part of the circuit that can be
reached only from the primary input already chosen, and we re-assign the weights
from the beginning to choose the next primary input. By repeating this assignment
and deletion, we obtain the order of the input variables. An example of this procedure
is illustrated in Fig. 3.3(a) and (b). When we delete the subcircuit, the largest weight
in the prior assignment is distributed to the neighboring inputs in the re-assignment,
and their new weights are increased. Thereby, the neighboring inputs tend to be close
to the prior ones in the order.
When the circuit has multiple outputs, we have to choose one in order to start the
weight assignment. We start from the output with the largest logical depth from the
primary inputs. If we have not yet ordered all the inputs after the ordering, the output
with the next largest depth is selected to order the rest of the inputs.
This algorithm gives a good order in many cases, and the time complexity of this
method is O( rn . n), where rn is the number of the gates and n is the number of the
primary inputs. This time complexity is enough tolerable for many practical uses.
30 CHAPTER 3
The ordering method is very effective except in a few cases which are insensitive to
the order. The random ordering is quite impractical. The use of the original order of
the circuit data sometimes gives a good result, but it is a passive way and it cannot
always be good. We can conclude that our ordering method is useful and essential for
many practical applications.
Variable Ordering for BDDs 31
In this section, we describe a heuristic method based on reordering the variables for a
given BDD with an initial order[Min92b). In the following, n denotes the number of
the input variables.
Several ordering methods based on the reordering approach have been proposed.
Fujita et al.[FMK91] presented an incremental algorithm based on the exchange of a
pair of variables (:1:." ;];;+1 ), and Ishiura et al.[ISY9l] presented a simulated annealing
method with the random exchange of two variables. A drawback of these incremental
search methods, though, is that their performance depends very much on the initial
order. If it is far from the best, many exchanges are needed. This takes longer time
and increases the risk of being trapped in a bad local minimum solution.
We therefore propose another strategy. We first choose one variable based on a certain
cost function, fix it at the highest position (:1: n ), and then choose another variable and
fix it at the second highest position (X n -1). In this manner, all the variables are chosen
one by one, and they are fixed from the highest to the lowest. This algorithm has no
backtracking and is robust to the bad initial order. In our method, we define the width
of BDDs, as a cost function.
When choosing :1:!,;{1 :::; /,; :::; n), the variables with indices higher than /,; have already
been fixed and the form of the higher part of the graph never varies. Thus, the choice
of :]:1.; affects only the part of the graph lower than :rl.;. The goal in each step is to
choose the ;]: I.; that minimizes the size of the lower part of the graph. The cost function
should give a good estimation of the minimum size of the graph for each choice of :l:/,;,
and should be computable within a feasible time. As a cost function to satisfy those
requirements, we define the width of BDDs here.
Definition 3.1 The width of a BDD at height /';, denoted widthl.;, is the number of
edges crossing the section of the graph between J:I.; and :]:1.;+1, where the edges pointing
to the same node are counted as one. The width between :):1 and the bottom of graph
is denoted widtho. 0
An example is shown in Fig. 3.4. the widthl.; is sometimes greater than the number of
the nodes of :rl.; because the width also includes edges that skip the nodes of :1:1.;.
32 CHAPTER 3
Fl F2 F3
Theorem 3.1 The widthl,; is constant for any permutation of {:rl. :r2 •.... J:I,;} and
any permutation of {;1:I,;+I. :rl,;+2 ..... :z:,,}.
(Proof) widthl,; represents the total number of subfunctions obtained by assigning all
the combinations of Boolean values {O. I} n-I,; to the variables {:rk+1 ,:t1,;+2 • ...• :r n }.
Because the total number of subfunctions is independent of the order of the assignment,
the width/,; is constant for any permutation of {:rl,;+l. ;1:k+2 • .... :J: n }.
All the subfunctions obtained by assigning all the combinations of Boolean value
{O.l} n-I,; to the variables {:rl,;+l. :]:1,;+2 ..... :rn} are uniquely represented by BDDs
with a uniform variable order. For any permutation of these variables, the number
of the subfunctions does not change because they are still represented uniquely with
a different but uniform variable order. Therefore, width/,; never varies for any
permutation of {:rl. ;1:2 ..... :rd. 0
3.3.2 Algorithm
Our method uses the width of BDDs to estimate the complexity of the graph, and the
variables are chosen one by one from the highest to the lowest according to the cost
function. As shown in Fig. 3.5, we choose J:/,; that gives the minimum width/,;_I. We
Variable Ordering for BDDs 33
Fl F2 F3
call this algorithm the minimum-width method. If there are two candidates with the
same 11Iidth/,;_1, we choose the one at the higher position in the initial order.
• We had better avoid making width/,;_1 large because width/,;_l is a lower bound
of the number of the nodes at the lower than :r".
fex = (J:·i· 1:j . fool V (:r.,· 1:j . hoi V (1:,' J:j . f01) V (:r·i 'J:j . fll).
34 CHAPTER 3
where faa, fOl' fIo, and f11 are the subfunctions obtained by assigning a value 0/1 to
the two variables :ri and Xj as follows:
Roughly speaking, the time complexity of our method is O(n 2 G), where G is the
average size of BDDs during the ordering process. This complexity is considerably
less than that of conventional algorithms seeking exactly the best order.
In our experiments, we generated initial BDDs for given logic circuits in a certain
order of variables and applied our ordering method to the initial BDDs. We used
negative edges.
The results for some examples are summarized in Table 3.2. In this table, "seIS,"
"encS," "add8," and "mult6" are the same ones used in Section 2.5. The other items
were chosen from the benchmark circuits in DAC'S6[dG86J. In general, these circuits
have multiple outputs. Our program handles multiple output functions by using the
shared BDD technique. the performance of the reordering method greatly depends
on the initial order. We generated 10 initial BDDs in random orders, and applied our
ordering method in each case. The table lists the maximum, minimum, and average
numbers of the nodes before and after ordering. "Time (sec)" is the average ordering
time for the 10 BDDs (including the time needed for generating initial BDDs).
Variable Ordering for BDDs 35
The results show that our method can reduce the size of BOOs remarkably for most
of the examples, except for "9sym," which is a symmetric function. Note that our
method constantly gives good results despite the various initial orders.
Similar experiments were then done using larger examples. The functions were chosen
from the benchmark circuits in ISCAS'85[BF85]. These circuits are too large and too
complicated to generate initial BOOs in a random order, so to obtain a good initial
order, we used the OWA method, which is described in Section 3.2.
As shown in Table 3.3, our method is also effective for large-scale functions. It takes
longer time, but it is still faster than methods that seek exactly the best order. The sizes
of the BODs after reordering are almost equal to those obtained by using the heuristic
methods based on the circuit information[MWBSV88, FFK88, MIY90, BRKM91].
Our results are useful for evaluating other heuristic methods of variable ordering.
36 CHAPTER 3
Weak points of our method are that it takes more time than the heuristic methods using
the circuit information and that it requires a BDDs with a certain initial order.
These results led us to conclude that it is effective to apply the minimum-width method
at first, and then apply the incremental search for the final optimization. The results
of our experiments with such a combination of methods are summarized in Table 3.4,
and show that the combination is more effective than either of the methods alone.
3.4 CONCLUSION
We have discussed properties of variable ordering, and have shown two heuristic
methods: the DWA method and the minimum-width method. The former one finds an
Variable Ordering for BDDs 37
This evaluation of the variable ordering indicates that it is effective to apply these
heuristic methods in the following manner:
This sequence gives fairly good results for many practical problems, but because these
methods are only heuristics, there are cases in which they give poor results. Tani
et al.[THY93] has proved that an algorithm finding the best order is in the class of
NP complete. This implies that it is almost impossible to have a method of variable
ordering that always finds best order in a practical time. We will therefore make do
with some heuristic methods selected according to the applications.
The techniques of variable ordering are still being studied intensively, and one note-
worthy method is the dynamic variable ordering, recently presented by Rudell [Rud93].
It is based on having the BDD package itself determine and maintain the variable
order. Every time the BDDs grow to a specified size, the reordering process is invoked
automatically, just as the garbage collection. This method is very effective in reducing
BDD size, although it sometimes takes a long computation time.
4
REPRESENTATION OF
MULTI-VALUED FUNCTIONS
A Boolean function with don't care is regarded as a function from a Boolean vector
input to a ternary-valued output, denoted as:
Ternary-valued functions are manipulated by the extended logic operations, and the
rules of the logic operations between two ternary-valued functions are defined as
follows:
39
40 CHAPTER 4
r
We also define two special unary operations fl and If J that are important since they
are used for abstracting Boolean functions from ternary-valued functions:
f rfl lfJ
0 0 0
1 1 1
d 1 0
There are two ways to represent ternary-valued functions by using BDDs. The first
one is to introduce ternary values '0', '1' and 'd' at the terminal nodes of BDDs, as
shown in Fig. 4.1. We call this BDD ternary-valued BDD. This method is natural and
r
easy to understand, but it has a disadvantage that the operations Il and If J are not
easy and that we have to develop a new BDD package for ternary-valued functions.
Matsunaga et al. reported work[MF89] that uses such a BDD package.
The second way is to encode the ternary-value into a pair of Boolean values, and
represent a ternary-valued function by a pair of Boolean functions, denote as:
1: [fo. hl·
Representation of Multi-Valued Functions 41
0: [0.0]
1: [1. 1]
d: [0,1],
which differs from Bryant's code. The choice of encoding is important for the
efficiency of the operations. In this encoding, 10 and II express the functions II J and
r11, respectively. Under this encoding, the constant functions 0 and 1 are respectively
expressed as [0,0] and [1,1]. The chaos function is represented as [0.1]' and the logic
operations are simply computed as follows:
To find out which method is more efficient, the ternary-valued BOOs or the BOO
pairs, we compare them by introducing the D-variable.
4.1.2 D-variable
We propose to use a special variable, which we call the D-variable, for representing
ternary-valued functions. As shown in Fig. 4.2(a), a pair of BOOs 1 : [10, It] can be
joined into a single BOO using O-variable on the root node whose 0- and I-edges are
pointing to 10 and It, respectively. This BOO has the following elegant properties:
• The constant functions 0 and 1 are represented by the 0- and I-terminal nodes,
respectively. The chaos function is represented by a BOO which consists of only
one non-terminal node with the O-variable.
• When the function 1 returns 0 or 1 for any inputs; that is, when 1 does not contain
don't care, 10 and It become the same function and the subgraphs are completely
shared. In such cases, the O-variable is redundant and is removed automatically.
Consequently, if 1 contains no don't care, its form becomes the same as that of a
usual BOO.
42 CHAPTER 4
f f
Using O-variable, we can discuss the relation of the two representation: the ternary-
valued BODs and the BOD pairs. In Fig 4.2(a), the O-variable is ordered at the
highest position in the BOD. When the D-variable is reordered to the lowest position,
the form of the BOD changes as shown in Fig. 4.2(b). In this BOD, each path from
the root node to the I-terminal node through the O-variable node represents an input
assignment that makes fo = 0 and it = 1, namely f = d (don't care}. The other paths
not through the O-variable node represent the assignments such that f = 0 or f = 1.
(Notice that there are no assignments to have fo = 1 and it = 0.) Therefore, if we
regard the O-variable node as a terminal, this BOD corresponds to the ternary-valued
BOD.
Consequently, we can say that both the ternary-valued BODs and the BOD pairs are
the special forms of the BODs using the O-variable, and we can compare the efficiency
of the two methods by considering of the properties of variable ordering. From the
discussion in previous chapter, we can conclude that the O-variable should be ordered
at the higher position when the O-variable greatly affects the function.
Representation of Multi-Valued Functions 43
J:{O.l}"->I.
MTBDDs are extended BDDs with multiple terminal nodes, each of which has an
integer value (Fig. 4.3). This method is natural and easy to understand, but we need to
develop a new BDD package to manipulate multi-terminals. Hachtel and Somenzi et
al. have reported several investigations of MTBDD[BFG+93, HMPS94j. They call
MTBDDs in other words, Algebraic Decision Diagrams (ADDs).
Using BDD vectors is a way to represent B-to-I functions by a number of usual BDDs.
By encoding the integer numbers into n-bit binary codes, we can decompose a B-to-I
function into n pieces of Boolean functions each of which represents a digit of the
binary-corded number. These Boolean functions can then be represented with shared
BDDs (Fig. 4.4). This method was mentioned in the workLCMZ+93j.
Here we discuss which representation is more efficient in terms of size. We show two
extreme examples as follows:
44 CHAPTER 4
(1'3) f2 n CO
B·to·1 function
xl x2 f(f2 n 1'0)
•
0 0 0(000)
0 1 1 (001)
1 0 3 (0 11)
1 1 4 (l 00)
Similar to the way we did for the ternary-valued functions, we show that the
comparison between multi-terminal BDDs and BDD vectors can be reduced to the
variable-ordering problem. Assume the BDD shown in Fig. 4.7(a), which was
obtained by combining the BDD vector shown in Fig. 4.4 with what we call bit-
selection variables. If we change the variable order to move the bit-selection variables
from higher to lower positions, the BDD becomes one like that shown in Fig. 4.7(b).
In this BDD, the subgraphs with bit-selection variables correspond to the terminal
nodes in the MTBDD. That is, MTBDDs and BDD vectors can be transformed into
each other by changing the variable order by introducing the bit-selection variables.
The efficiency of the two representations thus depends on the nature of the objective
functions, and we therefore cannot determine which representation is generally more
efficient.
Representation of Multi-Valued Functions 45
f fn f2 n ro
(n-bit integers)
r n.1 '" f2 n ro
f f
Bit-selection
yl ] variables
I 0
BDD
vector
In addition, we extended the arguments to B-to-I functions, and presented two methods:
MTBDDs and BDD vectors. These methods can be compared by introducing the
bit-selection variables, similarly to the D-variable for the ternary-valued functions.
Based on the techniques for manipulating B-to-I functions, we developed an arithmetic
Boolean expression manipulator, which is presented in Chapter 9.
Several variants ofBDDs have recently been devised in order to represent multi-valued
logic functions. The two most notable works are the Edge-Valued BDDs (EVBDDs)
presented by Lai et al.[LPV93] and the Binary Moment Diagrams (BMDs) developed
by Bryant[BC95]. EVBDDs can be regarded as MTBDDs with attributed edges. The
attributed edges indicate that we should add a number to the output values of the
functions. Figure 4.8 shows an example of an EVBDD representing a B-to-I function
(Xl + 2X2 + 4X3). This technique is sometimes effective to reduce the memory
requirement, especially when representing B-to-I functions for linear expressions.
a coefficient of the product term. For example, a B-to-I function for (1:1 + 2X2 + 4:r3)
becomes a binary tree form of MTBDD, but the algebraic expression contains only
three terms, and it can be represented by a BMD as shown in Fig. 4.9.
In many problems on digital system design, cube sets (also called covers, PLA forms,
sum-of-products forms, or two-level logics) are used to represent Boolean functions.
They have been extensively studied for many years, and their manipulation algorithms
are important in LSI CAD systems. In general, it is not so difficult to generate BDDs
from cube sets, but there are no efficient methods for generating compact cube sets
from BDDs.
In this chapter, we present a fast method for generating prime-irredundant cube sets
from BDDs[Min92a, Min93b]. Prime-irredundant means that each cube is a prime
implicant and no cube can be eliminated.
The minimization or optimization of cube sets has received much attention, and a
number of efficient algorithms, such as MINI[HC074] and ESPRESSO [BHMSV84],
have been developed. Since these methods are based on cube set manipulation, they
cannot be applied to BDD operations directly. Our method is based on the idea of
the Recursive operator, proposed by Morreale[Mor70]. We found that Morreale's
algorithm can be improved and efficiently adapted for BDD operations.
• It generates cube sets from BDDs directly without temporarily generating redun-
dant cube sets in the process.
49
50 CHAPTER 5
In the remainder of this chapter, we first survey a conventional method for generating
cube sets from BDDs, and next we present our algorithm to generate prime-irredundant
cube sets. We then show experimental results of our method, followed by conclusion.
Akers[Ake78] presented a simple method for generating cube sets from BDDs by
enumerating the I-paths. This method enumerates all the paths from the root node to
the I-terminal node, and lists the cubes which correspond to the assignments of the
input variables to activate such paths. In the example shown in Fig. 5.1, we can find
the three paths that lead to the cube set:
In reduced BDDs, all the redundant nodes are eliminated, so the literals of the
eliminated nodes never appear in the cubes. In the above example, the first cube
contains neither :r1 nor :1:1. All of the cubes generated in this method are disjoint
because no two paths can be activated simultaneously. But although this method
can generate disjoint cube sets, it does not necessarily give the minimum ones. For
example, the literal of the root node appears in every cube, but some of them may be
unnecessary. Considerable redundancy, in terms of the number of cubes or literals,
remains in general.
Generation of Cube Sets from BDDs 51
Jacobi and Trullemans[JT92] recently presented a method for removing such redun-
dancy. It generates a prime-irredundant cube set from a BOD in a divide-and-conquer
manner. On each node of the BOD, it generates two cube sets for the two subgraphs
of the node, and then it combines the two by eliminating redundant literals and cubes.
In this method, a cube set is represented with a list of BODs each of which represents
a cube. The redundancy of each cube is determined by applying BOD operations.
Although this method can generate compact cube sets, it temporarily generates the lists
of redundant cubes during the procedure, and the manipulation of such lists sometimes
requires a lot of computation time and space.
• Each cube is a prime implicant; that is, no literal can be eliminated without
changing the function.
• There are no redundant cubes; that is, no cube can be eliminated without changing
the function.lrredundant
The expression ;I:yZ V fy, for example, is not prime-irredundant because we can
eliminate a literal without changing the function, whereas the expression xz + x Y is
a prime-irredundant cube set.
Prime-irredundant cube sets are very compact in general, but they are not always the
minimum form. The following three expressions represent the same function and all
of them are prime-irredundant:
We can thus see that prime-irredundant cube sets do not provide unique forms and that
the number of cubes and literals can be different. Empirically, however, they are not
much larger than the minimum form.
Prime-irredundant cube sets are useful for many applications including logic synthesis,
fault testable design, and combinatorial problems.
Unfortunately, Morreale's method is not efficient for large-scale functions because the
algorithm is based on cube set representation and it takes a long time to manipulate
cube sets for tautology checking, inverting, and other logic operations. However, the
basic idea of "recursive expansion" is well suited to BDD manipulation, which is what
motivated us to improve and adapt Morreale's method for BDD representation.
f~: (, f{:
I 0 d d I I d d
dOd d d d d d
isopo <- ISOP(f~) ;
/* recursively generates cubes including v */
iSOPl <-- ISOP(ff) ;
/* recursively generates cubes including 11 */
Let 90.91 be the covers of isopo. isop 1, respectively;
Compute f~', f{' in the following rules;
fo 0 I 1 II
f o'" . (0 (.
( 1 •
f'" 17.-t--7~---r_-T-
;J...(
I rl d I
Compute fd in the following rule;
" f 0" d
o
dOl d
iSO]ld <-- ISOP(f,I) ;
/* recursively generates cubes excluding v. 11 */
iSO]l <- (v· isopo) V (v . isopIl V iSOPd ;
} ,
return 'ISOp ;
}
already covered by isopo or iSOP1. and in Fig. 5.3(d) fd is computed from f~' and
J{'. fd represents the minterrns to be covered by the cubes excluding 11 and v. We
thereby generate its prime-irredundant cube set iSOPd. Finally. the result of i80P can
be obtained as the union set of v . isopo. 11 • i80P1 and iSOPd.
Note that although here we use Kamaugh maps for purpose of illustration. in practice
the functions are represented and manipulated using BDDs.
54 CHAPTER 5
1 1 0 0 1 1 0 0 d d 0 0 d d 0 0
0 1 I I 0
,.....
d !(d 1) 0 1 d d 0 l(l d' d
1 1 1 1 d d
i\! 2J d I d d d
Il! ~ d
1 0 0 0 1 0 0 'f 0 d 0 0 0 d 0/ 0 0
\ I I
isopO isopl isop d
(a) (b) (c) (d)
If the order of the input variables is fixed, the ISOP algorithm generates a unique form
for each function. In other words, it gives a unique form of cube set for a given BOD.
Another feature of this algorithm is that it can be applied for functions with don't
cares.
can be written as 1fl == 1. The special operation for ternary-valued functions can
be computed in the combination of ordinary logic operation for If J and 1fl. For
example, the ternary-valued operation
frio 0 d
f~ :
o 0 d
o d d
d 0 d d
can be written as:
(lf~J,lf~ll . . . (lfoJ 'Ifrl- Ifoll·
We noted earlier that isop is obtained as the union set of the three parts, as shown in
Fig. 5.2. To avoid cube set manipulation, we implemented the method in such a way
that the results of cubes are directly dumped out to a file. On each recursive call, we
push the processing literal to a stack, which we call a cube stack. When a tautology
function is detected, the current content of the cube stack is appended to the output
file as a cube. This approach is efficient because we manipulate only BDDs, no matter
how large the result of the cube set becomes.
Our method can be extended to manage multiple output functions. By sharing the
common cubes among different outputs, we obtain a representation more compact
than we would if each output were processed separately. In our implementation, the
cube sets of all the outputs are generated concurrently; that is, we extend f to be an
array of BDDs in order to represent a multiple output function. Repeating recursive
calls in the same manner as for a single output function eventuates in the detection of
a multiple output constant consisting of O's and 1'So The 1's mean that corresponding
output functions include the cube that is currently kept in the cube stack.
We implemented the method described in the foregoing section, and conducted some
experiments to evaluate its performance. We used a SPARC Station 2 (SunOS 4.1.1,
32 MByte). The program is written in C and C++.
We first generated initial BDDs for the output functions of practical combinational
circuits which may be multi-level or multiple output circuits. We then generated
56 CHAPTER 5
prime-irredundant cube sets from the BODs and counted the numbers of the cubes
and literals. We used the OWA method, described in Section 3.2, to find the proper
order of the input variables for the initial BOD. The computation time includes the
time to determine ordering, the time to generate the initial BO~s, and the time to
generate prime-irredundant cube sets. We compared our results with a conventional
cube-based method. We flattened the given circuits into cube sets with the system MIS-
II[BSYW87], and then optimized the cube sets by using ESPRESSO[BHMSY84].
The results are listed in Table 5.1. The circuits were an 8-bit data selector "seI8," an
8-bit priority encoder "enc8," a (4+4)-bit adder "add4," an (8+8)-bit adder "add8,"
a (2x2)-bit multiplier "muIt4," a (3x3)-bit multiplier "muIt6," a 24 input Achilles'
heeifunction[BHMSV84J "achiI8p," and its complement "achiI8n." Other items were
chosen from benchmarks at MCNC'90.
Generation of Cube Sets from BDDs 57
The table shows that our method is much faster than ESPRESSO - for large-scale
circuits, more than 10 times faster. The speed-up was most impressive for the "c432"
and "c880," where we generated a prime-irredundant cube set consisting of more than
100,000 cubes and I, ODD, 000 literals within a reasonable time. We could not apply
ESPRESSO to these circuits because we were unable to flatten them into cube sets
even after ten hours. In another example, ESPRESSO performed poorly for "achiI8n"
because the Achilles' heel junction requires a great many cubes when we invert it.
Our method nonetheless performed well for "achil8n" because the complementary
function can be represented by the same size of BOD as the original one. In general,
our method may give somewhat more cubes and literals than ESPRESSO does. In
most cases, the differences ranged between 0% and 20%.
As shown in Table 5.2, the numbers of cubes and literals are almost the same for both
orders, but the size of BODs varied greatly. The results demonstrate that our method
is robust for variation in order, although variable ordering is still important because it
affects the execution time and memory requirement.
As shown by the results listed in Table 5.3, the numbers of both BODs and cubes
grow exponentially when the number of inputs is increased. It is known that the
maximum BOD size is theoretically 0(2"/n) (where n is the input number)[Ake78],
and our statistical experiment produced similar results. In terms of number of cubes,
we observe about O( 2"), and the ratio of cubes to literals (the number of literals per
cube) is almost proportional to n.
Generation of Cube Sets from BDDs 59
Table 5.4 shows the results obtained when varying the number of outputs while the
number of inputs is fixed. Both BODs and cube sets grow a little less proportionally,
thus reflecting sharing the effect of their subgraphs or cubes. We expect such data
sharing to be more effective for practical circuits, where the multiple output functions
are closely related to each other. The ratio of cubes to literals is almost constant, since
the number of inputs is fixed.
We also investigated the relation between this method's performance and the truth
table density, which is the rate of 1's in the truth table. We applied our method to
the weighted random functions with 10 inputs ranging from 0% to 100% in density.
Figure 5.4 shows that the BOD size is symmetric with a center line at 50%, which is
like the entropy of information. The number of cubes is not symmetric and peeks at
about 60%; however, the number of literals becomes symmetric. This result suggests
that the number of literals is better as a measure of the complexity of Boolean functions
than is the number of cubes.
5.4 CONCLUSION
We have described the ISOP algorithm for generating prime-irredundant cube sets
directly from given BDDs. The experimental results show that our method is much
faster than conventional methods. It enables us to generate compact cube sets from
60 CHAPTER 5
(#)
250
BDD nodes
o·&-~--~~--~-L--L-~--L-~ ___
o 10 20 30 40 50 60 70 80 90 100 (%)
Truth-table density
large-scale circuits, some of which have not been flattened into cube sets by using
the conventional methods. In terms of size of the result, the ISOP algorithm may
give somewhat larger results than ESPRESSO, but there are many applications in
which such an increase is tolerable. Our method can be used to transform BODs into
compact cube sets or to flatten multi-level circuits into two-level circuits.
Recently, BDDs have attracted much attention because they enable us to manipulate
Boolean functions efficiently in terms of time and space. There are many cases that
the algorithm based on conventional data structures can be significantly improved by
using BDDs[MF89, BCMD90).
As our understanding ofBDDs has deepened, their range of applications has broadened.
In VLSI CAD problems, we are often faced with manipulating not only Boolean
functions but also sets of combinations. By mapping a set of combinations into the
Boolean space, they can be represented as a characteristic function by using a BDD.
This method enables us to implicitly manipulate a huge number of combinations,
which has never been practical before. Two-level logic minimization methods based
on implicit set representation have been developed recently[CMF93], and those
techniques for manipulating sets of combinations are also used to solve general
covering problems[LS90). Although BDD-based set representation is generally more
efficient than the conventional methods, it can be inefficient at times because BDDs
were originally designed to represent Boolean functions.
In this chapter, we propose a new type of BDD that has been adapted for set
representation[Min93d). This idea, called a Zero-suppressed BDD (ZBDD), enables
us to represent sets of combinations more efficiently than using conventional BDDs.
We also discuss unate cube set aigebra[Min94), which is convenient for describing
ZBDD algorithms or procedures. We present efficient methods for computing unate
cube set operations, and show some practical applications of these methods.
61
62 CHAPTER 6
Jump
Here we examine the reduction rules of BDDs when applying them to represent sets
of combinations. We then show a problem which motivates us to develop a new type
ofBDDs.
I. Eliminate all the redundant nodes whose two edges point to the same node.
(Fig.6.I(a»
BDDs give canonical forms for Boolean functions when the variable ordering is fixed,
and most uses of BDDs are based on the above reduction rules.
It is important how BODs are shrunk by the reduction rules. One recent paper[LL92]
shows that, for general (or random) Boolean functions, node sharing makes a much
more significant contribution to storage saving than the node elimination. For practical
functions, however, the node elimination is also important. For example, as shown in
Fig. 6.2, the form of a BDO does not depend on the number of input variables as long
as the expressions of the functions are the same. When we use BODs, the irrelevant
variables are suppressed automatically and we do not have to consider them. This is
Zero-Suppressed BDDs 63
(abcd):{lOOO,OlOO)
(abc):{lOO,OlO)
set. Such Boolean functions are called characteristic functions. The set operations
such as union, intersection, and difference can be executed by logic operations on
characteristic functions.
Despite the efficiency of manipulating sets by using BDDs, there is one inconvenience:
that, as shown in Fig. 6.3, the form of BDDs depends on the number of input variables.
We therefore have to fix the number of input variables before generating BDDs. This
inconvenience comes from the difference in the model on default variables. In sets of
combinations, irrelevant objects never appear in any combination, so default variables
are regarded as zero when the characteristic function is true. Unfortunately, such
variables can not be suppressed in the BDD representation. We therefore have to
generate many useless nodes for irrelevant variables when we manipulate very sparse
Zero-Suppressed BDDs 65
Jump
1 ......
combinations. In such cases, the node elimination does not work well in reducing the
graphs.
In the following section, we describe a method that solves this problem by using
BDDs based on new reduction rules.
1. Eliminate all the nodes whose I-edge points to the O-terminal node and use the
subgraph of the O-edge, as shown in Fig. 6.4.
2. Share all equivalent subgraphs in the same way as for ordinary BDDs.
Notice that, contrary to the rules for ordinary BDDs, we do not eliminate the nodes
whose two edges point to the same node. This reduction rule is asymmetric for the
two edges, as we do not eliminate the nodes whose O-edge points to a terminal node.
We call BDDs based on the above rules Zero-suppressed BDDs (ZBDDs). If the
number and the order of input variables are fixed, a ZBDD uniquely represents
a Boolean function. This is obvious because a non-reduced binary tree can be
reconstructed from a ZBDD by applying the reduction rule reversely.
Figure 6.5 shows ZBDDs representing the same sets of combinations shown in Fig. 6.3.
A feature of ZBDDs is that the form is independent of the number of inputs as long
66 CHAPTER 6
(abcd):UOOO,OlOO}
(abc):{100, OlO}
as the sets of combinations are the same. Thus we do not have to fix the number of
input variables before generating graphs. ZBOOs automatically suppress the variables
which never appear in any combination. It is very efficient when we manipulate very
sparse combinations.
Another advantage of ZBOOs is that the number of I-paths in the graph is exactly
equal to the number of combinations in the set. In conventional BOOs, the node
elimination breaks this property, so we conclude that ZBOOs are more suitable for
representing sets of combinations than conventional BOOs are.
On the other hand, it would be better to use conventional BOOs when representing
ordinary Boolean functions, as shown in Fig. 6.2. The difference is in the models
of default variables: "fixed to zero" in sets of combinations, and "both the same" in
Boolean functions. We can choose one of the two types of BOOs according to the
feature of applications.
Zero-Suppressed BDDs 67
(#Node)
10000
BDD
10 20 30 40 50 60 70 80 90 100
Number of l's in a combinalion
In this section, we show that ZBDDs are manipulated efficiently as well as conventional
BDDs.
A
(ab):{OI) (ab):{OO, 10}
:;/~
(ab):{OO, 10,01}
Union b I
oa I ~
~ ~
m
(ab):ll{OO) 0 I 00
u:;;:;"'•"
Figure 6.7 Generation of ZBDDs.
Figure 6.7 shows examples of ZBDDs generated with those operations. EmptyO
returns the O-terminal node, and BaseO is the I-terminal node. Any combination
can be generated with BaseO operation followed by Chang eO operations for all the
variables that appear in the combination. Using IntsecO operation, we can check
whether a combination is contained in a set.
6.3.2 Algorithms
We show here that the basic operations for ZBDDs can be executed recursively, like
the ones for conventional BDDs.
First, to describe the algorithms simply, we define the procedure Getnode(top, Po, Pd,
which generates (or copies) a node for a variable top and two subgraphs Po, Pl. In
the procedure, we use a hash table, called uniq-table, to keep each node unique. Node
elimination and sharing are managed only by Getnode( ).
Using Getnode( ), we can describe the operations for ZBDDs as follows. Here P.top
means a variable with the highest order, and po. PI are the two subgraphs.
SubsetO(P. var) {
if (P.top < var) return P;
if (P.top == var) return Po;
if (P.top > var)
return Getnode(P.top, SubsetO(Po• var), SubsetO(PI. var»;
}
Change (P. var) {
if (P.tol) < var) return Getnode(var. </1. P);
if (P.top == var) return Getnode(var. Pl. Po);
if (P.top > var)
return Getnode(P.top, Change(Po• var), Change(PI. 'liar»;
70 CHAPTER 6
Union (P, Q) {
if (P == ¢) return Q;
if (Q == ¢) return P;
if (P == Q) return P;
if (Ptop > Q.top) return Getnode(P.top, Union(Po , Q), PI);
if (Ptop < Q.top) return Getnode(Q.top, Union(P. Qo), Ql):
if (Ptop == Q.top)
return Getnode(P.top, Union(Po, Qo), Union(PI . Ql »;
Intsec (P, Q) {
if (P == ¢) return ¢;
if (Q == ¢) return ¢;
if (P == Q) return P;
if (P.top > Q.top) return hi.tsec(Po• Q);
if (Ptop < Q.top) return Intsec(P. Qo);
if (Ptop == Q.top)
return Getnode(P. top, Intsec(Po. Qo), Intsec(PI . (21 »;
Diff (P. Q) {
if (P == ¢) return ¢;
if (Q == ¢) return P;
if (P == Q) return ¢;
if (P.top > Q.top) return Getnode(P.top, Diff(Po, Q), PI);
if (P.top < Q.top) return Diff(P. Qo);
if (Ptop == Q.top)
return Getnode(Ptop, Diff(Po , Qo), Diff(PI . Ql »;
}
Count (P) {
if (P == ¢) return 0;
if (P == {O}) return I;
return Count(Po) + Count(P1 );
}
In the worst case, these algorithms take an exponential time for the number of
variables, but we can accelerate them by using a cache which memorizes the results
of recent operations in the same manner as it is used in conventional BDDs. By
referring to the cache before every recursive call, we can avoid duplicate executions
for equivalent subgraphs. This enable us to execute these operations in a time that is
roughly proportional to the size of the graphs.
Zero-Suppressed BDDs 71
In conventional BDDs, we can reduce the execution time and memory requirement by
using attributed edges[MIY90] to indicate certain logic operations such as inverting.
ZBDDs also have a kind of attributed edges, but the functions of the attributes are
different from conventional ones.
Here we present an attributed edge for ZBDDs. This attributed edge, called O-element
edge, indicates that the pointing subgraph has a I-path that consists of O-edges only.
In other word, a O-element edge means that the set includes the null-combination "fo"
We use the notation P to express the O-element edge pointing to P.
As with other attributed edges, we have to place a couple of constraints on the location
of O-element edges in order to keep the uniqueness of the graphs:
If necessary, O-element edges can be carried over as shown in Fig. 6.8. The constraint
rules can be implemented in Getnode( ).
In this section, we discuss unate cube set algebra for manipulating sets of combinations.
A cube set consists of a number of cubes, each of which is a combination of literals.
Unate cube sets allow us to use only positive literals, not the negative ones. Each
cube represents one combination, and each literal represents an item chosen in the
combination.
We sometimes use cube sets to represent Boolean functions, but they are usually
binate cube sets containing both positive and negative literals. Binate cube sets have
different semantics from unate cube sets. In binate cube sets, literal ;r and x represent
;r = 1 and ;r = 0, respectively, and the absence of a literal means don't care; that is,
;r = 1. 0, both OK. In unate cube sets, on the other hand, literal ;r means ;r = 1 and
the absence of a literal means J: = O. For example, the cube set expression (a + be)
represents (abc) : {Ill. 110. 101. 100.011} under the semantics of binate cube sets,
but (abc) : {I 00. OIl} under unate cube set semantics.
The unit set "1" includes only one cube that contains no literals. This set becomes the
unit element of the product operation. A single literal set J:/.; includes only one cube
that consists of only one literal. In the rest of this section, a lowercase letter denotes a
literal, and an uppercase letter denotes an expression.
& (intersection),
+ (union),
(difference),
* (product),
/ (quotient of division),
% (remainder of division).
(We may use a comma H," instead of H+" and we sometimes omit H*.") The
operation "*" generates all possible concatenations of two cubes in respective cube
Zero-Suppressed BDDs 73
{ab,b,c}&{ab, I} {ab}
{ab,b,c} + {ab.l} {ab, b. c. I}
{ab. b, c} - {ab. I} {b. c}
{ab, b. c} * {ab. I} (ab * ab) + (ab * I) + (b * ab)
+ (b * I) + (c * ab) + (c * I)
{ab. abc. b, c}
P+P=P
a * a = a. (P * P =j:. P in general)
(P - Q) = (Q - P) ~ (P = Q)
P * (Q + R) = (P * Q) + (P * R)
Dividing P by Q acts to seek out the two cube sets P IQ (quotient) and P%Q
(remainder) under the equality P = Q * (PIQ) + (P%Q). In general this solution is
not unique. Here we apply the following rules to fix the solution with reference to the
weak-division method[BSVW87].
These three trivial sets and six basic operators are used to represent and manipulate sets
of combinations. In Section 6.3, we defined three other basic operations - Subsetl(
74 CHAPTER 6
), SubsetO( ), and Change( ) - for assigning a value to a literal, but we do not have
to use them since the weak-division operation can be used as generalized cofactor
for ZBDDs. For example, Subsetl(P. xd can be described as (P/xd * Xi.;, and
SubsetO(P,x,J becomes (P%:rd. And the Change() operation can be described by
using some multiplication and division operators. Using unate cube set expressions,
we can elegantly express the algorithms or procedures for manipulating sets of
combinations.
6.4.2 Algorithms
We show here that the basic operations of unate cube set algebra can be efficiently
executed using ZBDD techniques. The three trivial cube sets are represented by
simple ZBDDs. The empty set "0" becomes the a-terminal, and the unit set" I" is
the I-terminal node. A single literal set :rk corresponds to the single-node graph
pointing directly to the 0- and Herminal node. The intersection, union, and difference
operations are the same as the basic ZBDD operations shown in Section 6.3. The
other three operations - product, quotient, and remainder - are not included in the
basic ones, so we have developed algorithms for computing them.
P = x * PI + Po. Q = X * Ql + Qo·
The product (P * Q) can be written as:
(P * Q) = x * (PI * Ql + PI * Qo + Po * QIl + Po * Qo·
Each sub-product term can be computed recursively. The expressions are eventually
broken down into trivial ones and the results are obtained. In the worst case, this
algorithm would require an exponential number of recursive calls for the number of
literals, but we can accelerate it by using a hash-based cache that memorizes results
of recent operations. By referring to the cache before every recursive call, we can
avoid duplicate executions for equivalent subsets. Consequently, the execution time
depends on the size of ZBDDs rather than on the number of cubes and literals. This
algorithm is shown in detail in Fig. 6.9.
Zero-Suppressed BDDs 75
procedure(P * Q)
{ if(P.top<Q.top)return(Q*P);
if(Q = 0) return 0 ;
if (Q = 1) return l' ;
R ~ cache("P * Q"); if (R exists) return R ;
;r ~ Ptop ; /* the highest variable in P */
(Po. PI) ~ factors of l' by;r ;
(Qo, Ch) ~ factors of Q by ;r ;
R ~ ;r (PI * Ql + PI * Qo + Po * Ql) + Po * Qo ;
cache("P * Q") ~ R ;
return R;
procedure(PI Q)
{ if(Q = 1) return 1';
if (1' = 0 or l' = 1) return 0 ;
if(p = Q) return 1 ;
R +-- cache("P I Q"); if (R exists) return R ;
;r <- Q.tO]i ; /* the highest variable in Q */
(Po. PI) ~ factors of l' by;): ;
(Qo. Ch) <- factors of Q by;): ; /* (Ch 1:- 0) */
R <- PI/Ch;
if(R 1:- 0) if(Qo 1:- 0) R <- R & po/Qo;
cache("PI Q") ~ R ;
return R;
Division is computed in the same recursive manner. Suppose that;}: is a literal at the
root-node in Q, and that Po. PI . Qo. (J,"III1Ch are the subsets of cubes factored by ;r.
(Notice that C21 1:- 0, since:t: appears in Q') The quotient (pIQ) can be described as
Each sub-quotient term can be computed recursively. Whenever we find that one
of the sub-quotients (PI/QdoT(poIQo) results in 0, (pIQ) = 0 becomes obvious
and we no longer need to compute it. Using the cache technique avoids duplicate
76 CHAPTER 6
abc, a b d, a b e, c d e
ucc> print G / (a b)
c, d, e
ucc> print G % (a b)
a c, a d, a e, c d e
ucc> print .mincost G
a c (4)
ucc> exit
executions for equivalent subsets. This algorithm is illustrated in Fig. 6.10. The
remainder (P%Q) can be determined by calculating P - P * (PIQ).
Based on the above techniques, we developed a Unate Cube set Calculator (UCC),
which is an interpreter with a lexical and syntax parser for calculating unate cube set
Zero-Suppressed BDDs 77
Using uee, we can compute the minimum-cost cube under a definition of the cost for
each literal. After ZBOOs are constructed, the minimum-cost cube can be found in
a time proportional to the number of nodes in the graph, as when using conventional
BOOs[LS90J.
Because the unate cube set calculator can generate huge ZBOOs with millions
of nodes, limited only by memory capacity, we can manipulate large-scale and
complicated expressions. Here we show several applications for the unate cube set
calculator.
The 8-queens problem is an example in which using unate cube set calculation is more
efficient than using ordinary Boolean expressions.
By unate cube set calculation, we can solve the 8-queens problem efficiently. The
algorithm can be written as
58: Search all the choices to put the eighth queen, considering the other
queens' locations.
Calculating these expressions with ZBDOs provides the set of solutions to the 8-
queens problem. Okuno[Oku94] reported experimental results for N-queens problems
to compare ZBDDs and conventional BDDs. In Table 6.1, the column "BDD nodes"
shows the size of BDDs using Boolean algebra, and "ZBDD nodes" shows the size of
ZBDDs using unate cube set algebra. We can see that there are about N times fewer
ZBDDs than conventional BDDs. We can represent all the solutions at once within a
storage space roughly proportional to the number of solutions.
xO(s-a-O)
xl(s-a-l)
I'-..~ 1 ~ J'..,..
~ ---- --
X
y
netx~t
__ - - e z
net y Z
sets of multiple stuck-at faults. It propagates the fault sets from primary inputs
to primary outputs, and eventually obtains the detectable faults at primary outputs.
Takahashi et al. used conventional BDDs, but we can compute the fault simulation
more simply by using ZBDDs based on unate cube set algebra.
First, we generate the whole set of mUltiple faults that is assumed in the simulation.
The set Fl of all the single stuck-at faults is expressed as
Fl = (ao + al + bo + bl + Co + Cl + ... ),
where ao and al represent the stuck-at-O and -1 faults, respectively, at the net a. Other
literals are expressed similarly. We can represent the set F2 of double and single faults
as (Fl * Fl)' Further more, (F2 * F l ) gives the set of three or fewer multiple faults.
If we assume exactly double (not including single) faults, we can calculate (F2 - Fl ).
In this way, the whole set U can easily be described with unate cube set expressions.
After computing the whole set U, we then propagate the detectable fault set from the
primary inputs to the primary outputs. As illustrated in Fig. 6.12(a), two faults Xo and
Xl are assumed at a net:r. Let X and X' be the detectable fault sets at the source and
sink, respectively, of the net :r. We can calculate X' from X with the following unate
cube expressions:
X' (X + (U/:rd * xd%xo. when X = 0 in a good circuit.
X' (X + (U/xo) * XO)%Xl. when:r = 1 in a good circuit.
On each gate, we calculate the fault set at the output of the gate from the fault sets at
the inputs of the gate. Let us consider a NAND gate with two inputs x and y, and one
output Z, as shown in Fig. 6.12(b). Let X. Y, and Z be the fault sets at x, y, and z. We
can calculate Z from X and Y by the simple unate cube set operations as follows:
Z = X &Y, when x = 0, y = o. Z = 1 in a good circuit.
80 CHAPTER 6
6.6 CONCLUSION
We have proposed ZBDDs, which are BDDs based on a new reduction rule, and have
presented their manipulation algorithms and applications. ZBDDs can represent sets
of combinations uniquely and more compactly than conventional BDDs. The effect
of ZBDDs is remarkable especially when we are manipulating sparse combinations.
On the basis of the ZBDD techniques, we have discussed the method for calculating
unate cube set algebra, and we have developed a unate cube set calculator, which can
be applied to many practical problems.
Unate cube sets have semantics different from those of binate cube sets, but there is
a way to simulate binate cube sets by using unate ones: we use two unate literals
Xl and :1:0 for one binate literal. For example, a binate cube set (a b + c) is
expressed as the unate cube set (alb o + CI). In this way, we can easily simulate the
cube-based algorithms implemented in logic design systems such as ESPRESSO and
MIS[BSVW87]. Using this technique, we have developed a practical multi-level logic
optimizer (detailed in the next chapter).
Unate cube set expressions are suitable for representing sets of combinations, and they
can be efficiently manipulated using ZBDDs. For solving some types of combinatorial
problems, our methods are more useful than those using conventional BDDs. We
expect the unate cube set calculator to be a helpful tool in developing VLSI CAD
systems and in various other applications.
7
MULTI-LEVEL LOGIC SYNTHESIS
USINGZBDDS
Logic synthesis and optimization techniques have been used successfully for practical
design of VLSI circuits in recent years. MuIti-levellogic, Optimization is important in
logic synthesis systems and a lot of research in this field has been undertaken[MKLC87,
MF89, Ish92]. In particular, the algebraic logic minimization method, such as
MIS[BSVW87], is the most successful and prevalent way to attain this optimization.
It is based on cube set (or two-level logic) minimization and generates multi-level
logic from cube sets by applying a weak-division method. This approach is efficient
for functions that can be expressed in a feasible size of cube sets, but we are sometimes
faced with functions whose cube set representations grow exponentially with the
number of inputs. Parity functions and full-adders are examples of such functions.
This is a problem of the cube-based logic synthesis methods.
The use of BODs provided a break-through for that problem. By mapping a cube set
into the Boolean space, a cube set can be represented as a Boolean function using a
BOD. Using this method, we can represent a huge number of cubes implicitly in a small
storage space. This enables us to manipulate very large cube sets whose manipulation
has not been practicable before. Based on the Cube set, BOD-based representation,
new cube set minimization methods have been developed [CMF93, MSB93].
81
82 CHAPTER 7
The following sections, we first discusses the implicit cube set representation based on
ZBDDs. We then present the implicit weak-division method and show experimental
results.
By using ZBDDs, we can represent any cube set simply, efficiently, and uniquely.
Figure 7.1 illustrates a cube set that can be seen as a set of combinations using two
variables for literals :J:k and :l:k. Xk and :l:k never appear together in the same cube, and
at least one should be O. The O's are conveniently suppressed in ZBDDs. The number
of cubes exactly equals the number of I-paths in the graph, and the total number of
literals can be counted in a time proportional to the size of the graph.
The basic operations for the cube set representation based on ZBDDs are the following:
"0" returns ¢. (no cube)
"1" returns 1. (the tautology cube)
AndO(P, var) returns (var· P).
AndI(P, var) returns (var· P).
FactorO(P, var) returns the factor of P by var.
FactorI(P, var) returns the factor of P by var.
FactorX(P, var) returns the cubes in P excluding var. var.
Union(P.Q) returns (P + Q).
Intsec(P, Q) returns (P n Q).
Diff(P,Q) returns (P - Q).
CountCubes(P) returns number of cubes.
CountLits(P) returns number of literals.
"0" corresponds to the O-terminal node on ZBDDs, and "I" corresponds to the
I-terminal node. Any cube can be generated by applying a number of AndO( ) and
Andl( ) operations to "1." The three Factor operations mean that
P = (var· FactorO) + (var· Factor!) + FactorD.
84 CHAPTER 7
IntsecO is different from the logical AND operation; it extracts only the common cubes
in the two cube sets. These operations are simply composed of ZBDD operations, and
their execution time is roughly proportional to the size of the graphs.
Using this new cube set representation, we have developed a program for generating
prime-irredundant cube sets based on the ISOP algorithm, described in Chapter 5.
Our program converts a conventional BDD representing a given Boolean function into
a ZBDD representing a prime-irredundant cube set.
We found that the ISOP algorithm can be accelerated through the use of the new cube
set representation based on ZBDDs. We prepared a hash-based cache to store the
results of each recursive call. Each entry in the cache is formed as a pair (t. 09), where
f is the pointer to a given BDD and 8 is the pointer to the result of the ZBDD. On
each recursive call, we check the cache to see whether the same subfunction f has
already appeared, and if so, we can avoid duplicate processing and return result .5
directly. By using this technique, we can execute the ISOP algorithm in a time roughly
proportional to the size of graph, independent of number of the cubes and literals.
We implemented the program on a SPARC station 2 (128 MB) and generated prime-
irredundant cube sets from the functions of large-scale combinational circuits. The
results are listed in Table 7.1. The following circuits were used for this experiment: an
(8+8)-bit adder "add8," a (l6+16)-bit adder "add16," a (6x6)-bit multiplier "mult6,"
an (8x 8)-bit multiplier "mult8," and a 24-input Achilles' heel function[BHMSV84]
"achiI8p" and its complement "achiI8n." Other circuits were chosen from the
benchmarks of MCNC'90.
The columns "In." and "Out." show the scale of the circuits. The column "BDD
nodes for func." lists the number of nodes in conventional BDDs representing
Boolean functions. "Cubes" and "Literals" show the scale of the generated prime-
Multi-Level Logic Synthesis Using ZBDDs 85
irredundant cube sets. They indicate the memory requirement when we use a classical
representation, such as a linear linked list. The actual memory requirements of the
implicit cube set representation are listed in the column "ZBDD nodes for cubes." As
86 CHAPTER 7
we can see, extremely large prime-irredundant cube sets containing billions of literals
can easily be generated in a practical computation time. Such cube sets have never
been generated before. Our experimental results thus show that the implicit cube set
representation reduces the memory requirement dramatically.
We also evaluated the effect of using ZBDDs. In Table 7.2, the column "BDD nodes
for cubes" lists the number of nodes needed when we use conventional BDDs to
represent implicit cube set representation. "Z/B" is the ratio of size of ZBDDs and
conventional BDDs. "Density" shows how many literals appear in a cube on the
average. This result indicates the property that ZBDDs are effective for representing
sparse combinations. Notice that 11 and v never appear in the same cube, so the density
does not exceed 50%. In general, reduced cube sets consist of sparse cubes and the use
of ZBDDs is effective. When we manipulate cube sets using intermediate variables to
represent multi-level logic, cube sets are sparser, and ZBDDs more effective.
In this section, we present a fast algorithm for factorizing implicit cube representation
using ZBDDs, and we demonstrate results of our new multi-level logic optimizer
based on the algorithm.
Weak-division (or algebraic division) is the most successful and prevalent method for
generating multi-level logics from cube sets. For example, the cube set expression
f =abr1+abe+ab!J+cd+ce+ch
can be divided by (a b + c). By using an intermediate variable p, we can rewrite the
expression
f = p d + ]J C + a b !J + C II. p = a b + c.
In the next step, f will be divided by (d + c) in a similar manner.
Weak-division does not exploit all of Boolean properties of the expression and is
only an algebraic method. In terms of result quality, it is not as effective as other
stronger optimizing methods, such as the transduction method[MKLC87]. However,
weak-division is still important because it is used for generating initial logic circuits
for other strong optimizers, and is applied to large-scale logics that cannot be handled
by strong optimizers.
(f/p) (f/{ab))n(f/c)
88 CHAPTER 7
(r1+e+g)n(r1+e+h)
r1 + e.
(f%p) 1- p (f /p)
== a b 9 + c h.
I v 10 +v II + h
P v Po + 11 PI + Pd·
The quotient (f / p) can then be written as
procedure(j / p)
{ if(P = 1) return I ;
if (j = 0 or I = 1) return 0 ;
if(j = p) return 1 ;
q +- cache{"I/p"); if (q exists) return q ;
v +- p.top ; /* the highest variable in p */
(fo, It, Id) +- factors of I by v;
(Po, PI, Pd) +- factors of p by v;
q +- p;
if(po "I- 0) q +- lo/Po ;
if{q = 0) return 0;
if(PI "I- 0)
if{q=p)q+-It/PI;
else q +- q n (ft/pt) ;
if{q = 0) return 0;
if(Pd "I- 0)
if{q = p) q +- Id/Pd;
else q +- q n (fd/Pd) ;
cache{"f/p") +- q;
return q;
}
Figure 7.3 Implicit weak-division algorithm.
(ft/pd and (fd/Pd) becomes zero, (f /p) = 0 becomes obvious and we no longer
need to continue the calculation.
This algorithm computes the example shown in the previous section as follows:
(abd+abe+abg+cd+ce+ch)/(ab+c)
(bd+be+bg)/b n (cd+ce+ch)/c
= (d+e+g)/1 n (d+e+h)/1
(d+e+!J) n (d+e+ h)
d+e.
In the same way as for the ISOP algorithm, we prepared a hash-based cache to
store results for each recursive call and avoid duplicate execution. Using the cache
technique, we can execute this algorithm in a time nearly proportional to the size of
the graph, regardless of the number of cubes and literals.
90 CHAPTER 7
procedure(f . g)
{ if (f.top < g.top) return (g . f) ;
if(g = 0) return 0;
if(g = 1) return I;
II <- cache("1 . g"); if (h exists) return h ;
11 <- j.top ; /* the highest variable in I */
(fo. II· Id) <- factors of I by v ;
(go. gl . g,J) <- factors of g by v ;
h <- v( 10 . go + 10 . gd + Id . .lio)
+v(1I . .iiI + II . gd + Id . gl) + la . .lJd ;
cache(" I . .Ii") <- h;
return h;
For multi-level logic synthesis based on the weak-division method, the quality of the
results greatly depends on the choice of divisors. Kernel extraction[BSVW87] is
the most common and sophisticated method for obtaining divisors. It extracts good
divisors and has been used successfully in practical systems such as MIS-II. For very
large cube sets, however, this method is complicated and time consuming.
Here we present a simple and fast method for finding divisors in implicit cube sets.
The basic algorithm is described as follows:
Divisor(j)
{ 'Ii <- a literal appears twice in f ;
if(v exist) return Divisor(f Iv) ;
else return I ;
}
If there is a literal that appears more than once in a cube set, we compute the factor
for the literal. Repeating this recursively, we eventually obtain a divisor, which is
the same as the one called a Level-O kernel in the kernel extraction method used in
Multi-Level Logic Synthesis Using ZBDDs 91
MIS-II. With this method, factors for a literal can be computed quickly in the implicit
representation. Whether a literal appears more than once can be checked efficiently
by looking on the branch of the graph.
A different divisor may be obtained for another order of factoring literals. When
two or more possible literals are located, we choose one defined later so that the
extracted divisor may have variables nearer to the primary inputs. This rule allows us
to maintain a shallow depth of the circuit.
The use of a common divisor for multiple cube sets may yield better results, but
locating common divisors is complicated and time consuming for large cube sets. So
far, we have been able to extract only single output divisors and apply them to all the
other cube sets. When there is a cube set providing a non-zero quotient for the divisor,
we factorize the cube set. At least one cube set and sometimes more can be divided
by a common cube.
Using the complement function for the divisor, we sometimes can attain more compact
expressions. For example,
f=ac+bc+abc
can be factorized using a complement divisor as:
f pc+p c,
P = a + b.
It is not easy to compute the complement function in the cube set representation.
We transform the cube set into a conventional BDD for the Boolean function of the
divisor, and make a complement for the BDD. We then regenerate a cube set from the
inverted BDD by using the ISOP algorithm. This strategy seems as if it would require
a large computation time, however, the actual execution time is comparatively small
in the entire process because the divisors are always small cube sets.
Based on the above method, we implemented a multi-level logic synthesizer. The basic
flow of the program is illustrated in Fig. 7.5. Starting with non-optimized multi-level
logics, we first generate BDDs for the Boolean functions of primary outputs under
a heuristic ordering of input variables[MIY90]. We then transform the BDDs into
prime-irredundant implicit cube sets by using the ISOP algorithm. The cube sets
are then factorized into optimized multi-level logics by using the fast weak-division
method. We wrote the program with C++ language on a SPARC station 2 (128 MB).
92 CHAPTER 7
(Non-optimized)
Multi-level circuit
Variable ordering
Fast weak-division
(Optimized)
Multi-level circuit
We compared our experimental results with those of the MIS-II using a conventional
cube-based method. The results are listed in Table 7.3. The circuits were an 8-bit and
16-bit parity functions "xor8" and "xor16," a (16+16)-bit adder "add16," (6x6)-bit
multiplier "mult6," and other circuits chosen from the MCNC'90 benchmarks. The
column "Lit." lists the total number of literals in multi-level logic, and the column
"Time(s)" is the total CPU time for optimization. The heading "MIS-I1(flatten)" labels
the results obtained when we forced the MIS-II to flatten multi-level logics into cube
sets and then factorized them using weak-division. "Our method" uses the implicit
cube set representation.
The experimental results show that we can now quickly flatten and factorize circuits,
even for parity functions and adders, that have never been flattened with other methods.
Our method is much faster than MIS-II, and the difference is remarkable when large
cube sets are factorized. This result demonstrates the effect of our implicit cube set
representation.
When the flattened cube sets becomes too large, an incremental optimization method
is sometimes effective. MIS-II provides an option of incremental optimization without
flattening. We compared our results with such incremental methods. In Table 7.4,
"MIS-I1(algebra)" denotes the results of incremental optimization using only algebraic
rules, and "MIS-I1(Bool)" denotes the results obtained by using Boolean optimization
Multi-Level Logic Synthesis Using ZBDDs 93
7.4 CONCLUSION
We have developed a fast weak-division method for implicit cube sets based on ZBDDs.
Computation time for this method is nearly proportional to the size of ZBDDs, and
is independent of the number of cubes and literals in cube sets. Experimental results
indicate that we can quickly flatten and factorize circuits - even for parity functions
and adders, which have never been practicable before. Our method greatly accelerates
logic synthesis systems and enlarges the scale of the circuits to which these systems
are applicable.
94 CHAPTER 7
There is still some room to improve the results. We have used a simple strategy for
choosing divisors, but more sophisticated strategies might be possible. Moreover, a
Boolean division method for implicit cube sets is worth investigating to improve the
optimization quality.
8
IMPLICIT MANIPULATION OF
POLYNOMIALS
BASED ON ZBDDS
Polynomials can be manipulated in a similar way to the cube sets manipulation, except
that polynomials differ from cube sets in the following two points:
95
96 CHAPTER 8
First, we show a way to deal with degrees. Here, we consider only degrees of positive
integers. The basic idea is that an integer number can be written as a sum of 2's
exponential numbers by using binary coding. That is, a variable J: k can be broken
down as follows:
where /';1. /';2 ..... /,;m are different 2's exponential numbers. Thus we can represent
:r k as a combination of n items :r 1 . J: 2 • :I A • :]:8 •.... :]:2" - 1 (0 < /,; < 211 ), and we can
efficiently deal with such combinations by using ZBDDs. A polynomial x 20 + :1: 10 +
:]:5 +:r, for example, can be written as J;4 J: IG + J: 2 J: 8 + :1: 1 :):4 + :z; 1. It can be regarded
as a set of combinations based on the five items J: 1 . J: 2 • J A . :]:8. and :1;16. The formula
can then be represented by using a ZBDD, as shown in Fig. 8.1 (a). In this example, we
ordered J: 1 the top and ordered the higher degrees lower in the graph. This ordering is
convenient in calculating arithmetic operations, which is described in Section 8.2.
When dealing with more than one sort of variable - such as Xi. yi. and Zk - we
decompose them as :rl.:[:2.J: 4 .... , yl.y2.y4 ..... and ZI.z2.Z4 •...• Figure 8.I(b)
shows an example with two sorts of variables. Since our BDD package allows 65,535
Implicit Manipulation of Polynomials Based on ZBDDs 97
variables, we can use more than 8,000 sorts of variables when using 8-bit coding (max
255) for degrees.
One of feature of our method is that it gives canonical forms of a polynomials, since
the degrees are uniquely decomposed into the combinations based on a binary coding,
and ZBOOs represent the sets of combinations uniquely. In addition, ZBOOs clearly
exhibit their efficiency. For example, ;J: 1 ;):2 (= ;]:3) is represented as a combination of
only :1;1 and ;;2, but ;A. ;):8. ;rIG .... are not included. In ZBOOs, the nodes for the
irrelevant items;A. ;;8. ;):IG •... are automatically eliminated. In many cases, variables
with lower degrees appear more often than those with higher degrees, so that most
of the combinations are sparse and ZBOOs are effective. In addition, when dealing
with many sorts of variables, we should consider that the combination ;;1 ;;2 does not
include other sorts of variables, such as yl. y2. y4 . .... or Zl. z2. z4 . . '" In this case,
the combinations become very sparse and ZBOOs are especially effective.
where Cl. C2 • •.. ,Cm are different positive integer numbers. If we regards "2"
as a symbol, just like :r. y. z, etc., we can represent it as a polynomial of vari-
ables with degrees, which has already been discussed. Consequently, we can
represent a constant number c using ZBOOs as a set of combinations from
n items 2 I . 2 2,2.4 28 • . . . . 22"-1(0 < C < 22") . For example, the constant
300 = 2 8 + 2 5 + 2 3 + 22 can be written as 2 8 + 2124 + 2122 + 22. It can be
regarded as a set of combinations based on four items 21. 22. 24, and 28 , then
represented by a ZBOO, as shown in Fig. 8.2(a).
When a constant number is used as a coefficient with other variables, we can regard
the symbol "2" just as one sort of variable in the formula. Figure 8.2(b) shows an
example for 5x 2 + 2:1:Y, which is decomposed into :]:2 + 22;;2 + 21:rlyl.
When dealing with negative coefficients, we have to consider the coding of negative
values. There are two well-known methods, one using 2's complement representation,
and the other using the absolute value with sign. Both methods, however, have
drawbacks. The one using 2's complement yields many non-zero bits for small
98 CHAPTER 8
negative numbers (typically, -1 is "all one"), and the ZBDD reduction rule is not
effective to those non-zero bits. And with the other using the absolute value, the
operation of addition become complicated since we have to switch the addition into
subtraction for some product terms in the same formula.
To solve these problems, we used another binary coding based on (_2)k: each
bit represents 1,-2.4.-8,16, .... For example, -12 can be decomposed into
(_2)5 + (_2)4 + (_2)2 = -2.2 4 + 24 + 22, and can be represented by a ZBDD
as shown in Fig. 8.2(c). In this way, we can avoid producing many non-zero bits for
small negative numbers.
Two polynomials are equivalent if and only if they have the same coefficients for
all corresponding terms. Since our new representation method maintains uniqueness,
we can immediately check the equivalence between two polynomials after generating
ZBDDs.
x+4y
x (x + 4 y)
for the fonnula x 2 + 4xy from the arithmetic expression x x (:1: + 4 x y), we first
generate graphs for "x," "y," and "4," and then use some arithmetic operations.
After generating ZBDDs for polynomials, we can immediately check the equivalence
between two polynomials. We can also easily evaluate the polynomials in tenns of
the length, degrees, coefficients, etc.
F Fxv
By multiplying by a special symbol 2'" (or (-2)"'), we can perform a "shift" operation
for the coefficients in a formula.
8.2.2 Addition
We can explain the action of the algorithm using an example: the addition of
F = ;r + z and G = 3J: + )j (= 21:}: + ;r + ?J). In the first execution, C <- J: and
5 <- 21:r +?J + z. Since C =1= 0, we repeat the procedure with F = 21:r +)j + z (= 5)
and G = 21 J: (= C x 2). In the second execution, C <- 21:r and S <- y + z, and
we repeat with F = ?J + z and G = 2~;r. The third times, C = 0 and the result
22;r + ?J + z is obtained.
When the coding based on ( - 2)" is used, the addition and subtraction can be performed
as follows:
(F +G) = 5 - (C x (-2)),
(F-G)=D+(Bx(-2)),
Implicit Manipulation of Polynomials Based on ZBDDs 101
where D = F n G, B = F n G.
In this procedure, the addition and subtraction are called alternately.
Using the two operations described in previous section, we can compose an algorithm
for multiplying two polynomials. This algorithm is based on the divide-and-conquer
idea. Suppose that v is the variable of the root node in the ZBDD. We first factor F
and G into two parts:
F = v . Fl U Fo. G = v . G 1 U Go.
8.2.4 Division
In polynomial calculus, the division operation yields a quotient (FIG) and a remainder
(F%G). This algorithm can also be described by a recursive formula. Suppose;1: is
the variable of the root node in ZBDD, and that oS and t are the highest degrees of;1: in
F and G, respectively. They are then factored into two parts:
(FIG)
procedure(F x G)
{ if (F.top < G.top) return (G x F);
if (G = 0) return 0 ;
if (G = 1) return F ;
H +- cache(UF x G"); if (H exists) return H ;
11 +- F.top ; /* the highest variable in F */
(Fo, F I ) +- factors of F by '1/;
(Go, GI) +- factors of G by 11 ;
H +- (H x Gd x 1/ 2
trivial ones. (For example, FIG = 0 when s < t.) A hash-based cache memory is
also referred to avoid duplicate executions for equivalent subformulas. The remainder
(F%G) can be obtained by F - (FIG) x G.
When dealing with more than one sort of variables, this algorithm may give different
quotients depending on the variable ordering of ZBDDs. However, the quotient is
computed uniquely for any variable ordering when the remainder is zero. Using this
division algorithm, we can easily check whether a polynomial is a factor of another
polynomial.
8.2.5 Substitution
Using the division operation, we can compute the substitution operation F[:z: = G] for
given polynomials F and G, and for a variable :J: contained in F. The algorithm can
be written as:
F[;1: = G] = (FI:r)[a: = G] x G + (F%:z:).
(FI;1:)[a: = G] is computed recursively. If no ;1: appears in F, F[:z: = G] = F.
We can use this algorithm to perform various substitutions, such as F [:1: = :r+ 1], F[:z: =
y2]. and F[:.z: = 5]. This operation is very useful for practical applications.
Implicit Manipulation of Polynomials Based on ZBDDs 103
n ZBDD Time
nodes (sec)
10 8 0.1
20 22 0.3
30 32 0.9
40 43 1.7
50 64 2.8
56 62 3.2
Our method greatly accelerates the computation of polynomials and expands the scale
of their applicability. It is especially effective when dealing with many sorts of
variables, a feat that has been difficult for conventional methods.
104 CHAPTER 8
We have proposed an elegant way to represent polynomials by using ZBDDs, and have
shown efficient algorithms for manipulating those polynomials. Our experimental
results indicate that we can manipulate large-scale polynomials implicitly within a
106 CHAPTER 8
4xy+3x+2y+l
4xy+3x+2y+l
BMD
feasible time and space. An important feature of our representation is that it is the
canonical form of a polynomial under a fixed variable order. Because the polynomial
calculus is a basic part of mathematics, our method is very useful in various areas.
In the future we will try to implement other interesting operations on polynomials, such
as differential methods, finding max/min values, solving equations, approximation,
Implicit Manipulation of Polynomials Based on ZBDDs 107
9.1 INTRODUCTION
Boolean expressions are sometimes used in the research and development of digital
systems, but calculating Boolean expressions by hand is a cumbersome job , even
when they have only a few variables. For example, the Boolean expressions
((),/',b) V (a/\7:) V (Me), (o,/\b) V (o,/\e) V (aM) V (a/\c), and ((J,/\b) V (aM) V (Me) V (h/\c)
represent the same function, but it is hard to verify their equivalence by hand. If
they have more than five or six variables, we might as well give up. This problem
motivated us to develop a Boolean expression manipulator (BEM)[MIY89], which is
an interpreter that uses BDDs to calculate Boolean expressions. It enables us to check
the equivalence and implications of Boolean expressions easily, and it helped us in
developing VLSI design systems and solving combinatorial problems.
Most discrete problems can be described by logic expressions; however, the arithmetic
operators such as addition, subtraction, multiplication, and comparison, are convenient
for describing many practical problems. Such expressions can be rewritten using logic
operators only, but this can result in expressions that are complicated and hard to read.
In many cases, arithmetic operators provide simple descriptions of problems.
In this chapter, we present a new Boolean expression manipulator, that allows the use
of arithmetic operators[Min93a]. This manipulator, called BEM-II, can directly solve
problems represented by a set of equalities and inequalities, which are dealt with in 0-1
linear programming. Of course, it can also manipulate ordinary Boolean expressions.
We developed several output formats for displaying expressions containing arithmetic
operators.
109
110 CHAPTER 9
In the following sections, we first show a method for manipulating Boolean expressions
with arithmetic operations. We then present implementation of the BEM-II and its
applications.
Most discrete problems can be described by using logic operations; however, arithmetic
operators are useful for describing many practical problems. For example, a majority
function with five inputs can be expressed concisely by using arithmetic operators:
9.2.1 Definitions
For example, (3 x :1:1 + :r2) is an arithmetic Boolean expression with respect to the
variables :[:1. :]:2 E {O. I}. (3 x :r1 + :1:2 < 4) is also an example.
Arithmetic Boolean Expressions 111
11
3 X Xl 003 3
3 X Xl + :1:2 o 3 4
3 X Xl + :1:2 < 4 o
Figure 9.1 Computation of arithmetic Boolean expressions.
When ordinary logic operations are applied to integer values other than 0 and 1, we
define them as bit-wise logic operations for binary-coded numbers, like the manner in
many programming languages. For example, (3 V 5) returns 7. Under this modeling
scheme, conventional Boolean expressions become special cases of arithmetic Boolean
functions.
The value of the expression (3 x :]:1 + :1:2) becomes 0 when :1:1 = X2 = 0, and it
becomes 4 when :1;1 = :1:2 = 1. We can see that an arithmetic Boolean expression
represents a function from a binary-vector to an integer: (BTl -4 f). We call this
a Boolean-to-integer (B-to-I) function, which has been discussed in Section 4.2.
The operators in arithmetic Boolean expressions are defined as operations on B-to-I
functions.
The procedure for obtaining the B-to-I function for the arithmetic Boolean expression
(3 x Xl + :1:2 < 4) is shown in Fig. 9.1. First, multiply the constant function 3 by
the input function Xl to obtain the B-to-I function for (3 x :rd. Then add :r2 to
obtain the function for (3 x Xl + :1:2). Finally we can get a B-to-I function for the
entire expression (3 x :1:1 + :1:2 < 4) by applying the comparison operator «) to the
constant function 4. We find that this arithmetic Boolean expression is equivalent to
the expression (:1:1 V :1:2).
As discussed in Section 4.2, there are two ways to represent B-to-I functions: multi-
terminal BDDs (MTBDDs) and BDD vectors. MTBDDs are extended BDDs with
112 CHAPTER 9
multiple terminals, each of which has an integer value (Fig. 9.2(a», and BDD vectors
are the way to represent B-to-I functions with a number of BDDs by encoding the
integer numbers into binary vectors (Fig. 9.2(b».
The efficiency of the two representations depends on the nature of the objective
functions. In manipulating arithmetic Boolean expressions, we often generate B-to-I
functions from Boolean functions, as when we calculate F x 2. F x 5, or F x 100 from
a certain Boolean function F. In such cases, the BDD vectors can be conveniently
shared with each other (Fig. 9.3). Multi-terminal BDDs, however, cannot be shared
(Fig. 9.4), so we use BDD vectors for manipulating arithmetic Boolean expressions.
F 2F 3F 100 F
bit-wise inversion and logical inversion. Logical inversion returns 1 for 0, and returns
ofor the other numbers.
Arithmetic addition can be composed using logic operations on BDDs by simulating
a conventional hardware algorithm of full-adders designed as combinational circuits.
We use a simple algorithm for a ripple carry adder, which computes from the lower
bit to the higher bit, propagating carries. Other arithmetic operations like subtraction,
multiplication, division, and shifting can be composed in the same way. Exception
handling should be considered for overflow and division by zero.
114 CHAPTER 9
It would be useful to find the upper (or lower) bound value of a B-to-I function for all
possible combinations of input values. This can be done efficiently by using a binary
search. To find the upper bound, we first check whether the function can ever exceed
2n. If there is a case in which it does, we then compare it with 2" + 2,,-1, otherwise
only with 211-1. In this way, all the bits can be determined from the highest to the
lowest, and eventually the upper bound is obtained. The lower bound is found in a
similar way.
Computing the upper (or lower) bound is a unary operation for B-to-I functions; it
returns a constant B-to-I function and can be used conveniently in arithmetic Boolean
expressions. For example, the expression:
gives a function that returns 1 for the inputs that maximize F, otherwise it returns O.
That is, it computes the condition for maximizing F.
Bitwise Expressions
Arithmetic Boolean Expressions 115
3 xl
3 xl 3 xl + x2 3x1+x2<4
f2 n fO f2 n fO f2 fl fO
•
Less
f2
x2
n fO 1
Jf1
4
f2 fl fO
Jb
Figure 9.5 Generation of BDD vectors for arithmetic Boolean expressions.
When the objective function is too complicated for an integer Kamaugh map, the
function can be displayed by listing Boolean expressions for respective bits of the
BOO vector in the sum-of-products format. Figure 9.7 shows a bitwise expression
for the same function shown in Fig .9.6. We used the ISOP algorithm (described in
Chapter 5) to generate a compact sum-of-products format on each bit.
Bitwise expression is not very helpful for showing the behavior of B-to-I functions,
but it does allow us to observe the appearance frequency of an input variable and it
can estimate a kind of complexity of the functions.
116 CHAPTER 9
F=2a+3b-4c+d
••• cd
ab···.. 00 01 11 10
00 0 1 -3 -4
01 3 4 0 -1
11 5 6 2 1
10 2 3 -1 -2
If a function never has negative values, we can suppress the expression for the sign
bit. If some higher bits are always zero, we can omit showing them. With this zero
suppression, a bitwise expression becomes a simple Boolean expression if the function
returns only 1 or O.
Case Enumeration
Using case enumeration, we can list all possible values of a function and display the
condition for each case using a sum-of-products format (Fig. 9.8). This format is
effective when there are many input variables but the range of output values is limited.
F=2xa+3xb-4xc+d
± (aAcAd)V(bAC)
F2 (a A b A c) V (a A c A d) V (b A c A d) v (b A c)
Fl (a A b) V (a A d) V (a A b A d)
Fa (b /\ d) V (b /\ d)
6 a/\b/\c/\d
5 a/\b/\cAd
4 a/\b/\c/\d
3 (a /\ b /\ c /\ d) V (a /\ b /\ c A d)
2 (a /\ b /\ c /\ d) V (a /\ b /\ c A d)
1 (a /\ b /\ c /\ d) V (a /\ b /\ c /\ d)
0 (a /\ b /\ c /\ d) V (a /\ b /\ c /\ d)
-1 (a /\ b /\ c /\ d) V (a /\ b A c /\ d)
-2 a/\b/\c/\d
-3 a/\bAcAd
-4 a/\bAcAd
Figure 9.8 Case enumeration format.
This expression seems much more complicated than the original one, which has a
linear form. Here we propose a method for eliminating the negative literals from
the above expression and making an arithmetic sum-of-products expression which
consists only of arithmetic addition, subtraction, and multiplication operators. Our
method is based on the following expansion:
F x X Fl +X x Fa
:1: x (Fl - Fa) + Fa.
where F is the objective function, and Fa and Fl are subfunctions obtained by
assigning 0 and 1 to input variable :1:. By recursively applying this expansion to
all the input variables, we can generate an arithmetic sum-of-products expression
containing no negative literals. We can thereby extract a linear expression from a
B-to-I function represented by a BDD vector. For example, we can generate a BDD
118 CHAPTER 9
9.3 APPLICATIONS
In BEM-II scripts, we can use two kind of variables: input variables and register
variables. Input variables, denoted by strings starting with a lowercase letter, represent
the inputs of the functions to be computed. They are assumed to have a value of either
1 or O. Register variables, denoted by strings starting with an uppercase letter, are
used to identify the memory to which a B-to-I function is to be saved temporarily. We
can describe multi-level expressions using these two types of variables; for example,
F = a + b ; G = F + c.
Calculation results are displayed as expressions with input variables only, without
using register variables. BEM-II allows 65,535 different input variables to be used,
and there is no limit on the number of register variables.
BEM-II supports not only logical operators such as AND, OR, EXOR, and NOT,
but also arithmetic operators such as addition, subtraction, multiplication, division,
shifting, equality, inequality, and upper!lower bound. The syntax for expressions
1 This program, which runs on a SPARe station, is a public domain software.
FrP server address: eda.kuec.kyoto-u.ac.jp (130.54.29.134) /pub/cad/Bemll
Arithmetic Boolean Expressions 119
BEM-II generates BDD vectors of B-to-I functions for given arithmetic Boolean
expressions. Since BEM-II can generate huge BDDs with millions of nodes, limited
only by memory size, we can manipulate large-scale and complicated expressions. It
can, of course, calculate any expressions that used to be manipulated by hand. The
results can be displayed in the various formats presented in earlier sections.
The input variables are assumed to have values that are either 1 or 0, but
multi-valued variables are sometimes used in real problems. In such cases, we
can use register variables to deal with multi-valued variables. For example,
x = xO + 2*xl + 4*x2 + 8*x3 represents a variable ranging from
15. In another way, x = xl + 2*x2 + 3*x3 + 4*x4 represents a variable
to °
ranging from 1 to 4, under the one-hot constraint (xl + x2 + x3 + x4 == 1).
BEM-II can be used for solving many kind of combinatorial problems. Using BEM-II,
we can generate BDDs for constraint functions of combinatorial problems specified
by arithmetic Boolean expressions. This enables us to solve 0-1 linear programming
problems by handling equalities and inequalities directly, without coding complicated
procedures in a programming language. BEM-II can also solve problems which are
expressed by non-linear expressions. BEM-II features its customizability: we can
120 CHAPTER 9
% bemII
***** Arithmetic Boolean Expression Manipulator (Ver. 4.2) *****
> symbol a b c d
> F = 2*a + 3*b - 4*c + d
> print /map F
abc d
00 01 11 10
00 0 1 -3 -4
01 3 4 0 -1
11 5 6 2 1
10 2 3 -1 -2
> print /bit F
+-: ! a & c & ! d !b & C
2: a & b & !c I !a & c & !d I b & !c & d I !b & C
1: a & !b I a & d I !a & b & !d
0: b & !d I !b & d
> print F > 0
a & b I a & !c I b & !c I !c & d
> M = UpperBound(F)
> print M
6
> print F == M
a & b & !c & d
> C = (F >= -1) & (F < 4)
> print C
a & c & d I !a & !c & !d I b & C I !b & !c
> print /map C
a b c d
00 01 11 10
00 1 1 0 0
01 1 0 1 1
11 0 0 1 1
10 1 1 1 0
> quit
%
compose scripts for various applications much more easily than we can by developing
and tuning a specific program.
The subset-sum problem is one example of a combinatorial problem that can easily be
described by arithmetic Boolean expressions and solved by BEM-II. This problem is
Arithmetic Boolean Expressions 121
to find a subset of a given set of positive integers {al. a2. a2 . ...• an}, whose sum is
a given number b. It is a fundamental problem for many applications, including VLSI
CAD systems.
In BEM-II script, we use n input variables for representing whether or not the t-th
number is chosen, and the constraint on these input variables can be described with
simple arithmetic Boolean expressions. The following is an example of BEM-II script
for a subset-sum problem:
symbol xl x2 x3 x4 xS
S 2*xl + 3*x2 + 3*x3 + 4*x4 + S*xS
C = (S == 12)
The c represents the assignments of the input variables for satisfying the constraint.
This expression is almost the same as the definition of the problem. We can easily
write and understand this script.
BEM-II is convenient not only for solving the problem but also for analyzing the
nature of the problem. We can analyze the behavior of the constraint functions by
displaying with various formats. An example of BEM-II execution for a subset-sum
problem is shown in Fig. 9.10.
These constraints can be described with simple arithmetic Boolean expressions as:
% bemII
***** Arithmetic Boolean Expression Manipulator (Ver. 4.2) *****
> symbol xl x2 x3 x4 x5
> S = 2*x1 + 3*x2 + 3*x3 + 4*x4 + 5*x5
> print /map S
xl x2 x3 x4 x5
i 000 001 011 010 110 111 101 100
00 I 0 5 9 4 7 12 8 3
01 3 8 12 7 10 15 11 6
11 5 10 14 9 12 17 13 8
10 I 2 7 11 6 9 14 10 5
> C = (S -- 12)
> print C
xl & x2 & x3 & x4 & !x5 I !x1 & x2 & !x3 & x4 & x5 I
!x1 & !x2 & x3 & x4 & x5
> print Imap C ? S 0
xl x2 x3 x4 x5
I 000 001 011 010 110 III 101 100
00 I 0 0 0 0 0 12 0 0
01 I 0 0 12 0 0 0 0 0
11 I 0 0 0 0 12 0 0 0
10 I 0 0 0 0 0 0 0 0
> print /map (S >= 12) ? S 0
xl x2 x3 x4 x5
I 000 001 011 010 110 111 101 100
00 I 0 0 0 0 0 12 0 0
01 I 0 0 12 0 0 15 0 0
11 I 0 0 14 0 12 17 13 0
10 I 0 0 0 0 0 14 0 0
> quit
%
BEM-II analyzes the above expressions directly. This is much easier than creating a
specific program in a programming language. The script for the 8-queens problem
took only ten minutes to create.
Table 9.2 shows the results when we applied this method to N-queens problems. In our
experiments, we solved the problem up to N = 11. When seeking only one solution,
we can solve the problem for larger values of N by using a conventional algorithm
based on backtracking. However, the conventional method does not enumerate all the
Arithmetic Boolean Expressions 123
solutions nor does it count the number of solutions for larger N s. The BDD-based
method generates all the solutions simultaneously and keeps them in a BDD. Therefore,
if an additional constraint is appended later, we can revise the script easily without
rewriting the program from the beginning. This customizability makes BEM-II very
efficient in terms of the total time for programming and execution.
Traveling salesman problem (TSP) can also be solved by using BEM-II. The problem
is to find the minimum-cost path for visiting all given cities once and returning to the
starting city.
step-2
Let F1 = X12F2 V ;r:13F3 V ... V X1nFn.
124 CHAPTER 9
The logical product of all the above constraint expressions becomes the solution.
BEM-II feeds these expressions directly, and generates BODs representing all the
possible paths to visit n cities. As mentioned in previous section, we can specify
the cost (distance) of each path, and find an optimal solution to the problem after
generating BDDs. Experimental results are listed in Table 9.3. We can solve the
problem for n up to 10. This seems poor because the conventional method solves the
problem when n is more than 1000. However, our method computes all the solutions
at once, and the additional constraints can be specified flexibly. For example,
In designing high-speed digital systems, timing analysis for logic circuits is important.
The orthodox approach is to traverse the circuit to find the active path with the
topologically maximum length. Takahara[Tak93] proposed a new method using
Arithmetic Boolean Expressions 125
BEM-II. This method calculates B-to-I functions representing the delay time with
respect to the values of the primary inputs. Using this method, we can completely
analyze the timing behavior of a circuit for any combination of input values.
The B-to-I functions for the delay time can be described by a number of arithmetic
Boolean expressions, each of which specifies the signal propagation at each gate. For
a two-input AND gate with delay D, for example, if Tn and T'J are the signal arrival
times at the two input pins, and V,L and Vi, are their final logic values, the signal arrival
time at output pin Tc is expressed as:
Tc = T " + D when (Ta ::; To) and (Va = 1),
Tc = Tn + D when (Ta ::; T,J ) and (V,L = 0),
T, = To + D when (XL> To) and (Vi, = 1),
Tc = To + D when (Tn> T,,) and (Vi, = 0).
These rules can be described by the following arithmetic Boolean expression:
Tc ~ D + «Ta > Tb)? (Vb? Ta:Tb): (Va? Tb:Ta)).
By calculating such expressions for all the gates in the circuit, we can generate BOD
vectors for the B-to-I functions of the delay time. Experimental results for practical
benchmark circuits are listed in Table 9.4lTak93 J. The size of the BODs for the delay
time is about 20 times greater than that of the BODs for the Boolean functions of the
circuits.
The generated BODs maintain the timing information for all of the internal nets in the
circuit, and we can use BEM-II to analyze the circuits in various ways. For example,
we can easily compare the delay times of two nets in the circuit.
126 CHAPTER 9
Operation No.
c2
Clock
cycle
No.
c3
c4
Scheduling is one of the most important subtasks that must be solved to perform data
path synthesis. Miyazaki[Miy92] proposed a method for solving scheduling problems
by using BEM-II. The problem is to find the minimum-cost scheduling for a procedure
specified by a data-flow graph under such constraints as the number of operation units
and the maximum number of clock cycles (Fig. 9.11). This problem can be solved by
using linear programming, but BEM-U can also be used.
When Tn is the total number of operations that appear in the data-flow graph and n is
the maximum number of clock cycles, we allocate Tn x n input variables from :]:11 to
l: mn , where :rij represents the i-th operation executed in the j-th clock cycle. Using
this variable coding, we can represent the constraints of the scheduling problem as
follows.
3. If two operations have a dependency in the data-flow graph, the operation in the
upper stream has to be executed before the one in the lower stream.
Let C 1 = 1 x :rl1 + 2 x :r12 + ... + n x :rl n .
Let C 2 = 1 x :r21 + 2 x :r22 + ... + n x :r211.
The logical product of all these constraint expressions becomes the solution to the
scheduling problem. Using BEM-II, we can easily specify the cost of operation and
the other constraints. BEM-II analyzes the above expressions and tries to generate
BDDs that represent the solutions. If it succeeds in generating BDDs in main
memory, we can immediately find a solution to the problem and count the number of
solutions. Table 9.5[Miy92J lists the experimental results for benchmark data from
the High-Level Synthesis Workshop (HLSW). The BDDs for constraint functions can
be generated in a feasible memory and space.
128 CHAPTER 9
9.4 CONCLUSION
In this book, we have discussed techniques related to BDDs and their applications for
VLSI CAD systems.
The variable ordering method is one of the most important issues in the application of
BDDs. In Chapter 3, we discussed the properties of variable ordering and showed two
heuristic methods: DWA method and minimum-width method. The DWA method finds
an appropriate order before generating BDDs. It refers to topological infonnation
which specifies the sequence of logic operations in the Boolean expression or logic
circuit. Experimental results show that for many practical circuits the DWA method
finds a tolerable order in a short computation time. The minimum-width method, on
the other hand, finds an appropriate order for a given BDD without using additional
infonnation. It seeks a good order from a global viewpoint, rather than by doing an
incremental search. In many cases, this method gives a better order than the DWA
method does in a longer but still reasonable computation time. The techniques of
variable ordering are still being studied intensively, but it is almost impossible to have
129
130 CHAPTER 10
a method that always finds best order within a practical time. We will therefore have
to use some heuristic methods selected according to the application.
As our understanding of BOOs has deepened, their range of applications has broadened.
In VLSI CAO problems, we are often faced with manipulating not only Boolean
functions but also sets of combinations. In Chapter 6, we proposed Zero-Suppressed
BDDs (ZBDDs), which are BOOs based on a new reduction rule. ZBOOs enable us to
manipulate sets of combinations more efficiently than we can when using conventional
BOOs. The effect of ZBOOs is especially remarkable when we are manipulating
sparse combinations. We discussed the properties of ZBOOs and their efficiency
as revealed by a statistical experiment. We then presented the basic operators for
ZBOOs. These operators are defined as the operations on sets of combinations, and
differ slightly from the operations on Boolean functions based on conventional BOOs.
Conclusions 131
On the basis of these ZBDD techniques, we discussed the calculation of un ate cube set
algebra. We developed efficient algorithms for computing unate cube set operations
including multiplication and division, and we showed some practical applications. For
solving some types of combinatorial problems, unate cube set algebra is more useful
than using conventional logic operations. We expect the unate cube set calculator to
be a helpful tool in developing VLSI CAD systems and in various other applications.
In Chapter 9, we described a useful tool for the research on computer science. Using
the BDD techniques, we have developed an arithmetic Boolean expression manipulator
(BEM-II) that can easily solve many kind of combinatorial problems. It calculates
not only binary logic operation but also arithmetic operations on multi-valued logics:
such as addition, subtraction, multiplication, division, equality, and inequality. Such
arithmetic operations provide simple descriptions for various problems. BEM-II feeds
and computes the problems described as equalities and inequalities, which are dealt
with using 0-1 linear programming. It is therefore not necessary to write a specific
program in a programming language for solving a problem. In this chapter, we
discussed the data structure and algorithms for the arithmetic operations. We then
presented the specification of BEM-II and some application examples. Experimental
results indicate that the customizability of BEM-II makes it very efficient in terms of
132 CHAPTER 10
total time for programming and execution. We therefore expect it to be a useful tool
for in digital systems research and development.
A number of works based on the BOD techniques are in progress. The novelty of
BOD manipulation are summarized as:
2. Eliminates redundancy by ensuring that there are no duplicate nodes and that
equivalent computation is not repeated.
The BOD algorithms are based on the quick search of the hash tables and the linked
list data structure, both of which greatly benefit from the random access machine
model,such that any data in the main memory can be accessed in a constant time.
Because most computers are designed according to this model, we conclude that the
BOD manipulation algorithms are fairly sophisticated and adapted to the conventional
computer model. BODs and their improvements will become key techniques in VLSI
CAD systems and other aspects of computer science.
REFERENCES
133
134 BDDs AND ApPLICATIONS FOR VLSI CAD
[CB89] K. Cho and R. E. Bryant. Test pattern generation for sequential MaS
circuits by symbolic fault simulation. In Proc. of 26th ACMIIEEE
Design Automation Conference (DAC'89), pp. 418-423, June 1989.
[Rud93] R. Rudell. Dynamic variable ordering for ordered binary decision dia-
grams. In Proc. of IEEEIACM International Conference on Computer-
Aided Design (ICCAD-93), pp. 42--47, November 1993.