On Bipartite and Multipartite Clique Problems: Milind Dawande
On Bipartite and Multipartite Clique Problems: Milind Dawande
Pinar Keskinocak
Jayashankar M. Swaminathan
and
Sridhar Tayur
388
0196-6774/01 $35.00
2001 Elsevier Science
All rights reserved.
clique problems 389
1. INTRODUCTION
1
Note the difference between multipartite graphs and the well known class of k-partite
graphs. A graph G = V E is k-partite if V can be partitioned into k subsets V1 Vk such
that for every edge u v ∈ E, u and v belong to different vertex sets of the partition. The
class of multipartite graphs is contained in the class of k-partite graphs.
390 dawande et al.
perceived by the customers for these products is to reduce the final assem-
bly time, where such a reduction can be obtained by creating subassemblies
(or vanilla boxes) in advance (see [11] for details). A vanilla box U1 contain-
ing parts i1 ik can be used only in products which contain all of these
parts. In other words, the set of products U2 = j il j ∈ E, l = 1 k
can use the vanilla box U1 . Let tij be the assembly time of component i in
product j. If the total assembly time of the components in vanilla box U1
is T , then we can obtain a reduction of T in the lead times of all the prod-
ucts in U2 by having enough inventory of these vanilla boxes. On the other
hand, to obtain a large T , we have to include many parts in the vanilla
box, which will usually decrease the number of products which can use the
vanilla box (size of U2 ). Then, there is a trade-off between constructing a
large vanilla box and using it in many products. The problem of finding
a “good” vanilla box can be modeled by finding a maximum edge weight
biclique in the bipartite graph B. If all the parts have (approximately) the
same assembly time the problem reduces to the maximum edge cardinality
biclique problem (MBP). A natural generalization of the bipartite clique
problem is the multipartite clique problem. In such a case, each multipar-
tite clique in the graph represents a possible storage of vanilla boxes at
different levels in the assembly process such that a vanilla box in a later
level in assembly is itself assembled in part from another vanilla box from
the previous level (because a biclique between any two levels i and i + 1
acts as vanilla box for that level).
Bicliques have also been studied in the area of formal concept analysis
[3, 4]. Consider two sets V1 and V2 (the set of “attributes” and the set of
“objects”) and a relation R between V1 and V2 (i j ∈ R if object j has
attribute i). For subsets P ⊂ V1 and Q ⊂ V2 , let
P = the set of all objects which have all the attributes in P, and
Q = the set of all attributes which all the objects of Q have.
TABLE 1
Variants of Biclique Problems
Abbreviation Problem
Abbreviation Problem
In this section, we first present the formulation for the biclique problem
and discuss known results. Then we show that MWBP and four variants
of MBP are NP-complete. Note that since the complement of a bipartite
graph is not bipartite in general, the polynomial-time solvability of the inde-
pendent set problem on bipartite graphs does not imply a polynomial time
algorithm for MBP. Finally, we compare the sizes of maximum balanced
bicliques and maximum edge cardinality bicliques in random graphs.
In a node weighted bipartite graph B = V1 ∪ V2 E, there is a weight wv
associated with each node v. The maximum node weight biclique problem
(MNWBP) can be formulated as a 0–1 integer program as
max wu xu + w v xv
u∈V1 v∈V2
subject to xu + xv ≤ 1 u ∈ V1 v ∈ V2 u v ∈
/E (1)
xv ∈ 0 1 for all v ∈ V1 V2 (2)
392 dawande et al.
where
1 if node v is in the biclique
xv =
0 otherwise.
If we relax this integer program by replacing the integrality constraints (2)
with 0 ≤ xv ≤ 1 for all v ∈ V1 ∪ V2 , we obtain a linear program. Note that
the matrix defining the constraint set (1) is the node-edge incidence matrix
of a bipartite graph, which is totally unimodular, and hence the solution to
the linear programming relaxation will be integer [9, p. 544, Corollary 2.9].
Therefore, the maximum node weight biclique problem is polynomially solv-
able [5]. It follows that the maximum node cardinality biclique problem is
also polynomially solvable. A restricted version of these problems, where
there is an additional requirement that U1 = U2 , is called the maximum
balanced node cardinality biclique problem (MBBP), which is NP-complete
[5]. (This problem is referred to as the balanced complete bipartite sub-
graph problem in [5, p. 196].) Note that for the same bipartite graph, solu-
tions to MBBP and MBP may be quite different from each other. Hence,
node-cardinality biclique problems do not provide good approximations for
MBP in general. In Section 2.2, we quantify the difference between the
solutions to MBP and MBBP in random graphs.
Hochbaum [7] considers a related problem to MWBP and MNWBP,
where the objective is to minimize the total weight of the nodes or edges
deleted so that the remaining subgraph is a biclique. She provides a
2-approximation for the edge deletion version for general and bipartite
graphs and a 2-approximation for the node deletion version for general
graphs.
Note that the reduction in Theorem 1 does not imply the NP-hardness
of MBP, since we used a weighted bipartite graph in the reduction in which
some edge weights were zero. An NP-completeness proof for MBP has been
recently provided in [10].
Next, we consider three decision problems and an optimization problem,
which are related to MBP:
• Exact balanced node cardinality decision problem (EBNCD): Given
a bipartite graph G = V1 ∪ V2 E and a positive integer a ∈ Z+ , does there
exist a biclique C = U1 ∪ U2 with U1 = U2 = a?
• Exact node cardinality decision problem (ENCD): Given a bipartite
graph G = V1 ∪ V2 E and two positive integers a b ∈ Z+ , does there
exist a biclique C = U1 ∪ U2 with U1 = a and U2 = b?
• Exact edge cardinality decision problem (EECD): Given a bipartite
graph G = V1 ∪ V2 E and a positive integer k ∈ Z+ , does there exist a
biclique with exactly k edges?
• Maximum one-sided edge cardinality problem (MOFCP): Given a
bipartite graph G = V1 ∪ V2 E and a positive integer k ∈ Z+ , find a maxi-
mum cardinality biclique with exactly k nodes on one side of the bipartition.
Lemma 2.1. EBNCD and ENCD are NP-complete.
Proof. It is known that the maximum balanced node cardinality biclique
problem (MBBP) is NP-complete [5]. Then, it follows that EBNCD is
NP-complete, since MBBP can be solved using a polynomial number of
instances of EBNCD. Note that EBNCD is just a special case of ENCD and
hence ENCD is also NP-complete. Note that the reductions for EBNCD
and ENCD are Turing reductions rather than Karp reductions [5].
Theorem 2.2. EECD is NP-complete.
To prove this theorem, first we define the following decision problem and
show that it is NP-complete:
Exact balanced prime node cardinality decision problem (EBPNCD):
Given a bipartite graph G = V1 ∪ V2 E and a prime number p, such
that the maximum degree in G is less than p2 , does there exist a biclique
C = U1 ∪ U2 , with U1 = U2 = p?
Lemma 2.3. EBPNCD is NP-complete.
Proof. Given an instance of EBNCD, let l = max V1 V2 + 1 and
let p be any prime number such that l ≤ p ≤ 2l. Such a prime number
is guaranteed by Bertrand’s theorem [6]. Let a < p be a positive integer,
where a is the specification for EBNCD. Add p − a nodes on both sides
of the bipartition and connect each of these additional nodes to all the
394 dawande et al.
nodes on the opposite side of the bipartition. The maximum degree of any
node in this graph is p − a + max V1 V2 ≤ 3l. Since p2 ≥ l2 it follows
that p2 > 3l and the maximum degree is less than p2 , for l > 3. Then,
EBNCD has a yes (no) answer if and only if EBPNCD has a yes (no)
answer, implying that EBPNCD is NP-complete.
Now we prove Theorem 2.2.
Proof of Theorem 2.2. Consider a bipartite graph G = V1 ∪ V2 E and
an instance of EBPNCD. A biclique of edge cardinality p2 in G can be
possible only in two ways: (1) one node on one side of the bipartition and
p2 nodes on the other side and (2) exactly p nodes on both sides of the
bipartition. Since the maximum degree in G is strictly less than p2 , the first
case is not possible. Thus, EBPNCD has a yes (no) answer if and only if an
instance of EECD with k = p2 has a yes (no) answer. Then it follows that
EECD is NP-complete since EBPNCD is.
Our next result is about the complexity of the optimization problem
MOFCP:
Theorem 2.4. MOFCP is NP-complete.
To prove Theorem 2.4, we define the following decision problem:
Maximum fixed intersection problem (MFIP): Given k ∈ Z+ , a
ground set V , and a set system = S1 Sn , where the Si ’s are sub-
sets of V , find k subsets from such that their intersection has maximum
cardinality.
Lemma 2.5. MFIP is NP-hard.
Proof. It is well known that the decision problem CLIQUE, “Given a
graph G = V E and a positive integer k, does there exist a clique of size
k in G?,” is NP-complete [5]. Given an instance of CLIQUE, construct the
following set system on the ground set V . For each edge e = u v ∈ E,
construct one set Se = V \ u v. Let = Se e ∈ E. There exists a clique
on k nodes in G if and only if there exist p = kk−1 2
subsets in whose
intersection has cardinality at least V − k. Thus, there exists a clique of
size k in G if and only if the cardinality of the maximum intersection in the
optimal solution to MFIP of p sets is V − k.
Proof of Theorem 2.4. Consider an instance of MFIP. Construct a bipar-
tite graph G = V1 ∪ V2 E as follows: For each set Si is , create a node
i in V1 ; for each element j of the base set V , create a node j in V2 . For
every element j ∈ Si , include an edge e = i j. Note that the maximum
edge cardinality biclique with exactly k nodes in V1 solves MFIP.
clique problems 395
subset
A ⊆ V1 or Q ⊆ V2 of size a. Hence, the number of a × a subgraphs
is na na and the expected number of a × a bicliques is
n n a2
EZa = p
a a
396 dawande et al.
−2
Note that for a ≥ 2 log n/ log p1 , pa = pa a ≤ plogp n a = n−2a and
2
ProbZa ≥ 1 → 0 as n → ∞ (3)
Now, we need to show that there is a balanced biclique of size an × an
in B with high probability; i.e., ProbZa = 0 a = an is very small. From
the second moment method [2], we have ProbZa = 0 ≤ VarZa /EZa 2 .
Let XA Q be an indicator variable which assumes value 1, if the nodes in
A ⊆ V1 and Q ⊆ V2 form a biclique, and zero otherwise:
E Za2 = ProbXA Q = 1 XA Q = 1
A Q A Q
= ProbXA Q = 1 XA Q = 1 ProbXA Q = 1
A Q A Q
Q:
Since all the A Q look alike, fix A Q as A
E Za2 = ProbXA Q = 1 XA
Q = 1 ProbXA Q = 1
A Q A Q
= ProbXA Q = 1 ProbXA Q = 1 XA
Q = 1
A Q
A Q
a
a
= ProbXA Q = 1 ProbXA Q = 1 XA
Q = 1
A Q i=0 j=0
A ∩A=i
Q ∩Q=j
a a
a n − a a n − a a2 −ij
= ProbXA Q = 1 p
A Q i=0 j=0
i a−i j a−j
Letting A Q
ProbXA Q = 1 XA
Q = 1 = %, we get
VarZa %
ProbZa = 0 ≤ 2
= − 1
EZa EZa
We can write
a
a
%
= T
EZa i=0 j=0 ij
where
an−aan−a
i a−i j a−j
Tij = nn p−ij
a a
clique problems 397
a2
= T
n − 2a + 1 00
n−a a
n−a
The second equality in T10 follows, since a−1
= n−2a+1 a
. Similarly,
2
a
T01 =
T
n − 2a + 1 00
Adding up the first three terms, we obtain
2a2
T00 + T01 + T10 = T00 1 +
n − 2a + 1
2
a2 2a2
= 1− + on−3/2 1+
n n − 2a + 1
log n
= 1 + on−3/2 for a =
log p1
Now, we want to show that the remaining part of the summation is also
small. To be able to do that, first we will bound the terms Tij i j ≥ 1 in
terms of T11 :
an−a an−a
Tij j a−j
= ai n−a
a−i
an−a p−ij+1
T11 1 a−1 1 a−1
Since
an−a
n − 2a + 1! a − 1!2
ai n−a
a−i
=
1 a−1
n − 2a + i! i! a − i!2
we obtain
i−1 j−1
Tij a2 a2
≤ p−ij+1
T11 n − 2a n − 2a
398 dawande et al.
First, note that T12 /T11 = a − 12 /2n − 2a + 2p ≤ 1, for sufficiently large
n. Similarly, T21 /T11 ≤ 1, for sufficiently large n.
For i ≥ 2,
−ij + 1
−i − 1j ≤
2
Similarly, for j ≥ 2,
−ij + 1
−j − 1i ≤
2
Thus,
i−1 j−1
Tij a2 −ij+1 a2 −ij+1
≤ p 2 p 2
T11 n − 2a n − 2a
i−1 j−1
a2 a2
≤ p−j p−i
n − 2a n − 2a
For the choice of a = a∗ n = 1 − ' log n/ log p1 , we get Tij /T11 ≤ 1 for
sufficiently large n.
Noting that
a4
n−a2
n−2a+12 a
T11 = n2
a
p
a a a a
and i=1 j=1 Tij ≤ i=1 j=1 T11 → 0 as n → ∞ for a = a∗ n, we get
%
= 1 + on−3/2
EZa
Hence,
Our use of the probabilistic method in Theorem 2.6 was inspired by the
work presented in [2]. The fundamentals of the method and similar results,
on random graphs, for combinatorial quantities such as the clique number
and the chromatic number are presented in [2].
clique problems 399
3.1. Formulations
i i+1 i i+1
Let G = GVi Vi+1 E be the bipartite graph induced by node
i i+1
sets Vi and Vi+1 . Define the variable xe to be 1 if edge e of E i i+1
is not in the multipartite clique; 0, otherwise. For an edge e in E i i+1 , let
Ae be the edges in E i+1 i+2 which are adjacent to e and let Be be
i i+1
the edges in E i−1 i which are adjacent to e. we is the weight of edge
1 5 9 13 3 7
11
2 6 10 14 1 4 8
12
3 15 5 9
7 11 2
13
8 16 6 10
4 12
(a) (b)
A multi-partite clique (MPC) Difference between MPCP, MPCF
and MPCS.
(spanning all levels of the graph)
e in E i i+1 . We assume that there are n levels in the graph and they are
numbered 1 n where 1 represents the first level.
n−1
i i+1 i i+1
W ∗ = min wk xk
i=1 k∈E i i+1
subject to
i i+1 i i+1
xk + xl ≥1 if edges k and l in E i i+1 cannot be in
the same biclique, ∀ pair k l ∈ E ii+1
i i+1 i−1 i
xe ≤ Ap + xp −1 ∀ p ∈ E i−1 i 2 ≤i≤n−1
e∈Ap
i−1 i i i+1
xe ≤ Bp + xp −1 ∀ p ∈ E i i+1 2 ≤i≤n−1
e∈Bp
x1
e
2
≤ E 1 2 − 1
e∈E 12
i i+1
xe ∈ 0 1 ∀ e ∈ E i i+1 1 ≤ i ≤ n − 1
For each bipartite subgraph Gi i+1 , due to the first set of constraints, the
i i+1
variables xe having value 0 form a biclique. The second set of con-
i−1 i
straints “links” these bicliques together. That is, if variable xe is 0 (i.e.,
edge e is in the biclique of Gi−1 i ), then at least one edge adjacent to e
in Gi i+1 should be in the MPC. Note that the second set of constraints
is required only for levels 2 through n − 1. Similarly, the third constraint
i i+1
makes sure that if a variable xe is 0 then at least one edge adjacent to
e in Gi−1 i should be in the MPC. The fourth constraint makes sure that at
least one edge from the first level is included in the MPC.
n−1
i i+1 i i+1
W ∗ = min wk xk
i=1 k∈E i i+1
subject to
i i+1 i i+1
xk + xl ≥1 if edges k and l in E i i+1 cannot be in
the same biclique, ∀ pair k l ∈ E i i+1
clique problems 401
i i+1 i−1 i
xe ≤ Ap + xp − zi ∀ p ∈ E i−1 i 2 ≤i≤n−1
e∈Ap
i−1 i i i+1
xe ≤ Bp + xp − zi ∀ p ∈ E i i+1 2 ≤i≤n−1
e∈Bp
zi+1 ≤ zi 2 ≤i≤n−2
i i+1
xe ≥ E i i+1 1 − zi 2 ≤i≤n−1
i i+1
e∈E
x1
e
2
≤ E 1 2 − 1 (I)
1 2
e∈E
i i+1
xe ∈ 0 1 ∀ e ∈ E i i+1 2 ≤i≤n−1
zi ∈ 0 1 1 ≤ i ≤ n − 1
In this case, zi = 0 indicates that no edges from levels i and above can
be in the MPC. Notice that if zi = 0, then none of the edges in E i i+1
can be in the MPC due to the fifth set of constraints and zj = 0 ∀ j =
i + 1 n − 1 due to the fourth set of constraints. When zi = 1, the
second set of constraints “links” level i with level i + 1 in the MPC (similar
to MPCP) and when zi = 0, they are redundant.
MPCS: Multipartite Clique Problem which Includes Nodes from Some Levels
Problem MPCS can be formulated in a way similar to that of MPCF by
removing constraint (I) and using variables δi in addition to variables zi .
In the formulation for MPCF, zi = 0 indicates that no edges from levels
i and above can be in the MPC. Problem MPCS will have an additional
set of similar constraints involving variables δi where δi = 0 indicates that
no edges from levels i and below can be in the MPC. We avoid giving the
entire formulation for MPCS since the basic idea of the formulation is the
same as that of MPCF.
Since the biclique problem is a special case of MPCP, the complexity of
several optimization and decision problems regarding the MPCP follows
directly from the results for the biclique problem proved in Section 2. We
list these results in Lemma 3.1.
is a MPC (provided that all the sets in the above product are nonempty).
Consider an optimal MPC (say M ∗ ) which satisfies the hypothesis. Thus, for
every level i (i = 1 2 n − 1), M ∗ has a node (say vi ) in the optimum
solution such that all neighbors of vi in Gi i+1 are also in the optimum solu-
tion. Then, it can easily be verified that Sv1 = M ∗ . Hence, the polynomial
time procedure which considers every node u from level 1 and constructs
the set Su will find M ∗ .
The above conditions on the multipartite clique may be true in certain
real environments. Many manufacturers in the computer industry offer a
base model (a complete product) as a shell and offer several options on
the base model to define other products in the product line. They store
inventory of the shell and use it as a vanilla box while customizing other
products with options. If the supplier (or supplying plant) of at least one key
component to the shell also follows a similar strategy, then the multipartite
cliques of interest are such that they require the conditions in Theorem 3.2
to be satisfied.
As with the biclique problem, the node cardinality and node weighted
counterparts of multipartite clique problems can be considered. To the best
clique problems 403
4. CONCLUSIONS
REFERENCES