0% found this document useful (0 votes)
23 views

(1997) Normal-Boundary Intersection A New Method For Generating The Pareto Surface in Nonlinear Multicriteria Optimization Problems

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

(1997) Normal-Boundary Intersection A New Method For Generating The Pareto Surface in Nonlinear Multicriteria Optimization Problems

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

SIAM J. OPTIM.

c 1998 Society for Industrial and Applied Mathematics


°
Vol. 8, No. 3, pp. 631–657, August 1998 001

NORMAL-BOUNDARY INTERSECTION: A NEW METHOD FOR


GENERATING THE PARETO SURFACE IN NONLINEAR
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

MULTICRITERIA OPTIMIZATION PROBLEMS∗


INDRANEEL DAS† AND J. E. DENNIS‡
Abstract. This paper proposes an alternate method for finding several Pareto optimal points for
a general nonlinear multicriteria optimization problem. Such points collectively capture the trade-off
among the various conflicting objectives. It is proved that this method is independent of the relative
scales of the functions and is successful in producing an evenly distributed set of points in the Pareto
set given an evenly distributed set of parameters, a property which the popular method of minimizing
weighted combinations of objective functions lacks. Further, this method can handle more than two
objectives while retaining the computational efficiency of continuation-type algorithms. This is an
improvement over continuation techniques for tracing the trade-off curve since continuation strategies
cannot easily be extended to handle more than two objectives.

Key words. Pareto set, multicriteria optimization, multiobjective optimization, trade-off curve

AMS subject classification. 90C29

PII. S1052623496307510

1. Introduction. A wide variety of problems arising in design optimization of


engineering systems inherently involve optimizing multiple performance criteria (see,
for example, Eschenauer, Koski, and Osyczka [3] and Statnikov and Matusov [4]). For
example, a typical bridge-construction design might involve simultaneously minimiz-
ing the total mass of the structure and maximizing its stiffness. However, it is highly
improbable that these conflicting objectives would both be “extremized” by the same
design. Hence the designer makes some trade-offs among the conflicting objectives in
choosing the final design.
In mathematical notation a multicriteria optimization problem can be loosely
posed as
 
f1 (x)
 f2 (x) 
 
“min”F (x) =  ..  , n ≥ 2, . . . (M OP ),
x∈C  . 
fn (x)
where
C = {x : h(x) = 0, g(x) ≤ 0, a ≤ x ≤ b},
F : <N 7→ <n , h : <N 7→ <ne , and g : <N 7→ <ni are twice continuously differentiable
mappings, and a ∈ (< ∪ {−∞})N , b ∈ (< ∪ {∞})N , N being the number of variables,
∗ Received by the editors July 22, 1996; accepted for publication (in revised form) April 1, 1997;

published electronically June 3, 1998. This research was partially supported by Department of
Energy, DOE grant DE-FG03-95ER25257, Air Force grant F49620-95-1-0210, and by the National
Aeronautics and Space Administration contract NAS1-19480 while the first author was in residence
at the Institute for Computer Applications in Science and Engineering (ICASE), NASA Langley
Research Center, Hampton, VA 23681-0001.
http://www.siam.org/journals/siopt/8-3/30751.html
† Department of Computational & Applied Mathematics, Rice University, Houston, TX 77251-

1892. Current address: IBM, Hopewell Junction, NY 12533 ([email protected]).


‡ Noah Harding Professor of Computational & Applied Mathematics, Rice University, Houston,

TX 77251-1892 ([email protected].).
631
632 INDRANEEL DAS AND J. E. DENNIS

n the number of objectives, and ne and ni the number of equality and inequality
constraints.
Since no single x∗ would generally minimize every fi simultaneously, a concept of
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

optimality which is useful in the multiobjective framework is that of Pareto optimality,


as explained below.
Definition. The vector F (x̂) is said to dominate another vector F (x̄), de-
noted F (x̂) ≺ F (x̄), if and only if fi (x̂) ≤ fi (x̄) for all i ∈ {1, 2, . . . , n} and
fj (x̂) < fj (x̄) for some j ∈ {1, 2, . . . , n}. A point x∗ ∈ C is said to be globally
Pareto optimal or a globally efficient point for (MOP) if and only if there does not
exist x ∈ C satisfying F (x) ≺ F (x∗ ). F (x∗ ) is then called globally nondominated or
noninferior.
Computational methods for general nonlinear multicriteria optimization, includ-
ing the one described in this paper, can at best guarantee local Pareto optimality of
the obtained solution. The definition of local Pareto optimality is very similar to its
global counterpart: a point x∗ ∈ C is said to be locally Pareto optimal or a locally
∗ ∗
efficient point for (MOP) if and only if there T exists an open neighborhood of x , B(x ),
such that there does not exist x ∈ B(x∗ ) C satisfying F (x) ≺ F (x∗ ).
Pareto optimality will henceforth refer to local Pareto optimality unless qualified
explicitly.
The shadow minimum or utopia point, F ∗ , is defined as the vector containing the
individual global minima, fi∗ , of the objectives, i.e.,
 
f1∗
 f2∗ 
 
F∗ =  .. .
 . 
fn∗

We assume here and henceforth the existence of a minimizer for each of our
objectives. The shadow minimum could thus be attained only in the rare case when
a single x minimizes all the objective functions. However, in practical situations, the
best we can hope for is to get close to the shadow minimum and assure that there is
an agreeable trade-off among the multiple objectives.
Very often in engineering applications the desired result helpful in facilitating
design is a whole collection of Pareto optimal points, representative of the entire
spectrum of efficient solutions. Thus, ideally, the desired solution is the entire Pareto
optimal set, which can be obtained for some small problems that allow themselves
to be treated parametrically, resulting in closed-form expressions for the Pareto set
(see Lin [5]). More recently, attempts have been made to approximate the entire
curve of Pareto optimal solutions in biobjective problems using techniques that trace
the curve of parametrized optima (see Rakowska, Haftka, and Watson [6]; Rao and
Papalambros [7]; Lundberg and Poore [8]).
Another alternative acceptable in most applications is a discrete set of Pareto
optimal points obtained by combining the multiple objectives into a single objective
function and minimizing the single objective over various values of the parameters
used to combine the objectives. For example, it is possible to generate a set of Pareto
optimal points by minimizing a convex combinationPn of the objectives, wT F (x), over
x ∈ C, where w ≥ 0 (componentwise) and i=1 wi = 1, and by performing the
minimization for different choices of w (see, among many others, Koski [14]). In this
article, we propose a new method for generating Pareto optimal points which is at
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 633

least as efficient as these methods and, unlike the techniques for tracing the curve of
Pareto optimal solutions, can be applied to problems with more than two objectives.
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

2. Preliminaries. First let us introduce some terminology.


Convex hull of individual minima (CHIM ). Let x∗i be the respective global
minimizers of fi (x), i = 1, . . . , n over x ∈ C. Let Fi∗ = F (x∗i ), i = 1, . . . , n. Let Φ
be the n × n matrix whose ith column is Fi∗ − F ∗ sometimes known as the pay-off
matrix. Then P the set of points in <n that are convex combinations of Fi∗ − F ∗ , i.e.,
n n
{Φβ : β ∈ < , i=1 βi = 1, βi ≥ 0}, is referred to as the CHIM.
The set of attainable objective vectors {F (x) : x ∈ C} is denoted by F, so
F : C 7→ F; i.e., C is mapped by F onto F. The space <n which contains F is
usually referred to as the objective space. The map of C under F in the objective
space is often called the multiloss map (biloss map, if n = 2). We shall denote the
boundary of F by ∂F. The set of all Pareto optimal points is usually denoted by P.
The complete curve/surface of Pareto minima (continuous or not) is often referred to
as the trade-off function (see p. 9, Haimes, Hall, and Freedman [15]).
CHIM+ : Let CHIM∞ be the affineP subspace of lowest dimension that contains
n
the CHIM , i.e., the set {Φβ : β ∈ <n , i=1 βi = 1}. Then CHIM+ is defined as
the convex hull of the points in the intersection of F and CHIM∞ . More informally,
consider extending (or withdrawing) the boundary of the CHIM simplex to touch
∂F; the “extension” of CHIM thus obtained is defined as CHIM+ .
Henceforth, it shall be assumed that the objective functions have been defined
with the shadow minimum shifted to the origin, so that all the objective functions are
nonnegative, i.e., F (x) is redefined as

F (x) ← F (x) − F ∗ .

We observe that in Fig. 1, which shows the set F in the objective space, the point A is
F1∗ , B is F2∗ , O is the shadow minimum (and the origin), the broken line segment AB
is the CHIM , while the “arc” ACB is the set of all Pareto minima in the objective
space, alternately, the trade-off curve. In this (and any) problem with n = 2 (i.e.,
biobjective), CHIM = CHIM+ . For n > 2 CHIM may not equal CHIM+ as in
the case shown in Fig. 3.
3. Central idea. Normal-boundary intersection (NBI) is a technique intended
to find the portion of ∂F which contains the Pareto optimal points. In order to facil-
itate the introduction of the preliminary idea behind NBI the discussion will assume
that the vector of global minima of the objectives, F ∗ , is available. In section 4.2 it
will be argued how not having global minima usually renders very little injury to the
technique.
The algebraic idea behind our approach will be motivated by means of a simple
and obvious idea: the intersection point between the boundary ∂F and the normal
pointing toward the origin emanating from any point in the CHIM is a point on
the portion of ∂F containing the efficient points. This point is also a Pareto optimal
point unless it happens to lie in a “sufficiently concave” part of the boundary as
shown in Fig. 2. It certainly is a Pareto optimal point when the trade-off surface in
the objective space is convex, which happens in almost every application found in
the literature. If the trade-off surface is not convex, points in the concave part will
still be obtained using NBI. If these points in the concave part are Pareto optimal
this particular trait can be thought of as a merit of NBI over minimizing convex
combinations of objectives which fails to obtain points in the nonconvex parts of the
634 INDRANEEL DAS AND J. E. DENNIS

f2 (x)
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

C
O B f1(x)

Fig. 1. A typical biloss map.

Pareto set (see Das and Dennis [2]). If they are not Pareto optimal this might be
characterized as a disadvantage. Nevertheless these points are useful even though they
are not Pareto optimal, since they help in constructing a smoother approximation of
the Pareto boundary.
It should be noted that the goal attainment method described in Gembicki [9],
or a very similar method in Schy and Giesy [10], [11], [13] and Schy, Giesy, and
Johnson [12] can also be interpreted in terms of the geometrical idea described above.

F*

Fig. 2. Boundary point obtained by NBI is not Pareto optimal.

Now let us illustrate algebraically how any such boundary point can be found by
solving an optimization problem. Given barycentric coordinates β, Φβ represents a
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 635

point in the CHIM . Let n̂ denote the unit normal to the CHIM simplex pointing
toward the origin; then Φβ + tn̂, t ∈ < represents the set of points on that normal.
The point of intersection of the normal and the boundary of F closest to the origin
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

is the global solution of the following subproblem:

max t
x,t

(1) s.t. Φβ + tn̂ = F (x),

h(x) = 0, (N BIβ ),

g(x) ≤ 0,

a ≤ x ≤ b.

The vector constraint Φβ +tn̂ = F (x) ensures that the point x is actually mapped
by F to a point on the normal, while the remaining constraints ensure feasibility of x
with respect to the original problem (M OP ). Observe that if the origin is not shifted
to F ∗ the first set of constraints should read Φβ + tn̂ = F (x) − F ∗ .
The subproblem above shall be referred to as the NBI subproblem and be written
as N BIβ since β is the characterizing parameter of the subproblem. Solutions of these
subproblems will be referred to as NBI points. The idea is to solve N BIβ for various
β and find several points on the boundary of F, effectively constructing a pointwise
approximation of the efficient frontier.
The goal attainment approach of Gembicki [9] or Schy and Giesy [10], [11], [13]
results in a similar subproblem where the equality constraints (1) in the NBI sub-
problem get replaced by inequalities (F (x) ≤ u + tv). However the work of Schy and
Giesy was mainly concerned with finding one Pareto optimal point, so the concept of
parametrizing the subproblem to generate many Pareto points was not studied. In
their work, both the normal vector (v) and the point of origin of the normal (u) are
user-defined quantities; once they are set, one Pareto point can be generated. On the
other hand, NBI chooses a particular parametrization of the point of origin of the
normal in terms of the barycentric coordinates β and keeps the normal direction n̂
fixed. This particular parametrization plays a key role in generating the even spreads
of Pareto points demonstrated later. Observe that unlike an NBI point, the solution
of a goal attainment problem is not constrained to lie on the normal.
As indicated earlier, all NBI points are not Pareto optimal points. In biobjective
problems, for every Pareto optimal point there exists a corresponding NBI subproblem
of which it is the solution. The same is true for n ≥ 3, with one difference: the
coordinates of the parameter vector β for N BIβ may not be all nonnegative. As a
simple example, suppose F is a sphere in <3 touching the coordinate axes. Then the
CHIM simplex is the triangle formed by joining the three points where the sphere
touches the axes. Quite clearly CHIM 6= CHIM+ so that there exist points in
CHIM+ \ CHIM underneath which there are Pareto optimal points on the sphere.
Pn are not in CHIM , they do not satisfy βi ≥ 0 ∀i. Thus,
However, since these points
by solving N BIβ for i=1 βi = 1, βi ≥ 0 ∀i, a portion of the Pareto set might be
overlooked for problems with n > 2. However, these overlooked points are likely to
636 INDRANEEL DAS AND J. E. DENNIS

be “extremal” Pareto points lying near the periphery of the Pareto surface and are
not interesting from the trade-off standpoint, which is our primary goal. Figure 3
illustrates a similar situation. The reader interested in how these peripheral Pareto
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

points can be obtained can look in Das [1] for such a technique interesting at least
from a theoretical standpoint.

f3

F(x1*)

F*

F(x2*)
f2
F(x3*)
f1

Fig. 3. There exist Pareto optimal points not obtainable using NBI.

4. Some details.
4.1. Structure of Φ. The ith column of Φ is described by

Φ(:, i) = F (x∗i ) − F ∗ .

Since fi (x∗i ) = fi∗ , clearly,

Φ(i, i) = 0.

Moreover, since x∗i is the minimizer of fi (x) over x∗j , j = 1, . . . , n,

Φ(j, i) ≥ 0, j 6= i.

Thus a negative element in position (j, k) of Φ signifies that x∗k is not the global
minimizer of fk (x), and fk (x∗j ) < fk (x∗k ); i.e., x∗j improves on the current local min-
imum of fk (x). This fortunate occurrence provides a better starting point x∗j for
minimizing fk (x) and hence will lead to a better local minimum for fk∗ just by exam-
ining Φ.
4.2. Local versus global. As indicated earlier, most NBI points are guaranteed
to be only locally Pareto optimal points. However, the components of the shadow
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 637

minimum F ∗ being global minima of the objectives and the Pareto surface being
convex is a sufficient, although far from necessary, condition for the NBI points to be
globally Pareto optimal. In situations like the one shown in Fig. 4, where the relevant
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

part of ∂F is “folded,” the NBI point obtained may not be the one furthest out on the
boundary along that normal because the solution of the nonlinear NBI subproblem
is only guaranteed to be locally optimal. Thus the NBI point is not globally Pareto
optimal.

Q
P
P*

Fig. 4. NBI started at Q converges to P (locally Pareto optimal), whereas the corresponding
globally efficient point would have been P ∗ .

Not being able to find globally Pareto optimal points is a drawback inherent in
every method which finds a large number of efficient points of MOP. In homotopy
methods, it would involve finding the global minimum of one of the two objectives
in the very beginning. In methods which find efficient points by minimizing a single
objective, only a global minimum of the scalarized objective would correspond to a
globally efficient point.
Another important issue we had promised to deal with is the case when one
or more components of the shadow minimum F ∗ consists of local but not global
function minima. Such a case results in a different matrix Φ and more different
goals Φβ for the NBI subproblems to improve on. These goals may be conservative or
ambitious depending on the orientation of the incorrect CHIM relative to the CHIM
formed using the true global minimizers. However, having the “incorrect Φβ” may
not preclude the NBI point from being a point on the efficient frontier, as in the
case of Fig. 5. Once the globally efficient point P in Fig. 5 has been found, a trivial
examination of its components reveals that the current x∗1 is not the global minimizer
of f1 and provides a starting point, viz., P, for restarting the nonlinear programming
(NLP) to obtain a better local minimum of f1 . Then NBI can be restarted with this
improved estimate of F ∗ . Some (if not all) globally Pareto optimal points will be
obtained in most problems even if NBI is not restarted. Some points that are not
Pareto optimal may be obtained if the targets Φβ are conservative as in Fig. 5. In
cases such as the one in Fig. 6, it is possible that all globally Pareto optimal points
may not be found using NBI, and no indication regarding the local optimality of the
function minima may be obtained.
638 INDRANEEL DAS AND J. E. DENNIS

f2
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

F* f1

Fig. 5. NBI started at Q converges to globally Pareto optimal point P even though all the
function minima in the components of F ∗ are not global minima.

F*

Fig. 6. Local minima in the components of F ∗ might prevent NBI from obtaining Pareto points
from every part of the efficient frontier.

However, in situations like the ones in Fig. 7, owing to the fact that the individual
function minima are only local, all the NBI points obtained are only locally Pareto
optimal.
Computational experience (on more than just the problems mentioned here) shows
that in cases where the global minima of the functions are not available at the onset,
as NBI proceeds, either some component of Φ turns out to be negative or a function
value of a particular objective is found that improves on its current local minimum
value. This is not unusual given that the entire NBI procedure samples a large number
of function values in the objective space.
To conclude this discussion and provide a general abstraction, it should be men-
tioned that whatever the components of F ∗ may be, NBI obtains at least the (local)
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 639
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

F* F*

Fig. 7. In the first case F = ∂F . Here, having local function minima in the components of F ∗
can cause NBI to find only locally Pareto optimal points.

boundary points dominated by F ∗ unless F ∗ is attainable, i.e., F ∗ ∈ F. If F ∗ ∈ F,


Φ has a column of zeros and/or NBI obtains some (local) boundary point which
dominates F ∗ , providing reason to refine F ∗ and start NBI all over again.
4.3. Quasi-normal instead of normal direction. The idea of a family of
normals intersecting the boundary is valid even if we do not have the exact normal
direction to the CHIM simplex, but we have some quasi-normal direction n̂ which
has negative components; i.e., it points toward the origin. “Shooting” a family of
quasi-normal rays toward the boundary also gets us our desired boundary points.
In practice we choose our quasi-normal direction to be an equally weighted linear
combination of the columns of Φ, multiplied by −1 to ensure that it points toward
the origin. Explicitly,

n̂ = −Φe,

where e is the column vector of all ones.


The quasi-normal component defined as above has the property that the NBI
point found for a certain β is completely independent of the scales of the objective
functions. In other words, if N BIβ is re-solved with the objective functions rescaled
by arbitrary factors, the NBI point found remains unchanged. This fact will be proved
later.
Given that Φ has nonnegative components as discussed in the previous subsection,
it is clear that all components of Φe are nonnegative.
Even though a quasi-normal direction will be used in our computations, we prefer
to retain the name “NBI,” rather than change it to something like “QNBI” hoping
this misnomer will not be judged too harshly.
4.4. Further insight: NBI and goal programming. Since t is being max-
imized in the NBI subproblem and Φβ + tn̂ = F (x), x ∈ C, this maximization
subproblem attempts to find a feasible point x as far from a “target” point Φβ as
possible, with n̂ ≤ 0 (componentwise) guaranteeing nonincrease in the components of
F (x) relative to the components of Φβ if the optimal value of t is nonnegative.
This is similar to goal programming. If we take the Pareto surface to be convex
640 INDRANEEL DAS AND J. E. DENNIS

in the objective space, “equality goal programming”1 can be thought of as NBI where
the direction n̂ is the negative of one of the canonical basis vectors ei (i.e., with 1 in
the ith position and 0 in the rest). To be precise, the subproblem N BIβ with n̂ = −ei
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

has the same solution as the following goal programming problem:

min fi (x)
x

s.t. fj (x) = (Φβ)(j), j = 1, . . . , n, j 6= i,

x ∈ C,

where (Φβ)(j) denotes the jth component of the vector Φβ.


Although posing the goals as equalities is nontraditional, this equality constrained
goal programming problem for obtaining a Pareto optimal point is discussed in Lin [5]
and [16].
In a future section NBI will be related to the traditional goal programming prob-
lem using Lagrange multiplier theory without assuming that the Pareto surface is
convex.
4.5. Efficiently solving the subproblems. The following simple observation
plays a key role in lowering the computational expense involved in solving the NBI
subproblems.
Consider parameter vectors β and β̄ such that β is “close to” β̄; i.e., kβ − β̄k
is “small” in some norm. Then it is reasonable to expect that the solution (x∗ , t∗ )
of N BIβ and the solution (x̄∗ , t̄∗ ) of N BIβ̄ are “close to each other.” Assume that
we have solved N BIβ̄ first and already have the point (x̄∗ , t̄∗ ). Then with (x̄∗ , t̄∗ ) as
the starting point for solving N BIβ , the NBI subproblem solver can be expected to
converge in relatively few iterations. It is this aspect of our algorithm that gives it
the flavor of a continuation-type method.
Since we already have the individual minima of the functions, i.e., the vertices
of the CHIM simplex, we start at x∗1 and solve a “nearby subproblem,” and then a
subproblem close to the one just solved, and so on.
Of course “ordering the subproblems” may not be obvious for problems with
more than two objective functions, but it can still be achieved, as described in the
next section.
5. Generating β and ordering the subproblems for more than two ob-
jectives. In this section we shall describe a (data) structure which simultaneously
enables the generation of weights β and the ordering of the subproblems in a manner
amenable not only to efficient solution but also to parallelization.
5.1. Generating β. Let us assume that for an n-objective problem, δj > 0 is
the uniform spacing between two consecutive βj values (i.e., the “stepsize” on the jth
component of β) for j = 1, . . . , n − 1. For simplicity, let us also assume that δ11 is an
integer.
The possible values that can be assumed by β1 are

[0, δ1 , 2δ1 , . . . , 1].

1 Referring to goal programming where the goal constraints are equalities instead of inequalities.
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 641

Given a particular value of β1 , define m1 = βδ11 . Then the possible values of β2 corre-
sponding to that value of β1 (i.e., β1 = m1 δ1 ; all the βi ’s must add up to 1) are
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

[0, δ2 , 2δ2 , . . . , k2 δ2 ] ,

where k2 = I[ 1−β
δ2 ] = I[
1 1−m1 δ1
δ2 ].
β2
Now define m2 = δ2 . Then the possible values of β3 corresponding to β1 = m1 δ1
and β2 = m2 δ2 are

[0, δ3 , 2δ3 , . . . , k3 δ3 ],

where k3 = I[ 1−βδ13−β2 ] = I[ 1−m1 δδ13−m2 δ2 ].


Thus, corresponding to βi = mi δi , i = 1, . . . , j − 1, the possible values of βj for
j = 2, . . . , n − 1 are

[0, δj , 2δj , . . . , kj δj ],

where
" Pj−1 #
1− i=1 mi δ i
kj = I .
δj

Finally the last component of β is defined as


n−1
X
βn = 1 − βi .
i=1

The entire data structure above can be thought of as a tree where the number
of children varies with the node and generation. Each generation or level represents
a component of β, and each path from the root to the leaf represents a possible β
vector. However, a tree structure is unnecessary for implementation; all that require
storage are the numbers δj . Nevertheless the tree is useful as a conceptual aid.
Of the subproblems generated by the weights in the above tree, n subproblems
(with β = ei ) have already been solved in the course of finding F ∗ . It should also
be noted that since δδji is not necessarily an integer ∀i < j, the spacings between the
“last two” values of βn may not be uniform.
5.2. Special case. For equal stepsizes on all βi ,

let δi = δ, i = 1, . . . , n − 1.

Also assume that 1δ = p is an integer.


As before, the possible values of β1 are

[0, δ, 2δ, . . . , 1].

Then the possible values of βj corresponding to βi = mi δi , i = 1, . . . , j − 1 for


j = 2, . . . , n − 1 are
" Ã j−1
! #
X
0, δ, 2δ, . . . , p − mi δ .
i=1
642 INDRANEEL DAS AND J. E. DENNIS

1 0 0.2 0.4 0.6 0.8 1


Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 ...... 0 0.2 0

3
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 ..... 0 0.2 0 0

0.8 0.6 0.4 0.2 0 0.8 0.6 0.4 0.2 0 0.6 0.4 0.2 0 0.2 0 0 0
41
Fig. 8. Generating β for n = 4, δ = 0.2.

Pn−1
As before, βn = 1 − i=1 βi , and now all the βn values are uniformly spaced.
Figure 8 shows part of the tree of β values for n = 4 and δ = 0.2.
Number of NBI subproblems. In general, the number of NBI subproblems
for a given n and a given p = 1δ is given by
 
n+p−1
.
p

In spite of the fact that we only intend to solve “nearby subproblems,” the com-
putational cost of solving a huge number of NLP problems can be quite daunting.
This motivates the need for parallelization, as will be mentioned in the next section.
5.3. Ordering the subproblems. Each path from the root of the tree (the
topmost node) to a leaf (a member of the bottommost generation) represents a unique
weight β. It should also be observed that the β vectors are already ordered on the
basis of “nearness” as one traverses the tree breadthwise. Thus a strategy for picking
the order of the subproblems could be to start with the leftmost one (which has
β = en and is already solved) and solve the next one in the βn−1 generation (which
is βn−1 = δn−1 , βn = 1 − δn−1 ), then the next one in the βn−1 generation (βn−1 =
2δn−1 , βn = 1 − 2δn−1 ), and so on until all the subproblems for βi = 0, i = 1, . . . , n − 2
have been solved. Then we move to the next node in the βn−2 generation (i.e., with
βi = 0, i = 1, . . . , n − 3, βn−2 = δn−2 ) and visit all the children of this node, with the
starting points of the NBI subproblems chosen as the corresponding NBI subproblem
solutions at the previous node.
This is where the scope for parallelization comes in. The solution of the first
subproblem at the second node in the βn−2 generation did not have to wait until all
the subproblems in the first node were solved. The first subproblem in the second
node of the βn−2 generation with βn−2 = δn−2 , βn−1 = δn−1 , βn = 1 − δn−2 − δn−1
can be solved immediately after solving the first subproblem in the first node with
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 643

βn−2 = 0, βn−1 = δn−1 , βn = 1 − δn−1 . Thus the first subproblem in the second
node can be solved in parallel with the second subproblem in the first node, . . . , and
the kth subproblem in the second node can be solved in parallel with the (k + 1)th
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

subproblem of the first node. Further, the kth subproblem in the third node can be
solved in parallel with the (k + 1)th subproblem of the second node, with the solution
of the kth subproblem of the second node as the starting point, and so on. This entire
process of efficient parallelization is one of the topics of future research.
6. Relationship between the NBI subproblem and minimizing a con-
vex combination of the objectives. In this section we illustrate how the NBI
subproblem is related to the popular method of minimizing a convex combination of
the objectives. This demonstrates how to go back and forth between the NBI parame-
ter β and the convex combinations weight vector w for a particular Pareto point. The
following discussion also demonstrates that corresponding to every w there exists a β
such that N BIβ has the same solution as LCw , but the converse is not true. In other
words, this proves that there might be points obtainable using NBI not obtainable by
minimizing convex combinations.
Given a Pareto point x∗ , the problem can be thought of as being constrained only
by the vector of equalities and binding inequalities and bounds at x∗ . Let Pus denote
n
this augmented vector of equalities by h̄(x). Let w ∈ (<+ ∪ {0})n , 1 wi = 1,
denote a positive, convex weighting of the objectives. The weighted linear combination
problem for obtaining a Pareto optimal point is then written as

min wT F (x)
x

(2) s.t. h̄(x) = 0.

The solution of the problem above will be referred to as an LC point, and the
problem will be denoted by LCw . Part of the first-order necessary or KKT condition
for optimality of (x∗ , λ∗ ) for problem (2) is

(3) ∇x F (x∗ )w + ∇x h̄(x∗ )λ∗ = 0 .

Similarly, if β denotes the vector of parameters in N BIβ , the NBI subproblem


can be written as

min −t
x,t

s.t. F (x) − Φβ − tn̂ = 0,

(4) h̄(x) = 0 .

Part of the KKT condition for optimality of (x∗ , t∗ , λ(1)∗ , λ(2)∗ ) is

(5) ∇x F (x∗ )λ(1)∗ + ∇x h̄(x∗ )λ(2)∗ = 0,

−1 + n̂T λ(1)∗ = 0,

where λ(1) ∈ <n represents the vector of multipliers corresponding to the constraints
Φβ + tn̂ − F (x) = 0, and λ(2) ∈ <ne denotes the multipliers of the equality constraints
h̄(x) = 0.
644 INDRANEEL DAS AND J. E. DENNIS

Pn (1)∗
Claim. Suppose (x∗ , t∗ , λ(1)∗ , λ(2)∗ ) is the solution of N BIβ and 1 λi 6= 0.
Now define the components of the vector w as
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

(1)∗
λ
wi = Pni (1)∗
.
1 λi

Then problem (2) with the above convex weighting vector w has the solution
" #
∗ ∗ 1 (2)∗
x , λ = Pn (1)∗
λ .
1 λi
Pn (1)∗
Proof. Dividing both sides of (5) by the scalar 1 λi and observing that
h̄(x∗ ) = 0, the equivalence between (3) and (5) becomes obvious.
(1)∗
However, quite clearly, if, for some i, the sign of λi is opposite to that of
Pn (1)∗
λ
1 i , then the vector w has a negative component and does not qualify as a
weight for problem (2). In such a case, either the Pareto optimality of the NBI point
(x∗ , t∗ , λ(1)∗ , λ(2)∗ ) is questionable or the Pareto point lies in a nonconvex part of the
Pareto set (Pareto points in nonconvex parts of the Pareto set cannot be obtained by
minimizing a linear combination of the objectives).
Just as the above analysis gives a method for obtaining w for problem LCw given
the corresponding solution of N BIβ , one can also obtain the NBI point corresponding
to a given solution of problem LCw with very little effort.
Claim. Suppose (x∗ , λ∗ ) solves problem LCw . Let (β̄, t∗ ) be the solution of the
(n + 1) × (n + 1) linear system

Φβ + tn̂ = F (x∗ ),

n
X
βi = 1.
i=1

Then (x∗ , λ∗ ) corresponds to the solution of N BIβ with β = β̄; i.e., the solution of
N BIβ̄ is
 
∗ ∗ (1)∗ w λ∗
x ,t ,λ = T , λ(2)∗ = T .
w n̂ w n̂

Proof. Let us divide (3) on both sides by wT n̂. This can always be done because w
has nonnegative components (not all zero) and n̂ has negative components, wT n̂ < 0.
Observing that λ(1)∗ defined above satisfies n̂T λ(1)∗ = 1, we can see that the first
part of the KKT conditions for N BIβ̄ holds. Further observing that h̄(x∗ ) = 0 and
Φβ + tn̂ = F (x∗ ), the required equivalence between LCw and N BIβ̄ follows.
7. Relationship between the NBI subproblem and goal programming
using multipliers. A solution to an NBI subproblem is also a solution to a goal
programming problem given that some assumptions hold. This is elaborated on below,
using the same type of multiplier argument used to relate N BIβ to LCw .
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 645

Claim. Suppose (x∗ , t∗ , λ(1)∗ , λ(2)∗ ) is the solution of N BIβ . Suppose that the
components of λ(1)∗ are all of the same sign with at least one nonzero component. If
(1)∗
λk is any such nonzero component, then x∗ solves the following goal programming
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

problem:

min fk (x)

s.t. fi (x) ≤ γi , ∀i 6= k,

(6) h̄(x) = 0

with goals γi given by


(
(1)
fi (x∗ ) if λi 6= 0,
γi = (1)
any finite number ≥ fi (x∗ ) if λi = 0

∀i ∈ {1, 2, . . . , n} \ {k}.
Proof. Since (x∗ , t∗ , λ(1)∗ , λ(2)∗ ) solves the N BIβ subproblem, it must satisfy (5).
(1)∗
Given that λk 6= 0, we can divide both sides of (5) and get
i=n
X (1)∗
λi λ(2)∗
∇x fk (x∗ ) + ∇x fi (x∗ ) (1)∗
+ ∇x h̄(x) (1)∗
= 0.
i=1,i6=k λk λk

(1)∗ (1)∗
λi (1)∗ (1)∗ λi
Now (1)∗ ≥ 0 because λi and λk are of the same sign. Then with (1)∗ as
λk λk
the multipliers of the n − 1 inequality constraints in (6), the goals γi satisfy comple-
mentarity by definition, since
(1)∗
γi = fi (x∗ ) whenever λi 6= 0

(1)∗
λi
⇒ (1)∗
(fi (x∗ ) − γi ) = 0 ∀i 6= k.
λk
(1)∗
λi (2)∗
Moreover, since x∗ is clearly feasible for (6), (x∗ , (1)∗ , λ(1)∗ ) solves (satisfies first-
λk λk
order necessary conditions for minimizer for) problem (6).
8. Proof of independence with respect to function scales using the
quasi-normal. In this section we shall prove that the NBI point found using the
quasi-normal n̂ and a particular β is independent of how the individual functions are
scaled.
Let the objective functions be scaled by positive scalars si as

fi (x) ← si fi (x), i = 1, . . . , n.

In other words, if s is the vector with components si and S = diag(s), then

F (x) ← SF (x).
646 INDRANEEL DAS AND J. E. DENNIS

Consequently

∇x F (x) ← ∇x F (x)S ,
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Φ = SΦ .

The quasi-normal direction n̂ = −Φe after scaling becomes = −SΦe.


Claim. If (x∗ , t∗ , λ(1)∗ , λ(2)∗ ) solves the unscaled N BIβ (i.e., with S = In ), then
(x , t , S −1 λ(1)∗ , λ(2)∗ ) solves2 N BIβ with the ith function scaled by si as above.
∗ ∗

Proof. Since (x∗ , t∗ , λ(1)∗ , λ(2)∗ ) solves the unscaled N BIβ (still with only equality
constraints as in the previous section),

∇x F (x∗ )λ(1)∗ + ∇x h̄(x∗ )λ(2)∗ = 0,

n̂T λ(1)∗ = 1,

Φβ + t∗ n̂ = F (x∗ ),

h̄(x∗ ) = 0.

The first equation can be rewritten to state that the following holds:

(7) (∇x F (x∗ )S)(S −1 λ(1)∗ ) + ∇x h̄(x∗ )λ(2)∗ = 0.

The second equation implies

eT ΦT λ(1)∗ = 1

≡ eT ΦT SS −1 λ(1)∗ = 1.

Since S = S T , the above is the same as

(8) (eT (SΦ)T )(S −1 λ(1)∗ ) = 1.

The third equation can be rewritten as

Φβ + t∗ Φe = F (x∗ )

(9) ≡ SΦβ + t∗ SΦe = SF (x∗ ).

Clearly, equations (7), (8), and (9) imply that (x∗ , t∗ , S −1 λ(1)∗ , λ(2)∗ ) solves N BIβ
with the functions scaled by S.
The above result does not depend on e being the vector of all ones and conse-
quently holds if n̂ is scaled by a factor, say, a normalization constant.
The above result suggests that no matter how disparately the different functions
might be scaled, NBI with the quasi-normal finds a set of points as if the functions
were all scaled to the same order of magnitude.
2 Here “solves” means “finds a stationary point of the NLP problem.”
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 647

9. Advantages of NBI.
• Finds a uniform spread of Pareto points. Consider any method that
attempts to capture the shape of the Pareto surface by generating many points on
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

the surface. An important property that would make such a method desirable is that
it should generate an even spread of Pareto points, representative of all parts of the
Pareto set, and not clusters of points in certain parts which fail to provide a good
idea of the entire shape. Given that we can only solve a limited number of NLP
subproblems and hence generate only a limited number of Pareto points, it becomes
crucial to have the points spread as evenly as possible, so that a good approximation
of the Pareto surface is obtained by solving as few subproblems as possible.
In implementing NBI, various settings of the parameter β are chosen such that
the points Φβ form a uniformly spaced grid on the CHIM simplex (this is achieved
by generating β as in section 5.2). Since the NBI points are restricted to lie on a set
of parallel normals emanating from these “uniformly spread” points, the projections
of the areas between neighboring NBI points on the CHIM are uniformly spread.
Thus NBI can yield a good approximation of the Pareto surface by solving fewer
NLP problems than weighted convex combinations. It is very difficult to guess the
parameter settings for which weighted convex combinations yield a uniform spread of
Pareto points because the weights that correspond to an even spread depend on the
shape of the Pareto surface, as shown in Das and Dennis [2].
The interrelationship between the linear combinations subproblem and the NBI
subproblem provides more insight into why the linear combinations technique fails
to give a uniformly distributed set of Pareto optima. By fixing the weights w in
subproblem LCw , in effect the multipliers of the corresponding NBI subproblem get
fixed, thus partly restricting the solution of the resultant subproblem. Even if the
Pareto optima are uniformly distributed in the Pareto set, there is no reason why
the corresponding multipliers should be uniformly distributed. More insight into the
failures of convex combinations can be found in Das and Dennis [2].
However, the weights in the linear combinations approach are often very desirable
because they give an idea of the relative importance of the objectives. Thus obtaining
the NBI points, which are uniformly distributed, and then finding the corresponding
weights w for the NBI points can be quite insightful.
• Improves over homotopy techniques. NBI improves over homotopy/continuation
techniques for tracing the curve of Pareto optimal solutions like the one discussed in
Rakowska, Haftka, and Watson [6] in the following respects.
– Applicable for more than two objectives. NBI is formulated to handle an
arbitrary number of objectives. On the other hand, for a multiobjective
problem with more than two objectives the homotopy parameter is not a
scalar and the associated differential equation is a system of nonlinear partial
differential equations with not readily available boundary conditions, rather
than an ordinary initial value problem, as in the case of two objectives.
– Does not require exact Hessian. Even for a biobjective problem, solving the
homotopy initial value problem requires exact second derivative information
(i.e., the Hessian of the Lagrangian), whereas the NBI subproblem solver can
use any NLP technique. Even if the NLP technique for the NBI subproblem
requires gradient information, secant methods for NLPs make exact Hessians
unneccesary.
– Bypasses tracking active sets. For problems with inequality constraints or
explicit bounds on variables, homotopy techniques need to keep track of the
648 INDRANEEL DAS AND J. E. DENNIS

changes in active sets of the inequality constraints or bounds meticulously in


the course of the numerical integration, which can present difficulties if the
number of inequalities or bounds is large. On the other hand, an interior point
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

NLP solver used as the NBI subproblem solver would handle this situation
quite efficiently and would not have a problem with frequent changes in the
active set.
It must be noted though that points where the active set changes provide
important information to the designer. However, homotopy needs to keep
track of changes in active sets even in the uninteresting parts of the Pareto
set, whereas once the NBI points are found it is not difficult to trace how
the active set changes along the Pareto surface by examining the binding
inequalities at the Pareto points.
– Does not assume connectedness or smoothness of the Pareto set. The homo-
topy technique assumes that the Pareto curve is continuous and differentiable
and is also connected to be able to integrate along the curve. This is not the
case with NBI, although it might end up reporting some subproblems as
infeasible if the Pareto set is disconnected.
• NBI improves on other traditional methods like goal programming in the sense
that it never requires any prior knowledge of “feasible goals.” It improves on mul-
tilevel optimization techniques from the trade-off standpoint, since multilevel tech-
niques usually can improve only a few of the “most important” objectives, leaving no
compromise for the rest.
10. A numerical example. Below is a brief account of employing NBI tech-
niques on a small biobjective problem stated below:
 
f1 (x) = x21 + x22 + x23 + x24 + x25
min 3
x f2 (x) = 3 x1 + 2 x2 − x33 + 0.01 (x4 − x5 )

s.t. x1 + 2 x2 − x3 − 0.5 x4 + x5 = 2,

4 x1 − 2 x2 + 0.8 x3 + 0.6 x4 + 0.5 x25 = 0,

x21 + x22 + x23 + x24 + x25 ≤ 10.

NBI using the quasi-normal was run on the above problem with the evenly spread
values of β with δ = 0.05. The Pareto points thus obtained are tabulated below in
Table 1 and are plotted in Fig. 9.
The method of convex combinations was run thrice on the same problem, with
the weight vectors w assuming the same 21 uniformly spread values as the w vector
above. The efficient solution scheme, i.e., starting the solution of a subproblem from
the optimal point of the “nearest subproblem,” was used here as well.
When run on the original problem the minimizer of f2 (x) was found six times for
six different w, and there was a considerable gap “in the middle” of the Pareto set
(see Fig. 9).
With f1 scaled by 5, the point found six times earlier was found only twice (i.e.,
heavily weighting the first objective made the minimizer move away from x∗2 ), but the
Pareto optimal vectors obtained were concentrated at the F (x∗1 ) end and no “middle
ground for compromise” was captured.
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 649
Table 1
β Objective values
0.00, 1.00 10.0000, −4.0111
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

0.05, 0.95 9.4254, −3.7706


0.10, 0.90 8.8546, −3.5276
0.15, 0.85 8.2882, −3.2818
0.20, 0.80 7.7264, −3.0329
0.25, 0.75 7.1698, −2.7807
0.30, 0.70 6.6189, −2.5247
0.35, 0.65 6.0743, −2.2647
0.40, 0.60 5.5368, −2.0000
0.45, 0.55 5.0072, −1.7302
0.50, 0.50 4.4866, −1.4546
0.55, 0.45 3.9764, −1.1722
0.60, 0.40 3.4781, −0.8820
0.65, 0.35 2.9939, −0.5827
0.70, 0.30 2.5266, −0.2724
0.75, 0.25 2.0801, 0.0514
0.80, 0.20 1.6597, 0.3922
0.85, 0.15 1.2740, 0.7556
0.90, 0.10 0.9370, 1.1506
0.95, 0.05 0.6754, 1.5947
1.00, 0.00 0.5551, 2.1306

With f1 scaled by 10, the point repeated earlier was found only once, although
the clustering at the F (x∗1 ) end increased (see Fig. 10).
The Pareto optimal vectors obtained using linear combinations are shown in Ta-
ble 2.
Clearly the inability of the method of convex combinations in adequately captur-
ing the shape of the Pareto surface renders it fairly useless as a means of studying the
trade-off between the conflicting objectives.
11. A truss optimization problem. Now we shall apply NBI to a truss op-
timization problem, a version of which has been studied in Koski [14]. The problem
involves optimizing the design of a pin-jointed linear truss structure as shown in
Fig. 11.
The problem is to find the optimal position of the vertical bar of fixed length L
(the bars on the edge get fixed and their lengths decided accordingly) between 1/4
and 3/4 of the entire distance D and the optimal bar cross-sectional areas. The angles
θ and α clearly depend on the chosen location x. Other optimization variables are
the cross-sectional areas of the bars, a1 , a2 , a3 , allowed to vary between 0.8 in2 and
3.0 in2 .
The objectives to be considered for minimization are the total volume of the
structure, the displacement of the node, and the absolute value of the stress in each
of the three bars. In the structure considered by Koski in [14] the location of the
middle bar was fixed, so that θ and α were also fixed. Also, his total volume was a
linear function of the design variables, unlike in our formulation where total volume
is expressed as a1 sinL θ + a2 L + a3 sinLα .
Without going into further details of the problem and the data involved, which
can be found in Das [1], we present some Pareto plots for subsets of the five objectives
mentioned here. Figure 12 shows the Pareto curve for minimizing the square of nodal
displacement and the total volume with constraints on the absolute stresses in the
three bars. The apparently unexpected gaps in the Pareto curve using NBI are points
corresponding to which the NBI subproblems were infeasible owing to discontinuities
650 INDRANEEL DAS AND J. E. DENNIS

2
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

0
F(2)

−1

−2

−3

−4

−5
0 2 4 6 8 10 12
F(1)
Efficient points obtained by minimizing convex combinations of objectives
3

0
F(2)

−1

−2

−3

−4

−5
0 2 4 6 8 10 12
F(1)

Fig. 9. Pareto optimal vectors in the objective space using NBI and the method of convex
combinations, respectively.

in the Pareto set introduced by stringent stress inequalities.


Figure 13 shows the Pareto curve for minimizing the stress in the right bar (the
minimum value of the stress was positive and hence the absolute sign was dropped)
and the total volume with constraints on the absolute stresses in the middle and left
bars.
Given the individual minima and minimizers of the objectives at the outset, the
number of floating point operations (flops) required in solving the subproblems for
minimizing the stress in the right bar and the total volume using NBI and convex
combinations for 21 parameter settings for each are shown in Table 3.
Table 3 shows that NBI takes about twice as many flops but finds about twice
as many distinct points, so that the number of flops per Pareto point is almost the
same for the two methods (convex combinations wins marginally). But NBI yields a
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 651

Efficient points obtained by minimizing convex combinations of objectives


3
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

0
F(2)

−1

−2

−3

−4

−5
0 2 4 6 8 10 12
F(1)
Efficient points obtained by minimizing convex combinations of objectives
3

0
F(2)

−1

−2

−3

−4

−5
0 2 4 6 8 10 12
F(1)

Fig. 10. Pareto optimal vectors in the objective space using the method of linear combinations
on the problem with f1 scaled by 5 and 10, respectively.

uniform spread of points representative of all parts of the Pareto set and hence yields
a better model of the trade-off curve for the same effective computational cost.
Finally, Fig. 14 shows the Pareto surface obtained using NBI with stress in the left
bar and total volume and stress in the right bar as objectives. The uniform stepsize
δ on each component of β was chosen to be 0.1, and 66 NBI subproblems were solved
of which nine failed to converge owing to infeasibility. The whole process took about
11.4 million flops.
A more detailed engineering-oriented treatment of this problem with trade-off
studies for more than the groups of objectives mentioned here can be found in Das [1].
12. Function scaling implicit in NBI. NBI using the quasi-normal com-
ponent is unaffected by the function scales. However, as the functions get more
disparately scaled, the Pareto set gets more “stretched,” and consequently the NBI
652 INDRANEEL DAS AND J. E. DENNIS

Table 2
Weights Objective values Objective values Objective values
(w1 , w2 ) (original scale) (f1 scaled by 5) (f1 scaled by 10)
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

0.00, 1.00 10.0000, −4.0111 10.0000, −4.0111 10.0000, −4.0111


0.05, 0.95 10.0000, −4.0111 10.0000, −4.0111 4.8211, −1.6330
0.10, 0.90 10.0000, −4.0111 4.1857, −1.2896 1.1634, 0.8741
0.15, 0.85 10.0000, −4.0111 1.6131, 0.4330 0.7689, 1.4083
0.20, 0.80 10.0000, −4.0111 1.0180, 1.0451 0.6559, 1.6416
0.25, 0.75 10.0000, −4.0111 0.7975, 1.3592 0.6100, 1.7724
0.30, 0.70 8.9403, −3.5644 0.6953, 1.5506 0.5876, 1.8563
0.35, 0.65 4.5379, −1.4822 0.6412, 1.6796 0.5754, 1.9146
0.40, 0.60 2.7307, −0.4109 0.6100, 1.7725 0.5682, 1.9576
0.45, 0.55 1.8319, 0.2473 0.5909, 1.8425 0.5637, 1.9905
0.50, 0.50 1.3357, 0.6928 0.5788, 1.8973 0.5608, 2.0165
0.55, 0.45 1.0425, 1.0147 0.5707, 1.9413 0.5589, 2.0376
0.60, 0.40 0.8615, 1.2583 0.5654, 1.9773 0.5576, 2.0551
0.65, 0.35 0.7463, 1.4492 0.5618, 2.0075 0.5567, 2.0698
0.70, 0.30 0.6719, 1.6029 0.5593, 2.0331 0.5561, 2.0823
0.75, 0.25 0.6236, 1.7295 0.5576, 2.0551 0.5557, 2.0931
0.80, 0.20 0.5926, 1.8356 0.5565, 2.0741 0.5554, 2.1025
0.85, 0.15 0.5734, 1.9258 0.5558, 2.0909 0.5553, 2.1108
0.90, 0.10 0.5622, 2.0035 0.5554, 2.1057 0.5552, 2.1181
0.95, 0.05 0.5567, 2.0711 0.5551, 2.1188 0.5551, 2.1247
1.00, 0.00 0.5551, 2.1306 0.5551, 2.1306 0.5551, 2.1306

D
x

L
P

Wind
Suspended
Load

Fig. 11. A truss structure under a suspended load and a wind load.

points get farther apart from each other. Consequently solving an NBI subproblem
starting from the solution of the same nearby subproblem takes more iterations to
converge. This was observed in the numerical example above and motivates the need
to scale the functions properly to remove this disparity in scales.
Geometrically it can be perceived that if the vertices of the CHIM simplex are
almost equidistant from the origin, i.e., the quantities

kF (x∗j ) − F ∗ k, j = 1, . . . , n,
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 653

Points obtained using NBI Efficient points obtained by minimizing convex combinations of objectives
5 5
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

4.5 4.5

4 4
Total Volume (cu. ft.)

Total Volume (cu. ft.)


3.5 3.5

3 3

2.5 2.5

2 2

1.5 1.5

1 1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.2 0.4 0.6 0.8 1 1.2 1.4
Nodal displacement in ft. squared Nodal displacement in ft. squared

Fig. 12. Pareto curves for minimizing nodal displacement and total volume (cu. ft.) using
NBI and convex combinations.

Points obtained using NBI Efficient points obtained by minimizing convex combinations of objectives
4 4

3.5 3.5

3 3
Total Volume (cu. ft.)

Total Volume (cu. ft.)

2.5 2.5

2 2

1.5 1.5

1 1
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Stress in right bar (ksi) Stress in right bar (ksi)

Fig. 13. NBI points for minimizing stress in right bar and total volume (cu. ft.).

are almost equal, then the quasi-normal direction n̂ is almost normal to the CHIM
simplex. This would achieve the “minimally stretched” Pareto set we want and could
also be a good scaling for the problem in the sense that all the functions would be
about the same order of magnitude and thus would reduce possible ill conditioning.
For the biobjective problem, Φ is antidiagonal; thus a scaling that would achieve
the above is obvious:
f1
f1 ← ,
f1 (x∗2 )

f2
f2 ← ,
f2 (x∗1 )
which makes each vertex of CHIM be unit distance from the origin.
654 INDRANEEL DAS AND J. E. DENNIS

Table 3
NBI Convex combinations
Number of distinct points 21 12
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Number of flops 2,962,068 1,650,079


Flops per distinct point 141,050.86 137,506.58

100

80
Stress in right bar (ksi)

60

40

20

0
5
4 500
3 400
300
2
200
Total volume (cu. ft.) 1 100
Stress in left bar (ksi)

Fig. 14. NBI points for minimizing stresses in left and right bars and total volume.

However, the solution may not be so transparent for more than two objectives,
and it may not be possible to get all the vertices exactly equidistant from the origin.
So now we shall attempt to find function scalings di > 0 such that the functions scaled
as
p
fi ← di fi

will have the property that the variance among the scaled distances of the vertices
from the origin, i.e.,

k D(F (x∗j ) − F ∗ )k2 , j = 1, . . . , n,

√ (D = diag(d); d represents the vector with components di ).


will be minimized
Let vj = k D(F (x∗j ) − F ∗ )k2 , i.e.,
n
X
vj = di φ2i,j ,
i=1

where φi,j is the ith row jth column entry of the matrix Φ.
The mean square distance of the vertices is defined as
 
Xn Xn X n
1 1
v̄ = vj = di  φ2i,j  .
n j=1 n i=1 j=1
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 655

The variance quantity to be minimized is given by


n
X
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

V (d) = (vj − v̄)2 ;


j=1

i.e.,
  2
n X
X n n
X Xn 
1
V (d) = di φ2i,j − di  φ2i,j  .
 n j=1 
j=1 i=1 i=1

Let A be the matrix with components ai,j given by


n
1X 2
ai,j = φ2i,j − φi,k .
n
k=1

Then
n
à n !2
X X
V (d) = di ai,j ;
j=1 i=1

i.e.,

V (d) = dT AAT d = kAT dk2 .

This quadratic function is convex in d and has an unconstrained minimizer at


d = 0. Thus we shall demand a specific value of v̄, which represents an average distance
of the CHIM simplex from the origin and is roughly the same order of magnitude as
a typical function value of any objective encountered in the computation.
Suppose we want a typical objective value to be τ , which could be something like
10. Then we would enforce
 
n n
1 X X 2 
v̄ = di φi,j = τ
n i=1 j=1

along with a small lower bound on di . Thus the optimization problem to be solved
to obtain our “optimal” scales is

min V (d) = dT AAT d


d

 
n
X Xn
s.t. di  φ2i,j  = nτ,
i=1 j=1

di >= 10−8 , i = 1, . . . , n.

Thus we can see how the matrix Φ suggests an “improved scaling” of the objective
functions, which is a bonus in the NBI approach.
It is worth observing that using the mean distance as opposed to the mean square
distance in the last constraint would result in loss of convexity; hence the latter is
preferred.
656 INDRANEEL DAS AND J. E. DENNIS

13. Conclusion. A technique was presented for finding Pareto optimal points
of any smooth, constrained multiobjective problem with any number of objectives,
perhaps restricted only by considerations of computational expense. The technique is
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

efficient and has several useful properties, including that of obtaining an even spread of
Pareto optimal points and invariance with respect to function scaling. This technique
should be regarded as a tool for generating points from which the user can select the
final design and not one that actually helps the user make that selection.
Further research is in progress regarding the implementational issues of paralleliz-
ing the solution of the NBI subproblem. Customized NLP techniques for solving the
NBI subproblem will also be investigated.
A public domain Matlab 4.2 implementation of NBI is available free of charge
at http://www.owlnet.rice.edu/∼ indra/NBIhomepage.html.

Acknowledgments. First, the authors would like to thank the referees for point-
ing out various improvements in the presentation of the material and presenting sev-
eral insightful comments. The authors would also like to thank Paul Uhlig, De-
partment of Mathematics, Rice University, for several insightful discussions; Dr. Ja-
gannatha Rao, Department of Mechanical Engineering, University of Houston, for
providing motivation and helpful comments; Dan P. Giesy, NASA-Langley, for an
interesting conversation on his earlier work; Jeffrey Hittinger, University of Michi-
gan, Ann Arbor, for a helpful discussion on data structures; and Sanjeeb Dash, Rice
University, for his algorithmic insights.

REFERENCES

[1] I. Das, Nonlinear Multicriteria Optimization and Robust Optimality, Ph.D. Thesis, Dept. of
Computational and Applied Mathematics, Rice University, Houston, TX, 1997.
[2] I. Das and J. E. Dennis, A closer look at drawbacks of minimizing weighted sums of objectives
for Pareto set generation in multicriteria optimization problems, Structural Optim., 14
(1997), pp. 63–69.
[3] H. Eschenauer, J. Koski, and A. Osyczka, Multicriteria Design Optimization, Springer-
Verlag, Berlin, 1990.
[4] R. B. Statnikov and J. B. Matusov, Multicriteria Optimization and Engineering, Chapman
and Hall, New York, 1995.
[5] J. G. Lin, Three methods for determining Pareto-optimal solutions of multiple-objective prob-
lems, in Directions in Large-Scale Systems, Y. C. Ho and S. K. Mitter, eds., Plenum Press,
New York, 1975, pp. 117–138.
[6] J. Rakowska, R. T. Haftka, and L. T. Watson, Tracing the efficient curve for multi-
objective control-structure optimization, Comput. Systems Engrg., 2 (1991), pp. 461–471.
[7] J. R. Rao and P. Y. Papalambros, A non-linear programming continuation strategy for
one parameter design optimization problems, in Proceedings of ASME Design Automation
Conference, Montreal, Quebec, Canada, Sept. 17–20, 1989, pp. 77–89.
[8] B. N. Lundberg and A. B. Poore, Bifurcations and sensitivity in parametric programming,
in Proceedings of Third Air Force/NASA Symposium on Recent Advances in Multidisci-
plinary Analysis and Optimization, San Francisco, CA, Sept. 24–26, 1990, pp. 50–55.
[9] F. W. Gembicki, Performance and Sensitivity Optimization: A Vector Index Approach, Ph.D.
Thesis, Dept. of Systems Engineering, Case Western Reserve University, Cleveland, OH,
1974.
[10] A. A. Schy and D. P. Giesy, Tradeoff Studies in Multiobjective Insensitive Design of Airplane
Control Systems, in Proc. of the AIAA Guidance and Control Conference, Gatlinburg, TN,
Aug. 15–17, 1983.
[11] A. A. Schy and D. P. Giesy, Multiobjective Insensitive Design of Airplane Control Systems
with Uncertain Parameters, in Proc. of the AIAA Guidance and Control Conference, Al-
buquerque, NM, Aug. 19–21, 1981.
GENERATING EVENLY SPACED PARETO OPTIMAL POINTS 657

[12] A. A. Schy, D. P. Giesy, and K. J. Johnson, Pareto-optimal multi-objective design of airplane


control systems, in Proceedings of the 1980 Joint Automatic Control Conference, American
Automatic Control Council, ASME, Vol. 1, New York, 1980.
Downloaded 01/02/13 to 132.206.27.25. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

[13] A. A. Schy and D. P. Giesy, Multicriteria optimization methods for design of aircraft control
systems, in Multicriteria Optimization in Engineering and in the Sciences, W. Stadler, ed.,
Plenum Press, New York, 1988.
[14] J. Koski, Multicriteria truss optimization, in Multicriteria Optimization in Engineering and in
the Sciences, W. Stadler, ed., New York, Plenum Press, 1988.
[15] Y. Haimes, W. Hall, and H. Freedman, Multiobjective Optimization in Water Resources
Systems, Elsevier Scientific Publishing Co., Amsterdam, 1975.
[16] J. G. Lin, Multiple-objective problems: Pareto-optimal solutions by method of proper equality
constraints, IEEE Trans. Automat. Control, AC-21 (1976), pp. 641–650.
[17] G. A. Katopis and J. G. Lin, Non-inferiority of controls under double performance objectives:
Minimal time and minimal energy, in Proceedings of 7th Hawaii Int. Conf. Syst. Sci.,
Honolulu, HI, Jan. 1974, pp. 129–131.
[18] J. G. Lin, Circuit design under multiple performance objectives, in Proceedings of the 1974
IEEE Int. Symp. Circuits and Systems, San Francisco, CA, Apr. 1974, pp. 549–552.

You might also like