0% found this document useful (0 votes)
153 views

Convex Optimization 1 - Charalampos Salis

The document is an assignment for a course on convex optimization. It contains solutions to several exercises on concepts in convex optimization, including: 1) Proving that the convex combination of points in a convex set is also in the set. 2) Showing properties of the intersection of a convex/affine set with a line. 3) Demonstrating that midpoint convexity implies convexity. 4) Proving that the convex hull of a set S is the smallest convex set containing S. 5) Deriving the formula for the distance between parallel hyperplanes. 6) Characterizing the set of points closer to one point than another as a halfspace. 7)

Uploaded by

Trompa Lompa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
153 views

Convex Optimization 1 - Charalampos Salis

The document is an assignment for a course on convex optimization. It contains solutions to several exercises on concepts in convex optimization, including: 1) Proving that the convex combination of points in a convex set is also in the set. 2) Showing properties of the intersection of a convex/affine set with a line. 3) Demonstrating that midpoint convexity implies convexity. 4) Proving that the convex hull of a set S is the smallest convex set containing S. 5) Deriving the formula for the distance between parallel hyperplanes. 6) Characterizing the set of points closer to one point than another as a halfspace. 7)

Uploaded by

Trompa Lompa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ATHENS U NIVERSITY OF E CONOMICS AND

B USINESS
D EPARTMENT OF I NFORMATICS

Convex Optimization

Assignment 1

Salis Charalampos
1 Exercise 2.1

Pk Pk
We need to show that if x1 , ..., xk ∈ C and i=1 θi = 1, then i=1 θi xi ∈ C for an
arbitrary k. We will use induction:

• For k = 2, we have the trivial case, since if C is convex, then from the definition
of convexity we get:

θx + (1 − θ)y ∈ C, ∀x, y ∈ C and θ ∈ [0, 1]

• (Induction Hypothesis) Suppose that for an arbitrary k, if x1 , ..., xk ∈ C and


Pk Pk
i=1 θi = 1, with θi ≥ 0, ∀i, then: i=1 θi xi ∈ C.

• Using the induction hypothesis, we will prove that if x1 , ..., xk+1 ∈ C and
Pk+1 Pk+1
i=1 θi = 1, with θi ≥ 0, ∀i, then: i=1 θi xi ∈ C. We will try to "down-
scale" the dimensions. At first, we normalize each coefficient for i ≥ 2 with the
term 1 − θi , i.e. construct the new coefficient θ˜i = 1−θ
θi
1
. Then, we construct the
Pk+1 ˜
new point: i=2 θi xi , which is convex, since x2 , ..., xk+1 ∈ C and
Pk+1
i=2 θi
Pk+1 ˜ 1−θ1
i=2 θi = 1−θ1 = 1−θ1 =1

The convexity is guaranteed by the induction hypothesis. Then, from the defini-
Pk+1 ˜
tion of convexity, since x1 and θi xi ∈ C, we have that the point:
i=2

Pk+1 ˜ Pk+1
θ1 x1 + (1 − θ1 ) i=2 θi xi = i=1 θi xi ∈ C

2 Exercise 2.2

2.1 Convexity

· 1st Part: Assume that the set C is convex and L is an arbitrary line. Then, from
the definition of convexity: ∀x, y ∈ C and θ ∈ [0, 1]: θx + (1 − θ)y ∈ C. However,
θx + (1 − θ)y also belongs to the line segment between x and y, which implies that:
θx + (1 − θ)y ∈ L.

· 2nd Part: Assume that C ∩ L is convex, but C is not convex. Using again the
definition of convexity, we get that there are x, y ∈ C, such that: θx + (1 − θ)y ∈
/ C,

2
θ ∈ [0, 1]. However, the line segment between x and y must be convex (from previous
assumption) and, thus, will not contain the point θx + (1 − θ)y, which contradicts the
assumption. Hence, if C ∩ L is convex, then C must also be convex.

2.2 Affineness

· 1st Part: Assume that the set C is affine and L is an arbitrary line. Then, from
the definition of affineness: ∀x, y ∈ C and θ ∈ R: θx + (1 − θ)y ∈ C. Also, the
point θx + (1 − θ)y belongs to the line segment between x and y, which implies that:
θx + (1 − θ)y ∈ L.

· 2nd Part: Assume that C ∩ L is affine, but C is not affine. Using again the definition
of affineness, we get that there are x, y ∈ C, such that θx + (1 − θ)y ∈ / C, θ ∈ R.
However, any point of the line segment segment between x and y belongs to the
respective line passing through these points, resulting in: θx + (1 − θ)y ∈ C ∩ L, which
is affine (from previous assumption). Hence, if C ∩ L is affine, then C must also be
affine.

3 Exercise 2.3

We want to show that ∀x, t ∈ C and θ ∈ [0, 1], it holds: θx + (1 − θ)y ∈ C. Because
θ ∈ [0, 1], we know that it has a unique binary expansion, i.e. the following representa-
P∞
tion: θ = k=1 ak (θ)2−k , where ak (θ) ∈ {0, 1}. For any finite degree n, by applying
midpoint convexity n times, we get that:
Pn Pn
( k=1 ak (θ))x + (1 − k=1 ak (θ))y ∈ C

From the closeness of C, we can also get the limit to infinity and stay in C, in order
to obtain any real number in [0, 1]:

n
X n
X ∞
X ∞
X
lim [( ak (θ))x + (1 − ak (θ))y] = ( ak (θ))x + (1 − ak (θ))y
n→∞
k=1 k=1 k=1 k=1

= θx + (1 − θ)y ∈ C

3
4 Exercise 2.4

Let C be the intersection of all convex sets that contain S. Then, we have to show
that: conv S = C. We will show that both conv S ⊆ C and conv S ⊇ C hold.

· 1st Part: conv S ⊆ C: Take an arbitrary element of conv S, x ∈ conv S. Then, x is


Pk
a convex combination of elements of S, i.e. x = i=1 θi xi , where xi ∈ S, θi ≥ 0 and
Pk
i=1 θi = 1. Because C is a convex set and is the intersection of all convex sets that
contain S and x is a convex combination of elements of S, then x ∈ C.

· 2nd Part: conv S ⊇ C: We know that conv S is convex and also the smallest
convex set that contains S, from its definition. Thus, given that C is the intersection of
all convex sets that contain S, it is reduced to conv S. Hence: conv S ⊇ C always
holds.

Thus, conv S = C.

5 Exercise 2.5

First, we need the Euclidean projection of a point x0 ∈ Rn onto a hyperplane H =


{x|aT x = b}. The projection can be found as the solution of the optimization problem:

1
min ||x − x0 ||
2
s.t. : aT x = b

Using Lagrange multipliers, we obtain the solution:

x∗ = x0 + (b − aT x) ||a||
a
2

Now, given two parallel hyperplanes H1 = {x|aT x = b1 } and H2 = {x|aT x = b2 },


we can calculate their distance by projecting any point of H1 onto H2 or conversely.
Without loss of generality, we consider an arbitrary point x0 ∈ H1 and project it onto
H2 :

a a
x1 = x0 + (b2 − aT x0 ) ||a||2 = x0 + (b2 − b1 ) ||a||2

4
Thus, the distance between the hyperplanes is:

a ||a|| |b2 −b1 |


||x1 − x0 || = ||(b2 − b1 ) ||a||2 || = |b2 − b1 | ||a||2 = ||a||

6 Exercise 2.7

We want to show that all points x ∈ Rn that are closer to a than b form a set of the
following form: {x|cT x ≤ d}, where c ̸= 0. We consider any point x ∈ Rn , such that:

q q
||x − a||2 ≤ ||x − b||2 ⇒ (x − a)T (x − a) ≤ (x − b)T (x − b)
⇒ (x − a)T (x − a) ≤ (x − b)T (x − b)
⇒ (x − a)T x − (x − a)T a ≤ (x − b)T x − (x − b)T b
⇒ xT x − 2aT x + aT a ≤ xT x − 2bT x + bT b
⇒ 2(bT − aT )x ≤ bT b − aT a
⇒ 2(b − a)T x ≤ bT b − aT a

Thus, these points have the form: {x|cT x ≤ d}, where c = 2(b − a) and
d = bT b − aT a.

6.1 Graphical Representation

The code to produce the plot is located in the Appendix. Given two points a and b, the
blue line depicts the hyperplane H = {x|2(b − a)T x ≤ bT b − aT a}.

5
The area under the hyperplane to the direction of a denotes the corresponding
halfspace calculated in the exercise.

7 Exercise 2.11

7.1 2-D Case

Consider two points of the hyperbolic set H, x and x̃. We want to show that: θx +
(1 − θ)x̃ ∈ H. We have that:

θx + (1 − θ)x̃ = θ(x1 , x2 )T + (1 − θ)(x˜1 , x˜2 )T


= (θx1 + (1 − θ)x˜1 , θx2 + (1 − θ)x˜2 )T

For this element to belong to the hyperbolic set H, it must hold:

[θx1 + (1 − θ)x˜1 ] · [θx2 + (1 − θ)x˜2 ] ≥ 1

we have:

[θx1 + (1 − θ)x˜1 ] · [θx2 + (1 − θ)x˜2 ] = θ2 x1 x2 + (1 − θ)2 x˜1 x˜2 + θ(1 − θ)x1 x˜2 + θ(1 − θ)x˜1 x2
= θ2 x1 x2 + (1 − θ)2 x˜1 x˜2 + θ(1 − θ)(x1 x˜2 + x˜1 x2 )

Since x1 , x2 ≥ 0 and x˜1 , x˜2 ≥ 0, we have:

θ2 x1 x2 + (1 − θ)2 x˜1 x˜2 + θ(1 − θ)(x1 x˜2 + x˜1 x2 ) ≥ θ2 x1 x2 + (1 − θ)2 x˜1 x˜2

Then, because x1 x2 ≥ 1 and x˜1 x˜2 ≥ 1:

θ2 x1 x2 + (1 − θ)2 x˜1 x˜2 ≥ θ2 + (1 − θ)2 = 1 + 2θ2 − 2θ


= 1 + 2θ(1 − θ) ≥ 1

6
because θ ∈ [0, 1] ⇒ (1 − θ) ∈ [0, 1], which implies that: 2θ(1 − θ) ≥ 0.

7.2 General Case

We now consider again two arbitrary points of H, x and x̃. In the general case, the
convex combination of x and x̃ becomes:

Qn
θx + (1 − θ)x̃ = i=1 [θxi + (1 − θ)x̃i ]

By using the inequality provided by the problem statement, we obtain that:

n
Y n
Y n
Y
[θxi + (1 − θ)x̃i ] = θ xi + (1 − θ) x̃i
i=1 i=1 i=1
n
Y n
Y
≥( xi )θ + ( x̃i )1−θ ≥ 1
i=1 i=1

8 Exercise 2.12

(a) Slab: It is convex, because it is the intersection of two halfspaces:


H1 = {x|aT1 x ≤ β} and {x| − aT2 x ≤ −α} and the intersection of two convex sets is
convex.

(b) Rectangle: Following the same rational with (a), we can think of a rectangle as
the intersection of many halfspaces (thus, convex sets). Hence, it is also convex.

(c) Wedge: Wedge is the intersection of the following two halfspaces:


H1 = {x|aT1 x ≤ b1 } and H2 = {x|aT2 x ≤ b2 }. Thus, it is convex.

(d) In Exercise 2.7 we showed the Voronoi description of a halfspace. In this case, we
have to consider many points y instead of just one. For any y, the set: {x|||x − x0 || ≤
||x − y||2 } is a halfspace, thus, by calculating this quantity ∀y, we get the intersection
of all these halfspaces, which is convex.

(e) If S and T consist only of one point, then the set is convex (Exercise 2.7). For
the general case, the set is not convex and we will prove it using a counter-example.

7
Consider the 1-dimensional case, with T = {0} and S being the set of two symmetric
points to T : S = {−x0 , x0 }, where x0 ∈ R+ . Then, ∀x ∈ R, we have:

p
dist(x, x0 ) = (x − x0 )2 = |x − x0 |
p
dist(x, −x0 ) = (x + x0 )2 = |x + x0 |
dist(x, 0) = |x|

Then, the points closer to S than T have the following form:

{x||x − x0 | ≥ |x|, x ∈ R+ ∪ |x + x0 | ≥ |x|, x ∈ R− }

From theory, we know that the union of two convex sets is not necessarily convex.
The lack of convexity derives from "gaps" in the diagram of the most close points.

(f) From the element-wise addition of sets, we get that:

{x|x + S2 ⊆ S1 } = {x, x0 ∈ S2 |x + x0 ∈ S1 }
\
= {x|x + x0 ∈ S1 }
x0 ∈S2
\
= (S1 − x0 )
x0 ∈S2

The subtraction of a convex set with any set is not necessarily convex, so the
intersection is, also, not necessarily convex.

(g) Following the same rationale with Exercise 2.7:

||x − a||2 ≤ θ||x − b||2 ⇒ (x − a)T (x − a) ≤ θ2 (x − b)T (x − b)


⇒ xT x − 2aT x + aT a ≤ θ2 xT x − 2θ2 bT x + θ2 bT b
⇒ (1 − θ2 )xT x − 2(aT − θ2 bT )x + aT a − θ2 bT b ≤ 0
⇒ (1 − θ2 )xT x − 2(a − θ2 b)T x + aT a − θ2 bT b ≤ 0

8
If θ = 1, this is obviously a halfspace, which is convex. If θ < 1, then:

(1 − θ2 )xT x − 2(a − θ2 b)T x + aT a − θ2 bT b ≤ 0


2(a − θ2 b)T θ2 bT b − aT a
⇒ xT x − x ≤
1 − θ2 1 − θ2
2 T
2(a − θ b) 2
⇒ xT x − x+( )2 (a − θ2 b)T (a − θ2 b) ≤
1 − θ2 1 − θ2
θ2 bT b − aT a 2
≤ 2
+( )2 (a − θ2 b)T (a − θ2 b)
1−θ 1 − θ2
2 2
⇒ [x − (a − θ2 b)]T [x − (a − θ2 b)] ≤
1 − θ2 1 − θ2
θ2 bT b − aT a 2
≤ 2
+( )2 (a − θ2 b)T (a − θ2 b)
1−θ 1 − θ2

which is a ball and, thus, also convex.

9 Exercise 2.15

Pn
(a) E[f (x)] = i=1 pi f (ai ). Thus: α ≤ E[f (x)] ≤ β is:
Pn
α≤ i=1 pi f (ai ) ≤ β

which can be considered as the intersection of two linear inequalities of p and,


thus, is convex.
Pn
(b) prob(x > α) ≤ β ⇒ i=k pi ≤ β, which is also a linear inequality of p and, thus,
convex.

(c) E(|x|3 ) ≤ αE(|x|) ⇒ E(|x|3 ) − αE(|x|) ≤ 0.

From linearity of expectation:


Pn
E(|x|3 ) − αE(|x|) ≤ 0 ⇒ i=1 pi (|ai |3 − α|ai |) ≤ 0

which also defines a linear inequality in terms of p.


Pn
(d) E(x2 ) ≤ α ⇒ i=1 pi a2i ≤ 0

which also defines a linear inequality of p.

9
Pn
(e) E(x2 ) ≥ α ⇒ i=1 pi a2i ≥ 0

which also defines a linear inequality of p.


Pn Pn
(f) Var(x) ≤ α ⇒ E(x2 ) − [E(x)]2 = i=1 pi a2i − ( i=1 pi ai )2 ≤ α.

The squared term does not preserve convexity. As a counter-example. consider


the variance of a random variable with only two possible outcomes: (a1 , a2 )T = (1, 0)T .
Then, for p = (1, 0)T , we have: Var(x) = 0 and for p = (0, 1)T , we also have:
Var(x) = 0, thus: Var(x) ≤ 0 for both p. However, for their convex combination
p = ( 12 , 12 )T , we get: Var(x) = 1
4 > 0.
Pn Pn Pn
(g) Var(x) ≤ α ⇒ i=1 pi a2i −( i=1 pi ai )2 ≤ α ⇒ −sumni=1 pi a2i +( i=1 pi ai )2 ≥ −α

From linear algebra, we know that:


T
pi ai )2 = −a⃗2 p + pT Ap
Pn Pn
− i=1 pi a2i + ( i=1

where a⃗2 = (a21 , ..., a2n )T and A matrix is the outer product: A = aaT . A must be
positive definite, because for vectors in Rn , it must hold:

(uT x)2 ≥ 0 ⇒ uT xxT u ≥ 0

and, thus, xxT must form a positive definite matrix. Then:


T
Var(x) ≤ α ⇒ −a⃗2 p + pT Ap ≤ −α
T
Because a⃗2 and pT Ap are convex functions of p, the function −a⃗2 p + pT Ap must
T
also be convex. Hence, −a⃗2 p + pT Ap ≤ −α is also convex.

(h) Since we want the infimum of β in the quartile function, we just consider the
maximum element of ⃗a, let it be ak . Then:

inf{β ≥ α|prob(x ≤ β) ≥ 0.25}

becomes:

inf{β|prob(x ≤ ak ) ≥ 0.25}

Then:
Pk
prob(x ≤ ak ) ≥ 0.25 = i=1 pi ≥ 0.25

which is a linear inequality of p and, thus, convex.

10
(i) Following the same rational with (h), we have:

quartile(x) ≤ α = {β|prob(x ≥ α ≥ 0.25)}

Then:
Pn
prob(x ≥ ak ) ≥ 0.25 = 1−prob(x ≤ ak ) < 0.75 ⇒ i=k+1 pi < 0.75

which is also a linear inequality in terms of p.

10 Exercise 2.16

We consider two points of S: (x, y1 + y2 ), (x, y˜1 , y˜2 ) ∈ S. Then, for θ ∈ [0, 1]:

θ(x, y1 + y2 ) + (1 − θ)(x, y˜1 , y˜2 ) = (θx + (1 − θ)x, θy1 + (1 − θ)y˜1 +


+ θy2 + (1 − θ)y˜2 ) ∈ S

because: (θx+(1−θ)x, θy1 +(1−θ)y˜1 ) ∈ S1 and (θx+(1−θ)x, θy2 +(1−θ)y˜2 ) ∈ S2 ,


which derives from the convexity of S1 and S2 respectively.

11
11 Appendix

import numpy as np
import matplotlib.pyplot as plt

a = np.array([0.2, 0.3])
b = np.array([0.9, 0.7])

plt.xlim(0, 1.5)
plt.ylim(0, 1)
plt.scatter(a[0], a[1], color="red", label="a")
plt.scatter(b[0], b[1], color="orange", label = "b")

def euclidean_norm(x):

return np.sqrt(x[0]**2 + x[1]**2)

x_grid = np.arange(0.0, 1.4, 0.01)


y_grid = np.arange(0.0, 1.5, 0.01)

hyperplane_points = []
epsilon = 0.005

for x_value in x_grid:


for y_value in y_grid:

point = np.array([x_value, y_value])


if np.abs(euclidean_norm(a-point) - euclidean_norm(b-point)) <
,→ epsilon:
hyperplane_points.append(point)

x1_point = [hyperplane_points[0][0], hyperplane_points[0][1]]


x2_point = [hyperplane_points[-1][0], hyperplane_points[-1][1]]

plt.plot(x1_point, x2_point, color="blue", label="Hyperplane")


plt.legend()

12

You might also like