100% found this document useful (1 vote)

129 views

Hw2sol PDF

The document is a homework assignment on optimization techniques for non-convex and non-differentiable functions. It covers subgradient optimality conditions for inequality constrained optimization, optimality conditions and coordinate descent for l1-regularized minimization, and working through an example of coordinate descent for l1-regularized least squares.

Uploaded by

Shy PeachD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

129 views

Hw2sol PDF

Uploaded by

Shy PeachD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

EE364b Prof. S.

Boyd

EE364b Homework 2
1. Subgradient optimality conditions for nondifferentiable inequality constrained optimiza-
tion. Consider the problem
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , m,
with variable x ∈ Rn . We do not assume that f0 , . . . , fm are convex. Suppose that x̃
and λ̃ 0 satisfy primal feasibility,
fi (x̃) ≤ 0, i = 1, . . . , m,
dual feasibility,
m
X
0 ∈ ∂f0 (x̃) + λ̃i ∂fi (x̃),
i=1
and the complementarity condition
λ̃i fi (x̃) = 0, i = 1, . . . , m.
Show that x̃ is optimal, using only a simple argument, and definition of subgradient.
Recall that we do not assume the functions f0 , . . . , fm are convex.
Solution. Let g be defined by g(x) = f0 (x) + m i=1 λ̃i fi (x). Then, 0 ∈ ∂g(x̃). By
P

definition of subgradient, this means that for any y,

g(y) ≥ g(x̃) + 0T (y − x̃).

Thus, for any y,

m
X
f0 (y) ≥ f0 (x̃) − λ̃i (fi (y) − fi (x̃)).
i=1

For each i, complementarity implies that either λi = 0 or fi (x̃) = 0. Hence, for any
feasible y (for which fi (y) ≤ 0), each λ̃i (fi (y) − fi (x̃)) term is either zero or negative.
Therefore, any feasible y also satisfies f0 (y) ≥ f0 (x̃), and x̃ is optimal.
2. Optimality conditions and coordinate-wise descent for ℓ1 -regularized minimization. We
consider the problem of minimizing
φ(x) = f (x) + λkxk1 ,
where f : Rn → R is convex and differentiable, and λ ≥ 0. The number λ is the
regularization parameter, and is used to control the trade-off between small f and
small kxk1 . When ℓ1 -regularization is used as a heuristic for finding a sparse x for
which f (x) is small, λ controls (roughly) the trade-off between f (x) and the cardinality
(number of nonzero elements) of x.

1
(a) Show that x = 0 is optimal for this problem (i.e., minimizes φ) if and only if
k∇f (0)k∞ ≤ λ. In particular, for λ ≥ λmax = k∇f (0)k∞ , ℓ1 regularization yields
the sparsest possible x, the zero vector.
Remark. The value λmax gives a good reference point for choosing a value of the
penalty parameter λ in ℓ1 -regularized minimization. A common choice is to start
with λ = λmax /2, and then adjust λ to achieve the desired sparsity/fit trade-off.
Solution. A necessary and sufficient condition for optimality of x = 0 is that
0 ∈ ∂φ(0). Now ∂φ(0) = ∇f (0) + λ∂k0k1 = ∇f (0) + λ[−1, 1]n . In other words,
x = 0 is optimal if −∇f (x) ∈ [−λ, λ]n . This is equivalent to k∇f (0)k∞ ≤ λ.
(b) Coordinate-wise descent. In the coordinate-wise descent method for minimizing
a convex function g, we first minimize over x1 , keeping all other variables fixed;
then we minimize over x2 , keeping all other variables fixed, and so on. After
minimizing over xn , we go back to x1 and repeat the whole process, repeatedly
cycling over all n variables.
Show that coordinate-wise descent fails for the function

g(x) = |x1 − x2 | + 0.1(x1 + x2 ).

(In particular, verify that the algorithm terminates after one step at the point
(0) (0)
(x2 , x2 ), while inf x g(x) = −∞.) Thus, coordinate-wise descent need not work,
for general convex functions.
(0)
Solution. We first minimize over x1 , with x2 fixed as x2 . The optimal choice is
(0)
x1 = x2 , since the derivative on the left is −0.9, and on the right, it is 1.1. We
(0) (0)
then arrive at the point (x2 , x2 ). We now optimize over x2 . But it is optimal,
with the same left and right derivatives, so x is unchanged. We’re now at a fixed
point of the coordinate-descent algorithm.
On the other hand, taking x = (−t, t) and letting t → ∞, we see that g(x) =
−0.1t → −∞.
It’s good to visualize coordinate-wise descent for this function, to see why x gets
stuck at the crease along x1 = x2 . The graph looks like a folded piece of paper,
with the crease along the line x1 = x2 . The bottom of the crease has a small
tilt in the direction (−1, −1), so the function is unbounded below. Moving along
either axis increases g, so coordinate-wise descent is stuck. But moving in the
direction (−1, −1), for example, decreases the function.
(c) Now consider coordinate-wise descent for minimizing the specific function φ de-
fined above. Assuming f is strongly convex (say) it can be shown that the iterates
converge to a fixed point x̃. Show that x̃ is optimal, i.e., minimizes φ.
Thus, coordinate-wise descent works for ℓ1 -regularized minimization of a differ-
entiable function.
Solution. For each i, x̃i minimizes the function ψ, with all other variables kept

2
fixed. It follows that
∂f
0 ∈ ∂xi ψ(x̃) = (x̃) + λIi , i = 1, . . . , n,
∂xi
where Ii is the subdifferential of | · | at x̃i : Ii = {−1} if x̃i < 0, Ii = {+1} if
x̃i > 0, and Ii = [−1, 1] if x̃i = 0.
But this is the same as saying 0 ∈ ∇f (x̃) + ∂kx̃k1 , which means that x̃ minimizes
ψ.
The subtlety here lies in the general formula that relates the subdifferential of
a function to its partial subdifferentials with respect to its components. For a
separable function h : R2 → R, we have
∂h(x) = ∂x1 h(x) × ∂x2 h(x),
but this is false in general.
(d) Work out an explicit form for coordinate-wise descent for ℓ1 -regularized least-
squares, i.e., for minimizing the function
kAx − bk22 + λkxk1 .
You might find the deadzone function

 u−1 u>1

ψ(u) =  0 |u| ≤ 1

u + 1 u < −1
useful. Generate some data and try out the coordinate-wise descent method.
Check the result against the solution found using CVX, and produce a graph show-
ing convergence of your coordinate-wise method.
Solution. At each step we choose an index i, and minimize kAx − bk22 + λkxk1
over xi , while holding all other xj , with j 6= i, constant.
Selecting the optimal xi for this problem is equivalent to selecting the optimal xi
in the problem
minimize ax2i + cxi + |xi |,
where a = (AT A)ii /λ and c = (2/λ)( j6=i (AT A)ij xj + (bT A)i ). Using the theory
P

discussed above, any minimizer xi will satisfy 0 ∈ 2axi + c + ∂|xi |. Now we note
that a is positive, so the minimizer of the above problem will have opposite sign
to c. From there we deduce that the (unique) minimizer x⋆i will be
(
0 c ∈ [−1, 1]
x⋆i =
(1/2a)(−c + sign(c)) otherwise,
where 
 −1 u < 0

sign(u) =  0 u=0

1 u > 0.

3
Finally, we make use of the deadzone function ψ defined above and write
T
−ψ((2/λ) j6=i (A A)ij xj + (bT A)i )
P
x⋆i = .
(2/λ)(AT A)ii

Coordinate descent was implemented in Matlab for a random problem instance

with A ∈ R400×200 . When solving to within 0.1% accuracy, the iterative method
required only a third the time of cvx. Sample code appears below, followed by
a graph showing the coordinate-wise descent method’s function value converging
to the CVX function value.

% Generate a random problem instance.

randn(’state’, 10239); m = 400; n = 200;
A = randn(m, n); ATA = A’*A;
b = randn(m, 1);
l = 0.1;
TOL = 0.001;
xcoord = zeros(n, 1);

% Solve in cvx as a benchmark.

cvx_begin
variable xcvx(n);
minimize(sum_square(A*xcvx + b) + l*norm(xcvx, 1));
cvx_end

% Solve using coordinate-wise descent.

while abs(cvx_optval - (sum_square(A*xcoord + b) + ...
l*norm(xcoord, 1)))/cvx_optval > TOL
for i = 1:n
xcoord(i) = 0; ei = zeros(n,1); ei(i) = 1;
c = 2/l*ei’*(ATA*xcoord + A’*b);
xcoord(i) = -sign(c)*pos(abs(c) - 1)/(2*ATA(i,i)/l);
end
end

4
1
10

0
10

−1
10

−2
10

−3
10

−4
10

−5
10

−6
10

−7
10
0 5 10 15 20 25 30

Full download Practical Methods for Optimal Control and Estimation Using Nonlinear Programming Second Edition Advances in Design and Control John T. Betts pdf docx
100% (17)
Full download Practical Methods for Optimal Control and Estimation Using Nonlinear Programming Second Edition Advances in Design and Control John T. Betts pdf docx
60 pages
Bernet, R. - Force, Drive, Desire - A Philosophy of Psychoanalysis
No ratings yet
Bernet, R. - Force, Drive, Desire - A Philosophy of Psychoanalysis
401 pages
Problem Set #1. Due Sept. 9 2020.: MAE 501 - Fall 2020. Luc Deike, Anastasia Bizyaeva, Jiarong Wu September 2, 2020
No ratings yet
Problem Set #1. Due Sept. 9 2020.: MAE 501 - Fall 2020. Luc Deike, Anastasia Bizyaeva, Jiarong Wu September 2, 2020
3 pages
HSE Management System Plan
100% (2)
HSE Management System Plan
223 pages
DPOCexam2008midterm Solution
No ratings yet
DPOCexam2008midterm Solution
12 pages
HJB Equations
100% (1)
HJB Equations
38 pages
Optimal Control Theory
No ratings yet
Optimal Control Theory
20 pages
Stability
No ratings yet
Stability
16 pages
Criminology Is The Scientific Study of Crime, Criminals, and Criminal Behavior. in Broadest Sense
100% (1)
Criminology Is The Scientific Study of Crime, Criminals, and Criminal Behavior. in Broadest Sense
2 pages
Dynamic Programming Value Iteration
100% (1)
Dynamic Programming Value Iteration
36 pages
Complete Quadratic Lyapunov Functionals Using Bessel LegendreInequality
No ratings yet
Complete Quadratic Lyapunov Functionals Using Bessel LegendreInequality
6 pages
Solutions To Problem Set 2: U U Is The Gradient of U
No ratings yet
Solutions To Problem Set 2: U U Is The Gradient of U
15 pages
Optimal Control Dynamic Programming
No ratings yet
Optimal Control Dynamic Programming
18 pages
Robo3 2
No ratings yet
Robo3 2
336 pages
Partial Diff Equations
No ratings yet
Partial Diff Equations
51 pages
Introduction To Optimal Control Theory and Hamilton-Jacobi Equations
100% (1)
Introduction To Optimal Control Theory and Hamilton-Jacobi Equations
55 pages
Notas - Dynamic Optimation and Optimal Control
No ratings yet
Notas - Dynamic Optimation and Optimal Control
26 pages
Skript Control Systems II V1
No ratings yet
Skript Control Systems II V1
248 pages
Robust Control Part1
No ratings yet
Robust Control Part1
12 pages
LQG/LQR Controller Design: Undergraduate Lecture Notes On
No ratings yet
LQG/LQR Controller Design: Undergraduate Lecture Notes On
37 pages
LMI-Linear Matrix Inequality
No ratings yet
LMI-Linear Matrix Inequality
34 pages
Control Principles For Engineered Systems 5SMC0: State Reconstruction & Observer Design
No ratings yet
Control Principles For Engineered Systems 5SMC0: State Reconstruction & Observer Design
19 pages
Newton Gauss Method
No ratings yet
Newton Gauss Method
37 pages
LQR Feedforward
No ratings yet
LQR Feedforward
20 pages
Bellman
100% (1)
Bellman
8 pages
Material: Glad & Ljung Ch. 12.2 Khalil Ch. 4.1-4.3 Lecture Notes
No ratings yet
Material: Glad & Ljung Ch. 12.2 Khalil Ch. 4.1-4.3 Lecture Notes
42 pages
Lagrange Multipliers Basics
No ratings yet
Lagrange Multipliers Basics
7 pages
Robust Pole Placement Using Linear Quadratic Regulator Weight Selection Algorithm
No ratings yet
Robust Pole Placement Using Linear Quadratic Regulator Weight Selection Algorithm
5 pages
Switched Systems
100% (2)
Switched Systems
185 pages
Neural Network Sliding-Mode Position Controller For Induction Servo Drive
No ratings yet
Neural Network Sliding-Mode Position Controller For Induction Servo Drive
12 pages
EACT633 - Optimal Control and It's Applications Worksheet For Chapter 4, 5 & 6
No ratings yet
EACT633 - Optimal Control and It's Applications Worksheet For Chapter 4, 5 & 6
3 pages
Lecture Notes Control Systems Theory and Design PDF Free
No ratings yet
Lecture Notes Control Systems Theory and Design PDF Free
311 pages
(Peter Van Overschee, Bart de Moor (Auth.) ) Subs
No ratings yet
(Peter Van Overschee, Bart de Moor (Auth.) ) Subs
262 pages
Razumikhin and Krasovskii Stability Theorems For Time-Varying Time-Delay Systems
No ratings yet
Razumikhin and Krasovskii Stability Theorems For Time-Varying Time-Delay Systems
11 pages
Dynamic Programming and Optimal Control, Volumes I Solution Selected
No ratings yet
Dynamic Programming and Optimal Control, Volumes I Solution Selected
30 pages
Robust Control: Alireza Karimi Laboratoire D'automatique
No ratings yet
Robust Control: Alireza Karimi Laboratoire D'automatique
54 pages
Lecture 4 Chapter 4 Lyapunov Stability
No ratings yet
Lecture 4 Chapter 4 Lyapunov Stability
86 pages
The Kalman Filter: State-Space Derivation For Mass-Spring-Damper System
No ratings yet
The Kalman Filter: State-Space Derivation For Mass-Spring-Damper System
10 pages
Controllability and Observability of Nonlinear Systems: Key Points
100% (1)
Controllability and Observability of Nonlinear Systems: Key Points
22 pages
Adaptive Control Basics and Research: Gang Tao
No ratings yet
Adaptive Control Basics and Research: Gang Tao
85 pages
Notes On Linearisation (H.K.Khalil)
No ratings yet
Notes On Linearisation (H.K.Khalil)
11 pages
Hw1 2 Solutions
No ratings yet
Hw1 2 Solutions
7 pages
Calculus of Variations
No ratings yet
Calculus of Variations
30 pages
Levenberg Marquardt
No ratings yet
Levenberg Marquardt
7 pages
Time Delay Systems
No ratings yet
Time Delay Systems
25 pages
Tutorial KF
No ratings yet
Tutorial KF
13 pages
ENPM667: Control of Robotic Systems Final Project: University of Maryland, College Park
100% (1)
ENPM667: Control of Robotic Systems Final Project: University of Maryland, College Park
18 pages
EC744 Lecture Notes: Economic Dynamics: Prof. Jianjun Miao
No ratings yet
EC744 Lecture Notes: Economic Dynamics: Prof. Jianjun Miao
13 pages
Stabilization Methods For Simulations of Constrained Multibody Dynamics
100% (1)
Stabilization Methods For Simulations of Constrained Multibody Dynamics
179 pages
Predictive Control: For Linear and Hybrid Systems
No ratings yet
Predictive Control: For Linear and Hybrid Systems
458 pages
1 Extensive Games With Imperfect Information
No ratings yet
1 Extensive Games With Imperfect Information
17 pages
Stochastic Control Princeton
No ratings yet
Stochastic Control Princeton
14 pages
Model Predictive Control For UAVs
No ratings yet
Model Predictive Control For UAVs
24 pages
Adaptive Control: Stability, Convergence, and Robustness
No ratings yet
Adaptive Control: Stability, Convergence, and Robustness
201 pages
Ps 1
No ratings yet
Ps 1
2 pages
Lecture 1 - Introduction To Optimization PDF
No ratings yet
Lecture 1 - Introduction To Optimization PDF
31 pages
m3 Lec1 3
No ratings yet
m3 Lec1 3
7 pages
Optimization Class Notes MTH-9842
No ratings yet
Optimization Class Notes MTH-9842
25 pages
Partial Differential Equations A Unified Hilbert Space Approach 1st Edition Rainer Picard All Chapters Instant Download
100% (1)
Partial Differential Equations A Unified Hilbert Space Approach 1st Edition Rainer Picard All Chapters Instant Download
82 pages
HW 2 Sol
No ratings yet
HW 2 Sol
5 pages
Convex Problems
No ratings yet
Convex Problems
48 pages
Coordinate Descent Algorithms: Stephen J. Wright
No ratings yet
Coordinate Descent Algorithms: Stephen J. Wright
32 pages
Exam1Review Annotated
No ratings yet
Exam1Review Annotated
13 pages
Extra Exercises PDF
No ratings yet
Extra Exercises PDF
232 pages
MSG Pass Dyn PDF
No ratings yet
MSG Pass Dyn PDF
57 pages
Linear Programming Using Matlab: Nikolaos Ploskas & Nikolaos Samaras
No ratings yet
Linear Programming Using Matlab: Nikolaos Ploskas & Nikolaos Samaras
15 pages
Alt Proj
No ratings yet
Alt Proj
9 pages
Hw3sol PDF
No ratings yet
Hw3sol PDF
8 pages
Business Ethics in Decision Making: January 2012
No ratings yet
Business Ethics in Decision Making: January 2012
3 pages
MA211 Week 3 Lecture 2 Chapter 13.2
No ratings yet
MA211 Week 3 Lecture 2 Chapter 13.2
11 pages
Measuring Principles For Geometrical Tolerances: Standard STD 112-0004E
No ratings yet
Measuring Principles For Geometrical Tolerances: Standard STD 112-0004E
38 pages
Ipc2022-87301 The State of Dent Screening and Shape-Based Assessments
No ratings yet
Ipc2022-87301 The State of Dent Screening and Shape-Based Assessments
6 pages
Budgetary Control
100% (1)
Budgetary Control
81 pages
The Calculation Method of Lightning Protection Design of Overhead Line in Urban Rail Transit
No ratings yet
The Calculation Method of Lightning Protection Design of Overhead Line in Urban Rail Transit
14 pages
Final Paper Chapters 1 4 Group 7
No ratings yet
Final Paper Chapters 1 4 Group 7
76 pages
Drawing Issue Sheet: Discipline CFC Hotels Developments
No ratings yet
Drawing Issue Sheet: Discipline CFC Hotels Developments
22 pages
Carib Questions and Chap 3
No ratings yet
Carib Questions and Chap 3
5 pages
First Condition of Equilibrium: Objective
No ratings yet
First Condition of Equilibrium: Objective
11 pages
Real Numbers: Project Work-Submitted To - Mr. Amit Pandey Submitted by - Srishti Dubey Class - 10 B
83% (12)
Real Numbers: Project Work-Submitted To - Mr. Amit Pandey Submitted by - Srishti Dubey Class - 10 B
10 pages
Ideju Miesttas Lithuania Pif 2023
No ratings yet
Ideju Miesttas Lithuania Pif 2023
4 pages
Postdoc Presentation Walbot
No ratings yet
Postdoc Presentation Walbot
32 pages
CAM 8 TEST 3 - Crossword Labs
No ratings yet
CAM 8 TEST 3 - Crossword Labs
2 pages
Centrifuge DSC-203,303SD (Manual) - 2021
100% (1)
Centrifuge DSC-203,303SD (Manual) - 2021
3 pages
A New Peng-Robinson Modification To Enhance Dew Point Estimations of Natural Gases
No ratings yet
A New Peng-Robinson Modification To Enhance Dew Point Estimations of Natural Gases
12 pages
Area Computation of A Closed Traverse
No ratings yet
Area Computation of A Closed Traverse
9 pages
MTM Lab Manual
No ratings yet
MTM Lab Manual
49 pages
Singapore MPH
No ratings yet
Singapore MPH
3 pages
Obtaining Construction Permits in Cameroon Dealing With The Law
No ratings yet
Obtaining Construction Permits in Cameroon Dealing With The Law
3 pages
Diamec U6: Electrical System Manual
No ratings yet
Diamec U6: Electrical System Manual
21 pages
OHS-PR-09-03-F02 HIRA - 004 Soil Investigation For Plain Area
No ratings yet
OHS-PR-09-03-F02 HIRA - 004 Soil Investigation For Plain Area
6 pages
Expository writing past paper solved
100% (1)
Expository writing past paper solved
7 pages
Chapter 5 - Writing A Research Paper
No ratings yet
Chapter 5 - Writing A Research Paper
8 pages
The Geometry of Quantum Computation
No ratings yet
The Geometry of Quantum Computation
23 pages
Oceanhw chp2 Revised
No ratings yet
Oceanhw chp2 Revised
2 pages
Creative Arts and Designs
No ratings yet
Creative Arts and Designs
136 pages

Hw2sol PDF

Uploaded by

Hw2sol PDF

Uploaded by

EE364b Prof. S.

definition of subgradient, this means that for any y,

Thus, for any y,

g(x) = |x1 − x2 | + 0.1(x1 + x2 ).

Coordinate descent was implemented in Matlab for a random problem instance

% Generate a random problem instance.

% Solve in cvx as a benchmark.

% Solve using coordinate-wise descent.

You might also like