0% found this document useful (0 votes)

44 views8 pages

Generalized Nonconvex Nonsmooth Low-Rank Minimization

Uploaded by

Sourish Sarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views8 pages

Generalized Nonconvex Nonsmooth Low-Rank Minimization

Uploaded by

Sourish Sarkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

2014 IEEE Conference on Computer Vision and Pattern Recognition

Generalized Nonconvex Nonsmooth Low-Rank Minimization

Canyi Lu1 , Jinhui Tang2 , Shuicheng Yan1 , Zhouchen Lin3,∗

1
Department of Electrical and Computer Engineering, National University of Singapore
2
School of Computer Science, Nanjing University of Science and Technology
3
Key Laboratory of Machine Perception (MOE), School of EECS, Peking University
[email protected], [email protected], [email protected], [email protected]

Supergradient w g(T)
Abstract

Supergradient w g(T)
3 200 2
1

Penalty g(T)

Penalty g(T)
150 1.5
2
100 1
As surrogate functions of L0 -norm, many nonconvex 1
50 0.5
0.5

penalty functions have been proposed to enhance the s- 0 0 0

0 2 4 6
0
0 2 4 6
0 2 4 6 0 2 4 6
parse vector recovery. It is easy to extend these noncon- T T T T

vex penalty functions on singular values of a matrix to en- (a) Lp Penalty [11] (b) SCAD Penalty [10]

Supergradient w g(T)

Supergradient w g(T)
hance low-rank matrix recovery. However, different from 3 1.5 2 1
Penalty g(T)

Penalty g(T)
convex optimization, solving the nonconvex low-rank mini- 2 1
1.5

1 0.5
mization problem is much more challenging than the non- 1 0.5
0.5
convex sparse minimization problem. We observe that all 0 0 0 0
0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6
the existing nonconvex penalty functions are concave and T T T T

monotonically increasing on [0, ∞). Thus their gradients (c) Logarithm Penalty [12] (d) MCP Penalty [23]
are decreasing functions. Based on this property, we pro-
Supergradient w g(T)

Supergradient w g(T)
2 2 2
1
Penalty g(T)

Penalty g(T)
pose an Iteratively Reweighted Nuclear Norm (IRNN) al- 1.5 1.5 1.5

1 1 1
gorithm to solve the nonconvex nonsmooth low-rank mini- 0.5
0.5 0.5 0.5
mization problem. IRNN iteratively solves a Weighted Sin- 0 0 0 0
0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6
gular Value Thresholding (WSVT) problem. By setting the T T T T

weight vector as the gradient of the concave penalty func- (e) Capped L1 Penalty [24] (f) ETP Penalty [13]
tion, the WSVT problem has a closed form solution. In theo-
Supergradient w g(T)

Supergradient w g(T)
2 0.8 2 0.8
Penalty g(T)

Penalty g(T)

ry, we prove that IRNN decreases the objective function val- 1.5 0.6 1.5 0.6

1 0.4 1 0.4
ue monotonically, and any limit point is a stationary point.
0.5 0.2 0.5 0.2
Extensive experiments on both synthetic data and real im- 0 0 0 0
0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6
ages demonstrate that IRNN enhances the low-rank matrix T T T T

recovery compared with state-of-the-art convex algorithms. (g) Geman Penalty [15] (h) Laplace Penalty [21]

Figure 1: Illustration of the popular nonconvex surrogate func-

tions of ||θ||0 (left), and their supergradients (right). All these
1. Introduction penalty functions share the common properties: concave and
monotonically increasing on [0, ∞). Thus their supergradients
This paper aims to solve the following general noncon- (see Section 2.1) are nonnegative and monotonically decreasing.
vex nonsmooth low-rank minimization problem Our proposed general solver is based on this key observation.

m
A1 gλ : R → R+ is continuous, concave and monotoni-
min F (X) = gλ (σi (X)) + f (X), (1) cally increasing on [0, ∞). It is possibly nonsmooth.
X∈Rm×n
i=1
A2 f : Rm×n → R+ is a smooth function of type C 1,1 ,
m×n
where σi (X) denotes the i-th singular value of X ∈ R i.e., the gradient is Lipschitz continuous,
(we assume m ≤ n in this work). The penalty function gλ
and loss function f satisfy the following assumptions: ||∇f (X) − ∇f (Y)||F ≤ L(f )||X − Y||F , (2)
∗ Corresponding author. for any X, Y ∈ Rm×n , L(f ) > 0 is called Lipschitz

1063-6919/14 $31.00 © 2014 IEEE 4126

4130
DOI 10.1109/CVPR.2014.526

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
Table 1: Popular nonconvex surrogate functions of ||θ||0 and their supergradients.
Penalty Formula gλ (θ), θ ≥ 0, λ > 0 Supergradient
∂gλ (θ)
p ∞, if θ = 0,
Lp [11] λθ
λpθ p−1 , if θ > 0.
⎧ ⎧
⎪
⎪ λθ, if θ ≤ λ, ⎪
⎨ 2 ⎨λ, if θ ≤ λ,
−θ +2γλθ−λ2 γλ−θ
SCAD [10] , if λ < θ ≤ γλ, , if λ < θ ≤ γλ,
⎪ 2 2(γ−1)
⎪ ⎪
⎩0,
γ−1
⎩ λ (γ+1) , if θ > γλ. if θ > γλ.
2
λ γλ
Logarithm [12] log(γ+1)
log(γθ + 1) (γθ+1) log(γ+1)

θ2 λ − γθ , if θ < γλ,
λθ − 2γ , if θ < γλ,
MCP [23] 1 2
2 γλ , if θ ≥ γλ. 0, if θ ≥ γλ.
⎧
⎪
λθ, if θ < γ, ⎨λ, if θ < γ,
Capped L1 [24] [0, λ], if θ = γ,
λγ, if θ ≥ γ. ⎪
⎩0, if θ > γ.
λ λγ
ETP [13] 1−exp(−γ)
(1 − exp(−γθ)) 1−exp(−γ)
exp(−γθ)
λθ λγ
Geman [15] θ+γ (θ+γ)2
Laplace [21] λ(1 − exp(− γθ )) λ θ
γ exp(− γ )

constant of ∇f . f (X) is possibly nonconvex. have been proposed, including Lp -norm [11], Smoothly
Clipped Absolute Deviation (SCAD) [10], Logarithm [12],
A3 F (X) → ∞ iff || X ||F → ∞. Minimax Concave Penalty (MCP) [23], Capped L1 [24],
Many optimization problems in machine learning and Exponential-Type Penalty (ETP) [13], Geman [15], and
computer vision areas fall into the formulation in (1). As for Laplace [21]. Table 1 tabulates these penalty functions and
the choice of f , the squared loss f (X) = 12 ||A(X) − b||2F , Figure 1 visualizes them. One may refer to [14] for more
with a linear mapping A, is widely used. In this case, the properties of these penalty functions. Some of these non-
Lipschitz constant of ∇f is then the spectral radius of A∗ A, convex penalties have been extended to approximate the
i.e., L(f ) = ρ(A∗ A), where A ∗
is the adjoint operator of rank function, e.g. the Schatten-p norm [19]. Another non-
m
A. By choosing gλ (x) m = λx, i=1 gλ (σi (X)) is exactly
convex surrogate of rank function is the truncated nuclear
the nuclear norm λ i=1 σi (X) = λ|| X ||∗ . Problem (1) norm [16].
resorts to the well known nuclear norm regularized problem For nonconvex sparse minimization, several algorithms
have been proposed to solve the problem with a nonconvex
min λ|| X ||∗ + f (X). (3) regularizer. A common method is DC (Difference of Con-
X
vex functions) programming [14]. It minimizes the non-
If f (X) is convex, it is the most widely used convex relax- convex function f (x) − (−gλ (x)) based on the assumption
ation of the rank minimization problem: that both f and −gλ are convex. In each iteration, DC pro-
gramming linearizes −gλ (x) at x = xk , and minimizes the
min λrank(X) + f (X). (4)
X relaxed function as follows

The above low-rank minimization problem arises in many xk+1 = arg min f (x) − (−gλ (xk )) − vk , x −xk , (5)
x
machine learning tasks such as multiple category classifi-
cation [1], matrix completion [20], multi-task learning [2], where vk is a subgradient of −gλ (x) at x = xk . DC pro-
and low-rank representation with squared loss for subspace gramming may be not very efficient, since it requires some
segmentation [18]. However, solving problem (4) is usu- other iterative algorithm to solve (5). Note that the updating
ally difficult, or even NP-hard. Most previous works solve rule (5) of DC programming cannot be extended to solve the
m problem (1). The reason is that for concave gλ ,
the convex problem (3) instead. It has been proved that un- low-rank
der certain incoherence assumptions on the singular values − i=1 gλ (σi (X)) does not guarantee to be convex w.r.t.
of the matrix, solving the convex nuclear norm regularized X. DC programming also fails when f is nonconvex in
problem leads to a near optimal low-rank solution [6]. How- problem (1).
ever, such assumptions may be violated in real applications. Another solver is to use the proximal gradient algorith-
The obtained solution by using nuclear norm may be sub- m which is originally designed for convex problem [3]. It
optimal since it is not a perfect approximation of the rank requires computing the proximal operator of gλ ,
function. A similar phenomenon has been observed in the
1
convex L1 -norm and nonconvex L0 -norm for sparse vector Pgλ (y) = arg min gλ (x) + (x − y)2 , (6)
recovery [7]. x 2
In order to achieve a better approximation of the L0 - in each iteration. However, for nonconvex gλ , there may not
norm, many nonconvex surrogate functions of L0 -norm exist a general solver for (6). Even if (6) is solvable, differ-

4131
4127

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
ent from convex optimization, (Pgλ (y1 ) − Pgλ (y2 ))(y1 − g(x 2 ) vT2 x x 2
y2 ) ≥ 0 does not always hold. Thus we cannot perform
Pgλ (·) on the singular values of Y directly for solving g(x 2 ) vT3 x x 2
m
g(x1 ) v1T x x1
Pgλ (Y) = arg min gλ (σi (X)) + || X − Y ||2F . (7) g(x)
X
i=1
x1 x2
The nonconvexity of gλ makes the nonconvex low-rank
Figure 2: Supergraidients of a concave function. v1 is a super-
minimization problem much more challenging than the
gradient at x1 , and v2 and v3 are supergradients at x2 .
nonconvex sparse minimization.
Another related work is the Iteratively Reweighted Least 2.1. Supergradient of a Concave Function
Squares (IRLS) algorihtm. It has been recently extended to
handle the nonconvex Schatten-p norm penalty [19]. Actu- The subgradient of the convex function is an extension
ally it solves a relaxed smooth problem which may require of gradient at a nonsmooth point. Similarly, the supergradi-
many iterations to achieve a low-rank solution. It cannot ent is an extension of gradient of the concave function at a
solve the general nonsmooth problem (1). The alternative nonsmooth point. If g(x) is concave and differentiable at x,
updating algorithm in [16] minimizes the truncated nuclear it is known that
norm by using a special property of this penalty. It contains g(x) + ∇g(x), y − x ≥ g(y). (8)
two loops, both of which require computing SVD. Thus it is
not very efficient. It cannot be extended to solve the general If g(x) is nonsmooth at x, the supergradient extends the
problem (1) either. gradient at x inspired by (8) [5].
In this work, all the existing nonconvex surrogate func-
Definition 1 Let g : Rn → R be concave. A vector v is a
tions of L0 -norm are extended on the singular values of a
supergradient of g at the point x ∈ Rn if for every y ∈ Rn ,
matrix to enhance low-rank recovery. In problem (1), gλ
the following inequality holds
can be any existing nonconvex penalty function shown in
Table 1 or any other function which satisfies the assump- g(x) + v, y − x ≥ g(y). (9)
tion (A1). We observe that all the existing nonconvex sur-
rogate functions are concave and monotonically increasing All supergradients of g at x are called the superdifferential
on [0, ∞). Thus their gradients (or supergradients at the of g at x, and are denoted as ∂g(x). If g is differentiable at
nonsmooth points) are nonnegative and monotonically de- x, ∇g(x) is also a supergradient, i.e., ∂g(x) = {∇g(x)}.
creasing. Based on this key fact, we propose an Iterative- Figure 2 illustrates the supergradients of a concave function
ly Reweighted Nuclear Norm (IRNN) algorithm to solve at both differentiable and nondifferentiable points.
problem (1). IRNN computes the proximal operator of the For concave g, −g is convex, and vice versa. From this
weighted nuclear norm, which has a closed form solution fact, we have the following relationship between the super-
due to the nonnegative and monotonically decreasing su- gradient of g and the subgradient of −g.
pergradients. In theory, we prove that IRNN monotonically Lemma 1 Let g(x) be concave and h(x) = −g(x). For
decreases the objective function value, and any limit point is any v ∈ ∂g(x), u = −v ∈ ∂h(x), and vice versa.
a stationary point. To the best of our knowledge, IRNN is
the first work which is able to solve the general problem The relationship of the supergradient and subgradien-
(1) with convergence guarantee. Note that for noncon- t shown in Lemma 1 is useful for exploring some properties
vex optmization, it is usually very difficult to prove that of the supergradient. It is known that the subdiffierential of
an algorithm converges to stationary points. At last, we a convex function h is a monotone operator, i.e.,
test our algorithm with several nonconvex penalty function-
u − v, x − y ≥ 0, (10)
s on both synthetic data and real image data to show the
effectiveness of the proposed algorithm. for any u ∈ ∂h(x), v ∈ ∂h(y). The superdifferential of
a concave function holds a similar property, which is called
2. Nonconvex Nonsmooth Low-Rank Mini- antimonotone operator in this work.
mization Lemma 2 The superdifferential of a concave function g is
In this section, we present a general algorithm to solve an antimonotone operator, i.e.,
problem (1). To handle the case that gλ is nonsmooth, e.g., u − v, x − y ≤ 0, (11)
Capped L1 penalty, we need the concept of supergradient
defined on the concave function. for any u ∈ ∂g(x), v ∈ ∂g(y).

4132
4128

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
This can be easily proved by Lemma 1 and (10). Algorithm 1 Solving problem (1) by IRNN
Lemma 2 is a key lemma in this work. Supposing that Input: μ > L(f ) - A Lipschitz constant of ∇f (X).
the assumption (A1) holds for g(x), (11) indicates that Initialize: k = 0, Xk , and wik , i = 1, · · · , m.
u ≥ v, for any u ∈ ∂g(x) and v ∈ ∂g(y), (12) Output: X ∗ .
while not converge do
when x ≤ y. That is to say, the supergradient of g is mono-
1. Update Xk+1 by solving problem (18).
tonically decreasing on [0, ∞). Table 1 shows some usual
concave functions and their supergradients. We also visual- 2. Update the weights wik+1 , i = 1, · · · , m, by
ize them in Figure 1. It can be seen that they all satisfy the
assumption (A1). Note that for the Lp penalty, we further wik+1 ∈ ∂gλ σi (Xk+1 ) . (17)
deﬁne that ∂g(0) = ∞. This will not affect our algorithm
and convergence analysis as shown latter. The Capped L1 end while
penalty is nonsmooth at θ = γ, with the superdifferential
∂gλ (γ) = [0, λ].
2.2. Iteratively Reweighted Nuclear Norm Instead of updating Xk+1 by solving (16), we linearize
f (X) at Xk and add a proximal term:
In this subsection, we show how to solve the general non-
convex and possibly nonsmooth problem (1) based on the μ
f (X) ≈ f (Xk ) + ∇f (Xk ), X − Xk + ||X − Xk ||2F ,
assumptions (A1)-(A2). For simplicity of notation, we de- 2
note σi = σi (X) and σik = σi (Xk ).
Since gλ is concave on [0, ∞), by the deﬁnition of the where μ > L(f ). Such a choice of μ guarantees the con-
supergradient, we have vergence of our algorithm as shown latter. Then we update
Xk+1 by solving
gλ (σi ) ≤ gλ (σik ) + wik (σi − σik ), (13) m

Xk+1 = arg min wik σi + f (Xk )
where X
i=1
wik ∈ ∂gλ (σik ). (14) μ
+ ∇f (X ), X − Xk + ||X − Xk ||2F
k
Since σ1k ≥ σ2k ≥ · · · ≥ σm k
≥ 0, by the antimonotone 2
m
2
property of supergradient (12), we have μ 1
= arg min wik σi + X − Xk
− ∇f (X k
) .

X 2 μ
0 ≤ w1k ≤ w2k ≤ · · · ≤ wm
k
. (15) i=1 F
(18)
This property is important in our algorithm shown latter.
(13) motivates us to minimize its right hand side instead of Problem (18) is still nonconvex. Fortunately, it has a closed
gλ (σi ). Thus we may solve the following relaxed problem form solution due to (15).

m
Lemma 3 [8, Theorem 2.3] For any λ > 0, Y ∈ Rm×n
k+1
X = arg min gλ (σik ) + wik (σi − σik ) + f (X) and 0 ≤ w1 ≤ w2 ≤ · · · ≤ ws (s = min(m, n)), a global-
X
i=1 ly optimal solution to the following problem
m
= arg min wik σi + f (X). s
1
X
i=1 min λ wi σi (X) + ||X − Y||2F , (19)
i=1
2
(16)
is given by the weighted singular value thresholding
It seems that updating Xk+1 by solving the above weighted
nuclear norm problem (16) is an extension of the weighted X∗ = U Sλw (Σ)V T , (20)
L1 -norm problem in IRL1 algorithm [7] (IRL1 is a special
DC programming algorithm). However, the weighted nu- where Y = U ΣV T is the SVD of Y, and Sλw (Σ) =
clear norm is nonconvex in (16) (it is convex if and only Diag{(Σii − λwi )+ }.
if w1k ≥ w2k ≥ · · · ≥ wm k
≥ 0 [8]), while the weighted
L1 -norm is convex. Solving the nonconvex problem (16) is It is worth mentioning that for the Lp penalty, if σik = 0,
much more challenging than the convex weighted L1 -norm ∈ ∂gλ (σik ) = {∞}. By the updating rule of Xk+1 in
wik
problem. In fact, it is not easier than solving the original (18), we have σik+1 = 0. This guarantees that the rank of
problem (1). the sequence {Xk } is nonincreasing.

4133
4129

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
Iteratively updating wik , i = 1, · · · , m, by (14) and Third, since wik ∈ ∂gλ (σik ), by the deﬁnition of the super-
k+1
X by (18) leads to the proposed Iteratively Reweight- gradient, we have
ed Nuclear Norm (IRNN) algorithm. The whole procedure
gλ (σik ) − gλ (σik+1 ) ≥ wik (σik − σik+1 ). (24)
of IRNN is shown in Algorithm 1. If the Lipschitz constant
L(f ) is not known or computable, the backtracking rule can Now, summing (22), (23) and (24) for i = 1, · · · , m, to-
be used to estimate μ in each iteration [3]. gether, we obtain

3. Convergence Analysis F (Xk ) − F (Xk+1 )

m

In this section, we give the convergence analysis for the = gλ (σik ) − gλ (σik+1 ) + f (Xk ) − f (Xk+1 ) (25)
IRNN algorithm. We will show that IRNN decreases the i=1
objective function value monotonically, and any limit point μ − L(f )
is a stationary point of problem (1). We ﬁrst recall the fol- ≥ ||Xk+1 − Xk ||2F ≥ 0.
2
lowing well-known and fundamental property for a smooth
function in the class C 1,1 . Thus F (Xk ) is monotonically decreasing. Summing all the
inequalities in (25) for k ≥ 1, we get
Lemma 4 [4, 3] Let f : Rm×n → R be a continuously dif- ∞
ferentiable function with Lipschitz continuous gradient and μ − L(f )
F (X1 ) ≥ ||Xk+1 − Xk ||2F , (26)
Lipschitz constant L(f ). Then, for any X, Y ∈ Rm×n , and 2
k=1
μ ≥ L(f ),
or equivalently,
μ
f (X) ≤ f (Y) + X − Y, ∇f (Y) + ||X − Y||2F . (21) ∞

2 2F (X1 )
||Xk − Xk+1 ||2F ≤ . (27)
Theorem 1 Assume that gλ and f in problem (1) satisfy μ − L(f )
k=1
the assumptions (A1)-(A2). The sequence {Xk } generated
in Algorithm 1 satisﬁes the following properties: In particular, it implies that lim (Xk − Xk+1 ) = 0. The
k→∞
k
(1) F (X ) is monotonically decreasing. Indeed, boundedness of {Xk } is obtained based on the assumption
(A3).
μ − L(f )
F (Xk ) − F (Xk+1 ) ≥ ||Xk − Xk+1 ||2F ≥ 0; Theorem 2 Let {Xk } be the sequence generated in Algo-
2
rithm 1. Then any accumulation point X∗ of {Xk } is a
(2) lim (Xk − Xk+1 ) = 0; stationary point of (1).
k→∞
Proof. The sequence {Xk } generated in Algorithm 1 is
(3) The sequence {Xk } is bounded. bounded as shown in Theorem 1. Thus there exists a matrix
Proof. First, since Xk+1 is a global solution to problem X∗ and a subsequence {Xkj } such that lim Xkj = X∗ .
j→∞
(18), we get From the fact that lim (Xk −Xk+1 ) = 0 in Theorem 1, we
m k→∞
μ
wik σik+1 + ∇f (Xk ), Xk+1 − Xk + ||Xk+1 − Xk ||2F have lim Xkj +1 = X∗ . Thus σi (Xkj +1 ) → σi (X∗ ) for
j→∞
i=1
2 k kj
i = 1, · · · , m. By the choice ofwi j ∈ ∂gλ (σi (X
)) and
m
μ k
Lemma 1, we have −wi j ∈ ∂ −gλ (σi (Xkj )) . By the
≤ wik σik + ∇f (Xk ), Xk − Xk + ||Xk − Xk ||2F .
i=1
2 upper semi-continuous property of the subdifferential [9,
It can be rewritten as Proposition 2.1.5], there exists −wi∗ ∈ ∂ (−gλ (σi (X∗ )))
k
such that −wi j → −wi∗ . Again by Lemma 1, wi∗ ∈
∇f (Xk ), Xk − Xk+1 k
∂gλ (σi (X∗ )) and wi j → wi∗ .
m
μ m
≥− wik (σik − σik+1 ) + ||Xk − Xk+1 ||2F .
(22) Denote h(X, w) = i=1 wi σi (X). Since Xkj +1
i=1
2 is optimal to problem (18), there exists Gkj +1 ∈
∂h(Xkj +1 , wkj ), such that
Second, since the gradient of f (X) is Lipschitz continuous,
by using Lemma 4, we have Gkj +1 + ∇f (Xkj ) + μ(Xkj +1 − Xkj ) = 0. (28)

f (Xk ) − f (Xk+1 ) Let j → ∞ in (28), there exists G∗ ∈ ∂h(X∗ , w∗ ), such

that
L(f )
≥∇f (Xk ), Xk − Xk+1 − ||Xk − Xk+1 ||2F . 0 = G∗ + ∇f (X∗ ) ∈ ∂F (X∗ ). (29)
2
(23) Thus X∗ is a stationary point of (1).

4134
4130

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
4. Extension to Other Problems
1 0.5
APGL
0.9 IRNN - Lp
0.45
IRNN - SCAD
0.8 IRNN - Logarithm
0.4
Our proposed IRNN algorithm can solve a more general IRNN - MCP

Frequency of Sucess
0.7 IRNN - ETP
0.35

Relative Error
low-rank minimization problem as follows, 0.6

0.5
0.3

0.25
m
0.4
0.2

min gi (σi (X)) + f (X),

0.3
(30) ALM
IRNN-Lp 0.15
X
0.2
IRNN-SCAD

i=1 0.1
IRNN-Logarithm
IRNN-MCP
0.1
IRNN-ETP
0 0.05

where gi , i = 1, · · · , m, are concave, and their super-

20 22 24 26 28 30 32 34 15 20 25 30 35
Rank Rank

gradients satisfy 0 ≤ v1 ≤ v2 ≤ · · · ≤ vm , for any (a) random data without noise (b) random data with noise
vi ∈ ∂gi (σi (X)),mi = 1, · · · , m. The truncated nuclear
norm || X ||r = i=r+1 σi (X) m[16] satisfies the above as-
Figure 3: Comparison of matrix recovery on (a) random data
sumption. Indeed, || X ||r = i=1 gi (σi (X)) by letting without noise, and (b) random data with noise.

0, i = 1, · · · , r, where Ω is the set of indices of samples, and PΩ : Rm×n →
gi (x) = (31) Rm×n is a linear operator that keeps the entries in Ω un-
x, i = r + 1, · · · , m.
changed and those outside Ω zeros. The gradient of squared
Their supergradients are loss function in (35) is Lipschitz continuous, with a Lips-
chitz constant L(f ) = 1. We set μ = 1.1 in Algorithm 1.
0, i = 1, · · · , r, For the choice of gλ , we test all the penalty functions listed
∂gi (x) = (32)
1, i = r + 1, · · · , m. in Table 1 except for Capped L1 and Geman, since we find
that their recovery performances are sensitive to the choices
The convergence results in Theorem 1 and 2 also hold since of γ and λ in different cases. For the choice of λ in IRN-
(24) holds for each gi . Compared with the alternating up- N, we use a continuation technique to enhance the low-rank
dating algorithms in [16], which require double loops, our matrix recovery. The initial value of λ is set to a larger val-
IRNN algorithm will be more efficient and with stronger ue λ0 , and dynamically decreased by λ = η k λ0 with η < 1.
convergence guarantee. It is stopped till reaching a predefined target λt . X is ini-
More generally, IRNN can solve the following problem tialized as a zero matrix. For the choice of parameters (e.g.,
m
p and γ) in the nonconvex penalty functions, we search it
min g(h(σi (X))) + f (X), (33) from a candidate set and use the one which obtains good
X
i=1 performance in most cases 1 .
when g(y) is concave, and the following problem 5.1. Low-Rank Matrix Recovery
min wi h(σi (X)) + || X −Y||2F , (34) We first compare our nonconvex IRNN algorithm with
X
state-of-the-art convex algorithms on synthetic data. We
can be cheaply solved. An interesting application of (33) conduct two experiments. One is for the observed matrix
is to extend the group sparsity on the singular values. By M without noise, and the other one is for M with noise.
dividing the singular values into k groups, i.e., G1 =
For the noise free case, we generate the rank r matrix M
{1, · · · , r1 }, G2 = {r1 + 1, · · · , r1 + r2 − 1}, · · · , Gk =
k−1 as ML MR , where ML ∈ R150×r , and MR ∈ Rr×150 are
{ i ri + 1, · · · , m}, where i ri = m, we can de- generated by the Matlab command randn. 50% elements
fine the group sparsity on the singular values as || X ||2,g = of M are missing uniformly at random. We compare our
k
i=1 g(||σGi ||2 ). This is exactly the first term in (33) by algorithm with Augmented Lagrange Multiplier (ALM) 2
letting h be the L2 -norm of a vector. g can be noncon- [17] which solves the noise free problem
vex functions satisfying the assumption (A1) or specially
the convex absolute function. min || X ||∗ s.t. PΩ (X) = PΩ (M). (36)
X
5. Experiments For this task, we set λ0 = ||PΩ (M)||∞ , λt = 10−5 λ0 ,
In this section, we present several experiments on both and η = 0.7 in IRNN, and stop the algorithm when
synthetic data and real images to validate the effectiveness ||PΩ (X − M)||F ≤ 10−5 . For ALM, we use the default
of the IRNN algorithm. We test our algorithm on the matrix parameters in the released codes. We evaluate the recov-
completion problem ery performance by the Relative Error defined as ||X̂ −
m
1
1 Code of IRNN: https://sites.google.com/site/canyilu/.
min gλ (σi (X)) + ||PΩ (X − M)||2F , (35) 2 Code: http://perception.csl.illinois.edu/matrix-rank/
X
i=1
2 sample_code.html.

4135
4131

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
(1)

(2)

(a) Original Image (b) Noisy Image (c) APGL (d) LMaFit (e) TNNR-ADMM (f) IRNN-Lp (g) IRNN-SCAD

Figure 4: Comparison of image recovery by using different matrix completion algorithms. (a) Original image. (b) Image
with Gaussian noise and text. (c)-(g) Recovered images by APGL, LMaFit, TNNR-ADMM, IRNN-Lp , and IRNN-SCAD,
respectively. Best viewed in ×2 sized color pdf ﬁle.

M ||F /|| M ||F , where X̂ is the recovered solution by a cer- rank, but the top singular values dominate the main infor-
tain algorithm. If the Relative Error is smaller than 10−3 , mation [16]. Thus the corrupted image can be recovered
X̂ is regarded as a successful recovery of M. We repeat by low-rank approximation. For color images which have
the experiments 100 times with the underlying rank r vary- three channels, we simply apply matrix completion for each
ing from 20 to 33 for each algorithm. The frequency of channel independently. The well known Peak Signal-to-
success is plotted in Figure 3a. The legend IRNN-Lp in Noise Ratio (PSNR) is employed to evaluate the recovery
Figure 3a denotes the Lp penalty function used in problem performance. We compare IRNN with some other ma-
(1) and solved by our proposed IRNN algorithm. It can be trix completion algorithms which have been applied for
seen that IRNN with all the nonconvex penalty functions this task, including APGL, Low-Rank Matrix Fitting (L-
achieves much better recovery performance than the con- MaFit) 4 . [22] and Truncated Nuclear Norm Regularization
vex ALM algorithm. This is because the nonconvex penalty (TNNR) [16]. We use the solver based on ADMM to solve
functions approximate the rank function better than the con- a subproblem of TNNR in the released codes (denoted as
vex nuclear norm. TNNR-ADMM) 5 . We try to tune the parameters to be op-
For the noisy case, the data are generated by PΩ (M) = timal of the chosen algorithms and report the best result.
PΩ (ML MR )+0.1×randn. We compare our algorith- In our test, we consider two types of noises on the real
m with convex Accelerated Proximal Gradient with Line images. The ﬁrst one replaces 50% of pixels with random
search (APGL) 3 [20] which solves the noisy problem values (sample image (1) in Figure 4 (b)). The other one
adds some unrelated texts on the image (sample image (2)
1 in Figure 4 (b)). Figure 4 (c)-(g) show the recovered images
min λ|| X ||∗ + ||PΩ (X) − PΩ (M)||2F . (37)
X 2 by different methods. It can be observed that our IRNN
method with different penalty functions achieves much bet-
For this task, we set λ0 = 10||PΩ (M)||∞ , and λt = 0.1λ0
ter recovery performance than APGL and LMaFit. Only
in IRNN. All the chosen algorithms are run 100 times with
the results by IRNN-Lp and IRNN-SCAD are plotted due
the underlying rank r lying between 15 and 35. The rela-
to the limit of space. We further test on more images and
tive errors can be ranging for each test, and the mean errors
plot the results in Figure 5. Figure 6 shows the PSNR val-
by different methods are plotted in Figure 3b. It can be
ues of different methods on all the test images. It can be
seen that IRNN for the nonconvex penalty outperforms the
seen that IRNN with all the evaluated nonconvex functions
convex APGL for the noisy case. Note that we cannot con-
achieves higher PSNR values, which veriﬁes that the non-
clude from Figure 3 that IRNN with Lp , Logarithm and ET-
convex penalty functions are effective in this situation. The
P penalty functions always perform better than SCAD and
nonconvex truncated nuclear norm is close to our methods,
MCP, since the obtained solutions are not globally optimal.
but its running time is 3∼5 times of that for ours.
5.2. Application to Image Recovery
6. Conclusions and Future Work
In this section, we apply matrix completion for image
recovery. As shown in Figure 4, the real image may be In this work, the nonconvex surrogate functions of L0 -
corrupted by different types of noises, e.g., Gaussian noise norm are extended on the singular values to approximate
or unrelated text. Usually the real images are not of low- 4 Code: http://lmafit.blogs.rice.edu/.
3 Code: http://www.math.nus.edu.sg/ mattohkc/NNLS.html. 5 Code: https://sites.google.com/site/zjuyaohu/.
˜

4136
4132

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.
45
APGL IRNN - SCAD
40
LMaFit IRNN - Logarithm
() TNNR - ADMM IRNN - MCP
35
IRNN - Lp IRNN - ETP
30
Image recovery by APGL lp
25

PSNR
() 20

10
Image recovery by APGL lp

() 0
Image (1) Image (2) Image (3) Image (4) Image (5) Image (6)

Figure 6: Comparison of the PSNR values by different matrix

Image recovery by APGL lp
completion algorithms.

() [3] A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for
linear inverse problems. SIAM Journal on Imaging Sciences, 2009.
[4] D. P. Bertsekas. Nonlinear programming. Athena Scientific (Belmont, Mass.),
2nd edition, 1999.
(a) Original Image (b) Noisy Image (c) APGL (d) IRNN-Lp [5] K. Border. The supergradient of a concave function. http://www.hss.
caltech.edu/˜kcb/Notes/Supergrad.pdf, 2001. [Online].
[6] E. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix
Figure 5: Comparison of image recovery on more images. (a) completion. IEEE Transactions on Information Theory, 2010.
Original images. (b) Images with noises. Recovered images by (c) [7] E. Candès, M. Wakin, and S. Boyd. Enhancing sparsity by reweighted 1 min-
APGL, and (d) IRNN-Lp . Best viewed in ×2 sized color pdf file. imization. Journal of Fourier Analysis and Applications, 2008.
[8] K. Chen, H. Dong, and K. Chan. Reduced rank regression via adaptive nuclear
norm penalization. Biometrika, 2013.
the rank function. It is observed that all the existing non- [9] F. Clarke. Nonsmooth analysis and optimization. In Proceedings of the Inter-
convex surrogate functions are concave and monotonically national Congress of Mathematicians, 1983.
increasing on [0, ∞). Then a general solver IRNN is pro- [10] J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and
its oracle properties. Journal of the American Statistical Association, 2001.
posed to solve problem (1) with such penalties. IRNN is the
[11] L. Frank and J. Friedman. A statistical view of some chemometrics regression
first algorithm which is able to solve the general noncon- tools. Technometrics, 1993.
vex low-rank minimization problem (1) with convergence [12] J. Friedman. Fast sparse regression and classification. International Journal of
guarantee. The nonconvex penalty can be nonsmooth by Forecasting, 2012.
[13] C. Gao, N. Wang, Q. Yu, and Z. Zhang. A feasible nonconvex relaxation ap-
using the supergradient at the nonsmooth point. In theory, proach to feature selection. In AAAI, 2011.
we proved that any limit point is a local minimum. Ex- [14] G. Gasso, A. Rakotomamonjy, and S. Canu. Recovering sparse signals with a
periments on both synthetic data and real images demon- certain family of nonconvex penalties and DC programming. IEEE Transac-
tions on Signal Processing, 2009.
strated that IRNN usually outperforms the state-of-the-art [15] D. Geman and C. Yang. Nonlinear image recovery with half-quadratic regular-
convex algorithms. An interesting future work is to solve ization. TIP, 1995.
the nonconvex low-rank minimization problem with affine [16] Y. Hu, D. Zhang, J. Ye, X. Li, and X. He. Fast and accurate matrix completion
via truncated nuclear norm regularization. TPAMI, 2013.
constraint. A possible way is to combine IRNN with Alter-
[17] Z. Lin, M. Chen, L. Wu, and Y. Ma. The augmented lagrange multiplier method
nating Direction Method of Multiplier (ADMM). for exact recovery of a corrupted low-rank matrices. UIUC Technical Report
UILU-ENG-09-2215, Tech. Rep., 2009.
Acknowledgements [18] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma. Robust recovery of subspace
structures by low-rank representation. TPAMI, 2013.
This research is supported by the Singapore National Re- [19] K. Mohan and M. Fazel. Iterative reweighted algorithms for matrix rank mini-
mization. In JMLR, 2012.
search Foundation under its International Research Centre [20] K. Toh and S. Yun. An accelerated proximal gradient algorithm for nuclear
@Singapore Funding Initiative and administered by the ID- norm regularized linear least squares problems. Pacific Journal of Optimization,
2010.
M Programme Office. Z. Lin is supported by NSF of China
[21] J. Trzasko and A. Manduca. Highly undersampled magnetic resonance image
(Grant nos. 61272341, 61231002, and 61121002) and M- reconstruction via homotopic 0 -minimization. TMI, 2009.
SRA. [22] Z. Wen, W. Yin, and Y. Zhang. Solving a low-rank factorization model for
matrix completion by a nonlinear successive over-relaxation algorithm. Math-
ematical Programming Computation, 2012.
References [23] C. Zhang. Nearly unbiased variable selection under minimax concave penalty.
[1] Y. Amit, M. Fink, N. Srebro, and S. Ullman. Uncovering shared structures in The Annals of Statistics, 2010.
multiclass classification. In ICML, 2007. [24] T. Zhang. Analysis of multi-stage convex relaxation for sparse regularization.
[2] A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. JMLR, 2010.
Machine Learning, 2008.

4137
4133

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on June 05,2020 at 15:51:16 UTC from IEEE Xplore. Restrictions apply.

Unit4 DL Final
No ratings yet
Unit4 DL Final
30 pages
Download Full Trigonometry 6th Edition Charles P. Mckeague PDF All Chapters
100% (1)
Download Full Trigonometry 6th Edition Charles P. Mckeague PDF All Chapters
51 pages
BCA SEM 3 Computer Oriented Numerical Methods BC0043
75% (4)
BCA SEM 3 Computer Oriented Numerical Methods BC0043
10 pages
Calculus 3 Lecture Notes
No ratings yet
Calculus 3 Lecture Notes
96 pages
Grade 8 Exponents and Powers
100% (1)
Grade 8 Exponents and Powers
13 pages
Inverse Laplace Transformation Ex 11 2 Umer Asghar Method
100% (1)
Inverse Laplace Transformation Ex 11 2 Umer Asghar Method
34 pages
Add Maths QUADRATICS
No ratings yet
Add Maths QUADRATICS
18 pages
apl232
No ratings yet
apl232
48 pages
Assignment 4 Solutions
No ratings yet
Assignment 4 Solutions
14 pages
Lecture 4 Numerical Differentiation
100% (1)
Lecture 4 Numerical Differentiation
26 pages
DL- Unit 2
No ratings yet
DL- Unit 2
60 pages
C Programming
No ratings yet
C Programming
186 pages
Weak Subgradient Method for Solving Nonsmooth Nonconvex Optimization Problems
No ratings yet
Weak Subgradient Method for Solving Nonsmooth Nonconvex Optimization Problems
42 pages
Ahookhosh 等 - 2021 - A Bregman Forward-Backward Linesearch Algorithm for Nonconvex Composite Optimization Superlinear Co(1)
No ratings yet
Ahookhosh 等 - 2021 - A Bregman Forward-Backward Linesearch Algorithm for Nonconvex Composite Optimization Superlinear Co(1)
33 pages
07_regularization
No ratings yet
07_regularization
51 pages
2304.03641
No ratings yet
2304.03641
45 pages
2004.12345v5
No ratings yet
2004.12345v5
29 pages
SVM_GC
No ratings yet
SVM_GC
42 pages
Quadratic Decomposable Submodular Function Minimization: Theory and Practice
No ratings yet
Quadratic Decomposable Submodular Function Minimization: Theory and Practice
49 pages
Bms Basic NLP 120609
No ratings yet
Bms Basic NLP 120609
103 pages
Covering Spaces and Graph Theory PDF
No ratings yet
Covering Spaces and Graph Theory PDF
46 pages
1-s2.0-S0377042722004769-main
No ratings yet
1-s2.0-S0377042722004769-main
38 pages
NTA-CUET (UG) Mathematics 2025 (By O.P. GUPTA)
No ratings yet
NTA-CUET (UG) Mathematics 2025 (By O.P. GUPTA)
30 pages
Convex Functions
No ratings yet
Convex Functions
13 pages
315 F19 15 SVM 2
No ratings yet
315 F19 15 SVM 2
35 pages
Surrogates
No ratings yet
Surrogates
35 pages
Convergence of Inexact Forward--Backward Algorithms Using the Forward--Backward Envelope
No ratings yet
Convergence of Inexact Forward--Backward Algorithms Using the Forward--Backward Envelope
29 pages
Lecture 7 Loss Function and Regularization
No ratings yet
Lecture 7 Loss Function and Regularization
38 pages
A Projected Semismooth Newton Method For A Class of Nonconvex Composite Programs With Strong Prox-Regularity
No ratings yet
A Projected Semismooth Newton Method For A Class of Nonconvex Composite Programs With Strong Prox-Regularity
32 pages
Chapter 3
No ratings yet
Chapter 3
30 pages
Class05 LogisticsSVM
No ratings yet
Class05 LogisticsSVM
33 pages
GS-OPT: A New Fast Stochastic Algorithm For Solving The Non-Convex Optimization Problem
No ratings yet
GS-OPT: A New Fast Stochastic Algorithm For Solving The Non-Convex Optimization Problem
10 pages
6_approx
No ratings yet
6_approx
26 pages
A-A regularized robust design criterion for uncertain data
No ratings yet
A-A regularized robust design criterion for uncertain data
23 pages
A Unified Convergence Analysis of Block Successive Minimization Methods For Nonsmooth Optimization
No ratings yet
A Unified Convergence Analysis of Block Successive Minimization Methods For Nonsmooth Optimization
34 pages
FTC_2021_Nonsmooth
No ratings yet
FTC_2021_Nonsmooth
22 pages
GTT 4
No ratings yet
GTT 4
24 pages
FTC_2021_Nonsmooth-2-22
No ratings yet
FTC_2021_Nonsmooth-2-22
21 pages
Staar (APPROACHES) : 1 A B C D
No ratings yet
Staar (APPROACHES) : 1 A B C D
10 pages
Penalty Decomposition Methods For L - Norm Minimization
No ratings yet
Penalty Decomposition Methods For L - Norm Minimization
26 pages
(Hu2017groupsparse) Group Sparse Optimization Via LP, Q Regularization
No ratings yet
(Hu2017groupsparse) Group Sparse Optimization Via LP, Q Regularization
52 pages
Penaltyfunctionmethodsusingmatrixlaboratory MATLAB
No ratings yet
Penaltyfunctionmethodsusingmatrixlaboratory MATLAB
39 pages
Filled Function
No ratings yet
Filled Function
16 pages
An Unbiased Approach To Low Rank Recovery
No ratings yet
An Unbiased Approach To Low Rank Recovery
28 pages
Convex Problems
No ratings yet
Convex Problems
48 pages
Pattern Recognition: Zheng Pan, Changshui Zhang
No ratings yet
Pattern Recognition: Zheng Pan, Changshui Zhang
13 pages
Rotating Non-Linear Beam
No ratings yet
Rotating Non-Linear Beam
15 pages
CO1-L2 - Measures of Central Tendencies
No ratings yet
CO1-L2 - Measures of Central Tendencies
38 pages
Lec 16
No ratings yet
Lec 16
15 pages
Ertekin 2011
No ratings yet
Ertekin 2011
14 pages
2202.03599v3
No ratings yet
2202.03599v3
11 pages
Global Optimality of Local Search For Low Rank Matrix Recovery
No ratings yet
Global Optimality of Local Search For Low Rank Matrix Recovery
21 pages
Total Variation-Regularized Weighted Nuclear Norm Minimization For Hyperspectral Image Mixed Denoising
No ratings yet
Total Variation-Regularized Weighted Nuclear Norm Minimization For Hyperspectral Image Mixed Denoising
22 pages
Notes On Gans, Energy-Based Models, and Saddle Points
No ratings yet
Notes On Gans, Energy-Based Models, and Saddle Points
10 pages
Generalized Majorization-Minimization
No ratings yet
Generalized Majorization-Minimization
10 pages
4 Pair of Linear Equations in Two Variables
0% (1)
4 Pair of Linear Equations in Two Variables
27 pages
Karakus 2020
No ratings yet
Karakus 2020
12 pages
Regularization Methods and Fixed Point Algorithms For A Ne Rank Minimization Problems
No ratings yet
Regularization Methods and Fixed Point Algorithms For A Ne Rank Minimization Problems
28 pages
MODUL KELAS 8
No ratings yet
MODUL KELAS 8
9 pages
Robust Sketching For Multiple Square-Root LASSO Problems
No ratings yet
Robust Sketching For Multiple Square-Root LASSO Problems
17 pages
C2_M2_Exam_withSol (1)
No ratings yet
C2_M2_Exam_withSol (1)
12 pages
sparsitNERANETEE NEURA RMNURLA NER WERK MANUA KPAPER PD PDF BENE
No ratings yet
sparsitNERANETEE NEURA RMNURLA NER WERK MANUA KPAPER PD PDF BENE
8 pages
Lecture 03
No ratings yet
Lecture 03
8 pages
Efficient Convex Relaxation for Transductive Support Vector Machine
No ratings yet
Efficient Convex Relaxation for Transductive Support Vector Machine
8 pages
Written Assignment Math-1201 Unit-1 (2)
No ratings yet
Written Assignment Math-1201 Unit-1 (2)
8 pages
Weighted Schatten P-Norm Minimization For Image Denoising and Background Subtraction
No ratings yet
Weighted Schatten P-Norm Minimization For Image Denoising and Background Subtraction
16 pages
Displacement Method
No ratings yet
Displacement Method
27 pages
Yu2017 PDF
No ratings yet
Yu2017 PDF
8 pages
What Is Empirical - Models
No ratings yet
What Is Empirical - Models
14 pages
M2 Exam 2022-23 Solutions
No ratings yet
M2 Exam 2022-23 Solutions
12 pages
Prepared By: Engr. Jeffrey P. Landicho
No ratings yet
Prepared By: Engr. Jeffrey P. Landicho
20 pages
DG 2009
No ratings yet
DG 2009
2 pages
L S N N R: Earning Parse Eural Etworks Through Egularization
No ratings yet
L S N N R: Earning Parse Eural Etworks Through Egularization
13 pages
Introduction To Machine Learning Lecture 2: Linear Regression
No ratings yet
Introduction To Machine Learning Lecture 2: Linear Regression
38 pages
K-SVD: An Algorithm For Designing Overcomplete Dictionaries For Sparse Representation
No ratings yet
K-SVD: An Algorithm For Designing Overcomplete Dictionaries For Sparse Representation
12 pages
When Are Nonconvex Problems Not Scary
No ratings yet
When Are Nonconvex Problems Not Scary
11 pages
Graduated Non-Convexity For Robust Spatial Perception: From Non-Minimal Solvers To Global Outlier Rejection
No ratings yet
Graduated Non-Convexity For Robust Spatial Perception: From Non-Minimal Solvers To Global Outlier Rejection
11 pages
A Non-Convex Relaxation For Fixed-Rank Approximation
No ratings yet
A Non-Convex Relaxation For Fixed-Rank Approximation
9 pages
MSML604_Homework_5
No ratings yet
MSML604_Homework_5
4 pages
Principal'S Health and Business College First Semester Mathematics Final Examination For First Degree Regular Program IN 2012 E.C
No ratings yet
Principal'S Health and Business College First Semester Mathematics Final Examination For First Degree Regular Program IN 2012 E.C
6 pages
SEM5_MTMH_CC11
No ratings yet
SEM5_MTMH_CC11
2 pages
Figueiredo Et Al. - 2007 - Gradient Projection For Sparse Reconstruction Application To Compressed Sensing and Other Inverse Problems-Annotated
No ratings yet
Figueiredo Et Al. - 2007 - Gradient Projection For Sparse Reconstruction Application To Compressed Sensing and Other Inverse Problems-Annotated
12 pages
CSE 597 Spring 2019 Exercise 1 Due Sunday 11:59 PM, February 3th
No ratings yet
CSE 597 Spring 2019 Exercise 1 Due Sunday 11:59 PM, February 3th
6 pages
An Accelerated Gradient Method For Trace Norm Minimization
No ratings yet
An Accelerated Gradient Method For Trace Norm Minimization
8 pages
Simultaneous Dictionary Learning and Denoising For Seismic Data
No ratings yet
Simultaneous Dictionary Learning and Denoising For Seismic Data
5 pages
K-SVD: A Robust Dictionary Learning Algorithm With Simultaneous Update
No ratings yet
K-SVD: A Robust Dictionary Learning Algorithm With Simultaneous Update
5 pages
Tu 04 13 Seismic Data Interpolation and Denoising Using SVD-free Low-Rank Matrix Factorization
No ratings yet
Tu 04 13 Seismic Data Interpolation and Denoising Using SVD-free Low-Rank Matrix Factorization
5 pages
Restoration of Seismic Data Based On Alternating Direction Method and Total Variation Theory
No ratings yet
Restoration of Seismic Data Based On Alternating Direction Method and Total Variation Theory
5 pages
Robust Principle Component Analysis (RPCA) For Seismic Data Denoising
No ratings yet
Robust Principle Component Analysis (RPCA) For Seismic Data Denoising
5 pages
Homework 3 in 18.06 Due On Gradescope by Sunday Midnight, March 5
No ratings yet
Homework 3 in 18.06 Due On Gradescope by Sunday Midnight, March 5
3 pages
D. C. Sorensen: SIAM Journal On Numerical Analysis, Vol. 19, No. 2. (Apr., 1982), Pp. 409-426
No ratings yet
D. C. Sorensen: SIAM Journal On Numerical Analysis, Vol. 19, No. 2. (Apr., 1982), Pp. 409-426
20 pages
MIT22 51F12 Ch2 PDF
No ratings yet
MIT22 51F12 Ch2 PDF
7 pages
Gradient Methods For Minimizing Composite Objective Function
No ratings yet
Gradient Methods For Minimizing Composite Objective Function
31 pages
Estimation: From To: Least-Squares Gauss Kalman
No ratings yet
Estimation: From To: Least-Squares Gauss Kalman
6 pages
Sparsity and Its Mathematics
No ratings yet
Sparsity and Its Mathematics
44 pages
Pages From Notes09
No ratings yet
Pages From Notes09
10 pages
DL Chpter 3
No ratings yet
DL Chpter 3
8 pages
LAB 3 Handout
No ratings yet
LAB 3 Handout
2 pages
Basic Simulation Lab
No ratings yet
Basic Simulation Lab
2 pages
Fill Your Glass With Gold-When It's Half-Full or Even Completely Shattered
From Everand
Fill Your Glass With Gold-When It's Half-Full or Even Completely Shattered
Hillary Saffran
No ratings yet

Generalized Nonconvex Nonsmooth Low-Rank Minimization

Uploaded by

Generalized Nonconvex Nonsmooth Low-Rank Minimization

Uploaded by

2014 IEEE Conference on Computer Vision and Pattern Recognition

Generalized Nonconvex Nonsmooth Low-Rank Minimization

Canyi Lu1 , Jinhui Tang2 , Shuicheng Yan1 , Zhouchen Lin3,∗

penalty functions have been proposed to enhance the s- 0 0 0

Figure 1: Illustration of the popular nonconvex surrogate func-

1063-6919/14 $31.00 © 2014 IEEE 4126

3. Convergence Analysis F (Xk ) − F (Xk+1 )

f (Xk ) − f (Xk+1 ) Let j → ∞ in (28), there exists G∗ ∈ ∂h(X∗ , w∗ ), such

min gi (σi (X)) + f (X),

where gi , i = 1, · · · , m, are concave, and their super-

Figure 6: Comparison of the PSNR values by different matrix

You might also like