Multispectral Joint Image Restoration Via Optimizing A Scale Map
Multispectral Joint Image Restoration Via Optimizing A Scale Map
VOL. 37,
NO. 12,
DECEMBER 2015
INTRODUCTION
X. Shen and J. Jia are with the Department of Computer Science and Engineering, The Chinese University of Hong Kong. Shatin, Hong Kong.
E-mail: {xyshen, leojia}@cse.cuhk.edu.hk.
L. Ma is with the Department of Computer Science and Engineering,
Shanghai Jiao Tong University, Shanghai, China.
E-mail: [email protected].
Q. Yan and L. Xu are with Lenovo R&T.
Manuscript received 3 Mar. 2014; revised 19 Dec. 2014; accepted 10 Mar.
2015. Date of publication 7 Apr. 2015; date of current version 6 Nov. 2015.
Recommended for acceptance by K. Lee.
For information on obtaining reprints of this article, please send e-mail to:
[email protected], and reference the Digital Object Identifier below.
Digital Object Identifier no. 10.1109/TPAMI.2015.2417569
Haar wavelets. In [32] and [23], the detail layer was manipulated differently for RGB and haze image enhancement.
Methods exploring other image fusion applications are
two-image deblurring [30], matting [24], tone mapping [8],
upsampling [14], context enhancement [21], relighting [2],
to name a few. Bhat et al. [3] proposed GradientShop to edit
gradients, which can also be used to enhance images.
We note existing methods work well for their respective
applications by handling different detail layers or gradients
from multiple images. But in terms of two-image high-quality image restoration, there remain a few major and fundamental issues. We take the RGB-NIR images shown in Fig. 1
as an example to reveal issues generally existing in multispectral images. In this example, an NIR image differs from
the corresponding RGB one in detail distribution and intensity formation. Structure inconsistency on many pixels can
be categorized as follows.
0162-8828 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
SHEN ET AL.: MULTISPECTRAL JOINT IMAGE RESTORATION VIA OPTIMIZING A SCALE MAP
2519
(1)
Here r is an operator forming a vector with x- and y-direction gradients. Each element si in map s, where i indexes
pixels, is a scalar, measuring robust difference between corresponding gradients in the two images. Simply put, s is a
scale or ratio map between the guidance and ground truth.
The optimal s corresponding to the multispectral example
in Fig. 1 is shown in Fig. 2, visualized as a color image after
pixel-wise value normalization to [0,1].
We analyze the unique properties of s with regard to
structure discrepancy between rG and rI , and present
them as follows with the facilitation of illustration in Fig. 3.
Property of s. First, sign of each si can be either positive or
negative. A negative si means the ground truth edge exists
also in the guidance, but edge direction is reverted, as demonstrated in Fig. 3c. Second, when the guidance image G
contains extra shadow and highlight caused by flash, which
do not exist in rI , si with value 0 can help ignore them.
Finally, si can be any value when rGi 0. That is, guidance
edge does not exist (the red letters in Fig. 3a). Under local
smoothness, making si 0 is a good choice.
In short, if the s map is constructed optimally, all structure discrepancy problems can be addressed. We will discuss it more in Section 4.2. This map is first-of-a-kind to
avail solving this set of multispectral restoration problems.
Its additional and notable benefit is the role as latent variables in developing an efficient optimization procedure.
More of the function. Denoting by I our estimate towards
I , Eq. (1) is updated to
minkrI s rGk:
(2)
2520
VOL. 37,
NO. 12,
DECEMBER 2015
X
rjsi pi;x rx Ii j rjsi pi;y ry Ii j ;
(4)
(5)
E2 I
1
;
signrk Gi maxjrk Gi j; "
(6)
rjIi I0;i j;
(7)
1
2
rGi
2h2
T
rG?
rG?
h2 1 ;
i
i
(8)
T
where rG?
is a vector perpendicular
i ry Gi ; rx Gi
to rGi , 1 is an identity matrix and scalar h controls the
isotropic smoothness. When rGi is much smaller than h,
Eq. (8) reduces to 0:5 1 and the structure tensor is therefore isotropic. In all our experiments, h is set to 0.1
empirically.
Generally, the two orthogonal eigenvectors of DrGi
are
vi;1
rGi
;
jrGi j
vi;2
rG?
i
;
jrGi j
(9)
h2
rGi 2 2h2
mi;2
rGi 2 h2
rGi 2 2h2
(10)
SHEN ET AL.: MULTISPECTRAL JOINT IMAGE RESTORATION VIA OPTIMIZING A SCALE MAP
2521
(14)
mi;1
vi;2
0
0
mi;2
!
vTi;1
:
vTi;2
(11)
(12)
X
i
2.4
mi;1 vTi;1 rsi 2 mi;2 vTi;2 rsi 2 :
(13)
Fig. 6. Multispectral shadow detection. Given noisy RGB image (a) and NIR image (b), our algorithm detects shadow regions in (b). (c) is the denoising result of (a) with the help of (b). (d) is the shadow removed and smoothed result of (b). (e) is the rough shadow map while (f) is obtained by applying labeling to (e). (g) is the exacted small structures of I0 and (h) shows the final shadow detection result.
2522
VOL. 37,
NO. 12,
DECEMBER 2015
(18)
like bilateral filter [4]. It helps keep edges and remove noise.
s G and s x are two parameters to control spatial and range
influence.
This function slightly smoothes I0 aware of inherent
structure. It would not strongly damage edges. The nonlocal
image smoothing can be efficiently solved iteratively by fast
bilateral filtering [18]. In our experiments, we set 1 to 0.01.
s G and s x are set to 0.01 and 6 respectively. Size of Np is
set to 2s x 1 2s x 1. One initial denoising result is
shown in Fig. 6c.
q2Np
q2N 0 p
v0p;q is the affinity again taking the denoised I 0 as the guidance. It is expressed as
!
kIp0 Iq0 k2 kp qk2
0
:
(17)
vp;q exp
s 02
s 2I
x
Obviously, if shadow is not presented at one pixel in I 0 ,
smoothing will not stop at it in the NIR image. This effect
results in automatic shadow removal in the filtered G0 . Setting of s 0x relates to the shadow region size. It is not sensitive. The size of N 0 p is twice of s 0x . Both s I and 2 are set
to 0.01 in our experiments.
i2V
i;j2E
(20)
SHEN ET AL.: MULTISPECTRAL JOINT IMAGE RESTORATION VIA OPTIMIZING A SCALE MAP
when comparing G with its smoothed version even if pixels are not shadow ones.
To redress this problem, we explicitly extract small structures in I0 by Difference of Gaussian, as illustrated in
Fig. 6g. This simple operation is fast to separate out most
small structures. Then we apply the relative complement of
(g) in (f) to keep non-zero pixels in (f) when they are zeros
in (g). This process can remove more than 98 percent of the
errors in our experiments since nearly all small regions not
belonging to shadow are eliminated. Note small shadow
regions could also be discarded. It is acceptable in all our
experiments because small shadow would not influence
much the quality of final restored images.
Median filtering is applied afterwards to remove isolated
noise. We denote the final shadow detection result as Sr ,
shown in Fig. 6h.
2523
(23)
NUMERICAL SOLUTION
1
jxj2a
(25)
(21)
(24)
I I0 T BI I0 bsT Ls
0
gI I T DI I ;
(26)
(27)
Vx ii rx Gi =maxjrGi j; ";
Vy ii ry Gi =maxjrGi j; ":
2524
4.1 Solver
We propose an alternating minimization algorithm to solve
for s and I based on above derivations. Results of s and I in
each iteration t are denoted as st and It . Initially, we set
s0 1, whose elements are all 1 and I0 I0 .
By setting all initial si to 1, total smoothness is obtained.
It yields zero cost for E3 s, a nice starting point for optimization. This initialization also makes the starting rI same
as rG with many details. Then at iteration t 1, we solve
two subproblems sequentially
Given st and It , minimize Es; It to get st1 .
Given st1 and It , minimize Est1 ; I to update
It1 .
The procedure is repeated until s and I do not change too
much. Usually, a small number of iterations (4-6) are
enough to generate visually compelling results. The algorithm is depicted in Algorithm 2, with the solvers elaborated on below.
Solve for st1 . The energy function with respect to s can
be expressed as
Es s P x Cx IT Ax s P x Cx I
s P y Cy IT Ay s P y Cy I bsT Ls:
(28)
(29)
VOL. 37,
NO. 12,
DECEMBER 2015
t;t
t;t
t
t
At;t
At;t
x Ay bL s Ax Px Cx I
y Py Cy I :
(30)
(31)
SHEN ET AL.: MULTISPECTRAL JOINT IMAGE RESTORATION VIA OPTIMIZING A SCALE MAP
2525
Fig. 7. s map estimation in iterations. Given image pairs in (a) and (b), our method can get the high-quality restoration result in (c). The s maps in different iterations are shown in (e)-(h).
Fig. 8. Comparison with guided image filter [12]. (c) is the guided image
filter result with r 6 and 0:032 . (d) is our result computed with
2:5 and b 0:5.
gradients. One example is shown in Fig. 8. The result difference is obvious. Note our si is defined for each pixel with the
robust data and regression terms optimized in a global function, which leads to the less-noise result.
Our method is also distinct from edge-aware filtering [1],
[10], [25], which employs weighted mean. The new scale
map is developed in gradient domain and considers possible relationship between rI and rG. We show the comparisons in Section 5.
2526
VOL. 37,
NO. 12,
DECEMBER 2015
Fig. 10. Shadow detection in multispectral restoration. Our shadow detection results in (d) are more accurate than those of [22] (shown in (c)). (e) are
results of our method after shadow detection.
1. http://www.cse.cuhk.edu.hk/leojia/projects/crossfield
SHEN ET AL.: MULTISPECTRAL JOINT IMAGE RESTORATION VIA OPTIMIZING A SCALE MAP
2527
Fig. 11. Tea-bag example. The input RGB and NIR images are shown in Figs. 6a and 6b respectively. (a) is the result of guided non-local means [10].
(b) is denoised by the method of [12] with the NIR image as guidance (r 4 and 0:012 ). (c) is the BM3D result while (d) is enhanced by the
method of [32] (s 35:0). (e) is our result ( 10:0 and b 1:5) and (f) shows the corresponding scale map.
We have presented a complete system, showing a principled way for multispectral joint image restoration. Unlike
transferring details or applying joint filtering, we explicitly
take the possible structural discrepancy between input
images into consideration. It is encoded in a scale map s
involving all challenging cases to deal with. Our objective
functions and optimization process are tailor made to use
2528
VOL. 37,
NO. 12,
DECEMBER 2015
Fig. 14. Image restoration from flash/no-flash image pairs. The input
images and result of [20] are obtained from the original paper. 6:0
and b 0:6 are used in our method.
Fig. 17. Image restoration from haze image. Close-ups shown in (d) are
from (a-c). The left two are NIR and haze images. The right three are
results of [23], [13] and ours.
Fig. 15. Depth map restoration with RGB image as guidance. The
parameter settings are 5:0 and b 0:5.
Fig. 18. Day and night image pair enhancement. (c) is from (b) by
improving the color contrast in Photoshop. (d) is our result get from (c) in
our framework using (a) as guidance. The parameters used to compute
our result are 3:0 and b 0:5.
Fig. 16. Texture smoothing example. (b) is the result computed by [28]
and (c) is our result computed with 5:0 and b 1:0.
the guidance from other domains and preserve only necessary details and edges.
The limitation of our current method is on the situation
that the guidance does not exist, corresponding to zero rG
and non-zero rI pixels. One example is shown in Fig. 19.
SHEN ET AL.: MULTISPECTRAL JOINT IMAGE RESTORATION VIA OPTIMIZING A SCALE MAP
Because the guidance does not exist, image restoration naturally degrades to single-image denoising.
ACKNOWLEDGMENTS
The work described in this paper was supported by a grant
from the Research Grants Council of the Hong Kong Special
Administrative Region (Project No. 412911).
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
A. K. Agrawal, R. Raskar, S. K. Nayar, and Y. Li, Removing photography artifacts using gradient projection and flash-exposure
sampling, ACM Trans. Graph., vol. 24, no. 3, pp. 828835, 2005.
D. Akers, F. Losasso, J. Klingner, M. Agrawala, J. Rick, and
P. Hanrahan, Conveying shape and features with image-based
relighting, in Proc. IEEE Vis., 2003, pp. 349354.
P. Bhat, C. L. Zitnick, M. F. Cohen, and B. Curless, Gradientshop:
A gradient-domain optimization framework for image and video
filtering, ACM Trans. Graph., vol. 29, no. 2, p. 10, 2010.
M. J. Black, G. Sapiro, D. H. Marimont, and D. Heeger, Robust
anisotropic diffusion, IEEE Trans. Image Process., vol. 7, no. 3,
pp. 421432, Mar. 1998.
A. Buades, B. Coll, and J.-M. Morel, A non-local algorithm for
image denoising, in Proc. IEEE Conf. Comput. Vis. Pattern Recog.,
2005, pp. 6065.
K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, Image denoising by sparse 3-d transform-domain collaborative filtering, IEEE
Trans. Image Process., vol. 16, no. 8, pp. 20802095, Aug. 2007.
E. Eisemann and F. Durand, Flash photography enhancement via
intrinsic relighting, ACM Trans. Graph., vol. 23, no. 3, pp. 673
678, 2004.
R. Fattal, D. Lischinski, and M. Werman, Gradient domain high
dynamic range compression, ACM Trans. Graph., vol. 21,
pp. 249256, 2002.
G. D. Finlayson, S. D. Hordley, and M. S. Drew, Removing shadows from images, in Proc. 7th Eur. Conf. Comput. Vis., 2002,
pp. 823836.
E. S. L. Gastal and M. M. Oliveira, Adaptive manifolds for realtime high-dimensional filtering, ACM Trans. Graph., vol. 31,
no. 4, pp. 33:133:13, 2012.
R. Guo, Q. Dai, and D. Hoiem, Single-image shadow detection
and removal using paired regions, in Proc. IEEE Conf. Comput.
Vis. Pattern Recog., 2011, pp. 20332040.
K. He, J. Sun, and X. Tang, Guided image filtering, in Proc. Eur.
Conf. Comput. Vis., 2010, pp. 114.
K. He, J. Sun, and X. Tang, Single image haze removal using dark
channel prior, IEEE Trans. Pattern Anal. Mach. Intell., vol. 33,
no. 12, pp. 23412353, Dec. 2011.
J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, Joint
bilateral upsampling, ACM Trans. Graph., vol. 26, no. 3, p. 96,
2007.
D. Krishnan and R. Fergus, Dark flash photography, ACM
Trans. Graph., vol. 28, no. 3, p. 96, 2009.
A. Levin and Y. Weiss, User assisted separation of reflections
from a single image using a sparsity prior, in Proc. Eur. Conf.
Comput. Vis., 2004, pp. 602613.
M. D. Levine and J. Bhattacharyya, Removing shadows, Pattern
Recognit. Lett., vol. 26, no. 3, pp. 251265, 2005.
S. Paris and F. Durand, A fast approximation of the bilateral filter
using a signal processing approach, in Proc. Eur. Conf. Comput.
Vis., 2006, pp. 568580.
P. Perona and J. Malik, Scale-space and edge detection using anisotropic diffusion, IEEE Trans. Pattern Anal. Mach. Intell., vol. 12,
no. 7, pp. 629639, Jul. 1990.
G. Petschnigg, R. Szeliski, M. Agrawala, M. F. Cohen, H. Hoppe,
and K. Toyama, Digital photography with flash and no-flash
image pairs, ACM Trans. Graph., vol. 23, no. 3, pp. 664672, 2004.
R. Raskar, A. Ilie, and J. Yu, Image fusion for context enhancement and video surrealism, in Proc. 3rd Int. Symp. Non-Photorealistic Animation Rendering, 2004, pp. 85152.
uenacht, C. Fredembach, and S. S
ustrunk, Automatic and
D. R
accurate shadow detection using near-infrared information,
IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 8, pp. 1672
1678, Aug. 2014.
2529
Li Xu received the BS and MS degrees in computer science and engineering from Shanghai
Jiao Tong University (SJTU) in 2004 and 2007,
respectively, and the PhD degree in computer
science and engineering from the Chinese University of Hong Kong (CUHK) in 2010. He joined
Lenovo R&T Hong Kong in August 2013, where
he leads the imaging & sensing group in the
Image & Visual Computing (IVC) Lab. He
received the Microsoft Research Asia Fellowship
Award in 2008 and the best paper award of
NPAR 2012. His major research areas include motion estimation, motion
deblurring, image/video analysis, and enhancement. He is a member of
the IEEE.
2530
VOL. 37,
NO. 12,
DECEMBER 2015