0% found this document useful (0 votes)
27 views

Removing Motion Blur With SpaceTime Processing

This document discusses an approach to removing motion blur from videos without requiring explicit motion estimation or object segmentation. It proposes using a space-time (3D) point spread function model instead of a frame-by-frame spatial (2D) model, which is better suited to motion blur caused by camera and object movement. Examples show that existing 2D deblurring methods only remove global camera motion blur and not object motion blur, requiring segmentation. The proposed 3D approach can remove both types of motion blur without segmentation by exploiting the shift-invariant nature of temporal point spread functions.

Uploaded by

Marco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Removing Motion Blur With SpaceTime Processing

This document discusses an approach to removing motion blur from videos without requiring explicit motion estimation or object segmentation. It proposes using a space-time (3D) point spread function model instead of a frame-by-frame spatial (2D) model, which is better suited to motion blur caused by camera and object movement. Examples show that existing 2D deblurring methods only remove global camera motion blur and not object motion blur, requiring segmentation. The proposed 3D approach can remove both types of motion blur without segmentation by exploiting the shift-invariant nature of temporal point spread functions.

Uploaded by

Marco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

2990 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO.

10, OCTOBER 2011

Correspondence
Removing Motion Blur With Space–Time Processing The practical solutions to blind motion deblurring available so far
largely only treat the case, where the blur is a result of global mo-
Hiroyuki Takeda, Member, IEEE, and Peyman Milanfar, Fellow, IEEE tions due to the camera displacements [3], [4], rather than motion of
the objects in the scene. When the motion blur is not global, then it
would seem that segmentation information is needed in order to iden-
Abstract—Although spatial deblurring is relatively well understood by
tify what part of the image suffers from motion blur (typically due to
assuming that the blur kernel is shift invariant, motion blur is not so when fast-moving objects). Consequently, the problem of deblurring moving
we attempt to deconvolve on a frame-by-frame basis: this is because, in gen- objects in the scene is quite complex because it requires 1) segmenta-
eral, videos include complex, multilayer transitions. Indeed, we face an ex- tion of moving objects from the background, 2) estimation of a spatial
ceedingly difficult problem in motion deblurring of a single frame when the motion PSF for each moving object, 3) deconvolution of the moving ob-
scene contains motion occlusions. Instead of deblurring video frames indi-
vidually, a fully 3-D deblurring method is proposed in this paper to reduce jects one by one with the corresponding PSFs, and finally 4) putting the
motion blur from a single motion-blurred video to produce a high-resolu- deblurred objects back together into a coherent and artifact-free image
tion video in both space and time. Unlike other existing approaches, the pro- or sequence [5]–[8]. In order to perform the first two steps (segmen-
posed deblurring kernel is free from knowledge of the local motions. Most tation and PSF estimation), one would need to carry out global/local
importantly, due to its inherent locally adaptive nature, the 3-D deblurring
is capable of automatically deblurring the portions of the sequence, which
motion estimation [9]–[12]. Thus, the deblurring performance strongly
are motion blurred, without segmentation and without adversely affecting depends on the accuracy of motion estimation and segmentation of
the rest of the spatiotemporal domain, where such blur is not present. Our moving objects. However, the errors in both are in general unavoidable,
method is a two-step approach; first we upscale the input video in space particularly, in the presence of multiple motions, occlusion, or non-
and time without explicit estimates of local motions, and then perform 3-D rigid motions, i.e., when there are any motions that violate parametric
deblurring to obtain the restored sequence.
models or the standard optical flow brightness constancy constraint.
Index Terms—Inverse filtering, sharpening and deblurring. In this paper, we present a motion deblurring approach for videos
that is free of both explicit motion estimation and segmentation. Briefly
speaking, we point out and exploit what in hindsight seems obvious,
I. INTRODUCTION though apparently not exploited so far in the literature: that motion blur
is by nature a temporal blur, which is caused by relative displacements

E ARLIER in [1], we proposed a space–time data-adaptive video


upscaling method, which does not require explicit subpixel es-
timates of motions. We named this method 3-D steering kernel regres-
of the camera and the objects in the scene while the camera shutter
is opened. Therefore, a temporal blur degradation model is more ap-
propriate and physically meaningful for the general motion deblurring
sion (3-D SKR). Unlike other video upscaling methods, e.g., [2], it is
problem than the usual spatial blur model. An important advantage
capable of finding an unknown pixel at an arbitrary position in not only
of the use of the temporal blur model is that regardless of whether
space, but also time domains by filtering the neighboring pixels along
the motion blur is global (camera induced) or local (object induced)
the local 3-D orientations, which comprise spatial orientations and mo-
in nature, the temporal PSF stays shift invariant,1 whereas the spatial
tion trajectories. After upscaling the input video, one usually performs
PSF must be considered shift variant in essentially all state-of-the-art
a frame-by-frame deblurring process with a shift-invariant spatial (2-D)
frame-by-frame (or 2-D, spatial) motion deblurring approaches [5]–[8].
point spread function (PSF) in order to recover high-frequency compo-
The examples in Figs. 1 and 2 illustrate the advantage of our
nents. However, typically, since any motion blurs present are often shift
space–time (3-D) approach as compared to the blind motion deblur-
variant due to motion occlusions or nonuniform motions in the scene,
ring methods in the spatial domain proposed by Fergus et al. [3] and
they remain untreated, and hence, motion deblurring is a challenging
Shan et al. [4]. For the first example, the ground truth, a motion-blurred
problem. The main focus of this paper is to illustrate one important but
frame, and the restored images by Fergus’ method, Shan’s method,
so far unnoticed fact: any successful space–time interpolators enable
and our approach are shown in Fig. 1(a)–(e) including some detailed
us to remove the motion blur effects by deblurring with shift-invariant
regions in Fig. 1(f)–(j), respectively. As can be seen from this example,
space–time (3-D) PSF and without any object segmentation or motion
their methods [3], [4] deblur the background, while, in fact, we wish
information. In this presentation, we use our 3-D SKR method as such
to restore the details of the mug. This is because both blind methods
a space–time interpolator.
are designed for the removal of the global blur effect caused by trans-
lational displacements of the camera (i.e., ego-motion). Segmentation
Manuscript received August 13, 2010; revised December 28, 2010, March of the moving objects is necessary to deblur segments one by one with
03, 2011; accepted March 04, 2011. Date of publication March 24, 2011; date of different motion PSFs. On the other hand, the second example in Fig. 2
current version September 16, 2011. This work was supported in part by the U.S.
Air Force under Grant FA9550-07-1-0365 and the National Science Foundation is a case, where spatial segmentation of motion regions is simply not
under Grant CCF-1016018. The associate editor coordinating the review of this practical. The pepper image shown in Fig. 2(b) is blurred by another
manuscript and approving it for publication was Dr. James E. Fowler. type of motion, namely, the rotation of the camera. When the camera
H. Takeda was with the Electrical Engineering Department, University of Cal- rotates about its optical axis while capturing an image, the middle
ifornia, Santa Cruz, CA 95064 USA. He is now with the University of Michigan, portion of the image is less blurry than the outer regions because the
Ann Arbor, MI 48109 USA (e-mail: [email protected]).
P. Milanfar is with the Electrical Engineering Department, University of Cal- pixels in the middle move relatively little. Similar to the previous
ifornia, Santa Cruz, CA 95064 USA (e-mail: [email protected]). example, the restored images by Fergus’, Shan’s, and our approaches
Color versions of one or more of the figures in this paper are available online are shown in Fig. 2(d) and (e). We will discuss this example in more
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2011.2131666 1We assume that the exposure time stays constant.

1057-7149/$26.00 © 2011 IEEE

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011 2991

2
Fig. 1. Motion (temporal) deblurring example of the Cup sequence (130 165, 16 frames) in which a cup moves upward. (a) Two frames of the ground truth
at times t =6 2 2
to 7. (b) Blurred video frames generated by taking the average of five consecutive frames (the corresponding PSF is 1 1 5 uniform) [PSNR:
23.76 dB (top), 23.68 dB (bottom), and structure similarity (SSIM): 0.76 (top), 0.75 (bottom)]. (c)–(e) Deblurred frames by Fergus’s method [3] [PSNR: 22.58 dB
(top), 22.44 dB (bottom), and SSIM: 0.69 (top), 0.68 (bottom)], Shan’s method [4] [PSNR: 18.51 dB (top), 10.75 dB (bottom), and SSIM: 0.57 (top), 0.16 (bottom)],
and the proposed 3-D total variation (TV) method (13) [PSNR: 32.57 dB (top), 31.55 dB (bottom), and SSIM: 0.98 (top), 0.97 (bottom)], respectively. The figures
(f)–(j) are the selected regions of the video frames (a)–(e) at time t=6 , respectively. (a) Ground truth. (b) Blurred frames. (c) Fergus et al. [3]. (d) Shan et al. [4].
(e) Proposed method (13). (f) Ground truth. (g) Blurred frames. (h) Fergus et al., [3]. (i) Shan et al. [4]. (j) Proposed method (13).

detail in Section III and a few more examples are also available at the distribution of gradients and the degree of blur [3]. With this in
our website.2 Although the blind methods are capable of estimating hand, the method estimates a spatial motion PSF for each segmented
complex blur kernels, when the blur is spatially nonuniform, they no object.
longer work. We briefly summarize some existing methods for the Later, inspired by Fergus’ blind motion deblurring method, Levin
motion deblurring problem in the next section. [8] and Shan et al. [4] proposed blind deblurring methods for a single
blurred image caused by a shaking camera. Although their methods
are limited to global motion blur, using the relationship between the
II. MOTION DEBLURRING IN 2-D AND 3-D distribution of derivatives and the degree of blur proposed by Fergus et
al., they estimated a shift-invariant PSF without parametrization.
A. Existing Methods Ji and Liu [13] and Dai and Wu [14] also proposed derivative-based
Ben-Ezra and Nayar [5], Tai et al. [6], and Cho et al. [7] proposed de- methods. Ji and Liu estimated the spatial motion PSF by a spectral
blurring methods, where the spatial motion PSF is obtained from the es- analysis of the image gradients, and Dai and Wu obtained the PSF by
timated motions. Ben-Ezra and Nayar [5] and Tai et al. [6] used two dif- studying how blurry the local edges are, as indicated by local gradients.
ferent cameras: a low-speed high-resolution camera and a high-speed Recently, another blind motion deblurring method was proposed by
low-resolution camera, and capture two videos of the same scene at the Chen et al. [15] for the reduction of global motion blur. They claimed
same time. Then, they estimate motions using the high-speed low-reso- that the PSF estimation is more stable with two images of the same
lution video so that detailed local motion trajectories can be estimated, scene degraded by different PSFs, and also used a robust estimation
and the estimated local motions yield a spatial motion PSF for each technique to stabilize the PSF estimation process further.
moving object. On the other hand, Cho et al. [7] took a pair of images With the advancement of computational algorithms, as mentioned
by a camera with some time delay or by two cameras with no time delay earlier, the data-acquisition process has been also studied. Using mul-
but some spatial displacement. The image pair enables the separation tiple cameras [5]–[7] is one simple way to make the identification of the
of the moving objects and the foreground from the background. Each underlying motion-blur kernel easier. Another technique called coded
part of the images is often blurred with a different PSF. The separation exposure improves the estimation of both blur kernels and images [16].
is helpful in estimating the different PSFs individually, and the estima- The idea of the coded exposure is to preserve some high-frequency
tion process of the PSFs becomes more stable. components by repeatedly opening and closing the shutter while the
Whereas the deblurring methods in [5]–[7] obtain the spatial motion camera is capturing a single image. Although it makes the SNR ratio
PSF based on the global/local motion information, Fergus et al. pro- worse, the high-frequency components are helpful in not only finding
posed a blind motion deblurring method using a relationship between the blur kernel, but also estimating the underlying image with higher
quality. When the blur is spatially variant, then scene segmentation is
2http://users.soe.ucsc.edu/~htakeda/VideoDeblurring/VideoDeblurring.htm necessary [17].

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
2992 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011

2
Fig. 2. Motion deblurring example of a rotating pepper sequence (179 179, 90 frames). (a) One of the frames from a simulated sequence, which we generate
by rotating the pepper image counterclockwise 1 per frame. (b) Blurred frame generated by taking the average of eight consecutive frames (the corresponding
2 2
PSF is a 1 1 8 shift-invariant uniform PSF) and adding white Gaussian noise with standard deviation  = 2 PSNR = 27 10 SSIM = 0 82
( : dB; : ). (c) and
(d) Deblurred frames by Fergus’ method [3] ( PSNR = 23 23 : dB; SSIM = 0 61 : ), and Shan’s method [4] ( PSNR = 25 12 SSIM = 0 81
: dB; : ), respectively.
(e) Deblurred frame by the proposed method ( PSNR = 33 12 : dB; SSIM = 0 90 : ). The images in the second column show the magnifications of the upper right
portions of the images in the first column. (a) Ground truth. (b) Blurred frame. (c) Fergus et al. [3]. (d) Shan et al. [4]. (e) Proposed method (13).

Fig. 3. Schematic representation of the exposure time  and the frame interval  . (a) Standard camera. (b) Multiple videos taken by multiple cameras with slight
time delay is fused to produce a high frame rate video. (c) Original frames with estimated intermediate frames, Frame rate upconversion. (d) Temporally deblurred
output frames.

B. Path Ahead of interest, but as we will explain shortly, rather a means to obtain a de-
blurred sequence, at possibly the original frame rate. It is worth noting
that the temporal blur reduction is equivalent to shortening the expo-
All the methods mentioned earlier are similar in that they aim at re- sure time of video frames. Typically, the exposure time e is less than
moving motion blur by spatial (2-D) processing. In the presence of mul- the time interval between the frames f (i.e., e  f ), as shown in
tiple motions, the existing methods would have to estimate shift-variant Fig. 3(a). Many commercial cameras set e to less than 0:5f (see for
PSF and segment the blurred images by local motions (or depth maps). instance [18]). Borissoff in [18] pointed out that e should ideally de-
However, occlusions make the deblurring problem more difficult be- pend on the speed of moving objects. Specifically, the exposure time
cause pixel values around motion occlusions are a mixture of multiple should be half of the time it takes for a moving object to run through
objects moving in independent directions. In this paper, we reduce the the scene width, or else temporal aliasing would be visible. In [19],
motion blur effect from videos by introducing the space–time (3-D) Shechtman et al. presented a space–time super resolution (SR) algo-
deblurring model. Since the data model is more reflective of the actual rithm, where multiple cameras capture the same scene at once with
data-acquisition process, even in the presence of motion occlusions, slight spatial and temporal displacements. Then, multiple low-resolu-
deblurring with 3-D blur kernel can effectively remove both global and tion videos in space and time are fused to obtain a spatiotemporally
local motion blur without segmentation or reliance on explicit motion super-resolved sequence. As a postprocessing step, they spatiotempo-
information. rally deblur the super-resolved video so that the exposure time e nearly
Practically speaking, for videos, it is not always preferable to re- equals to the frame interval f . Recently, Agrawal et al. proposed a
move all the motion blur effect from video frames. Particularly, for temporal coded sampling technique for temporal video SR in [20],
videos with relatively low frame rate (e.g., 10–20 frames per second), where multiple cameras simultaneously capture the same scene with
in order to show smooth trajectory of moving objects, motion blur (tem- different frame rates, exposure times, and temporal sampling positions.
poral blur) is often intentionally added. Thus, when removing (or more Their proposed method carefully optimizes those frame sampling con-
precisely “reducing”) the motion blur from videos, we would need to ditions so that the space–time SR can achieve higher quality results. By
increase the temporal resolution of the video. This operation can be contrast, in this paper, we demonstrate that the problem of motion blur
thought of as the familiar frame rate up-conversion, with the following restoration can be solved using a single, possibly low frame rate, video
caveat: in our context, the intermediate frames are not the end results sequence.

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011 2993

Fig. 4. Forward model addressed in this paper. We estimate the desired video u by two-step approach: 1) space–time upscaling, and 2) space–time deblurring.

To summarize, frame-rate up-conversion is necessary in order to deblurring for videos. Ringing suppression is of importance because
avoid temporal aliasing. Furthermore, unlike motion deblurring algo- the ringing effect in time creates significant visual distortion for the
rithms which address the problem purely in the spatial domain [3]–[7], output videos.
[13]–[15], we deblur with a shift-invariant 3-D PSF, which is effective 1) Data Model: The exposure time e of videos taken with a stan-
for any type of motion blur. Examples were illustrated in Figs. 1 and dard camera is always shorter than the frame interval f , as illustrated
2, and more will be shown later in Section III. The following are the in Fig. 3(a). It is generally not possible to reduce motion blur by tem-
assumptions and the limitations of our 3-D deblurring approach. poral deblurring when e < f (i.e., the temporal support of the PSF is
Assumptions shorter than the frame interval f ). This is because the standard camera
1) The camera settings are fixed: captures one frame at a time. The camera reads a frame out of the photo-
The aperture size, the focus length, the exposure time, and the sensitive array, and the array is reset to capture the next frame.3 Unlike
frame interval are all fixed. The photosensitivity of the image the spatial sampling rate, the temporal sampling rate is always below
sensor array is uniform and unchanged. the Nyquist rate. This is an electromechanical limitation of the standard
2) One camera captures one frame at a time: video camera. One way to have a high-speed video with e > f is to
In our approach, only one video is available, and the video is fuse multiple videos captured by multiple cameras at the same time
shot by a single camera, which captures one frame at a time. with slight time delay, as shown in Fig. 4(b). As we mentioned earlier,
Also, all the pixels of one frame are sampled at the same time the technique is referred to as space–time SR [19] or high-speed videog-
(without time delay). raphy [21]. After the fusion of multiple videos into a high-speed video,
3) The aperture size is small: the frame interval becomes shorter than the exposure time and we can
We currently assume that the aperture size is so small that the carry out the temporal deblurring to reduce the motion blur effect.
out-of-focus blur is almost homogeneous. An alternative to using multiple cameras is to generate intermediate
4) The spatial and temporal PSFs are known: frames, which may be obtained by frame interpolation (e.g., [22] and
In the current presentation, our primary focus is to show that [1]), so that the new frame interval ~f is now smaller than e , as illus-
a simple deblurring with the space–time (3-D) shift-invariant trated in Fig. 3(c). Once we have the video sequence with e > ~f , the
PSF can effectively reduce the complicated, nonuniform mo- temporal deblurring reduces e to be nearly equally to ~f , and the video
tion blur effects of a sequence of images. shown in Fig. 3(d) is our desired output. It is worth noting that, in the
Limitations most general setting, generation/interpolation of temporally interme-
1) The performance of our motion deblurring depends on the diate frames is indeed a very challenging problem. However, since our
performance of the space–time interpolator: interest lies mainly in the removal of motion blur, the temporal inter-
The space–time interpolator needs to generate the missing in- polation problem is not quite as complex as the general setting. In the
termediate blurry frames, while preserving spatial and tem- most general case, the space–time SR method [19] employing multiple
poral blur effects. cameras may be the only practical solution. Of course, it is possible to
2) The temporal upscaling factor affects our motion deblurring: apply the frame interpolation for the space–time super-resolved video
To remove the motion blur completely, the temporal upscaling to generate an even higher speed video. However, in this paper, we
factor of the space–time interpolator must be set to so large that focus on the case, where only a single video is available and show that
the motion speed slows down to less than 1 pixel per frame. our frame interpolation method (3-D SKR [1]) enables motion deblur-
ring. We note that the performance of the motion deblurring, therefore,
For instance, when the temporal upscaling factor is not large
depends on how well we interpolate intermediate frames. As long as
enough and an object in the upscaled video moves 3 pixels per
the interpolator successfully generates intermediate (upscaled) frames,
frame, the moving object would be still blurry along its mo- the 3-D deblurring can reduce the motion blur effects. Since, typically,
tion trajectory in a 3-pixel-wide window even after we deblur. the exposure time of the frames is relatively short even at low frame rate
However, as discussed in this section, the motion blur is some- (10–20 frame per second), we assume that local motion trajectories be-
times necessary for very fast moving objects in order to pre- tween frames are smooth enough that the 3-D SKR method interpolates
serve a smooth motion trajectory.
3Most commercial charge-coupled device (CCD) cameras nowadays use the

C. Video Deblurring in 3-D interline CCD technique, where the charged electrons of the frame are first trans-
ferred from the photosensitive sensor array to the temporal storage array and the
Next, we extend the single image (2-D) deblurring technique photosensitive array is reset. Then, the camera reads the frame out of the tem-
with total variation (TV) regularization to space–time (3-D) motion poral storage array while the photosensitive array is capturing the next frame.

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
2994 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011

If the sizes of the spatial and temporal PSF kernels are N 2 N 2 1 and
1 2 1 2  , respectively, then the overall PSF kernel has size N 2 N 2 ;
as illustrated in Fig. 5. We will discuss how to select the 3-D PSF for
deblurring later in Section II-C. While the data model (1) resembles the
one introduced by Irani and Peleg [23], we note that ours is a 3-D data
model. More specifically, we consider an image sequence (a video) as
Fig. 5. Overall PSF kernel in video (3-D) is given by the convolution of the one data set and consider the case where only a single video is available.
spatial and temporal PSF kernels. The PSF and the downsampling operations are also all in 3-D.
In this paper, we split the data model (1) into

the trajectories. When multilayered large (fast) motions are present, it


Spatiotemporal (3 0 D) upsampling problem :
is hard to generate intermediate frames using only a single video input
due to severe occlusions. Consequently, a video with higher frame rate yi = z (xi ) + "i (3)
is necessary. Spatiotemporal (3 0 D) deblurring problem :
Fig. 4 illustrates an idealized forward model, which we adopt in this z (xj ) = (g 3 u)(xj ) (4)
paper. Specifically, the camera captures the first frame by temporally
integrating the first few frames (say the first, second, and third frames)
of the desired video u, and the second frame by integrating, for where xi = [x1i ; x2i ; ti ]T is the pixel sampling position of the low-res-
example, the fifth frame and the following two frames.4 Next, the olution video with index i; xj = [x1j ; x2j ; tj ]T is the pixel sampling
frames are spatially downsampled due to the limited number of pixels position of the high-resolution video with index j; and yi is the ith
on the image sensor. We can regard spatial and temporal sampling sample of the low-resolution video (yi = y (xi )). We estimate u(xj )
mechanisms of the camera altogether as space–time downsampling for all j by a two-step approach:
effect, as shown in Fig. 4. Step 1. upscaling of yi to have the motion-blurred high-resolution
In our paper, we assume that all the frames in a video are taken by video z (xj );
a camera with the same setting (focus, zoom, aperture size, exposure Step 2. deblurring of z (xj ) to have the motion-deblurred high-res-
time, frame rate, etc.). Under such conditions, the spatial PSF, caused olution video u(xj ).
by the physical size of one pixel on the image sensor, and the temporal For the upscaling problem, we first upsample the low-resolution
PSF, whose support size is given by the exposure time, also remain video (yi ) and register it onto the grid of the desired high-resolution
unchanged, no matter how the camera moves and no matter what scene video, as illustrated in Fig. 6. Since the sampling density of the
we shoot. Therefore, the 3-D PSF, given by the convolution of the 2-D low-resolution video (xi ) is lower than the density of the high-reso-
spatial PSF and the 1-D temporal PSF, as depicted in Fig. 5, is shift lution video (xj ), there are missing pixels. In Fig. 6, the blank pixel
invariant. lattice indicates that a pixel value is missing, and we need to fill those
Under these assumptions, we estimate the desired output u by a missing pixels. We use our 3-D SKR [1] (reviewed in Section II-C2)
two-step approach: 1) space–time upscaling, and 2) space–time deblur- to estimate the missing pixels. For the deblurring problem, since each
ring. In our earlier study, we proposed a space–time upscaling method blurry pixel (z (xj )) is coupled with its space–time neighbors due to
in [1], where we left the motion (temporal) blur effect untreated, and the space–time blurring operation, it is preferable that we rewrite the
removed only the spatial blur with a shift-invariant (2-D) PSF with data model (4) in matrix form as follows:
TV regularization. In this paper, we study the reduction of the spatial
and temporal blur effects simultaneously with a shift-invariant (3-D) Spatiotemporal (3D) deblurring problem : z = Gu
PSF. A 3-D PSF is effective because the spatial blur and the temporal
(5)
blur (frame accumulation) are both shift invariant. PSF becomes shift
variant when we convert the 3-D PSF into 2-D temporal slices, which
where z = [. . . ; z (xj ); . . .]T and u = [. . . ; u(xj ); . . .]T . For ex-
yield the spatial PSF due to the moving objects for frame-by-frame de-
ample, let us say that the low-resolution video (yi ) is of size (L=rs ) 2
blurring. Again, unlike the existing methods [3]–[7], [13]–[15], after
(M=rs ) and (T =rt ) frames, where rs and rt are the spatial and tem-
the space–time upscaling, no motion estimation or scene segmentation
poral upsampling factors, respectively. Then, the blurred version of
is required for the space–time deblurring.
the high-resolution video z, which is available after the space–time
Having graphically introduced our data model in Fig. 4, we define
upscaling, and the video of interest u are of size L 2 M 2 T , and
the mathematical model between the blurred data denoted y and the
the blurring operator G is of dimension LT M 2 LT M . The ma-
desired signal u with a 3-D PSF g as follows:
trices with underscore represent that they are lexicographically ordered
y (x) = z (x) + " = (g 3 u)(x) + " (1) into column-stacked vector form (e.g., z 2 RLMT 21 ). Using (5), we
present our 3-D deblurring in Section II-C3. But first, we describe the
upscaling method.
where " is the independent and identically distributed zero mean noise
2) Space–Time (3-D) Upscaling: The first step of our two-step ap-
value (with otherwise no particular statistical distribution assumed),
proach is upscaling. Given the spatial and temporal upsampling factors
x = [x1 ; x2 ; t] is the 3-D (space–time) coordinate in vector form, 3
is the convolution operator, and g is the combination of spatial blur gs
rs and rt , we spatiotemporally upsample the low-resolution video and
then register all the pixels (yi ) of the low-resolution video onto a grid
and the temporal blur g
of the high-resolution video, where the pixel positions in the high-res-
g(x) = gs (x1 ; x2 ) 3 g (t): (2) olution grid are labeled by xj , as illustrated in Fig. 6. Due to the lower
sampling density of the low-resolution video, there are missing pixels
in the high-resolution grid, and our task is to estimate the samples z (xj )
4Perhaps, a more concise description is that motion blur effect can always

for all j from the measured samples yi for i = 1; . . . ; (LMT =rs2 rt ).


be modeled as a single 1-D shift-invariant PSF in the direction of the time
axis. This is simply because the blur results from multiple exposure of the
same fast-moving object in space during the exposure time. The skips of the Assuming that the underlying blurred function z (x) is locally smooth
temporal sampling positions can be regarded as temporal downsampling. and it is N -times differentiable, we can write the relationship between

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011 2995

Fig. 6. Schematic representation of the registration of the low-resolution video onto a high-resolution grid. In the illustration, a low-resolution video (3 2 3, 3
frames) is upsampled with the spatial upsampling factor r =2
and the temporal upsampling factor r . =3

the unknown pixel value z (xj ) and its neighboring sample yi by Taylor (9)
series as follows:
where h is the global smoothing parameter. This is the formulation of
the kernel regression [24] in 3-D. We set h = 0:7 for all the experi-
yi = z (xi ) + "i
ments, and Ci is the smoothing (3 2 3) matrix for the sample yi , which
= z (xj ) + frz (xi )g (xi
0 xj ) T
dictates the “footprint” of the kernel function and we will explain how
+ (xi 0 xj ) fHz (xj )g(xi 0 xj ) + 1 1 1 + "i
T
we obtain it shortly. The minimization (8) yields a pointwise estimator
= 0 + 1 (xi 0 xj )
of the blurry signal z (xj ) with the order of local signal representation
(N )
+ 2 vechf(xi 0 xj )(xi 0 xj ) g + 1 1 1 + "i
T T
(6)

z^(xj ) = ^0 = Wi (K (xi 0 xj ); N )yi (10)


where r and H are the gradient (3 2 1) and Hessian (3 2 3) operators, i2!
respectively, and vechf1g is the half-vectorization operator that lexi-
cographically orders the lower triangular potion of a symmetric matrix where Wi is the weights given by the choice of Ci and N . For example,
into a column-stacked vector. Furthermore, 0 is z (xj ), which is the choosing N = 0 (i.e., we keep only 0 in (8) and ignore all the higher
signal (or pixel) value of interest, and the vectors 1 and 2 are order terms), the estimator (10) becomes
@z (x) @z (x) @z (x)
T
; ; K (xi 0 xj ) yi
1 =
@x1 @x2 @t z^(xj ) = : (11)
x=x i i K (xi 0 xj )

1 @ 2 z (x) @ 2 z (x) @ 2 z (x)


2 = ; 2 ; 2 We set N = 2 as in [24] and the size of the cubicle !i is 5 2 5 2 5 in
2 @x21 @x1 @x2 @x1 @t
T
the grid of the low-resolution video in this paper. Since the pixel value
@ 2 z (x) @ 2 z (x) @ 2 z (x) of interest z (xj ) is a local combination of the neighboring samples,
; 2 ; : (7)
@x22 @x2 @t @t2 the performance of the estimator strongly depends on the choice of the
x=x
kernel function, or more specifically the choice of the smoothing matrix
Since this approach is based on local signal representations, a logical Ci . In our previous study [1], we obtain Ci from the local gradient
step to take is to estimate the parameters f n gN n=0 using the neigh- vectors in a local analysis cubicle i , whose center is located at the
boring samples (yi ) in a local analysis cubicle !j around the position position of yi
of interest xj while giving the nearby samples higher weights than sam-
ples farther away. A weighted least-square formulation of the fitting Ci = JiT Ji
problem capturing this idea is and
.. .. ..
min yi 0 0 0 1 (xi 0 xj )
T
. . .
f g i2! Ji = zx (xp ) zx (xp ) zt (xp ) ; p 2 i (12)
2 .. .. ..
2 vechf(xi 0 xj )(xi 0 xj ) g 0 1 1 1 K (xi 0 xj )
T T
0 (8) . . .

with the Gaussian kernel (weight) function where p is the index of the sample positions around the ith sample
(yi )in the local analysis cubicle i ; zx (xj ); zx (xj ), and zt (xj )
(xi 0 xj )T Ci (xi 0 xj ) are the gradients along the vertical (x1 ), horizontal (x2 ), and time
K (xi 0 xj ) = jCi j exp 0
2h2 (t) axes, respectively. In this paper, we first estimate the gradients

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
2996 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011

( 1 = [zx (xp ); zx (xp ); zt (xp )]) using (8) with Ci = I and set rs 2 rs uniform PSF. Currently, we ignore the out-of-focus blur, and
i a 5 2 5 2 5 cubicle in the grid of the low-resolution video y , and we obtain the temporal support size  of the temporal PSF by
then, plugging in the estimated gradients into (12), we obtain the
locally adaptive smoothing matrix Ci for each yi . With Ci given by e
(12), the kernel function faithfully reflects the local signal structure  =
f
2 rt (17)
in space–time (we call it the steering kernel function), i.e., when we
estimate a pixel on an edge, the kernel function gives larger weights where rt is the user-defined temporal upscaling factor. Convolving the
for the samples (yi ) located on the same edge. On the other hand, if spatial PSF and the temporal PSF as shown in Fig. 5, we have a 3-D
there is no local structure, all the nearby samples have similar weights. (rs 2 rs 2  ) PSF for the deblurring (13). Our deblurring method with
Hence, the estimator (10) preserves local object structures while the rs 2 rs 2  PSF reduces the effective exposure time of the upscaled
suppressing the noise effects in flat regions. We refer the interested video. Specifically, after the deblurring, the effective exposure time of
reader to [24] for further details. Once all the pixels of interest have the output video is given by
been estimated using (10), we fill them in the matrix z (5) and deblur e f
~e = = : (18)
the resulting 3-D data set at once, as explained in the following section.  rt
3) Space–Time (3-D) Deblurring: Assuming that, at the space–time
upscaling stage, noise is effectively suppressed [1], the important issue Therefore, when the temporal upscaling factor rt is not high, the ex-
that we need to carefully treat in the deblurring stage is the suppres- posure time ~e is not shortened by very much, and some motion blur
sion of the ringing artifacts, particularly, across time. The ringing ef- effects may be seen in the output video. For example, if an object moves
fect in time may cause undesirable flicker when we play the output 3 pixels per frame in the spatiotemporally upscaled video, the moving
video. Therefore, the deblurring approach should smooth the output object would be still blurry along its motion trajectory in a 3-pixel-wide
pixel across not only space, but also time. To this end, using the data window even after we deblur.
model (5), we propose a 3-D deblurring method with the 3-D version
of TV to recover the pixels across space and time III. EXPERIMENTS
We illustrate the performance of our proposed technique on both real
u^ = arg min
u
kz 0 Guk22 + k0uk1 (13) and simulated sequences. To begin, we first illustrate motion deblur-
ring performance on the Cup sequence, with simulated motion blur.5
The Cup example is the one we briefly showed Section I. This se-
where  is the regularization parameter, and 0 is a high-pass filter. The quence contains relatively simple transitions, i.e., the cup moves up-
joint use of L2 -, L1 -norms is fairly standard [25]–[27], where the first ward. Fig. 1(a) shows the ground-truth frames, and Fig. 1(b) shows the
term (L2 -norm) is used to enforce the fidelity of the reconstruction to motion-blurred frames generated by taking the average of five consec-
the data (in a mean-squared sense), and the second term (L1 -norm) utive frames, i.e., the corresponding PSF in 3-D is 1 2 1 2 5 uniform.
is used to promote sparsity in the gradient domain, leading to sharp The deblurred images of the Cup sequence by Fergus’ method [3],
edges in space and time and avoid ringing artifacts. Specifically, we Shan’s method6 [4], and our approach (13) with (; ) = (0:75; 0:04)
implement the TV regularization as follows: are shown in Fig. 1(c)–(e), respectively. Fig. 1(f)–(j) shows the selected
1 1 1 regions of the video frames Fig. 1(a)–(e) at time t = 6, respectively.
k0uk1 ) u 0 Slx Sm t
x St u (14) The corresponding PSNR7 and SSIM8 values are indicated in the figure
l=01 m=01 t=01
1 captions. It is worth noting here again that, although motion occlusions
are present in the sequence, the proposed 3-D deblurring requires nei-
where Slx ; Sm t
x , and St are the shift operators that shift the video u ther segmentation nor motion estimation. We also note that, in a sense,
toward x1 ; x2 , and t-directions with l; m, and t-pixels, respectively. one could regard a 1 2 1 2  PSF as a 1-D PSF. However, in our paper,
We iteratively minimize the cost C (u) = kz 0 Guk22 + k0uk1 a 1 2 N 2 1 PSF and a 1 2 1 2 N are, for example, completely different.
in (13) with (14) to find the deblurred sequence u ^ using the steepest The 1 2 N 2 1 PSF blurs along the horizontal (x2 ) axis, while on the
descent method other hand, the 1 2 1 2 N PSF blurs along the time axis.
The second example in Fig. 2 is also a simulated motion deblur-
@C (u)
u^ (`+1) = u^ (`) +  (15) ring. In this example, the motion blur is caused by the camera rota-
@ u u=^u tion about its optical axis. We generated a video by rotating the pepper
image counterclockwise 1 per frame for 90 frames. This is equivalent
where  is the step size, and to rotating the camera clockwise 1 per frame. The sequence of the
rotated pepper image is the ground-truth video in this example. Then,
@C (u) we blurred the video by blurring with a 1 2 1 2 8 uniform PSF (this
@u
= 0GT (z 0 Gu)
is equivalent to taking the average of eight consecutive frames), and
1 1 1
I 0 S0l 0m 0t added white Gaussian noise (standard deviation = 2). Fig. 2(a) and
+ x Sx St
l=01 m=01 t=01
(b) shows one frame from the ground-truth video and the noisy blurred
video. When the camera rotates, the pixels rotate at different speeds in
2 sign u 0
Slx Sm t
x St u : (16) proportion to the distance from the center of the rotation. Consequently,
5In order to examine how well the motion blur will be removed, we do not
(`)
We initialize u ^ with the output of the space–time upscaling (i.e., take the spatial blur into account for the experiments.
u^ (0) = z), and manually select a reasonable 3-D PSF (G) for the 6The software is available at http://w1.cse.cuhk.edu.hk/~leojia/programs/de-
experiments with real blurry sequences. blurring/deblurring.htm. We set the parameter “noiseStr” to 0.05 and used
In this paper, we select a 3-D PSF based on the exposure time e and the default setting for the other parameters for all the examples.
the frame interval f of the input videos (which are generally avail- 7PSNR ratio= 10log (255 mean squareerror)
= (in decibels).
able from the camera setting), and the user-defined spatial and temporal 8The software for Structure SIMilarity index is available at http://www.ece.
upscaling factors rs and rt . Specifically, we select the spatial PSF an uwaterloo.ca/~z70wang/research/ssim/.

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011 2997

2
Fig. 7. Motion (temporal) deblurring example of the Book sequence (380 510, 10 frames) with real motion blur. (a) Frame of the ground truth at time t . =6
(b) and (c) Deblurred frames by Fergus’s [3] and Shan’s methods [4]. (d) and (f) Deblurred frames at t =6
and 6.5 by the proposed 3-D TV method (13) using a
2 2
1 1 8 uniform PSF. (e) One of the estimated intermediate frame at t =65 : by the 3-D SKR (10).

the motion blur is spatially variant. Even though the (temporal) PSF is The next experiment shown in Fig. 7 is a realistic example, where
independent of the scene contents or the camera motion, the shift-in- we deblur a low temporal resolution sequence degraded by real mo-
variant 3-D PSF causes spatially variant motion blur effects. Using the tion blur. The cropped sequence consists of ten frames, and the sixth
blurred video as the output of a space–time interpolator, we deblurred frame (at time t = 6) is shown in Fig. 7(a). Motion blur can be seen in
the blurred video by Fergus’ and Shan’s blind methods. One deblurred the foreground (i.e., the book in front moves toward right about 8 pixels
frame by each blind method is shown in Fig. 2(c) and (d), respectively. per frame). Similar to the previous experiment, we first deblurred those
Our deblurring result is shown in Fig. 2(e). We used the 1 2 1 2 8 frames individually by Fergus’ and Shan’s methods [3], [4]. Their de-
shift-invariant PSF for our deblurring (13) with (; ) = (0:5; 0:15). blurred results are in Fig. 7(b) and (c), respectively. For our method,

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
2998 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011

Fig. 8. 3-D (spatiotemporal) deblurring example of the Foreman sequence in CIF format. (a) Cropped frame at time t =6 . (b) and (c) Deblurred results of
2 2
the upscaled frame shown in (e) by Fergus’ [3] and Shan’s methods [4] (d) Deblurred frames by the proposed 3-D TV method (13) using a 2 2 2 uniform
PSF. (e) Upscaled frames by 3-D SKR [1] at time t =6 and 6.5 in both space and time with the spatial and temporal upscaling factors of r =2 and r =8 ,
=6
respectively. The figures (f)–(i) and (j)–(n) are the selected regions of the frames shown in (a)–(e) at t 65
and : .

temporal upscaling is necessary before deblurring. Here, it is indeed video with a 1 2 1 2 8 uniform PSF by the proposed method (13) with
the case that exposure time is shorter than the frame interval (e < f ), (; ) = (0:75; 0:06). We took the book video in dim light, and the
as shown in Fig. 3(a). Using the 3-D SKR method (10), we upscaled exposure time is nearly equal to the frame interval. Selected deblurred
the sequence with the upscaling factors rs = 1 and rt = 8 in order frames9 are shown in Fig. 7(d) and (f).
to generate intermediate frames to have the sequence, as illustrated in The last example is another real example. This time we used the
Fig. 3(c). We chose rt = 8 to slow the motion speed of the book down Foreman sequence in CIF format. Fig. 8(a) shows one frame of the
to about 1 pixel per frame so that the motion blur of the book will be 9We must note that, in case severe occlusions are present in the scene, the
almost completely removed. One of the estimated intermediate frames blurred results for the interpolated frames contain most of the errors/artifacts,
at t = 6:5 is shown in Fig. 7(e). Then, we deblurred the upscaled and this issue is one of or important future works.

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011 2999

Fig. 9. Deblurring performance comparisons using absolute residuals (the absolute difference between the deblurred frames shown in Fig. 8(b)–(d) and the esti-
mated frames shofwn in Fig. 8(e)). (a) Fergus’ method [3]. (b) Shan’s method [4]. (c) Our proposed method (13).

cropped input sequence (170 2 230, 10 frames) at time t = 6. In this REFERENCES


example, we upscaled the Foreman sequence using 3-D SKR (10) with [1] H. Takeda, P. Milanfar, M. Protter, and E. Elad, “Superresolution
spatial and temporal upscaling factor of rs = 2 and rt = 8, respec- without explicit subpixel motion estimation,” IEEE Trans. Image
tively, and Fig. 8(e) show the estimated intermediate frame at time Process., vol. 18, no. 9, pp. 1958–1975, Sep. 2009.
t = 5:5 and the estimated frame at t = 6. We note that these frames
[2] Q. Shan, Z. Li, J. Jia, and C. Tang, “Fast image/video upsampling,”
presented at the ACM Trans. Graph. (SIGGRAPH ASIA), Singapore,
are the intermediate results of our two-step deblurring approach. We 2008.
also note that our 3-D SKR successfully estimated the blurred inter- [3] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. Freeman,
mediate frames, as seen in the figures, and the motion blur is spatially “Removing camera shake from a single photograph,” ACM Trans.
variant; the man’s face is blurred as a result of the out-of-plane rotation Graph., vol. 25, pp. 787–794, 2006.
[4] Q. Shan, J. Jia, and A. Agarwala, “High-quality motion deblurring from
of his head. In this time, we deblur the upscaled frames using Fergus’ a single image,” ACM Trans. Graph., vol. 27, pp. 73:1–73:10, 2008.
and Shan’s methods [3], [4], and the proposed 3-D deblurring method [5] M. Ben-Ezra and S. K. Nayar, “Motion-based motion deblurring,”
using a 2 2 2 2 2 uniform PSF. The exposure time of the Foreman se- IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 6, pp. 689–698,
quence is unavailable, and we manually chose the temporal support Jun. 2004.
[6] Y. Tai, H. Du, M. S. Brown, and S. Lin, “Image/video deblurring using
size of the PSF to produce reasonable deblurred results. The deblurred a hybrid camera,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
frames are in Fig. 8(b)–(d), respectively, and Fig. 8(f)–(i) and (j)–(n) Anchorage, AK, Jun. 2008, pp. 1–8.
are the selected regions of the frames shown in (a)–(e) at t = 5:5 and [7] S. Cho, Y. Matsushita, and S. Lee, “Removing non-uniform motion
6, respectively. In addition, in order to compare the performance of blur from images,” in Proc. IEEE 11th Int. Conf. Comput. Vis., Rio de
Janeiro, Brazil, Oct. 2007, pp. 1–8.
our proposed method to Fergus’ and Shan’s methods, in Fig. 9, we
[8] A. Levin, “Blind motion deblurring using image statistics,” presented
compute the absolute residuals (the absolute difference between the de- at the Conf. Neural Inf. Process. Syst., Vancouver, BC, 2006.
blurred frames shown in Fig. 8(b)–(d) and the estimated frames shown [9] P. Milanfar, “Projection-based, frequency-domain estimation of super-
in Fig. 8(e) in this case). The results illustrate that our 3-D deblurring imposed translational motions,” J. Opt. Soc. Amer.: A, Opt. Image Sci.,
approach successfully recovers more details of the scene, such as the vol. 13, no. 11, pp. 2151–2162, Nov. 1996.
[10] P. Milanfar, “Two dimensional matched filtering for motion estima-
man’s eye pupils, and the outlines of the face and nose even without tion,” IEEE Trans. Image Process., vol. 8, no. 3, pp. 438–444, Mar.
scene segmentation. 1999.
[11] D. Robinson and P. Milanfar, “Fast local and global projection-based
methods for affine motion estimation,” J. Math. Imag. Vis. (Invited
Paper), vol. 18, pp. 35–54, Jan. 2003.
[12] D. Robinson and P. Milanfar, “Fundamental performance limits in
IV. CONCLUSION AND FUTURE WORKS image registration,” IEEE Trans. Image Process., vol. 13, no. 9, pp.
1185–1199, Sep. 2004.
[13] H. Ji and C. Liu, “Motion blur identification from image gradients,”
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Anchorage, AK,
In this paper, instead of removing the motion blur as spatial blur, we
Jun. 2008, pp. 1–8.
proposed deblurring with a 3-D space–time invariant PSF. The results [14] S. Dai and Y. Wu, “Motion from blur,” in Proc. IEEE Conf. Comput.
showed that we could avoid segmenting video frames based on the local Vis. Pattern Recognit., Anchorage, AK, Jun. 2008, pp. 1–8.
motions, and that temporal deblurring effectively removed motion blur [15] J. Chen, L. Yuan, C. Tang, and L. Quan, “Robust dual motion deblur-
even in the presence of motion occlusions. ring,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Anchorage,
AK, Jun. 2008, pp. 1–8.
For all the experiments in Section III, we assumed that exposure time [16] A. Agrawal and R. Raskar, “Resolving objects at higher resolution from
was known. In our future work, we plan on extending the proposed a single motion-blurred image,” in Proc. IEEE Conf. Comput. Vis. Pat-
method to the case, where the exposure time is also unknown. tern Recognit., Minneapolis, MN, Jun. 2007, pp. 1–8.

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.
3000 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 10, OCTOBER 2011

[17] Y. Tai, N. Kong, S. Lin, and S. Shin, “Coded exposure imaging for pro- [23] M. Irani and S. Peleg, “Improving resolution by image registration,”
jective motion deblurring,” in Proc. IEEE Conf. Comput. Vis. Pattern CVGIP: Graph. Models Image Process., vol. 53, no. 3, pp. 231–239,
Recognit., San Francisco, CA, Jun. 2010, pp. 2408–2415. May 1991.
[18] E. Borissoff, “Optimal temporal sampling aperture for HDTV [24] H. Takeda, S. Farsiu, and P. Milanfar, “Kernel regression for image
varispeed acquisition,” SMPTE Motion Imag. J., vol. 113, no. 4, pp. processing and reconstruction,” IEEE Trans. Image Process., vol. 16,
104–109, 2004. no. 2, pp. 349–366, Feb. 2007.
[19] E. Shechtman, Y. Caspi, and M. Irani, “Space-time super-resolution,” [25] L. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based
IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 4, pp. 531–545, noise removal algorithms,” Physica D, vol. 60, pp. 259–268, Nov.
Apr. 2005. 1992.
[20] A. Agrawal, M. Gupta, A. Veeraraghavan, and S. G. Narasimhan, “Op- [26] C. Vogel and M. Oman, “Iterative methods for total variation de-
timal coded sampling for temporal super-resolution,” in Proc. IEEE noising,” SIAM J. Sci. Comput., vol. 17, pp. 227–238, 1996.
Conf. Comput. Vis. Pattern Recognit., San Francisco, CA, 2010, pp. [27] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin, “An iterative reg-
599–606. ularization method for total variation-based image restoration,” SIAM
[21] B. Wilburn, N. Joshi, V. Vaish, M. Levoy, and M. Horowitz, “High- J. Multiscale Model. Simul., vol. 4, pp. 460–489, 2005.
speed videography using a dense camera array,” in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Washington, DC, 2004, pp. 294–301.
[22] A. Huang and T. Nguyen, “Correlation-based motion vector processing
with adaptive interpolation scheme for motion-compensated frame in-
terpolation,” IEEE Trans. Image Process., vol. 18, no. 4, pp. 740–752,
Apr. 2009.

Authorized licensed use limited to: Universita degli Studi di Bologna. Downloaded on May 01,2023 at 16:46:23 UTC from IEEE Xplore. Restrictions apply.

You might also like