0% found this document useful (0 votes)
47 views

IFAN

The document describes an iterative filter adaptive network for single image defocus deblurring. The network predicts and applies per-pixel deblurring filters to handle spatially variant blur. It uses iterative adaptive convolution to efficiently establish a large receptive field for large blur. The network also estimates a disparity map to help predict filters and is trained on a dataset containing defocused, dual-pixel, and ground truth images.

Uploaded by

Junyong Lee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

IFAN

The document describes an iterative filter adaptive network for single image defocus deblurring. The network predicts and applies per-pixel deblurring filters to handle spatially variant blur. It uses iterative adaptive convolution to efficiently establish a large receptive field for large blur. The network also estimates a disparity map to help predict filters and is trained on a dataset containing defocused, dual-pixel, and ground truth images.

Uploaded by

Junyong Lee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Iterative Filter Adaptive Network for Single Image Defocus Deblurring

2021.01.05
Junyong Lee

Computer Graphics Lab


2

Contents
Defocus Deblurring
Related work & Motivation
Our Approach
◾ Iterative Filter Adaptive Network
◾ Training
Experimental Results
Conclusion & Discussion
3

Defocus Deblurring
Restoration of an all-in-focus image from a defocused image
◾ Highly demanded by daily photographers to remove unwanted blur
◾ Can facilitate high-level vision tasks, such as
 semantic segmentation
 object detection

Qualitative comparison of our method with previous state-of-the-art method, DPDNet1

1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
4

Defocus Deblurring
However, it is challenging due to the spatially varying nature of blur

varying size varying shape


5

Related Work
Conventional strategy
◾ Defocus map estimation  non-blind deconvolution
 Defocus map contains per-pixel blur amount
◾ Often fails due to restrictive blur model, which
 disregards the non-linearity of real-world blur
 constrains blur in specific shapes such as disc or Gaussian

Input Input
Input Defocus map Deblurred result
Defocus deblurring result of DMENet1
1J. Lee, S. Lee, S. Cho and S. Lee. Defocus Map Estimation using Domain Adaptation. In Proc. CVPR, 2019.
6

Related Work
Dual-pixel Defocus Deblurring Network (DPDNet)1
◾ The first end-to-end deep learning-based network for defocus deblurring
◾ Dual-Pixel Defocus Deblurring (DPDD) dataset
 Dual-pixel stereo defocused images
 Merged defocused images
 Ground-truth all-in-focus images
Architecture of DPDNet
defocus disparity ∝ blur size

Dual-pixel stereo defocused images Merged defocused image Ground-truth all-in-focus image

Sample images in DPDD dataset

1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
7

Motivation
Shortcomings of DPDNet
◾ Tends to include a ringing-like artifact
 due to naïve UNet1 architecture not suitable to handle spatially variant blur2
◾ At test time, DPDNet requires dual-pixel images that are not easily obtainable
 Single image-based DPDNet performs even worse

Quantitative comparison of our method with DPDNet. DPDNetdual necessitates dual-pixel stereo images for both training and testing, while
DPDNetsingle and our method require only a single image during test time.

1O.Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Proc. MICCAI, 2015.
2S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
8

Motivation
Shortcomings of DPDNet
◾ Tends to include ringing-like artifact
 due to naïve UNet1 architecture not suitable to handle spatially variant blur2
Problem Definition
◾ At test time, DPDNet requires dual-pixel image that are not easily obtainable
 Single image-based DPDNet performs even worse
“Design an end-to-end network that
effectively handles spatially variant and,
large blur for a single image defocus
deblurring.”
Quantitative comparison of our method with DPDNet. DPDNetdual necessitates dual-pixel stereo images for both training and testing, while
DPDNetsingle and our method require only a single image during test time.

1O.Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Proc. MICCAI, 2015.
2S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
9

Our Approach
Key Contributions
1. Iterative filter adaptive network (IFAN)
 predicts & applies per-pixel deblurring filters Spatially Variant
 adaptively handles spatially variant blur

2. Iterative adaptive convolution (IAC)


 IAC applies the stacked deblurring filters to defocused features
 Efficiently establishes a large receptive field for a large blur
Large
3. Disparity map estimation & Reblurring
 Defocus deblurring-specific tasks
Single
 maximum use of defocus blur cues in a single image.

1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
Our Approach 10

Filter Adaptive Network


Predicts a per-pixel convolution filters 𝐅𝐅
Applies the predicted filter on defocused features,
by Filter Adaptive Convolution1

Filter Adaptive Convolution

For 𝐅𝐅 to establish a large receptive field, a large 𝑘𝑘 is required,


which is, however, computationally inefficient.
2S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
Our Approach 11

Iterative Filter Adaptive Network (IFAN)


Ensuring a large receptive field with lighter filters?
 Iterative application of separable filters
Predicts per-pixel stacked separable filters 𝐅𝐅
Iteratively applies the predicted filters on defocused features

by Iterative Adaptive Convolution

Iterative Adaptive Convolution


Our Approach 12

Network Architecture
Feature Extractor  IFAN  Reconstructor
◾ 𝐼𝐼𝐵𝐵 : defocused image
◾ 𝑒𝑒… : intermediate features
◾ 𝐼𝐼𝐵𝐵𝐵𝐵 : all-in-focus image
◾ 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 : adaptive deblurring filters
Our Approach 13

Network Architecture
Disparity map estimator (DME) in IFAN
Disparity map, 𝑑𝑑 𝑟𝑟→𝑙𝑙
◾ is trained to estimate a disparity map which contains per-pixel blur size of 𝐼𝐼𝐵𝐵
 more accurate prediction of deblurring filters 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
Our Approach 14

Training
Reblurring Network
◾ is only attached after IFAN during training
◾ is trained to invert 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 to reblurring filter 𝐅𝐅𝑟𝑟𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
◾ induces IFAN to predict 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 containing valid information about the blur shape
and size

The reblurring network


Our Approach 15

Training
𝐿𝐿𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑

𝐿𝐿𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟

𝐿𝐿𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑

Detached for testing!


DPDD Dataset
◾ Defocused image 𝐼𝐼𝐵𝐵 , dual-pixel stereo images (𝐼𝐼𝐵𝐵𝑟𝑟 , 𝐼𝐼𝐵𝐵𝑙𝑙 ), all-in-focus image 𝐼𝐼𝐺𝐺𝐺𝐺
Loss Functions
Stereo images are used only during training!
◾ 𝐿𝐿𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 = 𝑀𝑀𝑀𝑀𝑀𝑀 𝐼𝐼𝐵𝐵𝑆𝑆 , 𝐼𝐼𝐺𝐺𝐺𝐺
𝑟𝑟→𝑙𝑙 𝑙𝑙 𝑟𝑟→𝑙𝑙 𝑟𝑟
◾ 𝐿𝐿𝑑𝑑𝑖𝑖𝑖𝑖𝑖𝑖 = 𝑀𝑀𝑀𝑀𝑀𝑀(𝐼𝐼𝐵𝐵↓ , 𝐼𝐼𝐵𝐵↓ ), where 𝐼𝐼𝐵𝐵↓ = 𝑤𝑤𝑤𝑤𝑤𝑤𝑤𝑤(𝐼𝐼𝐵𝐵↓ , 𝑑𝑑 𝑟𝑟→𝑙𝑙 )
◾ 𝐿𝐿𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 = 𝑀𝑀𝑀𝑀𝑀𝑀(𝐼𝐼𝑆𝑆𝐵𝐵↓ , 𝐼𝐼𝐵𝐵↓ )
Experimental Results 16

Ablation Study

Confirms advantage of a filter adaptive network in handling spatially varying blur


Validates potential of a filter adaptive network in absorbing extra blur-specific supervision
Disparity Map Estimation : contributes to recovering structural information (higher PSNR)
Reblurring Network : contributes to recovering detail information (Lower LPIPS)
DME & RBN : the best results as they have a synergistic relationship
Experimental Results 17

Quantitative Comparison

28J. Shi, L. Xu, and J. Jia. Just noticeable defocus blur detection and estimation. In Proc. CVPR, 2015.
12A. Karaali and C. Jung. Edge-based defocus blur estimation with adaptive scale selection. IEEE Trans. Image Processing(TIP), 27(3):1126–1137, 2018.
15J. Lee, S. Lee, S. Cho, and S. Lee. Deep defocus map estimation using domain adaptation. In Proc. CVPR, 2019.
Experimental Results 18

Qualitative Comparison
Experimental Results 19

Generalization Ability
RealDOF test set (Ours)
◾ 50 pairs of defocused and all-in-focus images,
◾ that are captured concurrently with a dual-camera system,
◾ and geometrically & photometrically aligned as in RealDBlur1
 Quantitative evaluation is available

1J. Rim, H. Lee, J.Won, and S. Cho. Real-world blur dataset for learning and benchmarking deblurring algorithms. In Proc. ECCV 2020.
Experimental Results 20

Generalization Ability
CUHK blur detection dataset1
◾ 704 defocused images randomly crawled from the internet
◾ Qualitative evaluation

1J. Shi, L. Xu, and J. Jia. Discriminative blur detection features. In Proc. CVPR, 2014.
Experimental Results 21

Generalization Ability
Pixel dual-pixel test set1
◾ 13 pairs of dual-pixel stereo images
◾ Qualitative evaluation

1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
Experimental Results 22

Number of Deblurring Filters 𝑵𝑵 in IFAN


We choose 𝑁𝑁 = 17 for our final model
Experimental Results 23

FAC1 vs IAC
Receptive field
◾ FAC: 11 × 11
◾ IAC : 35 × 35

1S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
24

Discussions

Limitation
◾ The proposed network is still limited for managing large blur
◾ The network works best with a typical isotropic defocus blur but may not
properly handle defocus blur with irregular shape (e.g. swirly bokeh) or
strong highlights (i.e. glitter bokeh)
For future work, we are planning
◾ to build a defocus deblurring dataset that contains image pairs of diverse blur
types captured with various cameras and lenses
25

Supplementary
Our Approach 26

Training Details
Training details
◾ Random augmentation: gray-scale conversion, noise, scale
◾ Random cropped 256 × 256 sized patches
◾ Batch size: 8
◾ 600k iterations
◾ learning rate: 1e-4
 step decayed to half at 500k and 550k
◾ rectified-Adam optimizer
 𝛽𝛽1 = 0.99, 𝛽𝛽2 = 0.999
◾ For stacked filters 𝐅𝐅, we set 𝑁𝑁 = 17
27

Detailed Net. Architecture


28

Ablation Models
29

Experimental results on 16-bit & dual-pixel images


Our Approach 30

Iterative Filter Adaptive Network (IFAN)


Separable filters for a deblurring kernel?
◾ Inverse kernel of spatially smooth kernel can be decomposed into 1D kernels
for deconvolution

Inverse deblurring kernel (a) can be expressed with


a linear combination of kernels (b-e) of series of separable filters1

1S. L. Xu, X. Tao, and J. Jia. Inverse kernels for fast spatial deconvolution. In Proc. ECCV, 2014.

You might also like