IFAN
IFAN
2021.01.05
Junyong Lee
Contents
Defocus Deblurring
Related work & Motivation
Our Approach
◾ Iterative Filter Adaptive Network
◾ Training
Experimental Results
Conclusion & Discussion
3
Defocus Deblurring
Restoration of an all-in-focus image from a defocused image
◾ Highly demanded by daily photographers to remove unwanted blur
◾ Can facilitate high-level vision tasks, such as
semantic segmentation
object detection
1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
4
Defocus Deblurring
However, it is challenging due to the spatially varying nature of blur
Related Work
Conventional strategy
◾ Defocus map estimation non-blind deconvolution
Defocus map contains per-pixel blur amount
◾ Often fails due to restrictive blur model, which
disregards the non-linearity of real-world blur
constrains blur in specific shapes such as disc or Gaussian
Input Input
Input Defocus map Deblurred result
Defocus deblurring result of DMENet1
1J. Lee, S. Lee, S. Cho and S. Lee. Defocus Map Estimation using Domain Adaptation. In Proc. CVPR, 2019.
6
Related Work
Dual-pixel Defocus Deblurring Network (DPDNet)1
◾ The first end-to-end deep learning-based network for defocus deblurring
◾ Dual-Pixel Defocus Deblurring (DPDD) dataset
Dual-pixel stereo defocused images
Merged defocused images
Ground-truth all-in-focus images
Architecture of DPDNet
defocus disparity ∝ blur size
Dual-pixel stereo defocused images Merged defocused image Ground-truth all-in-focus image
1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
7
Motivation
Shortcomings of DPDNet
◾ Tends to include a ringing-like artifact
due to naïve UNet1 architecture not suitable to handle spatially variant blur2
◾ At test time, DPDNet requires dual-pixel images that are not easily obtainable
Single image-based DPDNet performs even worse
Quantitative comparison of our method with DPDNet. DPDNetdual necessitates dual-pixel stereo images for both training and testing, while
DPDNetsingle and our method require only a single image during test time.
1O.Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Proc. MICCAI, 2015.
2S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
8
Motivation
Shortcomings of DPDNet
◾ Tends to include ringing-like artifact
due to naïve UNet1 architecture not suitable to handle spatially variant blur2
Problem Definition
◾ At test time, DPDNet requires dual-pixel image that are not easily obtainable
Single image-based DPDNet performs even worse
“Design an end-to-end network that
effectively handles spatially variant and,
large blur for a single image defocus
deblurring.”
Quantitative comparison of our method with DPDNet. DPDNetdual necessitates dual-pixel stereo images for both training and testing, while
DPDNetsingle and our method require only a single image during test time.
1O.Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Proc. MICCAI, 2015.
2S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
9
Our Approach
Key Contributions
1. Iterative filter adaptive network (IFAN)
predicts & applies per-pixel deblurring filters Spatially Variant
adaptively handles spatially variant blur
1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
Our Approach 10
Network Architecture
Feature Extractor IFAN Reconstructor
◾ 𝐼𝐼𝐵𝐵 : defocused image
◾ 𝑒𝑒… : intermediate features
◾ 𝐼𝐼𝐵𝐵𝐵𝐵 : all-in-focus image
◾ 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 : adaptive deblurring filters
Our Approach 13
Network Architecture
Disparity map estimator (DME) in IFAN
Disparity map, 𝑑𝑑 𝑟𝑟→𝑙𝑙
◾ is trained to estimate a disparity map which contains per-pixel blur size of 𝐼𝐼𝐵𝐵
more accurate prediction of deblurring filters 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
Our Approach 14
Training
Reblurring Network
◾ is only attached after IFAN during training
◾ is trained to invert 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 to reblurring filter 𝐅𝐅𝑟𝑟𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒
◾ induces IFAN to predict 𝐅𝐅𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 containing valid information about the blur shape
and size
Training
𝐿𝐿𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝐿𝐿𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟
𝐿𝐿𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
Ablation Study
Quantitative Comparison
28J. Shi, L. Xu, and J. Jia. Just noticeable defocus blur detection and estimation. In Proc. CVPR, 2015.
12A. Karaali and C. Jung. Edge-based defocus blur estimation with adaptive scale selection. IEEE Trans. Image Processing(TIP), 27(3):1126–1137, 2018.
15J. Lee, S. Lee, S. Cho, and S. Lee. Deep defocus map estimation using domain adaptation. In Proc. CVPR, 2019.
Experimental Results 18
Qualitative Comparison
Experimental Results 19
Generalization Ability
RealDOF test set (Ours)
◾ 50 pairs of defocused and all-in-focus images,
◾ that are captured concurrently with a dual-camera system,
◾ and geometrically & photometrically aligned as in RealDBlur1
Quantitative evaluation is available
1J. Rim, H. Lee, J.Won, and S. Cho. Real-world blur dataset for learning and benchmarking deblurring algorithms. In Proc. ECCV 2020.
Experimental Results 20
Generalization Ability
CUHK blur detection dataset1
◾ 704 defocused images randomly crawled from the internet
◾ Qualitative evaluation
1J. Shi, L. Xu, and J. Jia. Discriminative blur detection features. In Proc. CVPR, 2014.
Experimental Results 21
Generalization Ability
Pixel dual-pixel test set1
◾ 13 pairs of dual-pixel stereo images
◾ Qualitative evaluation
1A. Abuolaim and M.S. Brown. Defocus deblurring using dual-pixel data. In Proc. ECCV, 2020.
Experimental Results 22
FAC1 vs IAC
Receptive field
◾ FAC: 11 × 11
◾ IAC : 35 × 35
1S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren. Spatio-temporal filter adaptive network for video deblurring. In Proc. ICCV, 2019.
24
Discussions
Limitation
◾ The proposed network is still limited for managing large blur
◾ The network works best with a typical isotropic defocus blur but may not
properly handle defocus blur with irregular shape (e.g. swirly bokeh) or
strong highlights (i.e. glitter bokeh)
For future work, we are planning
◾ to build a defocus deblurring dataset that contains image pairs of diverse blur
types captured with various cameras and lenses
25
Supplementary
Our Approach 26
Training Details
Training details
◾ Random augmentation: gray-scale conversion, noise, scale
◾ Random cropped 256 × 256 sized patches
◾ Batch size: 8
◾ 600k iterations
◾ learning rate: 1e-4
step decayed to half at 500k and 550k
◾ rectified-Adam optimizer
𝛽𝛽1 = 0.99, 𝛽𝛽2 = 0.999
◾ For stacked filters 𝐅𝐅, we set 𝑁𝑁 = 17
27
Ablation Models
29
1S. L. Xu, X. Tao, and J. Jia. Inverse kernels for fast spatial deconvolution. In Proc. ECCV, 2014.