0% found this document useful (0 votes)
3 views

Lecture 02

Uploaded by

jinyaoz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 02

Uploaded by

jinyaoz
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Computer Vision!

CS-E4850, 5 study credits!


!
Juho Kannala!
Aalto University !
Lecture 2: Image Processing!

• Lecture concentrates on image filtering!


• Relevant reading: Chapter 3 of Szeliski’s book!

!
!
!
!
Acknowledgement: many slides from James Hays, Derek Hoiem, Svetlana Lazebnik, Esa Rahtu, Steve Seitz, David Lowe,
Kristen Grauman, Alexei Efros, and others.!

!
Three views of filtering!

• Image filters in spatial domain!


– Filter is a mathematical operation of a grid of numbers!
– Smoothing, sharpening, edge detection!

• Image filters in the frequency domain!


– Filtering is a way to modify the frequencies of images!
– Hybrid images, sampling, image resizing!

• Templates and image pyramids!


– Filtering is a way to match a template to the image!
– Detection, coarse-to-fine registration!

Source: J. Hays
Image filtering!

• Image filtering: compute function of local neighborhood at each position !


• Really important in practice!!
• Enhance images (Denoise, resize, increase contrast, etc.)!
• Extract information from images (Texture, edges, distinctive points, etc.)!
• Detect patterns (Template matching)!
• Deep Convolutional Networks (Sequence of filters and non-linear functions)!
Motivation: Image denoising!

• How can we reduce noise in a photograph?!

Source: Lazebnik
Moving average!

• Let’s replace each pixel with a weighted average of its neighborhood!


• The weights are called the filter kernel!
• The weights for the average of a 3x3 neighborhood:!

1 1 1

1 1 1

1 1 1

“box filter”
Source: D. Lowe
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0
0 0
0 0
0 90
90 0
0 90
90 90
90 90
90 0
0 0
0

0
0 0
0 0
0 90
90 90
90 90
90 90
90 90
90 0
0 0
0

0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0

0
0 0
0 90
90 0
0 0
0 0
0 0
0 0
0 0
0 0
0

0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
?
0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0
?
0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0 50

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Image filtering!

1 1 1

f [.,.] h[.,.] g[⋅ ,⋅ ] 1

1
1

1
1

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10

0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20

0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30

0 0 0 90 90 90 90 90 0 0 0 30 50 80 80 90 60 30

0 0 0 90 0 90 90 90 0 0 0 30 50 80 80 90 60 30

0 0 0 90 90 90 90 90 0 0 0 20 30 50 50 60 40 20

0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30 20 10

0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
k ,l Credit: S. Seitz
Box filter - what does it do?!

• Replaces each pixel with an average of its


neighborhood! g[⋅ ,⋅ ]
• Achieves smoothing effect!
(removes sharp features)! 1 1 1

1 1 1

1 1 1

Source: D. Lowe
Smoothing with box filter!
Practice with linear filters!

0 0 0
0
0
1
0
0
0
?
Original

Source: D. Lowe
Practice with linear filters!

0 0 0
0 1 0
0 0 0

Original Filtered
(no change)

Source: D. Lowe
Practice with linear filters!

0 0 0
0
0
0
0
1
0
?
Original

Source: D. Lowe
Practice with linear filters!

0 0 0
0 0 1
0 0 0

Original Shifted left


By 1 pixel

Source: D. Lowe
Practice with linear filters!

0 0 0 1 1 1
0
0
2
0
0
0
- 1
1
1
1
1
1
?
(Note that filter sums to 1)
Original

Source: D. Lowe
Practice with linear filters!

0 0 0 1 1 1
0
0
2
0
0
0
- 1
1
1
1
1
1

Original Sharpening filter


- Accentuates differences with local
average

Source: D. Lowe
Sharpening!

Source: D. Lowe
Other filters!

1 0 -1
2 0 -2
1 0 -1
Sobel

Vertical Edge
(absolute value)
Other filters!

1 2 1
0 0 0
-1 -2 -1
Sobel

Horizontal Edge
(absolute value)
Key properties!

• Linearity: !

filter(f1 + f2) = filter(f1) + filter(f2)!


• Shift invariance: !

filter(shift(f)) = shift(filter(f))
-> same behavior regardless of pixel location!

• Theoretical result: any linear shift-invariant operator can be represented as a convolution!

Source: S. Lazebnik
Properties in more detail!

• Commutative: a * b = b * a!
• Conceptually no difference between filter and signal!

• Associative: a * (b * c) = (a * b) * c!
• Often apply several filters one after another: (((a * b1) * b2) * b3)!
• This is equivalent to applying one filter: a * (b1 * b2 * b3)!

• Distributes over addition: a * (b + c) = (a * b) + (a * c)!


• Scalars factor out: ka * b = a * kb = k (a * b)!
• Identity: unit impulse e = […, 0, 0, 1, 0, 0, …],!
a * e = a!
!
Source: S. Lazebnik
Filtering vs. Convolution!

• 2D filtering:! I=image f=filter


!
h=filter2(f,I); or h=imfilter(I,f);

! h[m, n] = ∑ f [k , l ] I [m + k , n + l ]
k ,l
• 2D convolution:!
!
h=conv2(f,I);
h[m, n] = ∑ f [k , l ] I [m − k , n − l ]
k ,l
Definition of convolution!

• Let f be the image and g be the kernel. The output of convolving f with g is denoted f * g!

( f ∗ g )[m, n] = ∑ f [m − k , n − l ] g[k , l ]
k ,l

Convention:!
kernel is “flipped”!
f

• See MATLAB functions: conv2, filter2, imfilter (the latter two don’t flip the kernel) !
Source: F. Durand
Important filter - Gaussian!

• Spatially-weighted average!

0.003 0.013 0.022 0.013 0.003


0.013 0.059 0.097 0.059 0.013
0.022 0.097 0.159 0.097 0.022
0.013 0.059 0.097 0.059 0.013
0.003 0.013 0.022 0.013 0.003

5 x 5, σ = 1

Credit: C. Rasmussen
Smoothing with Gaussian filter!
Smoothing with box filter!
Gaussian filters!

• Remove “high-frequency” components from the image (low-pass filter)!


– Images become more smooth!

• Convolution with self is another Gaussian!


• So can smooth with small-width kernel, repeat, and get same result as larger-width kernel would have!
• Convolving two times with Gaussian kernel of width σ is same as convolving once with kernel of width 𝜎√2 !

• Separable kernel!
• Factors into product of two 1D Gaussians!

Source: K. Grauman
Separability of the Gaussian filter!
Separability example!

2D filtering
(center location only)

The filter factors


into a product of 1D
filters:

Perform filtering =
along rows: *

Followed by filtering =
along the remaining column: *

Source: K. Grauman
Separability!

• Why is separability useful in practice?!


Separability!

• Why is separability useful in practice?!

• Filter of size k*k requires k2 operations per pixel!

• Only 2k operations for separable kernels: !


Practical matters – what happens near the edge?!

• The filter window falls off the edge of the image!


• Need to extrapolate:!
• clip filter (black)!
Matlab: imfilter(f, g, 0)!
• wrap around!
Matlab: imfilter(f, g, ‘circular’)!
• copy edge !
Matlab: imfilter(f, g, ‘replicate’)!
• reflect across edge!
Matlab: imfilter(f, g, ‘symmetric’)!
Source: S. Marschner
Practical matters!

• What is the size of the output?!


• Matlab: filter2(g, f, shape)!
• shape = ‘full’: output size is sum of sizes of f and g!
• shape = ‘same’: output size is same as f!
• shape = ‘valid’: output size is difference of sizes of f and g !

full same valid


g g
g g
g g

f f f

g g
g g
g g
Source: S. Lazebnik
Why Gaussian gives smooth output compared to box filter?!

Gaussian Box filter

Source: D. Hoiem
Why lower resolution image still make sense? What is lost?!

Source: D. Hoiem
Thinking in terms of frequency!
Jean Baptiste Joseph Fourier (1768-1830)!
...the manner in which the author arrives at these
equations is not exempt of difficulties and...his
• He had a crazy idea in 1807:! analysis to integrate them still leaves something to be
desired on the score of generality and even rigour.
Any univariate function can be rewritten !
as a weighted sum of sines and cosines !
of different frequencies. !
• Don’t believe it? !
• Neither did lagrange, Laplace, Poisson and other big wigs!
Laplace
• Not translated into English until 1878!!

• But it’s (mostly) true!!


• Called Fourier Series!
• There are some subtle restrictions!

Legendre
Lagrange

Slides: Efros
A sum of sines!

• Our building block:!


!
Asin(ωx + φ )
• Add enough of them to get any signal
f(x) you want!!
A sum of sines!

• Example:!
g(t) = sin(2πf t ) + (1/3)sin(2π( 3f ) t)!

= +
Example: Music!

• We think of music in terms of frequencies at different magnitudes!

Source: D. Hoiem
2D signals!

• We can also think of all kinds of other signals the same way!

Source: D. Hoiem
Other signals!

• We can also think of all kinds of other signals the same way!

Source: D. Hoiem
Fourier analysis in images!

• In 2D case we have two-dimensional frequency !


(which encodes also the 2D orientation of the sine wave)!

Intensity Image

Fourier Image

Slide adapted from D. Hoiem http://sharp.bu.edu/~slehar/fourier/fourier.html#filtering


Signals can be composed!

+ =

http://sharp.bu.edu/~slehar/fourier/fourier.html#filtering
Source: D. Hoiem More: http://www.cs.unm.edu/~brayer/vision/fourier.html
Fourier Bases!

Strong Vertical Frequency


(Sharp Horizontal Edge)
Diagonal Frequencies
Strong Horz.
Frequency
(Sharp Vert.
Edge)

Log Magnitude
Low Frequencies

This change of basis is the Fourier Transform! Source: Hays, Hoiem


Fourier Transform!

• Fourier transform stores the magnitude and phase at each frequency!


• Magnitude encodes how much signal there is at a particular frequency!
• Phase encodes spatial information (indirectly)!
• For mathematical convenience, this is often notated in terms of real and complex numbers!

2 2 I (ω )
−1
Amplitude: A = ± R(ω ) + I (ω ) Phase: φ = tan
R(ω )
Euler’s formula:

Source: D. Hoiem
Source: L. Xie
Computing 2D-DFT!

DFT

IDFT

• Discrete, 2-D Fourier & inverse Fourier transforms are implemented


in fft2 and ifft2, respectively
• fftshift: Move origin (DC component) to image center for display
• Example:
>> I = imread(‘test.png’); % Load grayscale image
>> F = fftshift(fft2(I)); % Shifted transform
>> imshow(log(abs(F)),[]); % Show log magnitude
>> imshow(angle(F),[]); % Show phase angle
The Convolution Theorem!

• The Fourier transform of the convolution of two functions is the product of their Fourier
transforms!

F[ g ∗ h] = F[ g ] F[ h]
• The inverse Fourier transform of the product of two Fourier transforms is the convolution of the
two inverse Fourier transforms!
!
−1 −1 −1
!
! F [ gh] = F [ g ] ∗ F [ h]
!
• Convolution in spatial domain is equivalent to multiplication in frequency domain!!

Source: D. Hoiem
Properties of Fourier Transforms!

• Linearity!

• Fourier transform of a real signal is symmetric about the origin!

• The energy of the signal is the same as the energy of its Fourier transform!

See Szeliski Book (3.4)


Questions!

• Which has more information, the phase or the magnitude?!


• What happens if you take the phase from one image and combine it with the magnitude from
another image?!
Example: amplitude vs. phase !

A = “Aron” P = “Phyllis”
FA = fft2(A) FP = fft2(P)

log(abs(FA)) log(abs(FP))

angle(FA) angle(FP)

ifft2(abs(FA), angle(FP)) ifft2(abs(FP), angle(FA))

Source: L. Xie
What this all has to do with filtering?!
Filtering in spatial domain!
1 0 -1
2 0 -2
1 0 -1

* =

Source: D. Hoiem
Filtering in frequency domain!

FFT FFT

=
Inverse FFT

Source: D. Hoiem
Why Gaussian gives smooth output compared to box filter?!

Gaussian Box filter

Source: D. Hoiem
Gaussian filter!
Box filter!
Why lower resolution image still make sense? What is lost?!

Source: D. Hoiem
Subsampling by a factor of two!

Throw away every other row and column to create a ½ size image
Problem: Aliasing !

• One-dimensional example (sinewave):!


!
!
Problem: Aliasing !

• One-dimensional example (sinewave):!


!
!
Aliasing in graphics !

!
• Characteristic errors may appear ”checker board disintegrate”, “striped shirts look funny”,….!
Nyquist-Shannon sampling theorem !

• When sampling a signal at discrete intervals, the sampling frequency must be ≥ 2 × fmax!
• This allows to reconstruct the original perfectly from the sampled version!

good

bad
Solution: Anti-aliasing!

• Option 1: Sample more often!


• Option 2: Get rid of frequencies greater than half the new sampling frequency (i.e. filter)!
-> Loss of information, but still better than aliasing!

• Example algorithm for downsampling by factor 2 (Matlab):!


1. Apply low-pass filter!
im_blur = imfilter(image,fspecial(‘gaussian’,7,1));!
2. Sample every other pixel !
im_small = im_blur(1:2:end , 1:2:end);!
Subsampling without pre-filtering!

1/2 1/4 (2x zoom) 1/8 (4x zoom)

Credit: S. Seitz
Subsampling with pre-filtering!

Gaussian 1/2 G 1/4 G 1/8


Credit: S. Seitz
Why lower resolution image still make sense? What is lost?!

Source: D. Hoiem
Hybrid Images!

A. Oliva, A. Torralba, P.G. Schyns, “Hybrid Images,” SIGGRAPH 2006!


Source: D. Hoiem
Why do we get distance-dependent interpretation of a hybrid image?!

Adapted from a slide by D. Hoiem


Clues from Human Perception!

• Early processing in humans filters for various orientations and scales of frequency!
• Perceptual cues in the mid-high frequencies dominate perception!
• When we see an image from far away, we are effectively subsampling it (and low pass filtering)!

Early Visual Processing: Multi-scale edge and blob filters

Source: D. Hoiem
Hybrid Image in FFT!

Hybrid Image Low-passed Image High-passed Image

Source: D. Hoiem
Thus, we get distance-dependent interpretation of a hybrid image!

Adapted from a slide by D. Hoiem


Template matching using filtering!
Template matching!

• Goal: find in image!


• Approach: Filter image using the template!
• What is a good filter function (i.e. similarity
measure) between two patches?!

Source: D. Hoiem
Matching with filters!

• Goal: find in image!


• Method 1: filter the image with eye patch! h[ m, n] = ∑ g[ k , l ] f [ m + k , n + l ]
! k ,l

f = image
g = filter

What went wrong?!

Input! Filtered Image! Source: D. Hoiem


Matching with filters!

• Goal: find in image!


• Method 2: filter with zero-mean eye! h[ m, n] = ∑ ( g[ k , l ] −g ) ( f [ m + k , n + l ] )
! k ,l
mean of template g

True detections

False
detections

Input Filtered Image (scaled) Thresholded Image


Matching with filters!

• Goal: find in image!


• Method 3: Normalized cross-correlation!
!
mean template mean image patch
!

∑ ( g[k , l ] − g )( f [m + k , n + l ] − f
k ,l
m ,n )
h[ m, n] = 0.5
⎛ 2 2⎞
⎜⎜ ∑ ( g[ k , l ] − g ) ∑ ( f [ m + k , n + l ] − f m,n ) ⎟⎟
⎝ k ,l k ,l ⎠

Matlab: normxcorr2(template, im)


Source: D. Hoiem
Matching with filters!

• Goal: find in image!


• Method 3: Normalized cross-correlation!
!
!
True detections

Input Normalized X-Correlation Thresholded Image


Matching with filters!

• Goal: find in image!


• Method 3: Normalized cross-correlation!
!
!
True detections

Input Normalized X-Correlation Thresholded Image


Q: What is the best method to use?!

!
A: Depends!
• Zero-mean filter: fast but not a great matcher!
• Normalized cross-correlation: slow but invariant to local average intensity and contrast!

Source: D. Hoiem
Q: What if we want to find larger or smaller eyes?!

A: Image pyramids: multiresolution image representations!


• Repeated decimation with a Gaussian low-pass filter gives Gaussian pyramid!
Template Matching with Image Pyramids!

Input: Image, Template!


1. Match template at current scale!
2. Downsample image!
• In practice, scale step of 1.1 to 1.2!
3. Repeat 1-2 until image is very small!
4. Take responses above some threshold, perhaps with non-maxima suppression!
Gaussian
Filter Sample
Low-Pass Low-Res
Image
Filtered Image Image
Source: D. Hoiem
Laplacian pyramid!

• Contains the difference images between two successive Gaussian pyramid


levels:!
Showing, at full resolution, the information captured at each level of a Gaussian (top) and
Laplacian (bottom) pyramid.!
Major uses of image pyramids!

• Compression!
• Object detection!
• Scale search!
• Features!

• Detecting stable interest points !


• Registration!
• Coarse-to-fine!

Source: D. Hoiem
Things to Remember!

• Image filtering: compute function of local neighborhood at each position!


!
• Sometimes it makes sense to think of images and filtering in the frequency domain!
!
• Can be faster to filter using FFT for large images (N logN vs. N2 for auto-correlation)!
!
• Template matching: localize given template in image!
!
• Image pyramid: multiresolution representation of image!
(Remember to low pass filter before sub-sampling)!

You might also like