0% found this document useful (0 votes)
26 views

CS 664 Slides #7 Visual Motion: Prof. Dan Huttenlocher Fall 2003

This document summarizes techniques for estimating visual motion from image sequences. Direct methods estimate dense motion fields from pixel intensity variations across frames based on the brightness constancy assumption. The gradient constraint provides an underconstrained linear equation relating motion to image gradients at each pixel. Combining constraints over image patches allows estimating translational motion. Robust iterative methods can handle multiple motions. Global models like affine motion describe parameterized motion fields. Coarse-to-fine pyramid processing enables estimating larger displacements.

Uploaded by

manik birdi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

CS 664 Slides #7 Visual Motion: Prof. Dan Huttenlocher Fall 2003

This document summarizes techniques for estimating visual motion from image sequences. Direct methods estimate dense motion fields from pixel intensity variations across frames based on the brightness constancy assumption. The gradient constraint provides an underconstrained linear equation relating motion to image gradients at each pixel. Combining constraints over image patches allows estimating translational motion. Robust iterative methods can handle multiple motions. Global models like affine motion describe parameterized motion fields. Coarse-to-fine pyramid processing enables estimating larger displacements.

Uploaded by

manik birdi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

CS 664 Slides #7

Visual Motion

Prof. Dan Huttenlocher


Fall 2003

Visual Motion
Over sequence of images can determine
which pixels move where
Differs from motion in the world
Camera motion
Pan, tilt, zoom

Motion parallax
Information about depth from camera motion

Scene motion
Reveals independent objects and behaviors

Un-detectable motion
No/low intensity variation
2

Some Uses of Visual Motion


Human-machine interaction
Animation, gestures, facial expressions

Surveillance and monitoring


Tracking and analyzing behaviors
Collision detection and avoidance

Camera stabilization
Remove jitter

Autonomous navigation
Path finding and depth from parallax

Constructing panoramic mosaics


3

Motion Analysis in Video


Video insertion
Compute motion in one image sequence
Use to transform frames of another sequence
and superimpose
Today used to insert signs and markings into
sporting events

Panoramic mosaics
Synthesized views from video sequence

Estimating Visual Motion


Historically two different approaches
Direct methods, based on local image
derivatives at each pixel
Feature based methods, sparse
correspondence

We will focus on direct methods


Used most in practice
Recover image motion from spatio-temporal
variations in brightness
Dense estimates but can be sensitive to
variations in appearance
5

Direct Motion Estimation Methods


Based on the following assumptions
Every pixel in image I goes to some location in
subsequent image J
Overall brightness of images I,J does not
change (much)

Called brightness constancy equation


I(x,y) J(x+u(x,y), y+v(x,y))
1

-1

-1

-1

-1

10

11

12

12

11

10

-1

-3

15

16

13

14

13

14

15

16

-2

-2

v
6

Using Brightness Constancy


Minimization formulation
Seek (u(x,y),v(x,y)) minimizing error
(I(x,y)-J(x+u(x,y),y+v(x,y))2
Not practical to search explicitly!

Linearization
Relate motion to image derivatives
Gradient constraint

Assuming small u,v (on order of a pixel)


First order term of Taylor series expansion of
brightness constancy
7

Gradient Constraint
One-dimensional example linearization
Estimate displacement d using derivative
Two functions f(x) and g(x)=f(x-d)

Taylor series expansion


f(x-d) = f(x) d f(x) + E

Where f denotes derivative

Now write difference as


f(x)-g(x) = d f(x) + E
Neglecting higher order terms
= (f(x)-g(x))/f(x)
Note only for small d

Gradient Constraint
(or Optical Flow Constraint)
Same approach extends naturally to 2D

I(x,y) J(x+u,y+v), u=u(x,y), v=v(x,y)


Assume time-varying image intensity well
approximated by first order Taylor series
J(x+u,y+v) I(x,y)+Ix(x,y)u+Iy(x,y)v+It
Substituting
Ix(x,y)u+Iy(x,y)v -It
Using gradient notation
I(u,v) -It
Linear constraint on motion (u,v) at each pixel
Can only estimate motion in gradient direction
9

Aperture Problem (Normal Flow)


Can only measure motion in direction
normal to edge (along gradient)

10

Aperture Problem (Normal Flow)


Gradient constraint defines line in (u,v)
space
I(u,v) -It
Methods based solely on per pixel
estimates dont work well
v

u
11

Combining Local Constraints


Each pixel defines linear constraint on
possible (u,v) displacement
For set of pixels with same displacement
combine constraints to get estimate
For pixels with different displacements,
somehow identify that is case
v

u
12

Translational Motion
Assume single displacement (u,v) for all
pixels within some region of image
Over-constrained system of linear
equations Ix(x,y)u+Iy(x,y)v=-It
Find least squares solution
In matrix form: minz Dz - t

where D =

Ix(x1,y1) Iy(x1,y1)
Ix(xn,yn) Iy(xn,yn)

and t = [It(x1,y1) It(xn,yn)]T


13

Least Squares Solution


z* = (DTD)-1 DTt
Method of normal equations, can derive from
setting partial derivatives to zero
2

I
IxIy
x
T
D D=
IxIy Iy2

DTt

IxIt
=
IyIt

Inverse of 2x2 closed form


A=

a b
c d

A-1

= 1/(ad-bc)

d -b
-c a

Where det(A)=ad-bc not (near) zero


14

Translational Motion
Can estimate small translation over local
patch around each pixel

Fast using box sums


Note relation to corner detection
Poor estimate if A nearly singular
Also poor if patch contains more than one
underlying motion

Better handling of multiple motions


Robust statistical techniques

Handling larger translations


Pyramid method
15

Multiple Motions
Robust statistical techniques for finding
predominant motion in a region
Consider approach of iteratively
reweighted least squares (IRLS)
As illustration of robust methods

Generalize minimization problem to

minz W(Dz t)
Weight matrix W is diagonal
Lessen importance of pixels that dont match
Iterate to find good weights
Note in unweighted case W is identity matrix
16

Finding Predominant Motion


Minimization generalizes in obvious way
z* = (DTW2D)-1 DTW2t

Determining good weights to use


Start by computing least squares solution, z0
Iteratively compute better solutions
Compute error for each pixel based on previous
solution zk-1 and use that to set weight per pixel

Depends on initial solution being good enough


to allow bad pixels to have largest error
Have to measure error based on image intensity
matches, its the only thing we can measure
17

Updating Weights
To solve for zk given zk-1
Create weights Wk = diag(w1k wnk) where
wik =

1 if ri k-1 c
c/ri k-1 otherwise

Where ri k-1 is measure of error at i-th pixel


with motion estimate from iteration k-1
Compare i-th pixel value to matching pixel of
other image (using zk-1 for correspondence)

And c is set based on robust measure of good


versus bad data, such as median
Common value is 1/.6745 median(ri k-1 )
18

Weights Example
8

zk-1

I
ri k-1: 0,0,1,0,1,1,6,5,6

11 10 10

J
median = 1
c 1.48

wi k: 1,1,1,1,1,1,.24,.29,.24
19

Global Motion Estimation


Estimate motion vectors that are
parameterized over some region
Each vector fits some low-order model of how
vectors change

Affine motion model is commonly used


u(x,y) = a1+a2x+a3y
v(x,y) = a4 + a5x +a6y

Substituting into grad. constr. equation


Ix(a1+a2x+a3y) + Iy(a4 + a5x +a6y) -It
Each pixel provides a linear constraint in six
unknowns
20

Affine Transformations
Consider points (x,y) in plane rather than
vectors for the moment
Linear transformation and translation
x = a1+a2x+a3y
y = a4 + a5x +a6y
In matrix form A(z)=Lz+b
x
y

a2 a3
a5 a6

x
y

a1
+ a
4

Maps any triangle to any triangle


Defined by three corresponding pairs of points
21

Why Affine Transformations


Simple (and often inaccurate) model
of projection
Point (x,y,z) in space maps to (x,y) in
image
Orthographic or parallel projection

Somewhat reasonable model for


telephoto lens
Yields affine transformation of plane
for viewing flat objects
3D rotation, translation followed by
orthographic projection and scaling
22

Affine Motion Estimation


Minimization problem become that of
estimating the parameters a1, a6
Rather than just two parameters u,v

Still (over-constrained) linear system but


in more unknowns
Again use least squares to solve

Separable into two independent 3 variable


problems
a1, a2, a3 reflect only u-component of motion
a4, a5, a6 reflect only v-component of motion
23

Affine Motion Equations


Again compute (DTD)-1 DTt
Or (re)weighted version for IRLS

Now two 3x3 problems, one for Ix and one


for Iy, as opposed to single 2x2 problem
Problem for Ix and u motion (Iy analogous)
T remains same, D changes
Ix1 x1

Ixn yn

Ix1

Ixn xn

D=

Ix1 y1

Ixn
24

Multiple (Layered) Motions


Combining global parametric motion
estimation with robust estimation
Calculate predominant parameterized motion
over entire image (e.g., affine)
Corresponds to largest planar surface in scene
under orthographic projection
If doesnt occupy majority of pixels robust
estimator will probably fail to recover its motion

Outlier pixels (low weights in IRLS) are not


part of this surface
Recursively try estimating their motion
If no good estimate, then remain outliers
25

Other Global Motion Models


The affine model is simple but not that
accurate in some imaging situations
For instance pinhole rather than parallel
camera model for closer objects
Non-planar surfaces
Explicit modeling of motion parallax

Projective planar case


x = (h1+h2x+h3y)/(h7+h8x+h9y)
y = (h4+h5x+h6y)/(h7+h8x+h9y)
and u=x-x, v=y-y

3D models such as residual planar parallax


26

Handling Larger Motions


Methods based on image gradients are
restricted to small displacements
Two different approaches
Abandon gradient method and explicitly search
over possible translations
Computationally expensive to do for every pixel
Consider shifts and products of image patch

Block motion provides estimates just for certain


pixels, used in compression (e.g., MPEG)

Pyramid to guarantee small motions


At top level small motion
At each level small deviation from one above
27

Coarse to Fine Motion Estimation


Estimate residual motion at each level of
Gaussian pyramid
1/2k

res

Ik,Jk

res

I1,J1

Original
Pyramid of image I

I0,J0
Pyramid of image J
28

Coarse to Fine Estimation


Compute Mk, estimate of motion at level k
Can be local motion estimate (uk,vk)
Vector field with motion of patch at each pixel

Can be global motion estimate


Parametric model (e.g., affine) of dominant
motion for entire image

Choose max k such that motion about one pixel

Apply Mk at level k-1 and estimate


remaining motion at that level, iterate
Local estimates: shift Ik by 2(uk,vk)
Global estimates: apply inverse transform to Jk-1
29

Global Motion Coarse to Fine


Compute transformation Tk mapping
pixels of Ik to Jk
Warp image Jk-1 using Tk
Apply inverse of Tk
Double resolution of Tk (translations double)

Compute transformation Tk-1 mapping


pixels of Ik to warped Jk-1
Estimate of residual motion at this level
Total estimate of motion at this level is
composition of Tk-1 and resolution doubled Tk
In case of translation just add them
30

Affine Mosaic Example


Coarse-to-fine affine motion
Pan tilt camera sweeping repeatedly over scene

Moving objects removed from background


Outliers in motion estimate, use other scans

31

SSD
An alternative to gradient based methods
is template matching
Treat a rectangle around each pixel as a
template to find best match in other image
Search over possible translations minimizing
some error criterion (or maximizing quality)
Generally use sum squared difference (SSD)
(I(x,y)-J(x+u,y+v))2
Sometimes compute cross correlation
Compute over local neighborhood

32

You might also like