0% found this document useful (0 votes)

8 views87 pages

Digital Video

Uploaded by

anuraagnandi9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views87 pages

Digital Video

Uploaded by

anuraagnandi9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 87

Digital Video

EE6310: Image and Video Processing, Spring 2024

March 5, 2024
Digital Video

I Generic Video Codec

I Block Motion Estimation
I Interframe Coding and Motion Estimation
I Video Compression Standards: MPEG-x/H.264
I Optical Flow
Induced Motion Effects
Induced Motion Effects
Induced Motion Effects
Induced Motion Effects
Induced Motion Effects
Generic Video Encoder Diagram
Generic Video Decoder Diagram
Lossless Coding

I Lossless techniques achieve compression with no loss of

information
I The true image can be reconstructed exactly from the coded
image
I Lossless coding doesn’t usually achieve high compression but
has applications such as:
I In combination with lossy compression, multiply gains
I In applications where information loss is unacceptable
I Lossless compression ratios usually in the range:
2 : 1 ≤ CR ≤ 3 : 1 but can vary from image to image
Methods for Lossless Coding

I Basically amounts to clever rearrangement of data

I This can be done in many ways and many domains (DFT,
DCT, wavelet etc)
I The most popular methods use variable length coding
(VLC)
Variable Length Coding (VLC)

I Idea: use variable length codewords to encode gray levels

I Assign short lengths to gray levels that occur frequently
I Assign long lengths to gray levels that occur infrequently
I On average, the bits per pixel (BPP) will be reduced
Image Histogram and BPP

I Recall the image histogram HI :

I If B(k) is the number of bits used to code gray-level k, then

K −1
1 P
BPP(I) = NM B(k)HI (k)
k=0
I BPP is the common measure for VLC
Image Entropy

I Recall the normalized histogram values

1
pI (k) = NM HI (k); k = 0, . . . , K − 1
I pI (k) is the probability of gray level k
I The entropy of image I is then:
−1
KP
E[I] = − pI (k)log2 (pI (k))
k=0
I Entropy is a measure of information
I Provides a lower bound on VLCs:
BPP(I) ≥ E(I)
Optimal VLC

I Recall
−1
KP −1
KP
1
BPP(I) = NM HI (k)B(k) = pI (k)B(k)
k=0 k=0
−1
KP
E[I] = − pI (k)log2 (pI (k))
k=0
I Comparing equations, if B(k) = −log2 (pI (k)), optimum code
found - lower bound attained!
The Huffman Code

I The Huffman algorithm yields an optimum code

I For a set of gray levels {0, . . . , K − 1} it gives a set of code
words c(k) such that
KP−1
BPP(I) = pI (k)L(c(k)) is the smallest possible
k=0
The Huffman Algorithm

I Form a binary tree with branches labeled by the gray-levels km

and their probabilities pI (km ):
1. Eliminate any km where pI (km ) = 0
2. Find 2 smallest probabilities pm = pI (km ), pn = pI (kn)
3. Replace by pmn = pm + pn to form a node; reduce list by 1
4. Label the branch for km with (e.g.) ’1’ and for kn with ’0’
5. Until list has only 1 element (root reached), return to (2)
I In step (4), values ’1’ and ’0’ are assigned to element pairs
(km , kn ), element triples, etc. as the process progresses
Huffman Example

I There are K = 8 values 0, . . . , 7 to be assigned codewords:

pI (0) = 0.5, pI (1) = pI (2) = pI (3) = 0.125, pI (4) =
0.0625, pI (5) = pI (6) = 0.03125, pI (7) = 0
I The process creates a binary tree with values ’1’ and ’0’
placed on the top and bottom branches at each stage
I Solution on board - make note
Huffman Decoding

I The Huffman code is a uniquely decodable code. There is

only one interpretation for a series of codewords (bits)
I Decoding progresses as follows:
I Starting at tree root, traverse tree using coded bits until a leaf
is found. The symbol at the leaf is output
I Return to tree root and repeat above step until all bits are
exhausted
Arithmetic Coding

I Assigns a single arithmetic code word to an entire sequence

of source symbols – creates a mapping between source
symbol sequence and real numbers in the interval [0, 1)
I Achieves higher compression efficiency than Huffman codes
- no need to map source symbols to integral number of code
symbols
I Achieves Shannon’s noiseless source coding bound
Arithmetic Coding Algorithm

I Divide the interval [0, 1) according to PMF

I For e.g., if: p(a) = p(e) = 0.25, p(i) = p(o) = 0.2, p(u) = 0.1

Symbol a e i o u
Range [0, 0.25) [0.25, 0.5) [0.5, 0.7) [0.7, 0.9) [0.9, 1)
I Initialize h = 1, l = 0
I Loop over all source symbols
I r =h−l
I h = l + r ∗ hs
I l = l + r ∗ ls
I Final output is any real number in the interval [h, l)
Arithmetic Decoding Algorithm

I Initial r to encoded number

I Loop over sequence length or until EOF symbol
I Find interval where r lands - output corresponding source
symbol s
I r = hr −ls −ls
= rp(s)
s −ls

I Example on board . . .
Lossy Coding – Goals

To optimize and balance the following:

I Compression achieved by coding
I Computation required to code and decode
I Quality of the decompressed data
Lossy Coding – Broad Methodology

I Many methods proposed

I The successful methods broadly and loosely follow:
I Transform the image to another domain and/or extract
features
I Quantize in this image or those features
I Efficiently organize and/or entropy code the quantized data
Lossy Coding – Block Coding

I Most lossy methods begin by partitioning the image/frame

into sub-blocks that are individually coded
I Wavelet methods are an exception
Lossy Coding – Why Block Coding?

I Reason: Images are highly non-stationary: different areas

of an image may have different properties
I E.g., more high or frequencies, more or less detail etc.
I Local coding is thus more efficient
I Wavelet methods provide localization without blocks
I Typical block sizes: 4 × 4, 8 × 8, 16 × 16
Lossy Coding – Karhunen-Loeve Expansion
I Thm 8.5-1 in Stark and Woods, stated here for completeness
I Optimal decorrelating transform in a probabilistic framework
I Theorem: Let X (t) be a zero-mean, second-order random
process defined over [−T /2, T /2] with continuous covariance
function KXX (t1 , t2 ) = RXX (t1 , t2 ), then
P∞
X (t) = Xn φn (t) for |t| ≤ T /2
n=1
+T
R /2
I Xn ≡ X (t)φ∗n (t)dt
−T /2
I The set of functions {φn (t)} is a complete orthonormal set of
solutions to the integral equation
+TR /2
KXX (t1 , t2 )φn (t2 )dt2 = λn φn (t1 ) for |t1 | ≤ T /2
−T /2
I The coefficients Xn are statistically orthogonal
E [Xn Xm∗ ] = λn δmn
Lossy Coding – Principal Components Analysis (PCA)

I Optimal decorrelating transform in a deterministic

framework
I Given an m × n matrix X of observations of an unknown
system or a physical process
I Find a linear transform matrix P of size m × m such that
Y = PX
I The transform should be such that:
I The covariance matrix CY ’s off-diagonal elements must be 0
I Successive dimensions in Y must be rank-ordered according to
variance
I Note: CX ≡ n1 XXT
I Proof outline on board . . .
Lossy Coding – Discrete Cosine Transform

I The DCT of an N × M image or sub-image is defined as:

P M−1
N−1
I˜(u, v ) = 4CN (u)CM (v )
I (i, j)cos[ (2i+1)uπ ]cos[ (2j+1)v π
P
NM 2N 2M ]
i=0 j=0
I The inverse DCT of is defined as:
P M−1
N−1
CN (u)CM (v )I˜(u, v )cos[ (2i+1)uπ ]cos[ (2j+1)v π
P
I (i, j) = 2N 2M ]
u=0 v =0
(
√1 , if u = 0
I CN (u) = 2
1, u = 1, . . . , N − 1
Lossy Coding – Discrete Cosine Transform
Lossy Coding – Discrete Cosine Transform

I Good decorrelating properties

I Non-adaptive orthonormal basis
I Seperable, fast implementation
I Adopted by JPEG and MPEG standards
Lossy Coding – Overview of JPEG

I The commercial industry standard - formulated by the CCIT

Joint Photographic Experts Group (JPEG)
I Uses DCT as the central transform
I Standard is quite complex - will only discuss outline here
Lossy Coding – JPEG Block Diagram
Lossy Coding – JPEG Baseline Algorithm

I Partition image into 8 × 8 blocks and apply DCT to each

block to get I˜k (u, v )
I Pointwise divide each block by an 8 × 8 user-defined
normalization array Q(u, v )
I Q(u, v ) designed using sensitivity properties of human vision
I Uniformly quantize the result
I˜k (u,v )
I˜k (u, v ) = INT[ Q(u,v ) + 0.5]
Lossy Coding – JPEG Baseline Example

I A block DCT (integer only algorithm) I˜k =

I JPEG Normalization Array Q =

Lossy Coding – JPEG Baseline Example

I A block DCT (integer only algorithm) I˜k =

I Notice all the zeros - due to DCT’s good energy

compaction
Lossy Coding – JPEG Data Rearrangment

I Rearrange quantized AC coefficients

I This array contains mostly zeros, especially at high
frequencies
I So, rearrange into a 1-D array using zig-zag ordering

Reordered quantized block is:

[79, 0, −2, −1, −1, −1, 0, 0, −1, (55 00 s)]
Lossy Coding – JPEG DC Coefficient Handling

I Simple DPCM applied to DC values I˜k (0, 0) between

adjacent blocks to reduce entropy
I Difference between current block and left-adjacent block is
found
e(k) = I˜k (0, 0) − I˜k−1 (0, 0)
I e(k) losslessly encoded with Huffman coder
I First column of DC values retained to allow reconstruction
Lossy Coding – JPEG AC Coefficient Handling

I AC vector contains many zeros

I Using Run Length Coding (RLC) results in considerable
compression
I The AC vector is converted into 2-tuples (skip, value) where
I Skip = number of zeros preceding a non-zero value
I Value = the following non-zero value
I The AC pairs are then Huffman coded
Lossy Coding – JPEG Decoding

I Decoding is achieved by reversing Huffman coding, RLC and

DPCM to recreate I˜k
I Then multiply by normalization array to create lossy DCT
I˜klossy = Q ⊗ I˜k
I The decoded image block is the IDCT of the result
Iklossy = IDCT[I˜klossy ]
I The overall decoded image is recreated by putting together
the compressed 8 × 8 pieces:
I lossy = [Iklossy ]
Lossy Coding – JPEG Example

(a) Original (b) 16:1

(c) 32:1 (d) 64:1

Block Motion Estimation

I Image motion estimation, as in optical flow, is important

for many video applications including video filtering, motion
compensation, and video compression
I Practical systems usually use block motion estimation for
ease of implementation
I This approach assumes that the video consists of images
containing moving blocks
I The blocks are assumed to have simple translational motion.
We can again use windows and windowed sets to express
this
Video Notation

I A video sequences I is a 3-D array or signal:

I = [I (i, j, k); 0 ≤ i ≤ N − 1, 0 ≤ j ≤ M − 1, 0 ≤ k ≤ K − 1]
I Here, we will regard still images as single images taken from
a video. Thus a video consists of a sequence of still image:
I = [. . . Ik−1 Ik Ik+1 . . .]
Video Windows

I A window B is a set of 2-D coordinate shifts Bi = (mi , ni ):

B = {B1 , . . . , B2M+1 } = {(m1 , n1 ), . . . , (m2M+1 , n2N+1 )}
I Given an image Ik and a window B, the windowed set at
(i, j, k) is
B I(i, j, k) = B Ik (i, j) = {I (i − m, j − n, k); (m, n) ∈ B}
the set of image pixels covered by B at coordinate (i, j) at
time k
I The windows used are SQUARE and non-overlapping for
ease of implementation
I Standards typically use 16 × 16
Translational Block Motion

I This assumes that at some later time k + r , each block or

windowed set at time k has translated in the i and j
directions.
B I(i, j, k) = B I(i + d1 , j + d2 , k + r)
for integer displacement (d1 , d2 ) and time shift r , as depicted.
Translational Block Motion

I Advantages of translational block models:

I Only one motion vector needed per block
I Ease of hardware implementation
I Disadvantages of translational block models:
I Inaccurate for other motion types: zoom, rotation, bending
etc.
I Leads to visual “blocking artifacts” at low bitrates
I It is possible to estimate other motion types, but at much
higher cost. Standards use the simple model
Block Matching

I Goal: Estimate (d1 , d2 ) by block matching - the simplest

method for estimating block motion
I Involves a simple search to find the best-fitting translational
motion for each block
I Method: for each block B I(i, j, k) in video signal I at time
k, search for the best-fitting block of the same size at
time k + 1
I The blocks that are found at time k + 1 may overlap
Search Space

The search is conducted over a neighborhood centered around the

location (i, j) of the original block:
Block Match Measures

I Goal: Find the block with the minimum error with respect
to the original block:
FIND:min(d1 ,d2 ) ||B I(i, j, k) − B I(i + d1 , j + d2 , k + 1)||
where ||.|| is an error metric such as (assume P × Q blocks):
MSE(d 1 ,P
d2 ) =
1 P
PQ [I (i −m, j −n, k)−I (i −m +d1 , j −n +d2 , k +1)]2
(m,n)∈B
MAD(d 1 ,P
d2 ) =
1 P
PQ |I (i − m, j − n, k) − I (i − m + d1 , j − n + d2 , k + 1)|
(m,n)∈B
I MAD is commonly used in practice - no computation of
squares:
(d1∗ , d2∗ ) = arg min(d1 ,d2 ) MAD(d1 , d2 )
Block Searching

I In practice it is far too time consuming to check for all

possible matches. Instead, a subset of matches is checked
I First, the amount of translation is always limited:
−M ≤ d1 , d2 , ≤ M
I Three-step search is a typical strategy that is used. It
involves narrowing down the best location using a directed
search
I However, sub-optimal
Three-Step Search
I Step 1: Compute error at d1 = d2 = 0 at 9 equally-spaced
pixels:

I Step 2: Localize the search near the best match from Step 1:
Three-Step Search
I Step 3: Localize the search near the best match from Step 2:

I The motion estimate is then simply the displacement

between the current block (time k) and the best match (time
k + 1)
Comments on Match Search

I Discussed forward motion estimation between current frame

k and next frame k + 1
I Backward motion estimation between current frame k and
previous frame k − 1 common
I Block-based motion estimation is really a form of optical flow
I Current (H.264, H.265) video coding standards use
block-based motion estimation
Motion-Compensated Transform Coding

I Compute block motion displacement vectors using loopback

from frame Ik to frame Ik−1 . Usually 16×16 blocks. Blocks
are non-overlapping in frame Ik . This is referred to as
interframe coding
I Compute motion-compensated difference image Dk by
differencing each 16×16 block in Ik with its corresponding
displaced block in Ik−1
I Subdivide difference image Dk into sub-blocks (usually 8×8)
and code using JPEG-like algorithm
I The first frame is coded like an image. This is referred to as
intraframe coding
Motion-Compensated Transform Coding
Comments on Motion Compensation

I Motion compensation (MC) is highly effective for:

I Increasing compression efficiency
I Reducing “ghosting” artifacts in compressed video. Very fast
movements result in large differences between blocks. Without
MC, leads to “motion ghosts”
I Accomodating compression to temporal aliasing
I Current standards (H.264, H.265) use MC
Practical Video Codec – The Basics

I Earlier, defined video as time-indexed images

I Can sample in all three dimensions, yielding discrete video.
This is always quantized, which is digital video
I In principle, analog video is continuous in all three dimensions
I In practice, analog video is sampled along one spatial
dimension and along time dimension
Practical Video Codec – Analog Video

I An optical analog video signal is a function Ic (x, y , t) of

space and time
I Practical video systems, such as television and monitors,
represent analog video as a one-dimensional electrical
signal V (t)
I A 1-D signal V (t) is obtained by sampling Ic (x, y , t) along
the vertical (y ) direction and along time (t) direction. This is
called scanning and the result is a series of scan lines
Analog Video Sampling

I Progressive Analog Video involves sampling row after row

at intervals ∆y and each frame at intervals ∆t

I Interlaced Analog Video involves sampling even and odd

rows alternately
Digital Video Sampling

I Digital video is obtained either by sampling an analog video

signal or by directly sampling the 3-D intensity distribution
I If progressive analog video is sampled, or if digital video is
directly sampled, then the sampling is rectangular and
properly indexed
I If interlaced analog video is sampled, then the digital is
interlaced also and must be re-indexed
Practical Video Codec – TV Standards

I NTSC (National Television Systems Committee)

I 2:1 interlaced
I 525 lines per frame (262.5 / refresh) - 485 active, 40 blank
I 60 refreshes / second
I Used heavily in Japan and North America
I PAL (Phase Alternation Line)
I 2:1 interlaced
I 625 lines per frame
I 50 refreshes / second
I Used heavily in Europe and Asia (including India)
I Older tube TVs used this format
Practical Video Codec – Aspect Ratio

Aspect ratio: The ratio of the width of a video frame to its height

Figure: Analog formats used 4:3 aspect ratio

Practical Video Codec – Color Basics
I Any color can be represented as a combination of Red (R),
Blue (B), and Green (G)
I RGB representation codes a color video as three separate
signals
I The YIQ representation combines the information according
to perceptual criteria:
Y = 0.299R + 0.587G + 0.114B (luminance)
I = 0.596R + 0.275G − 0.321B (chrominance)
Q = 0.212R − 0.523G + 0.311B (chrominance)
I Alternately, YCbCr chrominance representation:
Cr = R − Y (chrominance)
Cb = B − Y (chrominance)
I Why bother? Compression! Chrominance information can
be sent in a fraction of information required for luminance
information
Practical Video Codec – Resolution

Figure: Typical modern video resolutions

Practical Video Codec – HDTV Resolutions

The HDTV format:

I Has interlaced and progressive modes
I 720p: progressive, 1280×720 pixels, 60 frames per second
(fps). Raw BW (24 bits/pixel): 1.3 Gbps
I 1080i: interlaced, 1920×1080 pixels, 50 fields (25 frames) per
second (fps). Raw BW (24 bits/pixel): 1.2 Gbps
I 1080p: progressive, 1920×1080 pixels, 59.94 fps. Raw BW
(24 bits/pixel): 2.98 Gbps
I Aspect ratio: 16:9
I Typically compressed: MPEG-2 or H.264
Practical Video Codec – Group of Pictures (GOP)

I Specifies the order in which Intra (I) and Inter (P, B) frames
are arranged in a video sequence

Figure: Traditional GOP structure

Video Compression Standard - H.264

I Standardized in 2003
I A large, complex video standard
I High-level overview here
I Good reference: “The H.264 Advanced Video Compression
Standard” by Iain E. Richardson, Wiley, 2010
H.264 - The Highlights

I Variable macroblock sizes - better motion estimation

I In-loop filtering - reduces blocking artifacts
I Integer transform - efficient implementation
I Improved lossless coding - CABAC
I Network Abstraction Layer (NAL) units - facilitates network
transport
H.264 - The Highlights
Optical Flow

I Fundamental to the concept of motion, but nevertheless

different, is optical flow
I Optical flow is the instantaneous motion of image
intensities. This is not the same as the motion of the objects
being imaged: image motion is not object motion
I Examples:
I An “off-camera” variable light source illuminating a stationary
object. A case of image motion without object motion
I A mirrored sphere that is spinning. A case of object motion
without image motion
I Still, optical flow is all the motion information that the image
supplies! So, most methods of motion estimation, motion
compensation, etc. depend on it
Optical Flow – Continuous Formulation

I The image intensity at a point in space and time is I (x, y , t)

I After a sufficiently small time interval ∆t, the intensity at
(x, y ) will move to a point (x + ∆x, y + ∆y ). In other words:
I (x + ∆x, y + ∆y , t + ∆t) = I (x, y , t)
I Illustration on board . . .
I This assumes that the intensity does not change, just its
position
Optical Flow – Taylor Expansion

I Expanding the LHS in a Taylor’s series:

I (x + ∆x, y + ∆y , t + ∆t) =
∂I ∂I ∂I
I (x, y , t) + ∆x ∂x + ∆y ∂y + ∆t ∂t + higher order terms
I So that
∂I ∂I ∂I
I (x, y , t) + ∆x ∂x + ∆y ∂y + ∆t ∂t + higher order terms =
I (x, y , t)
I Letting higher order terms to 0 (assuming small time and
motion), cancelling I (x, y , t) and dividing by ∆t
∆x ∂I ∆y ∂I ∂I
∆t ∂x + ∆t ∂y + ∂t = 0
Optical Flow Constraint Equation

I Taking the limit ∆t → 0 yields:

∂x ∂I ∂y ∂I ∂I
∂t ∂x + ∂t ∂y + ∂t = 0
I The optical flow components are:
∂y
u(x, y , t) = ∂x∂t (x, y , t), v (x, y , t) = ∂t (x, y , t)
I Putting this together gives the optical flow constraint
equation or OFCE:
Ix u + Iy v + It = 0
I So-called since it does not solve for optical flow - only
constrains the optical flow vector (u, v ) to lie on a line
Optical Flow – The Aperture Problem

I Even knowing Ix , Iy , It does not solve optical flow

I This is the aperture problem
I Image being able to view only a small region of the image
that is in motion:

Figure: Aperture problem

I If the edge is sensed to be moving “up”, the true motion

could actually be in any of the shown directions
I In order to solve for optical flow, some other physically
meaningful constraints must be found or assumed
Smooth Optical Flow

I The assumption that is usually made is that optical flow is

smooth. By smooth is meant that the derivatives of u and v
have small magnitudes
I Solution involves minimizing the overall departure from
2 2 2 2
RR
smoothness: Esmooth = Es = image [ux + uy + vx + vy ]dxdy
I We also want the overall OFCE error to be small:
2
RR
Ec = image [Ix u + Iy v + It ] dxdy
A Minimization Problem

I Minimize the weighted sum:

E = Es + λEc
I A solution will always exist and be unique
I For larger λ, the solution will track OFCE closely
I For smaller λ, the solution will be forced smoother
I Picking λ is a hard problem – not discussed here
I We will not try to solve the continuous problem
Discrete Optical Flow

I Approximations of derivatives of flow:

ux ≈ [u(i + 1, j) − u(i, j)]/2, uy ≈ [u(i, j + 1) − u(i, j)]/2
vx ≈ [v (i + 1, j) − v (i, j)]/2, vy ≈ [v (i, j + 1) − v (i, j)]/2
I Then
P N−1
M−1
{[u(i + 1, j) − u(i, j)]2 + [u(i, j + 1) − u(i, j)]2 +
P
Es =
i=0 j=0
[v (i + 1, j) − v (i, j)]2 + [v (i, j + 1) − v (i, j)]2 },
M−1
P N−1
[Ix (i, j)u(i, j) + Iy (i, j)v (i, j) + It (i, j)]2
P
Ec =
i=0 j=0
I The estimates Ix (i, j), Iy (i, j), It (i, j) will be discussed soon . . .
Discrete Optimization

I The goal is to minimize: E = Es + λEc

I Take derivatives w.r.t. u(i, j), v (i, j) for
0 ≤ i ≤ N − 1, 0 ≤ j ≤ M − 1:
∂E
∂u(i,j) = 2[u(i, j) − uave (i, j)] + 2λ[Ix u(i, j) + Iy v (i, j) + It ]Ix
∂E
∂v (i,j) = 2[v (i, j) − vave (i, j)] + 2λ[Ix u(i, j) + Iy v (i, j) + It ]Iy
I The local 4-averages are:
uave (i, j) = 14 [u(i + 1, j) + u(i − 1, j) + u(i, j + 1) + u(i, j − 1)]
vave (i, j) = 41 [v (i + 1, j) + v (i − 1, j) + v (i, j + 1) + v (i, j − 1)]
Discrete Solution

I The minima occur when the derivatives are zero:

∂E ∂E
∂u(i,j) = ∂v (i,j) = 0
I This results in:
(1 + λIx2 )u(i, j) + λIx Iy v (i, j) = vave (i, j) − λIx It
(1 + λIy2 )v (i, j) + λIx Iy u(i, j) = vave (i, j) − λIy It
I Solving for u(i, j), v (i, j) yield:
Ix uave (i,j)+Iy vave (i,j)+It
u(i, j) = uave (i, j) − λ 1+λ(Ix2 +Iy2 )
.Ix
Ix uave (i,j)+Iy vave (i,j)+It
v (i, j) = vave (i, j) − λ 1+λ(Ix2 +Iy2 )
.Iy
Intensity Gradient Estimation

I The derivatives Ix , Iy , It can also be estimated as

differences-of-averages across a 2 × 2 × 2 data cube:
I Ix ≈ 41 [I (i +1, j, k)+I (i +1, j, k +1)+I (i +1, j +1, k)+I (i +1, j +
1, k +1)]−[I (i, j, k)+I (i, j, k +1)+I (i, j +1, k)+I (i, j +1, k +1)]
I Iy ≈
1
4 [I (i, j+1, k)+I (i, j+1, k+1)+I (i +1, j+1, k)+I (i +1, j+1, k+
1)]−[I (i, j, k)+I (i, j, k +1)+I (i +1, j +1, k)+I (i +1, j, k +1)]
I It ≈ 41 [I (i, j, k +1)+I (i, j +1, k +1)+I (i +1, j, k +1)+I (i +1, j +
1, k +1)]−[I (i, j, k)+I (i, j +1, k)+I (i +1, j, k)+I (i +1, j +1, k)]
Intensity Gradient Estimation
Iterative Solution

I The solution for u(i, j) and v (i, j) suggests a numerical

algorithm for actually computing them. The relaxation
algorithm is:
(p) (p)
p Ix uave (i,j)+Iy vave (i,j)+It
u (p+1) (i, j) = uave (i, j) − λ 1+λ(Ix2 +Iy2 )
.Ix
(p) (p)
p Ix uave (i,j)+Iy vave (i,j)+It
v (p+1) (i, j) = vave (i, j) − λ 1+λ(Ix2 +Iy2 )
.Iy
I This technique of compute a “new” estimate from “old”
estimates is a common technique in numerical analysis called
successive refinement
Initial Estimates

I The initial estimates u (0) (i, j), v (0) (i, j) might be taken from
some independent estimate of u, v or simply by taking
u (0) (i, j) = v (0) (i, j) = 0 which gives
I t Ix
u (1) (i, j) = −λ 1+λ(I 2 +I 2 )
x y
I I
v (1) (i, j) = −λ 1+λ(It 2y+I 2 )
x y
Iteration Limit

I The iteration are continued either:

I for a prescribed number P of iterations
I until iterating doesn’t change the solution much e.g.,
max(i,j) |u (p+1) (i, j) − u (p) (i, j)| < where is a tolerance
threshold
I Although in principle it could take N iterations for the
constraints to propagate across the image domain, in practice
it take just a few iterations due to the localness of image
motion
Optical Flow Example
Optical Flow Example

(a) Original Flow (b) Estimated Flow

Needle diagram: Arrow direction indicates flow direction and its

length indicates flow magnitude
Optical Flow Example

I Results are accurate in most places but errors occur near

the sphere boundary
I Error occur near flow discontinuities - the smoothness
conditions is inaccurate
I This is the Horn-Schunk Algorithm, the first and still classic
approach
I Many sophisticated techniques exist e.g., the attempt to find
flow discontinuites, then disable the smoothness constraint
there

Awa Lc47g58 User Manual
No ratings yet
Awa Lc47g58 User Manual
19 pages
PLTV 3250
No ratings yet
PLTV 3250
40 pages
Adobe Scan 18-Aug-2023
No ratings yet
Adobe Scan 18-Aug-2023
25 pages
FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I
No ratings yet
FALLSEM2022-23 CSE4019 ETH VL2022230104728 2022-10-19 Reference-Material-I
33 pages
Lecture10 Image Compression
No ratings yet
Lecture10 Image Compression
103 pages
3.multimedia Compression Algorithms
No ratings yet
3.multimedia Compression Algorithms
23 pages
Image Compression: CS474/674 - Prof. Bebis
100% (1)
Image Compression: CS474/674 - Prof. Bebis
110 pages
ImageCompression-UNIT-V-students Material
No ratings yet
ImageCompression-UNIT-V-students Material
88 pages
EC8093 Unit 5
100% (2)
EC8093 Unit 5
124 pages
MM Unit-III - 0
No ratings yet
MM Unit-III - 0
22 pages
Image Compression (Chapter 8) : CS474/674 - Prof. Bebis
No ratings yet
Image Compression (Chapter 8) : CS474/674 - Prof. Bebis
75 pages
Image Compression
100% (1)
Image Compression
111 pages
Domnic Image&Video Compression 2022
No ratings yet
Domnic Image&Video Compression 2022
76 pages
Image Compression
No ratings yet
Image Compression
107 pages
Assignment cyber security solved
No ratings yet
Assignment cyber security solved
22 pages
Huffman Shannon Fano2
No ratings yet
Huffman Shannon Fano2
41 pages
DIP-UNIT 5
No ratings yet
DIP-UNIT 5
37 pages
Chapter six
No ratings yet
Chapter six
28 pages
Lec13 Image-Compression Lec
100% (1)
Lec13 Image-Compression Lec
104 pages
Image Compression
No ratings yet
Image Compression
39 pages
Image Processing and Compression Techniques: Digitization Includes Sampling of Image and Quantization of Sampled Values
No ratings yet
Image Processing and Compression Techniques: Digitization Includes Sampling of Image and Quantization of Sampled Values
14 pages
Ip-Un3 1
No ratings yet
Ip-Un3 1
44 pages
Image Compression
No ratings yet
Image Compression
133 pages
Lecture10 Print
No ratings yet
Lecture10 Print
16 pages
Fpga Lec08 Jpeg
No ratings yet
Fpga Lec08 Jpeg
14 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
29 pages
Video Processing Communications Yao Wang Chapter8a
No ratings yet
Video Processing Communications Yao Wang Chapter8a
19 pages
image compression
No ratings yet
image compression
70 pages
Lecture11 Print
No ratings yet
Lecture11 Print
13 pages
Image Compression
No ratings yet
Image Compression
111 pages
Chapter 5
No ratings yet
Chapter 5
135 pages
Compression 2
No ratings yet
Compression 2
70 pages
Chapter Six 6A IC
No ratings yet
Chapter Six 6A IC
30 pages
Image Compression: Sankalp Kallakuri
No ratings yet
Image Compression: Sankalp Kallakuri
21 pages
Compression PDF
No ratings yet
Compression PDF
27 pages
Image Compression
100% (1)
Image Compression
47 pages
Unit III - Digital Image Fundamentals
No ratings yet
Unit III - Digital Image Fundamentals
19 pages
03 Compression
No ratings yet
03 Compression
15 pages
CHP - 10 - Image Compression - Error Free and Lossy Compression Min
No ratings yet
CHP - 10 - Image Compression - Error Free and Lossy Compression Min
20 pages
19102036-DIP
No ratings yet
19102036-DIP
10 pages
Types of Coding: - Source Coding - Code Data To More Efficiently Represent
No ratings yet
Types of Coding: - Source Coding - Code Data To More Efficiently Represent
41 pages
Lecture11 Lossless
No ratings yet
Lecture11 Lossless
34 pages
Lec-2 Source Coding v3.0
No ratings yet
Lec-2 Source Coding v3.0
10 pages
Image Compression: - Data vs. Information - Entropy - Data Redundancy
No ratings yet
Image Compression: - Data vs. Information - Entropy - Data Redundancy
30 pages
Lec - 14 Image Compression and Coding v4.0
No ratings yet
Lec - 14 Image Compression and Coding v4.0
9 pages
ECE359_Image Compression
No ratings yet
ECE359_Image Compression
42 pages
Image Compression
No ratings yet
Image Compression
113 pages
Image Compression (Chapter 8) : CS474/674 - Prof. Bebis
No ratings yet
Image Compression (Chapter 8) : CS474/674 - Prof. Bebis
128 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Module 5 - Info Theory and Compression Algo
No ratings yet
Module 5 - Info Theory and Compression Algo
58 pages
Chapter-5 Data Compression
No ratings yet
Chapter-5 Data Compression
53 pages
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
No ratings yet
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
48 pages
Jpeg PPT Notes
No ratings yet
Jpeg PPT Notes
24 pages
Image and Video Compression
No ratings yet
Image and Video Compression
18 pages
Image Compression
No ratings yet
Image Compression
114 pages
DIP-UNIT 5
No ratings yet
DIP-UNIT 5
37 pages
Chapter 08
No ratings yet
Chapter 08
111 pages
Image Compression
100% (1)
Image Compression
38 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Mathematical Formulas for Economics and Business: A Simple Introduction
From Everand
Mathematical Formulas for Economics and Business: A Simple Introduction
K.H. Erickson
4/5 (4)
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Aoc A20s431
No ratings yet
Aoc A20s431
70 pages
Infocus Projector Setup Guide For A Sony Playstation 2
No ratings yet
Infocus Projector Setup Guide For A Sony Playstation 2
9 pages
T.V. & Video Connectors
No ratings yet
T.V. & Video Connectors
7 pages
TV Samsung LN 22 B
No ratings yet
TV Samsung LN 22 B
61 pages
KDL S32a11e
No ratings yet
KDL S32a11e
112 pages
Value Top CCTV Price List May 2016
No ratings yet
Value Top CCTV Price List May 2016
8 pages
Test 2345
No ratings yet
Test 2345
2 pages
Haier Le32m600c
No ratings yet
Haier Le32m600c
232 pages
Dahua Catalogo 2014
No ratings yet
Dahua Catalogo 2014
32 pages
79559philips 20PF4121 Bren
No ratings yet
79559philips 20PF4121 Bren
18 pages
CV 190
No ratings yet
CV 190
2 pages
CS 101
No ratings yet
CS 101
3 pages
Samsung Ln-r408d Lnrxx9d Lnxxm51bd Lnr269d Lnr329d Lnr409d Lnr469d Ln32m51bd Ln40m51bd Ln46m51bd LCD TV Training Manual
100% (1)
Samsung Ln-r408d Lnrxx9d Lnxxm51bd Lnr269d Lnr329d Lnr409d Lnr469d Ln32m51bd Ln40m51bd Ln46m51bd LCD TV Training Manual
61 pages
2007 Sharp Aquos TV Service Update
No ratings yet
2007 Sharp Aquos TV Service Update
72 pages
Ateme Datasheet Kyrion Cm5000e 2021 Dec
No ratings yet
Ateme Datasheet Kyrion Cm5000e 2021 Dec
2 pages
Bravia KDL-32W650A PDF
No ratings yet
Bravia KDL-32W650A PDF
44 pages
ARISTOCRAT LCDs (UPDATED)
No ratings yet
ARISTOCRAT LCDs (UPDATED)
9 pages
Manual TV Plasma LG 42
No ratings yet
Manual TV Plasma LG 42
120 pages
Hipersign HSVPM Series v3
No ratings yet
Hipersign HSVPM Series v3
6 pages
Conectores Coaxial Intersac (BNC)
No ratings yet
Conectores Coaxial Intersac (BNC)
16 pages
FD 102 CVC 1
No ratings yet
FD 102 CVC 1
13 pages
Dolby Vision Generic 4.0 Studio Spec
No ratings yet
Dolby Vision Generic 4.0 Studio Spec
3 pages
DHL32-F600: 32'' FHD LCD Monitor
No ratings yet
DHL32-F600: 32'' FHD LCD Monitor
2 pages
Camera Production Guide - Panasonic DC-BGH1
No ratings yet
Camera Production Guide - Panasonic DC-BGH1
6 pages
SRW 9000
No ratings yet
SRW 9000
4 pages
Avid MC To After Effects Round-Trip (For ProRes or DNXHD Footage) v1.0
No ratings yet
Avid MC To After Effects Round-Trip (For ProRes or DNXHD Footage) v1.0
6 pages
Harga Kamera CCTV PDF
No ratings yet
Harga Kamera CCTV PDF
1 page
SCART Pins and Voltage Spec
No ratings yet
SCART Pins and Voltage Spec
4 pages