0% found this document useful (0 votes)

22 views

Dynamic 2D/3D Registration: Sofien Bouaziz Andrea Tagliasacchi Mark Pauly Ecole Polytechnique F Ed Erale de Lausanne

The document summarizes a course on dynamic 2D/3D registration. It discusses: 1) How image and geometry registration algorithms are important for computer graphics and vision systems, and how RGB-D sensors require robust 2D and 3D registration algorithms. 2) The course introduces basics of 2D/3D registration, including formulating it as minimizing a matching energy and prior energy. It discusses using both geometric and color information. 3) Specific registration techniques covered include rigid and non-rigid 3D registration using various priors like rigid transformations, local rigidity, and linear shape models.

Uploaded by

名昊官

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Dynamic 2D/3D Registration: Sofien Bouaziz Andrea Tagliasacchi Mark Pauly Ecole Polytechnique F Ed Erale de Lausanne

Uploaded by

名昊官

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Dynamic 2D/3D Registration

Sofien Bouaziz Andrea Tagliasacchi Mark Pauly

École Polytechnique Fédérale de Lausanne

Abstract

Image and geometry registration algorithms are an essential component of many

computer graphics and computer vision systems. With recent technological ad-
vances in RGB-D sensors, such as the Microsoft Kinect or Asus Xtion Live, ro-
bust algorithms that combine 2D image and 3D geometry registration have be-
come an active area of research. The goal of this course is to introduce the ba-
sics of 2D/3D registration algorithms and to provide theoretical explanations and
practical tools to design computer vision and computer graphics systems based on
RGB-D devices. To illustrate the theory and demonstrate practical relevance, we
briefly discuss three applications: rigid scanning, non-rigid modeling, and realtime
face tracking. Our course targets researchers and computer graphics practition-
ers with a background in computer graphics and/or computer vision. An up-to-
date version of the course notes as well as slides and source code can be found at
http://lgg.epfl.ch/2d3dRegistration.

1
About the lecturers

Sofien Bouaziz is a PhD student in the Computer Graphics and Geometry Laboratory
at the École Polytechnique Fédérale de Lausanne (EPFL) under the supervision of Prof.
Mark Pauly. He received his MSc degree in Computer Science from EPFL in 2009. His
research interests include computer graphics, computer vision, and machine learning.
Sofien co-developed the facial motion capture software faceshift studio.

e-mail: [email protected]

website: http://lgg.epfl.ch/~bouaziz

Andrea Tagliasacchi is a post-doctoral scholar in the Computer Graphics and Geome-

try Laboratory at the Ecole Polytechnique Federale de Lausanne (EPFL). He received his
MSc from Politecnico di Milano and a PhD from Simon Fraser University (SFU) under
the joint supervision of Prof. Richard Zhang and Prof. Daniel Cohen-Or. His research
interests include computer graphics, geometry processing and computer vision with a
focus on geometry tracking.

e-mail: [email protected]

website: http://drtaglia.github.io

Mark Pauly is an associate professor of computer science at EPFL in Lausanne, Switzer-

land, where he directs the Computer Graphics and Geometry Laboratory. Prior to joining
EPFL he was an assistant professor at ETH Zurich and a postdoctoral scholar at Stan-
ford University. He received his Ph.D. degree in 2003 from ETH Zurich. His research
interests include computer graphics and animation, shape analysis, geometry processing,
and architectural design.

e-mail: [email protected]

website: http://lgg.epfl.ch

Sofien and Mark are co-founders of faceshift AG (www.faceshift.com), an EPFL spin-off

that brings high-quality markerless facial motion capture to the consumer market.

2
1 Introduction

Recent technological advances in RGB-D sensing devices, such as the Microsoft Kinect,
facilitate numerous new and exciting applications, for example in 3D scanning [24] and
human motion tracking [26, 19, 6]. While affordable and accessible, consumer-level RGB-
D devices typically exhibit high noise levels in the acquired data. Moreover, difficult
lighting situations and geometric occlusions commonly occur in many application set-
tings, potentially leading to a severe degradation in data quality. This necessitates a
particular emphasis on the robustness of image and geometry processing algorithms.
The combination of 2D and 3D registration is one important aspect in the design of ro-
bust applications based on RGB-D devices. This lecture introduces the main concepts
of 2D and 3D registration and explains how to combine them efficiently. An up-to-
date version of these course notes as well as slides and source code can be found at
http://lgg.epfl.ch/2d3dRegistration.

2 2D/3D Registration

In the first part of the course we introduce the theory of 2D/3D registration algorithms
suitable for processing RGB-D data. We focus on pairwise registration to compute the
alignment of a source model onto a target model. This alignment can be rigid or non-
rigid, depending on the type of object being scanned. We formulate the registration as
the minimization of an energy
Ereg = Ematch + Eprior . (1)
The matching energy Ematch defines a measure of how close the source is from the target.
The prior energy Eprior quantifies the deviation from the type of transformation or defor-
mation that the source is allowed to undergo during the registration, for example, a rigid
motion or an elastic deformation. The goal of registration is to find a transformation of
the source model that minimizes Ereg to bring the source into alignment with the target.
For data acquired with RGB-D devices, registration can utilize both the geometric infor-
mation encoded in the 3D depth map, as well as the color information provided by the
recorded 2D images. We show that Equation 1 provides a unified way to formulate both
2D and 3D registration, which simplifies their integration.

2.1 3D Registration

In 3D registration we want to align a source surface X embedded in R3 to a target surface

Y in R3 . To formalize this problem, we introduce a surface Z that is a transformed or
deformed version of X that eventually aligns with Y. To solve the registration problem
numerically, we represent the continuous surface X by a set of points X = {xi ∈ X , i =
1 . . . n} and define their corresponding points on the deformed surface Z as Z = {zi ∈
Z, i = 1 . . . n}. Different sampling strategies have been presented by Rusinkiewicz and
Levoy [21].

3
2.1.1 Matching energy

The matching energy measures how close the surface Z is to the surface Y and is defined
as
Z
Ematch (Z) = ϕ(z, Y)dz, (2)
Z

where z ∈ R3 is a point on Z. The accuracy of the registration is evaluated by the metric

ϕ that measures the distance to Y. For simplicity, we will first use the squared Euclidian
distance as metric. Robust metrics [17] could be use instead to increase the robustness
of the registration to noise and outliers and will be presented later on. Using the set of
points Z, we can discretize the matching energy as
n
X
Ematch (Z) = kzi − PY (zi )k22 . (3)
i=1

where PY (zi ) : R3 → R3 returns the closest point (using Euclidian distance) on the
surface Y from zi . PY (zi ) can also be seen as the orthogonal projection of zi onto Y.

2.1.2 Prior energy

In this section we present several prior energies that can be used for registration. These
energies can also be combined to build more sophisticated priors. Priors encode proper-
ties of the scanned objects. For example, when scanning rigid objects, a global rigidity

4
prior can be used to limit the allowed transformations to rotations and translations. For
deforming objects, for example a human body, geometric priors are often employed that
try to mimic physical behavior such as an elastic deformation. We describe a simple
local rigidity prior that approximates elastic deformations and facilitates efficient imple-
mentations. More complex deformation behavior can be captured using a data-driven
approach. One popular method is based on a collection of sample shapes that represents
the space of space of allowed deformations. Using dimensionality reduction, for example
principal component analysis, efficient linear models can be derived that are suitable for
realtime registration algorithms.

Global rigidity. The global rigidity of the 3D registration can be measured as

n
X
Erigid (Z, R, t) = kzi − (Rxi + t)k22 , (4)
i=1

where R ∈ R3×3 is a rotation matrix and t ∈ R3 a translation vector. In this case, the
deformed surface Z tries to follow a rigid transformation of the original surface X .

Local rigidity. The local rigidity energy, following [22, 4], can be expressed as
n X
X
Earap (Z, Ri |ni=1 ) = k(zj − zi ) − Ri (xj − xi )k22 , (5)
i=1 j∈Ni

where the Ri ∈ R3×3 are rotation matrices and Ni is the set of indices of the neighboring
points of xi . In this case, each local neighborhood on the surface Z tries to follow a rigid
transformation of its corresponding local neighborhood on the surface X . Other local
rigidity energies can also be used as prior, see for example [3, 23].

5
Linear model. A 3D linear shape model can be defined using a matrix P containing
the shape model basis, and a mean shape vector m [10]. A new shape s can be defined
as

s = Pd + m, (6)

where d is a vector containing the basis coefficients. A linear model prior energy can be
formulated as the deviation of the vertices from the linear model
n
X
Eprior (Z, d) = kzi − (Pi d + mi )k22 , (7)
i=1

where Pi and mi are the part of P and m corresponding to the vertex zi .

2.1.3 Optimization

How to best optimize the registration energy depends on the prior energy. In this section
we show, as an example, how to optimize a registration energy for two applications: rigid
scanning and non-rigid modeling.

In-hand rigid scanning. Since single depth maps acquired with the RGB-D sensor
exhibit high noise levels and do not cover the whole surface of the 3D object, an aggrega-
tion procedure is typically applied to obtain a complete model with reduced noise level.
In order to aggregate multiple scans over time, different methods can be used [28, 29, 18].
The classical approach is to perform a 3D rigid registration of the currently acquired scan
of the object with the already accumulated 3D data. The pairwise 3D alignment can be
formulated as

E(Z, R, t) = w1 Ematch + w2 Erigid (8)

Xn
Ematch = kzi − PY (zi )k22
i=1
Xn
Erigid = kzi − (Rxi + t)k22
i=1

where the matching energy is combined with a global rigidity prior. To optimize E(Z, R, t)
we linearize the rotation matrix [20] approximating cos θ by 1 and sin θ by θ
 
1 −γ β
R ≈ R̃ =  γ 1 −α . (9)
−β α 1

The alignment is computed by solving iteratively

n
X
arg min w1 kzt+1
i − PY (zti )k22 + w2 kzt+1
i − (R̃(Rt xi + tt ) + t̃)k22 , (10)
Z t+1 ,R̃,t̃ i=1

6
where t is the iteration number and z0i = xi . As PY (.) is a non linear function that is
difficult to optimize with, we use in the optimization the previous estimate PY (zti ). This
correspond to the point-to-point matching error [1]. To speed up the convergence of the
optimization one can linearize kzt+1
i − PY (zti )k2 at PY (zti ) which gives nTi (zt+1
i − PY (zti )),
where ni is the normal of the surface Y at PY (zti ). This leads to the point-to-plane
matching error [8]. The optimization can be reformulated as
n
X
arg min w1 (nTi (zt+1
i − PY (zti )))2 + w2 kzt+1
i − (R̃(Rt xi + tt ) + t̃)k22 . (11)
Z t+1 ,R̃,t̃ i=1

Both Equation 10 and Equation 11 are quadratic, and therefore, can be optimized by
setting the partial derivatives to zero by solving a linear system. During the optimization,
it can be advantageous to apply a Tikhonov regularization to the parameters of the rigid
motion as linearizing the rotation matrix assumes that the angles are small.

Rigidity as a hard constraint. It is interesting to note that when w2 = +∞ then zi

can be replaced into the matching energy by Rxi + t leading to a registration energy
n
X
E(R, t) = k(Rxi + t) − PY (Rxi + t)k22 . (12)
i=1

This energy can be minimized in a similar spirit by linearizing the rotation matrix and
iteratively solving a linear system. Other approaches can be found in [11].

Shape Model
Fitting

Accumulated Scans 3D Mesh

Figure 1: Registration of a morphable model towards the scanned face.

Non-rigid registration. Registering a shape template towards a scanned 3D object

allows to obtain a complete and clean 3D mesh [15]. An example is given below in the
context of face modeling. In this case, the morphable model of Blanz and Vetter [2] that
represents the variations of different human faces in neutral expression is registered to a

7
scan of a face. Non-rigid modeling using a morphable model can be formulated as

E(Z, d, Ri |ni=1 , R, t) = w1 Ematch + w2 Erigid + w3 Emodel + w4 Earap (13)

Xn
Ematch = kzi − PY (zi )k22
i=1
Xn
Erigid = kzi − (Rxi + t)k22
i=1
n
X
Emodel = kzi − (Pi d + mi )k22
i=1
n X
X
Earap = k(zj − zi ) − Ri (xj − xi )k22
i=1 j∈Ni
(14)

A local rigidity energy is added to the optimization in order to get an accurate result, as
the morphable model represents the large-scale variability but might not capture small
scale details. As previously, we solve iteratively
n
X
arg min w1 (nTi (zt+1
i − PY (zti )))2 + w2 kzt+1
i − (R̃(Rt xi + tt ) + t̃)k22 +
Z t+1 ,d,R̃i |n
i=1 ,R̃,t̃ i=1
X
w3 kzt+1
i − (Pi d + mi )k22 + w4 k(zt+1
j − zt+1 t 2
i ) − R̃i Ri (xj − xi )k2 , (15)
j∈Ni

which corresponds to solving a linear system.

2.2 2D Registration

In 2D registration we want to register a source image I to a target image J. During the

registration process, the 2D pixel grid of the source image X = {xi ∈ R2 , i = 1 . . . n} is
deformed to Z = {zi ∈ R2 , i = 1 . . . n} to match the target image.

2.2.1 Matching energy

We define I(x) as the pixel value of the image I located at the position x. The matching
energy measures the color similarity between the source image and the target image

8
wrapped onto the deformed grid Z .
n
X
Ematch (Z) = kI(xi ) − J(zi )k22 . (16)
i=1

2.2.2 Prior energy

Similarly to 3D geometry registration, we can use different prior energies that can be
combined to build more complex priors.

Lucas-Kanade. In the Lucas-Kanade algorithm [16] the deformation is assumed to be

constant within a patch around each pixel. This corresponds to the prior energy
n X
X
ELK (Z) = k(zj − xj ) − (zi − xi )k22 , (17)
i=1 j∈Ni

where Ni is the set of indices of the neighbors of xi .

Horn-Schunck. In the Horn-Schunck algorithm [14] the smoothness of the flow is de-
fined using a Laplacian operator
n
X X
EHK (Z) = k(zi − xi ) − |Ni |−1 (zj − xj )k22 , (18)
i=1 j∈Ni

where |Ni | is the cardinality of Ni . This energy measures for each grid vertex the deviation
of its deformation from the mean deformation of its neighbors.

2.2.3 Optimization

In this section we show, as an example, how to optimize the matching energy combined
with the laplacian smoothness energy. This is similar to the method presented in [14].
Our optimization energy is

E(Z) = w1 Ematch + w2 EHK (19)

n
X
Ematch = kI(xi ) − J(zi )k22
i=1
n
X X
EHK = k(zi − xi ) − |Ni |−1 (zj − xj )k22
i=1 j∈Ni

9
To solve this optimization we linearize J(.) at the current estimate and solve iteratively
n
X
arg min w1 kI(xi ) − J(zti ) − ∇J(zti )T (zt+1
i − zti )k22 +
Z t+1 i=1
X
w2 k(zt+1
i − xi ) − |Ni |−1 (zt+1
j − xj )k22 . (20)
j∈Ni

T
where ∇J = ∇Jx ∇Jy is the image gradient, with ∇Jx the image gradient in x
direction and ∇Jy the image gradient in y direction. As previously, the minimization
can be computed by setting the partial derivative to zero, which corresponds to solving
a linear system.

2.3 2D/3D Registration

We show how to combine 2D image registration and 3D geometry registration to best

utilize the data provided by the RGB-D sensor. More specifically, we want to register a
surface X ⊂ R3 with color information I, i.e. a texture mapped surface, to a 3D surface Y
with corresponding color image J. As previously, the source X is deformed to a surface Z.
We sample the continuous surface X to obtain a set of points X = {xi ∈ X , i = 1 . . . n}.
We define their corresponding points on the deformed surface Z as Z = {zi ∈ Z, i =
1 . . . n}. The color information of sample point xi is given by I(xi ).

2.3.1 Matching energy

We formulate the energy measuring the quality of the 2D and 3D alignment as follow
n
X
Ematch (Z) = w1 kzi − PY (zi )k22 + w2 kI(xi ) − J(f (zi ))k22 . (21)
i=1

The first term is the matching energy presented in Section 2.1. The second term is similar
to the 2D matching energy presented in Section 2.2. The only difference is the additional
function f : R3 → R2 that projects a 3D point zi to the 2D image J. For example this
h iT
f zi,x f zi,y
function could be a perspective projection of the form f (zi ) = zi,z zi,z .

2.3.2 Optimization

We illustrate 2D/3D registration in the context of a face tracking system that combines
the 2D/3D matching energy with a 3D blendshape prior. A blendshape representation
is a linear model defined as a set of blendshape meshes B = [b0 , ..., bn ] where b0 is the
rest pose and bi , i > 0 are different expressions. A new expression can be generated as
T = b0 + Bd, where B = [b1 − b0 , ..., bn − b0 ]. The blendshape model shown below is
inspired from Ekmans Facial Action Coding System [12]. Realtime face tracking using

10
Neutral

Figure 2: A blendshape model composed of 48 expressions.

an RGB-D device can be formulated as a 2D/3D registration of the blendshape model to

the 2D and 3D data [27]. The registration energy can be formulated as

E(Z, d, R, t) = w1 Ematch geometry + w2 Ematch color + w3 Emodel+rigid (22)

n
X
Ematch geometry = kzi − PY (zi )k22
i=1
n
X
Ematch color = kI(xi ) − J(f (zi ))k22
i=1
Xn
Emodel+rigid = kzi − (R(Bi d + b0i ) + t)k22
i=1

11
To solve this optimization we linearize J(f (.)) at the current estimate
n
X ∂f (zti ) t+1
kI(xi ) − J(f (zt+1 t t T
i )k ≈ kI(xi ) − J(f (zi )) − ∇J(f (zi )) (zi − zti )k22 . (23)
i=1
∂zi
h iT
f zi,x f zi,y
For a perspective projection f (zi ) = zi,z zi,z we have
" f f zi,x #
∂f (zi ) zi,z
0 − z2i,z
= f f zi,y . (24)
∂zi 0 zi,z
− z2
i,z

In [27], the global rigidity is decoupled leading to a two steps optimization procedure. In
a first step, a 2D/3D alignment of the blendshape model is computed
n
X
arg min w1 (nTi (zt+1
i − PY (zti )))2 +
Z t+1 ,dt+1 i=1
∂f (zti ) t+1
w2 kI(xi ) − J(f (zti )) − ∇J(f (zti ))T (zi − zti )k22 +
∂zi
t+1
w3 kzi − (R (Bi d + b0i ) + tt )k22 ,
t t+1
(25)

in a second step, a 3D rigid alignment is performed

n
X
arg min kzt+1
i − (Rt+1 (Bi dt+1 + b0i ) + tt+1 )k22 . (26)
Rt+1 ,tt+1 i=1

These two steps are repeated alternatively until convergence. The first step can be
computed by solving a linear system. The second step can be solved using [11] or by
linearizing the rotation matrix. For tracking, another 2D matching energy can be added
to the system:
n
X
Ematch (Z t+1 ) = kJt (f (zti )) − Jt+1 (f (zt+1 2
i ))k2 . (27)
i=1

This optical flow energy enforces color consistency over time by measuring the variation
of color from the previous image frame Jt to the current frame Jt+1 for each zi .

12
3 Robust Registration

In registration, outliers are not only introduced by corrupted sensor measurements, but
also by partial overlaps - many samples on the source simply do not have an ideal cor-
responding point on the target shape. To address this problem, various techniques rely
on a set of heuristics to either prune or downweigh low quality correspondences. Typical
criteria include discarding correspondences that are too far from each other, have dissim-
ilar normals, or involve points on the boundary of the geometry; see [21] for details. As
we will see next these heuristics are related to the optimization of robust functions. In
this section we will consider robust functions as alternatives to the Euclidean metric and
introduce a suitable optimization technique to use them efficiently.

In previous sections, we always considered an energy composed by terms like ϕ((p)),

where ϕ() = 2 and (p) is the euclidean norm of the residual vector with parameters p.
This squared Euclidian distance metric is ideal for the data corrupted by Gaussian noise
as it is the maximum-likelyhood solution of the problem [7, Sec. 7.1.1]. However, it is not
robust to outliers which are common in real world data acquired by RGB-D devices.
0.5 0.5
(
0.45 0.45
1 x2 if |x| 6 τ
2
0.4 0.4
1 τ2 otherwise
1

2
1 x2 |x|p p
0.35 0.35
2 0.8
= 0.9
0.3 0.3 τ = 0.80 p = 0.7
0.25 0.25 0.6
p = 0.5
p = 0.3
0.2 0.2
τ = 0.64 0.4
0.15 0.15

0.1 0.1
τ = 0.48 0.2

0.05 0.05
τ = 0.32
0 0 0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

1 if |x| 6 τ
5
0 otherwise p|x|p−2
1 4.5
1 1

0.8 0.8 3.5

0.6 0.6
2.5

2
0.4 0.4
1.5

1
0.2 0.2

0.5

0 0 0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −5 −4 −3 −2 −1 0 1 2 3 4 5

Figure 3: (top) The robust norms ϕ. (bottom) The associated weight functions w.

In registration, robustness can be obtained by exploiting robust functions [17]. In this

framework, ϕ() acts as a “penalty” function – a function measuring the influence that
a certain residual has in the optimization. Given one of these functions, our robust
optimization can be expressed as
n
X
arg min ϕ(i (p)). (28)
p
i=1

In Fig. 3 we show a few examplar commonly used penalty functions, note how these
all posses properties like radial monotonicity and symmetry [13]. This optimization
problem in Equation 28 can be solved using Iteratively Re-Weighted Least Squares (IRLS)

13
by solving a sequence of problems of the form
n
X
arg min αi i (p)2 . (29)
p
i=1

To understand how to compute the weights αi first notice that the optima of Eq. 28 can
be obtained by vanishing its gradient, which can be computed by a simple application of
the chain rule (note we only look at one element of the sum)
∂ϕ((p)) ∂(p) ∂(p)
= ψ((p)) = w((p))(p) , (30)
∂p ∂p ∂p
where ψ(x) = ∂ϕ(x)/∂x for compactness of notation and w(x) = ψ(x)/x is the so called
weighting function. Interestingly, the gradient of Eq. 29 is
∂α (p)2 ∂(p)
= α (p) . (31)
∂p ∂p
We can now see that by setting α = w((p)) the two gradients become equal. However,
as the optimal weights αi∗ = w(i (p∗ )) are not available, we use an iterative approach
where at each iteration the weights are computed using the previous iteration
n
X
arg min w(i (pt ))i (pt+1 )2 . (32)
pt+1 i=1

This scheme is know as Iteratively Re-Weighted Least Squares (IRLS) and is related to
majorization-minimization. The basic idea of majorization-minimization is to iteratively
minimize a function always larger or equal to the objective function and with at least
one point in common. If these requirements are fullfilled the algorithm converges to a
minimum [25].

Trimmed Metrics. Discarding unreliable correspondences is undoubtedly the simplest

and most common way of dealing with outliers [21]. This can as well be formulated by
Eq. 28, as it corresponds to a weight function like the one in Fig. 3 (bottom-middle)
whose corresponding penalty function is a truncated squared euclidean norm Fig. 3 (top-
middle). Even though this is trivial to implement, the local support of the weight function
is problematic: if the souce surface is too far from the target surface the registration
process will not proceed as all the weights would be zero valued. A possible solution is
to dynamically adapt the threshold value by analyzing the distribution of residuals. For
example, when the ratio of outliers versus inliers is known a priori, then the threshold
can be readily estimated [9].

Sparse Metrics. The shortcomings of trimmed metrics can be overcome by considering

sparse metrics. The penalty functions for sparse metrics take the form ϕ() = ||p , see
Fig. 3 (bottom-right). An important observation is that the weight functions of p-norms
tend to infinity as we approach zero giving a very large reward to inliers. Moreover,
contrary to trimmed metrics, p-norms weakly penalize outliers leading to a more stable
approach when target and source are far apart. This metric has been demonstrated
successful in [5].

14
4 Conclusion

In this course, we introduced 2D/3D registration algorithms and show their applications
for data captured with RGBD devices, such as the Microsoft Kinect or Asus Xtion Live.
Image and geometry registration algorithms are an essential component of many computer
graphics and computer vision systems. With recent technological advances in RGB-D
sensors, robust algorithms that combine 2D image and 3D geometry registration have
become an active area of research. The goal of this course was to introduce the basics
of 2D/3D registration algorithms and to provide theoretical explanations and practical
tools to design robust computer vision and computer graphics systems based on RGBD
devices. We have shown that 2D and 3D registration can be expressed and combined in
a common framework. Numerous application based on RGB-D devices can benefit from
this formulation that allows to combine different priors in an easy manner. To illustrate
the theory and demonstrate practical relevance, we briefly discuss three applications:
rigid scanning, non-rigid modeling, and realtime face tracking.

15
References
[1] P. Besl and H. McKay. A method for registration of 3d shapes. PAMI, 1992.

[2] V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. Proc. of
ACM SIGGRAPH, 1999.

[3] M. Botsch, M. Pauly, M. Gross, and L. Kobbelt. Primo: coupled prisms for intuitive
surface modeling. SGP, 2006.

[4] S. Bouaziz, M. Deuss, Y. Schwartzburg, T. Weise, and M. Pauly. Shape-up: Shaping

discrete geometry with projections. Comput. Graph. Forum, 2012.

[5] S. Bouaziz, A. Tagliasacchi, and M. Pauly. Sparse iterative closest point. SGP, 2013.

[6] S. Bouaziz, Y. Wang, and M. Pauly. Online modeling for realtime facial animation.
ACM Trans. Graph., 2013.

[7] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press,

2004.

[8] Y. Chen and G. Medioni. Object modeling by registration of multiple range images.
In ICRA, 1991.

[9] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek. The trimmed iterative clos-
est point algorithm. In Pattern Recognition, 2002. Proceedings. 16th International
Conference on, volume 3, pages 545–548. IEEE, 2002.

[10] T. Cootes and C. Taylor. Statistical models of appearance for computer vision, 2000.

[11] D. W. Eggert, A. Lorusso, , and R. B. Fisher. Estimating 3-d rigid body transfor-
mations: a comparison of four major algorithms. Machine Vision and Applications,
1997.

[12] P. Ekman and W. Friesen. Facial Action Coding System: A Technique for the
Measurement of Facial Movement. Consulting Psychologists Press, 1978.

[13] J. Fox. An R and S-Plus companion to applied regression. Sage, 2002. http://cran.r-
project.org/doc/contrib/Fox-Companion/appendix-robust-regression.pdf.

[14] B. K. P. Horn and B. G. Schunck. ”determining optical flow”. Artif. Intell., 1981.

[15] H. Li, B. Adams, L. J. Guibas, and M. Pauly. Robust single-view geometry and
motion reconstruction. ACM Trans. Graph., 2009.

[16] B. D. Lucas and T. Kanade. An iterative image registration technique with an

application to stereo vision. IJCAI, 1981.

[17] M. Mirza and K. Boyer. Performance evaluation of a class of m-estimators for

surface parameter estimation in noisy range data. IEEE Transactions on Robotics
and Automation, 9:75–85, 1993.

16
[18] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison,
P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon. Kinectfusion: Real-time dense
surface mapping and tracking. ISMAR, 2011.

[19] I. Oikonomidis, N. Kyriazis, and A. Argyros. Tracking the articulated motion of two
strongly interacting hands. CVPR, 2012.

[20] S. Rusinkiewicz. Derivation of point to plane minimization, 2013. http://www.cs.

princeton.edu/~smr/papers/icpstability.pdf.

[21] S. Rusinkiewicz and M. Levoy. Efficient variants of the icp algorithm. 3DIM, 2001.

[22] O. Sorkine and M. Alexa. As-rigid-as-possible surface modeling. SGP, 2007.

[23] R. W. Sumner, J. Schmid, and M. Pauly. Embedded deformation for shape manip-
ulation. ACM Trans. Graph., 2007.

[24] J. Tong, J. Zhou, L. Liu, Z. Pan, and H. Yan. Scanning 3d full human bodies using
kinects. TVCG, 2012.

[25] P. Verboon. Majore ation wtthiteratively reweighted least squares: A general ap-
proach to optimize a class of resistant loss functions.

[26] X. Wei, P. Zhang, and J. Chai. Accurate realtime full-body motion capture using a
single depth camera. ACM Trans. Graph., 2012.

[27] T. Weise, S. Bouaziz, H. Li, and M. Pauly. Realtime performance-based facial

animation. ACM Trans. Graph., 2011.

[28] T. Weise, B. Leibe, and L. V. Gool. Accurate and robust registration for in-hand
modeling. CVPR, 2008.

[29] T. Weise, T. Wismer, B. Leibe, and L. Van Gool. In-hand scanning with online loop
closure. 3DIM, 2009.

Dynamical System - Meiss
No ratings yet
Dynamical System - Meiss
34 pages
Water Holding Capacity of Soil
100% (8)
Water Holding Capacity of Soil
3 pages
MMW Reviewer
No ratings yet
MMW Reviewer
5 pages
3D Scanning Deformable Objects With A Single RGBD Sensor
No ratings yet
3D Scanning Deformable Objects With A Single RGBD Sensor
9 pages
SG14 Byod3d
No ratings yet
SG14 Byod3d
66 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Build Your Own 3D Scanner: 3D Photography For Beginners: SIGGRAPH 2009 Course Notes Wednesday, August 5, 2009
No ratings yet
Build Your Own 3D Scanner: 3D Photography For Beginners: SIGGRAPH 2009 Course Notes Wednesday, August 5, 2009
94 pages
Schmidt Ryan M 201106 PHD Thesis
No ratings yet
Schmidt Ryan M 201106 PHD Thesis
271 pages
Surface Based Matching
No ratings yet
Surface Based Matching
30 pages
3 D Models and Match
No ratings yet
3 D Models and Match
35 pages
Odetic PSIP
No ratings yet
Odetic PSIP
3 pages
2 Surface Representations
No ratings yet
2 Surface Representations
78 pages
Lec11 Image Registration
No ratings yet
Lec11 Image Registration
46 pages
From Images To 3D Models
No ratings yet
From Images To 3D Models
7 pages
3D Surface Approximation From Point Clouds: Abhishek Pandey
No ratings yet
3D Surface Approximation From Point Clouds: Abhishek Pandey
56 pages
An Invitation To 3-D Vision From Images To Models
No ratings yet
An Invitation To 3-D Vision From Images To Models
339 pages
UNIT-5 (PART-2,Final)
No ratings yet
UNIT-5 (PART-2,Final)
28 pages
1992 - A Method For Registration of 3-D Shapes - ICP - OK
No ratings yet
1992 - A Method For Registration of 3-D Shapes - ICP - OK
18 pages
Plantillas Gráficas para El Registro de Modelos.
No ratings yet
Plantillas Gráficas para El Registro de Modelos.
12 pages
Besl 1992 Method For Registration of 3D Shapes
No ratings yet
Besl 1992 Method For Registration of 3D Shapes
21 pages
An Invitation To 3-D Vision PDF
No ratings yet
An Invitation To 3-D Vision PDF
338 pages
Unit 5 Int345
No ratings yet
Unit 5 Int345
28 pages
Surface Reconstruction by Propagating 3D Stereo Data in Multiple 2D Images 1st edition by Gang Zeng, Sylvain Paris, Long Quan, Maxime Lhuillier ISBN 3540219842 9783540219842 - Download the ebook now and own the full detailed content
100% (7)
Surface Reconstruction by Propagating 3D Stereo Data in Multiple 2D Images 1st edition by Gang Zeng, Sylvain Paris, Long Quan, Maxime Lhuillier ISBN 3540219842 9783540219842 - Download the ebook now and own the full detailed content
52 pages
3D Face Recognition Without Facial Surface Reconstruction
No ratings yet
3D Face Recognition Without Facial Surface Reconstruction
6 pages
M.Tech Project Report Phase-I: 3D Video Reconstruction From Multiple Views:An Fem Paradigm
No ratings yet
M.Tech Project Report Phase-I: 3D Video Reconstruction From Multiple Views:An Fem Paradigm
22 pages
Lec 22
No ratings yet
Lec 22
12 pages
2015_Pomerleau_FnTRo_Review
No ratings yet
2015_Pomerleau_FnTRo_Review
108 pages
1479
No ratings yet
1479
41 pages
Mir DIP Report Naren
No ratings yet
Mir DIP Report Naren
13 pages
On Computer Vision For Augmented Reality
No ratings yet
On Computer Vision For Augmented Reality
4 pages
Citation
No ratings yet
Citation
1 page
Unit 5ThreeDimensionalGraphics I
No ratings yet
Unit 5ThreeDimensionalGraphics I
47 pages
CV_V unit notes
No ratings yet
CV_V unit notes
15 pages
Kinect
No ratings yet
Kinect
3 pages
Remotesensing 11 01499
No ratings yet
Remotesensing 11 01499
29 pages
Fast Registration Based On Noisy Planes With Unknown Correspondences For 3-D Mapping
No ratings yet
Fast Registration Based On Noisy Planes With Unknown Correspondences For 3-D Mapping
18 pages
Understanding 3 D Shapes Being in Motion
No ratings yet
Understanding 3 D Shapes Being in Motion
6 pages
Fast 3D Mapping by Matching Planes Extracted From Range Sensor Point-Clouds
No ratings yet
Fast 3D Mapping by Matching Planes Extracted From Range Sensor Point-Clouds
6 pages
Structured-Light 3D Surface Imaging - Aop-3!2!128
0% (1)
Structured-Light 3D Surface Imaging - Aop-3!2!128
33 pages
1604 00449 PDF
No ratings yet
1604 00449 PDF
17 pages
Patch Based Non Rigid 3D Reconstruction From A Single Depth Stream
No ratings yet
Patch Based Non Rigid 3D Reconstruction From A Single Depth Stream
10 pages
Pose Estimation of Large Scale Objects With 3D
No ratings yet
Pose Estimation of Large Scale Objects With 3D
20 pages
Surface Profile Guided Scan Method For Autonomous 3D Reconstruction of Unknown Objects Using An Industrial Robot
No ratings yet
Surface Profile Guided Scan Method For Autonomous 3D Reconstruction of Unknown Objects Using An Industrial Robot
25 pages
Lecture4-CS294-2022
No ratings yet
Lecture4-CS294-2022
36 pages
Computer Graphics Project Report: Usama Mehmood - 110614650 Stony Brook University
No ratings yet
Computer Graphics Project Report: Usama Mehmood - 110614650 Stony Brook University
6 pages
Directional TSDF
No ratings yet
Directional TSDF
8 pages
Meagher 1982
No ratings yet
Meagher 1982
19 pages
Range Image Segmentation For 3-D Object Recognition
No ratings yet
Range Image Segmentation For 3-D Object Recognition
157 pages
An Unsupervised Learning Model For Deformable Medical Image Registration
No ratings yet
An Unsupervised Learning Model For Deformable Medical Image Registration
9 pages
5. GPU Ray Marching
No ratings yet
5. GPU Ray Marching
23 pages
Pauly 2005 EBS
No ratings yet
Pauly 2005 EBS
10 pages
1251779679-MIT
No ratings yet
1251779679-MIT
117 pages
Real-Time Camera Tracking and 3D Reconstruction Using Signed Distance Functions
No ratings yet
Real-Time Camera Tracking and 3D Reconstruction Using Signed Distance Functions
8 pages
UNIT 4
No ratings yet
UNIT 4
13 pages
A Recognition System For 3D Embossed Digits On Non-Smooth Metallic Surface
No ratings yet
A Recognition System For 3D Embossed Digits On Non-Smooth Metallic Surface
6 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
95 pages
Park DeepSDF Learning Continuous Signed Distance Functions For Shape Representation CVPR 2019 Paper
No ratings yet
Park DeepSDF Learning Continuous Signed Distance Functions For Shape Representation CVPR 2019 Paper
10 pages
Course Slam
No ratings yet
Course Slam
72 pages
PSDF Fusion: Probabilistic Signed Distance Function For On-The-Fly 3D Data Fusion and Scene Reconstruction
No ratings yet
PSDF Fusion: Probabilistic Signed Distance Function For On-The-Fly 3D Data Fusion and Scene Reconstruction
17 pages
Radiosity Computer Graphics: Advancing Visualization through Radiosity in Computer Vision
From Everand
Radiosity Computer Graphics: Advancing Visualization through Radiosity in Computer Vision
Fouad Sabry
No ratings yet
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
From Everand
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
Fouad Sabry
No ratings yet
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
From Everand
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
Fouad Sabry
No ratings yet
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
From Everand
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
Fouad Sabry
No ratings yet
01-TEMPLATE PARAMETER PLAXIS 2023 On Ramp
No ratings yet
01-TEMPLATE PARAMETER PLAXIS 2023 On Ramp
42 pages
Operator Training Use of Chart Recorders Rev 3
No ratings yet
Operator Training Use of Chart Recorders Rev 3
20 pages
Circular Motion Instructions
No ratings yet
Circular Motion Instructions
5 pages
MUET Physics # 16
No ratings yet
MUET Physics # 16
4 pages
Atomic Emission Spectrometry
No ratings yet
Atomic Emission Spectrometry
21 pages
Electrostatic Assignment
No ratings yet
Electrostatic Assignment
3 pages
DIAC
86% (7)
DIAC
12 pages
Applied Mechanics Civil Booster Updated
No ratings yet
Applied Mechanics Civil Booster Updated
33 pages
Chapter 4 - Span Morphing Concept An Overvie - 2018 - Morphing Wing Technologie
No ratings yet
Chapter 4 - Span Morphing Concept An Overvie - 2018 - Morphing Wing Technologie
20 pages
Applied Mechanics PDF
No ratings yet
Applied Mechanics PDF
18 pages
Test 1
No ratings yet
Test 1
3 pages
Inlet Entropy S Outlet Entropy S'
No ratings yet
Inlet Entropy S Outlet Entropy S'
9 pages
AE3491
No ratings yet
AE3491
2 pages
2-Laplace Transform
No ratings yet
2-Laplace Transform
11 pages
Jfet's & Mosfet's
100% (1)
Jfet's & Mosfet's
62 pages
STATISTICS. DSP
No ratings yet
STATISTICS. DSP
16 pages
Chapter 04 Homework
81% (16)
Chapter 04 Homework
32 pages
UFO Propulsion: The Intelligent and Practical Physics Of..
100% (1)
UFO Propulsion: The Intelligent and Practical Physics Of..
4 pages
Embankment and Base Waqtc AASHTO T 99/T 180 In-Place Density
No ratings yet
Embankment and Base Waqtc AASHTO T 99/T 180 In-Place Density
12 pages
Parametric Design Yoshimura Origami Pattern
No ratings yet
Parametric Design Yoshimura Origami Pattern
14 pages
IFS Academy Career Program in ANSYS MAPDL Workbench
No ratings yet
IFS Academy Career Program in ANSYS MAPDL Workbench
3 pages
Bubble Power
No ratings yet
Bubble Power
14 pages
Gravity and Friction-Lp
No ratings yet
Gravity and Friction-Lp
7 pages
Earth Sciences and Engineering: 4 International Conference On (ICEE-2017)
No ratings yet
Earth Sciences and Engineering: 4 International Conference On (ICEE-2017)
2 pages
Sizinghighly Orderedbuckyball Shaped
No ratings yet
Sizinghighly Orderedbuckyball Shaped
10 pages
(Ebook) Thin film materials: stress, defect formation and surface evolution by L. B. Freund, S. Suresh ISBN 9780511165658, 9780521822817, 051116565X, 0521822815 instant download
100% (1)
(Ebook) Thin film materials: stress, defect formation and surface evolution by L. B. Freund, S. Suresh ISBN 9780511165658, 9780521822817, 051116565X, 0521822815 instant download
47 pages
Class 12-A (2024-2025) Physics Project Allotment List
No ratings yet
Class 12-A (2024-2025) Physics Project Allotment List
1 page

Dynamic 2D/3D Registration: Sofien Bouaziz Andrea Tagliasacchi Mark Pauly Ecole Polytechnique F Ed Erale de Lausanne

Uploaded by

Dynamic 2D/3D Registration: Sofien Bouaziz Andrea Tagliasacchi Mark Pauly Ecole Polytechnique F Ed Erale de Lausanne

Uploaded by

Dynamic 2D/3D Registration

Sofien Bouaziz Andrea Tagliasacchi Mark Pauly

Image and geometry registration algorithms are an essential component of many

Andrea Tagliasacchi is a post-doctoral scholar in the Computer Graphics and Geome-

Mark Pauly is an associate professor of computer science at EPFL in Lausanne, Switzer-

Sofien and Mark are co-founders of faceshift AG (www.faceshift.com), an EPFL spin-off

In 3D registration we want to align a source surface X embedded in R3 to a target surface

where z ∈ R3 is a point on Z. The accuracy of the registration is evaluated by the metric

2.1.2 Prior energy

Global rigidity. The global rigidity of the 3D registration can be measured as

where Pi and mi are the part of P and m corresponding to the vertex zi .

E(Z, R, t) = w1 Ematch + w2 Erigid (8)

The alignment is computed by solving iteratively

Rigidity as a hard constraint. It is interesting to note that when w2 = +∞ then zi

Accumulated Scans 3D Mesh

Figure 1: Registration of a morphable model towards the scanned face.

Non-rigid registration. Registering a shape template towards a scanned 3D object

E(Z, d, Ri |ni=1 , R, t) = w1 Ematch + w2 Erigid + w3 Emodel + w4 Earap (13)

which corresponds to solving a linear system.

In 2D registration we want to register a source image I to a target image J. During the

2.2.1 Matching energy

2.2.2 Prior energy

Lucas-Kanade. In the Lucas-Kanade algorithm [16] the deformation is assumed to be

where Ni is the set of indices of the neighbors of xi .

E(Z) = w1 Ematch + w2 EHK (19)

2.3 2D/3D Registration

We show how to combine 2D image registration and 3D geometry registration to best

2.3.1 Matching energy

Figure 2: A blendshape model composed of 48 expressions.

an RGB-D device can be formulated as a 2D/3D registration of the blendshape model to

E(Z, d, R, t) = w1 Ematch geometry + w2 Ematch color + w3 Emodel+rigid (22)

in a second step, a 3D rigid alignment is performed

In previous sections, we always considered an energy composed by terms like ϕ((p)),

0.8 0.8 3.5

In registration, robustness can be obtained by exploiting robust functions [17]. In this

Trimmed Metrics. Discarding unreliable correspondences is undoubtedly the simplest

Sparse Metrics. The shortcomings of trimmed metrics can be overcome by considering

[4] S. Bouaziz, M. Deuss, Y. Schwartzburg, T. Weise, and M. Pauly. Shape-up: Shaping

[7] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press,

[16] B. D. Lucas and T. Kanade. An iterative image registration technique with an

[17] M. Mirza and K. Boyer. Performance evaluation of a class of m-estimators for

[20] S. Rusinkiewicz. Derivation of point to plane minimization, 2013. http://www.cs.

[22] O. Sorkine and M. Alexa. As-rigid-as-possible surface modeling. SGP, 2007.

[27] T. Weise, S. Bouaziz, H. Li, and M. Pauly. Realtime performance-based facial

You might also like

In previous sections, we always considered an energy composed by terms like ϕ((p)),