0% found this document useful (0 votes)
20 views

Expressive Expression Mapping With Ratio Images

The document presents a technique called expression ratio images (ERI) to capture illumination changes from facial expressions. An ERI extracted from one person can be applied to another face using geometric warping to generate more expressive expressions by including subtle details like wrinkles and creases.

Uploaded by

Zhengyou Zhang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Expressive Expression Mapping With Ratio Images

The document presents a technique called expression ratio images (ERI) to capture illumination changes from facial expressions. An ERI extracted from one person can be applied to another face using geometric warping to generate more expressive expressions by including subtle details like wrinkles and creases.

Uploaded by

Zhengyou Zhang
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220720997

Expressive expression mapping with ratio images

Conference Paper · January 2001


DOI: 10.1145/383259.383289 · Source: DBLP

CITATIONS READS
225 230

3 authors:

Zicheng Liu Ying Shan


Macau University of Science and Technology Chinese Academy of Sciences
145 PUBLICATIONS   6,702 CITATIONS    102 PUBLICATIONS   4,542 CITATIONS   

SEE PROFILE SEE PROFILE

Zhengyou Zhang
Microsoft
371 PUBLICATIONS   36,243 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

ViiBoard View project

Human body modeling View project

All content following this page was uploaded by Zhengyou Zhang on 04 January 2016.

The user has requested enhancement of the downloaded file.


Expressive Expression Mapping with Ratio Images
Zicheng Liu Ying Shan Zhengyou Zhang

Microsoft Research∗

Abstract with such geometric-warping-based approach is that it only


captures the face feature’s geometry changes, completely ig-
Facial expressions exhibit not only facial feature motions, noring illumination changes. The resulting expressions do
but also subtle changes in illumination and appearance (e.g., not have the expression details such as wrinkles. These de-
facial creases and wrinkles). These details are important vi- tails are actually very important visual cues, and without
sual cues, but they are difficult to synthesize. Traditional ex- them, the expressions are less expressive and convincing.
pression mapping techniques consider feature motions while
the details in illumination changes are ignored. In this pa-
per, we present a novel technique for facial expression map-
ping. We capture the illumination change of one person’s
expression in what we call an expression ratio image (ERI).
Together with geometric warping, we map an ERI to any
other person’s face image to generate more expressive facial
expressions.
Keywords: Facial animation, Morphing, Animation

1 Introduction
Facial expressions exhibit not only facial feature motions,
but also subtle changes in illumination and appearance (e.g., Figure 1: Expression mapping with expression details. Left:
facial creases and wrinkles). These details are important neutral face. Middle: result from geometric warping. Right:
visual cues, but they are difficult to synthesize. result from our method.
One class of methods to generate facial expressions with
details is the morph-based approaches and their exten-
sions [2, 14, 16, 3]. The main limitation is that this ap-
proach can only generate expressions in-between the given As an example, Figure 1 shows a comparison of an expres-
expressions through interpolation. If we only have someone’s sion with and without the expression details. The left image
neutral face, we would not be able to generate this person’s is the original neutral face. The one in the middle is the ex-
facial expressions using morph-based methods. pression generated using the traditional expression mapping
Another popular class of techniques, known as expression method. The image on the right is the expression generated
mapping ( performance-driven animation) [4, 10, 20, 13], using the method to be described in this paper. The feature
does not have such limitation. It can be used to animate 2D locations on the right image are exactly the same as those
drawings and images, as well as textured or non-textured 3D on the middle image, but because there are expression de-
face models. The method is very simple. Given an image of a tails, the right image looks much more convincing than the
person’s neutral face and another image of the same person’s middle one.
face with an expression. The positions of the face features
In this paper, we present a novel technique to capture the
(eyes, eye brows, mouths, etc.) on both images are located
illumination change of one person’s expression and map it to
either manually or through some automatic method. The
any different person to generate more expressive facial ex-
difference vector is then added to a new face’s feature posi-
pressions. The critical observation is that the illumination
tions to generate the new expression for that face through
change resulting from the change of surface normal can be
geometry-controlled image warping [21, 2, 10]. One problem
extracted in a skin-color independent manner by using what
[email protected],[email protected],[email protected] we call an expression ratio image (ERI). This ERI can then
be applied to any different person to generate correct illumi-
nation changes resulted from the geometric deformation of
that person’s face.

The remainder of this paper is organized as follows. We


describe the related work in next section. We then introduce
the notion of the expression ratio image (ERI) in Section 3.
In Section 4, we describe techniques to filter ERI to remove
the noises caused by pixel mis-alignment. Some experimen-
tal results are shown in Section 5. We conclude the paper
with a discussion of the limitations of our approach and a
proposal for future research directions.
2 Related work Notice that  is independent of the reflectance coefficients.
Equation (5) holds for any reflectance function of the sur-
Besides the work mentioned in the introduction, there has face Π. So for any unknown reflectance function, if we know
been a lot of other work on facial expression synthesis. For the illumination before deformation, then we can obtain its
an excellent overview, see [13]. The physically-based ap- illumination after deformation by simply multiplying the ex-
proach is an alternative to expression mapping. Badler and pression ratio image  with the illumination before deforma-
Platt [1] used a mass-and-spring model to simulate the skin tion.
and muscles. Extensions and improvements to this technique Let us now consider how to map one person’s expression
have been reported in [19, 18, 9]. to another. Given two people’s faces A and B, assume for
Marschner et al. [11] used the color ratio between the ren- every point on A, there is a corresponding point on B which
dered image pairs under the old and new lighting conditions has the same meaning (eye corners, mouth corners, nose tip,
to modify photographs taken under the old lighting condi- etc). By applying Equation (3) to A and B at each point,
tion to generate photographs under the new lighting condi- we have
tion. In a similar spirit, Debevec [5] used the color difference m
(instead of ratio) between the synthesized image pairs with Ia Ii na · lia
= i=1
m (6)
and without adding a synthetic object to modify the original Ia i=1 Ii na · lia
photograph.
Given a face under two different lighting conditions and and
another face under the first lighting condition, Riklin-Raviv m
Ib Ii nb · lib
and Shashua [15] used color ratio (called quotient image) = i=1
m (7)
to generate an image of the second face under the second Ib i=1 Ii nb · lib
lighting condition. Stoschek [17] combined this technique
Since human faces have approximately the same geometri-
with image morphing to generate the re-rendering of a face
cal shape, if they are in the same pose, their surface normals
under continuous changes of poses or lighting directions.
at the corresponding positions are roughly the same, that is,
The work reported in this paper handles geometrical de-
na ≈ nb and na ≈ nb , and the lighting direction vectors are
formations under constant lighting rather than changing
also roughly the same, that is, lia ≈ lib and lia ≈ lib . Under
lighting condition with fixed geometry as in the previous
this assumption, we have
works. As far as we know, this is the first technique that is
capable of mapping one person’s facial expression details to Ia I
a different person’s face. ≈ b (8)
Ia Ib
Of course, if A and B have exactly the same shape, the above
3 Expression ratio image equation is exact, not approximate. The approximation er-
ror increases with the shape difference between two faces and
For any point p on a surface Π, let n denote its normal.
with the pose difference when the images are taken.
Assume there are m point light sources. Let li ,1 ≤ i ≤ m,
Let A and A denote the images of A’s neutral face and ex-
denote the light direction from p to the ith light source, and
pression face, respectively. Let B denote the image of person
Ii its intensity. Suppose the surface is diffuse, and let ρ be
B’s neutral face, and B the unknown image of his/her face
its reflectance coefficient at p. Under the Lambertian model,
with the same expression as A . Furthermore, if we assume
the intensity at p is
these images have been aligned, then by (8), we have

m
I=ρ Ii n · li (1) B (u, v) A (u, v)
= (9)
i=1 B(u, v) A(u, v)
After the surface is deformed, the intensity at p becomes where (u, v) are the coordinates of a pixel in the images.
Therefore, we have

m
I = ρ Ii n · li (2) A (u, v)
i=1 B (u, v) = B(u, v) (10)
A(u, v)
where n is the normal at p after deformation, and li is the More realistically, the images are usually taken in different
light direction after deformation. poses with possibly different cameras, and so are usually not
From Equations (1) and (2), we have aligned. In order to apply the above equations, we have
m to align them first. In summary, given images Ã, Ã , B̃
I Ii n · li
= i=1
m (3) which have not been aligned, we have the following algorithm
I i=1 Ii n · li for facial expression mapping which captures illumination
changes.
We denote
m Step 1. Find the face features of Ã, Ã and B̃ (either man-
Ii n · li
 ≡ i=1
m . (4) ually or using some automatic method)
i=1 Ii n · li Step 2. Compute the difference vector between the feature
, which is a real function defined over Π, is called the ex- positions of à and Ã. Move the features of B̃ along
pression ratio image (ERI) of Π. From (3), we have the difference vector, and warp the image accordingly.
Let Bg be the warped image. This is the traditional
I  = I (5) expression mapping based on geometric warping.
Step 3. Align à and à with Bg through image warping,
for each point on the surface Π. and denote the warped images by A and A .
A (u,v)
Step 4. Compute ratio image (u, v) = A(u,v)
. weight, we use a large window to smooth out the noise in
Step 5. Set B = Bg for every pixel. the ERI.
We could descretize the weights into many levels and as-
This algorithm requires three warping operations for each sign a different window size for each level. But in practice
input image B. When applying the same expression to many we found that it is enough to use just two levels.
people, we can save computation by pre-computing the ratio
image with respect to A or A . During expression mapping
for a given image B, we first warp that ratio image to Bg 5 Results
and then multiply the warped ratio image with Bg . In this
way, we only perform warping twice, instead of three times, In this section, we show some test results. For each image,
for every input image. we manually put mark points on the face features. Figure 2
shows an example of the mark points. Currently only the
points are used as features for image warping while the line
3.1 Colored lights segments in the figure are only for display purpose. We
use the texture mapping hardware to warp an image from
In the above discussion, we only considered the monochro- one set of markers to another by simply applying Delauney
matic lights. For colored lights, we apply exactly the same triangulation to the mark points. This method is fast but
equations to each R, G and B component of the color im- the resulting image quality is not as good as with other more
ages, and compute one ERI for each component. During advanced image warping techniques [2, 10].
expression mapping, each ERI is independently applied to
the corresponding color component of the input image.
If all the light sources have the same color but only differ
in magnitude, it is not difficult to find that the three ERIs
are equal to each other. To save computation, we only need
to compute one ERI in this case.

3.2 Comments on different lighting conditions


If the lighting conditions for A and B are different, Equation
(10) does not necessarily hold. If, however, there is only an
intensity scaling while the light directions are the same, the
equation is still correct. This probably explains why our
method works reasonably well for some of the images taken
under different lighting environment.
In other cases, we found that performing color histogram
matching [8] before expression mapping is helpful in reducing
some of the artifacts due to different lighting conditions. In Figure 2: The mark points.
addition, we found that the result is noisy if we directly apply
the three color ERIs. A better solution that we use is to For the first example, we map the thinking expression of
first convert RGB images into the YUV space [7], compute the middle image in Figure 3 to a different person’s neutral
the ratio image only for the Y component, map it to the face which is the left image of Figure 4. The middle image of
Y component of the input image, and finally convert the Figure 4 is the result from the traditional geometrical warp-
resulting image back into the RGB space. ing. The right image is the result of our method. We can see
An even better solution would be to use sophisticated re- that the wrinkles due to the skin deformation between the
lighting techniques such as those reported in [6, 12, 15]. eyebrows are nicely captured by our method. As a result,
the generated expression is more expressive and convincing
than the middle one obtained with geometric warping.
4 Filtering
Since image alignment is based on image warping controlled
by a coarse set of feature points, misalignment between A
and A is unavoidable, resulting in a noisy expression ratio
image. So we need to filter the ERI somehow to clean up
the noise while not smoothing out the wrinkles. The idea
is to use an adaptive smoothing filter with little smoothing
in expressional areas and strong smoothing in the remaining
areas.
Since A and A have been roughly aligned, their intensities
in the non-expressional areas should be very close, i.e., the
correlation is high, while their intensities in the expressional
areas are very different. So for each pixel, we compute a Figure 3: An expression used to map to other people’s faces.
normalized cross correlation c between A and A , and use The image on the right is its expression ratio image. The
1 − c as its weight. ratios of the RGB components are converted to colors for
After the weight map is computed, we run an adaptive display purpose.
Gaussian filter on the ERI. For pixels with a large weight,
we use a small window Gaussian filter so that we do not Figure 1 in Section 1 shows an example of mapping the
smooth out the expressional details. For pixels with a small expression displayed in Figure 5(a). The left image is her
Figure 4: Mapping a thinking expression. Left: neutral face.
Middle: result from geometric warping. Right: result from Figure 6: Mapping of a sad expression. Left: neutral face.
ERI. Middle: result from geometric warping. Right: result from
ERI.
neutral face. Our result (the right image of Figure 1) con-
tains the visual effects of skin deformations around the nose
and mouth region. It is clearly a more convincing smile than
the middle image which is generated by geometric warping.

(a) (b) Figure 7: Mapping of a raising-eyebrow expression. Left:


neutral face. Middle: result from geometric warping. Right:
result from ERI.

in Figure 5(f) is mapped to the neutral face in Figure 9.


These two images were taken in different lighting environ-
ment. Again, the image on the right is the result using our
method. We can see that the wrinkles between and above
(c) (d) the two eyebrows are mapped quite well to the target face.
The resulting expression clearly exhibits the visual effects of
eyebrow crunching.
In Figure 10, 11, and 12, we show the results of mapping
the smile expression in Figure 5(b) to different faces. Fig-
ure 10 shows this smile expression being mapped to a male’s
face. The left image is the neutral face. The middle image
is generated using geometric warping and we can see that
the mouth-stretching does not look natural. The image on
(e) (f) the right is generated using our method. The illumination
changes on his two cheek bones and the wrinkles around his
Figure 5: Expressions used to map to other people’s faces.

Figure 6 shows the result of mapping the sad expression


in Figure 5(c). The right image in Figure 6 is the result
generated by using our method, while the result from ge-
ometric warping is shown in the middle. The right image
clearly shows a sad/bitter expression, but we can hardly see
any sadness from the middle image.
Figure 7 shows the result of mapping a raising-eyebrow
expression in Figure 5(d). We can see that the wrinkles on
the forehead are mapped nicely to the female’s face.
Figure 8 shows the result of mapping the frown expression
in Figure 5(e) to an already smiling face (the left image in
Figure 8). Because the frown expression is in a separate
face region from the existing smile expression, the mapping Figure 8: Mapping a frown expression to a smile expression.
works quite well and the resulting expression is basically the Because the two expressions are in separate face regions, the
sum of the two different expressions. mapping is almost equivalent to the sum of the two expres-
Figure 9 shows an example of mapping expressions un- sions. Left: the existing expression. Middle: result from
der different lighting conditions. The thinking expression geometric warping. Right: result from ERI.
her left mouth corner and illumination changes on the left
cheek are mapped nicely to both statues. The more subtle
wrinkle around her right mouth corner is mapped to (b) as
well. However, it does not get mapped to (a) because of the
shadow on this statue’s right face.

Figure 9: Expression mappings with different lighting con-


ditions. Left: neutral face. Middle: result from geometric
warping. Right: result of ERI. (a) (b)

Figure 12: Mapping expressions to statues. (a)Left: original


statue. (a)Right: result from ERI. (b)Left: another statue.
mouth create the visual effects of skin-bulging. It exhibits a
(b)Right: result from ERI.
more natural and convincing smile.
When the poses of the two faces are different, the mapping
may fail. Figure 13 shows such an example. (a), (b) and (c)
are the same neutral faces with different poses. (e) and (f)
are the results of mapping expression (d) to (b) and (c),
respectively. Notice that the original expression in (d) has a
dimple on his right face. Because the pose of (b) is different
from (a), the dimple in (e) is not as clear as in the original
expression. The difference between the poses of (c) and (a)
is even larger and the dimple does not get mapped at all.

Figure 10: Mapping a smile. Left: neutral face. Middle:


result from geometric warping. Right: result from ERI.

Figure 11 shows the result of mapping the same smile


expression in Figure 5(b) to Mona Lisa. The left image in
Figure 11 is the image generated by Seize and Dyer [16] using
their view morphing technique (we scanned the picture from
their paper). The image on the right is the result generated
using our method. The wrinkles around her two mouth cor- (a) (b) (c)
ners make her smile look more natural and convincing than
the one in the middle which is generated using geometric
warping.

(d) (e) (f)

Figure 13: Expression mapping may fail when the poses are
too far apart. (a), (b), and (c): neutral faces with different
poses. (e): result of mapping (d) to (b). (f): result of
mapping (d) to (c).

Figure 11: Mapping a smile to Mona Lisa’s face. Left: “neu-


tral” face. Middle: result from geometric warping. Right:
result from ERI. 6 Conclusions
Figure 12 shows the results of mapping the smile expres- We have shown that expression ratio image is an effective
sion of Figure 5(b) to two statues. The images of both technique to enhance facial expression mapping with illu-
statues are downloaded from the web. The wrinkle around mination changes. An expression ratio image can capture
subtle but visually important details of facial expressions. [7] J. Foley, A. van Dam, S. Feiner, and J. Hughes. Computer
The resulting facial expressions are significantly more ex- Graphics: Principles and Practice. Addison-Wesley Publishing
pressive and convincing than the traditional expression map- Company, 1992.
ping based on geometric warping. [8] R. Gonzalez and R. Woods. Digital Image Processing. Addison-
The proposed approach can be extended to facial expres- Wesley Publishing Company, 1992.
sion mapping for 3D textured face models. In this case, we [9] Y. Lee, D. Terzopoulos, and K. Waters. Realistic modeling for
only need to apply ERIs to the texture image, and we can facial animation. In Computer Graphics, pages 55–62. Siggraph,
obtain more expressive facial expressions for 3D facial ani- August 1995.
mation. [10] P. Litwinowicz and L. Williams. Animating images with draw-
This technique also applies to shaded 3D face models. ings. In Computer Graphics, pages 235–242. Siggraph, August
One could map expressions between synthetic face models, 1990.
as well as map between real face expressions and synthetic
[11] S. R. Marschner and D. P. Greenberg. Inverse lighting for photog-
face models. raphy. In IST/SID Fifth Colort Imaging Conference, November
1997.

7 Limitations and future directions [12] S. R. Marschner, B. Guenter, and S. Raghupathy. Modeling
and rendering for realistic facial animation. In Rendering Tech-
One limitation of this method is in dealing with different niques, pages 231–242. Springer Wien New York, 2000.
lighting conditions. Even though we had some success of [13] F. I. Parke and K. Waters. Computer Facial Animation.
applying ERI to expressions under different lighting envi- AKPeters, Wellesley, Massachusetts, 1996.
ronment with the help of histogram matching, a more gen- [14] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, and D. H. Salesin.
eral solution is to use advanced relighting techniques such Synthesizing realistic facial expressions from photographs. In
as [6, 12, 15]. Computer Graphics, Annual Conference Series, pages 75–84.
Currently the image marks are very sparse. It is desir- Siggraph, July 1998.
able to add line and curve features for better facial feature [15] T. Riklin-Raviv and A. Shashua. The quotient image: Class
correspondences. We are planning to implement better im- based re-rendering and recongnition with varying illuminations.
age warping techniques such as those reported in [2, 10]. In IEEE Conference on Computer Vision and Pattern Recog-
Better image warping should reduce the artifacts of the nition, pages 566–571, June 1999.
triangulation-based warping method that we currently use. [16] S. M. Seize and C. R. Dyer. View morphing. In Computer
We also hope that better image warping technique together Graphics, pages 21–30. Siggraph, August 1996.
with line and curve features will improve pixel correspon-
[17] A. Stoschek. Image-based re-rendering of faces for continuous
dences. High-quality pixel correspondences could reduce the pose and illumination directions. In IEEE Conference on Com-
need of ratio image filtering, thus allowing more expression puter Vision and Pattern Recognition, pages 582–587, 2000.
details to be captured in an ERI.
[18] D. Terzopoulos and K. Waters. Physically-based facial modeling
and animation. Journal of Visualization and Computer Ani-
Acknowledgements mation, 1(4):73–80, March 1990.
[19] K. Waters. A muscle model for animating three-dimensional fa-
We would like to thank Ko Nishino for helpful discussions cial expression. Computer Graphics, 22(4):17–24, 1987.
and many helps with image acquisition. We would like to [20] L. Williams. Performace-driven facial animation. In Computer
thank Conal Elliott for carefully reading our manuscripts Graphics, pages 235–242. Siggraph, August 1990.
and providing valuable comments. We would like to thank
[21] G. Wolberg. Digital Image Warping. IEEE Computer Society
Alex Colburn, Steve Harris, Chuck Jacobs, Brian Meyer, Press, 1990.
Sing Bing Kang, Sashi Raghupathy, and Emiko Unno for
their help with image acquisition. We would like to thank
Michael Cohen for his support.

References
[1] N. Badler and S. Platt. Animating facial expressions. In Com-
puter Graphics, pages 245–252. Siggraph, August 1981.
[2] T. Beier and S. Neely. Feature-based image metamorphosis. In
Computer Graphics, pages 35–42. Siggraph, July 1992.
[3] C. Bregler, M. Covell, and M. Slaney. Video rewrite: Driving
visual speech with audio. In Computer Graphics, pages 353–
360. Siggraph, August 1997.
[4] S. E. Brennan. Caricature Generator. M.S. Visual Studies, Dept
of Architecture, Massachusetts Institute of Technology, Cam-
bridge, MA., Sept. 1982.
[5] P. E. Debevec. Rendering synthetic objects into real scenes:
Bridging traditional and image-based graphics with global illu-
mination and high dynamic range photography. In Computer
Graphics, Annual Conference Series, pages 189–198. Siggraph,
July 1998.
[6] P. E. Debevec, T. Hawkins, C. Tchou, H.-P. Duiker, W. Sarokin,
and M. Sagar. Acquiring the reflectance field of a human face. In
Computer Graphics, Annual Conference Series, pages 145–156.
Siggraph, July 2000.

View publication stats

You might also like