0% found this document useful (0 votes)
168 views7 pages

Arcimboldo-Like Collage Using Internet Images: Hua Huang Lei Zhang Hong-Chao Zhang Xi'an Jiaotong University, China

The document describes an algorithm for creating Arcimboldo-like collages by selecting image cutouts from the internet and arranging them to represent an input image. Key points: - The algorithm retrieves internet images based on keywords related to the input image and uses filters to extract cutouts. - Each cutout is encoded with a descriptor containing color and shape features to help select competent cutouts for the collage. - The selected cutouts are purposefully arranged to resemble the input image while still being recognizable on their own. - Experimental results show the algorithm can effectively produce entertaining Arcimboldo-like collages using internet images.

Uploaded by

Salik Nadeem
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views7 pages

Arcimboldo-Like Collage Using Internet Images: Hua Huang Lei Zhang Hong-Chao Zhang Xi'an Jiaotong University, China

The document describes an algorithm for creating Arcimboldo-like collages by selecting image cutouts from the internet and arranging them to represent an input image. Key points: - The algorithm retrieves internet images based on keywords related to the input image and uses filters to extract cutouts. - Each cutout is encoded with a descriptor containing color and shape features to help select competent cutouts for the collage. - The selected cutouts are purposefully arranged to resemble the input image while still being recognizable on their own. - Experimental results show the algorithm can effectively produce entertaining Arcimboldo-like collages using internet images.

Uploaded by

Salik Nadeem
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Arcimboldo-like Collage Using Internet Images

Hua Huang

Lei Zhang Hong-Chao Zhang


Xian Jiaotong University, China
Figure 1: From an input image (left, Transformers c _2011 Hasbro), our algorithm creates the fascinating Arcimboldo-like collage (right) by
using a collection of Internet images (middle). The Internet images are retrieved with the key words musical instrument.
Abstract
Collage is a composite artwork made from assemblage of different
material forms. In this work, we present a novel approach for cre-
ating a fantastic collage artform, namely Arcimboldo-like collage,
which represents an input image with multiple thematically-related
cutouts from the ltered Internet images. Due to the massive data
of Internet images, competent image cutouts can almost always be
discovered to match the segmented components of the input image.
The selected cutouts are purposefully arranged such that as a whole
assembly, they can represent the input image with disguise in both
shape and color; but separately, individual cutout is still recogniz-
able as its own being. Experimental results and user study show that
our algorithm can effectively produce the entertaining Arcimboldo-
like collages.
CR Categories: I.3.3 [Computer Graphics]: Picture/Image
GenerationDisplay algorithm; I.3.6 [Computer Graphics]:
Methodology and TechniquesInteraction techniques; I.4.6 [Im-
age Processing and Computer Vision]: SegmentationPixel clas-
sication
Keywords: Arcimboldo collage, internet image, segmentation,
layer ordering
Links: DL PDF

e-mail: [email protected]
1 Introduction
Our body is overcoat of the soul inside Bhagwad Gita. It
has long been discussed by psychologists and artists about essence
and its physical representation. The essence can still be recogniz-
able even with marked changes on its appearance, while conveys
more information in a compelling way (see Figure 1). This prin-
ciple is artistically realized in the painting artworks by Giuseppe
Arcimboldo, an Italian painter in the Renaissance and honored as
the vanguard of Surrealism. In his paintings, portrait heads are rep-
resented by a variety of symbolic elements like fruits, books (see
Figure 2). Although this sort of hodgepodge images looks rather
eccentric, it provides a fascinating form to display the content in an
extravagant yet lucid style [Maiorino 1983], which is also fashion-
able nowadays. This paper focuses on the creation of such collages
with the Arcimboldo-like style.
From their origin dating back to the year 200 BC in China, collages
have gained growing popularity as artistic and collective expression
of photo assemblage [Wikipedia 2011]. However, manually creat-
ing collage is labor intensive and time consuming, which needs del-
icate cutouts as materials and reciprocity on their assembly. Thus,
many approaches work on automating collage construction [Rother
et al. 2006; Wang et al. 2006; Goferman et al. 2010]. These ap-
proaches often produce commendable collages, but conne them-
selves in a regular canvas, which disable the Arcimboldo-like ef-
fect as shown in Figure 2. In [Gal et al. 2007], 3D models instead
of image cutouts are assembled to approximate a 3D shape. Al-
though exhibiting the Arcimboldo-like style, such collage needs a
3D model as the template, which prohibits stylization of various 2D
images without 3D representation.
The visual mechanism by which we recognize the Arcimboldo-
like collage is the so-called Apophenia, which suggests the expe-
rienced meaningfulness coming from specic connections of co-
herent representation [Maiorino 1983]. Hence, there are two chal-
lenges to create effective Arcimboldo-like collages: selecting com-
petent image cutouts and assembling them consistently for resem-
blance. The Internet provides large easy-to-access image database
such as Flickr or Picasa Web Albums, where various image cutouts
can possibly be discovered. In this paper, we exploit the appropriate
Internet image cutouts as the elements to produce the Arcimboldo-
like collages.

Figure 2: Expressive Arcimboldo-like collages: Summer (1563,
left) and Librarian (1566, middle) painted by Giuseppe Arcim-
boldo; The Pirate (right) from a textbook.
The main contribution of our paper is an Internet image based al-
gorithm to create Arcimboldo-like collages from some input im-
ages. Being the rst attempt to create such 2D collages with the
Internet images, our algorithm is effective to produce the plausible
Arcimboldo-like expression of the input images.
2 Related Work
Much research has gone into collage construction from a photo al-
bum to an informative assembly photo. The cutout region in each
photo can be interactively specied [Agarwala et al. 2004] or au-
tomatically detected [Rother et al. 2006; Wang et al. 2006], and
subsequently stitched together with seamless transition. These ap-
proaches employ rectangular cutouts as primitives, while in [Gofer-
man et al. 2010] cutouts of arbitrary shapes are used to create col-
lages with puzzle-like style. However, all the resulting collages
above are assembled in a rectangle to form a new photo. ShapeCol-
lage [Cheung 2011] can arrange photos into arbitrary shape, but re-
quires regular cutouts as primitives. Our approach would assemble
cutouts of arbitrary shapes to a target image of arbitrary shape.
Image mosaic is another stylization similar to collage, which ar-
ranges a set of icon-sized tile images in a container image. The
tiles are regular squares [Hausner 2001] or arbitrarily-shaped im-
ages [Kim and Pellacini 2002], which can be seen as tiny ele-
ments. However, such stylized images mainly rest on the tile ori-
entation along feature edges to represent the target shape, ignoring
the shape consistency between individual tiles and their occupied
regions. Hence, image mosaic usually needs a large number of tile
images to fabricate the desirable representation. Our approach pur-
sues Arcimboldo-like effect by using consistent cutouts to represent
the meaningful components of the target image.
Gal et al. [2007] present a method to produce the collage like our
style, but using 3D models from a database. Their method needs a
3D shape as the template, not suitable for production from many 2D
images having no 3D representations. Besides, 3D database has far
less data than Internet image database, reducing the potential range
of artistic expression. Also using 3D models, Mitra et al. [2009]
design a system to assemble the texture splats instead of 3D geom-
etry, to produce the so-called emerging images. Emerging images
are perceived as a whole like Arcimboldo-like collage, but do not
locally present any meaningful components. Chu et al. [2010] im-
merse and hide foreground images into a background to produce
camouage images. Although a visual style similar to collage style,
those camouage images are less concerned about shape and color
consistency between the inset foreground and background.
Internet image based approaches have been gaining remarkable
ground as the method of choice for a series of classical image pro-
cessing problems, such as image based 3D touring [Snavely et al.
2006], image completion [Hays and Efros 2007], and so on. Chen
et al. [2009] make realistic photos by blending cutouts from ap-
propriate Internet images. They propose a set of lters to retrieve
the desirable online images and also the cutouts as scene items for
photo synthesis. In this paper, we will present a novel usage of
Internet images to create the artforms of Arcimboldo-like collages.
3 Competent Cutouts Selection
The essence of Arcimboldo-like collage is a metaphor of composite
representation about the input image. So cutouts should be diversi-
ed yet consistent in forms to make both themselves and the collage
still recognizable after assembling.
We use the Internet images to enable diversity of cutouts. Since
the space of Internet images is effectively innite, searching all
the online images to retrieve candidate cutouts is impossible. Ad-
ditionally, assembly of cutouts belonging to the close themes be-
comes more recognizable than those from contrast themes [Maior-
ino 1983], so we would like to select cutouts with some related
themes. We use text-based image searching and let user input some
descriptive key words to collect the relevant Internet images belong-
ing to the theme. Typically, the key words are suggested to some
collective or material nouns like fruit, vegetable or combination of
such words. Then, the saliency and content consistency image l-
ters [Chen et al. 2009] are applied to classify the searched images as
well as obtain the cutouts by GrabCut segmentation [Rother et al.
2004], which are denoted by c = |Ci (see Figure 3 (a)). As stated
in their report, those image lters prefer images with salient fore-
ground and simple background, while might generate inaccurate
cutouts for complex images, which needs ofine manual effort to
improve the segmentation quality for better cutout database. How-
ever, due to the large number of online images, sufcient data can
always be found to feed the image lters for applicable cutouts in
the collage construction.

(a)

(b)

x
y
(xt, yt)
O
Figure 3: (a) Cutouts from the relevant Internet Images. (b) Each
cutout is encoded with color-shape descriptor.
3.1 Cutouts Encoding
As establishment of the database, we mark the cutouts with some
descriptive tags for selecting the competent ones in the next sec-
tions. For each cutout Ci c, we assign a descriptor as:
(Ci) = Hi, Gi) (1)
where Hi = [hi1, ..., hiN]C
i
denotes the N-bin histogram in YUV
color space, and Gi = [gi1, gi2, ...]C
i
are afne moment invari-
ants (AMIs) [Flusser and Suk 1994]. This descriptor encodes the
cutout with both color distribution and shape feature invariant un-
der afne transformation, which approximates imaging mechanism
in photographing (see Section 3.2). For two cutouts Ci and Cj, we
dene their color and shape disparity as:
dc(Ci, Cj) =

N
k=1
(h
ik
h
jk
)
2
h
ik
+h
jk
(2)
ds(Ci, Cj) =

k
|g
ik
g
jk
| (3)
For efcient computation, we set N = 12
3
and only use the AMI
of the lowest order, i.e., Gi = gi1 = (2002
2
11
)/
4
00
, where
pq is the shape moment dened as
pq =

O
(x xt)
p
(y yt)
q
dxdy (4)
with (xt, yt) as center of the binary image of the cutout (see Fig-
ure 3 (b)). Then, K-means method is used to classify c into clusters
by color and shape descriptors respectively. The K-means method
is initialized with 40 clusters in our experiments. The set of color
cluster centers is denoted by |/
c
k
, and |/
s
k
for shape cluster
centers. Next, we proceed to selection of the competent cutouts.
3.2 Component-aware Cutouts Matching
The distinction of our Arcimboldo-like collage is to disguise an im-
age of arbitrary shape with multiple cutouts of arbitrary shapes in
a visually consistent matching manner. Such consistency embod-
ies two aspects: the collage of cutouts resembling the input image,
while individual cutout still being recognizable in the collage. This
demands the cutouts to match structural components of the input
image, thus forming plausible Arcimboldo-like representation.
Assuming the input image is segmented into a set of components
S = |Si (see Section 3.3), the matching is performed to assign
each component a label L indicating its cutout. The matching en-
ergy for competents cutout L(Si) c comprises three terms:
E(Si; L, T) = Ecms(L, T) +w
col
E
col
(L) +w
dev
E
dev
(T) (5)
where T is the induced transformation for consistent matching. The
rst term Ecms tends to select cutout most resembling the compo-
nent in shape, possibly with appropriate transformation. Next, E
col
term measures color similarity between component and its cutout.
Finally, E
dev
considers the identiability of individual cutouts in
the collage, which penalizes severe shape change under the trans-
formation. Next, each of these energy terms is discussed in detail,
as well as their parameters setting.
Shape Matching The destined cutout is required to well match
the shape of the corresponding component. However, there is rare
photographed image, even from the Internet, with exactly the same
shape as the component. Thus, shape deformation is inevitable,
and Ecms should favor shape consistency under such deformation.
Formally, this term is dened as
Ecms(L, T) = Ecms(L, Ai) = (Si Ai L(Si))/(Si) (6)
where () is the region area, is the symmetric difference, i.e.,
X Y = (X Y ) \ (X Y ) , and Ai is the best afne transfor-
mation matrix for shape matching as described below.
There are many published approaches to shape matching under pre-
scribed transformation. Here, we set T to the afne transformation
to follow the projective photography model of pin-hole camera.
The implied projection can be well approximated by afne trans-
formation, which makes the deformed cutout projectively changed
with less distortion (see Figure 4), thus being recognizable in the
collage. Hence, optimization of Equation (6) is to nd the cutout
best matching the component under afne transformation. Here, we
use the afne registration approach [Ho et al. 2009] to compute the
matching transformation Ai between the outlines of L(Si) and Si.
This approach does not need explicit pairwise correspondence and
is computationally fast.
Color Matching This term enables color imitation when using
cutout L(Si) to represent the component Si, dened as
E
col
(L) = dc(Si, L(Si)) (7)
where dc is the color difference dened by the histogram distance
as in Equation (2).
Matching Deviation To make the cutouts still recognizable, we
should keep their changes from severe deviation when matching to
components (see Figure 4). Since the deviation comes from afne
transformation in the shape matching, this term is dened as
E
dev
(T = Ai) = |i1/i2 1| (8)
where i1 and i2 are the two singular values of Ai. Actually, the
ratio i1/i2 measures the variation of Ai to a conformal mapping,
which only admits locally isotropic transformation.
Parameters The parameters w
col
and w
dev
tune the delity of
color and shape in the Arcimboldo-like representation. Sometimes,
the cutouts database might be in biased supply that cannot recon-
cile the components over the color and shape request, e.g., there
are few instruments of blue color, or fruits with rectangular shape.
Hence, we use adaptive setting instead of constant values to dene
the weighting parameters as:
w
col
(Si) = exp(dc(Si, c)) (9)
w
dev
(Si) = exp(ds(Si, c)) (10)
where and are constant coefcients for scales adjustment, and
set as = = 10 in our experiments. The disparity dc(Si, c) =
Figure 4: For each component, cutout is selected by measuring the
consistency of color (histogram), shape under afne transforma-
tion, and the matching deviation (shown by the circle distortion).

(b) (c) (e) (a)


i
P
i
Q
j j
u v

(d) (f)
Figure 5: (a) Segmentation by mean-shift clustering method. (b) Distribution of cutouts (diamonds) and segmented patches (dots) embedded
in the 2D space based on the metric ds. (c) Unqualied patches. (d) Patches merging and splitting. (e) Segmentation result by merging and
splitting on all the unqualied patches in (c) after the rst iteration. (f) Final segmented components.
min
k
|dc(Si, /
c
k
) and ds(Si, c) = min
k
|ds(Si, /
s
k
) mea-
sure color and shape similarity of the component to the database.
Energy minimization of Equation (5) can be performed by sequen-
tially recording energy value of each cutout in c, and keeping the
optimal one with minimum value as the competent one for the cor-
responding component. Since the optimization is independent be-
tween the components, it can be efciently implemented in parallel.
3.3 Cutouts Guided Component Segmentation
Obviously, selection of competent cutouts depends on the segmen-
tation |Si by the component-aware matching way. However, un-
supervised segmentation of semantic components remains a great
challenge in image processing, and we do not solve this general
problem in our system. Since components are represented by the
cutouts, it would be favorable to get segmentation with shape close
to the cutouts. So we use the database c as reference to iteratively
rene automatic segmentation result for desirable components.
Let T = |Pi be the automatically segmented patches using mean-
shift clustering approach [Comaniciu and Meer 2002] (see Figure 5
(a)), with uniform kernel of radius 10 in our experiments, and =
|Pi : ds(Pi, c) > be the unqualied components, which have
shapes far away from the database (see Figure 5 (b-c)). Then, for
each patch Pi , we optionally apply merging or splitting step to
improve its segmentation result as follows.
Merging. Given the patch Pi, we nd the patch Qi such that Qi =
arg min
Q
j
|ds(Pi Qj, c) < ds(Pi, c) : Qj A(Pi), where
A() is the neighborhood, and then merge Qi into Pi to increase the
shape similarity of Pi (see top in Figure 5 (d)). The merging step
can reduce the number of patches by combining neighbor patches
for the better shape matching. If such Qi does not exist, we turn to
the next splitting step.
Splitting. Let (uj, vj) Pi denote a pair of points on the out-
line of the patch Pi, and T = |(uj, vj) : [ujvj[/[

ujvj[ <
be the constraint set of point pairs, where [ujvj[ is the inner dis-
tance measured within the interior of Pi, and [

ujvj[ is the longer


distance of the two pathes along the outline (see bottom in Figure 5
(d)). Then, each edge link (uj, vj) partitions Pi into several patches
|E
jk
[k = 1, ..., K. Intuitively, the edges intend to split the patch
in the narrow band regions. We choose the pair (ui, vi) with min-
imum (

K
k=1
ds(E
ik
, c))/K, and split Pi into K new patches,
which have shapes closer to the database on average. Although the
splitting step increases the number of patches, it provides segmen-
tation more conformable to the cutout database, which induces the
consistent cutout matching.
The merging and splitting are performed on all the patches in ,
and subsequently update the segmentation T and . This proce-
dure is iterated until = . In our experiments, we set =
0.1 max|ds(/
s
i
, /
s
j
) and = 0.3. Finally, we obtain the
components |Si more suitable for cutouts matching (see Figure 5
(f)). However, this guided segmentation, relying on the database
c, cannot always generate accurate semantic components. We also
provide interactive merging and splitting tools to further improve
the segmentation (see Section 5).
4 Collage Assembly
Since the goal is to represent the components with cutouts in a
shape and color consistent way, we use the afne transformation
Ai associated with L(Si) in Equation (6) to assemble the cutouts
together. Thus, each component Si is replaced with the transformed
cutout

Si = Ai L(Si) (Figure 6). However, there is no evident se-
quence to sort the transformed cutouts in the assembly, so we must
reason about their layer ordering from the input image.

i
A
j
A
Figure 6: Selected cutouts are assembled according to the afne
transformations in unorganized layer ordering (gray image).
4.1 Layer Ordering of Cutouts
Recovering depth information from a single image is almost impos-
sible due to the lack of sufcient spatial cues. Here, image has been
represented by the segmented components, so it can be cast as esti-
mation of a reasonable layer ordering of the components. However,
even reasoning on the layer ordering of components is severely ill-
posed. Inspired by the work of occlusion recovery from single im-
age [Hoiem et al. 2007], we employ some perceptually-motivated
cues to infer the layer ordering. In the following sections, we use
inequalities to describe the orders of components, e.g., Si < Sj im-
plies that Si lies in front of Sj. Each cue has some inequality votes
on the ordering of components, which is discussed as follows.
Region coverage If the boundary of component Sa is completely
surrounded by component S
b
, we have the ordering Sa < S
b
(see
Figure 7 (a)). Hence, complete components are more likely to be
displayed in the front. This is a rather intuitive cue arising from
common physical interpretation of the scene.
T-junction area T-junction occurs when one boundary ends on
another one, where more than two boundaries intersect with each
other (see Figure 7 (b)). Assuming components |S
k
[k = 1, ..., m
touch at T-junction t, and r is a disc centered at t with radius
r, then the ordering is determined according to local areas near
t, i.e., S
l
< S
k
if (S
k
r) < (S
l
r). T-junction has
long been used as evidence for occlusion detection [Hoiem et al.
2007]. To confound ambiguity caused by noisy segmentation, we
set the discs in a range of radius to compute the local areas, e.g.,
r |
1
n
D(S
k
), ...,
z
n
D(S
k
)[z < n, where D(S
k
) is the di-
ameter of the union of components intersecting at t. Then for each
disc, we obtain a layer ordering of the components.
Due to the heuristics, conicting inequalities might occur from the
cues above (Figure 7 (b)). To make fair inference, we keep a vot-
ing table VT to record the propounded inequalities from all the cues
(Figure 7 (c)). VT is a square matrix with dimension of the com-
ponents number, of which each entry nij is the number of votes
for Si < Sj. For the cues of region coverage, each inequality tal-
lies one vote; while for T-junction area, each inequality contributes
1/z, where z is the number of discs. Then, all the votes are summa-
rized in VT. After majority election from all the cues, we obtain the
inequality set V, which can be represented by a direct graph with
the edge weight of nij. Then, we apply topological sorting on V
and erase the loop on the graph by deleting edges with the minimum
weights [Kahn 1962], which results in a consistent layer ordering
of all the components. Accordingly, we reshufe the cutouts in the
assembly (Figure 7 (d)).

a b
S S
a
S
b
S
j
S
k
S
(a) (b)
(c) (d)
t
b
S
a
S
b
S
j
S
k
S
a
S
b
S
j
S
k
S
1
1 1
2
3

1
3

b k j
b j k
b k j
S S S
S S S
S S S

Figure 7: Layer ordering. In the voting table (c), red number is


from inequality votes of (a), and green number from (b).
Although the heuristic cues provide no theoretical guarantee on the
correctness of layer ordering, it is able to fast validate a reasonable
sequence of the cutouts. And if necessary, it can be an initialization
for next interactively adjusting the orders (see Section 5).
4.2 Collage Enhancement
To improve the resemblance and aesthetic effect in the Arcimboldo-
like collage, more apparent properties need to inherit from the in-
put image. Symmetry is ubiquitous in many images, which often
attracts subconscious attention in human recognition system. We
apply symmetry detection [Loy and Eklundh 2006] on the input
image from outside to inside, and represent the symmetric compo-
nents with the same cutouts (see Figure 8 (a)).
Inconsistent illumination might occur between cutouts and input
image. For coherent lighting effect, we adjust the luminance by
histogrammatching of Y-channel in YUVcolor space between each
cutout Ci and its corresponding component Si (see Figure 8 (b)).
Luminance coherence is not always necessary in the Arcimboldo-
like collages, e.g., for cartoons, having no realistic lighting effect.

(a) fruit panda


(b) vegetable elephant
Figure 8: (a) Symmetric components are replaced by the same
cutouts. (b) From left to right: input image, result without lumi-
nance coherence, result with luminance coherence.
5 User Interaction
Art is always the production with human activities [Cong et al.
2011]. So we provide two interactive tools in the stages of segmen-
tation and layer ordering to conduct intended results (see Figure 9).
Merging&splitting brush With the automatically segmented
patches, user can simply draw a stroke by the merging brush to
merge the patches covered by the stroke into one new patch. To sep-
arate a patch, user draws a stroke crossing a selected patch with the
splitting brush, then the patch is segmented into two new patches
along the boundary computed by graph-cut optimization [Rother
et al. 2006] in the stroke-covering region.
Interactive layering arrow User can progressively adjust the or-
der of two cutouts Si and Sj by drawing an arrow between them as
Si Sj. Then, we obtain a new inequality Si < Sj, and add it
to the inequality set V (if Si > Sj exists, we rst delete it from
V). By applying topological sorting on the updated inequalities,
we obtain a new layer ordering to satisfy the user intention.
6 Experiments and Results
We have tested our algorithm on many examples to create a variety
of Arcimboldo-like collages. In the experiments, our algorithm au-
tomatically downloads 5,000 images for each of the key words like
fruit, instrument, and accommodates about 800 images after image
ltering. It takes about 30 minutes to process the Internet images
(including downloading, ltering and clustering). Then, the input
image is segmented and competent cutouts are selected for all the
segmented components. In the meanwhile, layer ordering and sym-
metry detection are performed to guide the cutouts assembly, and
luminance coherence is optionally applied. Some results are shown
in Figure 1, 8 and 9, where each image is represented by collage
of other theme. Figure 10 shows the results using two different
cutouts database for each example. It can be seen that our system is
exible to produce a wide range of Arcimboldo-like collages. The
running time of our algorithm depends on the number of cutouts in
the database, segmented components, and user interaction. Table 1
shows the timing statistics (in seconds) for cutouts selection (Sec-
tion 3) and automatic layer ordering (Section 4.1). Our experiments
were implemented on PC with 2.66GHz CPU and 4G RAM.
To evaluate the performance, we conduct a user study on both sys-
tem usability and collage quality. We are not aware of of any com-

vegetable kangaroo
conch parrot
marine animal cock
vegetable Popeye
handbag Kiss fruit Shin-chan
Figure 9: For each example, from left to right: cutouts guided segmentation, user interaction (red strokes for merging brush, green strokes
for splitting brushes and blue arrows for layer ordering), nal segmented components, and Arcimboldo-like collage using the keyword in the
quotation. (Popeye 2011 c _King Features Syndicate, Inc.; Shin-chan c _Yoshito Usui)
parable tool generating Arcimboldo-like collages, and the closest
one is Photoshop. Thus, we encouraged 20 participants (with Pho-
toshop training) to judge on using our system and Photoshop.
Figure 10: Top: excavator collage using electric appliance and
instrument cutouts; bottom: Calabash ( c _Shanghai Animation
Film Studio) collage using tableware and fruit cutouts.
User study I: usability. At the start, we gave a detailed instruc-
tion on how to use our system, and then asked the participants to
produce the collages in Table 1 by using our system and Photo-
shop respectively. To make fair comparison, we provide the same
cutouts database for Photoshop users, which greatly lessen effort
on segmenting the foreground. After each example was completed,
we recorded the participants preference on using the tools. From
statistics in Figure 11 (a), it can be seen that more than 50% peo-
ple choose our system for each example, especially when the input
images become complicated with many components. It is observed
that Photoshop users took lots of time on picking the appropriate
cutouts for components and adjust their layer ordering, while our
system can automate the process along with much less interaction.
User study II: visual quality. To evaluate the collage quality, we
asked the participants to score the collages generated by our algo-
rithm and the ones from an artist (like Figure 11 (c)). The collages
by artist were created using Photoshop, and given enough time to
make renement. Then, the participants marked the collages from
excellent (10) to poor (0) for each example. The score criterion is
recognition of the whole collage, as well as identity of individual
cutouts in the collage. As shown in Figure 11 (b), our algorithm
can produce appealing collages comparable to the artworks by the
artist, while the latter usually took 45 minutes on each example.
Image #M/#S/#L #Component Selection (.s) Ordering (.s)
Prime (Fig. 1) 0/0/4 92 81.2 1.82
Panda (Fig. 8) 0/3/0 12 14.1 0.28
Elephant (Fig. 8) 8/7/2 12 16.4 0.27
Kangaroo (Fig. 9) 8/5/1 10 9.6 0.25
Cock (Fig. 9) 0/3/3 62 60.4 1.23
Popeye (Fig. 9) 0/5/2 27 25.5 0.35
Parrot (Fig. 9) 4/0/3 39 41.3 0.74
Shin-chan (Fig. 9) 0/5/3 21 18.3 0.31
Kiss (Fig. 9) 0/5/2 28 28.7 0.41
Excavator (Fig. 10) 3/0/1 23 22.1 0.58
Calabash (Fig. 10) 3/4/1 30 39.4 0.62
Table 1: Interaction and timing statistics. #M/#S/#L denote the
numbers of interactive merging, splitting and layer ordering.
Limitations The automatic selection of component cutouts might
generate semantically inconsistent form, without considering the
object context. For example in Figure 8 (a), ear of the panda is rep-
resented by a grape, which looks as the size of a bunch of grapes
for the eye. And in Figure 8 (b), the legs of elephant are substituted
with different forms, exhibiting unnatural composition in common
sense. But on the other hand, these results can express the images
in an exaggerated way, which is also the style of Arcimboldo-like
collages. Additionally, there might also be inconsistent perspective
effects between assembled cutouts and the input image, e.g., exca-
vator in Figure 10, which attenuates the visual quality of collages.

0 4 8 12 16 20
Calabash
Excavator
Kiss
Shin-chan
Parrot
Popeye
Cock
Kangaroo
Elephant
Panda
Prime

0 2 4 6 8 10
Calabash
Excavator
Kiss
Shin-chan
Parrot
Popeye
Cock
Kangaroo
Elephant
Panda
Prime
(b) (a)
Participant number

Score

0 4 8 12 16 20
Our method Photoshop
0 2 4 6 8 10
D E
Participant number

Score

0 4 8 12 16 20
Our method Photoshop
0 2 4 6 8 10
D E
Score
0 2 4 6 8 10
Participant number
0 4 8 12 16 20
(c)
Figure 11: Statistics of user study and collages created by an artist.
7 Conclusion
We have given an efcient algorithm to create the Arcimboldo-like
collage, which represents an input image with thematically-related
cutouts from the Internet images. User study shows that our system
is feasible to produce collages having plausible Arcimboldo-like
style of the input images with the recognizable cutouts.
Despite minor limitations, we hope this paper opens a new direction
in computational aesthetics based on Internet images. We believe
that the massive database of Internet images furnishes the favor-
able desideratum in a wide range of image processing tasks. As the
future work, we plan to apply more stylization techniques [Wang
et al. 2010] to enhance the collages. Another promising scenario is
to combine Arcimboldo-like collage and mosaic in a unied Inter-
net image framework. By controlling the element size, we would
like to produce a spectrum of artworks with varied assembly styles.
Acknowledgements
We would like to thank the anonymous reviewers for their helpful
comments. We are also grateful to Hasbro International Inc., King
Features Syn., Animation International Ltd., and Shanghai Anima-
tion Film Studio for granting the permissions to use the pictures of
Prime, Popeye, Shin-chan and Calabash. We thank Nerina Patane
and Meng Ding for sharing their artworks in Figure 2 and 11, Yu
Zang and Hong Liu for helping us with the gures and video. This
work was partly supported by the Program for New Century Excel-
lent Talents in University (No. NCET-09-0635) and the National
Natural Science Foundation of China (No. 61133008, 61103159).
References
AGARWALA, A., DONTCHEVA, M., AGRAWALA, M., DRUCKER,
S., COLBURN, A., CURLESS, B., SALESIN, D., AND COHEN,
M. 2004. Interactive digital photomontage. ACM Trans. Graph.
23 (August), 294302.
CHEN, T., CHENG, M.-M., TAN, P., SHAMIR, A., AND HU, S.-
M. 2009. Sketch2photo: Internet image montage. ACM Trans.
Graph. 28 (December), 124:1124:10.
CHEUNG, V., 2011. Shape collage. http://www.
shapecollage.com.
CHU, H.-K., HSU, W.-H., MITRA, N. J., COHEN-OR, D.,
WONG, T.-T., AND LEE, T.-Y. 2010. Camouage images.
ACM Trans. Graph. 29 (July), 51:151:8.
COMANICIU, D., AND MEER, P. 2002. Mean shift: A robust
approach toward feature space analysis. IEEE Trans. Pattern
Anal. Mach. Intell. 24 (May), 603619.
CONG, L., TONG, R., AND DONG, J. 2011. Selective image
abstraction. The Visual Computer 27 (March), 187198.
FLUSSER, J., AND SUK, T. 1994. Afne moment invariants: A
new tool for character recognition. Pattern Recognition Letters
15 (April), 433436.
GAL, R., SORKINE, O., POPA, T., SHEFFER, A., AND COHEN-
OR, D. 2007. 3D collage: Expressive non-realistic modeling. In
Proc. NPAR, 714.
GOFERMAN, S., TAL, A., AND ZELNIK-MANOR, L. 2010.
Puzzle-like collage. Comput. Graph. Forum 29 (May), 459468.
HAUSNER, A. 2001. Simulating decorative mosaics. In ACM
SIGGRAPH 2001, 573580.
HAYS, J., AND EFROS, A. A. 2007. Scene completion using
millions of photographs. ACM Trans. Graph. 26 (July).
HO, J., PETER, A., RANGARAJAN, A., AND YANG, M.-H. 2009.
An algebraic approach to afne registration of point sets. In
Proc. ICCV, 13351340.
HOIEM, D., EFROS, A. A., AND HEBERT, M. 2007. Recovering
occlusion boundaries from a single image. In Proc. ICCV, 18.
KAHN, A. B. 1962. Topological sorting of large networks. Com-
mun. ACM 5 (November), 558562.
KIM, J., AND PELLACINI, F. 2002. Jigsaw image mosaics. ACM
Trans. Graph. 21 (July), 657664.
LOY, G., AND EKLUNDH, J.-O. 2006. Detecting symmetry and
symmetric constellations of features. In Proc. ECCV, 508521.
MAIORINO, G. 1983. The Portrait of Eccentricity: Arcimboldo and
the Mannerist Grotesque. Pennsylvania State University Press.
MITRA, N. J., CHU, H.-K., LEE, T.-Y., WOLF, L., YESHURUN,
H., AND COHEN-OR, D. 2009. Emerging images. ACM Trans.
Graph. 28 (December), 163:1163:8.
ROTHER, C., KOLMOGOROV, V., AND BLAKE, A. 2004. Grab-
Cut: Interactive foreground extraction using iterated graph cuts.
ACM Trans. Graph. 23 (August), 309314.
ROTHER, C., BORDEAUX, L., HAMADI, Y., AND BLAKE, A.
2006. AutoCollage. ACM Trans. Graph. 25 (July), 847852.
SNAVELY, N., SEITZ, S. M., AND SZELISKI, R. 2006. Photo
tourism: Exploring photo collections in 3D. ACM Trans. Graph.
25 (July), 835846.
WANG, J., SUN, J., QUAN, L., TANG, X., AND SHUM, H. 2006.
Picture collage. In Proc. CVPR, 347354.
WANG, S., CAI, K., LU, J., LIU, X., AND WU, E. 2010. Real-
time coherent stylization for augmented reality. The Visual Com-
puter 26 (June), 445455.
WIKIPEDIA, 2011. Collage. http://en.wikipedia.org/
wiki/Collage.

You might also like