0% found this document useful (0 votes)
38 views5 pages

Wavelet Transform Based Texture Features For Content Based Image Retrieval

This document discusses using wavelet transforms to extract texture features for content-based image retrieval. It proposes using both pyramidal and tree-structured wavelet transforms with 8-tap Daubechies coefficients. Extensive experiments on a large texture database of 1,856 images from 116 classes were conducted. The results found that a combination of standard deviation and energy features provided the best retrieval accuracy for tree-structured wavelets, while standard deviation alone worked best for pyramidal wavelets. It also found that the Manhattan distance metric provided better results than Euclidean distance for both wavelet decomposition methods.

Uploaded by

Adhy Rizaldy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views5 pages

Wavelet Transform Based Texture Features For Content Based Image Retrieval

This document discusses using wavelet transforms to extract texture features for content-based image retrieval. It proposes using both pyramidal and tree-structured wavelet transforms with 8-tap Daubechies coefficients. Extensive experiments on a large texture database of 1,856 images from 116 classes were conducted. The results found that a combination of standard deviation and energy features provided the best retrieval accuracy for tree-structured wavelets, while standard deviation alone worked best for pyramidal wavelets. It also found that the Manhattan distance metric provided better results than Euclidean distance for both wavelet decomposition methods.

Uploaded by

Adhy Rizaldy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Wavelet Transform Based Texture Features For Content Based Image

Retrieval
Manesh Kokare, B.N. Chatterji and P.K. Biswas
Electronics and Electrical Communication Engineering Department,
Indian Institute of Technology,
Kharagpur PIN 721 302, India
{mbk, bnc, pkb}@ece.iitkgp.ernet.in
ABSTRACT
The rapid expansion of the Internet and the
wide use of digital data have increased the need
for both efficient image database creation and
retrieval procedure. The challenge in image
retrieval is to develop methods that can capture
the important characteristics of an image,
which makes it unique, and allow its accurate
identification. The focus of this paper is on the
image processing aspects that in particular
using texture information for retrieval. We
present a unique wavelet transform based
texture features for content-based image
retrieval, which is comparable with standard
existing methods. We propose the use of
pyramidal and tree structured wavelet features
using 8-tap Daubechies coefficients for texture
analysis and provide extensive experimental
evaluation. Comparison with various features
using Broadtz texture database indicates that
the combination of energy and standard
deviation of wavelet features provide good
pattern retrieval accuracy for tree structured
wavelet
decomposition
while
standard
deviation alone gives better result in pyramidal
wavelet decomposition. In most of the existing
retrieval methods the most commonly used
Euclidean distance function or measure of
dissimilarity between feature vectors is used,
but we observed that, it is not always the best
metric. We have done the comparison of results
using Euclidean distance and Manhattan
distance for both the tree structured and
pyramidal wavelet decomposition methods and
found that Manhattan distance gives better
result than Euclidean distance metric.
Keywords: Content-based image retrieval
(CBIR), image database, feature database,
similarity, query image, pyramidal wavelet
transform, tree structured wavelet transform.
1. INTRODUCTION
Worldwide networking allows us to
communicate, share, and learn information in

the global manner. Digital library and


multimedia databases are rapidly increasing; so
efficient search algorithms need to be
developed. Retrieval of image data has
traditionally been based on human insertion of
some text describing the scene, which can then
be used for searching by using keywords based
searching methods. However in many
situations, this traditional method is
incomplete. In fact, it is quite rare for complete
text annotation to be available because it would
entail describing every color, texture, shape,
and object within the image. Another difficulty
with text annotation is for huge image data the
vast amount of labor required in manual image
annotation. We know that an image speaks
thousands of words. So instead of manually
annotated by text-based keywords, images
could be indexed by their own visual contents,
such as color, texture and shape. So researchers
turned attention to content based retrieval
methods [1]. However, research on contentbased image retrieval is still in its very early
stages.
Search techniques can be based on
many features such as colour, shape, and
texture but in this paper we concentrate only on
the problem of finding good texture features.
The main texture features currently used are
derived from either Gabor wavelets or the
conventional discrete wavelet transform.
Extensive experiments on a large set of
textured
images
show
that
retrieval
performance is better using Gabor filters than
when using conventional orthogonal wavelets
[2]. But Gabor wavelet transforms suffers from
three main disadvantages as follows.
1. It requires large space for storage, because
Gabor functions do not form an orthogonal
basis set, and hence the representation will
not be compact.
2. No efficient algorithms exist for computing
the forward and inverse transformations,
which is important in a content-based image
retrieval context.

3. Computational time required for feature


extraction is quite high, which limits the
retrieval speed. As data retrieval should be
fast and should perform well in real time.
In
sharp
contrast
to
these
disadvantages orthogonal wavelets require less
storage space as the number of transform
coefficients is exactly the same as the number
of pixels in the original image, fast algorithms
exist for computing the forward and inverse
wavelet transforms and computational time
required for feature extraction is very less,
which increases the retrieval speed.
Manjunath and Ma [2] have done
comparison of Gabor Wavelet, Discrete
Wavelet Transform (DWT) using 16-tap
Daubechies coefficients and the MultiResolution Simultaneous Autoregressive Model
(MR-SAR) by considering mean and standard
deviation as feature parameters.
The main contributions of this paper
are summarized as follows. Wavelet based
texture features for Content-based image
retrieval is proposed. Daubechies 8-tap
coefficients are used for tree structured wavelet
decomposition
and
pyramidal
wavelet
decomposition. Large texture database of 1856
images is used to check the retrieval
performance. A detailed comparison with the
performance of features such as mean, standard
deviation, energy and all possible combinations
using Manhattan distance and Euclidean
distance function is presented. Amongst all
these feature parameters we found that
combination of standard deviation and energy
gives best retrieval results while mean performs
very poor. Further it is observed that although
Euclidean (L2) distance is probably the most
commonly used distance function or measure
of dissimilarity between feature vectors, but we
observed that, it is not always the best metric.
The fact that the distances in each dimension
are squared before summation, places great
emphasis on those features for which the
dissimilarity is large. We found more moderate
approach by using sum of the absolute
differences in each feature, rather than their
squares, as the overall measure of dissimilarity.
This sum of absolute distance in each
dimension is sometimes called L1 distance or
the Manhattan distance.
The paper is organized as follows. In
section 2 pyramidal and tree structured wavelet
transforms are discussed in brief. Image
retrieval procedure is proposed in section 3.
Experimental results are provided in section 4,
which is followed by the conclusions.

2. WAVELET TRANSFORM
In this section we will discuss pyramidal and
tree structured wavelet transform in brief.
2.1 Pyramidal wavelet transform
The images in a database are likely to be stored
in a compressed form. Superior indexing
performance can therefore be obtained if the
properties of the coding can be exploited in the
indexing technique. Recently, discrete wavelet
transform has become popular in image coding
applications [3, 4]. Wavelets provide
multiresolution capability, good energy
compaction and adaptability to human visual
characteristics.
Wavelet transform represents a
function as a superposition of a family of basis
functions called wavelets. Translating and
dilating the mother wavelet corresponding to a
particular basis can generate a set of basis
functions. The signal is passed through a low
pass and high pass filter, and the filters output
is decimated by two. Thus, wavelet transforms
extract information from signal at different
scales. For reconstruction, the coefficients are
up sampled and passed through another set of
low pass and high pass filters.
The 2-D DWT is generally calculated
using a separable approach. Fig.1 shows a three
level pyramidal wavelet decomposition of an
image S1 of size a b pixels. In the first level of
decomposition, one low pass subimage (S 2 )
and three orientation selective high pass
subimages (W2H ,W2V ,W2D ) are created. In
second level of decomposition, the low pass
subimage is further decomposed in to one low
pass and three high pass subimages
W4H , W4V ,W4D . The process is repeated on the
low pass subimage to form higher level of
wavelet decomposition. In other words, DWT
decomposes an image in to a pyramid structure
of the subimages with various resolutions
corresponding to the different scales. We note
that a three-stage decomposition will create
three low pass subimages and nine (three each
in horizontal, vertical, and diagonal direction)
high pass directional subimages. The low pass
subimages are low-resolution versions of the
original image at different scales. The
horizontal, vertical and diagonal subimages
provide the information about the changes in
the corresponding directions respectively.

b
8

b
a
8

b
4

S8 W H
8
V
8

D
8

W W

Original
Image

a
4

a
2

V
4

b
2

W4H
W2H

D
4

W2V

W2D

Figure1. Pyramidal Wavelet Transform


2.2 Tree structured wavelet transform

3.1

The traditional pyramid-type wavelet transform


recursively decomposes subsignals in the low
frequency channels. However, since the most
significant information of a texture often
appears in the middle frequency channels,
further decomposition just in lower frequency
region, such as the conventional pyramidal
wavelet transform does, may not help much for
the purpose of classification [5]. Thus, an
appropriate way to perform the wavelet
transform for textures is to select the significant
frequency channels and then to decompose
them further. The key difference between this
algorithm and the traditional pyramid algorithm
is that the decomposition is no longer simply
applied to the low frequency subsignals
recursively. Instead, it can be applied to the
output of filter hLL , hLH , hHL , or hHH . It is

The
texture
database
used in the
experimentation consists of 116 different
texture classes. We have used 108 textures
from Bordatz album [7], seven textures from
USC Database and one artificial texture. Size
of each texture is 512 512 . Each of 512 512
images is divided into sixteen 128 128
nonoverlapping subimages, thus creating a
database of 1,856 patterns in the database.

usually unnecessary and expensive to


decompose all subsignals in each scale to
achieve a full decomposition. Chang et.al. [5]
have used a criterion of measuring energy of
image subbands at each level to decide further
decomposition of that band to avoid full
decomposition. If the energy of a subimage is
significantly smaller than others, that region is
not decomposed further, since it contains less
information. They have used another stopping
criterion for further decomposition as the size
of smallest subimage should not be less than
1616 .
3. IMAGE RETRIEVAL PROCEDURE
In this section we will discuss texture image
database creation, feature database creation and
image retrieval method. General architecture of
CBIR can be found in [6].

Texture image database

3.2 Feature database creation


Daubechies orthogonal wavelet 8-tap filter
coefficients [8] are used for computing
pyramidal wavelet transform (PWT) and tree
structured wavelet transform (TWT). The
128 128 image pattern is decomposed into three
levels. In PWT for three levels we get (4 3 =12)
bands of wavelet transform. In the
experimentation we have used feature
parameters viz. mean, standard deviation,
energy and all possible combinations of these
parameters.
These
feature
parameters
corresponding to each of the subbands at each
decomposition level are used to construct a
feature vector. Length of feature vector will be
equal to (12 No. of feature parameters used in
combination) elements.
In TWT, T. Chang et.al. [5] had
considered energy of image subbands at each
level and minimum size of subband ( 16 16 ) as
a stopping criterion for the further
decomposition of that subband, which will
result different structure for different patterns.
For content-based image retrieval applications,
it is convenient to have a fixed structure. A
fixed decomposition tree can be obtained by
sequentially decomposing the LL, LH, HL and

HH subbands. A tree-structured wavelet


transform with three level decomposition
results in (4(1+4+16)=84) subbands. As in the
PWT, construct a feature vector. Length of
feature vector will be equal to (84 No. of
feature parameters used in combination)
elements. For creation of feature database
above procedure is repeated for all the images
of the image database.
3.3 Image Retrieval Method
A query pattern is any one of the 1,856 patterns
from the image database. This pattern is then
processed to compute the feature vector as in
section (3.2). Then Manhattan (L1) and
Euclidean (L2) distance metric are used to
compute similarity for given pair of images and
are given by:
D qiM =

f qj - f ij

(1)

- f ij )

(2)

j =1

D qiE =

(f
j =1

qj

Where f qj and f ij are the feature vector of the


query and database image respectively, and n
is the length of feature vector. It is obvious that
the distance of an image from itself is zero. The
distances are then stored in increasing order
and the closest sets of patterns are then
retrieved. In the ideal case all the top 16
retrievals are from the same large image. The
performance is measured in terms of the
average retrieval rate, which is defined as the
average percentage number of patterns
belonging to the same image as the query
pattern in the top 16 matches.
4.EXPERIMENTAL RESULTS

of the performance of different features used in


CBIR. We observe that the combination of
standard deviation and energy as a feature
parameter improves the retrieval performance
considerably in TWT while standard deviation
alone gives best performance in case of PWT.
From Table 1 it is clear that Manhattan distance
gives better result than Euclidean distance
function. The best feature parameters are
combination of standard deviation and energy
for tree structured wavelet transform and
standard deviation alone for pyramidal wavelet
transform decomposition.
In the top 16
retrieved images with these best feature
parameter the average retrieval performance is
69.29% using tree structured wavelet
decomposition and 56.79% using pyramidal
wavelet decomposition. Retrieval performance
can be further improved by using smooth
wavelet function like cosine-modulated
wavelet.
Fig.3 shows the retrieval performance
according to the number of top matches
considered. The performance increases up to
91.65% if the top 116(6% of the database)
retrievals are considered.
Query Image

Figure.2. Retrieved top twenty images


from the database of 1856 images.

Retrieval for a sample query image is shown in


Fig.2. Table 1 provides a detailed comparison
TABLE 1
Comparison of performance of different features for Content based image retrieval.
Tree Structured Wavelet
Pyramidal Wavelet
Feature
Decomposition
Decomposition
Manhattan
Euclidean
Manhattan
Euclidean
Distance
Distance
Distance
Distance
Mean
13.74%
13.36%
14.22%
14.22%
Standard deviation
68.97%
63.20%
49.95%
56.79%
Energy
65.63%
53.18%
45.80%
41.76%
Mean + Standard Deviation
45.69%
43.64%
45.69%
43.46%
Mean + Energy
57.87%
46.55%
35.78%
35.88%
Standard Deviation + Energy
60.78%
53.50%
48.55%
69.29%
Mean + Standard Deviation + Energy
66.27%
56.30%
46.77%
43.91%

Figure3. Retrieval performance according to the number of top matches considered.


5.CONCLUSIONS

Content-based image retrieval using wavelet


based texture feature is proposed. Tree
structured
wavelet
decomposition
and
pyramidal wavelet decomposition is used.
Large texture database of 1856 images is used
to check the retrieval performance. A detailed
comparison with the performance of features
such as mean, standard deviation, energy and
all possible combinations using Manhattan
distance and Euclidean distance function is
presented. Amongst all these feature
parameters we found that combination of
standard deviation and energy gives best
retrieval results while mean feature alone and
in combination with other feature performs
poor for tree structured wavelet decomposition.
In case of pyramidal wavelet decomposition
standard deviation alone gives best result.
Retrieval results using tree structured wavelet
decomposition are better than pyramidal
wavelet decomposition. Only disadvantage of
tree structured wavelet decomposition is that it
increases the feature vector length, which
increases retrieval time.
Although Euclidean (L2) distance is
probably the most commonly used distance
function or measure of dissimilarity between
feature vectors, but we observed that, it is not
always the best metric. The fact that the
distances in each dimension are squared before
summation, places great emphasis on those
features for which the dissimilarity is large. As
compared to L2 distance metric, we found a
more moderate approach by using Manhattan
(L1) distance in which, sum of the absolute

differences in each feature is taken, rather than


their squares. The retrieval performance
increases up to 91.65% if the top 116. (6% of
the database) images are considered.
6.REFERENCES

[1] V.N Gudivada and V.V. Raghwan,


Finding the Right Image, Content Based
Image Retrieval Systems, Computer,
IEEE Computer Society, pp.18-62, Sept.
1995.
[2] B.S. Manjunath and W.Y. Ma, Texture
Features for Browsing and Retrieval of
Image Data,IEEE Trans. PAMI, Vol. 18,
No. 8, pp.837-842, Aug 1996.
[3] J. Woods, Subband Image Coding, Kulwer
Academic Publishers, 1991.
[4] M. Antonini et.al. Image Coding Using
Wavelet Transform, IEEE Trans.Image
Processing, Vol.1, No.2, April 1992.
[5] T. Chang and C.C. Kuo, Texture Analysis
and Classification with Tree-Structured
Wavelet Transform, IEEE Trans.Image
Processing, Vol.2 No.4, pp.429-441, 1993.
[6] Manesh Kokare, B.N. Chatterji and P.K.
Biswas, A Survey on Current Content
Based Image Retrieval Methods, IETE
Journal of Research, Vol.48, No. 3&4, pp.
261-271, May-Aug 2002.
[7] P. Brodatz, Textures: A photographic
Album for Artists & Designers, New York:
Dover, 1966.
[8] I. Daubechies, The Wavelet Transform,
Time- Frequency Localization and Signal
Analysis, IEEE Trans. Information
Theory, Vol. 36, pp.961-1005, Sept. 1990.

You might also like