Digital Watermarking of Text, Image, and Video Documents: Graphics In/for Digital Libraries
Digital Watermarking of Text, Image, and Video Documents: Graphics In/for Digital Libraries
EaN
_
. Hence, the transmitted signal is
s[n] =
EaN
_
m[n] c[n]; this signal is embedded as
a watermark. Figure 2 gives a block diagram of
the embedding process.
For watermarking, s[n] is transmitted
(embedded) by mapping s[n] into changes in the
original document. For example, in text water-
marking, s[n] could be used to perturb line spa-
cings. In image watermarking, s[n] might be
added directly to pixel values or transform coe-
cients.
The Spreading Sequence. The carrier c[n] is
called the spreading sequence, and it has several
special properties:
c[n]E{1, 1], for any n; (1)
N1
n=0
c[n]c[n Ni ] = N, for any i; (2)
1
N
N1
n=0
c[n] = 0; (3)
1
N
N1
n=0
c[n]c[n k] = d[k], for 0kN 1X (4)
In practice, these properties can be closely ap-
proximated, so we assume equality holds.
Equation (1) means that c[n] is binary-valued,
although non-binary (e.g., Gaussian-distributed)
sequences are also possible. Equations (2) and
(3), and Equation (4) are referred to as the
periodicity, zero-mean, and (periodic) autocorre-
lation properties, respectively.
The statistical behavior of c[n] is similar to
that of noise, although c[n] is not a random pro-
cess. For security reasons, c[n] should be easy to
generate with the proper key, but without the
key, it should be dicult to reconstruct the com-
plete signal c[n] from only a short segment of it.
Signals with these properties are known as
pseudo-noise signals.
The Watermark Channel. After transmission
(embedding), s[n] passes through a channel,
which introduces various types of interference.
The interference is typically modeled as additive
white Gaussian noise (AWGN) v[n] with variance
s
2
v
aN. The received signal is thus
r[n] = s[n] + v[n]. The channel (and receiver) are
diagrammed in Fig. 3.
In watermarking, v[n] includes interference
from the original document, as well as attacks.
An example of a text-document attack is printing
Fig. 2. Spread spectrum watermark embedding example
Fig. 3. Spread spectrum watermark recovery example
Digital watermarking in documents 689
and photocopying. Image and video watermarks
could be subject to attacks such as compression
with JPEG or MPEG, respectively. Although
images do not conform to the AWGN model,
the receiver can use pre-ltering to remove most
of the correlated noise introduced by the original
image [18]. For simplicity, we use the AWGN
model in this discussion, although more sophisti-
cated attacks are possible [10, 19].
Reception (Watermark Recovery). Given r[n], a
receiver (see Fig. 3) attempts to determine the
message that was transmitted (i.e., recover the
watermark). The receiver is assumed to have its
own copy of the spreading sequence c[n] and to
be synchronized with the transmitter. The recei-
ver uses a correlation detector, which computes
r
i
=
EaN
_
N1
n=0
r[n Ni ]c[n] for each i. From
the properties of c[n], we nd that
r
i
= Eb
i
+
EaN
_
N1
n=0
v[n] c[n]. If r
i
r0, the
receiver decides
b
i
= + 1; otherwise, the receiver
decides
b
i
= 1.
One measure of performance is the signal-to-
noise ratio (SNR), which in this case is
SNR
std
=Eas
2
v
. Another measure is the prob-
ability of error (P
E
), which is the probability that
b
i
is incorrectly received (
b
i
= b
i
). For this
scenario, P
E,std
=Q
Eas
2
v
p
, where
Q(x) =
1
2p
o
x
e
y
2
a2
dy.
4. BENEFITS OF SPREAD SPECTRUM
For AWGN with unlimited power, SS performs
no better than other modulation schemes. However,
when the AWGN power s
2
v
aN is limited, SS has
several advantages, discussed below.
Imperceptibility. Note that |s[n]|=
EaN
_
for all
n. By choosing N suciently large,
EaN
_
can be
made as small as desired, but the total power
over N samples remains E. The watermark can
thus be transmitted (embedded) with a large total
power E via many low-amplitude changes.
It can be shown that the power spectrum of
s[n] is F
ss
(e
jw
) = E, which means that s[n]
behaves like white noise. There is no peak in the
spectral domain to indicate to an unauthorized
observer that transmission (embedding) has
taken place. These abilities allow a watermark to
be imperceptible{.
Security. Without the spreading sequence c[n],
it is impossible to recover the embedded message
m[n]. Because s[n] behaves like white noise and
has low amplitude, an attacker will have great
diculty estimating c[n] from the marked docu-
ment. Even if the attacker can estimate a portion
of c[n], its pseudo-noise properties make it di-
cult to determine the entire spreading sequence
c[n]. Therefore, the watermark is secure.
Robustness. An attacker does not know what
elements of a marked document were altered by
s[n]; nor does he or she know the values of s[n].
Therefore, to jam the watermark, the attacker
must alter every element of the marked docu-
ment. However, the attacker cannot alter the
marked document excessively; otherwise, the
attacked document will no longer be valuable.
Knowledge of c[n] gives a SS receiver a power
advantage against limited-power jamming. This
advantage is called the processing gain, G
p
=N.
With appropriate processing, the receiver has an
eective SNR of SNR
proc
=G
p
SNR
std
, or an
eective P
E
of P
E,proc
=Q(G
p
Eas
2
v
p
). For su-
ciently large values of N, the watermark is
robust{ against attacks. Additional robustness
can be achieved by using error control coding
(ECC) rather than directly modulating the orig-
inal message b
0
b
1
b
2
.
Multiple Watermarks. It is possible to extend
the example to consider multiple messages m
k
[n]
and pseudo-noise sequences c
k
[n]. The sequences
can be designed to be mutually orthogonal, i.e.,
N1
n=0
c
k
[n] c