Data Hiding in Audio Based on Audio-to-Image Wavelet Transform and Vector Quantization
Data Hiding in Audio Based on Audio-to-Image Wavelet Transform and Vector Quantization
Abstract—This paper presents a new approach for hiding can guarantee that it doesn’t become suspicious about the
information in digital audio based on the audio-to-image existence of the secret message within the cover signal. (3)
wavelet transform (A2IWT) and vector quantization (VQ). In High capacity: the maximum length of the secret message
our scheme, the cover audio signal is first transformed into an that can be embedded should be as long as possible. (4)
image by utilizing the wavelet transform and re-sampling the Robustness: the secret data should be able to survive when
coefficients, then the secret data are embedded in the obtained the host medium is subject to some normal manipulations,
image using a VQ-based image steganographic scheme, and for example, lossy compression. However, because the main
finally the image together with the remaining wavelet aim of steganography is to transmit data effectively and
coefficients is inversely transformed back to a stego audio
secretly, this requirement is not as important as others. (5)
signal. Experimental results show that the proposed scheme is
effective and efficient.
Accurate extraction: the extraction of the secret data from
the host media should be as accurate and reliable as possible.
Keywords- audio steganography; vector quantization; wavelet In the steganographic application, the secret information
transform; audio-to-image data hiding is embedded within another object called ‘‘cover-object’’ to
form the final object called ‘‘stego-object’’, and then this
I. INTRODUCTION stego object can be transmitted or stored. Through this
Nowadays, with the rapid development of Internet and procedure, it hides the existence of the secret data and even
communication technologies, more and more personal the truth of its transmission. Recently, various
information is transmitted over the Internet or other digital steganographic techniques for digital audio for various
transmission media, and thus the information security purposes have been developed [2-10]. Most of them make
becomes one of the most important issues in our digital age. full use of the characteristics of Human Auditory System
Traditionally, cryptographic approaches are utilized to (HAS). In schemes [2-4], the audio steganographic
prevent the sensitive information from being accessed by algorithms based on modification of the least significant bits
unauthorized persons. However, it is sometimes not strong (LSB) of the audio samples in the temporal domain or the
enough and not flexible for all the application environments. coefficients in the transform domain have been proposed.
Once the cipher code is stolen or cracked, the sensitive These algorithms utilize LSB technique and also can be
private information could not be protected at all. It is combined with other techniques such as error diffusion [2],
therefore necessary to utilize alternative methods or minimum error replacement [3] and temporal masking effect
introduce effective supplements to the cryptographic [4]. Other methods embed secret messages in the LSB of the
approaches. Among these schemes, as a powerful tool to coefficients of wavelet [5] or integer wavelet transforms [4].
increase the security in data transferring and archiving, The general aims of the LSB based techniques are to pursue
steganography has been drawn more and more attention. the maximum payload and the minimal degradation of the
The main idea of steganography is to hide the existence of host audio signals. In [6], this LSB based technique is
the information by embedding it into another digital
improved to enhance the robustness against additive noise
medium such as image, audio, video or binary document [1].
attack. Because of the intuitionistic and redundant property
It is different from cryptography that is designed to
disarrange the content of the sensitive information. of images, many steganography methods for images have
An effective and efficient data hiding algorithm should been developed which are more mature and better than other
satisfy the following requirements[2-4]: (1)Secret property: media. In schemes [7, 8], the steganography methods for
the hidden information should not be able to be extracted audio which utilize image steganography techniques are
from the host signal without holding the proper secret key, proposed. The main aim of these methods is to increase the
and also the existence of the secret message should not robustness against MPEG compression. In [9], spread
arouse suspicion under any circumstance. (2) Imperceptible spectrum-based and phase shifting-based techniques are
property: the stego signal embedded with the secret message combined to increase the robustness against the additive
should be undistinguishable from the original one, which noise for aerial data transmission. In [10] perceptually
314
that based on this characteristic, secret data can be hidden in MP3 compression. The proposed scheme is applied to 10
the compression codes without inducing additional coding sample audio signals, each signal being a wav file of 50
distortion. Specifically, the receiver determines that each bit seconds long, sampled at 44.1 kHz and quantized to 16 bits
of the secret data is ‘0’ or ‘1’ based on whether the received per sample. The experiments use Daubechies-5 wavelet to
compression code is an SOC or OIV code. In this scheme, decompose of the audio signals. The set of detail
four types of code translations have to be taken into coefficients at resolution level 1 is chosen to be transformed
consideration. Two of the four types add additional bits to into an image. A random sequence of 256 bits is embedded
the compression codes. This makes the hiding capacity for in each of the audio signal. The number of duplications n is
secret data to be restricted since each block of the cover set to 50 and the wavelet coefficient sampling interval is set
image has a maximum hiding capacity of only one bit.
to 1.
The parameters of the proposed scheme along with the
secret key for permuting the secret data can serve as secret
keys for the scheme. These parameters include the specific TABLE I
wavelet transform used for audio signal decomposition, the SNR AND BIT ERROR UNDER PARAMETER SETTING 1 (DIMENSION=256)
set of detail coefficients chosen to embed the secret data in, Number of Bit Error
Audio Signal SNR
and the wavelet coefficient sampling interval ǻ. Moreover, 128kbps 160kbps 192kbps Non
the parameters of the VQ-based image steganographic Classic1 23.21 12 0 0 0
scheme employed can serve as additional keys to increase Classic2 39.80 103 3 0 0
the secrecy of the final audio steganographic scheme. Folk1 28.43 15 0 0 0
Without the proper keys, an unauthorized person will not be Folk2 29.56 29 0 0 0
able to extract the secret data from the stego audio signal. Jazz1 20.26 0 0 0 0
The wavelet coefficients may also be sampled with Jazz2 22.87 0 0 0 0
Pop1 32.72 56 0 0 0
irregular intervals by using some pseudo-random number
Pop2 38.50 102 5 2 0
generator. And this pseudo-random generator can be
Rock1 29.46 0 0 0 0
selected the same as the one used for permute the secret data.
Rock2 28.30 117 0 0 0
The same pseudo-random number generator needs to be
used for extraction. This method will also increase the TABLE Ċ
secrecy of the steganographic scheme. SNR AND BIT ERROR UNDER PARAMETER SETTING 2 (DIMENSION=64)
Number of Bit Error
Audio Signal SNR
V. EXPERIMENTAL RESULTS 128kbps 160kbps 192kbps Non
In this section, some experimental results of the proposed Classic1 27.45 38 13 0 0
scheme are shown as follows. Classic2 48.18 127 106 96 0
Folk1 36.23 32 20 0 0
The performance of the proposed scheme is influenced by
Folk2 34.15 27 6 0 0
a number of parameters as discussed in section IV. The
Jazz1 23.36 0 0 0 0
choice of the set of detail coefficients cdx used for
Jazz2 25.43 0 0 0 0
information hiding affects the perceptibility, embedding Pop1 43.68 89 67 38 0
capacity and robustness against MP3 compression. Pop2 46.52 116 102 121 0
Experimental results show that the set of detail coefficients Rock1 36.39 0 0 0 0
at resolution level 1 exhibits a good compromised between Rock2 32.12 3 0 0 0
imperceptibility and robustness against MP3 compression
and also has the largest embedding capacity (partly because The performance of the proposed scheme is measured
of the number of coefficients in each level). The number of under two different parameter settings. The SNR and the
duplications n, the codeword dimension of the VQ-based number of bit errors under different MP3 compression
steganographic method and the wavelet coefficient sampling levels for parameter setting 1 and 2 are shown in Table I and
interval ǻ affect the embedding capacity. And the Table II, respectively. Under the parameter setting 1, the
parameters of the VQ-based image steganographic scheme codeword dimension (or the size of image blocks) of the
have effects on imperceptibility and robustness against MP3 VQ-based image steganographic scheme is set to a higher
compression. An important parameter of the VQ-based value, while under the parameter setting 2, the dimension is
image steganographic scheme is the dimension of the set to a lower value. From the two tables, it can be seen that
codeword (or the size of the image blocks). Larger image higher dimension value leads to lower number of bit errors
blocks leads to less capacity and higher resistance against under different MP3 compression levels. The SNR values,
MP3 compression but less imperceptibility. however, are also lower implying more perceptible
Because the capacity of the proposed scheme can be modifications have been made. The difference in
accurately calculated, experiments are performed to measure perceptibility is also confirmed by subjective listening. The
the imperceptibility and robustness of the scheme against
315
modifications made by the scheme under the parameter Conf. Info. Tech. : Coding and Computing, Vol. 2, pp. 533-537, April
2004.
setting 1 are audible for some audio signal samples. In [3] N. Cvejic, T. Seppanen, “Increasing the capacity of LSB-based audio
contrast, under the parameter setting 2, the modifications are steganography.” IEEE Workshop on Multimedia Signal Processing,
almost imperceptible for all audio signal samples. pp. 336-338, 2002.
[4] S.S. Agaian, D. Akopian, O. Caglayan, S. A. D Souza, “Lossless
The experiment results show that the proposed scheme
Adaptive Digital Audio Steganography,” In Proc. IEEE Int. Conf.
exhibits some resistance to MP3 compression. The Signals, Systems and Computers, pp. 903-906, November 2005.
resistance to MP3 compression, however, also depends on [5] N. Cvejic, T. Seppanen, “A wavelet domain LSB insertion algorithm
the nature of the audio signals themselves. It can be seen for high capacity audio steganography,” In Proc. IEEE Digital Signal
Processing Workshop, Callaway Gardens, GA, p. 53–55, October
from the tables, the secret data can be extracted from some 2002.
audio signals with zero bit error rate, even after the audio [6] K. Gopalan, “Audio steganography using bit modification,” Proc.
signals had been compressed up to 128 kbps. For some IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Vol. 2, pp.
421-424, April 2003.
other audio signals, the extracted data exhibits a relatively [7] P. Bao and X. Ma, “MP3-Resistant Music Steganography based on
higher bit error rate. Dynamic Range Transform,” IEEE Int. Sym. Intelligent Signal
The embedding capacity depends mainly on the number Processing and Communication Systems, pp.266-271, Nov.18-
19,2004,Seoul, Korea.
of duplications n, the codeword dimension and the wavelet [8] R. A. Santosa, P. Bao, “Audio-to-image wavelet transform based
coefficient sampling interval ǻ. Higher number of audio steganography,” IEEE Int. Symp., pp.209-212, June 2005,
duplications n leads to lower embedding capacity but also Zadar, Croatia.
[9] H. Matsuka, “Spread Spectrum Audio Steganography using Sub-band
lowers the bit error rate. Larger codeword dimension causes Phase Shifting,” IEEE Int. conf. Intelligent Information Hiding and
lower embedding capacity and lower stego image quality Multimedia Signal Processing (IIHMSP '06), pp.3-6, Dec. 2006,
but improves the robustness against MP3 compression. Pasadena, CA, USA.
[10] K. Gopalan, “Audio steganography by cepstrum modification,” In
Larger sampling interval ǻ causes lower embedding
Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Vol.
capacity but improves have been modified. For the number 5, pp. 481-484, March 2005.
of duplications n=1, the codeword dimension equals 64 and [11] Chin-Chen Chang, Guei-Mei Chen and Min-Hui Lin, “Information
the sampling interval ǻ=1, about 28000 bits can be hiding based on search-order coding for VQ indices,” Pattern
Recognition Letters Vol. 25, pp. 1253-1261, Mar. 2004.
embedded within an audio signal of 50 seconds long, [12] Shih-Chieh Shie and Shinfeng D. Lin, “Data hiding based on
sampled at 44.1 kHz and quantized to 16 bits per sample. compressed VQ indices of images,” Computer Standards & Interfaces
This is equivalent to 560 bps. Although this number looks (2008), doi:10.1016/j.csi.2008.12.003.
[13] Chin-Chen Chang and Tzu-Chuen Lu, “Reversible index-domain
quite large, embedding secret data with the data duplication information hiding scheme based on side-match vector quantization,”
time n=1 will certainly exhibit low resistance against MP3 The Journal of Systems and Software, Vol. 79, pp. 1120-1129, 2006.
compression. For the n and ǻ values other than 1, under the [14] Yi-Pei Hsieh, Chin-Chen Chang and Li-Jen Liu, “A two-codebook
combination and three-phase block matching based image-hiding
condition that the codeword dimension equals 64, the scheme with high embedding capacity,” Pattern Recognition, Vol. 41,
embedding capacity can be evaluated by the following pp. 3104-3113, Mar. 2008.
formula. [15] Wen-Yuan Chen and Chin-Hsing Chen, “Public-key image
steganography using discrete cosine transform and quadtree partition
560 (3)
Capacity ≈ bps vector quantization coding,” Optical Engineering, Vol.42, pp. 2886-
n×Δ 2892, Oct 2003.
[16] Yung-Kuei Chiang and Piyu Tsai, “Steganography using overlapping
VI. CONCLUSIONS codebook partition,” Signal Processing, Vol. 88, pp. 1203-1215, 2008.
[17] C. H. Hsieh and J. C. Tsai, “Lossless compression of VQ index with
In this paper, we proposed an audio steganographic scheme search-order coding,” IEEE Trans. Image Processing, Vol.5, No.11,
based on the audio-image wavelet transform. The proposed pp. 1579-1582, 1996.
scheme employs an available VQ-based image
steganographic scheme to embed secret data within audio
signals. The experimental results show that the proposed
scheme exhibits some resistance to MP3 compression.
ACKNOWLEDGMENT
This work is supported by the National Natural Scientific
Foundation of China under Grant No. 61171150 and No.
61003255.
REFERENCES
[1] N. Provos, P. Honeyman, “Hide and seek: an introduction to
steganography,” Security & Privacy, IEEE Magazine, Vol.1 Issue.3,
pp.32-44, May-June 2003.
[2] N. Cvejic, T. Seppanen, “Increasing robustness of LSB audio
steganography using a novel embedding method,” In Proc. IEEE Int.
316