59-65-Speech Signal Enhancement Using Wavelet Threshold Methods
59-65-Speech Signal Enhancement Using Wavelet Threshold Methods
T.Sekhar, Asst. Professor, ECE ,Pragati Engineering College, A.P, India, [email protected]
parameter extraction process used in low bit rate vocoders and hence they are becoming an integral part of low-bit rate speech coding systems. Over the last three decades, many kinds of speech enhancement techniques adaptive filtering, modelbased methods. The transform-based techniques, transforms the time domain signal into other domains, suppress noise components, and apply the corresponding inverse transform to reconstruct enhanced speech signal. Discrete Fourier transformer (DFT), Discrete cosine transformer (DCT), Karhunen-Loeve transformer (KLT), and wavelet transformer (WT) are widely-known transformer methods. DFT based technique have been intensively investigated based on shorttime spectral amplitudes (STSA). A KLT-based technique, called signal subspace-based methods, decomposes the space into signal (or speech) and noise subspace by means of eigen decomposition, and then suppresses the noise component in the eigen values. DCT-based techniques are of lower computational complexity and higher frequency resolution than DFT-based methods. It is also possible to consider WTbased methods in order to simultaneously exploit the time and frequency characteristics of noisy speech signals. Adaptive filtering on the other hand, cancels the noise using adaptive filters such as the Kalman filter. A Kalman filter models noisy speech signals in terms of state space and observation equations, which present the speech production process and the noise addition model together with channel distortion, respectively. Kalman filters normally assume a white Gaussion noise distribution; however, Gibson et al. Proposed a generalization of Kalman-filtering over coloured noise signals. Finally, model-based technique classify the noise signals using an a priori speech model, such as hidden Markov and voiced/unvoiced models, and then conducts the enhancement depending on classified speech model. This method can be useful for improving the noise reduction performance for various kinds of speech signals. However, it requires extra training to build the model with intensive computation. In addition, it may exhibit model selection errors which cause significant speech quality degradation. Fundamentally, it is not easy to handle complicated speech signals with finite number of speech models. Here is presented an investigation of use of wavelet threshold methods speech signal Enhancement. Unlike Fourier 59
I. INTRODUCTION In voice communications, speech signals can be contaminated by environmental noise and, as a result, the communication quality can be affected making the speech less intelligible. Furthermore, compression of noise speech with low bit rate vocoder may result in considerable quality degradation due to frequent estimation errors of speech production model parameters required by the vocoder. This problem can be reduced significantly by speech enhancement (or noise cancellation), which may enable more pleasant voice communication by suppressing the noise components in input signals. Generally, it is assumed that the noisy speech signal is formed additively by speech and noise signals in which the noise is generated by environmental sources such as vehicles, street noise, babble, etc. Therefore, in real environments complete noise cancellation is not feasible as it is not possible to completely track varying noise types and characteristics that change with time. However, by assuming that the noise characteristics change slowly in the background noise levels producing more pleasant and intelligible speech quality. Speech enhancement techniques can help the speech model
From Audio Device Block Fig.2. Shows the process of identifies the main steps in a digital audio processing system based in simulink software C. Basic noise theory: Noise is defined as an unwanted signal that interferes with the communication or measurement of another signal. A noise itself is an information-bearing signal that conveys information regarding the sources of the noise and the environment in which it propagates. There are many types of sources of noise or distortions and they include: 1) Electronic noise such as thermal noise and short noise 2) Acoustic noise emanating from moving, vibrating or colliding sources such as revolving machines, moving vehicles, keyboard clicks, wind and rain.
4) 5) 6)
This paper is organized by various sections, section-II describes to brief introduction to wavelet transform, Section III gives denoising scheme, section IV gives Experiments and Results, section V describes conclusion followed by References. II. DISCRETE WAVELET TRANSFORM Assume the observed signal y(t) = s(t) + n(t) Contains the original signal s(t) with additive noise n(t) as functions of time t to be sampled. Let W(.) and W-1(.) denote the forward and inverse wavelet transform. Let D(., ) denote the Enhancement operator with threshold . We intend to wavelet de noise y(t) in order to recover s^(t) as an estimate of S(t). S = W(Y) Z = D(Y,) = W-1(Z) Similar to the Fourier series expansion, the DWT maps a continuous variable (t) into a sequence of coefficients, the resultant coefficients are called discrete wavelet transform of (t). Its representation involves the decomposition of the signals in wavelet basis function (t) given by a,b(t)= [(t-b)/a] a, b R -(1) The signal S(t) can be de composed into several levels. A three level wavelet decomposition tree is shown in figure 5. III. WAVELET ENHANCEMENT SCHEME Let us assume signal S(t) is corrupted by noise n(t)as y(t)= S(t) + n(t) where n(t) is white Gaussian noise . the wavelet based de noising scheme is shown in figure 6. Where a, b are called scale and position parameters as respectively. The multi resolution analysis is given by S. Mallet and Mayer proves that any conjugate mirror filter characterizes a wavelet . The wavelet decomposition of a signal x(t) based a multi resolution theory can be obtained using filter [3], the filter based wavelet decomposition is shown in fig. 3.
61
The hard and soft thresholding techniques are used for denoising process. The hard and soft threshold operations with threshold are defined using (2) and (3)
wavelet coefficient vector and s is the sum of squared wavelet coefficients given as . (9) s= Threshold determination is an important problem. A small threshold may yield a result which may be noisy and large threshold can cut significant part of signal thus losing the important details of the signal. IV. RESULTS AND DISCUSSIONS This paper compares the performance analysis of wavelet threshold methods have been applied on the speech signal in English which is taken from a male speaker at a sampling frequency of 25 KHz. All the threshold methods are tested for white Gaussian noise .For performance comparison and measurement of quality denoising is calculated between speech signal S(t) and the denoised speech signal Sd(t) is given by PSNR = 10 log 10 ( smax2 /MSE) (10) Where smax is the maximum value of signal and is given by Smax = max (max( s(t), max (Sd (t))) And MSE is Mean Square Error given by MSE = Sd (t) S (t) ]2 - (11)
Where = median and is the wavelet coefficient vector at unit scale and N is the length of signal vector. 2) Sqtwolog Criterion: the threshold values () are calculated by universal threshold (square root log) method given by, 62
Fig.9: Comparison of PSNR (dB) at level-1 Decomposition TABLE II COMPARISION OF PSNR (DB) AT LEVEL-2 DECOMPOSITION Method Haar Db10 Coif5 Bior !" #$%5 Mini%a& )i*r+,re He,r+,re +-t.o/o* 25!444 22! 44 24! 05 24!"02 2'!( ( 2(! 43 2'! "' 30!0 5 2'!(1 2(!2' 2"!(1 2"!02 2 !51' 24!45' 25! 02 24! 0' 2(!21' 2 !02' 2 !52' 25!5'2
Fig.10:.Comparision of PSNR (dB) at level -2 Decomposition TABLE III COMPARISION OF PSNR (DB) AT LEVEL-3 DECOMPOSITION Method Mini%a& )i*r+,re He,r+,re +-t.o/o* Haar 22!4'2 1'!("4 25! 05 2(!"02 Db10 2 !0 2 23!502 2(!034 2(!2'4 Coif5 25!(( 24! ' 2 !"2 2 !(1 Bior !" 24!514 21!5'2 22! "' 23! "' #$%5 25!3"' 23!412 25!02' 24!12'
Method
Mini%a& )i*r+,re He,r+,re +-t.o/o*
Haar
2 !"31 23!"4 2 !"24 2"!"2
Db10
30!''1 32!53 33!3 2 30!3"2
Coif5
30! ( 2"! 2 2'!(1 2"!(3
63
Fig.11: Comparison of PSNR (dB) at level-3 Decomposition TABLE IV COMPARISION OF PSNR (DB) AT LEVEL-4 DECOMPOSITION Method Mini%a& )i*r+,re He,r+,re +-t.o/o* Haar 21!("2 1"!' 2 24!502 24!142 Db10 24!51' 20! 0' 2 !0 2 24!20' Coif5 24!1( 21!1' 25!"2 23!"1 Bior !" 21!523 1"!5 3 25!" 4 23!"24 #$%5 23!'(2 22!3" 23!40' 23!424
#!N o
Fig.13. Comparative analysis of threshold methods TABLE VI COMPARATIVE ANALYSIS BETWEEN DIFFERENT WAVELET FAMILIES BY USING SQTWOLOG METHOD IN TERMS OF PSNR.
Noi+e 0e1e/ 2#N) of Noi+e +i*na/
Haar
Db10
Coif5
#$%5
1 2 3 4 5
0 dB 5 dB 10 dB 15 dB 20 dB
Fig.12: Comparison of PSNR (dB) at level-4 Decomposition Haar: Haar wavelet, Db: Doubechies wavelet, coif: Coiflet wavelet, sym: Symlet wavelet TABLE V COMPARATIVE ANALYSIS OF THRESHOLD METHODS
#! N o Noi+e 0e1e/+ 2#N) of Noi+e #i*na/ 2#N) of denoi+ed +i*na/ ,+in* different a/*orith% MiniMa&i )i*r+,re He,r+,r e #-t.o/o*
Fig.14. Comparative analysis between different wavelet families by using sqtwolog In addition, the sqtwolog method is performed by actually performing the comprehensive study of daubechies wavelets (Db -2 to Db 10), haar wavelet, coiflet wavelets (coif-1 to coif-5),symlet wavelets(sym-2 to sym-5) but here are only the results of sym 5 wavelet. A comparative analysis has been performed between Minimaxi/Rigrsure, Heursure and sqtwolog and the results in the form of PSNR are given in table 1,2,3,4 for 5 different wavelets. It is observed as the level of decomposition is increased from level 1 to level-4, the PSNR values of noise signals go on reducing and improved in terms of PSNR for denoised signals. When Db10 is applied at all the decomposition levels, significant increase in PSNR is obtained in comparison 64
1 2 3 4 5
0 dB 5 dB 10 dB 15 dB 20 dB
Electronics and Communication Engineering from Chaitanya Institute Science & Technology, Madhavapatnam, Kainada in 2010 and M.Tech in Embedded Systems (ES) from Pragati Engineering College, Surampalem in 2013. He is currently working as an Asst. Professor in the dept. of Electronics and Communication Engineering in Pragati Engineering College, Andhra Pradesh, India. He Research interest includes Embedded System, Signal Processing, and Image Processing.
65