A Novel Approach For Blind Estimation of Reverberation Time Using Gamma Distribution Model
A Novel Approach For Blind Estimation of Reverberation Time Using Gamma Distribution Model
Abstract – In this paper we proposed an unsupervised algorithm to estimate the reverberation time
(RT) directly from the reverberant speech signal. For estimation process we use maximum likelihood
estimation (MLE) which is a very well-known and state of the art method for estimation in the field of
signal processing. All existing RT estimation methods are based on the decay rate distribution. The
decay rate can be obtained either from the energy envelop decay curve analysis of noise source when it
is switch off or from decay curve of impulse response of an enclosure. The analysis of a pre-existing
method of reverberation time estimation is the foundation of the proposed method. In one of the state
of the art method, the reverberation decay is modeled as a Laplacian distribution. In this paper, the
proposed method models the reverberation decay as a Gamma distribution along with the unification of
an effective technique for spotting free decay in reverberant speech. Maximum likelihood estimation
technique is then used to estimate the RT from the free decays. The method was motivated by our
observation that the RT of a reverberant signal when falls in specific range, then the decay rate of the
signal follows Gamma distribution. Experiments are carried out on different reverberant speech signal
to measure the accuracy of the suggested method. The experimental results reveal that the proposed
method performs better and the accuracy is high in comparison to the state of the art method.
529
Copyright ⓒ The Korean Institute of Electrical Engineers
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/
licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
A Novel Approach for Blind Estimation of Reverberation Time using Gamma Distribution Model
band noise burst to radiate in the enclosed environment. Lollmann et al. method is the detection of the possible
The decay curve is obtained by switching off the noise sound decay via a pre selection method which leads
source after reaching its steady state. In RT estimation the towards an improvement in robustness of estimation and
decay curve slope is used. The fluctuation presence in the computational efficiency. In this paper we proposed a
noise source will result different decay curves from trial to method for the blind RT estimation based on the Lollmann
trial. The averaging technique is applied on a large number et al. method and a pre selection mechanism for the
of obtained decay curve to get a reliable RT value. In possible decay rate detection and used Gamma distribution
another method the segmentation process is used for the for modelling the decay rate.
detection of gaps in the reverberant speech [5, 6] which The rest of the paper is divided in four sections and is
leads to the tracking of sound decay curve. In 1965 organized as follows. ML estimation procedure and the
Schroeder addressed this problem by developing a method sound decay model used in the proposed method is
based on the integrated impulse response [7]. In this discussed in Section 2. In Section 3 the estimation
method the narrow or broad band excitation signal is procedure of RT and its efficiency is discussed.
replaced by a brief pulse. Schroeder reported an integral Experimental results and proposed algorithm performance
relationship between the overall average of the interrupted is evaluated in Section 4, and followed by Section 5
noise method, decay curves and the impulse response of containing the conclusion.
the room, and thus the recurred trials were unnecessary.
Schroeder’s method ruled the area of voice signal
processing for the past few decades however there was a 2. Proposed Model and MLE
need of a blind RT estimation technique, which has the
ability to estimate RT from the available reverberant In this work, to model the reverberant signal energy
signals, without using the information of the enclosure or decay, the statistical model used is based on the Gamma
in the absence of the test sound signal. In hands free distribution. When the RT of a reverberant signal falls in a
telephony devices or hearing aids incorporating, the certain range then Gamma distribution is used for the
method based on the direct analysis of the sound signal, better modelling of the amplitude distribution of the signal
will be very helpful. Recently many method have been [17]. The proposed work is motivated from the method in
deployed for the blind RT estimation [2, 8, 9, 10-12], uses [17]. A figure has been derived here based on the method
the reverberant sound (recorded sound) directly for in [17] and it shows the main motivation for this work as
estimation. In all these methods, statistical modelling is shown in Fig. 1. As the reverberation time lies in the
the core followed by the MLE to determine the optimized lower range (0-150ms) then the distribution pattern
RT. Some semi-blind methods have also been developed followed by signal is Gamma. In current work our main
[13-16] in which the enclosure characteristics are learned focus is on the small chamber such as telephone booth,
using neural network approaches. In neural networks aeroplane chamber (cockpit), ATM booth, where we have
first the network is trained by a set of data and then to develop a method to estimate the reverberation time
tested after that it is used for estimation. The estimation properly such that enhanced speech signal can be achieved
of reverberation characteristics using only the microphone at the end. In such small chambers, RT generally lies in
(reverberant) signal is a good tool for the study of the lower range. Therefore reverberant speech decay
reverberations [2].
Ratnam et al. [2] used noise decay curve approach to
model the room reverberation characteristics in order to
develop a blind RT estimation algorithm i.e. completely
based on the reverberant sound. The test room sound signal
is incessantly processed to achieve a running RT estimate
by utilising the ML parameter estimation technique. An
order statistics filter is used in decision making step to get
the most probable RT from the pool of estimated RT (over
a period of time). In [2] the objective of detecting sound
decay is obtained by using the iterative technique. This
iterative approach for the detection of the sound decay
makes the algorithm computationally expensive. Ratnam
et al. made some changes in his algorithm in [2] in order
to improve the computational efficiency of it and presented
a new algorithm in [10]. Lollmann et al. [8] recently Fig. 1. Reverberant speech distribution and theoretical PDF
developed an algorithm based on the sound decay of different distribution (Laplace distribution blue
statistical model [2] for the blind RT estimation directly line, Gamma distribution black line, Gaussian
from reverberant speech. The main advantage of the distribution pink line) with zero mean.
curve would be modelled in the work based on Gamma where a and θ are the (N +1) unknown parameters that are
distribution. required to be estimated from the observation y. Here the main
Hence, the decaying sound reverberant tail is modeled objective is the modelling of sound decay in an enclosure
by employing Gamma distribution G(x, k, θ) along with and to make some simplification to the acquired likelihood
a sequence of random variables. In G(x, k, θ) k is the shape function obtained in (7). A supposition is made that the
factor and is equal to one for zero mean, θ is the variance reverberant sound signal energy envelop damping is defined
of the Gamma distribution and x is the random variable. The by a single decay rate σ in the intermediate free decay regions
general mathematical expression of Gamma distribution (i.e., the regions followed by steep speech-sound offset)
probability density is given below [18]. instead of the sound ongoing, onset, or gradually declining
speech offsets regions. This leads to an expression for a(n)
∗ and is determined by [2],
( ; , )= (2)
∗⌈( )
( )= (− ) (8)
where ⌈( ) is the gamma function of k. For zero mean we put σ
after simplification and calculating the value of ⌈(1) , (3) = (−1/σ) (9)
reduced to [18] ,
By putting the value of Eq. (9) in Eq. (8) we get
( ; 1, ) = (4) ( )= (10)
In the evolution of model an assumption is made. In this After incorporating Eq. (10), Eq. (7) becomes
assumption the reverberant decaying sound tail is represented
by y which is the product of x and a . Where x is a random ∑ ( )/
(12)
( ; , )= × exp
process and representing the fine structure, and a represents ∗ …
the deterministic envelope. Furthermore, x(n) is supposed as
an independent and identically distribution (i.i.d.) random We use mathematical induction to simplify Eq. (12) by
number sequence valid for n ≥0 , having Gamma distribution replacing 0 →N-1 positive consecutive integer by its logical
with zero mean(k =1) and variance θ, G(1, θ). Likewise for equivalent, so that
each and every value of n a deterministic sequence a (n)>0 is
( )
defined in order to evolve the room decay (y) model [2]. The 0 + 1 + 2 + 3+. . . . . +(N − 1) = (13)
mathematical expression of the decay model is given as.
After making use of Eq. (13) and terms rearrangements, Eq.
( )= ( )× ( ) (5) (12) becomes
( ) = ( )/ ( ) (6)
For the estimation of the unknown parameters (a,θ) MLE is
As y(n) are independent but due to variation in x(n) with used. First step is to obtain the log-likelihood function, for this
time the y(n) are not identically distributed and have a logarithm is applied on the both side of the Eq. (14)
probability density function G(1, θ a(n)).
Select a finite observation sequence, n = 0,1,2,..., N −1, in ∑ ( )/
( ; , )= × exp −
order to estimate the decay rate of the reverberant signal. ∗ ( )/
( − 1)
1 1 ( ; , )=− ( )− (θ)
( ; , )= 2
(0) ∗ (1) … ( − 1)
( )
× exp −
∑ ( )/ ( )
(7) − ∑ (16)
θ
http://www.jeet.or.kr │ 531
A Novel Approach for Blind Estimation of Reverberation Time using Gamma Distribution Model
By making use of mathematical induction, Eq. (16) is more required time constant and to be estimated.
simplified and becomes The mapping of a over the σ is observed and it revealed
that a∈[0,1) maps one-to-one onto σ∈[0, ∞). The method that
( ) we used here and the method used in [2] and [10] are similar
( ; , ) = −∑ ln( ∗ θ) − ∑ (17)
θ
because in both method quantization is the base for the
estimation of a. To form the histogram of a, first the bins are
For the maximum of ( ( ; , )), we differentiate Eq.
created by quantizing the given range of a. For assigning
(17) with respect to a to get the score function S [19].
values to these bin likelihood is used. When the likelihood
values are calculated the maximum likelihood value is then
∂ ( ; , )
S ( ; , )= assigned to that bin in the histogram.
∂a Let X values are obtained after the quantization of a
1 1 having the range a∈[0,1). The quantized values of a is then
=− n+ ∗ ( )∗ (18) represented by , where w=1, 2, 3,..., X and for each
θ
the log-likelihood is calculated using Eq. (17) ad is given
To get the extremum achieved by the log-likelihood
as;
function, we put the differentiation resultant (Eq. (18)) equals
to zero, given as 1 ( )
( ; )=− ln( ∗ θ) − (22)
θ
− ∑ n+ ∑ ∗ ( )∗ =0 (19)
θ
And can be decides as
The estimated value of a, or in other words the zero of the
scoring function is denoted by . The value of should σ = max { ( . θ )} (23)
satisfy Eq. (19). For verification that the estimated value of a
( ) maximize the log-likelihood function, the second The result obtained from Eq. (23) is then put in Eq. (9) to
derivate test
( ; , )
< 0 , can be performed. obtain the estimated decay rate (σ ). At the end the value
of RT ( ) is calculated by using the formula from [11].
Similarly, the score function is obtained for θ by following
the same procedure as followed for a. The log likelihood = 6.908 × σ (24)
function in Eq. (17) is partially differentiated with respect
to θ .
3. Effective Estimation of RT
∂ ( ; , ) 1
S θ (θ ; , ) = =− + ( )∗
∂θ θ θ A looping approach is used in the parent method
(20) proposed in [2] to estimate the decay rate of the sound, but
the computational resources requirement for that algorithm
The score function (S θ (θ ; , )) is then put equal to zero is significantly high. The computational efficiency of the
to obtain an expression for θ . parent technique is enhanced by the algorithm introduced
in [10] in which the recorded signal is divided into frames
− + ∑ ( )∗ =0 (21a) and then processed instead of processing the whole
θ θ
signal to find the free sound decay regions for maximum
Ö θ = ∑ ( )∗ (21b) likelihood estimation of the reverberant sound decay rate.
The computational efficiency can be further enhanced by
The estimated value of θ is denoted by θ and should first capturing the reverberant signal free sound decay
satisfy Eq. (21a). For the value of θ given in Eq. (21b) the log- regions and then using those detected region for the ML
likelihood function will achieve an extremum, for verification estimation of the decay rate. That is the reason that this
that the estimated value of θ maximizes the log-likelihood approach leads to the increment in computational
( ; , )
< 0 , can be efficiency by using only small portion of the reverberant
function, second derivative test θ θ θ signal for processing. This goal can be achieves by an
performed estimation procedure proposed by Lollmann et al. [8] and
By observing estimation expression of a and θ represented have a residue benefit of diluting the outliers effects on RT
by Eq. (19) and Eq. (21b) respectively, it is clear that Eq. (19) estimated value. To enhance the Maximum Likelihood
belongs to the category of the implicit expressions and its estimation of the Gamma parameters for our proposed
explicit solution will not exist, on the other hand Eq. (21b) fall method, we have used this procedure.
in the category of explicit expressions and its explicit solution The reverberant speech signal is used as an input to the
exists ,if a is known. It is already defined in Eq. (9), σ is the algorithm and is represented by g(n). The g(n) is a discrete
time signal and the n represents its index. The g(n) has (27b), (27c) leads towards potential sound decay. RT
been processed frame-wise. The samples sequence is ( ) of the spotted frame is calculated for a finite band
divided into frame and each frame has D samples, moved of RT values by making use of Eq. (22), (23), (9), (24).
by an instant of ΔD [8]. The purpose of this division is to The estimated values of RT i.e. is the assigned to
detect the free sound decay regions and is given as the bins in order to generate the histogram. The RT ( )
value is update every time when a new value is generated.
G(ϒ,d) = g(ϒ*ΔD + d ) with d = 0,1,...,D−1 (25) The bin size of the histogram is kept 10 for the sake of
reducing the computational complexity and enhancing
where ϒ∈N. To spot the potential sound decays, pre estimation accuracy.
selection is carried out in the first step. This objective is Instead of the first peak, the maximum of the histogram
achieved by the division of G(ϒ,d ) into T=D/Q∈N sub- is associated to the current estimated RT ( ), because
frames. of the reduced number of outliers (result of the frame pre-
selection process). The frames pre-selection also effected
(ϒ, , ) = (ϒ , + ) (26) the estimated RT variance, reduced it by recursive
smoothing, and modified the estimation expression. The
where = 0,1,2,..., Q − 1 and = 0,1,2,..., T − 1 modified RT expression is given as
represents subframe indices. In next step we examined the
maxima and minima of the sub-frame energy to find (ϒ) = β . (ϒ − 1) + (1 − β). (ϒ ) (28)
whether these values diverts from the next sub-frame
values as done in [9]. where 0.9 <β< 1 . The RT value is finally estimated by
= ( (ϒ )) (29)
( ϒ, , )>τ . ( ϒ, + , )
Table 1 summarizes the blind RT estimation technique
(27a) utilizing the Gamma distribution based statistical model for
the decay of reverberant sound signal.
{ (ϒ, , )} > τ . { (ϒ, + , )}
(27b)
4. Experimental Result and Discussion
{ (ϒ, , )} < τ . { (ϒ, + , )}
(27c) Matlab simulations are carried out to evaluate the
performance of proposed method for the blind estimation
where τ is used as a weight and have a range of of RT. For simulation our first goal was to generate 10
0≤τ ≤ 1. For some frames the above equations (27a, 27b, different reverberant speech signal. For this we convolved
27c) may not be satisfied due to the counter , when it 10 anechoic speech signals with RIRs to get the desired
touches its minimum value1 < < − 2 and if this signals (reverberant signals). These 10 anechoic speech
is not the case, the inequality check is ceased and the signal are randomly selected from the TIMIT database.
incoming signal frame G(ϒ +1,b) is computed. On the Five of these are uttered by male and five by female,
other hand the sub-frame sequence satisfying Eq. (27a), sampled at 16 KHz. The RIRs are obtained from the AIR
http://www.jeet.or.kr │ 533
A Novel Appproach for Blind Estimation of
o Reverberatio
on Time using Gamma
G Distribuution Model
estimating the RT values over a wide range we would add Applied Acoust., vol. 58, pp. 305-325, 1999.
some modifications to the algorithm in future, to make it [15] Y. Tahara and T. Miyajima, “A new approach to opti-
possible. mum reverberation time characteristics,” Applied
Acoust., vol. 54, pp. 113-129, 1998.
[16] Aliabadi, M., Golmohammadi, R., Ohadi, A.,
References Mansoorizadeh, Z., Khotanlou, H., &Sarrafzadeh, M.
S. (2014). Development of an Empirical Acoustic
[1] H. Kuttruff, Room Acoustics, Elsevier Science Model for Predicting Reverberation Time in Typical
Publishers Ltd., Lindin, 3rd ed., 1991. Industrial Workrooms Using Artificial Neural Net-
[2] R. Ratnam, D. L. Jones, B. C. Wheeler, W. D. OBrien, works. Acta Acustica united with Acustica, 100 vol. 6,
C. R. Lansing and A. S. Feng, “Blind estimation of pp.1090-1097.
reverberation time,” J. Acoust. Soc. Am., vol. 114, pp. [17] T. Petsatodis, C. Boukis, F. Talantzis, Z. Tan and R.
2877-2892, Nov. 2003. Prasad, “Convex combination of multiple statistical
[3] W. C. Sabine, “Collected Papers on Acoustics,” 1922. models with application to VAD,” IEEE Trans. Audio,
[4] International Organization for Standardization (ISO), Speech, and Lang. Process., pp. 2314-2327, 2011.
Geneva, Acoustics- Measurements of the Reverber- [18] M. A. Bean (2001). Probability:The science of un-
ation Time of Rooms with Reference to Other certainty with application to investments, insurance
Acoustical Parameters, 1997 and engineering [Online]. Available:books.google.
[5] K. Lebart, J. Boucher, and P. Denbigh, “A new method com.pk/books?isbn=0821847929
based on spectral subtraction for speech deriver- [19] V. Poor, “An Introduction to Signal Detection and
beration,” Acta Acustica, vol. 87, pp. 359-366, 2001. Estimation,” Springer-Verlag, New York, 1994.
[6] S. Vesa and A. Harma, “Automatic estimation of [20] M. Jeub, M. Schafer and P. Vary, “A binaural room
reverberation time from binaural signals,” in Proc. impulse response database for the evaluation of
IEEE Int. Conf. Acoust., Speech, Signal Process., vol. dereverberation algorithms,” Proc. Int. Conf. Digital
3, 2005, pp. 281-284. Signal Process. (DSP), 2009, Santorini, Greece.
[7] M. R. Schroeder, “New method of measuring rever- [21] T. Jan, W. Wang, “Blind reverberation time estima-
beration time,” J. Acoust. Soc. Am., pp. 409-412, tion based on laplace distribution,” 20th EUSIPCO
1965. 2012
[8] H. W. Lollmann, E. Yilmaz, M. Jeub and P. Vary, “An
improved algorithm for blind reverberation time
estimation,” Proc. Int. Workshop Acoust. Echo and
Noise Control (IWAENC), Aug. 2010, Tel Aviv, Amad Hamza received his BSc degree
Israel in Electrical (Electronics) Engineering
[9] H. W. Lollmann and P. Vary, “Estimation of the from Air University Islamabad, Paki-
Reverberation Time in Noisy Environments,” Proc. stan in 2011 and his MS degree in
Int. Workshop Acoust. Echo and Noise Control Communication and Electronics from
(IWAENC), Sep. 2008, Washington USA. Univ. of Eng. & Tech. Peshawar, Pakistan
[10] R. Ratnam, D. L. Jones and W. D. O Brien, “Fast in 2015. His research interests are
algorithms for blind estimation of reverberation time,” audio/video processing, ANN and CGP.
IEEE Signal Process. Letters, vol. 11, pp. 537-540,
Jun. 2004
[11] J.Y.C. Wen, E.A.P. Habets, and P.A. Naylor, “Blind Tariqullah Jan received his B.Sc.
estimation of reverberation time based on the degree in Elect. Eng. from Pakistan in
distribution of signal decay rates,” Proc. IEEE Int. 2002 and PhD in the field of Electro-
Conf. Acoust., Speech, and Signal Process., pp. 329- nic Engineering from UK in 2012. His
332, 2008. Research interest includes Blind signal
[12] Scharrer, Roman, and M. Vorländer. “Blind rever- processing, machine learning, blind
beration time estimation.” Proceedings of the Inter- reverberation time estimation, multi-
national Conference on Acoustics, Sydney, Australia. modal based approaches for the blind
2010. source separation, compressed sensing, and Non-negative
[13] T. J. Cox, F. Li and P. Darlington, “Extracting room matrix/tensor factorization for the blind source separation.
everberation time from speech using artificial neural
networks,” FJ. Audio Engineering Soc., pp. 219-230,
2001.
[14] J. Nannariello and F. Fricke, “The prediction of
reverberation time using neural network analysis,”
http://www.jeet.or.kr │ 535
A Novel Approach for Blind Estimation of Reverberation Time using Gamma Distribution Model