(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF

This document discusses text-independent speaker recognition using two-dimensional information entropy. It begins by providing background on speaker recognition and different types of speech features used, including spectral, phonetic, and prosodic features. It then introduces two-dimensional information entropy as a new text-independent speaker recognition feature that is computed in the time domain using real numbers. The document describes how two-dimensional information entropy quantifies the information content of a speech signal's amplitude-time trajectory. Experimental results showed this feature to be speaker-specific and useful for speaker recognition.

Uploaded by

memoire univmila

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views5 pages

(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF

Uploaded by

memoire univmila

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Journal of ELECTRICAL ENGINEERING, VOL. 66, NO.

3, 2015, 169–173

COMMUNICATIONS

TEXT–INDEPENDENT SPEAKER RECOGNITION USING

TWO–DIMENSIONAL INFORMATION ENTROPY
∗ ∗∗
Boško Božilović — Branislav M. Todorović
∗
Miroslav Obradović

Speaker recognition is the process of automatically recognizing who is speaking on the basis of speaker specific char-
acteristics included in the speech signal. These speaker specific characteristics are called features. Over the past decades,
extensive research has been carried out on various possible speech signal features obtained from signal in time or frequency
domain. The objective of this paper is to introduce two-dimensional information entropy as a new text-independent speaker
recognition feature. Computations are performed in time domain with real numbers exclusively. Experimental results show
that the two-dimensional information entropy is a speaker specific characteristic, useful for speaker recognition.
K e y w o r d s: biometrics, speech, speaker recognition, feature extraction, information entropy

1 INTRODUCTION spectrum is usually neglected, since it is generally be-

lieved that it has little effect on the perception of speech
Biometric recognition systems are increasingly being [10]. The simplest way of analyzing spectral properties
deployed as a means for the recognition of people [1]. One of a signal is by using filter banks. This approach to
of the most widely used biometric modalities is human spectral feature extraction is so called subband filtering
voice. Speaker recognition systems are technologies which where subband outputs are considered directly as the fea-
are used to recognize person from his/her speech signal tures [11]. The most frequently used spectral features for
by exploiting speaker specific characteristics [2]. speaker recognition are mel-frequency cepstral coefficients
Speaker specific characteristics are result of a com- [12], which are based on mel-scale filter banks. Linear
bination of anatomical differences inherent in the vocal prediction [13, 14] is an alternative spectrum estimation
tract and the learned speaking habits of different indi- method.
viduals. In speaker recognition systems, all these speaker Phonetic features depend on speech content [15]. In or-
specific characteristics can be used to discriminate be- der to extract phonetic features it is necessary to perform
tween speakers [3]. These speaker specific characteristics segmentation of the speech signal into phonemes. Some
are called features. The most important characteristic of broad phonetic classes are more speaker specific than oth-
feature would be large between-speaker variability and ers. For example, using only vowels it is possible to obtain
small within-speaker variability [4]. a very high recognition rate [16].
Speech signal is a complex time-varying signal which Prosodic features are related to non-segmental aspects
can be represented by many different features. There are of speech. They reflect differences in speaking style, lan-
different ways to categorize the features. From the view- guage background, sentence type and emotions [17]. The
point of their physical interpretation, we can divide them most important prosodic parameter is the fundamental
into: spectral features [5, 6], phonetic features [7, 8] and frequency [18]. Other prosodic features for speaker recog-
prosodic features [9]. nition include speaking rate, pause statistics and intona-
Spectral features are computed from short frames of tion patterns [19].
about 20–30 ms in duration. Within this interval, the Depending on the algorithm used, the process of
speech signal is assumed to remain stationary. Spectral speaker recognition can be categorized as text-dependent
features represent the most common way to character- and text-independent. Text-independent recognition is
ize the speech signal. Fourier analysis provides a usual the much more challenging of the two tasks, since in
way of analyzing the spectral properties of a given signal text-independent systems there are no constraints on the
in the frequency domain. In speech analysis, the phase words which the speakers are allowed to use.

∗ ∗∗
VLATACOM, R&D Center, Milutina Milankovića 5, 11070 Belgrade, Serbia {Bosko; Miroslav.Obradovic}@vlatacom.com; RT-RK,
Institute for Computer Based Systems, Narodnog Fronta 23A, 21000 Novi Sad, Serbia, [email protected]

c 2015 FEI STU

DOI: 10.2478/jee-2015-0027, Print ISSN 1335-3632, On-line ISSN 1339-309X
170 B. Božilović — B. M. Todorović — M. Obradović: TEXT-INDEPENDENT SPEAKER RECOGNITION USING . . .

In general, phonetic variability represents an adverse 2 DESCRIPTION OF TWO–DIMENSIONAL

factor to accuracy in text-independent speaker recogni- INFORMATION ENTROPY
tion. Another adverse factor in text-independent speaker
recognition is modeling the different levels of prosodic in- Speech is made up of about 40 basic acoustic symbols,
formation (instantaneous, long-term) to capture speaker known as phonemes, which are used to construct words,
specific differences [19]. Beside those, adverse factors in sentences etc. Speech is an information-rich signal that
speaker recognition include: differences in recording and can be represented in frequency or time domain. All this
information is conveyed primarily within the traditional
transmission conditions, influence of noise environment
telephone bandwidth of 4 kHz [23].
[20], effect of the orthodontic appliances on spectral prop-
erties [21], etc. As a speaker specific characteristic of speech signal we
use its amplitude-time trajectory. In order to quantify the
Speaker recognition process is realized in several steps. information content of speech signal in time domain, we
The first step is speech signal capture by microphone. define two-dimensional information entropy of amplitude-
The second step assumes extraction of speech segments time trajectory.
by removing the silence from the captured speech signal. Let us consider the analog speech signal s(t) pre-
This step is performed by voice activity detector. The sented in Fig. 1. Maximum value of the signal is de-
next step is the choice of features that will represent the noted with Smax , while the minimum value is denoted
speech signal. The step which follows is the feature ex- with Smin . One can notice local maximums and local
traction process aiming to compute discriminative speech minimums of the signal amplitude, ie the time points
features suitable for speaker recognition. Furthermore, (. . . , ti−1 , ti , ti+1 , . . . ) where the first derivative of the sig-
speaker recognition follows a standard procedure which nal is equal to zero.
includes two different tasks: speaker identification and
speaker verification. In the speaker identification task, an Smax Dti
unknown speaker feature is compared against a database Si
of known speakers, and the best matching speaker is iden-
tified. An identity claim is given to the speaker verifica-
tion task, and the speaker’s voice sample is compared
against the claimed speaker’s voice template. If the simi- DS i
larity degree between the voice sample and the template 0
exceeds a predefined decision threshold, the speaker is
recognized, and otherwise rejected [19].
State-of-the-art speaker recognition systems use a
number of features in parallel, attempting to cover these Smin Ts
ti-1 ti ti+1
different aspects and employing them in a complementary
way to achieve more accurate recognition [22].
Fig. 1. Speech signal
Information entropy can be useful feature for speaker
recognition. In information theory, entropy is defined as
a measure of the randomness (uncertainty, information Let us denote with ti−1 time point where the signal
content) of a process. The calculation of the entropy of has local minimum, with ti subsequent time point where
speech is complex as speech signals simultaneously carry the signal has local maximum, and with ti+1 subsequent
various forms of information: phonemes, topic, intonation time point where the signal has local minimum. Further-
signals, accent, speaker voice and speaker stylistics. One more, let us denote with ∆si amplitude difference be-
can consider the entropy of speech signal at several levels: tween local maximum at the time point ti and previ-
the entropy of words contained in a sequence of speech, ous local minimum at the time point ti−1 . Let us denote
the entropy of intonation and the entropy of speech signal with ∆ti time difference between the time point ti and
features [23]. the time point ti−1 . Similarly, we can define ∆si+1 and
∆ti+1 as amplitude and time differences between time
Information entropy has already been used for speaker
point ti+1 and previous time point ti .
recognition. Empirical entropy was proposed in [24], while
approximated cross entropy was analyzed in [25]. Speech signal is sampled with sampling interval Ts
and quantized into q levels. Quantization step is ∆q =
In this paper, we propose and analyze so-called two- (Smax − Smin )/q . It should be noted that ∆si = m∆q
dimensional information entropy as a new feature domain and ∆ti = nTs , where m, n are integers and m ≤ q .
for text-independent speaker recognition. Algorithm for We propose two-dimensional information entropy as a
extraction of two-dimensional information entropy from measure to quantify the randomness of ∆si and ∆ti . The
speech signal is described. Experimental results show that two-dimensional information entropy is actually made up
the proposed feature domain can be useful to discriminate of two marginal entropies: H(∆si ) and H(∆ti ), assum-
between speakers. ing independence of the random variables ∆si and ∆ti .
Journal of ELECTRICAL ENGINEERING 66, NO. 3, 2015 171

Firstly, we calculate histograms of discrete random Histograms of discrete random variables ∆si and ∆ti
variables ∆si and ∆ti within certain time interval, which are calculated for all six speakers. Using these histograms,
is called frame duration and denoted with t0 . Secondly, information entropies H(∆si ) and H(∆ti ) are calcu-
we calculate information entropies H(∆si ) and H(∆ti ) lated according to relations (1) and (2). H(∆si ) and
H(∆ti ) represents coordinates in two-dimensional infor-
I
X 1 mation entropy domain. Different frame durations for cal-
H(∆si ) = P (∆si )ld , (1) culating histograms and information entropies H(∆si )
i=1
P (∆si )
and H(∆ti ) are considered: t0 = 10 s, 20 s and 30 s.
I
X 1 Obtained results for each specific frame can be rep-
H(∆ti ) = P (∆ti )ld , (2)
i=1
P (∆ti ) resented by point which is defined by ordered pair
H(∆si ), H(∆ti ) in two-dimensional information entropy
where I denotes number of intervals ∆ti within a frame, domain. Numerical results in H(∆si ), H(∆ti ) plane are
ie presented in Fig. 3, subfigures (a), (b) and (c), for
I
X t0 = 10 s, 20 s and 30 s, respectively. From this figure
t0 = ∆ti . (3) one can conclude that two-dimensional information en-
i=1
tropy points, obtained for one speaker, are clustered.
It will be shown that the proposed two-dimensional In other words, within-speaker variability from frame
information entropy is useful feature domain which rep- to frame is significantly smaller relative to between-
resents speaker specific characteristic suitable for text- speaker variability. Following the terminology from vector
independent speaker recognition. quantization (VQ) based approach [27, 28], ordered pair
Description of experimental testbed H(∆si ), H(∆ti ) is called speaker’s feature vector and the
Experimental testbed consists of voice activity detec- speaker’s model is formed by clustering the speaker’s fea-
tor, A/D converter and two-dimensional information en- ture vectors. In VQ-based approach, the speakers’ models
tropy extractor, as shown in Fig. 2. are formed by clustering the K speakers’ feature vectors
in K non-overlapping clusters.
Coordinates of the centre of the cluster are calculated
as
Two-
Voice
A/D dimensional N
activity
Converter information
X
detector
enropy extractor H(∆si ) = H(∆si ) , (4)
i=1
N
Fig. 2. Experimental testbed X
H(∆ti ) = H(∆ti ) , (5)
i=1
The function of the voice activity detector is to ex-
tract speech segments from the speech signal. A simple where N denotes number of the points in the cluster.
method [26], based on two audio features (signal energy For t0 = 10 s is N ∼ = 20 , for t0 = 20 s is N ∼ = 10 , while
and spectral centroid), for extraction of speech segments for t0 = 30 s is N ∼ = 6 . According to terminology from
by removing the silence is used in testbed. [27, 28], each cluster is represented by a code vector which
Once the speech segments have been extracted, speech is the centroid (average vector) of the cluster. VQ model,
signal is sampled at fs = 1/Ts = 8 kHz sampling rate also known as centroid model, is one of the simplest text-
with an 8-bits A/D convertor, ie each sample is quantized independent speaker models.
into one of q = 256 levels. Radius of each cluster, presented in Fig. 3, is calculated
The most important step in the speaker recognition as standard deviation of distances between points and the
process is to extract features from the analyzed signal. centre of the cluster
In two-dimensional information entropy extractor, speech
signal is windowed into frames and processed sequentially.
v
u
u1 X N
Calculations are performed according to relations (1) and σ=t {[H(∆si ), H(∆ti )] − [H(∆si ), H(∆ti )]}2
(2). N i=1
(6)
3 NUMERICAL RESULTS From Fig. 3(a), obtained for frame duration 10 s, one
can see that clusters are overlapping. Gaussian Mixture
Speech signal database is formed of six the most fre- Model (GMM) can be considered as an extension of the
quent speakers from Serbian parliament, three of them VQ model, in which the clusters are overlapping [29].
are males (denoted with M1, M2 and M3), while the re- GMM is composed of a finite mixture of multivariate
maining three are females (denoted with F1, F2 and F3). Gaussian components. Hence, a feature vector is not as-
Duration of speech signal of any of them is shortly below signed to the nearest cluster as in VQ model, but it has
4 min. a nonzero probability of originating from each cluster.
172 B. Božilović — B. M. Todorović — M. Obradović: TEXT-INDEPENDENT SPEAKER RECOGNITION USING . . .

H (DSi) H (DSi) H (DSi)

0.40 M1 0.40 M1 M2 0.40 M2

M2
M1
0.35 0.35 0.35
F1 M3 F1 M3
M3 F1
0.30 0.30 0.30 F2

F2 F3 F3
0.20 0.20 F2 0.20

0.25 F3 0.25 0.25

(a) (b) (c)
0.15 0.15 0.15
0.45 0.50 0.55 0.60 0.65 0.45 0.50 0.55 0.60 0.65 0.45 0.50 0.55 0.60 0.65
H (Dti) H (Dti) H (Dti)

Fig. 3. Two-dimensional information entropy for six speakers, males are denoted with M1, M2 and M3, females are denoted with F1, F2
and F3: (a) — Frame duration 10 s, (b) — Frame duration 20 s, (c) — Frame duration 30 s

From Fig. 3, one can see that standard deviation of [5] KINNUNEN, T. : Spectral Features for Automatic Text-Inde-
two-dimensional information entropy of a speaker is re- pendent Speaker Recognition, Licentiate’s thesis, University of
Joensuu, Joensuu, Finland, 2003.
duced as the frame duration is increased. In addition,
[6] KINNUNEN, T. : Optimizing Spectral Feature Based Text-In-
standard deviation of two-dimensional information en-
dependent Speaker Recognition, PhD thesis, University of Joen-
tropy is higher for females than for males. suu, Joensuu, Finland, 2005.
Although the frame duration of 10–30 s seems to be [7] JIN, Q.—SCHULTZ, T.—WAIBEL, A. : Phonetic Speaker
long, it is comparable with actual systems. Recently, it Identification, Proc. of the Int. Conference of Spoken Language
Processing (ICSLP 2002), Denver, CO, Sep 2002, pp. 1345-1348.
was announced that Barclays Wealth was to use speaker
[8] BACHOROWSKI, J. A.—OWREN, M. J. : Acoustic Correlates
recognition to verify the identity of telephone customers
of Talker Sex and Individual Talker Identity are Present in a
within 30 seconds of normal conversation [30]. Short Vowel Segment Produced in Running Speech, Journal of
Acoust. Soc. America. 106 No. 2 (1999), 1054–1063.
[9] ADAMI, A. G. : Modeling Prosodic Differences for Speaker
4 CONCLUSION Recognition, Speech Communication 49 No. 4 (2007), 277–291.
[10] FURUI, S. : Digital Speech Processing, Synthesis, and Recog-
nition, 2nd ed., Marcel Dekker, New York, 2001.
Two-dimensional information entropy is useful fea-
[11] SIVAKUMARAN, P.—ARIYAEEINIA, A.—LOOMES, M. :
ture domain for text-independent speaker recognition. Al- Sub-Band Based Text-Dependent Speaker Verification, Speech
though the validation is performed using small dataset, Communication 41 No. 2-3 (2003), 485–509.
obtained results clearly show that this feature can be used [12] DAVIS, S.—MERMELSTEIN, P. : Comparison of Parametric
to discriminate between speakers. Two-dimensional infor- Representations for Monosyllabic Word Recognition in Continu-
mation entropy is very accurate in gender identification. ously Spoken Sentences, IEEE Trans. Acoustics, Speech, Signal
Processing 28 No. 4 (1980), 357–366.
The most significant factor affecting automatic speaker
[13] HERMANSKY, H. : Perceptual Linear Predictive (PLP) Anal-
recognition performance is variability of signal character- ysis of Speech, Journal Acoust. Soc. America, 87 No. 4 (1990),
istics from trial to trial, ie between-trial variability. Varia- 1738–1752.
tions arise from the speaker him/herself, from differences [14] MAMMONE, R.—ZHANG, X.—RAMACHANDRAN, R. : Ro-
in recording and transmission conditions, and from differ- bust Speaker Recognition: a Feature Based Approach, IEEE Sig-
ent noise environment. These topics are subject of further nal Processing Magazine 13 No. 5 (1996), 58–71.
researches. [15] NOLAN, F. : The Phonetic Bases of Speaker Recognition, Cam-
bridge, 1983.
[16] ANTAL, M. : Phonetic Speaker Recognition, Proc. of 7th Int.
Conference Communications, Bucharest, Romania, June 2008,
References pp. 67–72.
[17] DEHAK, N.—KENNY, P.—DUMOUCHEL, P. : Modeling
[1] Biometric Recognition: Challenges and Opportunities (Pato, J. Prosodic Features with Joint Factor Analysis for Speaker Veri-
, Millett, L. I., eds.), National Academies Press, Washington, fication, IEEE Trans. Audio, Speech and Language Processing
2010. 15 No. 7 (2007), 2095–2103.
[2] TOGNERI, R.—PULLELLA, D. : An Overview of Speaker [18] MILIVOJEVIĆ, Z. N.—BRODIĆ, D. : Estimation of the Funda-
Identification: Accuracy and Robustness Issues, IEEE Circuits mental Frequency of the Speech Signal Compressed by G.723.1
and Systems Magazine, Second quarter (2011), 23–61. Algorithm Applying PCC Interpolation, Journal of Electrical
[3] CAMPBELL, J. P. : Speaker Recognition: A Tutorial, Proc. of Engineering, 62 No. 4 (2011), 181–189.
the IEEE 85 No. 9 (1997), 1437–1462. [19] KINNUNEN, T.—LI, H. : An Overview of Text-Independent
[4] ROSE, P. : Forensic Speaker Identification, Taylor & Francis, Speaker Recognition: From features to Supervectors, Speech
London, 2002. Communication 52 No. 1 (2010), 12–40.
Journal of ELECTRICAL ENGINEERING 66, NO. 3, 2015 173

[20] SEDLAK, V.—DURACKOVA, D.—ZALUSKY.—KOVA- [30] Barclays International Banking, available from: https://
CIK, T. : Intelligibility Assessment of Ideal Binary-Masked wealth.barclays.com/en gb/internationalwealth/
Noisy Speech with Acceptance of Room Acoustic, Journal of manage-your-money/banking-on-the-power-of-speech.html, ac-
Electrical Engineering 65 No. 6 (2014), 325–332. cessed on December 14, 2014.
[21] PRIBIL, J.—PRIBILOVA, A.—DURACKOVA, D. : An Ex-
periment with Spectral Analysis of Emotional Speech Affected Received 26 February 2015
by Orthodontic Appliances, Journal of Electrical Engineering 63
No. 5 (2012), 296–302. Boško Božilović was born in Belgrade, Serbia, in 1978.
[22] O’SHAUGHNESSY, D. Automatic Speech Recognition: His- He received Dipl Eng and MSc degrees from the Faculty of
tory, Methods and Challenges : Pattern Recognition 41 (2008),
Electrical Engineering, University of Belgrade, in 2003, and
2965–2979.
2012, respectively. He is a Director of ICT at VLATACOM,
[23] VASEGHI, S. V. : Multimedia Signal Processing: Theory and
R&D Center. His research interests are in the areas of bio-
Applications in Speech, Music and Communications, John Wiley
metrics, forensics and digital security. He has authored or co-
& Sons, 2007.
authored several peer-reviewed journal and conference papers
[24] BRUMMER, N.—du PREEZ, J. : Application Independent
and holds one patent. Currently he is working towards his
Evaluation of Speaker Detection, Computer Speech and Lan-
guage 20 No. 2-3 (2006), 230–275. PhD degree.
[25] ARONOWITZ, H.—BURSHTEIN, D. : Efficient Speaker Recog- Branislav M. Todorović was born in Belgrade, Serbia, in
nition using Approximated Cross Entropy (ACE), IEEE Trans- 1959. He received Dipl Eng and MSc degrees from the Faculty
actions on Audio, Speech and Language Processing 15 No. 7 of Electrical Engineering, University of Belgrade, and PhD
(Sep 2007), 2033–2043. degree from the Faculty of Technical Sciences, University of
[26] GIANNAKOPOULOS, T. : Silence Removal in Speech Signals, Novi Sad, in 1983, 1988 and 1997, respectively. He is a Se-
MATLAB Central, March 2014, available from: http:// nior Research Fellow at the RT-RK, Institute for Computer
www.mathworks.com/matlabcentral/fileexchange/ Based Systems, and a Full Professor at the Military Academy,
28826-silence-removal-in-speech-signals, accessed on December University of Defence, Belgrade. He is also with VLATA-
14, 2014. COM, R&D Center, Belgrade. Prior to joining RT-RK, he was
[27] SOONG, F. K.—ROSENBERG, A. E.—JUANG, B. H.—RA- with the Institute of Microwave Techniques and Electronics
BINER, L. R. : A Vector Quantization Approach to Speaker IMTEL-Komunikacije, Centre for Multidisciplinary Research,
Recognition, AT&T Technical Journal 66 No. 2 (Mar-Apr 1987),
and the Military Technical Institute (VTI, Institute of Elec-
14–26.
trical Engineering) in Belgrade. His research interests are in
[28] LINDE, Y.—BUZO, A.—GRAY, R. : An Algorithm for Vector
the wide area of radio telecommunications and digital signal
Quantizer Design, IEEE Trans. on Communications 28 No. 1
(1980), 84–95. processing. He has authored or co-authored more than 100
peer-reviewed journal and conference papers and three books.
[29] REYNOLDS, D. A.—ROSE, R. C. : Robust Text Independent
Speaker Identification using Gaussian Mixture Speaker Models, Miroslav Obradović was born in Belgrade, Serbia, in
IEEE Trans. on Speech and Audio Processing 3 No. 1 (1995), 1978. He is a Senior Software Developer at VLATACOM, R&D
72–83. Center, Belgrade.

DOC-20231017-WA0001.
No ratings yet
DOC-20231017-WA0001.
123 pages
Speaker Recognition From Whisper
No ratings yet
Speaker Recognition From Whisper
47 pages
msp.1982.28454
No ratings yet
msp.1982.28454
6 pages
State of The Art in Speaker Recognitin - 2202.12705v1
No ratings yet
State of The Art in Speaker Recognitin - 2202.12705v1
7 pages
6_Conception of speaker recognition methods a review
No ratings yet
6_Conception of speaker recognition methods a review
6 pages
Recognition of Socphatic Speaking
No ratings yet
Recognition of Socphatic Speaking
7 pages
Reference Paper 4
No ratings yet
Reference Paper 4
11 pages
An Overview of the Development of Speaker Recognition
No ratings yet
An Overview of the Development of Speaker Recognition
11 pages
MathLab Based Speech Processing
No ratings yet
MathLab Based Speech Processing
8 pages
A Research in Proxy Chain
No ratings yet
A Research in Proxy Chain
58 pages
Paper (Doi-Toe-Pvt) 2
No ratings yet
Paper (Doi-Toe-Pvt) 2
43 pages
Sita#1part2 Merged
No ratings yet
Sita#1part2 Merged
61 pages
Applied Multivariate Statistical Analysis 6th Edition Johnson Solutions Manualpdf download
100% (5)
Applied Multivariate Statistical Analysis 6th Edition Johnson Solutions Manualpdf download
50 pages
Automatic+Speaker+Recognition+System - EEE
No ratings yet
Automatic+Speaker+Recognition+System - EEE
11 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
45 pages
Data Privacy - NPC Case Digests (2019-2022) by Atty. Paolo Javier
100% (14)
Data Privacy - NPC Case Digests (2019-2022) by Atty. Paolo Javier
39 pages
Problems of Education in The 21st Century, Vol. 78, No. 5, 2020
No ratings yet
Problems of Education in The 21st Century, Vol. 78, No. 5, 2020
211 pages
M.E. Comm. Systems
No ratings yet
M.E. Comm. Systems
58 pages
Using Gaussian Mixture: Automatic Speaker Recognition Speaker Models
No ratings yet
Using Gaussian Mixture: Automatic Speaker Recognition Speaker Models
20 pages
Acoustic Parameters For Speaker Verification
No ratings yet
Acoustic Parameters For Speaker Verification
16 pages
Speech Processing Unit 4 Notes
No ratings yet
Speech Processing Unit 4 Notes
16 pages
Fast Speaker Identification Using Recursive Word Sample Attributes
No ratings yet
Fast Speaker Identification Using Recursive Word Sample Attributes
7 pages
Speaker Recognition Overview
No ratings yet
Speaker Recognition Overview
30 pages
JournalNX - Speaker Recognition
No ratings yet
JournalNX - Speaker Recognition
6 pages
MAGNUM-HW User Manual 2v2
No ratings yet
MAGNUM-HW User Manual 2v2
34 pages
IDAS MU Louvres
No ratings yet
IDAS MU Louvres
20 pages
hedha houa
No ratings yet
hedha houa
5 pages
Russia Project
No ratings yet
Russia Project
14 pages
Monalisha_barik_paper
No ratings yet
Monalisha_barik_paper
5 pages
Picador Vegetales HCM450
No ratings yet
Picador Vegetales HCM450
20 pages
Automatic Speech Recognition: 2.1 Relevant Keywords From Probability Theory and Statistics
No ratings yet
Automatic Speech Recognition: 2.1 Relevant Keywords From Probability Theory and Statistics
14 pages
Intechopen 80419
No ratings yet
Intechopen 80419
18 pages
Speaker Recognition System
No ratings yet
Speaker Recognition System
7 pages
Speaker Recognition System: A Project Report On
No ratings yet
Speaker Recognition System: A Project Report On
48 pages
Self Learning Speaker Identification A System For PDF
No ratings yet
Self Learning Speaker Identification A System For PDF
185 pages
Iit Research Papers+Computer Science
No ratings yet
Iit Research Papers+Computer Science
5 pages
Utterance Based Speaker Identification
No ratings yet
Utterance Based Speaker Identification
14 pages
Comparative Study of Speaker Recognition System Using VQ and GMM
No ratings yet
Comparative Study of Speaker Recognition System Using VQ and GMM
7 pages
Hinweis Presentation Template
No ratings yet
Hinweis Presentation Template
14 pages
List of Participants ABIM 2016: Last (Family) Name First (Given) Name Company / Institution E-Mail Country Webaddress
No ratings yet
List of Participants ABIM 2016: Last (Family) Name First (Given) Name Company / Institution E-Mail Country Webaddress
20 pages
5 Manual de Piezas Terminadora de Asfalto.....
No ratings yet
5 Manual de Piezas Terminadora de Asfalto.....
12 pages
491 Assignment 1 Frontsheet
No ratings yet
491 Assignment 1 Frontsheet
24 pages
11 English Core Lyp 2013 2014 Ques Paper
No ratings yet
11 English Core Lyp 2013 2014 Ques Paper
9 pages
Agenda_AFE_Summit_21st_Jan_2025
No ratings yet
Agenda_AFE_Summit_21st_Jan_2025
4 pages
The Process of Feature Extraction in Automatic Speech Recognition System For Computer Machine Interaction With Humans: A Review
No ratings yet
The Process of Feature Extraction in Automatic Speech Recognition System For Computer Machine Interaction With Humans: A Review
7 pages
Speaker Recognition Project Proposal: Dheeraj Mehra, Rohan Paul, S.Arun Nair and Vaibhav Singh January 19, 2007
No ratings yet
Speaker Recognition Project Proposal: Dheeraj Mehra, Rohan Paul, S.Arun Nair and Vaibhav Singh January 19, 2007
2 pages
Study of Speaker Verification Methods
No ratings yet
Study of Speaker Verification Methods
4 pages
Perraju Vodugu: Experience Summary
No ratings yet
Perraju Vodugu: Experience Summary
8 pages
Speech Technology and Research Laboratory, SRI International, Menlo Park, CA, USA
No ratings yet
Speech Technology and Research Laboratory, SRI International, Menlo Park, CA, USA
5 pages
MajorInterim Report1
No ratings yet
MajorInterim Report1
10 pages
Shahid Akhtar: Email Id
No ratings yet
Shahid Akhtar: Email Id
4 pages
Time Frequency Analysis and Wavelet Transform Tutorial Time-Frequency Analysis For Voiceprint (Speaker) Recognition
No ratings yet
Time Frequency Analysis and Wavelet Transform Tutorial Time-Frequency Analysis For Voiceprint (Speaker) Recognition
22 pages
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
No ratings yet
Speech Recognition: A Complete Perspective: Ashok Kumar, Vikas Mittal
6 pages
Algorithm For The Identification and Verification Phase
No ratings yet
Algorithm For The Identification and Verification Phase
9 pages
Kinetic Facades
No ratings yet
Kinetic Facades
68 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
DC Motor Control
No ratings yet
DC Motor Control
2 pages
Digital Signal Processing: The Final
No ratings yet
Digital Signal Processing: The Final
13 pages
Methodology For Speaker Identification and Recognition System
100% (1)
Methodology For Speaker Identification and Recognition System
13 pages
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
No ratings yet
Jarvis Digital Life Assistant IJERTV2IS1237 PDF
6 pages
Speaker Recognition
No ratings yet
Speaker Recognition
11 pages
Demag Drives: Keeping Things On The Move
No ratings yet
Demag Drives: Keeping Things On The Move
28 pages
Voice Recognition With Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
No ratings yet
Voice Recognition With Neural Networks, Type-2 Fuzzy Logic and Genetic Algorithms
8 pages
4.5 Computer To Print: Nologies That Do Not Require A Master (
100% (1)
4.5 Computer To Print: Nologies That Do Not Require A Master (
18 pages
Simplified DLL Contemporary Arts F2F Fourth
No ratings yet
Simplified DLL Contemporary Arts F2F Fourth
2 pages
Feature Extraction Using PCA
No ratings yet
Feature Extraction Using PCA
36 pages
Makhotso 2024 CV Only
No ratings yet
Makhotso 2024 CV Only
1 page
Answer:: Free Exam/Cram Practice Materials - Best Exam Practice Materials
No ratings yet
Answer:: Free Exam/Cram Practice Materials - Best Exam Practice Materials
3 pages
Automatic Speaker Recognition System
No ratings yet
Automatic Speaker Recognition System
11 pages
Automatic Recognition of Correctly Pronounced English Words Using Machine Learning
No ratings yet
Automatic Recognition of Correctly Pronounced English Words Using Machine Learning
12 pages
MANILA NCR-Market Supervisor III
100% (1)
MANILA NCR-Market Supervisor III
1 page
Truss-and-Purlins-Design Multipurpose Building
No ratings yet
Truss-and-Purlins-Design Multipurpose Building
7 pages
Advanced Signal Processing Using Matlab
No ratings yet
Advanced Signal Processing Using Matlab
20 pages
Speaker Recognition System - v1
No ratings yet
Speaker Recognition System - v1
7 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Speech Feature Extraction and Classification Techniques: Kamakshi and Sumanlata Gautam
No ratings yet
Speech Feature Extraction and Classification Techniques: Kamakshi and Sumanlata Gautam
3 pages
G7 Euclid Summary of Final Rating in English Math and Science
No ratings yet
G7 Euclid Summary of Final Rating in English Math and Science
2 pages
EEL6586 Final Project:: A Speaker Identification and Verification System
No ratings yet
EEL6586 Final Project:: A Speaker Identification and Verification System
16 pages
Speaker Recognition Using Vector Quantization and Gaussian Mixture Models
No ratings yet
Speaker Recognition Using Vector Quantization and Gaussian Mixture Models
6 pages
Invoice - 2nd Shipment
No ratings yet
Invoice - 2nd Shipment
2 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
Ma Kale
No ratings yet
Ma Kale
3 pages
TMOB
No ratings yet
TMOB
1 page
An Automatic Speaker Recognition System
No ratings yet
An Automatic Speaker Recognition System
11 pages
Speaker Recognition Publish
No ratings yet
Speaker Recognition Publish
6 pages
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
From Everand
Voice Technologies and Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Speech Recognition: Fundamentals and Applications
From Everand
Speech Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Computer Audition: Fundamentals and Applications
From Everand
Computer Audition: Fundamentals and Applications
Fouad Sabry
No ratings yet
An Executive Guide Biometrics
From Everand
An Executive Guide Biometrics
alasdair gilchrist
No ratings yet

(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF

Uploaded by

(1339309X - Journal of Electrical Engineering) Text-Independent Speaker Recognition Using Two-Dimensional Information Entropy PDF

Uploaded by

Journal of ELECTRICAL ENGINEERING, VOL. 66, NO.

TEXT–INDEPENDENT SPEAKER RECOGNITION USING

1 INTRODUCTION spectrum is usually neglected, since it is generally be-

c 2015 FEI STU

In general, phonetic variability represents an adverse 2 DESCRIPTION OF TWO–DIMENSIONAL

H (DSi) H (DSi) H (DSi)

0.40 M1 0.40 M1 M2 0.40 M2

0.25 F3 0.25 0.25

You might also like