0% found this document useful (0 votes)
107 views5 pages

Gender Classification

Classification of gender

Uploaded by

Adedayo tunji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views5 pages

Gender Classification

Classification of gender

Uploaded by

Adedayo tunji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp.

8126-8130
© Research India Publications. http://www.ripublication.com

Human Recognition using Voice Print in LabVIEW

Dr. S.Selva Nidhyananthan 1, K.Muthugeetha2 , V.Vallimayil 3


1
Associate Professor, 2, 3 UG Students
1-3
Mepco Schlenk Engineering College, Sivakasi, India

Abstract METHODOLOGY
This paper describes speaker recognition using Lab VIEW Initially, record and collect the voice from a person. Fifty
software. Speaker recognition consists of speaker verification persons have been chosen and 2 sample of each person were
and speaker identification. This project is to accumulate over taken in .wav format. After recording and collection of voice,
a period of time few human being‘s voice samples and check a low pass filter is used for silence and noise removal and and
those voice samples with already stored data. In this project, voice sample is framed using framing technique and every
silence removal, pre-processing, feature extraction has been signal is windowed using Hamming window technique voice
done. For feature extraction, Mel Frequency Cepstral Co samples are extracted using MFCC method.
efficient (MFCC) is used. The moment features of speech are
A. To get input signal
found for speaker identification.
Input signal is taken from any audio signal or human’s voice.
Keyword: Lab VIEW software, computer, MFCC feature
Input is in WAV format. Sampling frequency and sampling
extraction
rate is calculated using audacity software.

INTRODUCTION
This paper introduces a speaker recognition using Lab VIEW
software. Speaker recognition is an identity of authentication
process which automatically identifies individuals with the
intrinsic characteristic conveyed by their voice. This is also
called voice recognition. “Voice recognition” is used for both
speaker verification and speaker identification. Recognize the
Figure 1. Reading the input signal in labVIEW
speaker can simplify the task of convert speech in systems
that have been trained on specific human being’s voice or it Figure 1 describes that sound file reading in LABVIEW
can be used to prove and verify the identity of a speaker as a software. File path is taken from already recorded voice
part security process. In this paper, Mel frequency cepstral co- samples.
efficient has been used for formant detection. Data base of
fifty persons having two samples per person including male
and female has been created for analysis of result. These B. Pre-processing
systems operate with the user’s knowledge and typically
require the user’s co-operation. The developed system uses After getting input, silence and noise are removed using low
the LabVIEW (Laboratory Virtual Instrument Engineering pass filter. Low pass filter is used to allow the signals with a
Workbench) 2014 platform. frequency lower than certain cut off frequency and block the
signal with frequencies higher than the cut off frequency. We
assume the cut off frequency is 1 KHZ.

INPUT FILTER
SPEECH PROCESS FEATURE IDENTIFICATION
(Butterworth EXTRACTION SIMILARITY DECISION RESULT
Low pass filter) (MFCC) (SPEAKER ID)

DATABASE REFERENCE THRESHOLD


(Trained samples) MODEL

Figure 2. Block diagram of speaker recognition

8126
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp. 8126-8130
© Research India Publications. http://www.ripublication.com

FEATURE EXTRACTION involves there are Framing, windowing, Fast Fourier


transform (FFT), Mel Filter Bank and Discrete Cosine
Mel Frequency Cepstral Co-efficient (MFCC) is used for
Transform (DCT).
feature extraction method. MFCC method contains six steps

Windowing
Audio Framing (Hamming FFT |FFT|2
signal window)

Mel Filter Bank


DCT

Figure 3. Feature Extraction method

A. Audio signal C. Windowing


The filtered signal is given as an input signal by following the Hamming window is used for our project. Hamming window
block diagram. The input signal may be any audio signal or is derived from following equation.
human voice.
w(n)  0.54  0.46 cos(2n / N  1),0  n  N  1
Where N=Frame length, w(n)=Window function.
B. Framing
1) Level-1 why windowing method: A level-1 when
Framing is one of the important step in signal processing. The frequency content of a signal is computed errors can arise
recorded discrete signal has a finite length but it is usually not when we take a limited duration snapshot of a signal that
processed whole. The pre-processed signals are framed. actually lasting for a longer time. windowing is a way to
Speech or voice signals are blocked into frames of N sample. reduce these errors.
Adjacent frames are separated by M samples with the value M
less than N. The first frame consists of first N samples and the 2) Level-2 Advantages of windowing : A level-2 Windowing
second frame begins from M samples after the first frame and is used to suppress the discontinuity and resulting spurious
overlaps it by N-M samples and so on. Let N=160samples, high frequencies in the frequency analysis by ’tapering’ the
M=96samples, overlap (N-M)=64 samples(as per our speech recorded signal smoothly to zero at the start and end of the
signal). The purpose of overlapping is to increase precision of recording period.
three recognition process. The length of a frame is increased 3) Level-3 why we are using hamming window: A level-3
normally to the power of two that is in our case it would be Hamming window is used to reduce the side lobe levels in
416 samples. Voice signals change their characteristics from signal. Also provides smaller main lobe width and sharp
time to time. transition band. Hamming window is having better
selectivity for large signal.

Figure 4. Framing and Windowing Process in Lab VIEW

8127
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp. 8126-8130
© Research India Publications. http://www.ripublication.com

D. FFT(Fast Fourier Transform) E. Mel Filter Bank


FFT is used to convert a signal from its original domain to a Mel filter is applied for 20 filters coming from the Fast
representation in the frequency domain. And also used to b Fourier Transform signal. 20 filters are assumed by us.
simply characterize the magnitude and phase of a signal. And Assuming no of filter is our choice. A Mel scale to the power
also absolute value of the FFT can be taken. spectrum to extract the frequency bands. The aim of the mel
scale is to mimic the non-linear human ear perception of
sound.. It is time-frequency distributor. Figure 6 describes that
Mel filter bank for 20 filters. In that we create subVI for Mel
filter bank.

Figure 5. FFT in LabVIEW F. DCT(Discrete Cosine Transform)


The discrete cosine (DCT) is used to decorrelate the filter
bank co efficient a process is referred as whitening. In our
Figure 5 shows that X is the input after completing the
project is applied for 20 co-efficient value because of applying
windowing signal. To manually select the FFT size .FFT it
DCT function the time domain signal is converted into
will converts time domain to frequency domain.
frequency domain signal. Figure 7 describes that DCT for
LabVIEW software. DCT will gives sum of cosine function.

Figure 6. Block diagram of Mel Filter bank

In LabVIEW , we use mathscript node to calculate features


for collection of database.
Table 1. Moment features for Database.
Moment
Figure 7. DCT in LabVIEW Speaker1 Speaker2 Speaker3 Speaker4
features
Mean 0.00318 0.00608 0.00364 0.00234
FEATURES FOR DATABASE Variance 0.20453 0.20005 0.20456 0.20653
Getting the DCT co efficient values for 20 filters and Skewness 0.10632 0.02226 0.58008 0.02507
collecting database for some features. To find Mean,
Variance, Skewness, Kurtosis, 5th order moment and 6th order Kurtosis 8.01769 5.00562 9.44689 4.34228
moment for Database.
5th order 300.564 239.626 254.6 307.129
th
n Order moment formula has been used,
6th order
 
894924 720329 657032 820714
 n    X  X  n


Table 1 describes the speech features contains Mean,

 x    f x dx.
n Variance, Skewness, Kurtosis for various speakers. The
 moment features are used to find the minimum distance


8128
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp. 8126-8130
© Research India Publications. http://www.ripublication.com

between trained samples and test sample for identifying k


x y   x  yi 
2
speaker. To find out which speaker we have used MFCC i
coefficients .In MFCC coefficients we have compared the i 1
mean values of the speakers but in the mean there is
minimum variation in the values .so that we have moved to Where , x is the trained sample,
higher order for collecting database. y is the testing sample,
k=No. of features.
MINIMUM DISTANCE CLASSIFIER The above distance formula is used for finding minimum
Euclidean distance is used for minimum distance classifier distance between trained samples and testing sample.
method.

RESULT AND DISCUSSION


Table 2. Identified speakers

SPEAKERS S1 S2 S3 S4 S5

S1 631.855 1.49E+10 1.93E+10 2.74E+09 1.16E+09

S2 1.49E+10 7929.11 4.40E+09 1.22E+10 1.37E+10

S3 1.93E+10 4.40E+09 21868.7 1.66E+10 1.81E+10

S4 2.74E+09 1.22E+10 1.66E+10 1828.6 1.58E+09

S5 1.16E+09 1.37E+10 1.81E+10 1.58E+09 370.264

Table 3 shows that corresponding speakers will give been observed that the system is accurate up-to a value of
minimum value. For example, S4 as testing speech sample 85%.
and its features compared with the all speakers in the
database. Table 3 describes that identify the speakers by
using threshold value. We have chosen threshold value as REFERENCES
8000. If difference between the speakers below threshold
means it will gives ACCEPT and also identify the speaker id [1] Simarpreet Kaur and Purnima,” Speaker Verification
and its difference above threshold means it will gives using LabVIEW” International Journal of Computer
Applications Volume 21– No.4, May 2011
REJECT to display in the front panel .
[2] Saurabh bhardwaj, smriti srivastava, madasu
In the above table, Yellow color shaded box indicates
hanmandlu and J. R. P. Gupta.” GFM based methods
correctly identified speakers and Red color shaded box
for speaker identification”. IEEE transactions on
indicates incorrectly identified because the value is greater
than threshold value. cybernetics, vol. 43, no. 3, June 2013.
[3] Douglas A. Reynolds, James R. Glass, Timothy J.
Hazen and Ji Ming.” Robust Speaker Recognition
CONCLUSION in Noisy Conditions”. IEEE transactions on audio,
speech, and language processing, vol. 15, no. 5, July
This work describes speaker recognition systems as a part of
2007
the biometric security system. The speaker identification
using MFCC method was implemented on Lab VIEW 2014 [4] Athira Aroon and S.B. Dhonde, Speaker Recognition
platform. The feature have been extracted and stored in a System using Gaussian Mixture Model”.
database to be compared with a testing speech. In testing International Journal of Computer Applications
session, Euclidian distance to finding minimum distance Volume 130 – No.14, November 2015.
between trained and testing signal. The experiments have been
[5] Douglas A. Reynolds and Richard C. Rose,” Robust
conducted on the database stored in the .wav files and it has
Text-Independent Speaker Identification Using

8129
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp. 8126-8130
© Research India Publications. http://www.ripublication.com

Gaussian Mixture Speaker Models” IEEE


transactions on audio, speech, and language
processing, vol. 3, no. 1, January 1995.
[6] Vibha Tiwari, ”MFCC and its applications in
speaker recognition”. International Journal on
Emerging Technologies, February 2010./
[7] Kumar Rakesh, Subhangi Dutta and Kumara
Shama” Gender recognition using speech processing
techniques in labVIEW”. International Journal of
Advances in Engineering & Technology, May 2011.
[8] Mohammed algabri, hassan mathkour, mohamed a.
Bencherif, Mansour alsulaiman, and mohamed A.
Mekhtiche.” Automatic speaker recognition for
mobile forensic applications”. Hindawi , March2017.
[9] A.K. Jain, A. Ross and S. Prabhakar, “An
Introduction to Biometric Recognition,” IEEE
Transactions on Circuits and Systems for Video
Technology, Special Issue on Image and Video
Based Biometrics, vol.14, no.1, pp.4-20, Jan.2004.
[10] F Orság, “Some Basic Techniques of the Speech
Recognition”, In: Proceedings of 8th Conference
STUDENT EEICT 2002, Brno, CZ, FEKT VUT, pp.
5, ISBN 80- 214-2116, 2002.
[11] L.R. Rabiner, B.-H. Juang, “Fundamentals of Speech
Recognition” (Prentice-Hall, Englewood Cliffs,
1993 .
[12] Ashish kumar panda and Amit kumar sahoo, ”Study
of speaker recognition system”,2011.
[13] Kunjithapatham Meena,Kulumani Subramanian, and
Muthusamy Gomathy,”Gender classification in
speech recognition using fuzzy logic and neural
network”.International Arab Journal of Information
Technology volume 10,No.5,September 2013.
[14] National instruments visit www.ni.com

8130

You might also like