Unit 4 NLP Kcs072

Uploaded by

Ujjwal Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views9 pages

Unit 4 NLP Kcs072

Uploaded by

Ujjwal Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

NLP (KCS 072)

AKTU - UNIT 4
SPEED SOUNDS
In Natural Language Processing (NLP), understanding how speech sounds are produced and classified is
essential for tasks like speech recognition, synthesis, and phonetic analysis.

Speech sounds are also known as phonemes.

Speech sounds are produced when air from the lungs passes through the vocal cords, which can either
vibrate or remain open. This airflow then interacts with the mouth, teeth, tongue, and lips to form different
sounds.
The basic steps involved in the production of speech sounds are:
Initiation: Air is pushed from the lungs through the trachea.
Phonation: The vocal cords either vibrate or remain open to control airflow, creating sound.
Articulation: The sound is shaped by the movement of the tongue, lips, and other parts of the vocal
tract.

Application in NLP
Speech Recognition: Identifying and converting speech sounds (phonemes) into text.
Speech Synthesis: Generating speech from text by accurately pronouncing phonemes.
Phonetic Transcription: Writing down the speech sounds of a word using symbols like the International Phonetic Alphabet (IPA).
APPLICATION OF SPEECH SOUNDS IN NLP

Speech Recognition:
Identifying and converting speech sounds (phonemes) into text.

Speech Synthesis:
Generating speech from text by accurately pronouncing phonemes.

Phonetic Transcription:
Writing down the speech sounds of a word using symbols like the International
Phonetic Alphabet (IPA).
CLASSIFICATION OF SPEECH SOUNDS
A. Consonants-
Consonants are speech sounds where airflow is partially or fully obstructed in the vocal tract. They are classified based on the following
features:
Place of articulation: The location in the vocal tract where the sound is produced (e.g., lips, teeth, alveolar ridge).
Example: /p/ (bilabial), /t/ (alveolar).
Manner of articulation: The way in which the airflow is restricted or modified.
Example: /p/ (plosive), /s/ (fricative).
Voicing: Whether the vocal cords vibrate (voiced) or not (voiceless).
Example: /b/ (voiced), /p/ (voiceless).

B. Vowels
Vowels are produced without significant obstruction in the vocal tract. They are classified based on:
Height: How high the tongue is in the mouth (high, mid, low).
Example: /i/ (high), /a/ (low).
Backness: How far back the tongue is in the mouth (front, central, back).
Example: /i/ (front), /u/ (back).
Roundness: Whether the lips are rounded or unrounded.
Example: /u/ (rounded), /i/ (unrounded).

C. Semi-Vowels (Glides)
These are sounds that are produced similarly to vowels but function as consonants in certain contexts. They include
sounds like /j/ (as in "yes") and /w/ (as in "wet").
Articulatory Phonetics Acoustic Phonetics
Focuses on how speech sounds are produced by the Studies the physical properties of sound waves, such as frequency
movement of speech organs. and amplitude.
Air is pushed from the lungs to create sound. Examines how sound travels through the air.
Vocal cords vibrate to produce voiced sounds or stay open Frequency affects pitch—higher frequency means higher pitch.
for voiceless sounds.
Active articulators include the tongue, lips, and teeth. Amplitude affects loudness—greater amplitude means louder
sound.
Passive articulators are fixed parts like the teeth or palate. Formants define vowel sounds and their resonance.
Place of articulation refers to where the airflow is blocked Spectrogram visually shows frequency distribution over time.
(e.g., lips for /p/).
Manner of articulation involves how airflow is manipulated Waveform represents sound pressure variation over time.
(e.g., plosives or fricatives).
Voicing indicates whether vocal cords vibrate (voiced) or Acoustic features are used in speech recognition systems.
not (voiceless).
Short-Time Fourier Transform (STFT)
The Short-Time Fourier Transform (STFT) is a mathematical technique used to analyze
non-stationary signals, such as speech, by dividing the signal into small, overlapping
segments (called windows). The Fourier Transform is then applied to each segment,
allowing the representation of both time and frequency components simultaneously.
This results in a spectrogram—a visual representation of how the signal's frequency
content changes over time.

STFT is important in speech processing because it allows for time-based analysis of speech, creates a
spectrogram for recognition, and helps improve accuracy and noise reduction.
Significance of (STFT)
Time-Frequency View: STFT lets us see both time and frequency of speech, which helps in analyzing
how speech sounds change over time.

Captures Speech Changes: It helps to capture dynamic speech changes, such as when different
sounds (vowels or consonants) are produced in speech.

Visual Representation: The STFT creates a spectrogram, a visual map that shows which frequencies
are present at each moment of speech, useful for recognizing speech patterns.

Improves Accuracy: By breaking down speech into smaller segments, it helps make accurate analysis
of each sound, improving tasks like speech recognition.

Noise Reduction: STFT is useful for isolating speech from noise in a recording, enhancing audio clarity
for better processing.
Digital Signal Process (DSP)
Digital Signal Processing (DSP) involves manipulating signals that have been converted into a digital format.
Signals, such as audio, video, or sensor data, are first sampled and quantized to produce discrete values. DSP
techniques are then applied to process these digital signals for various purposes like filtering, analysis,
enhancement, compression, or transformation.

Sampling: The continuous signal (analog) is converted into discrete values by sampling it at regular
intervals, using a sampling rate.
Quantization: After sampling, the continuous values are approximated to finite levels (quantized) to
convert the signal into a digital format.
Processing: Various operations are performed on the digital signal, such as filtering (removing noise),
transforming (Fourier Transform, for example), and extracting features.
Reconstruction: After processing, the digital signal can be converted back to an analog signal (using a
Digital-to-Analog Converter, DAC) for playback or further use.
Filter-Bank Method in Digital Signal Processing (DSP)
The Filter-Bank Method in Digital Signal Processing (DSP) involves breaking a signal into multiple frequency
components by passing it through a series of filters. Each filter is designed to capture a specific frequency range of the
signal. This method allows for detailed analysis of the signal by focusing on different frequency bands. It mimics the
way the human auditory system processes sound, breaking down complex signals into simpler frequency components.
The Filter-Bank method is particularly useful in speech recognition, where it helps extract features from speech signals,
and in audio compression, where it helps focus on the most important frequency components, reducing redundancy
and making the signal more efficient for storage or transmission.

Linear Predictive Coding in Digital Signal Processing (DSP)

Linear Predictive Coding (LPC) is a technique used to represent the spectral envelope of a speech signal. LPC works by
predicting the current sample of a signal based on a linear combination of previous samples. The method estimates
predictive coefficients that minimize the error between the actual signal and the predicted one. LPC provides a
compact representation of the speech signal by capturing key features such as formants, which are the resonant
frequencies of the vocal tract. This technique is highly efficient and is widely used in speech synthesis (like text-to-
speech systems) and speech compression, where it helps reduce the amount of data needed to represent speech
while preserving its quality.

Acoustic Theory of Speech Production
No ratings yet
Acoustic Theory of Speech Production
57 pages
An Introduction To Speech Recognition B. Plannere
No ratings yet
An Introduction To Speech Recognition B. Plannere
69 pages
Synthesizer Cookbook: How to Use Filters: Sound Design for Beginners, #2
From Everand
Synthesizer Cookbook: How to Use Filters: Sound Design for Beginners, #2
Screech House
3/5 (4)
3.2 Automatic Speech Recognition.pptx
No ratings yet
3.2 Automatic Speech Recognition.pptx
151 pages
Lecture Notes - Speech Processing
No ratings yet
Lecture Notes - Speech Processing
80 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
69 pages
Automatic Speech Recognition
No ratings yet
Automatic Speech Recognition
69 pages
svp (1-5)units notes 4th yr csm
No ratings yet
svp (1-5)units notes 4th yr csm
35 pages
Acoustic Phonetics: Sanjukta Ghosh
No ratings yet
Acoustic Phonetics: Sanjukta Ghosh
19 pages
Text, speech and phono
No ratings yet
Text, speech and phono
2 pages
Speech Sound Production: Recognition Using Recurrent Neural Networks
No ratings yet
Speech Sound Production: Recognition Using Recurrent Neural Networks
20 pages
Method To Study Speech Synthesis
No ratings yet
Method To Study Speech Synthesis
43 pages
S H Li Speech Analysis
No ratings yet
S H Li Speech Analysis
32 pages
CS425 Audio and Speech Processing - Hodgkinson - 2012
No ratings yet
CS425 Audio and Speech Processing - Hodgkinson - 2012
106 pages
Speech Signal Processing
100% (2)
Speech Signal Processing
173 pages
TEST-1
No ratings yet
TEST-1
77 pages
Speech Processing: Review # (Or) Seminar #
No ratings yet
Speech Processing: Review # (Or) Seminar #
49 pages
Subject Related Assignment No.1
No ratings yet
Subject Related Assignment No.1
6 pages
Speech Production
No ratings yet
Speech Production
8 pages
2.2 Speech Processing: - Speech Synthesis. - Speech Recognition. - Speech Coding
No ratings yet
2.2 Speech Processing: - Speech Synthesis. - Speech Recognition. - Speech Coding
7 pages
(Alli) Linear Predictive Modelling of Speech Signal
No ratings yet
(Alli) Linear Predictive Modelling of Speech Signal
25 pages
Assignment On Speech
No ratings yet
Assignment On Speech
9 pages
The Basic Properties of Speech
0% (1)
The Basic Properties of Speech
3 pages
Speech Recognition Using DSP PDF
No ratings yet
Speech Recognition Using DSP PDF
32 pages
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
No ratings yet
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
50 pages
Phonolog Y: The Study of Sound Structure in Language
No ratings yet
Phonolog Y: The Study of Sound Structure in Language
21 pages
Ac Phon
No ratings yet
Ac Phon
60 pages
Speech Coding and Phoneme Classification Using Matlab and Neuralworks
No ratings yet
Speech Coding and Phoneme Classification Using Matlab and Neuralworks
4 pages
Chapter6 - SPEECH SIGNAL PROCESSING
No ratings yet
Chapter6 - SPEECH SIGNAL PROCESSING
54 pages
DSP II - DVP - cdp 2pp
No ratings yet
DSP II - DVP - cdp 2pp
141 pages
auditary phonetics
No ratings yet
auditary phonetics
5 pages
Lab9: Speech Synthesis
No ratings yet
Lab9: Speech Synthesis
13 pages
556 Acoustic Phonetics Basics
No ratings yet
556 Acoustic Phonetics Basics
22 pages
Chapter 3 - Week4
No ratings yet
Chapter 3 - Week4
49 pages
Speech Processing Basics
No ratings yet
Speech Processing Basics
86 pages
Digital Signal Processing: Course
No ratings yet
Digital Signal Processing: Course
47 pages
Introduction to Phonics
No ratings yet
Introduction to Phonics
15 pages
Acoustic and Auditory Phonetics: Johnson, Keith
No ratings yet
Acoustic and Auditory Phonetics: Johnson, Keith
3 pages
✅✅ELT-213-Phonetics-week4-5
No ratings yet
✅✅ELT-213-Phonetics-week4-5
2 pages
Speech Generation
No ratings yet
Speech Generation
11 pages
Speech and Audio Processing and Coding
No ratings yet
Speech and Audio Processing and Coding
52 pages
Topic 3b - The Biological Foundations of Language
No ratings yet
Topic 3b - The Biological Foundations of Language
27 pages
Communication Acoustics Karjalainen
100% (2)
Communication Acoustics Karjalainen
322 pages
Acoustic Phonetics 2025 (1)
No ratings yet
Acoustic Phonetics 2025 (1)
33 pages
SP - 3301PPT
No ratings yet
SP - 3301PPT
152 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Lec2 Audition
No ratings yet
Lec2 Audition
37 pages
Towards Neurocomputational Speech and So
No ratings yet
Towards Neurocomputational Speech and So
279 pages
Acoustic Phonetics
No ratings yet
Acoustic Phonetics
3 pages
Echoes of Speech
From Everand
Echoes of Speech
Pasquale De Marco
No ratings yet
4. Human Speech Communication
No ratings yet
4. Human Speech Communication
44 pages
unit4ppttsa
No ratings yet
unit4ppttsa
19 pages
Linguistic Complete Notes Semester 3
No ratings yet
Linguistic Complete Notes Semester 3
78 pages
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
No ratings yet
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
31 pages
Morphing Techniques For Enhanced Scat Singing
100% (1)
Morphing Techniques For Enhanced Scat Singing
4 pages
9056 Compressed
No ratings yet
9056 Compressed
22 pages
Теория
No ratings yet
Теория
11 pages
Grammar and Linguistics: Core Concepts
From Everand
Grammar and Linguistics: Core Concepts
Saraswati Saini
No ratings yet
Speech and Language Processing: An Introduction for Beginners
From Everand
Speech and Language Processing: An Introduction for Beginners
Pasquale De Marco
No ratings yet
Verbal Strategies
From Everand
Verbal Strategies
Xena Mindhurst
No ratings yet
btech-oe-8-sem-quality-management-koe085-2024
No ratings yet
btech-oe-8-sem-quality-management-koe085-2024
2 pages
DSMM Unit-3
No ratings yet
DSMM Unit-3
23 pages
B.Tech.: RAS-103 RAS - 103
100% (2)
B.Tech.: RAS-103 RAS - 103
11 pages
58ad9bc8-5a3b-40e8-882c-57cff8e21f9d
No ratings yet
58ad9bc8-5a3b-40e8-882c-57cff8e21f9d
26 pages
7062384
No ratings yet
7062384
27 pages
NLP Unit 1 and 2
No ratings yet
NLP Unit 1 and 2
106 pages
NLP 3 4 5
No ratings yet
NLP 3 4 5
105 pages
Kas101t 2022even
No ratings yet
Kas101t 2022even
2 pages
Multism Intro
No ratings yet
Multism Intro
13 pages
Java Basic 1
100% (1)
Java Basic 1
81 pages
Westcode: Type M0859LC140 To M0859LC160
No ratings yet
Westcode: Type M0859LC140 To M0859LC160
11 pages
4 File & Index
No ratings yet
4 File & Index
35 pages
UNIT-3 Fundamentals of Information Technology (Question and Answers)
No ratings yet
UNIT-3 Fundamentals of Information Technology (Question and Answers)
11 pages
Questions Templates Question (1) : Choose The Correct Answer
No ratings yet
Questions Templates Question (1) : Choose The Correct Answer
3 pages
1Z0 1042 25 Certification Dumps
No ratings yet
1Z0 1042 25 Certification Dumps
7 pages
GAMS 2.50 Installation and System Notes For Windows: A: Copy License File
No ratings yet
GAMS 2.50 Installation and System Notes For Windows: A: Copy License File
2 pages
VHDL Experiments List
No ratings yet
VHDL Experiments List
1 page
MC6839 Floating-Point ROM Manual PDF
No ratings yet
MC6839 Floating-Point ROM Manual PDF
94 pages
(Notes) DESIGN FOR TESTABILITY
No ratings yet
(Notes) DESIGN FOR TESTABILITY
34 pages
LAW486 Assessment 1 Group Assignment OctFeb2022
No ratings yet
LAW486 Assessment 1 Group Assignment OctFeb2022
2 pages
Skyhigh Security Security Service Edge (SSE)
No ratings yet
Skyhigh Security Security Service Edge (SSE)
8 pages
Dbms Important Question
No ratings yet
Dbms Important Question
4 pages
Visual Studio Code and GCC - Installation Guide PDF
No ratings yet
Visual Studio Code and GCC - Installation Guide PDF
12 pages
TXSeries For Multiplatforms Messages and Codes Version 6.2
No ratings yet
TXSeries For Multiplatforms Messages and Codes Version 6.2
949 pages
Untitled
No ratings yet
Untitled
18 pages
ALCATEL - 4049 - 4059 - 3AK17095ACJA
No ratings yet
ALCATEL - 4049 - 4059 - 3AK17095ACJA
20 pages
DM00091010
No ratings yet
DM00091010
775 pages
Iso 150
No ratings yet
Iso 150
10 pages
Tej 3mi Motherboard
No ratings yet
Tej 3mi Motherboard
9 pages
RPM
100% (1)
RPM
18 pages
0-IBM - Brocade Product Quick Ref Guide (Matrix Mapping) - 23290543
No ratings yet
0-IBM - Brocade Product Quick Ref Guide (Matrix Mapping) - 23290543
3 pages
CFM2100 User's Manual Rev 1.0
No ratings yet
CFM2100 User's Manual Rev 1.0
56 pages
Datacom2 70 Questions
No ratings yet
Datacom2 70 Questions
11 pages
Webinar - Introduction To Matter
No ratings yet
Webinar - Introduction To Matter
48 pages
Blazepod Manual Eng.
No ratings yet
Blazepod Manual Eng.
15 pages
Azure Cloud Service
No ratings yet
Azure Cloud Service
11 pages
Data Compression
No ratings yet
Data Compression
11 pages

Unit 4 NLP Kcs072

Uploaded by

Unit 4 NLP Kcs072

Uploaded by

NLP (KCS 072)

Speech sounds are also known as phonemes.

Linear Predictive Coding in Digital Signal Processing (DSP)

You might also like