0% found this document useful (0 votes)
29 views16 pages

HMM Overview

The document discusses hidden Markov models and their use in speech recognition. It provides an overview of hidden Markov models, including their components and examples of how they can be used for filtering, prediction, and smoothing. It also discusses how hidden Markov models are applied to speech recognition, including modeling phones, pronunciation, and language.

Uploaded by

sfaisalaliuit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views16 pages

HMM Overview

The document discusses hidden Markov models and their use in speech recognition. It provides an overview of hidden Markov models, including their components and examples of how they can be used for filtering, prediction, and smoothing. It also discusses how hidden Markov models are applied to speech recognition, including modeling phones, pronunciation, and language.

Uploaded by

sfaisalaliuit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Hidden Markov Models

Dave DeBarr
[email protected]
Overview
• General Characteristics

• Simple Example

• Speech Recognition
Andrei Markov
• Russian statistician (1856 – 1922)
• Studied temporal probability models
• Markov assumption
– Statet depends only on a bounded subset of State0:t-1
• First-order Markov process
– P(Statet | State0:t-1) = P(Statet | Statet-1)
• Second-order Markov process
– P(Statet | State0:t-1) = P(Statet | Statet-2:t-1)
Hidden Markov Model (HMM)
• Evidence can be observed, but the state is
hidden
• Three components
– Priors (initial state probabilities)
– State transition model
– Evidence observation model
• Changes are assumed to be caused by a
stationary process
– The transition and observation models do not change
Simple HMM
• Security guard resides in underground facility
(with no way to see if it is raining)
• Wants to determine the probability of rain given
whether the director brings an umbrella
• P(Rain0 = t) = 0.50
What can you do with an HMM?
• Filtering
– P(Statet | Evidence1:t)
• Prediction
– P(Statet+k | Evidence1:t)
• Smoothing
– P(Statek | Evidence1:t)
• Most likely explanation
– argmaxState1:t P(State1:t | Evidence1:t)
Filtering
(the forward algorithm)
P(Rain1 = t)
= ΣRain0 P(Rain1 = t | Rain0) P(Rain0)
=0.70 * 0.50 + 0.30 * 0.50 = 0.50

P(Rain1 = t | Umbrella1 = t)
= α P(Umbrella1 = t | Rain1 = t) P(Rain1 = t)
= α * 0.90 * 0.50 = α *0.45 ≈ 0.818

P(Rain2 = t | Umbrella1 = t)
= ΣRain1 P(Rain2 = t | Rain1) P(Rain1 | Umbrella1 = t)
= 0.70 * 0.818 + 0.30 * 0.182 ≈ 0.627

P(Rain2 = t | Umbrella1 = t, Umbrella2 = t)


= α P(Umbrella2 = t | Rain2 = t) P(Rain2 = t | Umbrella1 = t)
= α * 0.90 * 0.627 ≈ α * 0.564 ≈ 0.883
Smoothing
(the forward-backward algorithm)
P(Umbrella2 = t | Rain1 = t)
= ΣRain2 P(Umbrella2 = t | Rain2) P(* | Rain2) P(Rain2 | Rain1 = t)
= 0.9 * 1.0 * 0.7 + 0.2 * 1.0 * 0.3 = 0.69
P(Rain1 = t | Umbrella1 = t, Umbrella2 = t)
= α * 0.818 * 0.69 ≈ α * 0.56 ≈ 0.883
Most Likely Explanation
(the Viterbi algorithm)
P(Rain1 = t, Rain2 = t | Umbrella1 = t, Umbrella2 = t)
= P(Umbrella1 = t | Rain1 = t)
* P(Rain2 = t | Rain1 = t)
* P (Umbrella2 = t | Rain2 = t)
= 0.818 * 0.70 * 0.90 ≈ 0.515
Speech Recognition
(signal preprocessing)
Speech Recognition
(models)
• P(Words | Signal) = α P(Signal | Words) P(Words)

• Decomposes into an acoustic model and a language


model
– Ceiling or Sealing
– High ceiling or High sealing

• A state in a continuous speech HMM may be labeled


with a phone, a phone state, and a word
Speech Recognition
(phones)
• Human languages use a limited repertoire
of sounds
Speech Recognition
(phone model)
• Acoustic signal for [t]
– Silent beginning
– Small explosion in the middle
– (Usually) Hissing at the end
Speech Recognition
(pronounciation model)
• Coarticulation and dialect variations
Speech Recognition
(language model)

• Can be as simple as bigrams

P(Wordi | Word1:i-1) = P(Wordi | Wordi-1)


References
• Artificial Intelligence: A Modern Approach
– Second Edition (2003)
– Stuart Russell & Peter Norvig
• Hidden Markov Model Toolkit (HTK)
– http://htk.eng.cam.ac.uk/
– Nice tutorial (from data prep to evaluation)

You might also like