0% found this document useful (0 votes)

3 views

Speech Processing -Anu

Uploaded by

daliunni21

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Speech Processing -Anu

Uploaded by

daliunni21

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 78

Methods Used

for Spoken Word

Recognition
Faculty : Dr R Rajasudhakar
Presenter : Anu Prasad
Spoken Word Recognition

✔ Psycholinguists define spoken word recognition (SWR) as, the

processes intervening between speech perception
and sentence processing, whereby a sequence of speech
elements is mapped to a phonological word form

✔ Interface between low-level perception and cognitive

processes of retrieval, parsing and interpretation.
Challenges ??????
Background noise, rate of speech ,
dialect , language
Phonetic categories Incoming acoustic signals

Words

Phrases and syntactic

structures

Semantics
• Despite all these problems, adults recognize spoken words correctly.
• In most of the current models includes mapping as follows:
• Recognition of the unit occurs when its activation exceeds either a
threshold or some activation state relative to all other units at its level.
• The simplest way to study spoken word recognition is to study and measure
‘recognisability’ i.e. identification of words in a noisy condition or that of
truncated or filtered speech stimulus.
• However, these tasks fail to provide a measure of reaction time due to
variability.

• Some of the methods used in spoken word recognition research:

• Word Under Noise
• Continuous Speech
• Filtered, Truncated Words
• Tokens Embedded in Words and Non-
Words
• Lexical Decision
• Rhyme Monitoring
• Word Spotting
• Word Monitoring
• Phoneme Triggered Lexical Decision
• Cross-Model Priming
• Speeded Repetition of Words
• FMRI
1. Word under noise (Fururi, 1992)

Key Challenges with Noise and Their Flattened Spectral Envelope:

Effects Overall frequency range becomes less
1. Spectral Variation dynamic, making it harder to differentiate
speech sounds.

Disappearing Spectral Peaks:

Peaks in the frequency spectrum (important for Spurious Spectral Peaks:
distinguishing vowels and consonants) get lost Noise creates artificial peaks that
in noise. confuse speech processing systems.
.
Changes in Spectral Inclination
Nonlinear Transformations: and Bandwidth:
Noise causes unpredictable distortions, making speech Changes in how energy is
less natural distributed across frequencies
alter the speech characteristics.
2. Nonlinear Time Expansion and Contraction:
Noise can distort the timing of speech.
Some parts may sound stretched (expanded) or compressed (contracted).
This creates difficulty in aligning speech with expected patterns.

3. Additive Noise and Speech Period Detection:

Speech periods (time cycles that define pitch or rhythm) are harder to
detect when masked by noise.
Results in challenges for tone analysis or pitch tracking.
The following methods have been used to deal with additive noises:

• Using special microphones

• Reducing and suppressing noise
• Using noise masking and adaptive models
• Using spectral distance measures that are robust against noises
• Compensating for spectral deviations from the special speaking manners
used in noisy environments (Lombard effect).
Using special microphones

• Use directional or noise-canceling microphones.

• These mics focus on capturing the desired sound while rejecting background
noise.
• Example: Microphones with multiple sensors to differentiate between speech and
noise.

Auditory Models for Speech Analysis:

•Mimic how humans process sounds using our ears and brain.
•Identify speech features (pitch, formants, etc.) while ignoring irrelevant noise.
Noise Reduction and Suppression:

•Algorithms that estimate and subtract noise from the speech signal.
•Example: Noise gates or spectral subtraction.

Noise Masking:

•Add specific types of noise (like white noise) to “cover” the unpleasant or interfering noise.
•Used when noise cannot be removed entirely .

Adaptive Models:

•Models that adjust in real-time based on the noise environment.

•Example: Machine learning models that learn the characteristics of the noise and adapt accordingly .
•.

Spectral Distance Measures

•Compare speech features (like frequencies) even when noise is present.

•Robust methods ensure that small deviations caused by noise don’t
misclassify speech

Compensation for the Lombard Effect:

Adaptive techniques adjust speech recognition systems to handle

these variations.
Why Compensate for the Lombard Effect?
Speech recognition systems are often trained on "normal" speech.

The Lombard Effect causes:

✔ Increased Spectral Energy:

More energy in higher frequencies as people emphasize
certain sounds.

✔ Altered Speech Rhythm:

Different timing and stress patterns in speech.

✔ Distorted Acoustic Models:

The system might misclassify words due

to unexpected pitch or intensity changes.
Without compensation, speech systems (e.g., voice assistants)
struggle in noisy settings like a busy street or a crowded café.
2. Truncation/ Filtering of stimulus

What is Gating in Speech Perception?

Gating refers to presenting a speech signal incrementally, truncating it at different

time intervals, to understand how listeners recognize words.

Purpose: To investigate how much speech information (e.g., vowels, consonants) is

needed for accurate word recognition and how context influences recognition.
Study 1: Ellis et al. (1971)

Perception of Electronically Gated Speech

Method: Four similar-sounding words—cap, cat, cab,

cash—were used.

• The vowel sound (/æ/) was gated electronically, so the

duration of the vowel increased progressively across
presentations.
• Stimuli were presented randomly to participants.
Findings:

• Recognition Improves with Longer Vowel Durations:

Participants progressively recognized cap, cat, cab more accurately as the vowel
duration increased.

• Partial Information is Enough for Certain Words:

At about halfway through the vowel, participants could correctly identify cat, cap,
and cab more often than by random chance.

• Cash’ Requires More Information:

For cash, participants needed the final consonant (/ʃ/) to correctly identify it.
Study 2: Grosjean (1980)
Spoken Word Recognition Processes and Gating Paradigm

Method:
•Words of varying lengths and frequencies were presented to participants in three contexts:
• In isolation: No extra context, just the word.
• In short context: Minimal surrounding linguistic context.
• In long context: A sentence or phrase providing substantial contextual
information.

•The presentation of each word was incremental (word duration increased gradually).
•After each increment:
• Participants wrote down their guess of the word.
• Indicated their confidence level in the guess
Findings:

Word Frequency and Length Matter:

High-frequency, short words (common, short words like "cat") were

identified more quickly than low-frequency or longer words.

Context Helps:

Words presented in a long context were recognized much faster

and more confidently than those in isolation or short contexts.

Lexical Access in Online Processing:

The study showed how listeners access and process words

dynamically (online) as more acoustic information becomes
available.
3.Lexical decision

What is a Lexical Decision Task?

•A lexical decision task measures how participants process and recognize words in real-time.

•Task: Participants decide whether a given stimulus is a real word (e.g., "umbrella") or a non-word (e.g.,
"umbrellir")

•Purpose: To study lexical access—how quickly and efficiently the brain retrieves information about words
from the mental lexicon (our internal "dictionary").
1.Online Processing:

•The task measures real-time processing, as participants must decide quickly.

•For auditory stimuli, the full word must be heard before a decision is made (e.g., "umbrellir" becomes a non-
word at the last syllable).

2.Factors Affecting Response Time:

•Word Length:
• Longer words generally take more time to process

•Word Frequency:
• High-frequency words (e.g., "car") are recognized faster than low-frequency words (e.g., "vial").
•Non-Words: Non-words that closely resemble real words (e.g., "umbrellir") take longer
to reject than completely nonsensical ones (e.g., "flobber").

3.Auditory Closure Phenomenon:

•In unfavorable listening environments, participants might "guess" the word based on incomplete cues.
•This reflects the brain’s predictive ability in filling gaps when information is missing or distorted

4.Lexical Access:
Latency (response time) indicates how quickly the brain retrieves a word from the mental lexicon.
Faster RTs suggest easier or more automatic access.
4. Word spotting

What is Word Spotting?

•Word spotting is a task in speech processing where participants or systems identify specific
target words embedded within a continuous stream of speech.

•Unlike full sentence recognition, word spotting focuses only on detecting whether a word exists
in the input.
Why Use Word Spotting?

To study how listeners or machines can identify key words in noisy or complex speech
environments

Helps in understanding:

• Lexical access: How specific words are retrieved from memory.

• Acoustic processing: How listeners or systems distinguish target words from surrounding
sounds.
Stimuli:

Speech samples containing:

• Target words (e.g., “apple”).
• Distractor words or non-speech sounds.

Example: A sentence like “I ate an apple pie” contains the target word “apple.”

Task

Participants or systems are instructed to:

• Detect or mark the presence of the target word.
• Ignore the surrounding speech or noise.

Measurement

▪ Accuracy: Did they correctly identify the target word?

▪ Response Time (RT): How quickly did they spot the word?
▪ False Positives: Were non-targets mistakenly identified as the target?
Findings
Context Matters:
•Target words are easier to spot when they are predictable from context.
•Example: In “I ate an apple pie”, the context of “ate an” primes the word “apple.”

Acoustic Salience:
•Words with distinct acoustic features (e.g., stress, intonation) are easier to spot.
•Example: “APPLE” in a loud, clear tone is easier to detect than “apple” in a monotone speech.

Noise and Overlapping Speech:

•Word spotting becomes harder in noisy environments or when other speech overlaps with the
target.
•Noise masking reduces clarity and introduces false positives.
Lexical Characteristics:

•Frequency: High-frequency words (e.g., “dog”) are detected more easily than low-frequency words (e.g.,
“lichen”)
•Length: Shorter words (e.g., “cat”) are harder to spot than longer words due to potential overlaps with
parts of other words

Speech Rate:
•Faster speech reduces word spotting accuracy.
•Slower speech gives more time for processing and increases accuracy.
McQueen and Cutler (1998): Word Spotting in Contexts

Study Design:
1. Participants were given nonsense speech stimuli containing real words randomly embedded.
2. Words were presented in different contexts:
1. Syllabic context: e.g., "vuffapple" (contains vowels and likely word boundaries).
2. Consonantal context: e.g., "fapple" (contains only consonants and no clear word
boundaries).
Findings:

1. Syllabic Context is Better: Words were easier to spot in longer syllabic contexts (e.g.,
"vuffapple") than shorter consonantal ones (e.g., "fapple").

2. Phonotactic Probability:
1. Detection improves when the structure of the nonsense speech (e.g., its
phonotactics) predicts where a word boundary should occur.
2. Example: "venlip" makes "lip" easier to spot than "veglip," where phonotactic rules
do not suggest a clear boundary.
Phonotactic Probabilities

Definition: Phonotactic rules determine the likelihood of certain sound sequences in a language.
• E.g., In English, "lip" in "venlip" is segmented easily because English phonotactics favors a
syllable break before "lip."
• Conversely, "lip" in "veglip" is harder to segment due to unnatural syllable boundaries.

Impact: When nonsense stimuli align with natural language phonotactics, the embedded word is
recognized more easily.
Similarity Neighbourhoods

Definition: A similarity neighbourhood is a group of words differing by only one sound.

• Example: For the word "tweet," neighbours include "treat," "tweed," "twit," and "sweet.“

Findings from Luce and Pisoni (1998):

Dense Neighbourhoods:

• Words with many similar-sounding neighbours (dense neighbourhoods) are:

• Recognized more slowly.
• Recognized with lower accuracy.
• Example: "Tweet" has many neighbours, making it harder to identify.
.
Sparse Neighbourhoods: Words with few neighbours (sparse neighbourhoods) are:

• Recognized faster.
• Recognized with higher accuracy

Example: "Judge" has fewer neighbours, making it easier to identify.

5. Phoneme triggered lexical decision’ [Blank,
1980]

A phoneme-triggered lexical decision task is a variant of the lexical decision task designed to investigate
how phoneme-level and word-level information interact during sentence processing

The task focuses on lexical access—how quickly and efficiently participants recognize real words based on a
specified phoneme.
Setup:
• Participants are presented with a set of sentences.
• Before hearing each sentence, they are given a target phoneme to listen for (e.g., /k/).

Task:
• Participants must:
• Identify real words beginning with the specified phoneme as they occur in the sentence.
• Ignore nonsense words (even if they contain the target phoneme).
• Example:
• Target phoneme: /k/.
• Sentence: "Bobby drove the car into the lake."
• Participant's Response: Press the button on hearing "car."
Manipulation

• The speed of lexical access is varied by altering the semantic predictability of the
target word.
• Semantically related context: The verb or preceding words strongly suggest the
target word (e.g., "drove the car").
• Semantically unrelated context: No strong association with the target word (e.g.,
"saw the car").
Findings (Blank, 1980)

1.Semantic Predictability Enhances Word Recognition:

1. Participants recognized words faster when they were preceded by semantically related verbs
or contexts compared to neutral or unrelated ones.
2. Example:
1. Related context: “Bobby drove the car…” → Faster response to "car."
2. Unrelated context: “Bobby saw the car…” → Slower response to "car.“

2.Comparison to Phoneme or Word Monitoring:

1. The phoneme-triggered lexical decision task was found to be more suitable for studying
online lexical access because it integrates both phonemic cues and semantic processing.
2. Unlike simple phoneme or word monitoring tasks, this method reflects how listeners actively
process words in real-time sentences.
6. ‘Speeded repetition of words’ (auditory
naming task) by Whalen, 1991

Objective: The study aimed to explore how people perceive, process, and repeat words and non-words
when specific fricatives (/s/ or /sh/) are involved. It focused on the factors influencing naming times and
error rates during an auditory task

Stimuli Used

Words: Real monosyllabic words like mess and mesh.

Non-words: Made-up monosyllabic sequences like ness or nesh.
Parameters Studied

Four main variables were tested:

Fricative: Whether the sound was /s/ (e.g., mess) or /sh/ (e.g., mesh).

Lexicality:

Word: A real word like mess.

Non-word: A meaningless sequence like ness.

Location:
Initial fricative: At the start of the word (e.g., sack).
Final fricative: At the end of the word (e.g., mess).

Changeability:
This describes whether changing the fricative identity changes the item’s lexical status:
Example: Mess → Mesh (both real words).
Example: Ness → Nesh (both non-words).
Experiment Design

•Two Versions of Each Stimulus:

• Original: The natural version (e.g., mess).

• Mismatched: A manipulated version where the fricative remained the same, but other
elements like vocalic quality were altered.

•Task:
• Participants listened to all versions of the stimuli via headphones.
• They repeated what they heard into a microphone.
Results

1.Naming Times (Reaction Times):

1. The time taken to repeat words and non-words was indistinguishable, meaning both
were processed at similar speeds.
2.Error Rates:
1. Non-words had significantly more errors than real words, indicating that lexical status
(word vs. non-word) affects accuracy.
2. This suggests that recognizing real words are easier than processing non-words.

On presenting the mismatched version of stimuli ,subjects perceived the correct form of word
,indicating fricatives are important for the word recognition
7. Continuous Speech (Shadowing)
by Marslen-Wilson, 1985
What is Shadowing?

Definition:
Shadowing is a task where a subject listens to spoken language and repeats it back
immediately, word-for-word, with minimal delay.

Purpose:
The experiment is designed to study speech perception and how the brain
processes and repeats language in real time.
Chistovich's 1960 Findings
Two Types of Shadowers

1.Close Shadowers:
1. Delays: Very short, about 150–200 milliseconds (msec).
2. Characteristics: Speech is slurred and difficult to analyze for accuracy.

Demonstrates rapid and efficient speech perception, where the listener almost immediately
anticipates what they hear.

2.Distant Shadowers:
1. Delays: Longer, between 500–1500 msec.
2. Characteristics: Speech is clear and easy to understand.

Shows slower, more deliberate processing of speech.

Isolated Syllables Experiment:

•Chistovich tried using isolated syllables instead of connected speech.

•Result: The system didn’t work as efficiently, suggesting connected speech is more natural and taps
into the brain’s full potential for speech processing.

Conclusion:

•Close shadowing is a better tool for studying real-time speech perception.

•Limitation: Speech production processes (e.g., articulation errors) can interfere with measuring pure
reaction times (RT).
Marslen-Wilson’s 1985 Study
Marslen-Wilson extended these findings with a larger sample and connected speech instead of
isolated syllables.

Participants:
•65 participants, including men and women

Key Observations:

Close Shadowing Ability:

1. Only 25% of women could accurately shadow connected speech at delays of 250–300
msec.
2. The rest of the women and all the men shadowed at longer latencies (500 msec or
more), qualifying as distant shadowers.
Gender Differences:
1. A subset of women outperformed men in close shadowing, showing faster and more
accurate real-time speech perception.
2. Possible interpretation: Biological or cognitive factors might influence processing speed
and shadowing efficiency.
Types of Anomalous prose used in Shadowing
Experiments
Marslen-Wilson used three types of anomalous prose to study how syntax and semantics
influence speech perception during shadowing tasks:

1.Syntactic Prose:
1. Contains normal syntax (grammatical sentence structure), but is semantically
meaningless.
Example: The blue ideas sleep furiously.

2.Random Word Order Prose:

1. No syntax or semantic structure; words are jumbled randomly.
Example: Ideas sleep blue furiously the.

3.Jabberwocky:
1. Maintains basic syntax, but the words are replaced with nonsense words by modifying
sounds.
2. Inspired by Lewis Carroll’s Jabberwocky poem.
Example: Twas brillig and the slithy toves.
Second Series of Experiments

Purpose: To determine if close and distant shadowers process syntactic and semantic information during
shadowing.

Findings:

•Both close and distant shadowers actively analyze the syntax and semantics of the material while
shadowing.

•Evidence:
• Spontaneous Errors: Their errors were constrained by the syntactic and semantic structure of
the prose, meaning their mistakes weren’t random but followed logical rules.
• Sensitivity to Disruptions: Both groups showed reduced performance when the syntactic or
semantic structure of the material was disrupted.
Third Series of Experiments

Purpose: To explore the difference between close and distant shadowers.

Findings:

Close Shadowers:
• Use on-line (real-time) speech analysis to drive their articulation.
• Begin repeating speech before they are fully aware of the material's meaning, relying on rapid
processing.
• Advantage: Close shadowing provides a more direct reflection of language comprehension,
with minimal interference from later (post-perceptual) processes.

Distant Shadowers:
• Use slower, more deliberate output strategies, requiring greater conscious awareness of the
material.
Vitevitch and Luce’s Shadowing Task
Studies (1998, 1999, 2005)
Research Focus:

They examined how phonotactic probabilities and neighborhood density influence the speed
of shadowing.

1.Phonotactic Probabilities:
1. Likelihood of a sound sequence occurring in a language (e.g., str in "street" is high
probability, whereas fsr is low probability).
2.Neighborhood Density:
1. Refers to how many words sound similar to a given word.
2. Example: Cat has a dense neighborhood (bat, mat, sat), whereas quirk has a sparse
neighborhood.
Interpretation

•Spoken word recognition operates at two distinct levels:

• Lexical Level: Where neighborhood density influences word recognition.
• Sublexical Level: Where phonotactic probabilities influence processing of sounds.

•Phonotactic Information operates at the sublexical level.

•Neighborhood Density influences recognition at the lexical level.
Findings

Non-words:
• Non-words with high phonotactic probabilities (common sound
sequences) and dense neighborhoods were repeated faster than those
with low probabilities and sparse neighborhoods.
Words:
• The opposite was true—words with low phonotactic probabilities and
sparse neighborhoods were repeated faster than those with high
probabilities and dense neighborhoods.
8. Tokens embedded in Words and
Non-Words
Study by Zhang & Samuel (2015): Tokens Embedded in Words and Non-Words

Objective:
•Investigated how listeners process English words containing shorter words embedded within them (e.g.,
ham in hamster).

•Used auditory priming experiments to assess when embedded words become activated under varying
listening conditions.
Experiment 1: Optimal Listening Conditions

•1a: Embedded words presented in isolation (ham).

•1b: Embedded words presented within carrier words (hamster).

Findings:
• Isolated embedded words primed targets (ham → pig) in all conditions.
• In carrier words: Priming occurred only if the embedded word was at the beginning or
comprised a large proportion of the carrier word.
Experiment 2: Duration Change

•2a: Primes with expanded/compressed embedded words (haam or hm).

•2b: Primes with expanded/compressed carrier words (haamster or hmster).

Findings:
• Significant priming for isolated embedded words, even under duration changes.
• No priming when carrier words were compressed or expanded.
Experiment 3: Segment Loss
•Method: Replaced a segment of carrier words with noise (e.g., h_noise_mster).
•Findings: Priming was eliminated, indicating embedded word activation relies on intact speech signals.

Experiment 4: Cognitive Load

•4a: Embedded words presented under cognitive load (ham while multitasking).
•4b: Carrier words presented under cognitive load (hamster while multitasking).

Findings

Priming for embedded words persisted in isolation (ham), but not when embedded in carrier words (hamster).
Overall Findings:

1.Activation Factors:
1. Embedded words are activated if they are at the beginning of the carrier word.
2. Activation is stronger when embedded words constitute a large proportion of the carrier
word.

2.Listening Conditions:
1. Embedded word activation occurs only under optimal conditions (e.g., clear speech,
minimal distortion).
2. Under suboptimal conditions (e.g., noise, duration changes, cognitive load), activation is
impaired, especially in carrier words.
Study by Vroomen & de Gelder (1997): Embedded
Monosyllables
Objective:

•Explored cross-modal priming in Dutch to study when monosyllables embedded within other words or
non-words are activated.
Key Findings:

Context

1.Two-Syllable Words:
1. Example: framboos (strawberry) contains the embedded word boos (angry).
2. Finding: Embedded words (boos) produced significant priming for related words, showing
activation in two-syllable contexts.
2.Monosyllabic Words:
1. Example: swine contains wine.
2. Finding: No priming was found for embedded words (wine) in monosyllabic carriers.
Position Effects:

1. Initial Position:
1. Example: vel (skin) in velk (non-word) or velg (word).
2. Finding: Priming occurred when the carrier was a non-word (velk) but not when it was a
word (velg).
3. Longer words inhibit activation of shorter, embedded words due to lexical
competition.

2. Final Position:
1. Example: wine in swine or twine.
2. Finding: No evidence of embedded word activation.
Conclusion:

1.Lexical Competition:

Embedded word activation is influenced by lexical competition, where longer words inhibit
shorter embedded words in word contexts.

2.Syllable Onset:

Activation is stronger when the embedded word has a matching syllable onset with the
lexical representation.
9. Rhyme Monitoring (Marslen-
Wilson, 1980)
The subjects are presented auditorily with sentences. A cue
word rhyming with the target word is presented.

For e.g. target word- lead, cue word- bread.

• The sentences may or may not be meaningful. The target

word may be in any position in the sentence.
• Indicate by pressing a switch when he/she hears the target
word.
Key Findings

1.Semantic and Syntactic Sensitivity:

1. Rhyme monitoring is influenced by semantic (meaning) and syntactic (sentence
structure) information derived from the cue word (bread).
Subjects rely on the context provided by the sentence and cue to identify the
target word.

2.Early Attribute Matching:

1. Subjects start processing the attributes of the target word (e.g., rhyme and
meaning) before they hear all the words in the sentence.
2. They often anticipate how the target word will end, based on the cue and
sentence context.
3.Lexical Interpretation Dominates:

1. The phonetic properties (sounds) of the spoken words are not

processed independently of their lexical interpretation (word
meaning and identity).
2. This means:
1. Subjects do not rely on just the sound to identify rhymes.
2. They use top-down processing, combining phonetic input
with lexical knowledge.
10. Word Monitoring
Similar to Rhyme Monitoring:

•Instead of a rhyming cue, the cue word is semantically related to the target word.
• Example:
• Target word: lead
• Cue word: metal

•Subjects listen to a sentence and press a switch when they hear the target word.

•Nature of Sentences:
•Sentences can be:
• Meaningful: Contextually coherent.
• Nonsense: Lacking semantic coherence
Word Monitoring Paradigm:

1.What Happens in the Task?

1. Subjects monitor ongoing language (spoken or written) for a pre-designated target

word.

2. Independent Variables:
1. Nature of the target word (e.g., semantically related or unrelated).
2. Position of the target word in the sentence.
3. Context of the sentence (e.g., meaningful or nonsense).

3. Dependent Variables:
1. Response Latency: Time taken to press the switch.
2. Error Rate: Missed or incorrect responses.
3. Brain Imaging Data: Neural activity during the task
Findings

1.Semantic Processing is Rapid:

1. Response time (RT) is faster for meaningful sentences than for nonsense
ones.
2. This suggests that semantic information is accessed early in word recognition
when the sentence provides context.

2.Role of Semantic Context:

1. Semantic context aids in:
1. Narrowing down potential word candidates.
2. Facilitating quicker recognition of the target word.
2. Word recognition involves interaction between semantic properties (meaning)
and the context.
11.Cross-Modal Priming Task
(CMPT)
The Cross-Modal Priming Task (CMPT) is an experimental paradigm
that combines both auditory and visual modalities, which is why it's
referred to as "cross-modal." This task is designed to study how
information from one modality (e.g., hearing) influences the processing
of stimuli in another modality (e.g., seeing).
How CMPT Works

1.Participants' Task:
1. Participants listen to a sentence played to them.
2. Before the sentence finishes, they are shown a visual stimulus
(either a word or a picture) on a screen.
3. The visual stimulus can either be:
1. Related (or identical) to a word they heard in the sentence
earlier (e.g., the word "dog" after hearing "animal").
2. Unrelated to the sentence they heard.
2.Response:
1. As soon as they see the word/picture, they are instructed to press a button as quickly as
possible.
1. For words: They decide whether the word is a valid word or a non-word (lexical
decision task).
2. For pictures: They classify the picture, such as determining if it depicts an animate
or inanimate object (e.g., animacy task).

3.Priming Effect:
2. When the visual stimulus is related (or identical) to a word heard earlier in the sentence,
reaction times (RTs) are faster than when the visual stimulus is unrelated.
Summary and Implications

•Cross-Modal Priming: This task demonstrates how auditory input (e.g., a

sentence) can influence the processing of visual information (e.g., words or
pictures), especially when the two are related.

•Facilitation by Context: The faster reaction times for related stimuli

highlight how semantic context plays a crucial role in word recognition and
processing.
12. Functional MRI
Zhuang et al. (2011). The interaction of lexical semantics and cohort competition in
spoken word recognition: an fMRI study. Journal of Cognitive Neuroscience

Spoken word recognition involves the activation of multiple word candidates on the basis
of the initial speech input—the “cohort”—and selection among these competitors.

Selection may be driven primarily by bottom–up acoustic–phonetic inputs or it may be

modulated by other aspects of lexical representation, such as a wordʼs meaning [Marslen-
Wilson, 1987].

They examined the potential interaction of bottom-up and top-down processes in an fMRI
study by presenting participants with words and pseudowords for lexical decision.
•In words with high competition cohorts, high imageability words generated
stronger activity than low imageability words, indicating a facilitatory role of
imageability in a highly competitive cohort context.

•For words in low competition cohorts, there was no effect of imageability.

•These results support the behavioral data in showing that selection processes
do not rely solely on bottom–up acoustic– phonetic cues but rather that the
semantic properties of candidate words facilitate discrimination between
competitors.
• They found greater activity in the left inferior frontal
gyrus (BA 45, 47) and the right inferior frontal
gyrus (BA 47) with increased cohort competition
• An imageability effect in the left posterior middle
temporal gyrus/angular gyrus (BA 39)
• A significant interaction between imageability and
cohort competition in the left posterior superior
temporal gyrus/ middle temporal gyrus (BA 21, 22).
PREVIOUS YEAR QUESTIONS

Explain the different methods in SWR (15 marks) -2024

Write a note on following methods in spoken word recognition.

a) Cross Model priming. (5)
b) Continuous speech. (5)
c) Lexical decision. (5) 2023

Describe any 5 methods used in spoken word research (10)

2022,2019,2018

Briefly describe the pros and cons of any one method used in SWR
research (5) 2021

Write short notes on phoneme triggered lexical decision task and

rhyme monitoring (10) -2019
What is used to suppress noise in speech processing?

What effect describes changes in speech in noisy

environments?

What increases the difficulty of determining the speech period?

In which task is the stimulus truncated or filtered?

What phenomenon leads participants to guess words in unfavorable

conditions?
Which shadower type repeats speech with a delay of 150–200 ms?

What kind of prose involves nonsense words replacing syntactic

structure?
References
1. Dhahan, D., Magnusan, J S. (2006). Ch.8 spoken word recognition. Handbook of psycholinguistics. (249-283)
Spivey, M. J., Mc Rae, K. & Joanisse, M.F. (2012). Section2. Spoken Word Recognition. The Cambridge Handbook of
Psycholinguistics. 61-75
2.Blank, M. A. (1980). Measuring lexical access during sentence processing. Perception & Psychophysics, 28(1), 1-
8.
3.Strauß, A., Wu, T., McQueen, J. M., Scharenborg, O., & Hintz, F. (2022). The differential roles of lexical and
sublexical processing during spoken-word recognition in clear and in noise. cortex, 151, 70-88.
4.Amit, G., & Sandeep, M. (2003). Spoken word RECOGNITION: LEXICAL vs Sublexical. In Workshop on Spoken
Language Processing.
5.Holcomb, P. J., & Anderson, J. E. (1993). Cross-modal semantic priming: A time-course analysis using event-
related brain potentials. Language and cognitive processes, 8(4), 379-411

1. Zhuang, J., Randall, B., Stamatakis, E. A., Marslen-Wilson, W. D., & Tyler, L. K. (2011). The interaction of
lexical semantics and cohort competition in spoken word recognition: an fMRI study. Journal of Cognitive
Neuroscience, 23(12), 3778-3790.

2. Kerry Kilborn Helen Moss (1996) Word Monitoring, Language and Cognitive Processes, 11:6, 689-694, DOI:
10.1080/016909696387105
Thank you

Piano Safari Level 1 PDF
94% (31)
Piano Safari Level 1 PDF
122 pages
Albinoni - Concerto For Oboe in D Minor Op. 9 No
93% (27)
Albinoni - Concerto For Oboe in D Minor Op. 9 No
23 pages
AHHB Nokia Information PDF
100% (2)
AHHB Nokia Information PDF
4 pages
Ballroom Dancing Handouts
67% (6)
Ballroom Dancing Handouts
7 pages
Bat Calls of Britain and Europe - Contents and Sample Chapter
100% (1)
Bat Calls of Britain and Europe - Contents and Sample Chapter
20 pages
Chapter 11
No ratings yet
Chapter 11
45 pages
PSY3024 L3
No ratings yet
PSY3024 L3
26 pages
Language, Brain and Representation
No ratings yet
Language, Brain and Representation
39 pages
Francis_Katamba_English_Words-62
No ratings yet
Francis_Katamba_English_Words-62
3 pages
PSY3024 L5
No ratings yet
PSY3024 L5
63 pages
Language II
No ratings yet
Language II
43 pages
Models of Cognitive-Linguistic Process
100% (1)
Models of Cognitive-Linguistic Process
16 pages
Psycho Ch8
No ratings yet
Psycho Ch8
7 pages
Untitled 1
No ratings yet
Untitled 1
7 pages
Lexical access, competition and activation
No ratings yet
Lexical access, competition and activation
2 pages
Feature Extraction Using PCA
No ratings yet
Feature Extraction Using PCA
36 pages
Lecture 7 Updated
No ratings yet
Lecture 7 Updated
37 pages
Human Language Processing
No ratings yet
Human Language Processing
33 pages
Assessing Listening (Macro & Micro Skills) (Types of Assessment Methods)
100% (1)
Assessing Listening (Macro & Micro Skills) (Types of Assessment Methods)
15 pages
[Ebooks PDF] download (Ebook) Native Listening: Language Experience and the Recognition of Spoken Words by Anne Cutler ISBN 9780262305457, 0262305453 full chapters
100% (1)
[Ebooks PDF] download (Ebook) Native Listening: Language Experience and the Recognition of Spoken Words by Anne Cutler ISBN 9780262305457, 0262305453 full chapters
72 pages
Psycholinguistics - Language Processing
75% (4)
Psycholinguistics - Language Processing
42 pages
Psycholinguistics - Language Processing
No ratings yet
Psycholinguistics - Language Processing
42 pages
Grammar and Linguistics: Core Concepts
From Everand
Grammar and Linguistics: Core Concepts
Saraswati Saini
No ratings yet
Topic 4b - Word Recognition
No ratings yet
Topic 4b - Word Recognition
23 pages
Language Processing in Mind
No ratings yet
Language Processing in Mind
6 pages
Carticolazione 4
No ratings yet
Carticolazione 4
34 pages
Testing ReceptiveSkillsReadandLIsten
100% (1)
Testing ReceptiveSkillsReadandLIsten
48 pages
Final exam-Review questions
No ratings yet
Final exam-Review questions
2 pages
Written Report in Introduction To Linguistic: Saint Michael College, Hindang Leyte
No ratings yet
Written Report in Introduction To Linguistic: Saint Michael College, Hindang Leyte
7 pages
(Psycholinguistics) Cohort Model PDF
No ratings yet
(Psycholinguistics) Cohort Model PDF
5 pages
Anchas' Mid Psycho
No ratings yet
Anchas' Mid Psycho
5 pages
Speech Recognition: College Name: Guru Nanak Engineering College Authors: Shruthi Tapse
No ratings yet
Speech Recognition: College Name: Guru Nanak Engineering College Authors: Shruthi Tapse
13 pages
Theories of Speech Perception
No ratings yet
Theories of Speech Perception
6 pages
Theories of Spoken Word Recognition, PIA
No ratings yet
Theories of Spoken Word Recognition, PIA
58 pages
Term Paper ECE-300 Topic: - Speech Recognition
No ratings yet
Term Paper ECE-300 Topic: - Speech Recognition
14 pages
Aphasia Notes
No ratings yet
Aphasia Notes
22 pages
Speech Perception: Further
No ratings yet
Speech Perception: Further
28 pages
Sciences of Communication Disorders
From Everand
Sciences of Communication Disorders
Meenakshi Nehru
No ratings yet
Psycholinguistics, Computational: Advanced Article Richard L Lewis, University of Michigan, Ann Arbor, Michigan, USA
No ratings yet
Psycholinguistics, Computational: Advanced Article Richard L Lewis, University of Michigan, Ann Arbor, Michigan, USA
8 pages
Assessing Listening
No ratings yet
Assessing Listening
36 pages
Effects of Speech Clarity on Recognition Memory for....
No ratings yet
Effects of Speech Clarity on Recognition Memory for....
8 pages
English Language Assessment:: Meeting 6 and 7
No ratings yet
English Language Assessment:: Meeting 6 and 7
19 pages
Applied PRES Field
No ratings yet
Applied PRES Field
33 pages
Listening Test REVISI
No ratings yet
Listening Test REVISI
27 pages
DinaTsagariJaya 2016 7AssessingListening HandbookOfSecondLangu
No ratings yet
DinaTsagariJaya 2016 7AssessingListening HandbookOfSecondLangu
16 pages
Testing Listening
No ratings yet
Testing Listening
34 pages
Speech Recognition
No ratings yet
Speech Recognition
10 pages
2024-12 Brain & Human Behavior 1 - Language
No ratings yet
2024-12 Brain & Human Behavior 1 - Language
64 pages
Assessing Listening
No ratings yet
Assessing Listening
5 pages
NLP Project Reportttt
No ratings yet
NLP Project Reportttt
9 pages
Week 11 Testing Listening
No ratings yet
Week 11 Testing Listening
12 pages
Testing Listening Final
No ratings yet
Testing Listening Final
34 pages
Testing Listening: Marian Paz E. Callo
No ratings yet
Testing Listening: Marian Paz E. Callo
33 pages
CBR Bahasa Inggris Geo
No ratings yet
CBR Bahasa Inggris Geo
4 pages
Speech Audiometry BB 2015
No ratings yet
Speech Audiometry BB 2015
29 pages
Applied PRES Field
No ratings yet
Applied PRES Field
34 pages
Principle of Lesson Planning
0% (1)
Principle of Lesson Planning
92 pages
Mini Lesson Slides
No ratings yet
Mini Lesson Slides
21 pages
M3. Various methods
No ratings yet
M3. Various methods
19 pages
Chapter14_ModelsandTheories
No ratings yet
Chapter14_ModelsandTheories
55 pages
Wurm (1997)
No ratings yet
Wurm (1997)
24 pages
Document (4)
No ratings yet
Document (4)
17 pages
Assesing Listening
No ratings yet
Assesing Listening
13 pages
Theories of Speech Perception
100% (1)
Theories of Speech Perception
5 pages
Psycholinguistics
No ratings yet
Psycholinguistics
23 pages
Sentence Comprehension
No ratings yet
Sentence Comprehension
31 pages
Voice Enhancement in PVUs
No ratings yet
Voice Enhancement in PVUs
36 pages
Voice
No ratings yet
Voice
13 pages
Discourse Comprehension Ori.pptm
0% (1)
Discourse Comprehension Ori.pptm
58 pages
Chord Extensions (Minor 13b5 Chords)
No ratings yet
Chord Extensions (Minor 13b5 Chords)
2 pages
#1 Blink An LED
No ratings yet
#1 Blink An LED
23 pages
Music in the USA a Documentary Companion ---- (1920 1950)
No ratings yet
Music in the USA a Documentary Companion ---- (1920 1950)
6 pages
Hyperdrive ReFill Documentation
No ratings yet
Hyperdrive ReFill Documentation
4 pages
Minimoog Model D - Build A Synth
No ratings yet
Minimoog Model D - Build A Synth
6 pages
Pardo Carrion Maria Teresa OBL07 Tarea
100% (2)
Pardo Carrion Maria Teresa OBL07 Tarea
6 pages
Rocxyou
No ratings yet
Rocxyou
142 pages
El Ché y Los Rolling Stones
No ratings yet
El Ché y Los Rolling Stones
2 pages
PDF Askep Kebutuhan Reproduksi Dan Seksualitas Pada Pola Seksual Tidak Efektif - Compress
No ratings yet
PDF Askep Kebutuhan Reproduksi Dan Seksualitas Pada Pola Seksual Tidak Efektif - Compress
24 pages
Joe Satriani - Always With Me Always With You (Ver 3 by MalmsteenRules)
No ratings yet
Joe Satriani - Always With Me Always With You (Ver 3 by MalmsteenRules)
6 pages
Audix tr40
No ratings yet
Audix tr40
2 pages
历史成语
No ratings yet
历史成语
5 pages
Ragchew 20210801
No ratings yet
Ragchew 20210801
7 pages
A Detailed Lesson Plan in Dance
No ratings yet
A Detailed Lesson Plan in Dance
10 pages
The Mass Media & Voting
No ratings yet
The Mass Media & Voting
11 pages
A Hybrid AOA and TDOA-Based Localization Method Us
No ratings yet
A Hybrid AOA and TDOA-Based Localization Method Us
8 pages
Electromagnetic Spectrum
No ratings yet
Electromagnetic Spectrum
13 pages
Q3-W6-MAPEH-MATATAG
No ratings yet
Q3-W6-MAPEH-MATATAG
87 pages
BLuzViMinda - 3 Day Festival Proposal
No ratings yet
BLuzViMinda - 3 Day Festival Proposal
30 pages
Symphony in Three Movements
75% (8)
Symphony in Three Movements
49 pages
Buckcherry - Gluttony - Bass Score
No ratings yet
Buckcherry - Gluttony - Bass Score
2 pages
New Day
No ratings yet
New Day
13 pages
Maybe Im Amazed
No ratings yet
Maybe Im Amazed
2 pages
Solar Powered DAB Radio With Rechargeable Battery Pack: Please Read This Manual Before Use
No ratings yet
Solar Powered DAB Radio With Rechargeable Battery Pack: Please Read This Manual Before Use
20 pages
777 Component Part Number
No ratings yet
777 Component Part Number
53 pages