Jump to content

Intelligibility (communication): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 4: Line 4:
== Noise levels and reverberation ==
== Noise levels and reverberation ==


Intelligibility is negatively impacted by background noise and reverberation. The relationship between sound and noise levels is generally described in terms of a signal-to-noise ratio. With a masking noise level between 35 and 100 dB, the threshold for 100% intelligibility is usually a signal-to-noise ratio of 12 dB.<ref name="robinson-casali-2003">Robinson, G. S., and Casali, J. G. (2003). Speech communication and signal detection in noise. In E. H. Berger, L. H. Royster, J. D. Royster, D. P. Driscoll, and M. Layne (Eds.), ''The noise manual'' (5th ed.) (pp. 567-600). Fairfax, VA: American Industrial Hygiene Association.</ref> The speech signal ranges from about 200-8000 Hz, while human hearing ranges from about 20-20,000 Hz, so the effects of masking depend on the frequency range of the masking noise. Additionally, different speech sounds make use of different parts of the speech frequency spectrum, so a continuous background noise such as white or pink noise will have a different effect on intelligibility than a modulated background noise such as competing speech, multi-talker or "cocktail party" babble, or industrial machinery.
Intelligibility is negatively impacted by background noise and reverberation. The relationship between sound and noise levels is generally described in terms of a signal-to-noise ratio. With a masking noise level between 35 and 100 dB, the threshold for 100% intelligibility is usually a signal-to-noise ratio of 12 dB.<ref name="robinson-casali-2003">Robinson, G. S., and Casali, J. G. (2003). Speech communication and signal detection in noise. In E. H. Berger, L. H. Royster, J. D. Royster, D. P. Driscoll, and M. Layne (Eds.), ''The noise manual'' (5th ed.) (pp. 567-600). Fairfax, VA: American Industrial Hygiene Association.</ref> The speech signal ranges from about 200-8000 Hz, while human hearing ranges from about 20-20,000 Hz, so the effects of masking depend on the frequency range of the masking noise. Additionally, different speech sounds make use of different parts of the speech frequency spectrum, so a continuous background noise such as [[white noise|white]] or [[pink noise]] will have a different effect on intelligibility than a variable or modulated background noise such as competing speech, multi-talker or "cocktail party" babble, or industrial machinery.


Reverberation also affects the speech signal by blurring speech sounds over time, which enhances vowels but masks stop consonants, glides, and prosodic cues.<ref name="garcia-lecumberri-2010">Garcia Lecumberri, M. L., Cooke, M., and Cutler, A. (2010). Non-native speech perception in adverse conditions: A review. Speech Communication, 52, 864-886.</ref>
Reverberation also affects the speech signal by blurring speech sounds over time. This has the effect of enhancing vowels with steady states, while masking stops, glides and vowel transitions, and prosodic cues such as pitch and duration.<ref name="garcia-lecumberri-2010">Garcia Lecumberri, M. L., Cooke, M., and Cutler, A. (2010). Non-native speech perception in adverse conditions: A review. Speech Communication, 52, 864-886.</ref>


== Intelligibility standards ==
== Intelligibility standards ==

Revision as of 18:03, 18 July 2015

In speech communication, intelligibility is a measure of how comprehensible speech is in given conditions. Intelligibility is affected by the quality of the speech signal, the type and level of background noise, reverberation, and, for speech over communication devices, the properties of the communication system. The concept of speech intelligibility is relevant to several fields, including phonetics, human factors, acoustical engineering, and audiometry.

Noise levels and reverberation

Intelligibility is negatively impacted by background noise and reverberation. The relationship between sound and noise levels is generally described in terms of a signal-to-noise ratio. With a masking noise level between 35 and 100 dB, the threshold for 100% intelligibility is usually a signal-to-noise ratio of 12 dB.[1] The speech signal ranges from about 200-8000 Hz, while human hearing ranges from about 20-20,000 Hz, so the effects of masking depend on the frequency range of the masking noise. Additionally, different speech sounds make use of different parts of the speech frequency spectrum, so a continuous background noise such as white or pink noise will have a different effect on intelligibility than a variable or modulated background noise such as competing speech, multi-talker or "cocktail party" babble, or industrial machinery.

Reverberation also affects the speech signal by blurring speech sounds over time. This has the effect of enhancing vowels with steady states, while masking stops, glides and vowel transitions, and prosodic cues such as pitch and duration.[2]

Intelligibility standards

Quantity to be measured Unit of measurement Good values
STI[3] Intelligibility (international known) > 0.6
CIS Intelligibility (international known) > 0.78
%Alcons Articulation loss (popular in USA) < 10%
C50 Clarity index (widespread in Germany) > 3 dB
RASTI (obsolete) Intelligibility (international known) > 0.6

Word articulation remains high even when only 1–2% of the wave is unaffected by distortion:[4]

Intelligibility with different types of speech

Lombard speech

The human brain automatically changes speech made in noise through a process called the Lombard effect. Such speech has increased intelligibility compared to normal speech. It is not only louder but the frequencies of its phonetic fundamental are increased and the durations of its vowels are prolonged. People also tend to make more noticeable facial movements.[5][6]

Screaming

Shouted speech is less intelligible than Lombard speech because increased vocal energy produces decreased phonetic information.[7]

Clear speech

Clear speech is used when talking to a person with a hearing impairment. It is characterized by a slower speaking rate, more and longer pauses, elevated speech intensity, increased word duration, "targeted" vowel formants, increased consonant intensity compared to adjacent vowels, and a number of phonological changes (including fewer reduced vowels and more released stop bursts).[8][9]

Infant-directed speech

Infant-directed speech—or Baby talk—uses a simplified syntax and a small and easier-to-understand vocabulary than speech directed to adults[10] Compared to adult directed speech, it has a higher fundamental frequency, exaggerated pitch range, and slower rate.[11]

Citation speech

Citation speech occurs when people engage self-consciously in spoken language research. It has a slower tempo and fewer connected speech processes (e.g., shortening of nuclear vowels, devoicing of word-final consonants) than normal speech.[12]

Hyperspace speech

Hyperspace speech, also known as the hyperspace effect, occurs when people are misled about the presence of environment noise. It involves modifying the F1 and F2 of phonetic vowel targets to ease perceived difficulties on the part of the listener in recovering information from the acoustic signal.[12]

Notes

  1. ^ Robinson, G. S., and Casali, J. G. (2003). Speech communication and signal detection in noise. In E. H. Berger, L. H. Royster, J. D. Royster, D. P. Driscoll, and M. Layne (Eds.), The noise manual (5th ed.) (pp. 567-600). Fairfax, VA: American Industrial Hygiene Association.
  2. ^ Garcia Lecumberri, M. L., Cooke, M., and Cutler, A. (2010). Non-native speech perception in adverse conditions: A review. Speech Communication, 52, 864-886.
  3. ^ Speech Intelligibility Measurement Methods
  4. ^ Moore, C.J. (1997). An introduction to the psychology of hearing. Academic Press. 4th ed. Academic Press. London. ISBN 978-0-12-505628-1
  5. ^ Template:Cite PMID
  6. ^ Template:Cite PMID PDF
  7. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1121/1.1908510, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1121/1.1908510 instead.
  8. ^ Template:Cite PMID
  9. ^ Template:Cite PMID
  10. ^ Snow CE. Ferguson CA. (1977). Talking to Children: Language Input and Acquisition, Cambridge University Press. ISBN 978-0-521-29513-0
  11. ^ Template:Cite PMID
  12. ^ a b Johnson K, Flemming E, Wright R. (1993). "The hyperspace effect: Phonetic targets are hyperarticulated". Language. 69 (3): 505–28. doi:10.2307/416697. JSTOR 416697.{{cite journal}}: CS1 maint: multiple names: authors list (link)