Spring 2021
Spring 2021
comsol.blog/NVH-simulation
Make your job Preferred by sound and vibration
professionals around the world
easier with RION for more than 75 years
LIS TEN - F EE L - SO LV E
An Acoustical Society of America publication
35 Psychoacoustics of Tinnitus:
Lost in Translation Departments
Christopher Spankovich, Sarah Faucette,
Celia Escabi, and Edward Lobarinas 71 Obituaries
James David Miller | 1930–2020
Selection of sound level meters Vibration meters for measuring Software for prediction of
for simple noise level overall vibration levels, simple to environmental noise, building
measurements or advanced advanced FFT analysis and insulation and room acoustics
acoustical analysis human exposure to vibration using the latest standards
Systems for airborne sound Near-field or far-field sound Temporary or permanent remote
transmission, impact insulation, localization and identification monitoring of noise or vibration
STIPA, reverberation and other using Norsonic’s state of the art levels with notifications of
room acoustics measurements acoustic camera exceeded limits
Impedance tubes, capacity and Multi-channel analyzers for Noise alert systems and
volume measurement systems, sound power, vibration, building dosimeters for facility noise
air-flow resistance measurement acoustics and FFT analysis in the monitoring or hearing
devices and calibration systems laboratory or in the field conservation programs
Scantek, Inc.
www.ScantekInc.com 800-224-3813 Spring 2021 • Acoustics Today 5
Editor Acoustical Society of America
Arthur N. Popper | [email protected] The Acoustical Society of America was founded in
1929 “to generate, disseminate, and promote the
Associate Editor
knowledge and practical applications of acoustics.”
Micheal L. Dent | [email protected]
Information about the Society can be found on
Book Review Editor the website:
Philip L. Marston | [email protected] www.acousticalsociety.org
Membership includes a variety of benefits, a list of
AT Publications Staff
which can be found at the website:
Kat Setzer, Editorial Associate | [email protected]
www.acousticalsociety.org/asa-membership
Helen A. Popper, AT Copyeditor | [email protected]
Liz Bury, Senior Managing Editor | [email protected] Acoustics Today (ISSN 1557-0215, coden ATCODK)
Spring 2021, volume 17, issue 1, is published quarterly
ASA Editor In Chief by the Acoustical Society of America, Suite 300, 1305
James F. Lynch Walt Whitman Rd., Melville, NY 11747-4300. Periodi-
Allan D. Pierce, Emeritus cals Postage rates are paid at Huntington Station, NY,
and additional mailing offices. POSTMASTER: Send
Acoustical Society of America
address changes to Acoustics Today, Acoustical Society
Diane Kewley-Port, President
of America, Suite 300, 1305 Walt Whitman Rd., Mel-
Stan E. Dosso, Vice President
ville, NY 11747-4300.
Maureen Stone, President-Elect
Joseph R. Gladden, Vice President-Elect Copyright 2021, Acoustical Society of America. All rights reserved.
Judy R. Dubno, Treasurer Single copies of individual articles may be made for private use or re-
Christopher J. Struck, Standards Director search. For more information on obtaining permission to reproduce
content from this publication, please see www.acousticstoday.org.
Susan E. Fox, Executive Director
YOU CAN
MAKE A
DIFFERENCE
Publications Office
P.O. Box 809, Mashpee, MA 02649 Support the ASA Foundation:
(508) 293-1794 acousticalsociety.org/
acoustical-society-
Follow us on Twitter @acousticsorg foundation-fund
P: 800.579.GRAS
E: [email protected]
www.gras.us
Spring 2021 • Acoustics Today 7
From the Editor
Arthur N. Popper
Our goal for Acoustics Today (AT) work described, and the amazing sound files are of her
is that each article be interesting to, special singing. Although the article focuses, to a degree,
and readable by, every member of the on the fascinating topic of how one singer can produce
Acoustical Society of America (ASA). two voices at the same time, it also is a wonderful intro-
Thus, I encourage everyone to take a duction to the singing voice in general.
look at each article and each “Sound Perspectives” essay
in this issue. I trust most people will find something of The final article is by Lora Van Uffelen. Lora talks about global
interest and/or value in each. positioning systems (GPSs) and how positioning is done over
land and in the water. Considering that most every reader of
The first article by Grant Eastland discusses computa- AT carries a device using GPS with them most of the time,
tional methods in acoustics. Grant provides an insightful this article provides insights into how such systems work.
introduction to the topic and explains complex issues
in ways that will help many readers appreciate that the This issue also has three “Sound Perspectives” essays. “Ask
techniques discussed could apply to their own research. an Acoustician” is by Zoi-Heleni Michalopoulou. Eliza (as
she is known to friends and colleagues) shares insights
We then have a very substantial switch in topics to an into her wonderful career that spans a number of ASA
article on ultrasonic hearing in non-flying terrestrial mam- technical committees including Acoustical Oceanography,
mals. The article, written by three students, M. Charlotte Signal Processing in Acoustics, and Underwater Acoustics.
Kruger, Carina Sabourin, and Alexandra Levine, and their
mentor, Stephen Lomber, points out that ultrasonic hear- The second essay is by Tyrone Porter, chair of the Commit-
ing is actually quite common for many mammals, and tee to Improve Racial Diversity and Inclusivity (CIRDI).
that such sounds are used for communication. It is also Tyrone introduced this committee in the December 2020
interesting to note that this article may have more student issue of AT (available at bit.ly/348Gbyk), and he will continue
authors than any other article in the history of AT. I point to report on this very important work in subsequent issues.
this out to encourage future authors to consider engaging In this issue, he tells us about one of the first CIRDI initia-
students in articles they write for the magazine. tives, working toward getting more people of color to enter
the field of acoustics. As part of this article, Tyrone shares
Our third article is by Linda Polka and Yufang Ruan. Linda a personal story about how he became an acoustician and
and Yufang write about “baby talk.” But this is not what uses this to make the point that young people need great
you would immediately think of, baby language. Instead, the opportunities and great mentors to bring them into our field.
authors delve into the fascinating topic that a large number
of ASA members are familiar with, how adults talk to babies. The final essay is part of what I hope will be a series over
the next few years about how acoustics research is funded.
The fourth article also addresses an issue that should be These are in recognition of the fact that a significant number
familiar to many (especially older) ASA members, tinnitus. of ASA members pursue funding from various sources for
Christopher Spankovich, Sarah Faucette, Celia Escabi, and their work, including agencies of the US government. These
Edward Lobarinas discuss this very common affliction of agencies often have compelling missions that connect to
the auditory system and explain some of its etiology and the diverse work of many of our ASA members. Thus, over
describe how tinnitus is studied using animal models. the next year or two, we will invite senior leaders of these
agencies to submit essays with insights about their work
The fifth article by Johan Sunderg, Björn Lindblom, and and passions and, where possible, information about fund-
Anna-Maria Hefele has another first for AT. Anna-Maria ing opportunities. The goal is not only to share information
is not only an author but is also the subject of much of the about interesting funding organizations but perhaps also to
It’s silent
in outer space.
When NASA wanted
silence on earth,
they called us.
Sound in the World acoustics, often requiring solving the acoustic wave equa-
Throughout human history, people and cultures have cre- tion. Indeed, there is potential for advancement in new
ated sound for more than simple communication. For areas of research not contained in the traditional areas
example, early humans likely made music using primitive by employing computational acoustics. This is already
flutes (Atema, 2014) and considered sound integral in seen from the great developments and advancements in
the design of cities (e.g., Kolar, 2018). Furthermore, the all areas of acoustics over several decades where the com-
Mayans designed structures at the ruins at Chichen Itza plexity has required extensive use of numerical methods,
in Mexico that used sound for worship (Declercq et. al., optimization, computational modeling, and simulation.
2004). Specifically, clapping in front of the stairs of the
El Castillo pyramid creates a sound resembling a highly Like the relationships of computational physics to math-
revered bird by way of a series of reflections up the stairs ematics and computer science, the relationship between
(available at bit.ly/3jPfOTk). acoustics, mathematics, and computer science define
computational acoustics as described by the Venn-type
In addition to an interest in making sound, sound and diagram shown in Figure 1.
vibration have also been thoroughly investigated by either
empirical methods or philosophical arguments since as far The Wave Equation Explained
back as Pythagoras (550 BCE), who applied his discoveries The wave equation enables the expression of motion in a
in mathematics to the harmonic ratios in music. He dis- wave, and it shows itself in every area of physics includ-
covered that stringed instruments could be tuned, using ing acoustics, electromagnetism, quantum mechanics,
small integer ratios of string length, so that they would and optics, to name a few. The equation provides the
consistently produce layered consonant musical intervals.
The interest and desire to study our acoustic environ- Figure 1. Venn diagram showing the concept relationship
ment continues to this day, but the methods we use have of computational acoustics, indicating how it connects
changed dramatically, and continue to change as new traditional acoustics with mathematics and computer science.
technologies emerge. Beginning in the seventeenth cen-
tury with Robert Boyle, empirical investigation showed
that sound is a vibration of conceptualized fluid particles
transmitting energy from one place to another. Theoreti-
cal and empirical investigations are essential but more
often require additional help to solve the problems at
hand. Indeed, applying sophisticated computational
methods, the basis of this article, provides a valuable tool
in understanding and analyzing acoustics phenomena.
Table 1. Some relevant articles published in Acoustics Today same techniques used in those areas can be applied in
other areas (see Table 1 for articles in Acoustics Today
Authors Topic that discuss the use of similar techniques).
Ahrens et. al., 2014 Sound field synthesis
In addition, applications of machine learning (ML) that
Bruce, 2017 Speech intelligibility, are being used in artificial intelligence research and
signal processing
areas of data science are also being exploited to advance
Bunting et. al., 2020 Computational acoustics research into areas including acoustic oceanography,
Burnett, 2015 Computer simulation of scattering engineering acoustics, and signal processing. This is by
Candy, 2008 Signal processing, model-based no means an exhaustive list, but it brings a familiarization
machine learning beginnings to the areas and applications of computational acoustics
Duda, et. al., 2019 Ocean acoustics and the methods found therein.
Greenberg, 2018 Deep learning, languages
Modern Computational Methods
Hambri and Structural acoustics,
Fahnline, 2007 modeling methods
The numerical methods of computational acoustics are
focused on taking the continuous equations and differ-
Hawley et. al., 2020 Musical acoustics
ential equations from calculus and turning them into
Puria, 2020 Bioacoustics, hearing
linear algebraic equations, which are amenable to solu-
Stone and Speech production, modeling, tion on digital computers. In the case of a concert hall
Shadle, 2016 computational fluid dynamics with complex geometries that are not open to an ana-
Treeby, 2019 Biomedical acoustics lytic solution, computational acoustics would enable an
Vorländer, 2020 Virtual reality and music acoustics engineer to compute a numerical solution to
Wage, 2018 Array signal processing the wave equation to help the engineering design process,
and localization as discussed recently by Savioja and Xiang (2020).
Wilson et. al., 2015 Atmospheric acoustic propagation
Two of the more popular methods are the finite-differ-
Zurk, 2018 Underwater acoustic sensing
ence method (FDM) and finite-element method (FEM).
These papers have either a computational focus or The FDM is a class of numerical techniques related to a
computational relationship. general class of numerical methods known as Galerkin
methods (Jensen et al., 2011; Wang et. al., 2019) that treat
derivatives as algebraic differences and the continuous
ways to investigate interactions that previously were unap- function in question, such as the sound field, is calculated
proachable due to the complex nature of acoustics. at various points of space (Botteldooren, 1994).
Computational acoustics, which is a combination of math- For example, Figure 2 shows how to break up the space
ematical modeling and numerical solution algorithms, has with a grid where the sound field is calculated t as an
recently emerged as a subdiscipline of acoustics. The use individual element in space. Each point is calculated
of approximation techniques to calculate acoustic fields through iteration via a computational algorithm. The
with computer-based models and simulations allows for calculations are often simple enough that they could be
previously unapproachable problems to be solved. performed with pencil and paper or a basic calculator.
However, if the procedure needs to be applied to many
The increasing computational nature of acoustics, points, there may need to be thousands to millions of
especially in all the traditional areas, has provided a computations, thereby requiring a digital computer.
cross-disciplinary opportunity. The purpose of this paper
is to show an overview of the various techniques used in In contrast to the FDM, the FEM is another numerical tech-
computational acoustics over several of the traditional nique used for calculating sound fields based on dividing up
areas. I am more familiar with applications in under- a space or structure into individual elements, each of which
water acoustics and physical acoustics, but many of the is assumed to be constant. The space/structure is broken
For real-life problems, the FDM and FEM are not exclu- Beginning with an initial guess of the speeds, the backscat-
sive, and they are often applied at the same time on ter form function is determined. Backscattering data from
modern high-performance computing platforms. The the target are then matched to the form function by relating
FDM is simple in its application but requires some ini- the error in the null locations and separations. Based on the
tial knowledge of conditions. The FEM is more adaptable selection of arbitrary nulls in the data using any nonlin-
and accurate but often requires more input data to apply. ear least squares method (e.g., Levenberg-Marquardt), an
of acoustic simulation methods into virtual reality systems model where the model can be improved based on additional
(Vorlander, 2013, 2020). These types of systems can have inputs of data. The computer algorithm from the system
real-time performance due to the advances in technology using it essentially “learns” and incorporates that knowledge
and have become paramount in the entertainment industry. into its dataset. Although much of the research into ML and
techniques are done in areas of computer science, the appli-
Additionally, virtual and augmented realities have been cations of the methods into acoustics have driven some of
employed in training and as a diagnostic tool. In the past, the more recent advances. A major method of ML, called
there used to be latency or slowing down of simulations deep learning, based on artificial neural networks that work
due to the huge amounts of data being generated. However, through several layers, train systems to do everything from
this is not as significant problem anymore given advances synthesizing music to being able to perform better than the
in computer technology. As a result, sound synthesis and human ear for recognition (Hawley et al., 2020).
production of indoor/outdoor surroundings can be com-
bined with three-dimensional stereoscopic display systems Summary and Conclusions
through data fusion (e.g., Vorlander, 2020). The research The large variety of methods and applications outlined here
and design applications have led to improved reality for is hardly an exhaustive depiction of computational acoustics.
video games and similar systems. The user experience is Due to limitations in my knowledge and the space and time
enhanced by adding accurately synthesized sound and to do so, only a brief introduction to the field could be given.
allowing the listener to be able to move unrestrictedly, e.g., However, hopefully, I was able to make the case for the need
turn the head, to be able to perceive a more natural situation. for the field of computational acoustics and the variety of
areas of application. The uses of computational methods
Moreover, the improved synthesis algorithms (e.g., Gao et have driven discovery and improved understanding in a
al., 2020) can be used to provide more realistic conditions variety of areas of acoustics including sound synthesis, voice
for psychoacoustic tests. Sound synthesis algorithms based recognition, modeling of acoustic propagation, and source
on deterministic-stochastic signal decomposition have identification. Several techniques have been used to aid in
been applied to synthesize pitch and time scale modifica- the design of new automotive technologies by modeling the
tions of the stochastic or random component of internal mechanical interactions of structures with different moving
combustion engine noise (Jagla et al., 2012). The method parts and the fluids involved.
uses a pitch-synchronous overlap-and-add algorithm,
used in speech synthesis, that exploits the use of recorded Several of these methods are not only being used in engi-
engine noise data and the fact that the method does not neering acoustics, but they are also being employed for
require specific knowledge of the engine frequency. The space design for concert halls and classrooms. This type
data-based method used for speech synthesis, noise analy- of modeling has improved noise suppression in a variety
sis, and synthesis of engine noise just mentioned is similar of mechanical systems. Computational techniques are
to what is used in ML. Applications of ML seem to have being used in modeling and simulation in signal process-
no limits in the data-driven world of today. ing to utilize ML methods in the investigation of acoustic
source identification and classification. The methods are
ML methods are based on statistics and are excellent at detect- being applied to areas of animal bioacoustics to aid in
ing patterns in large datasets. Applications in acoustics are species identification for population monitoring, avoid-
fertile ground for research into ML for things such as voice ing direct interaction with the animals. The methods and
recognition, source identification, and bioacoustics (e.g., applications of computational acoustics are only going to
Bianco et al, 2019). With technologies like Alexa or Google grow over years to come and have become a fruitful and
Home, voice recognition investigations are needed to allow rewarding area of research.
the technology to work with people having different accents
or pronunciations or speaking different languages. The algo- Disclaimer
rithms must utilize huge datasets of recorded voices to teach The opinions and assertions contained herein are my private
the computer system to “learn” based on input. Models are opinions and are not to be construed as official or reflecting
developed of voices pronouncing certain common words the views of the United States Department of Defense, spe-
used for searching. Variations are compared statistically to the cifically, the US Navy or any of its component commands.
What is the first thought that comes to your mind when Despite not being able to hear ultrasound, humans often
you read the word “ultrasound”? Most readers of Acous- capitalize on its presence. The most familiar use would be
tics Today might associate ultrasound with pregnancy clinical applications of ultrasound (e.g., Ketterling and Sil-
or perhaps specialized detection technology on ships verman, 2017). These include pregnancy scans, observation
and airplanes. Some might also think about echolocat- of pathology progression, and treatments such as the elimi-
ing animals. But what about terrestrial mammals? The nation of kidney stones (Simon et al., 2017). In industrial
ones that walk the earth among us? Although the use environments, ultrasound is used as a nondestructive test to
of ultrasound in echolocating mammals (e.g., bats, dol- measure the thickness and quality of objects. Even though
phins, and whales) is well-known, our understanding of ultrasound can be useful for humans in a variety of settings,
ultrasonic perception in nonflying terrestrial mammals public exposure to airborne ultrasound is suggested to also
is limited. Here we discuss the frequencies perceived cause adverse effects, such as nausea, dizziness, and failure
and the biological importance of ultrasound for four to concentrate (Leighton et al., 2020). However, this is not
land-dwelling mammals as well as what is currently the case for many animals. Long before humans started
known about the various areas in the brain that allow utilizing ultrasonic frequencies, animals have been using
these animals to process ultrasound. ultrasound for various beneficial reasons.
What We Know About Ultrasound Signals containing ultrasound play a pivotal role in the
Ultrasonic sounds differ from “regular” sounds because lives of many species. Well-known uses include prey
their frequencies are too high for humans to detect. The detection, finding mates, and communicating with con-
upper hearing limit for humans is considered to be 20 specifics. High frequencies have very short wavelengths
kHz, and sounds with a frequency above 20 kHz are con- and therefore attenuate more rapidly when traveling
sidered ultrasonic. This is the agreed on definition, yet through air compared with lower frequencies. Therefore,
this distinction is subjectively based on the range that we, ultrasonic production and hearing create a private com-
as humans, can hear and has no biological basis per se. munication channel that subverts detection by prey as
18 Acoustics Today • Spring 2021 | Volume 17, issue 1 ©2021 Acoustical Society of America. All rights reserved.
https://doi.org/10.1121/AT.2021.17.1.18
well as by predators that are unable to hear the higher
frequencies (Ramsier et al., 2012). Examples of animals
that can hear ultrasound include cats, dogs, bats, mice,
and rats (Figure 1). Through technological advances, we
have been able to detect, observe, study, and utilize these
signals found outside our perceptual capabilities (Arch
and Narins, 2008). By investigating different animals that
can hear ultrasound, we better our understanding of the
physiological and anatomical mechanisms behind their
ability to perceive these high-frequency sounds.
produced by male mice to attract females shows the sex- Phillips and colleagues (1988) determined that ferrets
specific relevance of ultrasound production and hearing. (Mustela putorius furo) can detect sounds from 40 Hz to
approximately 40 kHz. Ferrets provide a useful model for
Similar to mice, adult rats have two main purposes for investigating the development, organization, and plastic-
emitting ultrasonic vocalizations as a form of commu- ity of the auditory cortex because the onset of hearing
nication: alarm calls at 22 kHz to warn conspecifics of in ferrets occurs late compared with other mammals
danger and calls at 50 kHz for social cooperation and (Moore, 1982). Before their ear canals open, newborn fer-
affiliative behavior (Wright et al., 2010). Rats generally rets, known as kits, produce high-frequency vocalizations
emit vocalizations with frequencies that fall within their often above 16 kHz. Lactating female ferrets respond to
hearing range (between 250 Hz and 80 kHz). For exam- these kit vocalizations (Shimbo, 1992) similar to the
ple, infant rats can emit vocalizations between 40 and 65 rodent behavior described in Rodents. Overall, ferrets
kHz when they are separated from their nest, and adult provide useful models for investigating different aspects
rats can emit ultrasonic calls to solicit sexual behavior of hearing and hearing loss, given that their hearing
from the opposite sex (Portfors, 2007). On hearing the range largely overlaps that of humans (Fritz et al., 2007).
50 kHz vocalizations from male rats, females display a
series of attracting behaviors, increasing the likelihood Another common carnivore model used for auditory
of the male approaching and copulating (Portfors, 2007). research is the domestic cat (felis catus). The sensitive hear-
Rodents therefore rely on ultrasound for their survival ing range of cats is commonly believed to be between 5 and
whether it is for communicating with conspecifics, 32 kHz, although there are notable discrepancies in the
attracting mates, or evading predators. literature regarding their hearing range limits (Figure 3).
The literature agrees that cats can hear ultrasonic frequen-
Carnivores cies, but the full extent of their perception remains unclear.
Unlike rodents, there are only limited data available on The lower limit of hearing is generally reported as approxi-
the evolution and biological importance of ultrasonic mately 125 Hz, but the upper limit is not well defined.
hearing in carnivores. Carnivores, aside from carnivo-
rous rodents like the northern grasshopper mouse Most sources report the upper limit as the maximum fre-
(Onychomys leucogaster), are seldom known to pro- quency tested. As such, the upper hearing limit of cats is
duce or use ultrasonic frequencies for communication not commonly described as greater than 60 kHz (Figure
(Brown et al. 1978; Farley et al., 1987). Even so, many 3), and, in some cases, the reported upper limit corre-
carnivores can perceive sounds with ultrasonic frequen- sponds to the highest frequency of sound tested in the
cies. It is thought that perhaps, at one point in history, the respective study. This is true for both electrical stimula-
common ancestor of carnivores used ultrasound for prey tion experiments, where electrical impulses are applied to
detection (Heffner and Heffner, 1985; Kelly et al., 1986). neurons in the auditory pathway, and behavioral experi-
However, as discussed in Rodents, prey (such as mice or ments. One exception is a study by Heffner and Heffner
rats) primarily communicate at frequencies above the (1985) who tested frequencies up to 92 kHz and reported
hearing range of carnivores (Kelly and Masterton, 1977). the upper hearing limit as 85 kHz. Therefore, it is possible
that the upper hearing limit of cats exceeds 60 kHz and The mouse was the first animal where a specialized cortical
that there could be neurons present in the cortex special- region for processing ultrasonic frequencies was identified
ized for these ultrasonic frequencies. (Hofstetter and Ehret, 1992). Frequencies between 40 and
70 kHz are represented in the UF, with approximately 50%
Cortical Representation of Ultrasonic of neurons responding to frequencies between 50 and 60
Frequencies kHz. However, unlike the A1 and AAF, the UF is not tono-
Mice, rats, ferrets, and cats are commonly used as topically organized (Stiebler et al., 1997), and it is still not
animal models for acoustic research. The biological clear whether the UF should be considered a part of the
importance of ultrasound to these mammals is further primary auditory fields alongside the A1 and AAF.
reflected by the allotment of cortical space for ultrasonic
sound perception in their respective auditory cortices. Tsukano and colleagues (2015) showed that the dorsome-
As such, it is crucial to validate as well as expand our dial field (DM), previously thought to be part of dorsal
current understanding of their hearing abilities, espe- A1, is a separate area specialized for ultrasonic perception.
cially the neural correlates underlying the perception of This region contains neurons highly responsive to vocal-
ultrasonic frequencies. izations, with frequencies above 40 kHz, demonstrating
how certain neurons in mouse cortex respond best to
Mice frequencies of behaviorally relevant sound features. This
In the mouse brain (Figure 4A), five auditory cortical type of cortical organization can also be seen in other
fields can be delineated in both hemispheres: primary rodents that rely on ultrasound for survival.
auditory field (A1), anterior auditory field (AAF), sec-
ondary auditory field (A2), dorsoposterior field (DPF), Rats
and ultrasonic field (UF) (Stiebler et al., 1997). The A1 The central auditory system of rats is comparable to that of
and AAF regions are both tonotopically organized but mice in both anatomical and functional organization. Five
with reverse gradients. The properties of the neurons distinct cortical fields have been identified in the rat brain,
within these two fields are similar. For example, the fre- and high-frequency neurons can be found in the following
quency ranges for neurons found in both the A1 and AAF regions: A1, AAF, posterior auditory field (PAF), ventral
are between 2 and 45 kHz. auditory field (VAF), and suprarhinal auditory field (SRAF).
Kaas, J. H. (2011). The evolution of auditory cortex: The core areas. In J. Ramsier, M. A., Cunningham, A. J., Moritz, G. L., Finneran, J. J.,
Winer Jand C. Schreiner. (Eds.), Auditory Cortex. Springer US, New York, Williams, C. V., Ong, P. S., Gursky-Doyen, S. L., and Dominy, N. J.
NY, pp. 407-427. https://doi.org/10.1007/978-1-4419-0074-6_19. (2012). Primate communication in the pure ultrasound. Biology Let-
Kalatsky, V. A., Polley, D. B., Merzenich, M. M., Schreiner, C. E., and ters 8, 508-511. https://doi.org/10.1098/rsbl.2011.1149.
Stryker, M. P. (2005). Fine functional organization of auditory cortex Reale, R. A., and Imig, T. J. (1980). Tonotopic organization in auditory
revealed by Fourier optical imaging. Proceedings of the National cortex of the cat. Journal of Comparative Neurology 192, 265-291.
Academy of Sciences of the United States of America 102(37), 13325- https://doi.org/10.1002/cne.901920207.
13330. https://doi.org/10.1073/PNAS.0505592102. Rutkowski, R. G., Miasnikov, A. A., and Weinberger, N. M. (2003).
Kelly, J. B., and Masterton, B. (1977). Auditory sensitivity of the albino Characterisation of multiple physiological fields within the ana-
rat. Journal of Comparative and Physiological Psychology 91, 930-936. tomical core of rat auditory cortex. Hearing Research 181, 116-130.
Kelly, J. B., Kavanagh, G. L., and Dalton, J. C. H. (1986). Hearing in the ferret https://doi.org/10.1016/S0378-5955(03)00182-5.
(Mustela putorius): Thresholds for pure tone detection. Hearing Research Shimbo, F. M. (1992), A Tao Full of Detours, the Behavior of the
24, 269-275. https://doi.org/10.1016/0378-5955(86)90025-0. Domestic Ferret. Ministry of Publications, Elon College, NC.
Ketterling, J. A., and Silverman, R. H. (2017). Clinical and preclinical appli- Simon, J. C., Maxwell, A. D., and Bailey, M. R. (2017). Some work on
cations of high-frequency ultrasound. Acoustics Today 13(1), 41-51. the diagnosis and management of kidney stones with ultrasound.
Koay, G., Heffner, R. S., Bitter, K. S., and Heffner, H. E. (2003). Hearing Acoustics Today 13(4), 52-59.
in American leaf-nosed bats. II: Carollia perspicillata. Hearing Research Sivian, L. J., and White, S. D. (1933). On minimum audible sound
178(1-2), 27-34. https://doi.org/10.1016/S0378-5955(03)00025-X. fields. Journal of the Acoustical Society of America 4, 288-321.
Leighton, T. G., Lineton, B., Dolder, C., and Fletcher, M. D. (2020). https://doi.org/10.1121/1.1915608.
Public exposure to airborne ultrasound and very high frequency Sokolovski, A. (1973). Normal threshold of hearing for cat for free-field
sound. Acoustics Today 16(3), 17-25. listening. Archiv Für Klinische Und Experimentelle Ohren-, Nasen- Und
Masterton, B., and Heffner, H. (1980). Hearing in Glires: Domestic Kehlkopfheilkunde 203(3), 232-240. https://doi.org/10.1007/BF00344934.
rabbit, cotton rat, feral house mouse, and kangaroo rat. The Journal Stiebler, I., Neulist, R., Fichtel, I., and Ehret, G. (1997). The audi-
of the Acoustical Society of America 68(6), 1584–1599. tory cortex of the house mouse: Left-right differences, tonotopic
https://doi.org/10.1121/1.385213. organization and quantitative analysis of frequency representation.
Matsumoto, Y. K., and Okanoya, K. (2016). Phase-specific vocalizations of Journal of Comparative Physiology A: Sensory, Neural, and Behavioral
male mice at the initial encounter during the courtship sequence. PLoS Physiology 181, 559-571. https://doi.org/10.1007/s003590050140.
ONE 11, e0147102. https://doi.org/10.1371/journal.pone.0147102. Trahiotis, C., and Elliot, D. M. (1970). Behavioral investigation of some possible
McGill, T. E. (1959) Auditory sensitivity and the magnitude of cochlear poten- effects of sectioning the crossed olivocochlear bundle. Journal of the Acoustical
tials. The Annals of Otology, Rhinology and Laryngology 68, 193-207. Society of America 47, 592-596. https://doi.org/10.1121/1.1911934.
Moerel, M., De Martino, F., and Formisano, E. (2014). An anatomical Tsukano, H., Horie, M., Bo, T., Uchimura, A., Hishida, R., Kudoh, M.,
and functional topography of human auditory cortical areas. Frontiers Takahashi, K., Takebayashi, H., and Shibuki, K. (2015). Delineation
in Neuroscience 8, 225. https://doi.org/10.3389/fnins.2014.00225. of a frequency-organized region isolated from the mouse primary
Moore, D. R. (1982). Late onset of hearing in the ferret. Brain Research auditory cortex. Journal of Neurophysiology 113, 2900-2920.
253(1-2), 309-311. https://doi.org/10.1016/0006-8993(82)90698-9. https://doi.org/10.1152/jn.00932.2014.
Neff, W. D., and Hind, J. E. (1955). Auditory thresholds of the cat. Wright, J. M., Gourdon, J. C., and Clarke, P. B. S. (2010). Identification of mul-
Journal of the Acoustical Society of America 27, 480-483. tiple call categories within the rich repertoire of adult rat 50-kHz ultrasonic
https://doi.org/10.1121/1.1907941. vocalizations: Effects of amphetamine and social context. Psychopharmacol-
Phillips, D. P., and Irvine, D. R. F. (1982). Properties of single neurons in ogy 211, 1-13. https://doi.org/10.1007/s00213-010-1859-y.
the anterior auditory field (AAF) of cat cerebral cortex. Brain Research
24, 237-244. https://doi.org/10.1016/0006-8993(82)90581-9.
Phillips, D. P., Judge, P. W., and Kelly, J. B. (1988). Primary auditory About the Authors
cortex in the ferret (Mustela putorius): Neural response properties
and topographic organization. Brain Research 443, 281-294.
https://doi.org/10.1016/0006-8993(88)91622-8. M. Charlotte Kruger
Pienkowski, M., and Eggermont, J. J. (2010). Intermittent exposure [email protected]
with moderate-level sound impairs central auditory function of Department of Physiology
mature animals without concomitant hearing loss. Hearing Research McGill University
261(1-2), 30-35. https://doi.org/10.1016/j.heares.2009.12.025. Montréal, Québec H3G 1Y6, Canada
Polley, D. B., Read, H. L., Storace, D. A., and Merzenich, M. M. (2007).
M. Charlotte Kruger graduated from
Multiparametric auditory receptive field organization across five cor-
the University of Western Ontario
tical fields in the albino rat. Journal of Neurophysiology 97, 3621-3638.
(London, ON, Canada) with a BSc Honors Specialization
https://doi.org/10.1152/jn.01298.2006.
in Biology in 2019. She is a current graduate student in the
Portfors, C. V. (2007). Types and functions of ultrasonic vocalizations
Cerebral Systems Laboratory at McGill University (Montréal,
in laboratory rats and mice. Journal of the American Association for
QC, Canada). She is investigating the ultrasonic hearing abil-
Laboratory Animal Science 46, 28-34.
ities of the cat and the location where ultrasonic frequencies
Rajan, R., Irvine, D. R. F., and Cassell, J. F. (1991). Normative N1 audiogram
might be encoded in the cat brain, to better understand the
data for the barbiturate-anaesthetised domestic cat. Hearing Research 53,
role of ultrasonic hearing in auditory neuroscience.
153-158. https://doi.org/10.1016/0378-5955(91)90222-U.
Made in Switzerland
For more information visit: NTI Audio AG NTI Americas Inc. NTI China NTI Japan
Figure 1. Publications (blue) and citations (orange) of papers on infant-directed speech (IDS) from 1990 to 2020. From the Web of Science.
26 Acoustics Today • Spring 2021 | Volume 17, issue 1 ©2021 Acoustical Society of America. All rights reserved.
https://doi.org/10.1121/AT.2021.17.1.26
which infants are presented a choice to listen to samples of to the tube formed by the vocal folds on one end and
IDS and adult-directed speech (ADS), infants (even new- the mouth at the other end. Movements of the tongue,
borns) repeatedly show a clear and strong preference to jaw, and lips vary the length and shape of the vocal tract,
listen to IDS, with few studies deviating from this pattern. which determines the resonances of the vocal tract.
A meta-analysis found that the average listening time dif-
ference between IDS and ADS (or “the effect size of IDS The acoustic patterns formed by the vocal resonances cre-
preference” in statistical terms) was significant and large ated when we speak are referred to as formants and are
(Dunst et al., 2012). numbered in ascending frequency value (the lowest is
the 1st formant [F1], next is the 2nd formant [F2], etc.).
Infant preference for IDS, being recognized as one of the The formants are essentially narrow frequency regions
most robust behaviors measured in infancy, was selected as where acoustic energy is increased because these frequen-
the target behavior in a large-scale study designed to under- cies vibrate most easily within the associated vocal tract
stand how subject variables and testing methodologies affect space. The first three formants contain critical acoustic
the measurement of infant behavior. This study, conducted information for speech communication.
by the ManyBabies Consortium, involved 67 laboratories
across North America, Europe, Australia, and Asia. The The vocal resonances and associated formant frequencies
findings provided further conclusive evidence of infants’ are higher for the short vocal tract of an infant or child
preference for IDS over ADS (ManyBabies Consortium, compared with the longer vocal tract of an adult. Talkers
2020). There is no doubt that infants are attracted to IDS. modify the resonance of the vocal tract to create different
vowel sounds by moving their articulators to create dif-
Acoustic Properties of Infant- ferent vocal tract shapes, such as by adjusting the degree
Directed Speech and location of constrictions along the vocal tract.
What is it about IDS that babies like? Studies show that
when caregivers talk to their infant, they modify their An extensive body of research has concentrated on describ-
speech on multiple levels. This includes basic speech pat- ing the acoustic structure of IDS. This work has considered
terns that play a broad role in communication and can each component, typically by comparing samples of IDS
be observed across different languages (conveying emo- with comparable samples of ADS (Soderstrom, 2007).
tion and talker information and basic units such as vowels,
consonants, and word forms) as well as acoustic cues that Voice Pitch and Rhythmic Properties of
mark specific lexical, grammatical, and pragmatic features Infant-Directed Speech
that are important in a specific language. Our focus here is The distinct vocal source properties of IDS are well-estab-
on basic acoustic speech patterns that have a broad impact lished (see Multimedia 1-5 at acousticstoday.org/polkamedia
and are more likely to be universal across languages. for audio examples in English and in Turkish; see video
example at bit.ly/3m3ecHh). Overall, higher voice pitch,
To understand the acoustic properties of IDS, it is useful to wider voice pitch range and greater pitch variability have
know that the acoustic speech signal has two independent been found in IDS (compared with ADS) in a variety of
components, referred to as the source and the filter. The languages, including both nontonal languages (Fernald et al.,
vocal source component is determined by how fast the 1989) and tonal languages (Liu et al., 2009). Several studies
vocal folds vibrate, which determines the voice pitch or have shown that high voice pitch is the primary acoustic
fundamental frequency (see article on singing by Sund- determinant of the infants’ preference for IDS (Fernald and
berg et al. on pages 43-51). The voice pitch of an infant or Kuhl, 1987; Leibold and Werner, 2007). Research focused on
child is much higher than that of an adult because their the speech movements that occur during IDS have observed,
short, light vocal folds vibrate faster compared with the as expected, that adults produce faster vocal fold vibrations
longer and thicker vocal folds of an adult. Talkers also vary and also raise their larynx when they talk to young infants
their voice pitch by adjusting the tension of the vocal folds. (Ogle and Maidment, 1993; Kalashnikova et al., 2017).
Larynx raising naturally occurs when vocal fold tension
The vocal filter component refers to the effects of the increases (which raises voice pitch) and can also shorten
length and shape of the vocal tract, the term used to refer the overall vocal tract length.
It is widely held that the primary goal or intention in infancy and supports early word learning. The list of
guiding these characteristic voice pitch properties is the positive effects of IDS rhythm on speech processing,
conveying emotion to the young infant (Saint-Georges which includes supporting better discrimination and
et al., 2013). Understanding the emotional expression in tracking of syllable patterns and detection of speech in
IDS led researchers to explore the pitch contours found noise, continues to grow (Soderstrom, 2007).
in IDS. Fernald and Simon (1984) observed that most
utterances in IDS had either rising or falling pitch con- Vocal Resonance Properties of Infant-
tours. Stern and colleagues (1982) identified the social Directed Speech
and linguistic context where these pitch contours were Research on IDS has also considered the other funda-
used. For example, a rising contour was frequently used mental component of speech, the filter or resonance
when mothers tried to engage in eye contact with an inat- properties. The focus here has been on vowel sounds.
tentive baby. Studies also show that creating “happy talk” Early research by Kuhl and colleagues (1997) reported
is the fundamental goal of IDS and that positive affect is that vowels are produced in an exaggerated form in IDS;
what drives infant preference (Singh et al., 2002). Thus, this hyperarticulation of vowels expands the vowel space,
understanding pitch contours in IDS can help us decode a standard graphic display that captures how vowel artic-
the affective function of IDS. ulation and formant patterns are related.
In terms of rhythmic features, IDS universally contains In the classic vowel space (Figure 2), F1 increases as
shorter utterances and longer pauses between words; in the tongue/jaw height decreases and F2 increases as the
some languages, including English and Japanese, there tongue constriction moves to the front of the mouth.
is also an enhanced lengthening of words or syllables at Importantly, three vowel sounds found in every spoken
the end of a phrase or utterance (Fernald et al., 1989; language, “ee,” “aw,” and “oo,” form the corners of this
Martin et al., 2016). This is helpful because natural fluent F1/F2 vowel space. These corner vowels are associated
speech typically lacks pauses between words, something with gestural extremes that define the full range of
you notice when encountering an entirely foreign lan- movements that we use to create vowel sounds: "ee" has
guage. This also highlights an initial challenge for babies, the most high and front constriction of the vocal tract,
learning which speech patterns are reoccurring words, "oo" has the most high and back constriction of the vocal
aka word segmentation. Infants begin to acquire word tract, and "aw" has the most open and unconstricted
segmentation skills at around 6 months, through expe- posture of the vocal tract. All other vowel sounds fall
rience listening to a specific language and before they within the limits defined by these corner vowel sounds.
attach meaning to each word they hear (Jusczyk, 1999).
Overall, the tempo of IDS provides the infant with a Figure 2. The articulatory/acoustic vowel space corresponding
speech stream that is easier to track with clearer cues to vowels produced by an adult female in ADS (squares) and
marking word boundaries and other syntactic units. in IDS (circles), and by an infant (triangles). F1 and F2, 1st
Consistent with this, the most prominent rhythm in and 2nd formants, respectively.
the acoustic speech signal, which matches the timing
of stressed syllables, was observed to be stronger in IDS
compared with ADS (Leong et al., 2017). This speech
rhythm was also prominent (and synchronized) in
mother and infant brain patterns when they watched a
nursery rhyme video together (Santamaria et al., 2020).
adult and infant, presumably to ensure that infant offspring interaction, speech convergence is typically associated
survive and thrive. Kalashnikova et al. (2017) claim that with liking or holding a positive attitude toward your
in early infancy, any benefits of IDS related to clarifying conversational partner (Pardo, 2013).
speech units are secondary to this basic social/emotional
bonding goal, and, as in evolution, these linguistically Other findings point to an important connection between
motivated patterns likely emerge later in development and IDS and infant speech. First, there are indeed clear paral-
piggy-back on to this social bonding function. lels between IDS and infant speech. With respect to vocal
source properties, infant speech and IDS have similar
Overall, what is happening to the resonance component voice pitch values, particularly when IDS is produced
of the speech signal when caregivers use IDS is not fully by a female adult/mother. Figure 3 shows voice pitch
resolved. Caregivers may be modifying their speech to values across the life span, including voice pitch values
clarify speech units and boost language development, to for IDS produced by female adults. Figure 3, pink box,
convey positive emotions, to sound smaller and build highlights the range in which voice pitch values overlap
social bonds, or some combination of these effects. across infant speech and speech produced by an adult
Although the details remain unclear, understanding how female using IDS.
these modifications impact infant development contin-
ues to ignite and steer ongoing research. Although voice pitch values can overlap across IDS and
infant speech, the vocal filter properties of infant speech
Infant-Directed Speech and Infant and IDS are more distinct. When an adult female raises
Speech: An Important Connection? her larynx and spreads her lips to shorten her vocal tract
As outlined in Vocal Resonance Properties of Infant- length, she will sound like a smaller person. Neverthe-
Directed Speech, there are different viewpoints less, a mother cannot shorten her vocal tract enough to
regarding what motivates the use of IDS. One idea to match the vocal tract length of her infant. Infant speech
emerge recently is that when mothers use IDS, they are has much higher vocal resonances, reflected in the for-
altering their speech to sound smaller and more like an mant frequencies uniquely associated with a talker with
infant. Although this is a new perspective on IDS, the act a very short vocal tract. This results in higher formant
of unconsciously adapting your speech to mirror or imi- frequency values for infant speech compared with adult
tate features of your conversational partner is not a new speech which are shown by a spectrogram of the vowel
observation. This has been noted and studied extensively "ee" produced by an infant and a female adult (Figure 4).
in adult speech communication and is often referred to These differences are also apparent in the vowel space
as phonetic convergence. Moreover, in adult-to-adult shown in Figure 2, where you can see that the corner
Figure 3. Typical average voice pitch (f0) values for speakers across the life span. Blue lines, observed range of values observed within each
group. Pink box, voice pitch range where infant and adult female IDS values overlap. Data from Masapollo et al., 2016, Table 1.
developmental stages. For example, in IDS with young infants (Kaplan et al., 2001). Infants’ learning is affected
infants (<12 months), communicating emotion is often when maternal depression persists over an extended
more prominent than clarifying linguistic structures. In period (Kaplan et al., 2011). However, infants of depressed
IDS with older children (>12 months), the reverse occurs, mothers remain responsive to IDS from nondepressed
such that highlighting linguistic structure is often more fathers and the quality of IDS is soon improved when the
prominent than communicating emotion and building mother’s depression is lifted (Kaplan et al., 2004). On the
social bonds. No doubt, IDS is best understood in the con- infant side, the preference for IDS is absent or reduced
text of infant/caregiver interaction and when the needs of among children with autism spectrum disorder, presum-
the child and the intentions of the caregiver are identified. ably reflecting difficulties in processing the heightened
emotional content of IDS (Kuhl et al., 2005).
Contingency and Synchrony
Are Fundamental New Directions
IDS is recognized to be dynamic and actively shaped by Going forward, research is moving quickly to expand
both the infant and caregiver. Contingent and synchro- our knowledge of IDS. Although we have learned a great
nized responding between mother and infant is a core deal about the acoustic properties of IDS, we need to
feature of IDS. Although an IDS speaking style can be learn more about the speech movements that give rise
simulated by an adult, IDS production is facilitated by to IDS signals. This type of work is technically challeng-
the presence of a baby. The salience of caregiver respon- ing but critical for understanding exactly what caregivers
siveness is demonstrated by the finding that adults can are doing when they adapt their speech for their infant,
readily identify audio recordings of IDS recorded with especially with respect to vocal resonance properties.
and without an infant present (Trehub et al., 1997).
Future research will also continue to build a more com-
Saint-Georges and colleagues (2013) proposed that IDS plete understanding of the social, emotional, cognitive,
creates an interactive communication loop connecting and linguistic benefits of IDS for the developing child.
the infant and the caregiver in a synergistic way. This Research exploring the physiological responses of interact-
idea has motivated researchers to search for physi- ing caregivers and infants will play a central role by helping
ological markers of enhanced synchrony during IDS. us identify and understand the contingent and synchro-
Synchronous activity has been observed in heart rate nous processes that are mediated by IDS. Each new finding
and respiration measures (McFarland et al., 2019) and pushes our curiosity to a higher level. We are confident
gaze patterns (Santamaria et al., 2020) recorded during that IDS will hold the interest of infants, caregivers, and
parent/infant interactions where IDS is commonly used. scientists for a long time and can help us understand con-
ditions that compromise parent/infant connection and
The powerful role of dynamic social interaction is also identify new ways to optimize infant development.
reinforced by research showing that infants can readily
learn to discriminate consonants from a foreign language Acknowledgments
in a live interaction involving IDS but not from audio- We thank Claire Ying Ying Liu for preparing Figure 2
visual recordings (Kuhl et al., 2003). It is also intriguing and Sandra Trehub and the ManyBabies Consortium for
to consider how the musical quality of IDS (which is sharing infant-directed speech samples.
enhanced in infant-directed singing) shapes this parent-
infant synchrony, given that early music exposure affects References
infant brain development (Zhao and Kuhl, 2020). Benders, T. (2013). Mommy is only happy! Dutch mothers’ realisa-
tion of speech sounds in infant-directed speech expresses emotion,
not didactic intent. Infant Behavior and Development 36, 847-862.
The critical role of IDS contingency and synchrony is also https://doi.org/10.1016/j.infbeh.2013.09.001.
Bradlow, A. R., Torretta, G. M., and Pisoni, D. B. (1996). Intelligibility
supported by evidence that challenges on each side of the of normal speech I: Global and fine-grained acoustic-phonetic talker
interactional loop affect the synergistic connection created characteristics. Speech Communication 20, 255-272.
via IDS. For example, from the caregiver side, mothers https://doi.org/10.1016/S0167-6393(96)00063-5.
Burnham, D., Kitamura, C. M., and Vollmer-Conna, U. (2002). What’s
with depression tend to include less affective information new pussycat? On talking to babies and animals. Science 296, 1435.
and have smaller pitch variations when speaking to their https://doi.org/10.1126/science.1069587.
Linda Polka
[email protected]
School of Communication Sciences
and Disorders
Centre for Research on Brain,
Language, and Music
Faculty of Medicine and
Health Sciences
McGill University
2001 McGill College Avenue
Montréal, Quebec H3A 1G1, Canada
Linda Polka is professor and graduate program director
in the School of Communication Sciences and Disorders,
McGill University (Montréal, QC, Canada) and director of the
McGill Infant Speech Perception Lab. She is also a fellow of
the Acoustical Society of America and was recently (2019)
chair of the Technical Committee for Speech Communica-
tion. Her work explores how infant speech perception is
shaped by universal biases and language experience in both
monolingual and bilingual infants. FOLLOW THE ASA ON
SOCIAL MEDIA!
Yufang Ruan
[email protected]
School of Communication Sciences
@acousticsorg
and Disorders
Centre for Research on Brain,
Language, and Music @acousticsorg
Faculty of Medicine
and Health Science
McGill University The Acoustical Society of America
2001 McGill College Avenue, Room 845
Montréal, Quebec H3A 1G1, Canada
AcousticalSociety
Yufang Ruan is a doctoral student in the Infant Speech
Perception Lab, McGill University (Montréal, QC, Canada).
Her research interests concern language development and AcousticalSocietyofAmerica
developmental disorders. She received her MSc from Beijing
Normal University (China) and her BA from Dalian University
acousticstoday.org/youtube
of Technology (China).
Psychoacoustics of Tinnitus:
Lost in Translation
Christopher Spankovich, Sarah Faucette, Celia Escabi, and Edward Lobarinas
Tinnitus: What Is It? at more central segments of the pathway. These changes
Tinnitus is the perception of sound without an exter- include (1) an increase spontaneous neural activity of
nal source, often experienced as a constant or frequent excitatory neurons/neurotransmitters and a reciprocal
ringing, humming, or buzzing. Tinnitus is reported by decrease in activity of inhibitory neurons/neurotransmit-
more than 50 million people in the United States alone ters, resulting in central gain; (2) distortions in frequency
(Shargorodsky et al., 2010); conservatively 1 in 10 US representation as input to more central regions is
adults has tinnitus (Bhatt et al., 2016). It is estimated that restricted due to peripheral damage; and (3) nonauditory
20-25% of patients with tinnitus consider the symptoms pathway/structure recruitment, suggesting a multisensory
to be a significant problem (Seidman and Jacobson, 1996). and distributed brain network implicated in mediating
tinnitus perception and reaction. Simply stated, tinnitus
Various nomenclature has been applied to describe tinnitus, is the attempt of the brain to fill in the reduced peripheral
including terms such as subjective or objective tinnitus and input (Spankovich, 2019).
the more recent recommended terms primary and second-
ary tinnitus (Tunkel et al., 2014). Primary tinnitus refers to Perception Versus Reaction to Tinnitus
tinnitus that is idiopathic and may or may not be associated A critical distinction is the perception of tinnitus versus
with sensorineural hearing loss (SNHL; hearing loss (HL) the reaction to tinnitus. The tinnitus percept or phantom
related to dysfunction of the inner ear and auditory nerve). sound itself has minimal repercussions for morbidity or
Secondary tinnitus refers to tinnitus that is associated with mortality. Conversely, the reaction or emotional response
a specific underlying cause other than SNHL or an identifi- to tinnitus can have a substantial effect on a person’s
able organic condition such as pulsatile tinnitus (heartbeat functional status (Jastreboff and Hazell, 1993). Almost
perception in ear). Our discussion here is focused on pri- everyone with tinnitus, whether bothersome or not, would
mary tinnitus, which is the more common variant. want the percept eliminated if possible (Tyler, 2012).
©2021 Acoustical Society of America. All rights reserved. Volume 17, issue 1 | Spring 2021 • Acoustics Today 35
https://doi.org/10.1121/AT.2021.17.1.35
PSYCHOACOUSTICS OF TINNITUS
Measuring Tinnitus in Humans to measure tinnitus that has translation between animal
There is currently no widely accepted or validated method models and humans would be most efficacious to empower
to identify the presence of primary tinnitus and quantifi- development of diagnostic and treatment approaches.
cation of its perceptual characteristics other than what is
reported by the patient. An objective measure of primary Recommendations for the psychophysical assessment
tinnitus by the clinician, a long-held goal, is complicated of tinnitus were postulated over 20 years ago by the
by the relationship among tinnitus, HL, and hyperacusis Ciba Foundation and the National Academy of Sciences.
(sound sensitivity related to increased central neural activ- Methods for administering the psychoacoustic battery for
ity compensating for reduced peripheral input) and a lack tinnitus assessment have been reviewed (Henry, 2016). To
of sensitivity and specificity from electrophysiological date, standardization of these procedures is still not rec-
measures or imaging studies. Developing objective mea- ognized; however, generalized clinical methods are briefly
sures of tinnitus has been challenging in studies in both described here.
human and animal.
Pitch match (PM) measures the patient’s perceived tin-
Then again, perhaps an objective measure to rule-in or rule- nitus pitch (perception of sound frequency) by matching
out the presence of tinnitus is not necessary. For example, the tinnitus to a specific frequency or range of frequencies.
the gold standard for assessment of hearing sensitivity is PM is typically measured, although crudely, using an audi-
the pure-tone audiogram (Figure 1), which indicates the ometer where two sounds are played, and the person with
lowest sound level a human or animal can detect at different tinnitus chooses the pitch closer to their tinnitus percept.
frequencies. The audiogram is, however, a psychophysical Most patients with peripheral hearing deficits match their
measure that is nonobjective in nature. Of course, a method tinnitus pitch at frequencies respective to their HL.
high doses has been shown to reliably induce tinnitus in rats were conditioned to associate a mild but unavoidable
humans but is also usually reversible and again limited foot shock that occurred after a continuous sound was
to high doses; a baby aspirin is unlikely to cause tinnitus. turned off. This resulted in suppressed licking from a water
spout in preparation of the imminent shock. Following
Given that tinnitus is a phantom auditory perception, conditioning, rats in the experimental group were given a
how can it be measured in animals? The simple answer high dose of sodium salicylate, whereas the control group
is that patients cannot perceive quiet while tinnitus is received a placebo. During this phase, the foot shock was
present, and neither can animals. Across studies, animals eliminated but the sound conditions remained. Rats in
are trained to exhibit one set of behaviors (e.g., pressing the control group continued to suppress licking when
levers, moving from one side of the chamber to another, the sound was turned off because the lack of sound was
climbing a pole) when there is no sound in the environ- associated with foot shock. In contrast, rats treated with
ment and another set of behaviors when sound is on in sodium salicylate continued to lick even when the sound
order to obtain food or avoid punishment. Among the was turned off. Simply put, the animals could not tell that
animal models (Brozoski and Bauer, 2016), the most the sound was turned off (presumably due to presence of
common approach is to have animals (usually rodents) tinnitus) and continued to lick from the waterspout.
detect a gap in a continuous sound. When tinnitus is
present, animals make more errors detecting gaps in A number of subsequent animal models have shown
continuous sound, especially if the frequency of the con- results consistent with the presence of tinnitus and consis-
tinuous sound is similar in pitch to their tinnitus. tent with Jastreboff ’s lick suppression model (Eggermont
and Roberts 2015). Other models have used either avoid-
Several of these animal studies have shown that the pat- able shock or positive reinforcement with food whereby
tern of results supports the presence of tinnitus after high animals have to differentiate between trials with sound
doses of sodium salicylate, quinine (an antimalarial drug and trials with no sound. Although interrogative assess-
known to induce tinnitus in humans), and noise expo- ments in animal models are crucial for investigating
sure. Importantly, the pitch of the tinnitus is consistent perceptual correlates of tinnitus, it is important to note the
with the adjusted frequency range (relative to peripheral considerable challenges in interrogative models because
HL) reported in humans. behavioral conditioning requires lengthy and consistent
training schedules (Brozoski and Bauer, 2016), and even
To effectively test animals for the presence of tinnitus, then, some animals may not respond as expected due to
several fundamental features are necessary for rigorous inability to do the task or lack of motivation.
investigation. These include the use of well-established
behavioral response paradigms for determining the phan- Given the challenges associated with interrogative
tom sound of tinnitus, known and reliable inducers of models, reflexive models for tinnitus assessment have
tinnitus, and/or reliable physiological responses consistent been widely used for determining the presence of tin-
with the presence of tinnitus. Psychophysical assessment of nitus. The acoustic startle reflex (ASR) is a large-motor
tinnitus is typically categorized either as an interrogative response akin to a jumping/jolt-like response that can be
model, which evaluates changes in behavioral outcomes as readily elicited in rodents using a loud startling acoustic
a function of tinnitus, or as a reflexive model, which assess stimulus. The ASR can be easily measured in rodents
changes in automatic, lower-order processing responses using pressure sensitive platforms to record the ampli-
consistent with the perception of a phantom sound. tude and duration of the reflex (Turner et al., 2006).
Interrogative models require that the animal voluntarily Interestingly, the ASR can be attenuated by presenting an
respond to the acoustic environment indicating the pres- acoustic cue before the startling acoustic stimulus. For
ence of silence or the presence of an auditory stimulus. example, a 50-ms tone before the loud startling stimulus
Early preclinical behavioral measures of tinnitus used will result in a reduction in the ASR. Because of the com-
interrogative methods, operant conditioning, and response pressed time frame, the changes in the ASR are believed
suppression to detect and characterize the presence of tin- to involve rapid lower level auditory processing before
nitus (Jastreboff et al., 1988). In the first animal model, the startle elicitor; in other words, the animal did not
Loss of Tuning not perceive tinnitus even with severe IHC loss. Thus, IHC
One of the earliest proposed theories of tinnitus initia- damage alone does not seem sufficient to generate tinnitus
tion was the discordant damage theory. According to this and support the discordant dysfunction theory of tinnitus
theory (an extension of theories proposed by Tonndorf or a combination of OHC and IHC/synapse injury at play.
1981a,b), the outer hair cells (OHCs) of the mammalian
cochlea are more prone to damage than the inner hair cells Changes to psychophysical tuning curves may offer
(IHCs), resulting in imbalanced activity via type I and type insight into differentiating OHC vs. IHC/synaptic contri-
II afferent fibers that, respectively, carry signals from the butions to the onset of tinnitus but are currently limited
ear to the dorsal cochlear nucleus (DCN), the first audi- to humans in regard to tinnitus effects. A psychophysical
tory center in the brain. The alteration of input to the DCN tuning curve is a method that can be used to generate
results in loss of inhibition and compensatory mechanisms comparable data to the physiological frequency threshold
at more central sites, including bursting neural activity, curve for a single auditory nerve fiber. A narrowband
mapping reorganization, decreased inhibition, and central noise of variable center frequency is used as a masker,
gain mentioned in Tinnitus: What Is It?. and a fixed frequency and fixed-level pure tone at about
20 dB HL is commonly the target. The level of masker is
Kaltenbach and Afman (2000) showed that significant found that just masks the tone for different masker fre-
IHC damage can prevent the onset of hyperactivity in quencies. With OHC damage, the tuning curve becomes
the DCN. Tonndorf ’s (1981) original model suggested a flattened and less sharp due to loss of sensitivity.
decoupling of stereocilia (the hair-like projections from
the cell) between the OHCs and the tectorial membrane For example, Tan et al. (2013) examined psychophysical
(a membrane floating above the hair cells) that leads to tuning curves in persons with HL and tinnitus and in per-
loss of energy and increased noise at the level of the hair sons with HL and no tinnitus. Both groups were compared
cell underlying tinnitus generation. Tonndorf ’s follow-up with a reference group of persons with normal hearing.
theory (1987) suggested that tinnitus was equivalent to The normal-hearing group showed expected patterns of
chronic pain in the somatosensory system and a result low thresholds and sharp tuning curves; these patterns are
of preferential damage to the OHCs and established an thought to reflect the nonlinearity of the OHCs. Interest-
analogy of tinnitus to chronic pain. ingly, the HL group with tinnitus showed better thresholds,
greater residual compression, and better tuning than the
In contrast to the discordant damage theory, cochlear no-tinnitus group in the midfrequency range. This was
insults that commonly lead to chronic tinnitus in humans likely reflective of the greater high-frequency HL of the
have been found to produce a long-term decrease in the tinnitus group relative to the no-tinnitus group that had
auditory neuronal spontaneous activity (Liberman and a wider array of patterns. Thus, the finding could simply
Dodds, 1984). Tinnitus is strongly correlated with HL reflect differences in hearing thresholds; however, after
and cochlear damage as a result of ototoxicity or noise matching participants based on HL, the pattern persisted.
exposure. Specifically, IHC/synaptic loss has been specu- Tan et al. suggested that the findings may be explained by
lated to produce tinnitus. the tinnitus group having residual OHC function and a
preferential loss of IHCs or afferents.
To explore this relationship, a behavioral gap detection
task was used to determine the presence of tinnitus in a The difference in the animal model of widespread loss of
chinchilla model with selective IHC loss following adminis- IHCs and lack of tinnitus evidence compared with psy-
tration of carboplatin. Carboplatin is an ototoxic anticancer choacoustic tuning curves in humans implicating IHCs/
drug known to cause significant IHC loss (>80% loss) while synapse may also be explained by the discordant damage
leaving OHCs largely intact (<5% loss) in the chinchilla, theory. The carboplatin model creates a pure loss of
an effect unique to the chinchilla model (Lobarinas et al., IHCs/synapses without damage to OHCs. Still, humans
2013b). Preliminary data showed overall poorer gap detec- may still have some level of damage to their OHCs not
tion performance when tested at lower presentation levels, reflected in their tuning curves. In other words, it would
but the findings were not frequency specific. The absence of be parsimonious to suggest that there is likely a ratio of
frequency-specific deficits suggested that these animals did damage to both hair cell types involved and necessary
Christopher Spankovich
[email protected] Edward Lobarinas
Department of Otolaryngology — [email protected]
Head and Neck Surgery School of Behavioral and Brain Sciences
University of Mississippi Medical Center University of Texas at Dallas
Jackson, Mississippi 39216, USA Dallas, Texas 75235, USA
Christopher Spankovich is an associate Edward Lobarinas earned a bachelor’s
professor and vice chair of research in the Department of Oto- degree from Rutgers University (New
laryngology — Head and Neck Surgery, University of Mississippi Brunswick, NJ) and a master’s degree and PhD from the State
Medical Center (Jackson). He obtained his MPH from Emory University of New York at Buffalo. He has had faculty appoint-
University (Atlanta, GA), AuD from Rush University (Chicago, ments in audiology programs at the University at Buffalo, the
IL), and PhD from Vanderbilt University (Nashville, TN). He is a University of Florida (Gainesville), and the University of Texas
clinician-scientist with a translational research program focused at Dallas. He is trained as a clinical audiologist and a basic
on the prevention of acquired forms of hearing loss, tinnitus, researcher. His research interests include tinnitus, tinnitus treat-
and sound sensitivity. He continues to practice clinically, with a ments, perceptual changes associated with the selective loss of
special interest in tinnitus, sound sensitivity, ototoxicity, hearing inner hair cells, and machine learning applications for assistive
conservation, and advanced diagnostics. He serves as an associ- listening devices. His work has been funded by the National
ate editor for the International Journal of Audiology. Institutes of Health, private foundations, and industry.
Introduction: Rendering Melodies of air below the vocal folds; (2) vocal fold vibration, quasi-
with Overtones periodically chopping airflow from the subglottal region;
A single singer but two voices? Experience that situa- and (3) filtering of the acoustic signal of this pulsatile airflow.
tion by visiting world-voice-day.org/EDU/Movies and
check the second movie with the title “Sehnsucht nach The overpressure of air below the folds throws them apart,
dem Frühlinge (Mozart) — Anna-Maria Hefele (AMH). thus allowing air to pass through the slit between them.
There, coauthor AMH sings a song by Mozart, first with Then, aerodynamic conditions reduce the air pressure
her singing voice and then with two simultaneous voices, along the folds, which, together with the elasticity of their
a drone (a low-pitched, continuously sounding tone) plus tissue, closes the slit. The same pattern is then repeated,
a whistle-like high-pitched tone that renders the melody. thus generating vocal fold vibration.
How is this possible? That is the question that we pose
here. Let us start by recalling how sounds are created by The vibration generates a pulsatile airflow as seen in Figure
the instrument AMH is playing, the human voice. 2A, producing sound, the voice source. The pitch is deter-
mined by the vibration frequency, whereas the waveform
Vocal Sound Generation is far from sinusoidal. Hence, this airflow signal is com-
Figure 1 shows a frame from the movie mentioned above. posed of a number of harmonic partials. In other words,
It shows a magnetic resonance imaging (MRI) with the the frequency of a partial number (n) = n × fo, where fo is
various parts of the voice organ labeled. Voice production the frequency of the lowest partial, the fundamental or
is the summed result of three processes: (1) compression vibration frequency. The amplitudes of the partials tend
to decrease with their frequency; the amplitude of n tends
to be something like 12 dB stronger than the amplitude of
Figure 1. Magnetic resonance (MR) image of Anna-Maria n × 2. The spectrum envelope of the voice source is rather
Hefele’s (AMH’s) head and throat, taken from the video where smooth and has a negative slope as seen in Figure 2B.
she performs a Mozart melody in overtone singing technique.
The voice source is injected into the vocal tract (VT),
which is a resonator. Hence it possesses resonances at
certain frequencies. Partials with frequencies close to a
VT resonance frequency are enhanced and partials fur-
ther away are attenuated (see Figure 2C). Therefore, the
spectrum envelope of the sound radiated from the lip
opening (Figure 2A) contains peaks at the VT resonance
frequencies and valleys in-between them. In this sense,
the VT resonances form the spectrum envelope of the
sound emitted to the free air. Probably for this reason, VT
resonances are frequently referred to as formants.
©2021 Acoustical Society of America. All rights reserved. Volume 17, issue 1 | Spring 2021 • Acoustics Today 43
https://doi.org/10.1121/AT.2021.17.1.43
ONE SINGER, TWO VOICES
pharynx to the hard palate. Also, the jaw and lip open-
ings contribute to determining the VT shape. As a result,
the formant frequencies can be varied within quite wide
ranges: the first formant between about 150 and 1,000 Hz,
the second from about 500 and 3,000 Hz, and the third
from about 1,500 Hz and 4,000 Hz.
Overtone Singing
What is overtone singing, then? The term covers several
different styles. Overtone singing was described as early
as the nineteenth century by a famous singing teacher
Manuel Garcia (Wendler, 1991) and has attracted the
interest of several researchers. Smith and colleagues
(1967) described “throat singing,” a type of chant per-
formed by Tibetan Lamas, (see, e.g., s.si.edu/37QCSOZ).
It is produced by males with special types of vocal fold
vibrations, referred to as vocal fry register. Its pitch is
stable and very low and is produced by a vocal fold
vibration pattern in which every second or third airflow
pulse is attenuated. Consequently, the pitch period of this
drone is doubled or tripled. A similar type of phonation
often occurs in phrase endings in conversational speech
but is then typically aperiodic. In throat singing, two
Figure 2. A and B: typical waveform and spectrum, of the overtones are quite strong, and audible, thereby
respectively, of the glottal airflow during phonation. C and together giving the impression of a “chord.” Throat sing-
D: vocal tract transfer function for the vowel /ae/ and the ing is regarded as sacred in some Asian cultures.
corresponding radiated spectrum, respectively.
The overtone singing demonstrated by AMH in the above
link can be produced by both females and males. How-
a strong effect on these frequencies and protruding the ever, the fundamental frequency of the drone is not as
lips makes the VT longer, thus lowering the formant fre- low as in throat singing. In AMH’s case, it is in the range
quencies. The shape of the VT can be varied within wide typical of female speaking voices. The melody is played
limits. Moreover, by bulging the tongue more or less and in a much higher pitch range by very strong overtones.
in various directions, the VT can be narrowed or widened Figure 3 shows some examples where overtones number
in almost any place along its length axis from the deep 9, 7, and 4 are the strongest in the spectrum.
Figure 3. Examples of spectra produced in overtone singing by AMH. FE, frequency of the enhanced overtone.
way of selecting and amplifying the harmonics and have with the standard source-filter model (Fant, 1960). This
refined their VT motor skills to be able, with great pre- theory treats the envelope as the sum of formant curves
cision in time and frequency, to produce harmonics as and the constant contributions of source and radiation
melodic sequences. Next, we test the hypotheses that (1) characteristics, which are not shown in Figure 5. The
enhancing and selecting a single partial is the result of bandwidths are secondary aspects, being determined
VT shapes producing clustering of formants; and (2) that mainly by the frequencies of the formants (Fant, 1972).
overtone singing is produced with a regular sound source.
In Figure 5, F1 was fixed at 600 Hz while F2 was varied.
Acoustic Theory When F1 and F2 approach almost identical frequencies
Enhancing Single Harmonics (Figure 5, right), creating, as it were, a double formant,
How can formants produce the excessive amplitudes of their individual peaks merge into a single maximum,
the single overtones illustrated in Figure 3? Mathemati- with a significant increase in the amplitude of the clos-
cally, the spectral envelope, the function determining the est partial. In other words, acoustic theory states that
amplitudes of the overtones, is the sum of formant reso- formant amplitudes are predictable and thus suggests an
nance curves (which vary with changes in VT shape) and answer to the question asked in the first sentence of this
certain factors such as glottal source and radiation char- section. Enhancing the amplitude of individual overtones
acteristics (which do not depend on articulatory activity). is possible: Move two formants close to each other in fre-
The only input to the calculation is the frequencies and quency. Create a double formant!
bandwidths of the formants. The latter factor generally
varies with the former factor in a predictable manner Measurements and Modeling
so formant amplitudes need not be specified. Figure 5 Vocal Tract Shapes
illustrates this predictability. As shown above, if the formant frequencies are deter-
mined by the shape of the VT, so what was the shape
Figure 5 shows three line spectra with the cardinal shapes of AMH’s VT? This has actually been documented in
of the resonance curves for the first and second formants another dynamic (MRI) video published by the Freiburg
(henceforth F1 and F2). The amplitudes of the partials Institute of Musician’s Medicine, Naturtonreihe in Zun-
and their spectral envelopes were derived in accordance gentechnik (see youtu.be/-jKl61Xxkh0). It was taken
when AMH performed overtone singing, enhancing,
one by one each overtone of a drone with a fundamen-
Figure 5. Schematic illustration of the spectrum effects of tal frequency of 270 Hz (pitch about C4), in a rising
moving the frequencies of two formants closer together. followed by a descending sequence. Henceforth, the
Vertical lines, partials of a drone with a fundamental frequencies of the enhanced overtones will be referred
frequency of 100 Hz; blue and red curves, first (F1) and to as FE. All overtones, from the 4th, FE = 1,080 Hz, up
second (F2) formants, respectively. Left: F1 = 600 Hz, F2 = to the 12th, FE ≈ 3,200 Hz, were enhanced.
1,400 Hz. Center: F1 = 600 Hz; F2 = 2,150 Hz. Right: F1 =
600 Hz; F2 = 650 Hz, thus creating a “double formant,” that The MRI video shows her entire VT in a midsagittal lat-
creates a very strong partial (arrow). eral profile. Figure 6 shows tracings of the VT for each
of the enhanced overtones in the ascending and the
descending series.
Voice Source
Is formant clustering an exhaustive explanation of over-
tone singing? Fortunately, the transfer function of the
VT can be predicted given its formant frequencies. Thus,
a vowel spectrum can be analyzed not only with respect
to the formant frequencies, which appear as peaks in the
spectrum envelope, but also with respect to the voice
source. The trick is simple, inverse filtering!
very wide limits. We can vary the shape of the tongue tip and a neck formed by the rather narrow lip opening
body, the position of the tongue tip, the jaw and lip open- (Sundberg and Lindblom, 1990; Granqvist et al., 2003).
ings, the larynx height, and the position and status of the
gateway to the nose, the velum. The area of the lip opening was measured in a front video
recorded when AMH produced the same overtone series as
Let us now more closely examine the shape of AMH’s for the MRI video. The length of the lip opening was docu-
VT as documented in the MRI video. It is evident from mented in the MRI video. These measures plus the frequency
Figure 6 that AMH produced overtone singing with a of the third formant used for the inverse filtering analysis
lifted tongue tip, so the tongue tip divided the VT into a allowed us to use the Helmholtz equation for calculating the
front cavity and a back cavity. Our first target is the back front cavity volume. The validity of this approximation was
cavity posterior to the raised tongue tip. corroborated in terms of a strong correlation between the
measured length and the volume of the front cavity.
The formant frequencies associated with a given VT
shape can be estimated from the VT contour. Several The formant frequencies of the entire VT could be calcu-
investigations have examined the relationship between lated by a custom-made software, Wormfrek (Liljencrants
the sagittal distance separating the VT contours and the and Fant, 1975). Figure 8 shows the transfer functions
associated cross-sectional area at the various positions with the formant frequencies for three FE values: 1,096,
along the VT length axis (see, e.g., Ericsdotter, 2005). 2,166, and 3,202 Hz. In Figure 8, the arrows highlight
Hence it was possible to describe the shape of the back the close proximity of F2 and F3. In Figure 8, bottom, F1,
cavity for each FE in terms of an area function that lists F2, and F3 are plotted as a function of FE. The trend lines
the cross-sectional area as a function of the distance to show that F2 and F3 have similar slopes and intercepts
the vocal folds. differing by about 220 Hz.
The next question concerns the front cavity, anterior to We note that the F1, F2, and F3 predictions parallel the
the raised tongue tip. The cavity between palatal constric- formant measurements made using inverse filtering
tion and the lip opening looks like, and can be regarded (Figure 7). Here, a somewhat wider distance separates F2
as, a Helmholtz resonator, a cavity in front of the tongue from F3 than what was shown in Figure 7. The common
Figure 8. Top: Wormfrek software displays of the transfer functions for the lowest, a middle, and the highest FE (left, center, and right,
respectively). Bottom: associated values of F1, F2, and F3 as a function of FE. Lines and equations refer to trend line approximations.
denominator is the consistent identification of the double Overtone singing clearly requires an extremely high degree
formant. We feel justified in concluding that our results of articulatory precision; for each FE, two cavities need to
confirm the double formant phenomenon as a prereq- be shaped such that they produce resonance frequencies
uisite for the overtone selection and enhancement in that match each other within a few tens of Hertz. How
AMH’s overtone singing technique. can the underlying motor control be organized? It is prob-
ably relevant that some of the articulatory configurations
Conclusions shown in Figure 6 are used also in speech. The lateral pro-
Central to the present account is the “double formant” file for FE = 1,096 Hz resembles the articulation of retroflex
hypothesis, which attributes the phenomenon of over- consonants (Dixit 1990; Krull and Lindblom, 1996). A
tone singing to VT filtering. However, the inverse filtering narrow pharyngeal constriction is typical of [a]-like vowels
results also suggest that overtone singing involves a pho- and pharyngealized consonants (Ladefoged and Maddie-
nation type different from that in conversational voice, son, 1996). The VT for FE = 3,202 Hz has a “palatalized’
making the source spectrum slope less steep and thus tongue shape similar to that used for the vowel [i].
boosting the amplitudes of the higher overtones. These
findings replicate and extend previous investigations of It would also be relevant that the articulatory param-
overtone singing. Bloothooft et al. (1992) undertook an eters varied systematically with FE. This is illustrated
acoustic study of an experienced overtone singer and in Figure 9. It shows how AMH varied the lip open-
suggested formant clustering as an explanation and also ing area, length of palatal constriction, larynx height,
noted an extended closed phase of the vocal fold vibrations.
front cavity volume, and pharynx area as a function
Using impedance measurements, Kob (2004) analyzed a of FE. It is evident that the values of each individual
form of overtone singing called sygyt and interpreted the articulatory dimension are aligned along smooth con-
overtone boosting as the result of formant clustering. tours running between its values in FE = 1,096 and
3,202 Hz. This lawful patterning suggests that it would
Parallel vibrations of the ventricular folds have been be possible to derive VT shapes intermediate between
documented in throat singing (Lindestad et al., 2001). those for FE = 1,096 and 3,202 Hz by interpolation. A
How about this possibility in AMH’s overtone singing? rough description would be to say that the VT shapes
Our inverse filtering data clearly rule out the existence are located along a trajectory in the articulatory space
of a laryngeal mechanism that selectively amplifies and that runs between a retroflex and pharyngealized [a]
enhances individual partials. and an [i]-like, palatalized tongue profile.
ASA WEBINARS
tics of organ pipes as a guest researcher in Gunnar Fant’s
department at the Royal Institute of Technology (KTH;
Stockholm, Sweden). This brought him into a productive
contact with Björn Lindblom. After finishing his disserta-
tion in 1966, he founded a research group in the area of
music acoustics at the KTH and was awarded a personal The Acoustical Society of America has
chair in music acoustics there in 1979. Being an active established a Webinar Series with the goal
singer, the voice as a musical instrument has been his
to provide ongoing learning opportunities
main research theme along with the theory underlying
music performance. and engagement in acoustics by ASA
members and nonmembers throughout the
Author photo by Linnéa Heinerborg year, as a supplement to content presented
at bi-annual ASA meetings.
Björn Lindblom ASA Webinars will be scheduled monthly
[email protected] and will include speakers on topics of
Department of Linguistics interest to the general ASA membership
Stockholm University and the broader acoustics community,
SE-106 91 Stockholm, Sweden
including acoustical sciences, applications
Björn Lindblom became an experi- of acoustics, and careers in acoustics.
mental phonetician in the early
1960s. His publications span a wide range of topics,
including the development, production, and perception
of speech. Academic experience: teaching and doing
laboratory research at the Royal Institute of Technol- Find a schedule of upcoming webinars
ogy (KTH; Stockholm, Sweden), Haskins Laboratories and videos of past webinars at
(New Haven, CT), MIT (Cambridge, MA), Stockholm
University (SU), and the University of Texas at Austin
acousticalsociety.org/asa-webinar-series
(UT). He has held endowed chairs at SU and UT. He is a
Fellow of the Acoustical Society of America and of the
American Association for the Advancement of Science
(AAAS) and is an Honorary Life Member of the Linguis-
tic Society of America (LSA). His current project is a
book: Reinventing Spoken Language — The Biological Way.
ASA Publications now
Anna-Maria Hefele
[email protected]
has a podcast!
Overtone Academy Across Acoustics highlights authors'
Saulengrainerstrasse 1
research from our four publications:
DE-87742 Dirlewang, Germany
The Journal of the Acoustical Society of
Anna-Maria Hefele has a Master of
Arts from Mozarteum Salzburg (Aus-
America (JASA), JASA Express Letters,
tria) and is a multi-instrumentalist singer and overtone Proceedings of Meetings on Acoustics,
singer, performing worldwide as a soloist with different and Acoustics Today.
ensembles, choirs, and orchestras. She frequently performs
in contemporary ballet, circus, and dance theater produc-
tions. Her YouTube video “Polyphonic Overtone Singing”
went viral and has resulted in more than 17 million views
so far, followed by regular appearances in various inter-
national television shows and radio broadcasts. Headlines Streaming now at
like “A Voice as from Another World,” “The Lady with the www.buzzsprout.com/
Two Voices,” and “Polyphonic Vocalist Does the Impossible”
have spread across the world.
1537384
52 Acoustics Today • Spring 2021 | Volume 17, issue 1 ©2021 Acoustical Society of America. All rights reserved.
https://doi.org/10.1121/AT.2021.17.1.52
operations, freeing them up for other tasks as well as Force Base in Colorado. The system is young, relatively
reducing human errors. speaking. The GPS project was started in 1973, and the
first NAVigation System with Timing and Ranging (NAV-
Overview and History of the Global STAR) satellite was launched in 1978. The 24-satellite
Positioning System system became fully operational in 1995. A photograph
The GPS is owned by the US Government and is operated of a modern GPS-III satellite and a depiction of the satel-
and maintained by the US Air Force out of Shriever Air lite constellation are shown in Figure 1.
The same basic relationship from Eq. 1 that is used to Short-baseline (SBL) systems operate on a smaller scale,
calculate the distance from satellites can be applied to and the SBL transducers are typically fixed to a surface
acoustic signals as well. Here, rather than multiplying the vessel. Ultrashort-baseline (USBL) systems are typically
time that the GPS signal has traveled by the speed of light, a small transducer array, also often fixed to a surface
the travel time of the signal is multiplied by the speed of vehicle, which use phase (arrival angle) information of
sound in the medium through which it is traveling. The the acoustic signals to determine the vehicle position.
speed of sound in the ocean is roughly 1,500 m/s. This is
much slower than the speed of light, and it is also quite These types of acoustic localization work in a similar way
variable because the speed of sound in seawater depends to GPS localization, with electromagnetic waves; how-
on the seawater temperature, salinity, and depth. ever, they all operate in relatively small regions. Note that
these acoustic-positioning methods have been described
Traditional Underwater Positioning and in the context of underwater vehicles, but they can be
Local Vehicle Navigation Systems used for other purposes as well, including tracking drift-
Underwater vehicles routinely get position and timing ing instrumentation or even animals underwater.
from a GPS receiver when they are at the surface, but
once they start to descend, this is no longer available. Long-Range Underwater Acoustics
Vehicles navigate underwater using some combination of Propagation in the SOFAR Channel
dead reckoning, vehicle hydrodynamic models, inertial Attenuation of acoustic signals in the ocean is highly
navigation systems (INSs), and local navigation networks dependent on frequency. The signals commonly used
(Paull et al., 2014). Positioning in the z direction, the for LBL, SBL, and USBL localization networks typi-
depth in the ocean, is straightforward with a pressure cally have frequencies of tens of kilohertz and upward.
sensor, which can reduce the dimensionality of the prob- These signals may travel for a few kilometers, but lower
lem to horizontal positioning in x and y, or longitude and frequency signals on the order of hundreds of hertz or
latitude, respectively. lower are capable of traveling across entire ocean basins
underwater. This was demonstrated in 1991 by the Heard
Dead reckoning estimates the position using a known Island Feasibility Test, where a signal was transmitted
starting point that is updated with measurements of vehi- from Heard Island in the Southern Indian Ocean and
cle speed and heading as time progresses. Larger vehicles, received at listening stations across the globe, from Ber-
such as submarines, may have an onboard INS that inte- muda in the Atlantic Ocean to Monterey, CA, in the
grates measurements of acceleration to estimate velocity Eastern Pacific Ocean (Munk et al., 1994).
and thereby position. These measurements are, however,
subject to large integration drift errors. Refractive effects of the ocean waveguide are usually
taken into account when using the acoustic-positioning
Because of the need for more position accuracy than methods described above because an acoustic arrival
afforded by the submarine systems discussed above, it often does not take a direct path from the source to the
comes as no surprise that underwater vehicles also use receiver, and often a number of arrivals resulting from
acoustics for localization. A long-baseline (LBL) acoustic- multiple propagation paths are received. The refractive
positioning system is composed of a network of acoustic effects of the ocean waveguide become even more impor-
transponders, often fixed on the seafloor with their posi- tant as ranges increase. Acoustic arrivals can be spread
tions accurately surveyed. The range measurements from out over several seconds; however, the time arrival struc-
multiple transponders are used to determine position. ture can be predicted based on the sound speed profile.
LBL systems typically operate on scales of 100 meters to
several kilometers and have accuracies on the order of a The speed of sound in the ocean increases with increasing
meter. Transponder buoys at the surface can also provide hydrostatic pressure (depth in the ocean) and with higher
positioning accuracy similar to a seafloor LBL network. temperatures that occur near the surface. This leads to
a sound speed minimum referred to as the sound chan- for ocean temperature. Each ray has traveled a unique
nel axis, which exists at approximately 1,000 m depth, path through the ocean and therefore carries with it
although the depth can vary depending on where you information on the sound speed along the particular path
are on the globe (Figure 3). that it has traveled. On a very basic level, we are looking
again at the relationship from Eq. 1, but here distance and
The SOFAR channel, short for SOund Fixing And Rang- travel time are known, and we are inverting for sound
ing, refers to a sound propagation channel (Worzel et speed, which is a proxy for temperature. In ocean acous-
al., 1948) that is centered around the sound channel axis. tic tomography, the variability in these acoustic travel
Sound from an acoustic source placed at the sound speed times is measured regularly over a long period of time
minimum will be refracted by the sound speed profile, (acoustic sources and receivers often remain deployed
preventing low-angle energy from interacting with the in the ocean for a year at a time) to track how the ocean
lossy seafloor and enabling the sound rays to travel for temperature is changing. This method was described by
very long distances, up to thousands of kilometers. Worcester et al. (2005) in the very first issue of Acoustics
Today and more thoroughly in the book, Ocean Acoustic
The rays take different paths when traveling over these Tomography, by Munk et al. (1995).
long ranges, as seen in Figure 3. The arrival time at a
receiver is an integrated measurement of travel time The variability in these travel times is measured in mil-
along the path of the ray. Rays that are launched at angles liseconds; therefore, as with a GNSS, the acoustic travel
near the horizontal stay very close to the sound speed time measurements must be extremely precise. Great care
minimum. Rays that are launched at higher angles travel is taken to use clocks with low drift rates and to correct
through the upper ocean and deep ocean, and although for any measured clock drift at the end of an experiment.
they take a longer route than the lower angle rays, they
travel through regions of the ocean that have a faster The locations of the acoustic sources and receivers also
sound speed and therefore arrive at a receiver before their must be accurate because inaccuracies in either position
counterparts that took the shorter, slower road. would lead to an inaccurate calculation of distance, which
would impact the inversion for sound speed based on the
Ocean Acoustic Tomography Measurements simple relationship of Eq. 1. The sources and receivers
Ocean acoustic tomography takes advantage of the vari- used in typical ocean acoustic tomography applications
ability in measured travel times for specific rays to invert are on subsurface ocean moorings, meaning that there
Figure 3. Left: canonical profile of sound speed as a function of depth in the ocean (solid line). Right: refracted acoustic ray paths
from a source at 1,000 m depth to a receiver at 1,000 m depth and at a range of 210 km. The Sound Channel Axis (dashed line) is
located at the sound speed minimum at a depth of 1 km. Adapted by Discovery of Sound in the Sea (see dosits.org) from Munk et
al., 1995, Figure 1.1, reproduced with permission.
The SOFAR float signals were originally received by the Figure 4, c and d, shows slices of these acoustic predic-
SOund SUrveillance System (SOSUS) of listening stations tions at a 2,000 m depth. The broadband signal shown
operated by the US military. This system tracked more in Figure 4d exhibits sharp peaks in the arrival that can
than just floats and enemy submarines. It also received be identified with individual ray paths.
acoustic signals from earthquakes, and there is a wonder-
ful 43-day record of passively tracking of an individual The increased bandwidth is one of the design suggestions
blue whale, nicknamed Ol’ Blue, as it took 3,200-km tour for a potential joint navigation/thermometry system
of the North Atlantic Ocean (Nishimura, 1994). addressed in Duda et al. (2006). A system of sources is
suggested with center frequencies on the order of 100-
The existing listening system was convenient, but 200 Hz and a 50-Hz bandwidth.
equipping each float with an acoustic source was tech-
nologically challenging and expensive. In the 1980s, the The acoustic sources used for ocean acoustic tomography
concept was flipped so that the float had the hydrophone applications are broadband sources designed to trans-
receiver, and acoustic sources transmitted to the floats mit over ocean basin scales. A 2010-2011 ocean acoustic
from known locations to estimate range to the float. tomography experiment performed in the Philippine Sea
The name was also flipped, and the floats are known as featured six acoustic sources in a pentagon arrangement
RAFOS, an anadrome for SOFAR (Rossby et al., 1986). and provided a rich dataset for evaluating long-range
positioning algorithms. The sources used in this particu- How Feasible Is a Global Navigation
lar experiment had a center frequency of about 250 Hz Acoustic System?
and a bandwidth of 100 Hz. Because acoustic signals are able to propagate over extremely
long ranges underwater, acoustics could provide an under-
The sources were used to localize autonomous underwater water analogue to the electromagnetic GNSS signals that
vehicles that had access to a GPS at the sea surface but only are used for positioning in the land, air, and space domains.
surfaced a few times a day. Hydrophones on the vehicles There are definite differences between using an underwater
received acoustic transmissions from the moored sources acoustic positioning system and a GNSS, however. GNSS
at ranges up to 700 km, and these signals were used to esti- satellites orbit the earth twice a day and transmit continu-
mate the position of the vehicle when it was underwater ously. Acoustic sources do not need to be in orbit, but proper
(Van Uffelen et al., 2013). The measured acoustic arrivals placement of the sources would enable propagation to most
were similar to the modeled arrival shown in Figure 4d. regions in the oceans of the world.
The measurements of these peaks collected on the vehicle
were matched to predicted ray arrivals to determine range. The far reach of underwater acoustic propagation is dem-
This method takes advantage of the multipath arrivals onstrated by the International Monitoring System (IMS)
in addition to signal travel time. As with other acoustic operated by the Comprehensive Nuclear Test Ban Treaty
methods and with the GPS, ranges from multiple sources Organization (CTBTO). The IMS monitors the globe for
were combined to obtain estimates of vehicle position. The acoustic signatures of nuclear tests with only six under-
resulting positions had estimated uncertainties less than water passive acoustic hydrophone monitoring stations
100 m root mean square (Van Uffelen et al., 2015). worldwide. Figure 5 shows the coverage of these few sta-
tions. Signals received on these hydroacoustic stations
Other long-range acoustic-ranging methods incorporate pre- were used to localize an Argentinian submarine that was
dictions of acoustic arrivals based on ocean state estimates lost in 2017 using acoustic recordings of the explosion
(Wu et al., 2019). An algorithm introduced by Mikhalevsky on IMS listening stations at ranges of 6,000 and 8,000 km
et al. (2020) provides a “cold start’ capability that does not from the site (Dall’Osto, 2019).
require an initial estimate of the acoustic arrival and has
positioning orders on the order of 60 m. These results were You may note that Figure 5 does not show much coverage
validated using hydrophone data with known positions that in the Arctic Ocean and that the sound speed structure is
received the Philippine Sea source signals. As with the afore- quite different at high latitudes because it does not have
mentioned method, this algorithm relies on the travel-time the warm surface that we see in Figure 3; however, long-
resolution afforded by the broadband source signals. range propagation has been demonstrated in the Arctic
Final Thoughts
The GPS satellite constellation was originally designed
to meet national defense, homeland security, civil, com-
mercial, and scientific needs in the air, in the sea, and on
Figure 5. Global coverage of the Comprehensive Nuclear land. The age of artificial intelligence and big data has
Test Ban Treaty Organization (CTBTO) International made GPS data on land incredibly useful to all of us in
Monitoring System (IMS), shown by a 3-dimensional model our everyday life. Not only can we use information on
of low-frequency (<50-Hz) propagation. The property of our own location from our cell phone to find the near-
reciprocity is invoked by placing sources at the locations of est coffee shop, we can take advantage of the location
the six IMS hydrophone listening stations (red areas) from information on many different devices to look at traf-
where the sound radiates. Colors represent transmission loss fic patterns to gauge what is the best way to get to that
with a range of 70 dB. Figure created by Kevin Heaney and coffee shop. It won’t be too long until we will be riding
reproduced from Heaney and Eller, 2019, with permission. in self-driving cars, automatically taking the best route
and precisely positioned relative to each other. All of this
happened in just the last few decades because it has been
Ocean as well. In a 2019–2020 experiment, 35-Hz signals only 25 years since GPS became fully operational.
were transmitted across the Arctic Ocean over the North
Pole (Worcester et al., 2020). An underwater analogue to a global navigation satel-
lite system would revolutionize any operations in the
The electromagnetic signals broadcast by GNSS satellites are underwater domain including oceanographic science,
outside the visible spectrum, so we do not notice the signals naval military applications, underwater vehicles, and
that are continuously emitted by the satellites. In addition even scuba diving. Acoustics is the most promising way
to the engineering challenges that would face continuous to approach this on a large scale.
acoustic transmission, the frequency band of long-range
propagation is within the hearing range of many animals, Acknowledgments
and the impacts to the environment, including potentially I extend my gratitude to Arthur Popper, Kathleen Wage, and
masking marine mammal vocalizations, would need to be Peter Worcester for their helpful suggestions and acknowl-
considered. Long-range acoustic transmissions for scien- edge the Office of Naval Research (ONR) for supporting my
tific purposes go through an intense permitting process work related to underwater acoustic positioning.
that takes into account the environment and the impacts
on marine animals in the environment. References
Chamberlain, P. M., Talley, L. D., Mazloff, M. R., Riser, S. C., Speer,
Each GNSS satellite broadcasts navigation messages that K., Gray, A. R., and Schwartzman, A. (2018). Observing the ice-
includes the date and time as well as the status of the covered Weddell Gyre with profiling floats: Position uncertainties
and correlation statistics. Journal of Geophysical Research Oceans 123,
satellite. It broadcasts ephemeris data that provide its
8383-8410. https://doi.org/10.1029/2017JC012990.
specific orbital information for more precise localiza-
Dall’Osto, D. R. (2019). Taking the pulse of our ocean world. Acoustics
tion of the GPS receiver. Localization using dedicated Today 15(4), 20-28.
networks of sources, such as the example in the Phil- Duda, T., Morozov, A., Howe, B., Brown, M., Speer, K., Lazarevich,
ippine Sea, which incorporates precise source position P., Worcester, P., and Cornuelle, B. (2006). Evaluation of a long-
and timing as necessary for localization of an acoustic range joint acoustic navigation/thermometry system. Proceedings
receiver as it is for GPS has been discussed. A vision for of OCEANS 2006, Boston, MA, September 18-21, 2006, pp. 1-6.
Freitag, L., Ball, K., Partan, J., Koski, P., and Singh, S. (2015). Long
range acoustic communications and navigation in the Arctic. Pro- About the Author
ceedings of OCEANS 2015-MTS/IEEE, Washington, DC, October
19-22, 2015, pp. 1-5.
Lora J. Van Uffelen
Guier, W. H., and Weiffenbach, G. C. (1998). Genesis of satellite navi-
[email protected]
gation. Johns Hopkins APL Technical Digest 19(1), 14-17.
Heaney, K. D., and Eller, A. I. (2019). Global soundscapes: Parabolic equation Department of Ocean Engineering
modeling and the CTBTO observing system. The Journal of the Acoustical University of Rhode Island
Society of America 146(4) 2848. https://doi.org/10.1121/1.5136881. Narragansett, Rhode Island 02882,
Howe, B. M., Miksis-Olds, J., Rehm, E., Sagen, H., Worcester, P. F., and USA
Haralabus, G. (2019). Observing the oceans acoustically. Frontiers in
Lora J. Van Uffelen is an assistant pro-
Marine Science 6, 426. https://doi.org/10.3389/fmars.2019.00426.
fessor in the Department of Ocean Engineering, University of
Mikhalevsky, P. N., Sperry, B. J., Woolfe, K. F., Dzieciuch, M. A.,
Rhode Island (Narragansett), where she teaches undergradu-
and Worcester, P. F. (2020). Deep ocean long range underwater
ate and graduate courses in underwater acoustics and leads
navigation, The Journal of the Acoustical Society of America 147(4),
the Ocean Platforms, Experiments, and Research in Acoustics
2365-2382. https://doi.org/10.1121/10.0001081.
(OPERA) Lab. She earned her PhD in oceanography from the
Muir, T. G., and Bradley, D. L. (2016). Underwater acoustics: A brief his-
Scripps Institution of Oceanography, University of California,
torical overview through World War II. Acoustics Today 12(2), 40-48.
San Diego (La Jolla). Her current research projects focus on
Munk, W., Worcester, P., and Wunsch, C. (1995). Ocean Acoustic
long-range underwater acoustic propagation, Arctic acous-
Tomography. Cambridge University Press, New York, NY.
tics, vehicle and marine mammal localization, and acoustic
Munk, W. H., Spindel, R. C., Baggeroer, A., and Birdsall, T. G. (1994).
sensing on underwater vehicles. She has participated in more
The Heard Island Feasibility Test. The Journal of the Acoustical Society
than 20 research cruises, with over 400 days at sea.
of America 96, 2330-2342.
Nishimura, C. E. (1994). Monitoring whales and earthquakes by using
SOSUS. 1994 Naval Research Laboratory Review, pp. 91-101.
Paull, L., Saeedi, S., Seto, M., and Li, H. (2014). AUV navigation and local-
ization: A review. IEEE Journal of Oceanic Engineering 39, 131-149. The Journal of the Acoustical
Rossby, T., and Webb, D. (1970). Observing abyssal motions by tracking Swal-
low floats in the SOFAR channel. Deep Sea Research and Oceanographic Society of America
Abstracts 17(2), 359-365. https://doi.org/10.1016/0011-7471(70)90027-6.
Rossby, T., Dorson, D., and Fontaine, J. (1986). The RAFOS system.
Journal of Atmospheric and Oceanic Technology 3, 672-679. Reflections
Shetterly, M. L. (2016). Hidden Figures: The American Dream and the
Untold Story of the Black Women Mathematicians Who Helped Win Don’t miss Reflections, The Journal of the
the Space Race. William Morrow, New York, NY. Acoustical Society of America’s series that
Van Uffelen, L. J., Howe, B. M. Nosal, E. M. Carter, G. S., Worcester, P.
takes a look back on historical articles that
F., and Dzieciuch, M. A. (2015). Localization and subsurface position
error estimation of gliders using broadband acoustic signals at long have had a significant impact on the
range. Journal of Oceanic Engineering 41(3) 501-508. science and practice of acoustics.
https://doi.org/10.1109/joe.2015.2479016.
Van Uffelen, L. J., Nosal, E. M., Howe, B. M., Carter, G. S., Worcester,
P. F., Dzieciuch, M. A., Heaney, K. D., Campbell, R. L., and Cross, P.
S.. (2013). Estimating uncertainty in subsurface glider position using
transmissions from fixed acoustic tomography sources. The Journal
of the Acoustical Society of America 134, 3260-3271.
https://doi.org/10.1121/1.4818841.
Worcester, P. F., Dzieciuch M. A., and Sagen, H. (2020). Ocean acous-
tics in the rapidly changing Arctic. Acoustics Today 16(1), 55-64.
Worcester, P. F., Munk, W. H., and Spindel, R. C. (2005). Acoustic
remote sensing of ocean gyres. Acoustics Today 1(1), 11-17.
Worzel, J. L., Ewing, M., and Pekeris, C. L. (1948). Propagation of Sound
in the Ocean. Geological Society of America Memoirs 27, Geological
Society of America, New York, NY. https://doi.org/10.1130/MEM27.
Wu, M., Barmin, M. P., Andrew, R. K., Weichman, P. B., White, A. W., See these articles at:
Lavely, E. M., Dzieciuch, M. A., Mercer, J. A., Worcester, P. F., and acousticstoday.org/forums-reflections
Ritzwoller, M. H. (2019). Deep water acoustic range estimation based
on an ocean general circulation model: Application to PhilSea10
data. The Journal of the Acoustical Society of America 146(6), 4754-
4773. https://doi.org/10.1121/1.5138606.
Acoustics Today is pleased to present the names of the recipients of the various awards and prizes given out by the
Acoustical Society of America. After the recipients are approved by the Executive Council of the Society at each
semiannual meeting, their names are published in the next issue of Acoustics Today.
Congratulations to the following recipients of Acoustical Society of America medals, awards, prizes, and fellowships,
who will be formally be recognized at the Spring 2021 Plenary Session. For more information on the accolades, please
see acousticstoday.org/asa-awards, acousticalsociety.org/prizes, and acousticstoday.org/fellowships.
Congratulations also to the following members who were elected Fellows in the Acoustical Society of America in
the spring 2021.
• Kathryn H. Arehart • Brian D. Simpson
(University of Colorado at Boulder) (Air Force Research Laboratory, Dayton, OH)
for contributions to the understanding of auditory for contributions to speech perception, spatial hear-
perception, hearing loss, and hearing aids ing, and the development of auditory displays
• Gregory Clement • Pamela E. Souza
(US Food and Drug Administration, Silver Spring, MD) (Northwestern University, Evanston, IL)
for contributions to transcranial for advancing understanding of the factors that affect
therapeutic ultrasound an individual’s response to hearing aid signal processing
• Ewa Jacewicz • Daniel J. Tollin
(Ohio State University, Columbus) (University of Colorado School of Medicine, Aurora)
for contributions to the understanding of spec- for multidisciplinary contributions linking acous-
tral and temporal dynamics in speech acoustics tics, physiology, and behavior to the understanding
and perception of binaural hearing
• Joan A. Sereno • Matthew W. Urban
(University of Kansas, Lawrence) (Mayo Clinic, Rochester, MN)
for contributions to speech learning, perception, for outstanding contributions to the field of ultra-
and production across individuals and languages sonic assessment of biologic tissue properties
©2021 Acoustical Society of America. All rights reserved. Volume 17, issue 1 | Spring 2021 • Acoustics Today 61
https://doi.org/10.1121/AT.2021.17.1.61
Ask an Acoustician:
Zoi-Heleni
Michalopoulou
Zoi-Heleni Michalopoulou and
Micheal L. Dent
Geoacoustic inversion is one aspect of the inverse problem. As is often the case, I was told that girls are not made for
My interests extend to inversion for source detection and math and that only made me more determined to pursue
62 Acoustics Today • Spring 2021 | Volume 17, issue 1 ©2021 Acoustical Society of America. All rights reserved.
https://doi.org/10.1121/AT.2021.17.1.62
a STEM career. I decided to study electrical engineering How do you feel when experiments/projects
at the National Technical University of Athens because do not work out the way you expected
it was the most prestigious STEM program in my home- them to?
town; the electrical circuits of my childhood may have I sometimes get frustrated, but I try to take it as a learn-
played a role in my decision! I enjoyed my studies and ing experience. I look for the reason behind the failure
decided to continue for an MS in electrical engineering in of an idea. That usually leads to a new idea that is an
the United States. My high-school years at the American alternative look at the problem I need to solve. And I
College of Greece had prepared me for this wonderful try to remind myself that progress in research is not a
adventure. I was fortunate to attend the MS program at linear process.
Duke University (Durham, NC) where I met my advisor,
Dimitri Alexandrou, who had a passion for anything that Do you feel like you have solved the work-
had to do with sound and the ocean. He inspired me, and life balance problem? Was it always
not only did I complete my MS thesis in ocean acoustic this way?
signal processing, but I decided to move forward for a Yes, as much as this is possible. I have a supportive family
doctorate in the same area. and a flexible working environment. Teaching courses
at convenient times and having family members help
Research became a passion, which led to an academic with child care so that I could attend conferences helped
career. I have been at NJIT ever since I graduated from me attain a satisfying combination of career and family
Duke, starting as a research assistant professor/postdoc- life. Having a daughter who appreciated my work and
toral fellow in the Department of Mathematical Sciences. enjoyed telling her friends about her mom searching
The environment was (and still is) full of energy and a for submarines was a bonus! The flexibility of my work
great fit. Soon after I joined, an opening for an assistant allowed me to get involved in the community. I served as
professor position came up. It was an easy decision for a volunteer for my daughter’s Girl Scout troop and unit,
me to apply and accept the offer that followed. I have and I also volunteered at her elementary school, mostly
been enjoying a fruitful career there ever since. I have helping students with math and science. I managed never
had the fortune to meet at Acoustical Society confer- to miss my daughter’s recitals, choir events, and soccer or
ences colleagues such as Ross Chapman, Alex Tolstoy, Jim volleyball games. I enjoyed travel and still do, attending
Candy, Ed Sullivan, Ellen Livingston, Leon Sibul, Leon many conferences, often with my husband and daughter,
Cohen, and many others, who mentored me in my early and I get to visit frequently my family in Greece, where I
years and to them I owe much of the satisfaction I have also enjoy collaborations with colleagues at the National
been drawing from my career. Technical University of Athens.
What is a typical day for you? What makes you a good acoustician?
I am a morning person, and my day starts very early; I I work in an applied mathematics and statistics depart-
am up at 5 a.m. with a cup of coffee, reading The New ment in a technological institute that enables me to have
York Times on my computer. But, other than that, every discussions and collaborations with researchers from
day is different. Research, teaching, and administration multiple areas in the mathematical and physical sciences
all compete for time. I try to get a good few hours of as well as engineering. I develop new ideas and a better
uninterrupted research time before delving into class understanding of acoustics problems after I become
preparation, teaching, and administration. I have fre- exposed to research advances in different disciplines.
quent meetings with my students that I look forward to And I learn from my students.
because they often lead to fresh ideas and perspectives.
I draw a firm line at around 6 p.m. Family and personal How do you handle rejection?
time start then unless deadlines are looming. Relaxed I put aside negative reviews and revisit them a couple
family dinners, classes at the Adult School of my town, of weeks later. I carefully consider critique (sometimes
and reading occupy my evenings. I agree and sometimes not) and try to use it to develop
new ideas or better arguments for my existing ones. I mentor, I realized that the first person I needed to per-
keep going. suade that I truly belonged in a challenging academic
environment was myself. Everything followed smoothly
What are you proudest of in your career? after that.
It has been a privilege to have mentored numerous
bright and talented young people, several of them from What do you want to accomplish within the
underrepresented groups in the sciences. I have had the next 10 years or before retirement?
pleasure of guiding several women in research projects, I plan to continue with all my activities: research, teach-
both during their graduate and undergraduate studies. ing, and administration. What I would particularly like
I take great pride in their accomplishments during and to accomplish is the mentoring of more undergraduate
after their time at NJIT. I follow their career paths and students in research. There is a spark when undergradu-
keep in touch; notes that they send me decorate my ates are exposed to research questions and asked to work
office. Similarly, I have found it rewarding to address alongside graduate students, postdocs, and faculty. Sev-
middle- and high-school students and to inspire them eral are inspired to go on to graduate school and some
(I hope!) about pursuing careers in STEM. On several continuing to work in acoustics. Others tell me that their
occasions, students have approached me afterward, star- research experience and participation in research teams
tled and excited about careers in math that they had in their undergraduate years enables them to work more
never imagined. effectively in groups in their jobs in industry. A worth-
while experience all around.
And, of course, I am exceedingly proud of the bright
23-year-old woman that my husband and I have raised Bibliography
in parallel to our careers, who has often inspired me to Frederick, C., Villar, S., and Michalopoulou, Z.-H. (2020). Seabed
classification using physics-based modeling and machine learning
work harder so that I could become a better role model The Journal of the Acoustical Society of America 148, 859-872.
for her and her peers. Lin, T., and Michalopoulou, Z.-H. (2016). A direct method for the
estimation of sediment sound speed with a horizontal array in shal-
low water. IEEE Journal of Oceanic Engineering 42, 208-218.
What is the biggest mistake you’ve
Michalopoulou, Z.-H., Pole, A., and Abdi, A. (2019). Bayesian
ever made? coherent and incoherent matched-field localization and detection
Overthinking everything. Writing a paper or research in the ocean. The Journal of the Acoustical Society of America 146,
proposal was sometimes a particularly lengthy endeavor. 4812-4820.
Piccolo, J., Haramuniz, G., and Michalopoulou, Z.-H. (2019). Geo-
Should I include the last figure? How about adding one acoustic inversion with generalized additive models. The Journal of
more reference? And how about this email I need to the Acoustical Society of America 145, EL463-EL468.
send? How will I convey my message? Once I realized
it, I stopped it and became more efficient and effective. Contact Information
The demographic survey completed by the Acousti- majored in electrical engineering. Similar to high school,
cal Society of America (ASA) in 2018 confirmed what the college physics course touched on acoustics and
many of us suspected, that the composition of the ASA sound waves but with very little depth.
membership does not reflect the demographics of the US
population. This is particularly true with respect to Black I was finally introduced to the fascinating world of
representation because less than 2% of the membership acoustics during a summer research experience at Duke
that responded to the survey identified as Black. University (Durham, NC). The program was funded
by the National Science Foundation, and I requested a
The ASA Committee for Improving Racial Diversity and research project in biomedical engineering to learn more
Inclusivity (CIRDI) that I chair was formed in the summer about the field. Interestingly, my summer project focused
of 2020 (Porter, 2020, acousticstoday.org/porter-16-4) on building and characterizing the performance of small
and charged with developing initiatives and activities to transformers that would be installed in measurement
address this glaring problem within the Society and, most devices for cardiac electrophysiology studies. I was and
importantly, within academic programs and professions remain to this day an innately curious person, and so I
related to acoustics. One of the first questions CIRDI would walk the hallways in the Pratt School of Engineer-
discussed was, “Why are there so few persons of color, ing at Duke and read the research posters.
particularly Blacks, in acoustics or acoustics-related
fields?” Through our conversations, we recognized I discovered that the Duke Biomedical Engineering Program
that there are few opportunities for Black students, had a very strong diagnostic ultrasound group and found
especially undergraduate students, to be exposed to the research to be accessible for an electrical engineering
acoustics in a structured format. It is more likely that student. On returning to PVAMU, I spent the year research-
a Black student will discover acoustics and careers in ing biomedical engineering graduate programs as well as
the field through their own efforts rather than through companies that produced diagnostic ultrasound systems.
a structured program (Scott, 2020, acousticstoday.org/
ScottNewNormal). I share my own experience as an The following summer, I secured an internship in the
example of what the ASA must address to diversify the Ultrasound Division of General Electric Medical Systems,
field and its membership. which is now GE Healthcare. I had a very supportive
supervisor and an extremely positive experience, which
I have been interested in physics and engineering since solidified my decision to pursue a career in biomedical
high school but was completely unaware of acoustics. ultrasound. My supervisor informed me of universities
Most of my high-school science classes focused on fun- that had strong research programs in biomedical ultra-
damentals (i.e., the biology of life across scales, Newton’s sound, including the University of Washington (UW;
Laws), and I was only introduced to sound waves in my Seattle). I was fortunate to be admitted to the bioengi-
physics class. However, the introduction was very super- neering program at the UW, and I joined the research
ficial, and the teacher never discussed careers in acoustics group led by Larry Crum. Larry recommended early in
or acoustics-related fields. my graduate career that I join the ASA, and he served as
a guide at its meetings. Larry also recommended that I
On completing high school, I enrolled at Prairie View attend programs that would provide additional instruc-
A&M University (PVAMU), which is a Historically Black tion in acoustics while also expanding my network, such
College/University (HBCU) outside Houston, TX, and as the Physical Acoustics Summer School. I completed
©2021 Acoustical Society of America. All rights reserved. Volume 17, issue 1 | Spring 2021 • Acoustics Today 65
https://doi.org/10.1121/AT.2021.17.1.65
ATTRACTING STUDENTS OF COLOR
my doctoral studies in 2003 and have been an active to increase interaction and communication with students
member of the ASA for more than 20 years. and professionals of color, such as working with faculty
and administrators at minority-serving institutions to
It is important to note that my training in acoustics raise awareness of acoustics-related professions; organiz-
occurred predominantly after completing my under- ing workshops and forums on topics related to diversity,
graduate degree. However, if I did not take it on myself to equity, and inclusion; and creating a webpage to highlight
learn about biomedical ultrasound as an undergraduate persons of color in acoustics and acoustics-related fields.
student, I never would have specialized in the area as a Please visit the ASA Diversity Initiatives page for more
graduate student and I never would have joined the ASA. information and for opportunities to volunteer and imple-
ment the various initiatives.
Based on my experience, the CIRDI acknowledged that
the ASA needs to create more opportunities for students References
of color to get introduced to acoustics and acoustics- Porter, T. (2020). Addressing the lack of Black members in the ASA.
Acoustics Today 16(4), 75-76.
related professions. The committee proposed that the Scott, E. K. E. (2020). The need for a new normal. Acoustics Today
ASA establish and manage a summer research and intern- 16(4), 77-79.
ship program in acoustics and acoustics-related fields for
undergraduate students of color. In addition to funding Contact Information
the students, the ASA will provide a short course in
acoustics in preparation of the summer experience.
Tyrone Porter [email protected]
Department of Biomedical Engineering
Furthermore, ASA members will host virtual gatherings for The University of Texas at Austin
the students to foster a community and discuss the academic 107 W. Dean Keeton Street
and professional pathways available in acoustics and acous- Austin, Texas 78715, USA
tics-related fields. The American Institute for Physics (AIP)
awarded the ASA seed funding from its Diversity Action
Fund to support launching the program in 2021. For many
students, the summer program may be their first substan-
tive experience with concepts, technologies, or processes
involving acoustics. A positive experience both technically
and culturally may serve as a first step toward pursuing a
career in a field related to acoustics and becoming a member
of the ASA. We are seeking mentors committed to diversi-
fying their profession to host these aspiring young scholars.
We plan to foster community among the mentors as well
by hosting workshops and virtual gatherings to discuss and
share best practices for mentoring students from under-
represented groups. Although not required, mentors and/
or companies willing to fund a student will enable the ASA
to include more students in the program. More informa-
tion about the program and expectations for mentors can
be found on the ASA Diversity Initiatives page (available at
acousticalsociety.org/diversity-initiatives). If you are inter-
ested, please contact Tyrone Porter ([email protected]).
Global hearing health care is another NIDCD priority National Academies of Sciences, Engineering, and Medicine (2016).
and one that also embraces multidisciplinary approaches. Hearing Health Care for Adults: Priorities for Improving Access and
Affordability. The National Academies Press. Washington, DC. Avail-
I cochair The Lancet Commission on Hearing Loss able at https://bit.ly/3pGJbeu.
(available at globalhearinglosscommission.com), which
pursues innovative ideas that challenge the accepted
Contact Information
thinking on identification and treatment of hearing loss
worldwide. The commission seeks to develop creative
Debara L. Tucci [email protected]
approaches focused on policy solutions and the use of
Office of the Director
new technologies and programs to enable those with
National Institute on Deafness and
hearing loss worldwide to be fully integrated into society. Other Communication Disorders (NIDCD)
We will share our findings in spring 2022. I encourage Building 31, Room 3C02
you to learn more about the NIDCD’s commitment to 31 Center Drive, MSC 2320
Bethesda, Maryland 20814, USA
global health (available at bit.ly/3kMEhZL).
PAC International................................................Page 34
www.pac-intl.com
RION ......................................................................Page 3
www.rion-sv.com
Scantek .................................................................Page 5 For information on rates and specifications, including display, business card
www.scantekinc.com and classified advertising, go to Acoustics Today Media Kit online at:
https://publishing.aip.org/acousticstodayratecard or contact the Advertising staff.
Donate today:
acousticalsociety.org/acoustical-
society-foundation-fund
Commercial Acoustics
A DIVISION OF METAL FORM MFG., CO.
Acousatics Today Mag- JanAprJulyOct 2017 Issues • Full pg ad-live area: 7.3125”w x 9.75”h K#9072 1-2017
TYPE 4966 MICROPHONE FAMILY
HIGH-PRECISION
1/2’’ FREE-FIELD MICROPHONE
HANDMADE IN DENMARK
www.bksv.com/4966-family