Ear Does Fourier Analysis
Ear Does Fourier Analysis
ear is a kind of spectrum analyzer. That is, the cochlea of the inner ear physically splits sound
into its (quasi) sinusoidal components. This is accomplished by the basilar membrane in the
inner ear: a sound wave injected at the oval window (which is connected via the bones of the
middle ear to the ear drum), travels along the basilar membrane inside the coiled cochlea. The
membrane starts out thick and stiff, and gradually becomes thinner and more compliant toward
its apex (the helicotrema). A stiff membrane has a high resonance frequency while a thin,
compliant membrane has a low resonance frequency (assuming comparable mass per unit length,
or at least less of a difference in mass than in compliance). Thus, as the sound wave travels, each
frequency in the sound resonates at a particular place along the basilar membrane. The highest
audible frequencies resonate right at the entrance, while the lowest frequencies travel the farthest
and resonate near the helicotrema. The membrane resonance effectively ``shorts out'' the signal
energy at the resonant frequency, and it travels no further. Along the basilar membrane there are
hair cells which ``feel'' the resonant vibration and transmit an increased firing rate along the
auditory nerve to the brain. Thus, the ear is very literally a Fourier analyzer for sound, albeit
nonlinear and using ``analysis'' parameters that are difficult to match exactly. Nevertheless, by
looking at spectra (which display the amount of each sinusoidal frequency present in a sound),
we are looking at a representation much more like what the brain receives when we hear.(
https://www.dsprelated.com/freebooks/mdft/Sinusoids.html)
intmath.com
1. Digital Audio
CD-ROM
Pulse code modulation (PCM) is the most common type of digital audio recording, used to
make compact disks and WAV files.
In PCM recording hardware, a microphone converts sound waves into a varying voltage. Then an
analog-to-digital converter samples the voltage at regular intervals of time. For example, in a
compact disc audio recording, there are 44100 samples taken every second.
The data that results from a PCM recording is a function of time. How does this work?
Imagine that you were very small and could fit into your friend's ear drum. Suppose also that you
could see things in very slow motion and that you could record the position of the ear drum once
every 44100th of a second. Your eyes are so good that you can notice 65536 distinct positions of
the ear drum's surface as it moves back and forth in response to incoming sound waves.
If your friend is listening to the sound of a flute, and you write down the positions of the ear
drum that you notice, then you would have a digital PCM recording - a series of numbers.
If you could later make your own ear drum move back and forth in accordance with the
thousands of numbers you had written down, you would hear the flute exactly as it originally
sounded. We have gone from:
To be able to convert from the series of numbers to sound, we need to apply the Fourier
Transform.
One analogy for the type of thing a Fourier Transform does is a prism which splits white light
into a spectrum of colors.
White light consists of all visible frequencies (red, orange, yellow, green, blue, indigo and violet)
mixed together (much like the information on a CD has sounds of all frequencies mixed
together) and the prism breaks them apart so we can see the separate frequencies (much like the
CD player splits apart the sound frequencies so they can be amplified and sent to the speakers).
White light is split into individual frequencies by a prism
Cochlea
In our inner ears, the cochlea enables us to hear subtle differences in the sounds coming to our
ears. The cochlea consists of a spiral of tissue filled with liquid and thousands of tiny hairs which
gradually get smaller from the outside of the spiral to the inside. Each hair is connected to a
nerve which feeds into the auditory nerve bundle going to the brain. The longer hairs resonate
with lower frequency sounds, and the shorter hairs with higher frequencies. Thus the cochlea
serves to transform the air pressure signal experienced by the ear drum into frequency
information which can be interpreted by the brain as tonality and texture.
The Fourier Transform is a mathematical technique for doing a similar thing - resolving any
time-domain function into a frequency spectrum. The Fast Fourier Transform is a method for
doing this process very efficiently.
3. The Fourier Transform
As we saw earlier in this chapter, the Fourier Transform is based on the discovery that it is
possible to take any periodic function of time f(t) and resolve it into an equivalent infinite
summation of sine waves and cosine waves with frequencies that start at 0 and increase in
The job of a Fourier Transform is to figure out all the an and bn values to produce a Fourier
Series, given the base frequency and the function f(t).
In our CD example, which has a sampling rate of 44100 samples/second, if the length of our
recording is 1024 samples, then the amount of time represented by the recording is
If you process these 1024 samples with the FFT (Fast Fourier Transform), the output will be the
sine and cosine coefficients an and bn for the frequencies
Example
Let's say that we use the FFT to process a series of numbers on a CD, into a sound.