50% found this document useful (2 votes)
608 views14 pages

Cardinal Vowel Theory

The document summarizes a quantitative model of vowel production developed by Lindblom and Sundberg. The model aims to combine the concept of a reference system like cardinal vowels with an objective numerical method of vowel specification. The model represents vowel articulation using parameters like jaw opening, tongue shape, lip rounding, and larynx height. Vowels can be generated by choosing parameter values, which allows computation of the vocal tract shape, area function, and resulting formant frequencies. The goal is to define possible vowel articulations in a more intuitive way compared to previous quantitative frameworks.

Uploaded by

Ahmed Shojib
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
50% found this document useful (2 votes)
608 views14 pages

Cardinal Vowel Theory

The document summarizes a quantitative model of vowel production developed by Lindblom and Sundberg. The model aims to combine the concept of a reference system like cardinal vowels with an objective numerical method of vowel specification. The model represents vowel articulation using parameters like jaw opening, tongue shape, lip rounding, and larynx height. Vowels can be generated by choosing parameter values, which allows computation of the vocal tract shape, area function, and resulting formant frequencies. The goal is to define possible vowel articulations in a more intuitive way compared to previous quantitative frameworks.

Uploaded by

Ahmed Shojib
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Dept.

for Speech, Music and Hearing

Quarterly Progress and Status Report

A quantitative theory of cardinal vowels and the teaching of pronunciation


Lindblom, B. and Sundberg, J.

journal: volume: number: year: pages:

STL-QPSR 10 2-3 1969 019-025

http://www.speech.kth.se/qpsr

STL-QPSR 2-3/1969

B,

A QUANTITATIVE THEORY OF CARDINAL VOWELS AND THE TEACHING O F PRONUPJCIATION*

B I Lindblom and J. Sundberg


Abstract The p r o b l e m of devising a r e f e r e n c e s y s t e m f o r specifying t h e phonetic value of vowels is d i s c u s s e d . The c l a s s i c a l t h e o r y of C a r d i n a l Vowels is examined a s well a s previous quantitative f r a m e w o r k s f o r describing vowel pronunciation. An attempt i s m a d e to c o n s t r u c t a model of vowel production that combines the i d e a of a r e f e r e n c e s y s t e m with a n objective n u m e r i c a l method of specification. A s e t of vowels i s generated +hat r e p r e s e n t the m o s t e x t r e m e vowels i n t e r m s of t h e total acoti:.tic vowel s p a c e c h a r a c t e r i s t i c of the model. T h e s e sounds a r e selected s o a s t o b e a l s o approximately equidistant acoustically. T h e s e modelbased "cardinal vowels" a r e compared with a s e t of t r u e c a r d i n a l vowels pronounced by Daniel Jones. The two s e t s display many qualitative s i m i l a r i t i e s both acoustically and auditorily. The i m plications of a vowel r e f e r e n c e s y s t e m that could indeed b e produced "from w r i t t e n d e s c riptions ", is touched upon i n p a r t i c u l a r with r e g a r d t o the teaching of pronunciation t o h a r d of hearing as well a s n o r m a l students. An analysis -by- synthesis application i s suggested i n which the model i s used t o supplement v i o u a l s p e c t r a l d i s p l a y s of vowel sounds with a r t i c u l a t o r y i n t e r p r e t a tions. Some of the p r o b l e m s that would have t o b e o v e r c o m e i n s u c h a method of "automatic articulation instruction" a r e m e n tioned, f o r instance, those of normalization and compensatory articulation. T h e C a r d i n a l Vowel S v s t e m In the teaching of pronunciation one principle i s t o d e s c r i b e the unknown sounds to b e learned i n relation to sounds that a r e known t o t h e student. Phoneticians have pointed out that the sounds of a given language cannot b e used f o r this purpose s i n c e t h e r e a r e l a r g e variations among s p e a k e r s of the s a m e language owing t:, dialectal, socialogical, and o t h e r f a c t o r s ( I ) . Instead of teaching, f o r instance, vowel pronunA a ciation i n t e r m s of the vowel sounds of a p a r t i c u l a r l a n g u a ~ e r e f e r e n c e s y s t e m that i s independent of any given language has been devised. f a m o u s example of s u c h a s y s t e m is the C a r d i n a l Vowels. t i e s and known tongue and l i p positions"(2). T h e s e sounds

a r e said to b e "a s e t of fixed vowel-sounds having known acoustic qualiThere a r e a primary set and a secondary s e t of c a r d i n a l vowels each s e t c o m p r i s i n g eight vowels.

T h i s p a p e r was p r e s e n t e d a t the Second International C o n g r e s s of Applied Linguistics, Cambridge, England, 8-1 2, Sept. 1969.
,

STL-QPSR 2-3/1969

20.

It is c u s t o m a r y t o r e f e r t o a given c a r d i n a l vowel by i t s number and t o d e s c r i b e i t s articulation i n t e r m s of t h r e e dimensions : tongue height, front-back position of the tongue and d e g r e e of rounding. In Table I-B-1 below we show the a r t i c u l a t o r y specifications and symbols used t o d e s c r i b e the p r i m a r y and secondary s e t s . TABLE I-B-1 close half-close half - open open close half-close half open open
i

u
o
3

e
E

PRIMARY CARDINAL VOWELS

a
Lu
Y
A
D

b
c e

SECONDARY CARDINAL VOWELS

CIE

T h e following f e a t u r e s a r e said t o c h a r a c t e r i z e t h e s e sounds (3). (1) They a r e independent of the vowels of any language.
(2) They a r e fixed r e f e r e n c e points of "exactly d e t e r m i n e d and invariable quality".

(3) They a r e p e r i p h e r a l vowels, Thus i n principle i t should be possible t o d e s c r i b e a n a r b i t r a r y vowel quality of any language by interpolating between the r e f e r e n c e points.
(4) They a r e auditorily equidistant.

(5) Moreover, "the values of c a r d i n a l vowels cannot b e l e a r n t


f r o m w r i t t e n descriptions; they should b e l e a instruction f r o m a t e a c h e r who knows them"

At.by o r a l

The l a s t p r o p e r t y indicates that the s y s t e m i s not a n objective and quantitative one but r e l i e s heavily on the m o t o r s k i l l s and perceptual acuity of the student. It is passed on by o r a l tradition.

Quantitative F r a m e w o r k s of Vowel S ~ e c i f i c a t i o n F r o m the e a r l y fifties onwards acoustic phonetics h a s made rapid progress. Among t h e achievements i n t h i s field a r e the s c h e m e s devised We a r e r e f e r r i n g t o t h e t h r e e The following t h r e e p a r a m e t e r s by F a n t (4) and Stevens and House (5) t o study the relation between vowel articulations and t h e i r acoustic r e s u l t s . p a r a m e t e r models of t h e s e investigators.

a r e controlled i n t h e s e models: (1) length and opening a r e a of l i p section,

( 2 ) position of maximal tongue constriction,


(3) the magnitude of this constriction. On the b a s i s of t h e s e t h r e e num'bers t h e c r o s s - s e c t i o n a l a r e a s along t h e vocal t r a c t a r e derived with the aid of r u l e s that differ somewhat between t h e two v e r s i o n s of the t h r e e - p a r a m e t e r model. Given t h e distribution of c r o s s - s e c t i o n a l a r e a s along the t r a c t , o r the a r e a function, t h e acoustic d e t e r m i n a n t s of vowel quality, the f o r m a n t frequencies, a r e ccmputed. According to this type of model possible vowel a r t i c u l a t i o n i s defined a s any p e r m i s s i b l e combination of p a r a m e t e r values the p a r a m e t e r s being dimensions of the a r e a function. Although t h e s e f r a m e w o r k s of vowel specification a r e objective and quantitative which the c a r d i n a l vowel s y s t e m is not they s o m e t i m e s s p e c ify "possible vowel articulation" i n too g e n e r o u s a fashion and i n a mann e r which i s not always e a s y t o i n t e r p r e t i n intuitively meaningful a r t i c ulatory t e r m s s u c h as open-clos e, front-back, etc.
A Model of Vowel Production

In the p r e s e n t p a p e r we s h a l l r e p o r t on a n attempt to c o n s t r u c t a model that combines the idea of a n a r t i c u l a t o r y and perceptual r e f e r e n c e s y s t e m


. inherent i n t h e c a r d i n a l vowel t h e ~ ~ r y Our a i m h a s been t o build into the

model a l l that we know a t p r e s e n t about the n a t u r a l d e g r e e s of f r e e d o m of the vocal t r a c t . In s o doing we hope that we might a r r i v e a t a n i m proved definition of the notion of "possible vowel articulation". A. Articulatory P r o p e r t i e s -------- -----Our model is controlled by m e a n s of t h e following independent c o m ponent s : the mandible t h e tongue whose movements we r e s t r i c t t o a single fixed path. whose shape c a n b e v a r i e d continuously by l i n e a r interpolation between t h r e e b a s i c configurations of tongue c o n t ~ u r s corresponding t o [i], [ a ] and [u], respectively. A c e r t a i n m i x t u r e of palatalization, velarization and pharyngealization ("[i) - n e s s f l , "Cu] - n e s s n andl[a] -ness1', respectively) c o r r e s p o n d s t o c e r t a i n n u m e r i c a l values of t h e s e p a r a meters.

STL-QPSR 2-3/1969

22.

T h i s choice c a n b e justified p a r t l y on t h e b a s i s of d a t a obtained f r o m l a t e r a l X - r a y profiles of Swedish vowels. It t u r n s out that t h r e e m a i n farnilizs of tongue contours a r e obtained provided that t h e s e cont o u r s a r e plotted with the l o w e r jaw a s r e f e r e n c e (Fig. I-B-1). An explanation of this r a t h e r r e s t r i c t e d s e t of contours i s readily apparent when we think of the a r r a n g e m e n t of the m a j o r e x t r i n s i c m u s c l e s of the tongue. We find the genioglossus, styloglos s u s and t h e hyoglossus m u s c l e s which s e e m mechanically capable of participating i n the contraction p a t t e r n s underand [ a ] , respectively(7). lying the production of [i], [ labio-muscular activitv (rounding- spreading)
I

which is independent of jaw position.

l a r y n x height All of t h e s e p a r a m e t e r s lend t h e m s e l v e s naturally t o a n i n t e r p r e t a t i o n i n t e r m s of "muscle lengths". T o compute a sound wave f r o m s u c h specifications t h e p r o c e d u r e is t h e following:
1 . Choose p a r a m e t r i c values (jaw opening, tongue shape, rounding spreading, l a r y n x height).

2. Compute the a s s o c i a t e d a r t i c u l a t o r y profile, that is, the contours of the vocal t r a c t and i t s length i n a l a t e r a l projection.
3. T r a n s l a t e the r e s u l t of 2 into a n a r e a function (the variation of c r o s s - s e c t i o n a l a r e a along the t r a c t ) .
4. Compute the formant frequencies corresponding t o this a r e a function.

In Fig. I-B-2t h e t h r e e b a s i c tongue shapes a r e shown. s y s t e m anchored on the mandible i s a l s o depicted. t o compute interpolated tongue shapes. a r e a i s derived. ordinate system.

A coordinate

T h i s s y s t e m is used

At the top left we s e e t h e p a r a -

m e t e r s of width and height of l i p s e p a r a t i o n with the aid of which the opening Below a l a t e r a l profile t r a c i n g i s shown with another coT h i s s y s t e m i s used t o compute the a r e a function.

B. Acoustic P r p e i s - - - - - - - - -o- -r t-eNaw a s s u m e that like the child learning t o talk, we combine different mandible positions, tongue s h a p e s , l i p s t a t e s and l a r y n x heights i n a l l possible ways and l i s t e n t o the acoustic r e s u l t i n each individual case. Whatever w e do with o u r a r t i c u l a t o r y components it i s c l e a r t h a t t h e human s p e e c h organs a r e constrained i n s u c h a way s o a s to p e r m i t only c e r t a i n vowel qualities, o r combinations of f o r m a n t frequency values.

1I
I
i

Fig. I-B-1.

Midsagittal tongue contours for Swedish vowels in relation to outline of mandible. Top right: [u]. Below: [ a , o, Top left: [ i , e , r ,

. I

?I.

Fig. I - B - 2 .

Intheupperleft-handparttheparametersdetermining t h e mouth opening a r e a A a r e shown: h = v e r t i c a l s e p a r a t i o n between lips; w = d i s t a n c e between mouth c o r n e r s ; p i s a n u m b e r that specifies the c u r v a t u r e of the lip cont o u r s . T h e s e contours when projected on a frontal plane a r e a s s u m e d t o be given by

In t h e u p p e r right p a r t the b a s i c tongue s h a p e s of the model a r e shown. A p o l a r coordinate s y s t e m defined i n r e l a t i o n t o the mandible i s a l s o indicated. With the a i d of t h i s coordinate s y s t e m interpolated tongue s h a p e s a s s o c i a t e d with [ i , u] and [ a ] w e r e computed. In the l o w e r p a r t of the figure a l a t e r a l X - r a y t r a c i n g c a n b e s e e n . Superimposed on the profile i s a c o o r d inate s y s t e m defined i n r e l a t i o n t o fixed s t r u c t u r e s s u c h a s the maxilla. This s y s t e m w a s used i n the d e t e r mination of a r e a functions.

Certain other mouths.

F3-F2dFI combinations c h a r a c t e r i z e vowels that we could

produce only with the aid of a t e r m i n a l analogue s y n t h e s i z e r combinations would b e impossible.

- not with o u r

F r o m the point of view of the human s p e e c h m e c h a n i s m s u c h The s p a c e that c h a r a c t e r i z e s the a In this figure The l o w e r fields The top a r e a s

coustic possibilities of o u r model i s shown i n Fig. I-B-31 we have s e p a r a t e d the rounded and s p r e a d subspaces. r e f e r t o a l l possible combinations of F p e r t a i n t o the corresponding symbols.

2 and F 1 values.

F 3 values.

When we explore the contours

of t h e s e s p a c e s auditorily we find the qualities indicated by the vowel T h e s e points have been selected a t approximately equidistant

F1 steps.
T r u e and Model-Based C a r d i n a l Vowels C l e a r l y the f i r s t f o u r f e a t u r e s mentioned on p. 20 a s c h a r a c t e r i z i n g c a r d i n a l vowels apply a l s o t o the vowels generated by t h e model. points and they a r e acoustically (if not auditorily) equidistant. Consequently i t would b e of s o m e i n t e r e s t t o c o m p a r e a s e t of modelbased vowels with a s e t of t r u e c a r d i n a l vowels. Fig. I-B-4 d e m o n s t r a t e s The f i r s t t h r e e forThey a r e independent of any language, they a r e p e r i p h e r a l and fixed r e f e r e n c e

the r e s u l t s of a n acoustic c o m p a r i s o n of t h i s type. by Daniel J o n e s ( I ) .

mant frequencies w e r e m e a s u r e d i n a s e t of c a r d i n a l vowels a s spoken The left plot i n t h i s figure shows the e x t r e m e vowels It is s e e n that t h e This difference is generated with t h e model ( a l s o shown i n Fig. I-B-3). t r u e c a r d i n a l vowels s p a n a somewhat l a r g e r range.

probably due t o the fact that,among o t h e r f a c t o r s , Daniel J o n e s h a s a s h o r t e r o v e r a l l t r a c t length than that of o u r model and that he a l s o d e l i b e r a t e l y shortened h i s t r a c t by elevating his l a r y n x to a n e x t r e m e position when producing

[il

and probably [ a ) .

Qualitative s i m i l a r i t i e s d o

exist between the s e t s , however.

T a b l e I-B-2 contains a specification of

t h e a r t i c u l a t o r y p a r a m e t e r s that underlie the model generated sounds. Table I-B-2. The a r t i c u l a t o r y dimensions of p e r i p h e r a l vowel types Tongue shape Palatal Close
JAW i

Palato-pharyngeal

Pharyngeal
o

Velar u

e Open
E

a
a

ROUNDED

VOWEL S P A C E

SPREAD

VOWEL

SPACE

+-.-.-.4.

-.-.-.-

Roundrd and larynx bprassad

[Ul
I
1

.
. I

.1

.2

4
FIRST

.4

.S

.6

.7

.8 kHz

.2

.3

. 4

.5

.6

.7

.8 kHz

FORMANT

FREQUENCY

FIRST FORMANT

FREQUENCY

Fig. I - B - 3 .

The maximal rounded and spread vowel spaces that the model i s capable of g e n e r a t i n g .

Implications of a Cuantitative Theory of C a r d i n a l Vowels --f o r the Teaching of Pronunciation In theory, a physiological model of vowel production should b e a useful tool i n the teaching of pronunciation t 3 second language l e a r n e r s and h a r d of hearing children. mountable. In practice, such a n application would r e q u i r e solutions to s o m c technical problems which, by rro m e a n s , however, a p p e a r i n s u r Imagine that the f o r m a n t p a t t e r n of a vowel in a given language o r dialect could be m e a s u r e d automatically and r e p r e s e n t e d a s a dot on a n

F2 and F 3 ' o r s o m e m e a s u r e combining t h e s e two frequencies, could b e plotted along the ordinate and F along the a b s c i s s a . 1 Technologically this i s not wishful thinking. Attempts have a l r e a d y been
oscilloscope s c r e e n . made along t h e s e l i n e s (6). dicated i n Fig. I-B-5. Suppose that w e f i r s t plot a t a r g e t vowel a s inIn a l l Next we a s k o u r subject who might f o r i n s t a n c e b e

a deaf child t o produce a v o v ~ e l s c l o s e to the t a r g e t a s possible. a i n F i g , I-B-5.

probability o u r pupil will m i s s the t a r g e t perhaps in the m a n n e r indicated might a t this s t a g e have the child t r y t o i m p r o v e his An even b e t t e r method Such condipronunciation by a t r i a l - a n d - e r r o r procedure.

would b e t o supplement t h e visuzi display with a n indication of how the child should change his articulation in o r d e r to r e a c h the target. information could b e given i n t e r m s of t r a j e c t o r i e s depicting the f o r m a n t

frequency shift a s s o c i a t e d with "isolingual" and "isomandibular"


tions.
A s e t of such c u r v e s i s given in Fig. I-B-5.

This i n s t r u m e n t

would s e r v e a s a s o r t of "automatic articulation instructor". This goal might a l s o sound utopian but i t should b e possible given a c o m p u t e r and a n acceptable theory of vowel production. T h e p r e s e n t model of vowel production is obviously unsatisfactory a s s u c h a theory. T h e r e a r e a number of modifications that a r e c l e a r l y T h e r e is f o r instance the p r o b l e m of normalizing T h e r e i s a l s o t h e p r o b l e m of compensatory a r t i c u Although it might b e possible t o produce a n needed to i m p r o v e it?. uniformly i n s i z e non-peripheral vowels.
3:

F - p a t t e r n d a t a f o r t a l k e r s whose vocal t r a c t lengths differ often nonlations that o u r model a t p r e s e n t often i n c o r r e c t l y allows in the c a s e of
[jd]

e. g. , independent control of tongue blade i n retroflexion and c o r o n a l consonant articulation; independent control of pharynx width i n connection with t h e t e n s e - l a x distinction on which t h e vowel harmony of the Akan languages s e e m s t o b e based; hyperpalatalization, velarization and pharyngealization t o produce cotnnensatory a r t i c a l a t i o n s and (ifferent types of d o r s a l constrictions l a r g e r than those f o r vowels; control of the f o r m of the e r o z s - s e c t i o n a l a r e a ( l a t e r a l s , f r i c a t i v e s e t r . ).

MOVE TONGUE FORWARD !

PUPI~S VOWEL

FIRST FORMANT

FREQUENCY

F i g . I-B-5.

Stylized v i s u a l d i s p l a y of s p e c t r a l p r o p e r t i e s of vowels intended t o b e u s e d i n i m p r o v i n g t h e pronunciation of vowels by h a r d of h e a r i n g , s u b j e c t s . T h e t a r g e t vowel and t h e p u p i l ' s vowel a r e r e p r e s e n t e d by d o t s . T h e t r a j e c t o r i e s i n d i c a t e t h e f o r m a n t f r e q u e n c y s h i f t s t h a t would r e s u l t if t h e pupil changed h i s a r t i c u l a t i o n a s i n d i c a t e d , t h a t i s , b y lowering h i s jaw and b y moving h i s tongue f o r w a r d . T h e s e t r a j e c t o r i e s a r e a s s u m e d t o b e b a s e d o n a n a n a l y s i s - b y - s y n t h e s i s of t h e pupil' s vowel using a quantitative m o d e l of vowel p r o d u c t i o n of t h e t y p e d e s c r i b e d i n t h e p a p e r .

STL-QPSR 2-3/1969

25.

with s p r e a d lips by compensating e l s e w h e r e along the t r a c t i t is e x t r e m e ly r a r e t o find a Swedish native t a l k e r who consistently s p r e a d s his sounds. tion that i s yet t o b e discovered i n future r e s e a r c h . References : (1) C a r d i n a l Vowels (spoken by D. Y w % 6)

[b]

Among o t h e r things t h e r e s e e m s t o b e a principle of disambigua-

ones), Linguaphone Institute

( 2 ) D. Jones:

An Outline of English Phonetics (Cambridge 1956), 8 t h edition.

1967). (3) D. Aberc rombie: E l e m e n t s of G e n e r a l Pholletics ( ~ d i n b u r ~ h


(4) G. Fant:

Acoustic T h e o r y of Speech Production ( ' s -Gravenhage 1960).

(5)

K,NA Stevens and A.

S, House: "Development of a quantitative d e s c r i p tion of vowel articuIation", J. Acoust. Soc. Am. - (1955), 17 pp. 484-493r

(6) J. M. Pidkett and A, Constam: "A v i s u a l s p e e c h t r a i n e r with simplified indication of vowel spec!truml', Am, Ann. of t h e d e a f 1 13 (1968), pp. 253-258.

I. B. Thomas and R. C. Snell: "Articulation training through the u s e of a r e a l - t i m e visual display of s p e e c h p a r a m e t e r s " , m s submitted f o r publication i n Am.Ann. of the Deaf.

A. J. Goldberg: "Visual perception of s p e e c h stimuli", Q u a r t e r l y P r o g r e s s R e p o r t No. 92 (MIT, RLE, Cambridge, Mass. ), Jan. 15, 1969, pp. 335-338. (7) P. Ladefoged: "Physiological c h a r a c t e r i z a t i o n of speech", Working P a p e r s in Phonetics (UCLA), June 1964, pp. 2-9.
(8) G. Fant: "A note on vocal t r a c t s i z e f a c t o r s and non-uniform Fp a t t e r n scalings", STL-QPSR 4/1966 (KTH, Stockholm), pp. 22-30.

Acknowledgments T h i s w o r k was supported by the National Institutes of Health R e s e a r c h Grant No. NB 04003-07 and the Tri-Centennial Fund of t h e Bank of Sweden Contract No. 67/48.

You might also like