Control Methods Used in a Study of the Vowels

Bell TelephoneLaboratories,Inc., Murray H•ll, New Jersey
(ReceivedDecember3, 1951)

Relationshipsbetweena listener'sidentificationof a spokenvowel and its propertiesas revealedfrom

acousticmeasurementof its soundwave have been a subject of study by many investigators.Both the
utteranceand the identificationof a vowel dependupon the languageand dialectalbackgroundsand the
vocal and auditory characteristicsof the individualsconcerned.The purposeof this paper is to discuss
someof the controlmethodsthat have beenusedin the evaluationof theseeffectsin a vowel study program
at Bell TelephoneLaboratories.The plan of the study, calibrationof recordingand measuringequipment,
and methodsfor checkingthe performanceof both speakersand listenersare described.-Themethodsare
illustratedfrom resultsof testsinvolvingsome76 speakersand 70 listeners.

tain sequencesof observations
for the purposeof check-
ing the measurement procedures and the speakerand
processes istobefound
of speechproductionbecauseof their listenerconsistency.The acoustic measurements were
complexityand becausethey depend upon the past made with the sound spectrograph;to minimize meas-
experienceof the individual. As in much of human urementerrors,a methodwasusedfor rapid calibration
behavior there is a self-correcting,or servomechanism of the recordingand analyzingapparatusby meansof
type of feedbackinvolvedas the speakerhearshisown a complextest tone.Statisticaltechniques wereapplied
voiceand adjustshis articulatorymechanisms. 1 to the resultsof measurements,both of the calibrating
In the elementarycaseof a word containinga conso- signalsand of the vowel sounds.
nant-vowel-consonant phoneme 2.3structure,a speaker's These methodsof measurementand analysishave
pronunciationof the vowel within the word will be beenfound to be preciseenoughto resolvethe effects
influencedby his particular dialectalbackground;and of different dialectal backgroundsand of the non-
his pronunciationof the vowel may differ both in random trends in speakers'utterances.Some aspects
phoneticqualityand in measurable characteristics
from of the vowel study will be presentedin the following
that producedin the word by speakerswith other paragraphsto illustratethe usefulness
of the methods
backgrounds.A listener,likewise,is influencedin his employed.
identificationof a soundby his past experience. EXPERIMENTAL PROCEDURES
Variations are observedwhen a given individual
makes repeated utterancesof the same phoneme.A The plan of the study is illustratedin Fig. 1. A list
very significantpropertyof thesevariationsis that they of words(List 1) waspresentedto the speakerand his
are not random in a statisticalsense,but show trends utterances of the words were recordedwith a mag-
and suddenbreaksor shiftsin level, and other typesof netictaperecorder.The list containedten monosyllabic
nonrandomfluctuations. 4 Variationslikewiseappearin wordseachbeginningwith I-hi and endingwith I-d]
the successive identificationsby a listenerof the same and differingonly in the vowel. The wordsusedwere
utterance. It is probable that the identification of heed,hid, head,had, hod,hawed,hood,who'd,hud, anal
repeatedsoundsis also nonrandombut there is little heard. The order of the words was randomized in each
direct evidencein this work to support such a con- list, and each speaker was asked to pronouncetwo
clusion. differentlists.The purposeof randomizingthe wordsin
A study of sustainedvowelswas undertakento in- the list was to avoid practiceeffectswhich would be
vestigate in a general way the relation between the associatedwith an unvarying order.
vowelphonemeintendedby a speakerand that identi- If a givenList 1, recordedby a speaker,wereplayed
fied by a listener,and to relate thesein turn to acous- back to a listener and the listener wero asked to write
tical measurements of the formant or energyconcentra- down what he heard on a secondlist (List 2), a com-
tion positionsin the speechwaves. parisonof List 1 and List 2 wouldrevealoccasional
In the plan of the study certainmethodsand tech-
niques were employedwhich aided greatly in the LEST

collectionof significantdata. Thesemethodsincluded

randomizationof test material and repetitionstb ob- I , • SPEAKERI'---I TAPE I•

FiG.2. Broadbandspectrograms
of thewordlistby a femalespeaker.

differences,or disagreements,betweenspeakerand three listswill differ in somewordsdependingupon the

listener.Instead of beingplayed back to a listener, characteristicsof the speaker,the listener, and the
List ! might be playedinto an acousticmeasuring measuringdevice.
deviceand the outputs classifiedaccordingto the A total of 76 speakers,including33 men, 28 women
measured propertiesof thesounds intoa List 3. The and 15 children,eachrecordedtwo lists of 10 words,

making a total of 1520 recordedwords.Two of the 2500 2000 154)0 1000 500 100 20

speakers wereborn outsidethe UnitedStatesand a few 2o •z

othersspokea foreignlanguagebeforelearningEnglish. lOO•
Most of the womenand childrengrewup in the Middle
Atlantic speecharea.5The malespeakers represented a
much broaderregionalsamplingof the United States; ,
õ00• •n
the majority of them spokeGeneralAmerican. 5 I'k
The wordswere randomizedand were presentedto a
groupof 70 listenersin a seriesof eight sessions.The IIIIIIIIll I I 1000"

listeninggroup containedonly men and women,and

Fro. 3. Vowelloopwith numbersof soundsunanimouslyclassified
representedmuch the same dialectal distributionas by listeners;eachsoundwas presented152 times.
did the groupof speakers,with the exceptionthat a
few observerswere includedwho had spokena foreign sessions,4 men, 4 women, and 2 children were chosen
languagethroughouttheir youth.Thirty-two of the 76 at randomfrom the respectivegroupsof 33, 28, and 15.
speakers werealsoamongthe 70 observers. The orderof occurrenceof the 200 wordsspokenby the
The 1520wordswerealsoanalyzedby meansof the 10 speakersfor each sessionwas randomizedfor pre-
soundspectrograph. 6,7 sentation to the observers.
Representativespectrograms and sectionsof these Each observerwas given a pad containing200 lines
wordsby a male speakerare shownin Fig. 3 of the having the 10 words on each line. He was asked to
paperby R. K. Potter and J. C. Steinberg; 4 a similar
draw a line throughthe one word in each line that he
list by a femalespeakeris shownhereas Fig. 2.8 In the
heard. The observers'seating positionsin the audi-
we seethe initial [h• followedby the torium were chosenby a randomizingprocedure,and
vowel,and thenby the final ['d•. Thereis generallya each observertook the samepositionfor each of the
part of thevowelfollowing
of the [h• and eightses.sions,
precedingthe influenceof the [d• duringwhich a The randomizing of the speakersin the listening
practicallysteadystate is reached.In this interval, a sessions was designedto facilitate checksof learning
sectionis made, as shownto the right of the spectro- effectsfrom one sessionto another. The randomizing
grams.The sections,portrayingfrequencyon a hori- of wordsin eachgroupof 200 wasdesignedto minimize
zontalsca;le,and amplitudeof the voicedharmonicson successfulguessingand the learning of a particular
the vertical side, have been measuredwith calibrated speaker'sdialect.The seatingpositionsof the listeners
Plexiglasstemplatesto providedata about the funda- were randomizedso that it would be possibleto de-
mental and formant frequencies and relative formant termine whether position in the auditorium had an
amplitudesof eachof the 1520recordedsounds. effect on the identification of the sounds.

The 1520recordedwordswerepresentedto the group The total of 1520soundsheardby the observerscon-

of 70 adult observersover a high quality loud speaker sistedof the 10 vowels,eachpresented152 times. The
system in Arnold Auditorium at the Murray Hill ease with which the observers classified the various
Laboratories.The generalpurposeof thesetestswasto vowelsvariedgreatly.Of the 152 ['i-] sounds,for in-
obtain an aural classificationof each vowel to supple- stance,143wereunanimouslyclassifiedby all observers
ment the speaker'sclassification.In presentingthe as Ei}. of the 152 soundswhichthe speakersintended
wordsto the observers,the procedurewas to reproduce for ['(•}, on the other'hand, only 9 were unanimously
at each of seven sessions,200 words recordedby 10 classified as Ea} by the wholejury.
speakers.At the eighth session,there remainedfive These data are summarizedin Fig. 3. This figure
men's and one child's recordingsto be presented;to showsthe positionsof the 10 vowelsin a vowelloopin
these were added three women's and one child's record-
which the frequencyof the first formant is plotted
ingswhichhadbeengivenin previoussessions, making againstthe frequencyof the secondforma. nt9 on mel
againa total of 200 words.The soundlevel at the ob- scales; •ø in this plot the origin is at the upper right.
servers'positionswas approximately70 db re 0.0002 The numbersbesideeach of the phoneticsymbolsare
dyne/cm2,andvariedovera rangeof about3 db at the the numbersof sounds,out of 152, which were unani-
differentpositions. mouslyclassified as that particularvowelby the jury.
In selectingthe speakersfor each of the first seven It is of interestin passingthat in no casedid the jury
5 C. K. Thomas, Phoneticsof American English, The Ronald agreeunanimouslythat a soundwas somethingother
PressCompany(New York, 1947). than what the speakerintended.Figure3 showsthat
6Koenig,Dunn, and Lacy, J. Acoust.Soc.Am. 17, 19 (1946).
?L. G. Kersta,J. Acoust.Soc.Am. 20, 796 (1948). oR. K. Potter and G. E. Peterson,J. Acoust.Soc.Am. 20, 528
sKey wordsfor the vowel symbolsare as follows:I-i'] heed, (1948).
l-z-1hid, l-e-1head, Ire-1had, [-a• father, [o-1ball, [rr-1hood, [u-I 10S.S. Stevensand J. Volkman, Am. J. Psychol.329 (July,
who'd, I%-1hud, [3'] heard. •94o).

[i], ['•r], [a•-],and ['u] aregenerally.quite

wellunder- vowelloopthat a widegapappearsbetween[a• and
stood. [a-].The [a'] of theRomance
in this
To obtain the locations of the small areas shown in region.Sincethat vowel waspresentin neither the lists
Fig. 3, the vowelswererepeatedby a singlespeakeron nor the dialectsof most of the speakersand observers
twelvedifferentdays.A line enclosing all twelvepoints the [•e-]wasusuallycorrectlyidentified.
wasdrawnfor eachvowel;the differences in the shapes The [i'] andthe[u-]aretheterminalor endpositions
of theseareasprobablyhave little significance. in the mouth and on the vowel loop toward which the
Whenthevowelsareplottedin themannershownin vowelsare normallydirectedin the prevailingprocess
Fig. 3, they appear in essentiallythe samepositionsas of pronunci.ation change.In the formationof ['i'] the
those shown in the tongue hump position diagrams tongueis humpedhigher and farther forward than for
which phoneticianshave employedfor many years.n any other vowel; in [u-] the tonguehump takesthe
The terms "high, front, low back" refer to the tongue highestposteriorpositionin the mouthand the lipsare
positionsin the mouth.The ['i], for instance,is made moreroundedthanfor any othervowel.The vowels['u-]
with the tonguehumphighand forward,the [u-] with and[i-] arethusmuchmoredifficultto displace, anda
the humphighandback,and the [a] and ['a•] with the greaterstability in the organicformationof thesesounds
tonguehump low. wouldprobablybe expected,whichin turn shouldmean
It is of interest that when observersdisagreedwith that thesesoundsare recognized moreconsistently by a
speakerson the classificationof a vowel, the two listener.
classifications were nearly alwaysin adjacentpositions The highintelligibilityof ['•r-]probablyresultsfrom
of the vowel loop of Fig. 3. This is illustratedby the the retroflexionwhich is present to a marked degree
data shown on Table I. This table shows how the ob- only in the formationof this vowel; that is, in addition
servers classifiedthe vowels, as compared with the to the regularhumpingof the tongue,the edgesof the
vowelsintendedby the speakers.For instance,on all tongueare turned up againstthe gum ridge or the hard
the 152 soundsintendedas [i-] by the speakers, there palate. In the acousticalpattern the third formant is
were10,267totalvotesby all observers that [heywere markedlylower than for any other vowel.Thus in both
['i'], 4 votesfor ['f], 6 votesfor ['e-],and 3 votesfor ['o.']. physiological andacoustical phonetics the ['•r']occupies
Of the 152 [-a']sounds,therewasa largefractionof the a singularpositionamongthe Americanvowels.
soundson whichsomeof the observers voted for ['o-]. The very low scoreson [•-] and [o-] in Fig. 3 un-
[•'] wastaken for ['e-]a sizablepercentage of the time, doubtedly result primarily from the fact that some
and ['e-]wascalledeither If] or ['a•-](adjacentsounds membersof the speakinggroupand many membersof
on the vowel loop shownin the precedingFig. 3) quite the listeninggroupspeakone of the formsof American
a largenumberof times.['a-]and [o-],and [',t-]and dialectsin which[a'] and['o-]arenot differentiated.
were also confusedto a certain extent. Here again, as When the individuals' votes on the sounds are an-
in Fig. 2, the [i-], [•r-], ['•e-],and ['u-]showhigh intel- alyzed, marked differencesare seenin the way they
ligibility scores. classifiedthe sounds.Not only did the total numbersof
It is of considerable interest that the substitutions agreements with the speakersvary, but the proportions
shownconformto presentdialectaltrendsin American of agreementsfor the variousvowelswas significantly
speechrather well,• and in part, to the prevailingvowel different. Figure 4 will be usedto illustrate this point.
shifts observableover long periods of time in most If we plot total numbersof disagreements for all tests,
languages. •a The .commontendencyis continually to rather than agreements,the result is shown by the
shift toward highervowelsin speech,which correspond upperchart.Thisshows that [•-], [e-I,[a-], [o-],and
to smallermouth openings. had the most disagreements.An "average" observer
The listener, on the other hand, wotfld tend to make would be expectedto have a distributionof disagree-
the oppositesubstitution.This effect is most simply mentssimilar in proportionsto this graph. The middle
describedin terms of the front vowels. If a speaker graphillustratesthe distributionof disagreements given
produces['f] for [e-I, for example[m•n-] for [men-]as by observernumber06. His chiefdifficultywas in dis-
currently heard in someAmerican dialects; then such tinguishing between[a-] and ['o-].This type of distribu-
an individual when servingas a listenerwill be inclined tion is characteristicof severalobservers.Observer013,
to writemenwhenhehears['m•n-].Thusit is that in the whosedistribution of disagreementsis plotted on the
substitutionsshownin Table I, [•-] most frequently bottomgraph,showsa tendencyto confuse [¾]and
became['e'],and [e-]mostfrequentlybecame['a•-].The more than the average.
explanation of thehighintelligibilityof ['a•-]isprobably The distributionsof disagreements of all 70 observers
basedon this samepattern. It will be noted along the differ from each other, dependingon their language
n D. Jones,An OutlineofEnglishPhonetics(W. Heifer and Sons,
experience,but the differencesare generally less ex-
Ltd., Cambridge,England, 1947). treme than the two examplesshownon Fig. 4. Thirty-
•2G. W. Gray and C. M. Wise, The Basesof Speech(Harper two of the 70 observerswere also speakers.In cases
Brothers,New York, 1946), pp. 217-302.
•aL. Bloomfield, Language (Henry Holt and Company, New where an observersuchas 06 was also a speaker,the
York, 1933), pp. 369-391. remainderof the jury generallyhad moredisagreements

with his [o-] and [o• soundsthan with the othersounds

he spoke.Thus it appearsthat if a speakerdoes not
differentiateclearly betweena pair of soundsin speak-
ing them, he is unlikely to classifythem properlywhen
he hearsothersspeakthem. His languageexperience,as
wouldbe expected,influencesboth his speakingand his 6O0

hearingof sounds.
Since the listening group was not given a seriesof
training sessionsfor thesetests,learningwould be ex- 400
pected in the results of the tests.14Several piecesof
evidenceindicate a certain amount of practice effect, 60
I ,
but the data are not suchas to provide anything more (b) OBSERVER
• 50-

than a very approximatemeasureof its magnitude.

uJ 40-
For one check on practice effect, a ninth test• was
giventhe jury, in whichall the wordshavingmore than ,,:I: 3O-

10 disagreementsin any of the precedingeight tests j 2..0-

were repeated.There was a total of about 175 such
words; to these were added 25 words which had no 0 10-

disagreements, picked at random from the first eight 0 I,I

tests. On the ninth test, 67 wordshad more disagree- 3O

ments, 109 had lessdisagreements, and 24 had the same
numberof disagreements as in the precedingtests.The
probability of getting this result had there been no
practice or other effect, but only a random variation
of observers'votes, would be about 0.01. When these
data are broken down into three groupsfor the men,
womenand childrenspeakers,the largestdifferencesin
numbersof disagreements for the originaland repeated Ii I I• •! a 3 u u., ^ •,
tests was on the childrens'words, indicating a larger
practice or learning effect on their sounds.The indi- Fro. 4. Observerdisagreements
in listeningtests.
cated learningeffecton men'sand women'sspeechwas
nearly the same.When the data are classifiedaccording within an auditoriumupon intelligibilityhas been ob-
to the vowel sound,the learningeffectindicatedby the servedpreviouslyand is reportedin the literature.1•
wasleaston ri-], I-x-I,and [u-I, and greatest
Another indication that there was a practice effect Calibrationsof Equipment
lies in the sequenceof total numbersof disagreements
by tests.From the secondto the seventhtest, the total A rapid calibrating technique was developedfor
number of disagreementsby all observersdiminished checkingthe over-allperformance of the recordingand
consistentlyfrom test to test, and the first test had con- analyzingsystems.This dependedon the use of a test
siderably more disagreementsthan the eighth, thus tone which had an envelopespectrumthat was essen-
stronglyindicatinga downwardtrend. With the speak- tially flat with frequencyover the voice band. The
ersrandomizedin their orderof appearancein the eight circuit used to generate this test tone is shown sche-
tests, each test would be expected to have approxi- maticallyin Fig. 5. It consists
essentiallyof an overload-
mately the same number of disagreements. The prob- ing amplifier and pulse sharpeningcircuit. The wave
ability of getting the sequenceof numbersof total dis- shapeswhich may be observedat several different
agreementswhich wasobtainedwouldbe somewhatless
pointsin the test tonegeneratorare indicatedin Fig. 5.
The test tone generatormay be driven by an input
than 0.05 if there were no learning trend or other non-
random effect.
sinewave signalof any frequencybetween50 and 2000
cycles.Figure6(a) showsa sectionof the test tonewith
It was alsofound that the listeningpositionhad an a 100 cycle repetition frequency,which had been re-
effect upon the scoresobtained. The observerswere cordedon magnetictape in place of the word lists by
arrangedin 9 rows in the auditorium, and the listeners the speaker, and then played back into the sourid
in the back 4 rowshad a significantlygreaternumberof spectrograph.The departure from uniform frequency
disagreements with the speakersthan did the listeners responseof the over-all systemsis indicated by the
in the first 5 rows. The effect of a listener'sposition shapeof the envelopeenclosingthe peaks of the 100
•4H. Fletcher and R. H. Galt, J. Acoust. Soc. Am. 22, 93 •* V. O. Knudsen and C. M. Harris, AcousticalDesigning in
0950). Architecture(JohnWiley and Sons,New York, 1950),pp. 180-181.

w.E. 400 A
---- DIODE

250 V

FIG. 5. Schematicof calibratingtone generator.

cycleharmonics.With the 100 cyclesfrom the Labora- percent slower when playing back than it did on
toriesstandardfrequencyoscillatoras the drive signal, recording.
the frequencycalibrationof thesystems maybe checked The speedvariationson the soundspectrograph were
very readilyby comparison of the harmonicspacingon measuredwith the test tone applieddirectly, and the
the sectionwith the template scale.The amplitude maximum short time variations were found to be :t=0.3
scalein 6(a) is obtainedby insertinga puretoneat the percent.Suchdirect calibrationsof the frequencyscale
spectrograph in 5 db increments.The frequencyscale of the spectrograph, duringa periodof four weekswhen
for spectrograms may also be calibratedas shownin most of the spectrographic analysiswas done, showed
Fig. 6(b). The horizontallineshereare representationsmaximumdeviationsof +30 cyclesat the31stharmonic
of the harmonics of the test tone when the test tone of the 100 cycletest tone.During that perioda control
generatoris driven by a 500 cyclestandardfrequency. chart•6 of the measurements of the 3100 cyclecompo-
These lines further afford a means of checkingthe nent of the test tone showed a downward trend of about
amount of speedirregularity or wow in the over-all 10 cycles,which was attributed to changesin the elec-
mechanicalsystem.A calibrationof the time scalemay tonic circuit componentsof the spectrograph.As a
be obtainedby usingthe test tone generatorwith 100 result of thesecalibration tests, it was concludedthat
cycle drive and making a broad band spectrogramas the frequencyscaleof the soundspectrograph couldbe
shownin Fig. 6(c). The spacings betweenverticalstria- reliedupon as beingaccuratewithin :t=1percent.
tions in this case correspondto one-hundredthof a
second intervals. Formant Measurements

In the processof recordingsomeof the word lists, Measurementsof both the frequencyand the ampli-
it was arrangedto substitutethe calibratingtest tone tude of the formants were made for the 20 words re-
circuit for the microphonecircuit, and record a few cordedby eachof the 76 speakers. The frequencyposi-
seconds of test tone between the lists of words. When the tion of each formant was obtained by estimating a
word listswereanalyzedwith the spectrograph, the ac- weighted average of the frequenciesof the principal
companyingtest tone sectionsprovided a means of components in the formant. (Seereference4 for a dis-
checkingthe over-allfrequencyresponse of the recorder cussionof this procedure.) When the principalcompo-
and analyzer,and the frequencyscaleof the sectioner. nents in the formant were symmetricallydistributed
The effectof speedvariationsin eitherthe recorderor about a dominant component, such as the second
the soundspectrograph is to changethe frequencyscale. formantof ['A• hudin Fig. 2, thereis little ambiguity
A seriesof measurements with the 100 cycletest tone •6"A.S.T.M. manual on presentationof data," Am. Soc.Test-
showedthat the tape recorderran approximatelyone ing Materials (Philadelphia,1945), AppendixB.

in choosing the formantfrequency. Whenthe distribu-
tionis asymmetrical, however, asin thefirstformantof 400
[-zr']heardin Fig. 2, the differencebetweenestimated 380
formantfrequency andthat assigned by the earmaybe
appreciable. 36O // //// / ß ß
Oneof the greatestdifficulties
in estimatingformant o-•
.J //// ß //

frequencieswasencountered in thosecaseswherethe • 340

k.) //// ß / ßo© // //
fundamentalfrequencywashigh so that the formant Z
-- 320- // / ß / .////
waspoorlydefined. Thesefactors mayaccountforsome,
but certainlynot all, of the differences
discussedlater 300 // /e /
ZO280' .o'ß ///

• _/ ß /// R OFDIFFE'RENCES=I?.2
,,- 240 - '/ // ESTIMATED½ OF

..-• :::--., ../-- -, , ;•,;,.,"•'•y'*;•, •';,. ,. ,s2; '., :, . } *,.': ;' %.;½.::-.
220 240 260 280 300 320 340 360 380 400 420
..:.: -.½ .., .--. ..... ., .½,• , ,v. ,,' ,'....½1;.:::';•:'½;.; F1 OF FIRST CALLING IN CYCLES PER SECOND

'. "o.-:: ,"'-":

_--' '.l:
;- .:'"::'.:*;',:
:w.... i .....
.... '......
":' ' •':"
: -:';-'.x;.
Fro. 7. Accuracy-precision
chartof firstformantfrequencies
of [i] as•spoken
by 28 women.
ß --.•, ß - --w-:-:

by ear and by measured
valuesof formant frequencies.
-" 4000 .... ';..... ; .*.;
Amplitudes wereobtainedby assigning a valuein
to the formantpeak.In the caseof the ampli-
tude measurements it was then necessaryto apply a
0 ' :- -"' '::;L '.... ,•:.:-'-":-::-:'"
t-t-l:- : ................ correctionfor the over-allfrequencyresponse of the
• . . ... :...... . ..................
.• ......... ..::--.:-:-}
The procedure of makingduplicaterecordings
.• ....
•. . .....
..... .:. . analysesof the ten wordsfor eachof the speakers
of the data.

.• ...... ._ :.: . Onemethodby whichtheduplicate measured values

• :- :.: :......... wereusedis illustratedby Fig. 7. This is a plot of the
• -::_ .......

-½•:.**. :•;..•;
-•,-• -.• _••• .•• ...;'--',• .•. • ..... ----
valuesfor the first formantfrequencyF, of [i] as in
.:•. 0 ,*-'-.;•*•=•,•
' ' -••
:•'•--:"-- ' ': - .".' *•-•
heed,as spokenby the 28 femalesubjects. Eachpoint
BAND .SP:ECTROGRAM represents,
fora single
thevalueofF, measured
.... 0 F 500:-CYCLE
CAL:!BRATiNG TONE.-- for the heedin the firstlist, versus
the valueof F, for the
...... ... -:.- ...... : ....
heedin the second list. If the F, for the second list or
.._-•:- •:::..j;:
callingwasgreaterthan that for the first calling,the
pointliesabovea 45-degreeline;if it is less,thepoint
liesbelowthe 45-degree
line.The averagedifference R
betweenthe pairedvaluesof F, for first and second
Theestimated standard devia-
tion • derivedfrom the differencesbetweenpairs of F,
valueswas15.3cycles.The dottedlinesin Fig. 7 are
spaced +3 • cyclesfromthe45-degree linethroughthe
origin.In casea pointfallsoutsidethe dottedlines,it
is generallybecauseof an erroneous
Eachof the threeformantfrequencies
for eachof the
pointsforeachformant,ora totalof2280pointsplotted
on90accuracy-precisionchartslikeFig.7. Of these2280
points,118fell outsidethe +3 • limits.On checking
back over the measurements,it was found that 88 of
the points were incorrectbecauseof grossmeasure-
ment errors,typographical errorsin transcribingthe
data,or becausethe sectionhadbeenmadeduringthe
Fro. 6. Spectrograms
and sectionof calibratingtone. influenceperiodof the consonants insteadof in the

fore it is assumedthat the data are not statistically

random, but that there are statistically significant
differencesbetweenspeakers.Since the measurements
for pairsof callingsweresonearlyalike, as contrasted
with the measurements on the same sound for different
speakers,this indicatedthat the precisionof measure-
ments with the soundspectrographwas sufficientto
resolvesatisfactorilythe differences
betweenthe various
individuals'pronunciationsof the samesounds.

In Fig. 3, as discussed previously,are plotted areas

in the plane of the secondformant F2 versusthe first
formant F•. These areas enclosepoints for several
repetitions of the sustainedvowels by one of the
writers.It is clearthat herethe vowelsmay be separated
readily, simply by plotting F2 againstF•; that is, on
the F•--F• plane, points for each'vowellie in isolated
areas, with no overlappingof adjacent areas, even
thoughthereexiststhe variation of the measuredvalues
o 2oo ,too eoo coo Iooo
FREQUENCY OF Ft IN CYCLES PER SECOND which we have discussed above.

Fro. 8. Frequency of secondformant versusfrequency of first The variation of the measureddata for a group of
formantfor ten vowelsby 76 speakers. speakersis muchlargerthan the variationencountered
in repetitionswith the samespeaker,however,as may
steadystateperiodof the vowel.When corrected,these be shownby the data for F1 and F• for the 76 speakers.
88 pointswerewithin the =1= 3 • limits. Of the remaining In. Fig. 8 are plotted the pointsfor the secondcallingby
30 pointswhichwerestill outsidethe limits, 20 werethe eachspeaker, withthepointsidentified accordingto the
result of the individuals' having produced pairs of speaker'sword list. The closedloops for each vowel
soundswhich were unlike phonetically,as shown by have been drawn arbitrarily to enclosemost of the
the resultsof the listeningtests. points; the more extremeand isolatedpointswere dis-
The duplicate measurementsmay also be used to regardedso that in generaltheseloopsincludeabout
show that the difference between successive utterances 90 percent of the values.The frequencyscaleson this
of the samesoundby the sameindividual is much less and Fig. 9 are spacedaccordingto the approximation
significant statistically than the difference between to an aural scaledescribedby Koenig, which is linear to
utterancesof the same soundby different individuals. 1000cpsand logarithmicabove?
An analysisof varianceof the data in Fig. 7 showsthat Considerableoverlappingof areasis indicated,par-
the differencesbetween callings of pairs are not sig- ticularlybetweenE•r-]and Ee-],E•r-]and Ev-],Ev-]and
nificant. However, the value for the varianceratio when Eu-],and Ea-]and Eo-].In the caseof the E•r-]sound,it
comparingspeakersis much larger than that corre- may be easily distinguishedfrom all the othersif the
spondingto a 0.1 percentprobability. In other words, third formant frequencyis used,as the positionof the
if the measurements shownin Fig. 7 for all callingsby third formant is very closein frequencyto.that of the
all speakerswere assumedto constitute a body of second.
statisticallyrandomdata, the probabilityof having a The data of Fig. 8 show that the distribution of
variance ratio as high as that found when comparing pointsin the F1--F•plane is continuousin goingfrom
speakerswould be lessthan one in a thousand.There- soundto sound;thesedistributionsdoubtlessrepresent

TABLEI. Classifications
of vowelsby speakersand by listeners.Vowelsas classifiedby listeners.

i I 8 • o o u u A •
10267 4 6 ... 3 ...........
. 6 954• 694 "5 1 1 .... 56
257 9014 949 1 3 ... i.i "5 51
ß 1 300 9919 2 2 15 39
Vowelsintendedby speakers ß 1 19 8936 1013 '• "' 228 7
... 1 2 590 9534 71 5 62 14
... 1 1 16 51 9924 96 171 19
1 2 78 10196 2
'"1 1 "• 540 '1•'7 103 ... •7• 21
... 23 6 2 3 ...... 2 10243

•7W. Koenig,Bell Labs.Record27, (August,1949),pp. 299-301.

TABz• II. Averagesof fundamentaland formantfrequencies

and formantamplitudesof vowelsby 76 speakers.

i i Ig It• ct o TJ u A 3'

W 136 232
235 135 130
223 127 124
210 212 129 232
216 137 231
141 221
130 218
(cps) Ch 272 269 260 251 256 263 276 274 261 261
Formant frequencies(cps)
M 270 390 530 660 730 570 440 300 640 490
Ft W 310 430 610 860 850 590 470 370 760 500
Ch 370 530 690 1010 1030 680 560 430 850 560

M 2290 1990 1840 1720 1090 840 1020 870 1190 1350
F2 W 2790 2480 2330 2050 1220 920 1160 950 1400 1640
Ch 3200 2730 2610 2320 1370 1060 1410 1170 1590 1820

M 3010 2550 2480 2410 2440 2410 2240 2240 2390 1690
F3 W 3310 3070 2990 2850 2810 2710 2680 2670 2780 1960
Ch 3730 3600 3570 3320 3170 3180 3310 3260 3360 2160

L• -4 --3 --2 --1 --1 0 --1 --3 --1 --5

Formant amplitudes(db) L2 --24 --23 --17 --12 --5 --7 --12 --19 --10 --15
La --28 --27 --24 --22 --28 --34 --34 --43 --27 --20

large differencesin the way individuals speak the The plot has alsobeen simplifiedby the omissionof
sounds.The values for F3 and the relative amplitudes [3.•. The [3.• producesextensiveoverlapin the [u•
of the formantsalso have correspondingly large varia- regionin a graph involvingonly the first two formants.
tions between individuals. Part of the variations are As explainedpreviously,however,the [3.-] may be
becauseof the differencesbetween classesof speakers, isolatedfrom the othervowelsreadily by meansof the
that is, men, womenand children.In general,the chil- third formant.
dren'sformantsare highestin frequency,the women's When only vowelswhichreceived100percentrecog-
intermediate, and the men's formants are lowest in nition are plotted, the scatter and overlap are some-
frequency. what reducedover that for all callings.The scatter is
These differencesmay be observedin the averaged greater,however,than might be expected.
formant frequenciesgiven on Table II. The first for- If the first and secondformantparametersmeasured
mants for the children are seen to be about half an from thesewordswell definedtheir phoneticvalues;
octave higher than those of the men, and the second and if the listeningtestswere an exact meansof classi-
and third formants are also appreciablyhigher. The fying the words, then the points for each vowel of
measurementsof amplitudesof the formants did not
show decideddifferencesbetween classesof speakers,
and so have been averagedall together.The formant
amplitudesare all referredto the amplitudeof the first
formantin [a•, whenthe total phoneticpowersof the
vowelsare correctedsoas to be relatedto eachotherby
the ratiosof powersgivenby Fletcher.•a
Various methods of correlating the results of the
listening tests with the formant measurementshave
been studied. In terms of the first two formants the
nature of the relationshipis illustratedin Fig. 9. In this
figuremeasurements for all vowelsof both callingsare
plotted in which all membersof the listening group
agreedwith the speaker.Sincethe valuesfor the men
and the childrengenerallylie at the two endsof the dis-
tributions for each vowel, the confusionbetweenvowels
is well illustratedby their data; thus the measurements
for the womenspeakershave been omitted.
The lines on Fig. 9 are the sameas the boundaries..
drawn in Fig. 8. As indicatedpreviously,somevowels
received100 percentagreementmuch more frequently o •oo ,•oo eoo eoo •ooo •oo •,•oo
than others.
Fro. 9. Frequencyof secondformant versusfrequencyof first
•aH. Fletcher,SpeechandHearing(D. Van NostrandCompany, formant for vowels spoken by men and children, which were
Inc., New York, 1929), p. 74. classifiedunanimouslyby all listeners.

Fig. 9 shouldbe well separated.Words judged inter- urements on vowel soundsmay be summarizedas
mediatein phoneticpositionshouldfall at intermediate follows.
positionsin sucha plot. In other words,the distribu- 1. Calibration and measurement techniques have been de-
tions of measuredformant valuesin theseplots do not velopedwith the soundspectrographwhich make possibleits
correspondclosely to the distributionsof phonetic usein a detailedstudy of the variationsthat appearin a broad
values. sampleof speech.
It is the presentbelief that the complexacoustical 2. Repeated utterances, repeated measurementsat various
stagesin the vowel study, and randomizationin test procedures
patterns representedby the wordsare not adequately have madepossiblethe applicationof powerfulstatisticalmethods
representedby a single section,but require a more in the analysisof the data.
complexportrayal. The initial and final influencesoften 3. The data, whensoanalyzed,revealthat both the production
shownin the bar movementsof the spectrograms are of and the identificationof vowel soundsby an individualdepend
on his previouslanguageexperience.
importancehere.•ø The evaluationof these changing 4. It is alsofound that the productionof vowel soundsby an
bar patterns of normal conversationalspeechis, of individual is not a randomprocess,i.e., the valuesof the acoustic
course,a problemof major importancein the study of measurements of the sounds are not distributed in random order.
the fundamentalinformationbearingelementsof speech. This is probably true of many other processesinvolving indi-
A further studyof the vowelformantsis nownearing viduals' subjectiveresponses.
5. Finally, the data showthat certain of the vowelsare gener-
completion. This study employs sustainedvowels, ally better understoodthan others,possiblybecausethey repre-
without influences,obtained and measuredunder con- sent "limit" positionsof the articulatorymechanisms.
trolled conditions.The general objectivesare to de-
termine further the most fundamental means of evalu- ACKNOWLEDGMENTS

ating the formants,and to obtain the relationsamong The work which we have discussed has involved the
the variousformantsfor eachof the vowelsasproduced contributionsof a number of people. We shouldlike
by differencespeakers.When this informationhasbeen to acknowledgethe guidanceof Mr. R. K. Potter and
obtainedit is anticipatedthat it will serveasa basisfor Mr. J. C. Steinbergin the plan of the experiment,and
determiningmethods of evaluating and relating the the contribution of Dr. W. A. Shewhart who has assisted
changingformantswithin wordsasproducedby various in the design and interpretation of the study with
speakers. respectto the applicationof statisticalmethods.We
SUMMARY are indebted to Miss M. C. Packer for assistance in
statisticalanalysesof the data. We wishto acknowledge
The results of our work to date on the develop- also the assistancegiven by Mr. Anthony Prestigia-
ment of methodsfor making acousticand aural meas- como,Mr. GeorgeBlake, and Miss E. T. Leddy in the
•9Potter, Kopp, and Green, VisibleSpeech(D. Van Nostrand recordingand analysisof the sounds and in the prepara-
Company,Inc., New York, 1947). tion of the data.

