Journal of Voice
Vol. 4, No. 3, pp. 231-237
© 1990 Raven Press, Ltd., New York
Formant Tuning in a Professional Baritone
*?D. G. Miller and ?H. K. Schutte
*School of Music, Voice Department, Syracuse University, Syracuse, NY, U.S.A.; ~Voice Research Laboratory,
ENT-Clinic, University Hospital Groningen, Groningen, The Netherlands
Summary: Formant tuning, or using vowel modification to approximate one or
both of the two lowest resonances of the vocal tract to harmonics of the glottal
source, is a technique advocated by certain pedagogies of singing. In this
experiment, two sung phonations of a professional baritone are examined for
evidence of the tuning by simultaneous recording of audio, electroglottographic (EGG), and subglottal and supraglottal pressures by means of wideband miniature pressure transducers on a catheter passed through the glottis.
The considerable resonance-enhancing effects of formant tuning appear to be
intentionally exploited by the singer in response to the demands of the musical
phrase. Key Words: Formant--Singing--Subglottal pressure--Supraglottal
pressure--Pedagogy of singing--Fundamental frequency--Spectrogram.
given language (2). By far the most important resonances in determining the vowel quality are F1 and
F2, which in singing have ranges of -250--900 and
-800-2,200 Hz, respectively. Formant tuning, as
we shall use the notion here, is bringing one of the
first two formants close in frequency to one of the
harmonics of the voice source, which occurs at multiples of the fundamental frequency (Fo). A welltuned formant can have considerable amplifying
power (3), and harmonics falling very close to F1 or
F 2 will generally have the highest sound pressure
level (SPL) in the spectrum of a sung tone. As F o
rises, the increasing distance between harmonics
makes formant tuning more critical to the resultant
intensity of the singing voice.
The most dramatic resonance-enhancing effects
of formant tuning generally occur in the range of F 5
to Bs-flat (ca. 700--900 Hz), where F 1 coincides with
F o on the open (high-F1) vowels. Sundberg (4) has
shown how this important effect is extended i~ female voices by adjusting F1 to match F0 in other
(low-F1) vowels through varying the jaw opening.
Because of a longer closed phase in the glottal cycle, however, " c h e s t " register production (including male "legitimate head voice") cannot take full
advantage of the resonance-enhancing effect of the
One of the important contributions of voice science to the theory of the singing voice has been to
shed light on the process of tuning the vocal tract,
an intuitive act that has been practiced at least as
long as skillful singers have striven to be heard
against the background of other musical instruments (1). Varying the resonances of the vocal tract
to produce the vowels of speech requires little conscious attention; in singing, however, the requirement of sustaining syllables while maintaining an
aesthetically acceptable sound forces singers into
producing vowels with care. The further requirement of producing a sound of adequate size in a
range well above that of speech makes attention to
tuning the resonances of the vocal tract often essential for professional survival.
By modulating the position of the articulators
(primarily tongue, lips and jaw) the resonances of
the vocal tract (called formants and designated F1,
F2, F 3, etc. in the order of their resonant frequencies) are varied to produce the several vowels of a
Address correspondence and reprint requests to D. G. Miller
at Voice Research Lab., PO Box 30.00l, NL 9700 RB Groningen, The Netherlands.
This article was presented at the XVIth Symposium Care of
the Professional Voice, New York, June 1987.
231
232
D. G. M I L L E R A N D H. K. SCHUTTE
coincidence of F 0 and F~, ~ and the strongest formant tuning effects in this register are often produced with the second or higher harmonics.
Conscious application of formant tuning to the
practice of training singers has been slow in developing. As early as 1969, Oncley (7) explained the
"lifts" commonly perceived in singing voices as
shifts in the tuning of formants to lower harmonics
as these rise with ascending F 0. Coffin (8) has built
a system of training the singing voice around the
notion of tuning the vocal tract. In the cases of both
of these authors, however, the theory was developed on the basis of limited knowledge of what actually happens acoustically within the vocal tract in
singing and, therefore, is in need of correction. The
present experiment introduces one way to examine
more closely the phenomenon of vocal tract tuning.
METHOD
Although the higher-frequency components of a
sung sound are an important determinant of perceived quality and of "carrying power" (1), the
sound pressure level (SPL) is largely a matter of the
lower frequency components, chiefly F 1 and, to a
lesser extent, F2 (4). The dominance of the lower
formants seen in the radiated sound is considerably
stronger within the vocal tract, since the lowfrequency components, much like those in brass instruments, are subject to greater attenuation as they
pass from the vocal tract into the outside air (9).
This leads us to expect that tuning of the lower
formants will be more readily apparent within the
vocal tract than to an outside microphone.
Since 1984, the authors have been measuring the
subglottal and supraglottal pressures of voices peri In "falsetto" production (where the closed phase of the glottis is short or absent), if the first formant is tuned to Fo (the first
harmonic), it is possible to achieve high (>100 cm water pressure) peak-to-peak supraglottal pressures. The (sinusoid) Fx
component is timed so that it is driven by the time-varying glottal
flow in both rising and falling phases, a distinct advantage over
" c h e s t " register production, whose long closed phase of the
glottis prevents the glottal flow from driving the supraglottal
pressure wave in its rising phase. In the case where F 1 is tuned
to the second harmonic, making two complete F~ cycles per
glottal cycle, however, the first of each pair of F~ cycles will be
completed in the long (>50%) closed phase of the "chest" register, whereas the second will be driven by the flow in both rising
and falling phases. An instance of tuning to the second harmonic
can be seen in the B3-flat of Fig. 2. The argument is presented in
greater detail in Miller (5). Some nuances of the "falsetto" case
are examined in Schutte and Miller (6).
Journal of Voice, Vol. 4, No. 3, 1990
forming a variety of singing tasks. The authors' procedure (10) consists briefly of recording these pressures with miniature wide-band pressure transducers on a small-diameter catheter introduced through
the nares and passing through the posterior commissure of the glottis. The electroglottographic
(EGG) signal (measuring vocal fold contact area)
and the radiated sound are also recorded simulta.
neously. The supraglottal pressure signal is mea.
sured within 3 cm of the glottis,2 which functions as
a pressure antinode (a point of maximum AC pressure modulations) in standing waves of the vocal
tract. Thus, tuning F~ o r F 2 to match one of the
harmonics of the glottal source would result in a
strong component at that frequency in the recorded
supraglottal pressure waveform.
The subject whose singing pressures are recorded
here (not one of the authors) is one of the leading
opera singers in the Netherlands. A man in his middle years with a serious interest in vocal pedagogy,
he is hardly a naive subject, but at the time of the
experiment his thinking about singing did not include any notion of formant tuning as described
above. Neither was he coached by the experimenters, but simply directed to perform some typical
vocalize patterns on vowels and CV nonsense syllables, once the catheter was in place and the topical anesthetic needed during the catheter insertion
had worn off.
RESULTS
In presenting two phonations for closer scrutiny
from the many recorded by this subject, it is not the
authors' intention to describe precise relationships
among quantitative data nor to make any assertion
about the Zstatistical frequency of the use of formant
tuning among singers; it is rather to call attention to
some acoustic mechanisms that appear to be exploited by this skillful and experienced singer. Both
phonations cover the range of frequencies between
230 and 380 Hz (roughly B3-flat to F4), a critical
range where the baritone normally makes some sort
of adjustment in order to arrive at the male "legitimate head voice" (11) by the time he reaches the
high F4. The phonations will be examined first in an
2 The 3 cm is an estimate, based on the fact that the 6 cm
length of catheter between the transducers is partly occupied by
structures of the glottis. In the case of FI, the attenuation of
antinodal pressure due to the varying position of the transducer
can be neglected; on the other hand, it could be as high as 30%
in the case of F2.
F O R M A N T TUNING I N A P R O F E S S I O N A L B A R I T O N E
overview and then the pressure patterns within the
glottal cycle will be looked at.
The first phonation is an arpeggio pattern (domi-sol-mi-do) on the syllables /bibibi/. The overview (Fig. 1) reveals an interesting relationship between the subglottal pressure and the sound pressure level. It is usually assumed that these two
variables are closely correlated (3), at least for a
constant vowel. In this case, the subglottal pressure
on the F 4 is about half again as much as that on the
B3-flat, whereas the D4 has an intermediate value;
nonetheless, the sound pressure level on the high
note is slightly lower than that on the low note, and
the intermediate D4's are at least six decibels below
the other notes in SPL.
A clue to the explanation of this unexpected result can be found in the variations of supraglottal
pressure within the glottal cycle, shown in Fig. 2.
Although the overview gives a good indication of
peak-to-peak (AC) variations in the pressure signals, as well as the progress of average pressures,
the details of the glottal cycle are revealed in Fig. 2,
with its greatly expanded time scale. Here a few
cycles of each of the pitches of the phonation have
been juxtaposed to facilitate comparison among
them. A strong tuning effect is evident in the B3-flat,
where F1 has been adjusted to a relatively high frequency so that it coincides with the second harmonic (see also Fig. 3). Because of the dominance
of low formants within the vocal tract, this well-
AUDIO
SUPRAGLOTTAL
PRESSURE
+ 10 -10(em H20)
"
SUBGLOTTAL 2 5 PRESSURE
.0-
~
PRESSURE
EGG
"
f
ii
11
~_~
PITCH
Bb
/ bi -
D F
bi-
D
Bb
bi-
/
FIG. 1. Signalsof complete phonation of baritone singingarpeggio (B3-flat, D4, F4, D4, B3-flat).
233
tuned formant produces a high peak-to-peak variation in the supraglottal pressure, which in turn gives
this pitch the highest SLP of the three in the phonation, in spite of its relatively low subglottal pressure.
The other striking instance of formant tuning in
this phonation occurs on the high note, which has a
center frequency of 350 Hz. This time, however, it
is not the first, but the second formant that dominates. Evidence for this is seen in the supraglottal
pressure but is even clearer in the audio signal,
which indicates an F2 standing wave with a frequency four times that of Fo (see also Fig. 3). The
amplitude of the peak-to-peak supraglottal pressure
modulations is notably smaller than on the B3-flat,
but the higher frequency of the tuned second formant results in a sound pressure level which is similar.
In contrast to these two strongly resonated tones,
the sound pressure level of the first D 4 is --6 dB
lower, a reduction of one-half, even though the subglottal pressure is relatively high. This is due largely
to the relative lack of formant tuning: F 0 has moved
too high for matching the second harmonic to F 1,
and there is no evidence of a matching of F 2 with
higher harmonics in the first sample shown. We
have included a second example of the D4, however, occurring after the high note, where F 2
matches the fifth harmonic at a certain point in the
vibrato cycle, raising the SPL - 4 dB above its minimum. This effect is stronger following the tuning of
F 2 on the high note than it is preceding it.
The information presented in Fig. 3 confirms
these observations. It consists of spectra (produced
by a Princeton Applied Research FFT Real Time
Spectrum Analyzer model 4512) of the audio signal
of the five tones of the vocalization, as well as the
supraglottal signal of the B3-flat and F4, where formant tuning is most evident. On these two pitches,
the partial, which is amplified by a coinciding formant, is 10-15 dB stronger than the remaining partials, at least in the microphone signal. In the supraglottal spectra, the preponderance of intensity of
the lower partials, much of which is not passed at
the lips into the radiated spectrum, gives a different
perspective, and F 2 tuning has a less evident effect.
Although it is pitched a fraction of a semitone
higher, our second example (Fig. 4) covers the same
range (from B3-flat to F4), with the same target
vowel (/i/). The pitch sequence is that of a descending scale passage, however, and there is no interruption of the vowel by consonants. Except for an
Journal of Voice, Vol. 4, No. 3, 1990
234
D. G. M I L L E R A N D H. K. S C H U T T E
AUDIO
SUPRAGLOTTAL
+ 1 0 - /5 £1 A
PRESSURE
-10-V w V v
(em H20)
SUBGLOTTAL 2 5
0PRESSURE
-
-
LJ
~
FIG. 2. Details taken from the sig.
nals of Fig. 1 on expanded time scale.
(The signals for audio and SPL lag
-1.5 ms behind the other signals.)
TRANSGLOTTAL
PRESSURE
EGG
S~ ' ' - ~ f ~ ' - - - j
SPL 10 d ~ ~ ~£ ,jr,.. ~!£ ,/~..
PITCH
Bb
D
intermittent high-intensity sound on the first pitch,
the overview shows a smooth scale, lacking the
contrasts and discontinuities of the first example.
Figure 5 gives some details of this overview on an
expanded time scale. The high note is shown at a
point in the vibrato cycle where F 2 is sharply tuned
to the fourth harmonic, as well as a point 60 ms later
where the Fz-tuning is weak. The weak tuning,
which is evident in the audio, supraglottal pressure,
and SPL signals, coincides with a drop in Fo from
380 to 340 Hz in the course of the vibrato cycle.
There are also two examples taken from the second
note (E4-fiat), illustrating stronger and weaker F ztuning within the vibrato cycle. In the better tuned
p a r t , F 2 is matched with the fifth harmonic, but the
effect is weaker than in the examples where the
tuning occurs at harmonics lower in the series.
Our last detail comes from the low note (B3-flat),
which we include for comparison with the similar
note in the first phonation. Visual inspection of the
supraglottal pressure signal reveals that the first formant is located somewhere between the first and
second harmonics, where it has remained throughout this phonation, thus not producing any sharp
tuning effects. The B3-flat note in the other phonation (Figs. 1-3) presents quite a different picture:
with F1 modified upwards in order to match the
second harmonic, the tuning effect produces the
highest SPL of any of the tones displayed here.
Figure 6 gives spectra of the audio signals of the
five descending scale tones. With the possible exception of the high note, with its intermittent F 2tuning, the strongest partial in all cases is the Fo, inJournal of Voice, Vol. 4, No. 3, 1990
F
0
dicating that the singer has made an adjustment of
F1 downwards as Fo moves stepwise from 360 to
240 Hz. His success in keeping the scale smooth
can be seen in the nearly uniform supraglottal peakto-peak amplitude in Fig. 4.
The intermittent F2-tuning within the vibrato cycle cannot be shown well in the spectra because our
spectrum analyzer has a minimum time window of
0.1 s, a time interval that would necessarily include
both strong and weak, phases of vibrato.
DISCUSSION
It is not necessary to look far for explanations of
this apparently inconsistent behavior on the part of
the singer. In the first phrase, the singer's customary attention to high notes and the natural tendency
to group notes rhythmically lead us to expect that
the F 4 and B3-flats will be produced with more care
than the D 4 ' s . Furthermore, the disjunction of the
notes reduces the need to match them carefully in
quality. The orchestral sounds that accompany
much of professional singing create the need for
SPL-enhancing effects; particularly on prominant
notes, the singer would tend to adjust resonances to
match one or another harmonic of the glottal
source. In our first example (Figs. 1-3), the vowel
modification is more than moderate: the/i/sounds
almost like /e/. This presumably is because the
singer found it necessary to raise the F 1 o f ]i/, which
in speech is normally <300 Hz, to 460 Hz in order
to match the second harmonic of the B3-flat (F0 =
230 Hz). After passing through the transitional D4,
F O R M A N T TUNING I N A P R O F E S S I O N A L B A R I T O N E
VOWEL/I/
I I I I ] I I I P
SUPRAGLOTTAL
AUDIO
235
AIMLt
i I
. . . . . .
i i i ~ r t i I i i ~
......
AUDIO
I
I Ill
SUPRAGLOTTAL+ 10 to]
PRESSURE
(cm H20)
.
0
1
2
3
4
.
o
5
.
1
.
SUBGLOTTAL 25- /
J
2
3
PRESSURE
4
TRANSGLOTTAL
PRESSURE
EGG
0
1
2
3
4
I . . . . . .
-10-
~
'
-
_
~
~
I/
,
~
0--
25- ~
O--
:::::::::::::::::::::::::
5
dB
10
10
PITCH
!
, i
0
1
2
3
4
5
F
Eb D
/bi .
.
C
.
Bb
.
/
FIG. 4. Signals of complete phonation of baritone singing descending scale passage (F4, E+-flat, D4, C4, B3-flat).
dB
10]
I
0
,
,
,
1
2
3
II,l
4
g
dB
10]
0
I
2,.
3
4
5
kHz
0
|
1
2
3
4
5
kHz
FIG. 3. Spectra of 0.1-s samples of audio and supraglottal signals of the pitches in Fig. 1, in consecutive order from top to
bottom.
the next important (resonance-demanding) note is
the F 4. Although early pedagogies involving forrnant tuning would suggest the matching of F1 to the
fundamental frequency Fo (350 Hz), the long closed
phase of the glottal cycle in the male "legitimate
head voice" greatly reduces the potential of this
tuning effect (5), and our singer operates with F2
instead. Again, he uses a vigorous vowel modification to lower F 2 to 1,450 Hz, where it will match the
fourth harmonic, at least in the high-F o phase of the
vibrato cycle. The next D 4 retains some of the tuning of F 2 (now matching the fifth harmonic), and the
final B3-flat returns abruptly to the match of F1 with
the second harmonic.
The musical shape of the second phonation (Figs.
4--6) presents different demands, and the singer reSponds accordingly. The fact that it is a scale pas-
sage without intervening consonants suggests a
more carefully matched set of tones, and maintaining a given "placement" on a descending passage is
something that an accomplished singer is likely to
have practiced a good deal. The slightly lower subglottal pressures also suggest that maximum intensity of acoustic output was not the singer's primary
concern. Listening confirms the absence of the
more extreme vowel modification used in the first
phonation. The uniform supraglottal pressure peakto-peak amplitudes in the overview, as well as the
shape of that wave in the time-expanded detail, implies that the first formant is kept close to Fo, but
that sharp tuning effects are avoided.
Each of the approaches illustrated in these two
phonations has its advantages: In the first, the tuning of the formants to match one of the low harmonics produces notes of considerable intensity; in the
second, the uniform quality of the passage is preserved by avoiding sudden effects of formant tuning. A further advantage of the second approach is
that the vowel is not distorted by extreme vowel
modification. Both approaches can be justified~ by
the musical contexts in which they appear.
The striking degree of vowel modification employed in producing the more pronounced effects of
what has been called "formant tuning" has led to
the conclusion that the singer modifies his vowels
with a purpose. The most plausible purpose, and
Journal of Voice, Vol. 4, No. 3, 1990
236
D. G. M I L L E R A N D H. K. SCHUTTE
AUDIO
_~jI.Wv~W~
~
~
~
~
~
~
25-/"~/'~-,/0-
~
'N.,.-.,j'Np,,/"
~
25-/~J-~
0-
~
~
~
SUPRAGLOTTAL
PRESSURE
+ 1 0 - . ,~. c%
_10-~
~ ~
(cm H20)
FIG. 5. Details taken from the signals of Fig. 3
SUBGLOTTAL
PRESSURE
on expanded time scale. (The signals for audio
and SPL lag ~1.5 ms behind the other signals.)
TRANSGLOTTAL
PRESSURE
~,~v,~,.~,~
.:1:.
SPL 10 dB "~'~.f~V'q ~
PITCH
one which is indeed accomplished, is to increase
intensity of the sounds where the modification is
most marked. This does not imply, however, that
the singer has to conceive of the effects in the terms
presented here; indeed, they are more likely to be
VOWEL/i/
0
1
2
AUDIO
3
4
5
E4 flat
FIG. 6. Spectra of 0.1-s
samples of audio signals of
the pitches in Fig. 4.
b i i ~ i
g
C4
i
b i i
5 4 g
b i 2 g 4 g
kHz
Journal of Voice, Vol. 4, No. 3, 1990
F
F
Eb
Eb
Bb
thought of in terms of imagery commonly used by
singers, such as "placement."
CONCLUSIONS
Two sung phrases of a professional baritone have
been examined for the presence (or absence) of effects caused by the matching of the two lowest resonances of the vocal tract with harmonics of the
glottal source. The signal from a miniature wideband pressure transducer placed just above the glottis provided important information on this tuning.
The effects of this tuning on the singing voice are
considerable, particularly on the sound pressure
level, but also on vowel definition, since the tuning
is implemented by vowel modification. The singer
appears to apply it intentionally, although he is unaware of the acoustic phenomena as they are described.
The extent to which these phenomena are characteristic of this singer or of singers in general is a
matter for further investigation. The importance of
the effects for the singing voice and the insights that
the method provides suggest that application of the
method to other phonations and other singers is
warranted, particularly in light of the fact that the
matching of resonances of the vocal tract with harmonics of the glottal source has been advanced as a
basic principle in recognized pedagogies of singing.
Acknowledgment: The authors wish to express their
gratitude to the following for supporting this research:
ENT Clinic, University Hospital, Groningen (head: Prof.
Dr. P. E. Hoeksema); Speech Laboratory, Department 0f
Electrical and Computer Engineering, Syracuse Univer"
sity (head: Prof. M. Rothenberg); M. Rothenberg, for
help in analyzing the data. The research reported herein
F O R M A N T TUNING I N A PROFESSIONAL BARITONE
was (partially) supported by the Foundation for Linguistic Research, which is funded by the Netherlands Organization for Scientific Research (NWO).
REFERENCES
1. SnndbergJ. Acoustics ofthe singing voice. SciAm 1977;236:
82-91.
2. Borden GJ, Harris KS. Speech science primer. Physiology,
acoustics, and perception of speech. Baltimore: The
Williams & Wilkins Company, 1980:100.
3. Sundberg J. The science of the singing voice. Dekalb, Illinois: Northern Illinois University Press, 1987:41,126.
4. Sundberg J. Observations on a professional soprano singer.
Speech Transmission Laboratory Quarterly Progress and
Status Report 1973;14:14-24.
5. Miller DG. Some observations on the soprano voice. NATS
Journal 1987;May/June: 12-15.
237
6. Schutte HK, Miller DG. The effect of F0/F 1 coincidence in
soprano high notes on pressure at the glottis. Journal of
Phonetics 1986;14:385-92.
7. Oncley P. Dual Concept of singing registers. In: Large J, ed.
Vocal registers in singing. The Hague: Mouton, 1973:35--44.
8. Coffin B. Overtones of bel canto. The phonetic basis of artistic singing. Metuchen, New Jersey: Scarecrow Press,
1980.
9. Roederer J. Introduction to the physics and psychophysics
of music, 2nd ed. New York: Springer-Verlag, 1975:128.
10. Miller DG, Schutte HK. Characteristic patterns of sub- and
supra-glottal pressure variations within the glottal cycle. In:
Lawrence VL, ed. Transcripts of the Xlllth Symposium:
Care of the Professional Voice. New York: The Voice Foundation, 1984;1:70-5.
11. Miller R. English, French, German and ltalian techniques of
singing: A study in national tonal preferences and how they
relate to functional efficiency. Metuchen, New Jersey:
Scarecrow Press, 1977:118.
Journal of Voice, Vol. 4, No. 3, 1990