Academia.eduAcademia.edu

Formant tuning in a professional baritone*

1990, Journal of Voice

Journal of Voice Vol. 4, No. 3, pp. 231-237 © 1990 Raven Press, Ltd., New York Formant Tuning in a Professional Baritone *?D. G. Miller and ?H. K. Schutte *School of Music, Voice Department, Syracuse University, Syracuse, NY, U.S.A.; ~Voice Research Laboratory, ENT-Clinic, University Hospital Groningen, Groningen, The Netherlands Summary: Formant tuning, or using vowel modification to approximate one or both of the two lowest resonances of the vocal tract to harmonics of the glottal source, is a technique advocated by certain pedagogies of singing. In this experiment, two sung phonations of a professional baritone are examined for evidence of the tuning by simultaneous recording of audio, electroglottographic (EGG), and subglottal and supraglottal pressures by means of wideband miniature pressure transducers on a catheter passed through the glottis. The considerable resonance-enhancing effects of formant tuning appear to be intentionally exploited by the singer in response to the demands of the musical phrase. Key Words: Formant--Singing--Subglottal pressure--Supraglottal pressure--Pedagogy of singing--Fundamental frequency--Spectrogram. given language (2). By far the most important resonances in determining the vowel quality are F1 and F2, which in singing have ranges of -250--900 and -800-2,200 Hz, respectively. Formant tuning, as we shall use the notion here, is bringing one of the first two formants close in frequency to one of the harmonics of the voice source, which occurs at multiples of the fundamental frequency (Fo). A welltuned formant can have considerable amplifying power (3), and harmonics falling very close to F1 or F 2 will generally have the highest sound pressure level (SPL) in the spectrum of a sung tone. As F o rises, the increasing distance between harmonics makes formant tuning more critical to the resultant intensity of the singing voice. The most dramatic resonance-enhancing effects of formant tuning generally occur in the range of F 5 to Bs-flat (ca. 700--900 Hz), where F 1 coincides with F o on the open (high-F1) vowels. Sundberg (4) has shown how this important effect is extended i~ female voices by adjusting F1 to match F0 in other (low-F1) vowels through varying the jaw opening. Because of a longer closed phase in the glottal cycle, however, " c h e s t " register production (including male "legitimate head voice") cannot take full advantage of the resonance-enhancing effect of the One of the important contributions of voice science to the theory of the singing voice has been to shed light on the process of tuning the vocal tract, an intuitive act that has been practiced at least as long as skillful singers have striven to be heard against the background of other musical instruments (1). Varying the resonances of the vocal tract to produce the vowels of speech requires little conscious attention; in singing, however, the requirement of sustaining syllables while maintaining an aesthetically acceptable sound forces singers into producing vowels with care. The further requirement of producing a sound of adequate size in a range well above that of speech makes attention to tuning the resonances of the vocal tract often essential for professional survival. By modulating the position of the articulators (primarily tongue, lips and jaw) the resonances of the vocal tract (called formants and designated F1, F2, F 3, etc. in the order of their resonant frequencies) are varied to produce the several vowels of a Address correspondence and reprint requests to D. G. Miller at Voice Research Lab., PO Box 30.00l, NL 9700 RB Groningen, The Netherlands. This article was presented at the XVIth Symposium Care of the Professional Voice, New York, June 1987. 231 232 D. G. M I L L E R A N D H. K. SCHUTTE coincidence of F 0 and F~, ~ and the strongest formant tuning effects in this register are often produced with the second or higher harmonics. Conscious application of formant tuning to the practice of training singers has been slow in developing. As early as 1969, Oncley (7) explained the "lifts" commonly perceived in singing voices as shifts in the tuning of formants to lower harmonics as these rise with ascending F 0. Coffin (8) has built a system of training the singing voice around the notion of tuning the vocal tract. In the cases of both of these authors, however, the theory was developed on the basis of limited knowledge of what actually happens acoustically within the vocal tract in singing and, therefore, is in need of correction. The present experiment introduces one way to examine more closely the phenomenon of vocal tract tuning. METHOD Although the higher-frequency components of a sung sound are an important determinant of perceived quality and of "carrying power" (1), the sound pressure level (SPL) is largely a matter of the lower frequency components, chiefly F 1 and, to a lesser extent, F2 (4). The dominance of the lower formants seen in the radiated sound is considerably stronger within the vocal tract, since the lowfrequency components, much like those in brass instruments, are subject to greater attenuation as they pass from the vocal tract into the outside air (9). This leads us to expect that tuning of the lower formants will be more readily apparent within the vocal tract than to an outside microphone. Since 1984, the authors have been measuring the subglottal and supraglottal pressures of voices peri In "falsetto" production (where the closed phase of the glottis is short or absent), if the first formant is tuned to Fo (the first harmonic), it is possible to achieve high (>100 cm water pressure) peak-to-peak supraglottal pressures. The (sinusoid) Fx component is timed so that it is driven by the time-varying glottal flow in both rising and falling phases, a distinct advantage over " c h e s t " register production, whose long closed phase of the glottis prevents the glottal flow from driving the supraglottal pressure wave in its rising phase. In the case where F 1 is tuned to the second harmonic, making two complete F~ cycles per glottal cycle, however, the first of each pair of F~ cycles will be completed in the long (>50%) closed phase of the "chest" register, whereas the second will be driven by the flow in both rising and falling phases. An instance of tuning to the second harmonic can be seen in the B3-flat of Fig. 2. The argument is presented in greater detail in Miller (5). Some nuances of the "falsetto" case are examined in Schutte and Miller (6). Journal of Voice, Vol. 4, No. 3, 1990 forming a variety of singing tasks. The authors' procedure (10) consists briefly of recording these pressures with miniature wide-band pressure transducers on a small-diameter catheter introduced through the nares and passing through the posterior commissure of the glottis. The electroglottographic (EGG) signal (measuring vocal fold contact area) and the radiated sound are also recorded simulta. neously. The supraglottal pressure signal is mea. sured within 3 cm of the glottis,2 which functions as a pressure antinode (a point of maximum AC pressure modulations) in standing waves of the vocal tract. Thus, tuning F~ o r F 2 to match one of the harmonics of the glottal source would result in a strong component at that frequency in the recorded supraglottal pressure waveform. The subject whose singing pressures are recorded here (not one of the authors) is one of the leading opera singers in the Netherlands. A man in his middle years with a serious interest in vocal pedagogy, he is hardly a naive subject, but at the time of the experiment his thinking about singing did not include any notion of formant tuning as described above. Neither was he coached by the experimenters, but simply directed to perform some typical vocalize patterns on vowels and CV nonsense syllables, once the catheter was in place and the topical anesthetic needed during the catheter insertion had worn off. RESULTS In presenting two phonations for closer scrutiny from the many recorded by this subject, it is not the authors' intention to describe precise relationships among quantitative data nor to make any assertion about the Zstatistical frequency of the use of formant tuning among singers; it is rather to call attention to some acoustic mechanisms that appear to be exploited by this skillful and experienced singer. Both phonations cover the range of frequencies between 230 and 380 Hz (roughly B3-flat to F4), a critical range where the baritone normally makes some sort of adjustment in order to arrive at the male "legitimate head voice" (11) by the time he reaches the high F4. The phonations will be examined first in an 2 The 3 cm is an estimate, based on the fact that the 6 cm length of catheter between the transducers is partly occupied by structures of the glottis. In the case of FI, the attenuation of antinodal pressure due to the varying position of the transducer can be neglected; on the other hand, it could be as high as 30% in the case of F2. F O R M A N T TUNING I N A P R O F E S S I O N A L B A R I T O N E overview and then the pressure patterns within the glottal cycle will be looked at. The first phonation is an arpeggio pattern (domi-sol-mi-do) on the syllables /bibibi/. The overview (Fig. 1) reveals an interesting relationship between the subglottal pressure and the sound pressure level. It is usually assumed that these two variables are closely correlated (3), at least for a constant vowel. In this case, the subglottal pressure on the F 4 is about half again as much as that on the B3-flat, whereas the D4 has an intermediate value; nonetheless, the sound pressure level on the high note is slightly lower than that on the low note, and the intermediate D4's are at least six decibels below the other notes in SPL. A clue to the explanation of this unexpected result can be found in the variations of supraglottal pressure within the glottal cycle, shown in Fig. 2. Although the overview gives a good indication of peak-to-peak (AC) variations in the pressure signals, as well as the progress of average pressures, the details of the glottal cycle are revealed in Fig. 2, with its greatly expanded time scale. Here a few cycles of each of the pitches of the phonation have been juxtaposed to facilitate comparison among them. A strong tuning effect is evident in the B3-flat, where F1 has been adjusted to a relatively high frequency so that it coincides with the second harmonic (see also Fig. 3). Because of the dominance of low formants within the vocal tract, this well- AUDIO SUPRAGLOTTAL PRESSURE + 10 -10(em H20) " SUBGLOTTAL 2 5 PRESSURE .0- ~ PRESSURE EGG " f ii 11 ~_~ PITCH Bb / bi - D F bi- D Bb bi- / FIG. 1. Signalsof complete phonation of baritone singingarpeggio (B3-flat, D4, F4, D4, B3-flat). 233 tuned formant produces a high peak-to-peak variation in the supraglottal pressure, which in turn gives this pitch the highest SLP of the three in the phonation, in spite of its relatively low subglottal pressure. The other striking instance of formant tuning in this phonation occurs on the high note, which has a center frequency of 350 Hz. This time, however, it is not the first, but the second formant that dominates. Evidence for this is seen in the supraglottal pressure but is even clearer in the audio signal, which indicates an F2 standing wave with a frequency four times that of Fo (see also Fig. 3). The amplitude of the peak-to-peak supraglottal pressure modulations is notably smaller than on the B3-flat, but the higher frequency of the tuned second formant results in a sound pressure level which is similar. In contrast to these two strongly resonated tones, the sound pressure level of the first D 4 is --6 dB lower, a reduction of one-half, even though the subglottal pressure is relatively high. This is due largely to the relative lack of formant tuning: F 0 has moved too high for matching the second harmonic to F 1, and there is no evidence of a matching of F 2 with higher harmonics in the first sample shown. We have included a second example of the D4, however, occurring after the high note, where F 2 matches the fifth harmonic at a certain point in the vibrato cycle, raising the SPL - 4 dB above its minimum. This effect is stronger following the tuning of F 2 on the high note than it is preceding it. The information presented in Fig. 3 confirms these observations. It consists of spectra (produced by a Princeton Applied Research FFT Real Time Spectrum Analyzer model 4512) of the audio signal of the five tones of the vocalization, as well as the supraglottal signal of the B3-flat and F4, where formant tuning is most evident. On these two pitches, the partial, which is amplified by a coinciding formant, is 10-15 dB stronger than the remaining partials, at least in the microphone signal. In the supraglottal spectra, the preponderance of intensity of the lower partials, much of which is not passed at the lips into the radiated spectrum, gives a different perspective, and F 2 tuning has a less evident effect. Although it is pitched a fraction of a semitone higher, our second example (Fig. 4) covers the same range (from B3-flat to F4), with the same target vowel (/i/). The pitch sequence is that of a descending scale passage, however, and there is no interruption of the vowel by consonants. Except for an Journal of Voice, Vol. 4, No. 3, 1990 234 D. G. M I L L E R A N D H. K. S C H U T T E AUDIO SUPRAGLOTTAL + 1 0 - /5 £1 A PRESSURE -10-V w V v (em H20) SUBGLOTTAL 2 5 0PRESSURE - - LJ ~ FIG. 2. Details taken from the sig. nals of Fig. 1 on expanded time scale. (The signals for audio and SPL lag -1.5 ms behind the other signals.) TRANSGLOTTAL PRESSURE EGG S~ ' ' - ~ f ~ ' - - - j SPL 10 d ~ ~ ~£ ,jr,.. ~!£ ,/~.. PITCH Bb D intermittent high-intensity sound on the first pitch, the overview shows a smooth scale, lacking the contrasts and discontinuities of the first example. Figure 5 gives some details of this overview on an expanded time scale. The high note is shown at a point in the vibrato cycle where F 2 is sharply tuned to the fourth harmonic, as well as a point 60 ms later where the Fz-tuning is weak. The weak tuning, which is evident in the audio, supraglottal pressure, and SPL signals, coincides with a drop in Fo from 380 to 340 Hz in the course of the vibrato cycle. There are also two examples taken from the second note (E4-fiat), illustrating stronger and weaker F ztuning within the vibrato cycle. In the better tuned p a r t , F 2 is matched with the fifth harmonic, but the effect is weaker than in the examples where the tuning occurs at harmonics lower in the series. Our last detail comes from the low note (B3-flat), which we include for comparison with the similar note in the first phonation. Visual inspection of the supraglottal pressure signal reveals that the first formant is located somewhere between the first and second harmonics, where it has remained throughout this phonation, thus not producing any sharp tuning effects. The B3-flat note in the other phonation (Figs. 1-3) presents quite a different picture: with F1 modified upwards in order to match the second harmonic, the tuning effect produces the highest SPL of any of the tones displayed here. Figure 6 gives spectra of the audio signals of the five descending scale tones. With the possible exception of the high note, with its intermittent F 2tuning, the strongest partial in all cases is the Fo, inJournal of Voice, Vol. 4, No. 3, 1990 F 0 dicating that the singer has made an adjustment of F1 downwards as Fo moves stepwise from 360 to 240 Hz. His success in keeping the scale smooth can be seen in the nearly uniform supraglottal peakto-peak amplitude in Fig. 4. The intermittent F2-tuning within the vibrato cycle cannot be shown well in the spectra because our spectrum analyzer has a minimum time window of 0.1 s, a time interval that would necessarily include both strong and weak, phases of vibrato. DISCUSSION It is not necessary to look far for explanations of this apparently inconsistent behavior on the part of the singer. In the first phrase, the singer's customary attention to high notes and the natural tendency to group notes rhythmically lead us to expect that the F 4 and B3-flats will be produced with more care than the D 4 ' s . Furthermore, the disjunction of the notes reduces the need to match them carefully in quality. The orchestral sounds that accompany much of professional singing create the need for SPL-enhancing effects; particularly on prominant notes, the singer would tend to adjust resonances to match one or another harmonic of the glottal source. In our first example (Figs. 1-3), the vowel modification is more than moderate: the/i/sounds almost like /e/. This presumably is because the singer found it necessary to raise the F 1 o f ]i/, which in speech is normally <300 Hz, to 460 Hz in order to match the second harmonic of the B3-flat (F0 = 230 Hz). After passing through the transitional D4, F O R M A N T TUNING I N A P R O F E S S I O N A L B A R I T O N E VOWEL/I/ I I I I ] I I I P SUPRAGLOTTAL AUDIO 235 AIMLt i I . . . . . . i i i ~ r t i I i i ~ ...... AUDIO I I Ill SUPRAGLOTTAL+ 10 to] PRESSURE (cm H20) . 0 1 2 3 4 . o 5 . 1 . SUBGLOTTAL 25- / J 2 3 PRESSURE 4 TRANSGLOTTAL PRESSURE EGG 0 1 2 3 4 I . . . . . . -10- ~ ' - _ ~ ~ I/ , ~ 0-- 25- ~ O-- ::::::::::::::::::::::::: 5 dB 10 10 PITCH ! , i 0 1 2 3 4 5 F Eb D /bi . . C . Bb . / FIG. 4. Signals of complete phonation of baritone singing descending scale passage (F4, E+-flat, D4, C4, B3-flat). dB 10] I 0 , , , 1 2 3 II,l 4 g dB 10] 0 I 2,. 3 4 5 kHz 0 | 1 2 3 4 5 kHz FIG. 3. Spectra of 0.1-s samples of audio and supraglottal signals of the pitches in Fig. 1, in consecutive order from top to bottom. the next important (resonance-demanding) note is the F 4. Although early pedagogies involving forrnant tuning would suggest the matching of F1 to the fundamental frequency Fo (350 Hz), the long closed phase of the glottal cycle in the male "legitimate head voice" greatly reduces the potential of this tuning effect (5), and our singer operates with F2 instead. Again, he uses a vigorous vowel modification to lower F 2 to 1,450 Hz, where it will match the fourth harmonic, at least in the high-F o phase of the vibrato cycle. The next D 4 retains some of the tuning of F 2 (now matching the fifth harmonic), and the final B3-flat returns abruptly to the match of F1 with the second harmonic. The musical shape of the second phonation (Figs. 4--6) presents different demands, and the singer reSponds accordingly. The fact that it is a scale pas- sage without intervening consonants suggests a more carefully matched set of tones, and maintaining a given "placement" on a descending passage is something that an accomplished singer is likely to have practiced a good deal. The slightly lower subglottal pressures also suggest that maximum intensity of acoustic output was not the singer's primary concern. Listening confirms the absence of the more extreme vowel modification used in the first phonation. The uniform supraglottal pressure peakto-peak amplitudes in the overview, as well as the shape of that wave in the time-expanded detail, implies that the first formant is kept close to Fo, but that sharp tuning effects are avoided. Each of the approaches illustrated in these two phonations has its advantages: In the first, the tuning of the formants to match one of the low harmonics produces notes of considerable intensity; in the second, the uniform quality of the passage is preserved by avoiding sudden effects of formant tuning. A further advantage of the second approach is that the vowel is not distorted by extreme vowel modification. Both approaches can be justified~ by the musical contexts in which they appear. The striking degree of vowel modification employed in producing the more pronounced effects of what has been called "formant tuning" has led to the conclusion that the singer modifies his vowels with a purpose. The most plausible purpose, and Journal of Voice, Vol. 4, No. 3, 1990 236 D. G. M I L L E R A N D H. K. SCHUTTE AUDIO _~jI.Wv~W~ ~ ~ ~ ~ ~ ~ 25-/"~/'~-,/0- ~ 'N.,.-.,j'Np,,/" ~ 25-/~J-~ 0- ~ ~ ~ SUPRAGLOTTAL PRESSURE + 1 0 - . ,~. c% _10-~ ~ ~ (cm H20) FIG. 5. Details taken from the signals of Fig. 3 SUBGLOTTAL PRESSURE on expanded time scale. (The signals for audio and SPL lag ~1.5 ms behind the other signals.) TRANSGLOTTAL PRESSURE ~,~v,~,.~,~ .:1:. SPL 10 dB "~'~.f~V'q ~ PITCH one which is indeed accomplished, is to increase intensity of the sounds where the modification is most marked. This does not imply, however, that the singer has to conceive of the effects in the terms presented here; indeed, they are more likely to be VOWEL/i/ 0 1 2 AUDIO 3 4 5 E4 flat FIG. 6. Spectra of 0.1-s samples of audio signals of the pitches in Fig. 4. b i i ~ i g C4 i b i i 5 4 g b i 2 g 4 g kHz Journal of Voice, Vol. 4, No. 3, 1990 F F Eb Eb Bb thought of in terms of imagery commonly used by singers, such as "placement." CONCLUSIONS Two sung phrases of a professional baritone have been examined for the presence (or absence) of effects caused by the matching of the two lowest resonances of the vocal tract with harmonics of the glottal source. The signal from a miniature wideband pressure transducer placed just above the glottis provided important information on this tuning. The effects of this tuning on the singing voice are considerable, particularly on the sound pressure level, but also on vowel definition, since the tuning is implemented by vowel modification. The singer appears to apply it intentionally, although he is unaware of the acoustic phenomena as they are described. The extent to which these phenomena are characteristic of this singer or of singers in general is a matter for further investigation. The importance of the effects for the singing voice and the insights that the method provides suggest that application of the method to other phonations and other singers is warranted, particularly in light of the fact that the matching of resonances of the vocal tract with harmonics of the glottal source has been advanced as a basic principle in recognized pedagogies of singing. Acknowledgment: The authors wish to express their gratitude to the following for supporting this research: ENT Clinic, University Hospital, Groningen (head: Prof. Dr. P. E. Hoeksema); Speech Laboratory, Department 0f Electrical and Computer Engineering, Syracuse Univer" sity (head: Prof. M. Rothenberg); M. Rothenberg, for help in analyzing the data. The research reported herein F O R M A N T TUNING I N A PROFESSIONAL BARITONE was (partially) supported by the Foundation for Linguistic Research, which is funded by the Netherlands Organization for Scientific Research (NWO). REFERENCES 1. SnndbergJ. Acoustics ofthe singing voice. SciAm 1977;236: 82-91. 2. Borden GJ, Harris KS. Speech science primer. Physiology, acoustics, and perception of speech. Baltimore: The Williams & Wilkins Company, 1980:100. 3. Sundberg J. The science of the singing voice. Dekalb, Illinois: Northern Illinois University Press, 1987:41,126. 4. Sundberg J. Observations on a professional soprano singer. Speech Transmission Laboratory Quarterly Progress and Status Report 1973;14:14-24. 5. Miller DG. Some observations on the soprano voice. NATS Journal 1987;May/June: 12-15. 237 6. Schutte HK, Miller DG. The effect of F0/F 1 coincidence in soprano high notes on pressure at the glottis. Journal of Phonetics 1986;14:385-92. 7. Oncley P. Dual Concept of singing registers. In: Large J, ed. Vocal registers in singing. The Hague: Mouton, 1973:35--44. 8. Coffin B. Overtones of bel canto. The phonetic basis of artistic singing. Metuchen, New Jersey: Scarecrow Press, 1980. 9. Roederer J. Introduction to the physics and psychophysics of music, 2nd ed. New York: Springer-Verlag, 1975:128. 10. Miller DG, Schutte HK. Characteristic patterns of sub- and supra-glottal pressure variations within the glottal cycle. In: Lawrence VL, ed. Transcripts of the Xlllth Symposium: Care of the Professional Voice. New York: The Voice Foundation, 1984;1:70-5. 11. Miller R. English, French, German and ltalian techniques of singing: A study in national tonal preferences and how they relate to functional efficiency. Metuchen, New Jersey: Scarecrow Press, 1977:118. Journal of Voice, Vol. 4, No. 3, 1990