Academia.eduAcademia.edu

Children recovered from stuttering without formal treatment: perceptual assessment of speech normalcy

1997, Journal of speech, language, and hearing research : JSLHR

Current evidence suggests that young children who recover from stuttering are essentially stutter-free. However, there is no evidence to indicate if their speech is perceptually indistinguishable from normally fluent peers or whether they retain perceptually unusual speech. One important example of recovery from stuttering is children who have recovered without receiving formal treatment. An investigation was conducted to determine if the speech of these children is perceptually different from the speech of children who have never stuttered. Speakers consisted of 10 preschool and early school-age children documented as recovered from stuttering without benefit of formal treatment. In a series of studies they were compared with 10 children who had never stuttered. Three groups of judges-sophisticated, unsophisticated, and experienced-were separately asked, using videotaped speech samples of the children, to decide which samples were from children who used to stutter. Results revealed...

JSLHR, Volume 40, 867–876, August 1997 Children Recovered From Stuttering Without Formal Treatment: Perceptual Assessment of Speech Normalcy Patrick Finn University of New Mexico Albuquerque Roger J. Ingham University of California, Santa Barbara Nicoline Ambrose Ehud Yairi University of Illinois, ChampaignUrbana Current evidence suggests that young children who recover from stuttering are essentially stutter-free. However, there is no evidence to indicate if their speech is perceptually indistinguishable from normally fluent peers or whether they retain perceptually unusual speech. One important example of recovery from stuttering is children who have recovered without receiving formal treatment. An investigation was conducted to determine if the speech of these children is perceptually different from the speech of children who have never stuttered. Speakers consisted of 10 preschool and early school-age children documented as recovered from stuttering without benefit of formal treatment. In a series of studies they were compared with 10 children who had never stuttered. Three groups of judges— sophisticated, unsophisticated, and experienced—were separately asked, using videotaped speech samples of the children, to decide which samples were from children who used to stutter. Results revealed that the children who recovered from stuttering were perceptually indistinguishable from the normal controls. The same result was obtained regardless of whether the samples were presented in pairedstimulus or single-stimulus mode. Two of the groups of judges were also instructed to rate the speech naturalness of the speech samples. The speakers were not distinguished on this measure either. Methodological issues and the implications of the findings are discussed. KEY WORDS: spontaneous recovery, speech naturalness, speech fluency I t is generally believed that the nature of developmental stuttering is different for children who stutter relative to adults who stutter (Bloodstein, 1995; Van Riper, 1982). Children who stutter are usually described as having an impairment that is more amenable to change than adults who stutter. The suspected reason for this difference between the two age groups is the chronicity of their impairment. Relative to onset, children who stutter have experienced the impairment for a shorter time than adults who stutter. An important result of this difference is that children are believed more likely to attain a complete recovery from the disorder (Bloodstein, 1995). Moreover, this recovery may occur in some cases without the assistance of formal treatment. In contrast, adults are less likely to attain the same degree of recovery and they are more likely to retain residual characteristics of the disorder, even if they have received formal treatment (Wingate, 1976). This means that the recovered speech of children is likely to be more comparable to the age-appropriate, normally fluent speech of their peers than is the ©1997, American Speech-Language-Hearing Association 1092-4388/97/4004-0867 Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 867 868 recovered speech of adults when compared to their normal peers. Various clinical and anecdotal accounts of unassisted and assisted recovery from early childhood stuttering have suggested that young children can attain essentially stutter-free speech (e.g., Bloodstein, 1995; Onslow, Andrews, & Lincoln, 1994; Yairi & Ambrose, 1992). Notwithstanding these positive accounts, to the best of our knowledge there is no laboratory evidence that young children who recover from stuttering would be judged as normally fluent. The fact that they may be stutterfree does not mean they are necessarily normally fluent and natural sounding (Finn & Ingham, 1989). Specifically, there are no perceptual data which show that children who recover from stuttering are indistinguishable from normal speakers. This is a concern because there is considerable perceptual and anecdotal evidence that adults who recover from stuttering often have presented with speech that was distinguishable from that of normal speakers (e.g., Ingham & Packman, 1978; Runyan & Adams, 1978, 1979; Wingate, 1976). Even in cases where the speech of successfully treated adults was not perceptually different from normal speakers, their speech naturalness was still judged as sounding significantly more unnatural than that of normal speakers (Ingham, Gow, & Costello, 1985). The factors responsible for recovery from stuttering in children are still not understood. There have been reports suggesting that children who stutter are highly responsive to a wide variety of ameliorative stimuli ranging from shadowed and rhythmic speech to responsecontingent stimulation (Ingham, 1984). An important, and possibly different, example is the phenomenon of unassisted recovery. Several studies have documented a substantial number of children who recovered from stuttering without receiving formal treatment (Andrews & Harris, 1964; Glasner & Rosenthal, 1957; Johnson & Associates, 1959; Yairi & Ambrose, 1992; Yairi, Ambrose, & Niermann, 1993). Systematic investigation of these children could be useful for establishing whether their recovery has in fact resulted in speech that is indistinguishable from normally fluent speakers. One commonly used method for determining differences between children or adults who stutter and their matched normal peers is a perceptual judgment task. Typically, such a judgment task will require judges to distinguish perceptually between speech samples obtained from two types of speakers. Two factors must be considered when constructing this task. First, the judges’ level of sophistication and experience is important because this could affect their ability to perceptually distinguish between types of speakers. In turn, this could affect the meaningfulness of the findings. For example, perceived differences between speakers might be viewed JSLHR, Volume 40, 867–876, August 1997 as less consequential if the differences were so small and subtle that they could be detected only by highly sophisticated judges (Runyan & Adams, 1979). Second, the type of perceptual task—discrimination or identification—must be considered. The discrimination task (paired-stimulus paradigm) requires observers to differentiate speakers when presented with pairs of samples, one from each type of speaker. The identification task (single-stimulus paradigm) requires judges to identify the type of speaker when presented with individual samples from both types of speakers. Past research contrasting these two tasks with samples from children or adults who stutter has had mixed results: Colcord and Gregory (1987) and Runyan, Hames, and Prosek (1982) reported no differences, whereas Wendahl and Cole (1961) and Young (1964) obtained significant differences for task effect. So far, these two tasks have not been examined with children believed to have recovered from stuttering. Therefore, the purpose of this investigation was fourfold: First, to determine if the speech of children who have recovered from stuttering without formal treatment was perceptually different from the speech of children who had never stuttered. Second, to determine if the speech of these children was judged perceptually different depending on the sophistication and experience of the judges. Third, to determine if the speech of these children was judged perceptually different depending on the type of perceptual task—discrimination or identification. Finally, to determine if their speech naturalness was judged to be different. Method Speakers Two groups of preschool through early school-age children provided the speech samples for all studies. They were originally participants in a longitudinal study investigating the speech characteristics of early childhood stuttering at the University of Illinois. For that study, they were videotaped speaking with an adult, usually a parent, while sitting at a table playing with a standard set of toys (see Yairi & Ambrose, 1992). These samples were obtained on repeated occasions over several years. Children Recovered From Stuttering (RS) To participate in the present study, speakers met the following selection criteria described by Yairi, Ambrose, Paden, and Throneburg (1996). First, they were initially judged as children with developmental stuttering. Stuttering criteria included (a) the parent(s) judged the child as stuttering, (b) two speech-language Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 869 Finn et al.: Assessment of Children Recovered From Stuttering pathologists independently judged the child as stuttering, (c) the child’s stuttering was rated greater than mild in severity, and (d) the child’s speech contained a minimum of three stuttering-like disfluencies (e.g., part-word and single-syllable word repetitions, sound prolongations, and silent blocks) per 100 syllables. Second, they recovered from stuttering without exposure to formal speech treatment. At the conclusion of the initial evaluation, parents were given basic information about stuttering, told that their child might or might not spontaneously recover, and advised about various beliefs regarding early stuttering—including the view that talking slower to the child might be beneficial in promoting fluency. The option of formal treatment was offered, but for various reasons parents chose not to seek treatment. Nonetheless, all parents continued to participate in the longitudinal study, which required them to have their child videotaped at least every 6 months. Third, they were later judged as recovered from stuttering. Recovery criteria (Yairi et al., 1996) included (a) the parent(s) judged the child as no longer stuttering and rated the child’s speech as normally disfluent, (b) a speech-language pathologist also judged the child as no longer stuttering and rated the child’s speech as normally disfluent, and (c) stuttering-like disfluencies were 2.99 or fewer per 100 syllables. Using these three sets of criteria, speakers were classified as children who recovered from stuttering (RS) without formal treatment. As can be seen in Table 1, RS speakers consisted of 7 males and 3 females. At the initial evaluation, two speech-language pathologists rated each speaker’s stuttering severity on an 8-point scale (where 0 = normal disfluency, 1 = very mild stuttering, and 7 = very severe stuttering). The severity ratings ranged from 2.26 to 5.87, with a mean of 3.63. Average age at onset of stuttering was 30 months, average age at recovery was 53 months, average duration between onset and recovery was 22.6 months, and average age at time of videotaped speech sample was 59 months (range = 39 to 80 months). Children Who Never Stuttered (NS) Ten children who did not have a history of stuttering and were judged by their parents and a speech-language pathologist as having age-appropriate speech and language skills were matched with the RS for sex and age within 2 months. At the time of the videotaping, NS average age was 59 months (range = 39 to 79 months). Perceptual Discrimination Task (Paired Stimulus) Using a discrimination (paired-stimulus) task, sophisticated judges were asked to determine if the speech of children who recovered from stuttering without formal treatment was distinguishable from the speech of normally fluent children. Sophisticated Judges The sophisticated judges consisted of 12 graduate students in speech-language pathology (11 females, 1 male; mean age = 34.1 years; range = 24–51 years). The judges were classified as sophisticated because they had successfully completed a graduate course on stuttering. Speaker Stimulus Videotape For the stimulus videotape, full-face speech samples of the child were obtained from the child-adult videotaped dialogues. RS speakers were paired with their respective NS speakers. Speech samples were selected from the videotaped dyads that involved the longest continuous segments during which the child was speaking and the adult listener was offering the fewest responses. Each sample pair was matched for number of syllables spoken (mean = 66.2 syllables; range across sample pairs = 38–85 syllables). Average sample duration was 1 min. Sample order within and across matched pairs was randomized. A second stimulus tape with randomized sample order was prepared for reliability purposes. Procedure The sophisticated judges performed two experimental tasks: a discrimination task and a speech descriptor task. Both tasks were performed after observing each pair of samples of an RS and NS speaker. For reliability Table 1. RS speakers: age at onset and recovery; duration between onset and recovery; and age at which videotaped speech sample was obtained (all ages are in months). Speaker Sex 1 2 3 4 5 6 7 8 9 10 Mean m m m m f m m f f m Average stuttering severitya Age at onset Age at recovery Months between onset and recovery 4.87 2.55 2.26 2.35 3.19 4.86 2.53 3.77 5.87 4.05 3.63 26 42 32 45 26 24 26 28 30 26 30 63 68 59 58 49 70 34 48 43 39 53 37 26 27 13 23 46 8 20 13 13 22.6 Age at speech sample 63 80 59 69 55 70 52 61 43 39 59 Speakers were rated on an 8-point stuttering severity scale where 0 = normal disfluency, 1 = very mild stuttering, and 7 = very severe stuttering. a Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 870 purposes, the judges repeated the tasks on the same samples 8 weeks later. Perceptual discrimination task. For the discrimination task, the judges were told that one child from each sample pair used to have a stuttering problem and that the other child never had a stuttering problem. After observing each pair of speakers, judges were instructed to decide which child used to have a stuttering problem. They were not told that the speakers were believed to have recovered from stuttering without the assistance of treatment. Speech descriptor task. After deciding which one of the pair of speakers used to stutter, the judges were instructed to write down a brief description of the speech characteristics or communicative behaviors of that child which helped them make their choice. Judges were given as much time as necessary to write their statements. For both tasks, judges were instructed to make their decisions independently of the other judges. Perceptual Identification Task (Single Stimulus) Using an identification task, two groups of independent judges—unsophisticated and experienced—were asked to determine if the speech samples were obtained from children who recovered from stuttering or were from normal speakers. They were also asked to rate the speech naturalness of the speech samples. Unsophisticated Judges The unsophisticated judges consisted of 26 graduate and undergraduate students (21 females, 5 males; mean age = 33.0 years; range = 22–46 years) majoring in speech-language pathology. These judges were classified as unsophisticated because they had not completed a graduate course on stuttering or observed a client who stuttered. Experienced Judges The experienced judges consisted of 14 speech-language pathologists (12 females, 2 males; mean age = 42.1 years; range = 34–57 years). The main criterion for participation was that the judges during the last 5 years of clinical experience had worked primarily with preschool or early school-age children, but not necessarily with children who stuttered. The average years of experience was 11.1 (range = 5.5–17 years). Therefore, these judges were classified as experienced. Speaker Stimulus Videotape For the stimulus videotape, speech samples were obtained from the speakers who participated in the JSLHR, Volume 40, 867–876, August 1997 discrimination task. Again, the speech samples were those with the longest continuous segments of child speech with the fewest responses from the adult listener in the video dyad. However, all samples were matched for number of syllables spoken (63 ± 2 syllables). Average sample duration was 1 min. Samples were randomly ordered and separated by a 5-s pause. Two stimulus tapes that included the same samples, but arranged in different random orders, were prepared for separate experimental tasks to be described below. Two additional stimulus tapes, also with samples arranged in random order, were prepared for reliability purposes. Procedure The unsophisticated and experienced judges independently performed the same two experimental tasks: an identification task and a speech-naturalness rating task. These tasks were performed separately with a short rest period between them. Identical speech samples were presented for each task except the sample order was randomized. Judges were not informed that identical samples would be observed in both tasks. The order of the tasks was counterbalanced across judges. For reliability purposes, the judges repeated the tasks on the same samples, one week later. Perceptual identification task. Judges were told they would view speech samples of children who used to stutter and children who had never stuttered. For each speech sample, they were instructed to independently decide if the speech sample was obtained from a child who used to stutter or a child who had never stuttered. They were not told that the speakers were believed to have recovered from stuttering without the assistance of treatment. Speech naturalness rating task. Judges were told they would view speech samples of children who used to stutter and children who had never stuttered. They were instructed to rate the speech naturalness of each child’s speech using a 9-point scale where 1 represented highly natural sounding and 9 represented highly unnatural sounding. Rating instructions were identical to those described by Martin, Haroldson, and Triden (1984). Results Perceptual Discrimination Task: Sophisticated Judges The frequency of the sophisticated judges’ correct and incorrect responses by speaker type were tallied. The percentage of total correct responses for RS speakers was 45.0%. A chi-square analysis revealed that this was not significantly different (χ2 = 2.40; df = 1; p = .12) from responses expected to occur by chance (50%). Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 871 Finn et al.: Assessment of Children Recovered From Stuttering Reliability of the correct responses was determined by comparing each judge’s first and second ratings for the same speaker across the two rating occasions. Results revealed a mean intrarater agreement level of 60.0%. The percent of correct responses had decreased from 45.0% to 38.3% across the two occasions. Speech Descriptor Task: Sophisticated Judges Speech characteristics described by sophisticated judges as the basis for their selection of a child as used to stutter were examined for consistency with stuttering. The descriptors across both occasions were classified on the basis of four categories: (a) characteristic of stuttering (e.g., behaviors that typify the problem of stuttering), (b) characteristic of stuttering treatment outcome (e.g., behaviors that might typify residuals from having received treatment for stuttering), (c) characteristic of communicative disorders other than stuttering, and (d) other (e.g., not consistent with any categories). A total of 265 statements were categorized by the first author. Results showed that (a) 59.7% of the judges’ statements were consistent with stuttering (e.g., “repetition of /p/ phoneme on two different words”), (b) 24.5% were consistent with treatment outcome (e.g., “very difficult to tell—perhaps the revisions on ‘p’ words were leftovers from treatment”), (c) 7.5% were consistent with other communicative disorders (e.g., “short MLU, didn’t seem to want to speak as much, a lot of single word answers”), and (d) 8.3% were unclassifiable (e.g., “hard to pick one”). A graduate student who had completed a stuttering course but had not participated in any other part of this study independently categorized the judges’ statements. Comparison with the first authors’ judgments revealed 78.7% interrater agreement. Perceptual Identification Task: Unsophisticated Judges The frequency of never stuttered judgments by the unsophisticated judges was determined for each speaker (see Appendix for individual data). Mean percent of never stuttered judgments for RS speakers was 72.3% (SD = 17.1, range = 42.3–88.5%) and for NS was 72.7% (SD = 15.2, range = 46.2–96.2%). A t test revealed that the difference between means was nonsignificant, t(18) = –.05, p = .96. Mean percent of used to stutter judgments was 27.7% for RS speakers and 27.3% for NS speakers. Reliability was determined by comparing each judge’s first and second ratings across occasions for the same speaker. Mean intrarater agreement was 76.2% (range = 60–95%). Perceptual Identification Task: Experienced Judges The frequency of never stuttered judgments by the experienced judges was determined for each speaker (see Appendix for individual data). Mean percent of never stuttered judgments was 68.6% (SD = 25.0, range = 35.7– 100%) for RS speakers and 72.9% for NS speakers (SD = 15.7, range = 50.0–100%). A t test revealed the difference between means was nonsignificant, t(18) = –.46, p = .65. The mean percent of used to stutter judgments was 31.4% for RS speakers and 27.1% for NS speakers. Intrarater agreement was determined by comparing each experienced judge’s first and second ratings across occasions for the same speaker. Mean intrarater agreement was 79.6% (range = 65–95%) Speech Naturalness Rating Task: Unsophisticated Judges Average speech naturalness ratings by the unsophisticated judges were estimated for each speaker (see Appendix for individual data). The average speech naturalness rating for RS speakers was 4.24 (SD = 1.22, range = 2.65–6.88) and for NS was 3.82 (SD = 1.05, range = 2.08–5.19). The difference between means was nonsignificant, t(18) = .83, p = .42. To determine intrarater agreement, each judge’s first and second ratings for the same speaker across occasions were compared. An acceptable level of agreement was defined as ratings that were identical or differed by no more than ±1 rating score (Martin et al., 1984). Using this criterion, mean intrarater agreement was 64.0% (see Table 2). For interrater agreement, each judge’s rating of a speaker was compared with the ratings of the same speaker by the other judges. An acceptable level of agreement was defined as ratings that were identical or differed by no more than ±1 rating score (Martin et al., 1984). Using this criterion, mean interrater agreement was only 40.6% (see Table 2). To determine if unreliable judges were influencing this outcome, a reanalysis of interrater agreement was performed on the basis of judges who demonstrated at least 80% intrarater agreement (n = 7). The findings indicated that interrater agreement at the ±1 level actually decreased to 38.3%. Speech Naturalness Rating Task: Experienced Judges Average speech naturalness ratings by the experienced judges were estimated for each speaker (see Appendix for individual data). The average speech naturalness rating for RS speakers was 3.71 (SD = 1.29, range = 2.57–6.93) and for NS was 3.24 (SD = 0.97, range = Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 872 JSLHR, Volume 40, 867–876, August 1997 Table 2. Unsophisticated judges: cumulative number and percentage (in parentheses) of intrarater and interrater agreements for speech-naturalness rating scores. Speaker type ±0 ±1.0 ±2.0 ±3.0 ±4.0 ±5.0 ±6.0 ±7.0 ±8.0 252 (96.9) 256 (98.5) 257 (98.8) 259 (99.6) 259 (99.6) 260 (100) 260 (100) 3030 (93.2) 3056 (94.0) 3172 (97.6) 3175 (97.7) 3241 (99.7) 3229 (99.4) 3250 (100) 3250 (100) Intrarater agreement RS NS 80 (30.8) 86 (33.1) 161 (61.9) 172 (66.2) 198 (76.2) 210 (80.8) 226 (86.9) 230 (88.5) 243 (93.5) 245 (94.2) Interrater agreement RS NS 446 (13.7) 543 (16.7) 1254 (38.6) 1388 (42.7) 1929 (59.4) 2021 (62.2) 2411 (74.2) 2448 (75.3) 2764 (85.0) 2803 (86.2) Table 3. Experienced judges: cumulative number and percentage (in parentheses) of intrarater and interrater agreements for speech-naturalness rating scores. Speaker type ±0 ±1.0 ±2.0 ±3.0 ±4.0 ±5.0 ±6.0 140 (100) 139 (99.3) 140 (100) 868 (95.4) 855 (93.9) 892 (98.0) 881 (96.8) ±7.0 ±8.0 910 (100) 904 (99.3) 910 (100) Intrarater agreement RS NS 46 (32.9) 49 (35.0) 92 (65.7) 90 (64.3) 112 (80.0) 116 (82.9) 129 (92.1) 131 (93.6) 136 (97.1) 135 (96.4) Interrater agreement RS NS 143 (15.7) 180 (19.8) 406 (44.6) 414 (45.4) 576 (63.3) 578 (63.5) 709 (77.9) 670 (73.6) 803 (88.2) 784 (86.2) 1.64–4.71). The difference between means was nonsignificant, t(18) = .91, p = .37. Discussion Intrarater agreement, based on ratings that were identical or differed by no more than ±1 rating score, was 65.0% (see Table 3). Interrater agreement, based on ratings that were identical or differed by no more than ±1 rating score, was only 45.1% (see Table 3). It was not possible to examine the influence of unreliable judges because an insufficient number of judges (n = 3) demonstrated 80% intrarater agreement. The main purpose of this investigation was to determine if the speech of children who had recovered from stuttering without formal treatment was perceptually different from the speech of children who had never stuttered. Results showed that the children who recovered from stuttering were not distinguished from their nonstuttering peers. The same result was obtained regardless of the type of perceptual task (discrimination or identification) or the judges’ level of sophistication and experience. Correlations Between Experienced and Unsophisticated Judges A correlational analysis between the perceptual judgments of never stuttered from experienced and unsophisticated judges revealed a significant positive correlation, r = .76, p < .001. There was also a significant positive correlation between the two groups of judges’ speech naturalness ratings, r = .79, p < .001. The results of the discrimination task revealed that sophisticated judges were unable to discriminate between paired speech samples from children who used to stutter and children who had never stuttered. Two factors need to be considered when interpreting this finding. First, the reliability of this finding is problematic because the judges demonstrated unsatisfactory levels Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 873 Finn et al.: Assessment of Children Recovered From Stuttering of intrarater agreement in making their judgments. At the same time, this concern may not be critical because the judges actually made fewer correct responses (e.g., correctly identifying a speaker who used to stutter) across the two judgment occasions. This may mean that with repeated observations of the same samples judges became less convinced that they were samples of speakers who used to stutter and were more likely speakers who had never stuttered. Second, although the judges were unable to discriminate between the two types of speakers, there is evidence that their judgments were at least guided by task-appropriate criteria. When asked to describe the basis for selecting a speaker as used to stutter,1 judges typically described speech characteristics that were consistent with stuttered speech. Judges also described selecting some speakers on the basis of speech behaviors considered characteristic of treated recovered speech. Though they were not told that speakers had recovered without formal treatment, some judges apparently inferred that if speakers used to stutter then treatment was the responsible agent. Therefore, the judges’ inability to discriminate between speaker types did not appear to be the result of employing invalid judgment criteria. A potential drawback of the discrimination task is that it might have imposed an artificial comparison between speakers that was not valid for either that speaker or for a particular judge. Some idiosyncratic feature of the NS speaker, for example, might have distracted judges from relevant speech features of the RS speaker. In contrast, an identification task in which each speaker is presented individually would allow judges to evaluate each speaker on his or her own terms. The findings of the identification task systematically replicated the findings from the discrimination task. Both unsophisticated and experienced judges were unable to distinguish between the speech of children who had recovered and who had never stuttered. Furthermore, the trustworthiness of these findings was bolstered by the relatively acceptable levels of intrarater agreement for this task in comparison with the discrimination task. The speech naturalness ratings provided additional evidence that there was no perceptual difference between the two groups of speakers. Comparison between the speech naturalness of the two groups of children revealed nonsignificant differences. This finding was the same regardless of whether the judges were unsophisticated or experienced. Both groups of judges also rated each speaker with comparable levels of naturalness. However, these promising findings must be interpreted cautiously because the reliability of these ratings was unsatisfactory,2 regardless of the judges’ background. The reasons for this low agreement warrant some discussion. This is the first study that has attempted to use this rating scale with speech samples from young children with a relatively large number of samples and judges. The only other study to employ this scale with children who stutter was also unable to demonstrate high agreement between two clinical judges (see Onslow, Costa, & Rue, 1990). However, this unsatisfactory agreement may not be a specific limitation of the speech naturalness scale. Rafaat, Rvachew, and Russell (1995) reported equally unsatisfactory interjudge agreement when experienced clinicians were rating the severity of phonological impairment in young children. They suggested that the greater range of “normal” that exists for young children combined with variability in clinicians’ knowledge of young children’s speech may be the main factors contributing to low rater agreement. Future research is necessary to determine if these factors also affect the speech naturalness scale, especially if it is going to be used to assess the speech of young children who stutter. At minimum, future researchers using the speech naturalness scale may have to instruct judges to make their ratings relative to an appropriate model of children’s speech. Findings from the present study also add to an emerging view concerning strategies for evaluating whether persons who stutter achieve normally fluent, natural-sounding speech as a result of treatment or after recovering from stuttering. A perceptual discrimination task similar to the one used in this study was introduced some years ago by Ingham and Packman (1978) and Runyan and Adams (1978, 1979) in order to determine whether adults who stutter had achieved perceptually normal speech. Subsequently, several studies suggested that Martin et al.’s (1984) 9-point speech naturalness rating scale might be a more practical and sensitive method for this purpose (Ingham et al., 1985; Ingham & Onslow, 1985; Runyan, Bell, & Prosek, 1990). That recommendation seemed justified because it was employed with satisfactory levels of rater agreement (e.g., Martin et al., 1984). More recent studies have shown that those agreement levels are often unpredictable. Different studies have found that not all judges achieve the high levels of rater agreement reported in earlier studies, especially when individual rather than group judgments are required (Finn & Ingham, 1994; Martin & Haroldson, 1992; Metz, Schiavetti, & Sacco, 1990; Onslow, Adams, & Ingham, 1992). The present 2 1 Note that judges were instructed to select the speaker who used to stutter. It may be worth considering whether the results would have been different had judges been instructed to select the speaker who had never stuttered, instead. For rating the speech naturalness of audiovisual speech samples of adults who stutter and do not stutter, Martin and Haroldson (1992) reported an average level of 84% for intrarater agreement (combined for stutterers and nonstutterers at ±1.0) and 80% for interrater agreement (combined for stutterers and nonstutterers at ±1.0). Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 874 study confirms this trend and shows that it also occurs when ratings are made of children’s speech (also see Onslow et al., 1990). These findings add to the argument that perhaps further research on this scale should focus on the development of standards for rating levels of naturalness, in much the same way as rating models have been developed for voice (Gerratt, Kreiman, AntonanzasBarroso, & Berke, 1993). In the meantime, the paired stimulus paradigm described in this study and recommended by others (e.g., Adams, 1984) may offer clinicians and researchers a less problematic method of deducing whether children have achieved normal sounding speech. Two factors might account for the apparently normal-sounding speech of the children recovered from stuttering. First, these children’s recoveries occurred without exposure to formal treatment. Therefore, their recovery did not necessarily involve changes in their speech behavior. In comparison, treated adults’ non-normal-sounding, fluent speech is usually attributed to the effects of changes in their speech behavior that are due to treatment procedures (e.g., prolonged speech). Second, perceptual studies have found that the fluent segments of speech from children who still stutter is undifferentiated from normally fluent peers (Colcord & Gregory, 1987; Krikorian & Runyan, 1983). This suggests that the child’s recovered speech pattern retains the normally fluent dimensions that were already present. The present findings provide the first objective evidence that children who recover from stuttering without exposure to formal treatment are likely to present with normal sounding speech. It is unknown if this finding would also extend to children who recover from stuttering because of formal treatment. Nonetheless, the outcome of this study suggests that normal fluency would be a reasonable treatment goal. Furthermore, the mechanism responsible for recovery in this study is unknown. One possible factor is that the brief parent counseling that occurred during the initial assessment contributed to their recovery. However, there is no credible evidence to support the view that such a limited informative session would result in an ameliorative effect. There is also no way of verifying that the parents actually followed any clinical suggestions offered during their counseling session. Future research should examine the kinds of parent behaviors that are beneficial to the child who stutters. In summary, a series of systematic replications have demonstrated that children who recover from stuttering are perceptually indistinguishable from children who have never stuttered. Obviously, these results should be considered preliminary.3 Because of 3 The small sample size in this study reduced the power of the statistical test to find differences where differences may in fact exist (Young, 1994). Therefore, it is possible that the nonsignificant statistical differences are Type II errors. JSLHR, Volume 40, 867–876, August 1997 the controversy surrounding spontaneous recovery in early childhood and the important theoretical and clinical implications, it should be carefully assessed and analyzed. Future research with a larger number of speakers and speech samples taken at several developmental stages is warranted. Acknowledgments Portions of this paper were presented by the first author at the Annual Convention of the American Speech-Language-Hearing Association, New Orleans, LA, 1994. Preparation of this manuscript was supported in part by Grant #RO1 DC-00060 awarded to R. J. Ingham by the National Institutes of Health. This research was also supported in part by Grant #RO1 DC-00459 from the National Institutes of Health, National Institute on Deafness and Other Communication Disorders (PI: E. Yairi). References Adams, M. R. (1984). The young stutterer: Diagnosis, treatment, and assessment of progress. In W. H. Perkins (Ed.), Stuttering disorders (pp. 41–55). New York: ThiemeStratton. Andrews, G., & Harris, M. (1964). The syndrome of stuttering. London: Heinemann. Bloodstein, O. (1995). A handbook on stuttering (5th ed.). San Diego, CA: Singular Publishing. Colcord, R., & Gregory, H. (1987). Perceptual analyses of stuttering and nonstuttering children’s fluent speech production. Journal of Fluency Disorders, 12, 185–196. Cordes, A. K., Ingham, R. J., Frank, P., & Ingham, J. C. (1992). Time interval analysis of interjudge and intrajudge agreement for stuttering event judgments. Journal of Speech and Hearing Research, 35, 483–494. Finn, P., & Ingham, R. J. (1989). The selection of “fluent” samples in research on stuttering: Conceptual and methodological considerations. Journal of Speech and Hearing Research, 32, 401–418. Finn, P., & Ingham, R. J. (1994). Stutterers’ self-ratings of how natural speech sounds and feels. Journal of Speech and Hearing Research, 37, 326–340. Gerratt, B. R., Kreiman, J., Antonanzas-Barroso, N., & Berke, G. S. (1993). Comparing internal and external standards in voice quality judgments. Journal of Speech and Hearing Research, 36, 14–20. Glasner, P., & Rosenthal, D. (1957). Parental diagnosis of stuttering in young children. Journal of Speech and Hearing Disorders, 22, 288–295. Ingham, R. J. (1984). Stuttering and behavior therapy: Current status and experimental foundations. San Diego, CA: College-Hill. Ingham, R. J., Gow, M., & Costello, J. (1985). Stuttering and speech naturalness: Some additional data. Journal of Speech and Hearing Disorders, 50, 217–219. Ingham, R. J., & Onslow, M. (1985). Measurement and modification of speech naturalness during stuttering Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 875 Finn et al.: Assessment of Children Recovered From Stuttering therapy. Journal of Speech and Hearing Disorders, 50, 261–281. Ingham, R. J., & Packman, A. (1978). Perceptual assessment of normalcy of speech following therapy. Journal of Speech and Hearing Research, 21, 63–73. Johnson, W., & Associates. (1959). The onset of stuttering. Minneapolis: University of Minnesota Press. treated stutterers. Journal of Fluency Disorders, 4, 29–38. Runyan, C. M., Bell, J. N., & Prosek, R. A. (1990). Speech naturalness ratings of treated stutterers. Journal of Speech and Hearing Disorders, 55, 434–438. Runyan, C. M., Hames, P. E., & Prosek, R. A. (1982). A perceptual comparison between paired stimulus and single stimulus methods of the fluent utterances of stutterers. Journal of Fluency Disorders, 7, 71–77. Krikorian, C., & Runyan, C. (1983). A perceptual comparison: Stuttering and nonstuttering children’s nonstuttered speech. Journal of Fluency Disorders, 8, 283–290. Van Riper, C. (1982). The nature of stuttering (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall. Martin, R. R., & Haroldson, S. K. (1992). Stuttering and speech naturalness—Audio and audiovisual judgments. Journal of Speech and Hearing Research, 35, 521–528. Wendahl, R. W., & Cole, J. (1961). Identification of stuttering during relatively fluent speech. Journal of Speech and Hearing Research, 4, 281–286. Martin, R. R., Haroldson, S. K., & Triden, K. (1984). Stuttering and speech naturalness. Journal of Speech and Hearing Disorders, 49, 53–58. Wingate, M. E. (1976). Stuttering: Theory and treatment. New York: Irvington. Metz, D. E., Schiavetti, N., & Sacco, P. R. (1990). Acoustic and psychophysical dimensions of the perceived speech naturalness of stutterers and posttreatment stutterers. Journal of Speech and Hearing Disorders, 55, 516–525. Onslow, M., Adams, R., & Ingham, R. J. (1992). Reliability of speech naturalness ratings of stuttered speech during treatment. Journal of Speech and Hearing Research, 35, 994–1001. Onslow, M., Andrews, C., & Lincoln, M. (1994). A control/ experimental trial of an operant treatment for early stuttering. Journal of Speech and Hearing Research, 37, 1244–1259. Onslow, M., Costa, L., & Rue, S. (1990). Direct early intervention with stuttering: Some preliminary data. Journal of Speech and Hearing Disorders, 55, 405–416. Rafaat, S. K., Rvachew, S., & Russell, R. S. C. (1995). Reliability of clinician judgments of severity of phonological impairment. American Journal of Speech-Language Pathology, 4, 39–45. Runyan, C., & Adams, M. R. (1978). Perceptual study of “successfully therapeutized” stutterers. Journal of Fluency Disorders, 3, 25–39. Runyan, C., & Adams, M. R. (1979). Unsophisticated judges’ perceptual evaluations of the speech of successfully Yairi, E., & Ambrose, N. (1992). A longitudinal study of stuttering in children: A preliminary report. Journal of Speech and Hearing Research, 35, 755–760. Yairi, E., Ambrose, N., & Niermann R. (1993). The early months of stuttering: A developmental study. Journal of Speech and Hearing Research, 36, 521–528. Yairi, E., Ambrose, N., Paden, E., & Throneburg, R. (1996). Predictive factors of persistence and recovery: Pathways of childhood stuttering. Journal of Communication Disorders, 29, 51–77. Young, M. A. (1964). Identification of stutterers from recorded samples of their “fluent” speech. Journal of Speech and Hearing Research, 7, 302–303. Young, M. A. (1994). Evaluating differences between stuttering and nonstuttering speakers: The group difference design. Journal of Speech and Hearing Research, 37, 522–534. Received August 28, 1996 Accepted January 23, 1997 Contact author: Patrick Finn, PhD, Department of Speech and Hearing Sciences, 901 Vassar NE, University of New Mexico, Albuquerque, NM 87131. Email: [email protected] Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 876 JSLHR, Volume 40, 867–876, August 1997 Appendix. Individual data for RS and NS speakers based on the scores of unsophisticated and experienced judges. Speaker type Unsophisticated judges Experienced judges Unsophisticated judges Experienced judges Percent Mean of never speechstuttered naturalness judgments ratings Percent Mean of never speechstuttered naturalness judgments ratings Percent Mean of never speechstuttered naturalness judgments ratings Percent Mean of never speechstuttered naturalness judgments ratings NS RS 1 2 3 4 5 6 7 8 9 10 Mean Speaker type 80.77 80.77 88.46 61.54 50.00 61.54 42.31 80.77 88.46 88.46 72.30 2.92 4.42 2.65 6.88 5.12 3.84 3.96 5.08 3.77 3.77 4.24 92.86 78.57 78.57 35.71 42.86 35.71 50.00 92.86 78.57 100.00 68.60 3.43 2.57 2.64 6.93 4.79 3.50 3.43 3.50 3.07 3.21 3.71 1 2 3 4 5 6 7 8 9 10 Mean 96.15 88.46 76.92 73.08 61.54 76.92 84.62 57.69 46.15 65.38 72.70 Journal of Speech, Language, and Hearing Research Downloaded From: http://jslhr.pubs.asha.org/ by University of California, Santa Barbara, Roger Ingham on 04/16/2014 2.12 3.77 2.08 5.19 4.54 4.46 3.77 4.73 3.27 4.23 3.82 92.86 64.29 71.43 64.29 71.43 85.71 100.00 50.00 57.14 71.43 72.90 1.64 4.14 2.50 4.71 3.79 2.14 2.71 4.00 3.29 3.50 3.24