|
|
||||||||
Contact author: Katherine C. Hustad, 475 Waisman Center, 1500 Highland Avenue, Madison, WI 53705. E-mail: kchustad{at}wisc.edu.
| Abstract |
|---|
|
|
|---|
Method: Speech samples were collected from 12 speakers with dysarthria secondary to cerebral palsy. For each speaker, 12 different listeners completed 2 tasks (for a total of 144 listeners): One task involved making orthographic transcriptions, and 1 task involved answering comprehension questions. Transcriptions were scored for the number of words transcribed correctly. Responses to comprehension questions were scored on a 3-point scale according to their accuracy.
Results: Across all speakers, the Pearson product–moment correlation between comprehension and intelligibility scores was nonsignificant when the effects of severity were factored out and residual scores were examined. Within severity groups, the same relationship was significant only for the mild group. Within individual speaker groups, the relationship was nonsignificant for all but 2 speakers with dysarthria. Percentage of correct scores for listener comprehension was descriptively higher than percentage of correct intelligibility scores for all groups.
Conclusion: Findings suggest that transcription intelligibility scores do not accurately reflect listener comprehension scores. Measures of both intelligibility and listener comprehension may provide a more complete description of the information-bearing capability of dysarthric speech than either measure alone.
KEY WORDS: speech intelligibility, speech perception, dysarthria, cerebral palsy, comprehensibility
The characterization of dysarthric speech is a topic of both clinical and theoretical importance. Although the dysarthrias are a heterogeneous group of speech disorders, one common characteristic is reduced intelligibility (Duffy, 2005; Yorkston, Beukelman, Strand, & Bell, 1999). Reduced intelligibility can have a critical impact on communication abilities and may limit vocational, educational, and social participation. As a result, quality of life may be greatly diminished.
Intelligibility refers to how well a speaker's acoustic signal can be accurately recovered by a listener. Although this definition seems simple, there are many speaker-related and listener-related variables that can impact how well a speech signal is deciphered. For example, research has shown that message predictability (Garcia & Cannito, 1996), message length (Yorkston & Beukelman, 1981), contextual cues (Hunter, Pring, & Martin, 1991), visual-facial information (Hustad & Cahill, 2003), and listener experience (Tjaden & Liss, 1995) each have the potential to affect intelligibility in significant ways. Thus, intelligibility is more complex than its definition may suggest.
There are several ways that intelligibility can be measured. One method is orthographic transcription of standard speech samples by naive listeners (see Garcia & Cannito, 1996; Giolas & Epstein, 1963; Tikofsky & Tikofsky, 1964; Yorkston & Beukelman, 1981). In this paradigm, listeners hear a speech sample (usually sentence length) and then write down what they thought the speaker said. Constituent transcribed words are scored as either correct or incorrect on the basis of whether they match the intended words of the speaker. Percentage of intelligibility scores are calculated by dividing the number of words identified correctly by the number of words possible, multiplied by 100. This method is widely used in clinical applications, and tools such as the Sentence Intelligibility Test (SIT; Yorkston, Beukelman, & Tice, 1996) are available for clinical use. Transcription intelligibility scores provide important information about the integrity of the acoustic signal relative to "normal" (Hustad & Beukelman, 2002) and are often used to describe severity of the dysarthria (Weismer & Martin, 1992). Transcription intelligibility scores can also be used as a basis of comparison to document progress in treatment (Yorkston et al., 1999). There are, however, several limitations to the information that can be obtained from transcription intelligibility scores. For example, the underlying basis for the intelligibility deficit cannot be determined from an intelligibility score (Kent, Weismer, Kent, & Rosenbek, 1989; Weismer & Martin, 1992). In addition, when individual listener-transcribed words are scored binomially (correct vs. incorrect) and each word is weighted equally, it is difficult to infer the extent to which listeners interpreted the meaning of the message. In large part, this is because the kinds of words (i.e., content bearing vs. non–content bearing) transcribed correctly cannot be determined from the intelligibility score alone (Hustad, 2006).
One complementary measure to transcription intelligibility is assessment of listener comprehension, sometimes called comprehensibility in the broader psychology and communication arts literatures. For clarity, it is important to note that in the dysarthria literature, Yorkston, Strand, and Kennedy (1996) have used the term comprehensibility to refer to "contextual intelligibility," or intelligibility when contextual information is present in different forms, such as semantic cues, syntactic cues, orthographic cues, and gestures. In their measurement of comprehensibility, Yorkston, Strand, and Kennedy employed orthographic transcription and percentage correct scores. Thus, comprehensibility as defined by Yorkston, Strand, and Kennedy is a type of intelligibility, with the addition of contextual information.
In contrast, measures of listener comprehension evaluate listeners' ability to interpret the meaning of messages produced by speakers with dysarthria without regard for accuracy of phonetic and lexical parsing (Hustad & Beukelman, 2002). Listener comprehension can be evaluated by examining listeners' ability to answer questions about the content of a message or narrative (Hustad & Beukelman, 2002) or by examining listeners' ability to summarize the content of a narrative passage (Higginbotham, Drazek, Kowarsky, Scally, & Segal, 1994) produced by a speaker.
Theories of discourse psychology provide a basis for conceptualizing different levels of processing that map onto the constructs of intelligibility and listener comprehension of dysarthric speech. Several competing theories exist in which underlying cognitive processes, mechanisms, and architectures differ. However, it is generally accepted that there are multiple levels of representation involved in language processing (Altmann, 2001; Foltz, 2003; Graesser, Millis, & Zwaan, 1997; Singer, 2000; Zwaan & Singer, 2003). van Dijk and Kintsch (1983) proposed that three levels of discourse representation are involved in the comprehension process, including the surface code; the textbase, or proposition; and the situational model. The first level of representation, surface code, refers to "precise word strings" (Singer, 2000, p. 370), or the exact syntax and morphology of the original message. The intermediate level of representation, textbase, or propositional content, refers to the meanings or propositions that are extracted from the surface code: essentially the semantics of the message (Graesser et al., 1997). The highest level of representation, the situation model, reflects the integration of propositions with the world knowledge and the goals of the receiver (Butcher & Kintsch, 2003; Kintsch, 1992; Singer, 2000). In the dysarthria literature, the majority of studies have focused on intelligibility, which can be considered a form of surface code because the focus of measurement is phonetic and lexical identification accuracy. There have been few studies in which propositional content or higher level situation models (comprehension) of dysarthric speech have been examined.
Results of comprehension studies have been equivocal, likely because of methodological differences between studies. For example, Beukelman and Yorkston (1979) examined the relationship between "information transfer" and intelligibility for 9 speakers with dysarthria of varying severity. In this study, listeners completed an intelligibility task in which they transcribed a paragraph (in 5- to 10-word segments) produced by a speaker with dysarthria. Listeners also completed an information-transfer task in which they listened to a different paragraph produced by a speaker with dysarthria and then answered 10 comprehension questions about the content of the paragraph. Percentage correct scores was obtained for each task and for each speaker. Results of the Beukelman and Yorkston study showed a strong and significant relationship (r = .95) between intelligibility scores and information-transfer (comprehension) scores across all speakers. However, Weismer and Martin (1992) identified a key problem related to the confounding effects of severity in Beukelman and Yorkston's study. That is, the correlation between intelligibility and comprehension scores across speakers of varying severity would be strong simply because both measures are correlated with severity. Thus, severity acted as a third variable, masking the true relationship that may or may not exist between intelligibility and comprehension scores. Because of this "third variable" effect, the only conclusion that can be drawn from the Beukelman and Yorkston study is that both intelligibility and listener comprehension increase as severity decreases. Weismer and Martin suggested that proper examination of the relationship between severity-related variables would require that severity be controlled or blocked so that the relationship could be examined within severity groups rather than across severity groups.
Hustad and Beukelman (2002) examined the relationship between listener comprehension and intelligibility for 4 speakers with severe dysarthria. They also examined the impact of supplemental contextual cues on the relationship between intelligibility and comprehension. Listeners completed two tasks, one in which they transcribed speech samples produced by a speaker with dysarthria and one in which they answered comprehension questions about a different set of speech samples produced by the same speaker with dysarthria. Results showed a weak and nonsignificant relationship between intelligibility and comprehension when no cues were provided. This finding provides evidence for the notion that the two measures tap into different phenomena and that listener performance on one measure does not necessarily reflect performance on the other. Thus, listener-comprehension measures and speech-intelligibility measures appear to provide different yet complementary information about the dysarthric speech signal. Because similar studies have not been conducted for groups of speakers from different severity groups, generalization of conclusions regarding the relationship between intelligibility and comprehension is difficult.
The purpose of the present study was to evaluate the relationship between comprehension and intelligibility for listeners of speakers with dysarthria from four different severity groups. The study was designed to be conceptually similar to that of Beukelman and Yorkston (1979); however, using both experimental and statistical procedures, the effects of severity were controlled (statistically partialled out) and systematically examined. Specifically, this study addressed the following questions: (a) Across all speakers, when severity effects are partialled out, what is the relationship between intelligibility and comprehension? (b) Within each speaker severity group, what is the relationship between intelligibility and comprehension? Is this relationship different among the severity groups? (c) Within individual speakers, what is the relationship between comprehension and intelligibility? What is the extent of individual differences among speakers?
Method
The present study was part of a larger project examining measurement of intelligibility. The first article (Hustad, 2006) examined the impact of different scoring methods on intelligibility findings. It also examined linguistic class errors made by everyday listeners when orthographically transcribing dysarthric speech. The present study used the same transcription intelligibility data obtained from the same listeners and speakers as those reported in the Hustad (2006) article. However, data were subjected to different analyses, as described below, in the present study. In addition, data from a separate comprehension task, collected in parallel with intelligibility data, are reported in this article.
Participants
Speakers with dysarthria. Twelve speakers with dysarthria secondary to cerebral palsy contributed speech samples for this study. Speakers were selected to represent a range of severity levels. Three speakers were assigned to each of the four severity groups exclusively on the basis of scores from the SIT (Yorkston, Beukelman, & Tice, 1996). For the purposes of this project, severity groups were operationally defined as follows: those with scores between 76% and 95% were in the mild group; between 46% and 75% were in the moderate group; between 25% and 45% were in the severe group; and between 5% and 24% were in the profound group. Demographic information for the speakers, including age, gender, dysarthria diagnosis, severity, prominent perceptual features, intelligibility score, and speech rate, is provided in Table 1. Details regarding inclusion criteria can be found in Hustad (2006).
|
Materials
Speech stimuli. Speakers with dysarthria produced three narrative passages, each consisting of 10 related sentences. The passages employed in this study have been used in several other projects focused on intelligibility of dysarthric speech (see Hustad, Auker, Natale, & Carlson, 2003; Hustad & Beukelman, 2001, 2002; Hustad, Jones, & Dailey, 2003). The interested reader is referred to Hustad and Beukelman (2002) for specific details regarding characteristics of the passages. In summary, passages were developed to represent common situations (e.g., sporting event, natural disaster, purchasing a vehicle). Passages followed standard American English conventions for content, form, and use of the language. Each of the narrative passages contained a total of 65 words. The 10 constituent sentences systematically ranged in length from five to eight words.
Comprehension questions. For each narrative passage, 10 comprehension questions were developed. Five questions were designed to be inferential in nature, and 5 were designed to be factual in nature. Inferential questions targeted information that was not overtly specified within the narratives but could be inferred from the content of the narrative. Factual questions targeted information that was directly stated within the narrative. Because each of the narratives was different, it was not possible to use the same comprehension questions for each. However, 4 questions were appropriate for all three narratives. These questions pertained to the topic of the story, the ending of the story, an alternative ending for the story, and what might happen following the event(s) described in the story. The 6 remaining questions were unique to the individual narrative passages. As appropriate for each narrative, these questions focused on information such as the time and place of the events in the story, the reason for the events of the story, and problems or obstacles encountered in the story. It is important to note that these questions differed from the ones used by Hustad and Beukelman (2002). The original questions developed by Hustad and Beukelman (2002) were designed to query sentence-level information so that each question could be answered solely on the basis of its referent sentence. In the present study, questions were designed to query narrative-level information that was not tied exclusively to one sentence.
In the present study, all comprehension questions for each of the three paragraphs were pilot tested to meet two criteria. The first criterion required that questions were not "guessable" to judges who had not been exposed to the target narrative. This was established by having a pool of 10 independent judges who had no knowledge of the referent narratives provide answers to the comprehension questions. Questions that were answered correctly by more than 1 of 10 judges were discarded. Questions that were answered incorrectly by at least 9 of 10 judges were deemed "unguessable" and were maintained in the pool of questions.
The second criterion for comprehension questions required that they be "answerable" to judges who had been exposed to the target narrative. This was established by having a separate pool of 10 independent judges provide answers to the comprehension questions following presentation of the referent narratives. Questions that were answered correctly by at least 9 of 10 judges were maintained in the pool of questions, and those that were answered incorrectly by more than 1 of 10 judges were discarded. Several iterations and modifications were required before a series of 10 questions for each narrative, each of which targeted a different response, met the two criteria with 10 different judges. The questions that were ultimately selected following all pilot testing were not guessed correctly by any of the 10 naive independent judges and were answered correctly by all of the judges who had been exposed to the referent narrative. See Table 2 for a sample narrative, sample comprehension questions, and one sample listener's responses to the comprehension questions.
|
Preparing speech samples for playback to listeners. Recorded samples were transferred onto computer via a digital sound card, maintaining the sampling rate and quantization of the original recordings. For each speaker, recordings of each stimulus sentence were separated into individual sound files. Stimulus files for each sentence were normalized using Sound Forge (1998) Version 4.5 so that the peak amplitude of each sentence was constant across all files.
Experimental task. Listeners completed the experiment independently in a soundproof booth. Each listener was seated approximately 2 ft from a high-quality external speaker, with a desktop computer located directly in front of him or her. The presentation level of speech stimuli was calibrated to a peak sound-pressure level of 70 dB. Calibration of presentation level was checked periodically to ensure consistency among listeners.
All experimental tasks were presented via computer using Microsoft PowerPoint. Tasks were self-paced, so that listeners were able to advance through the experiment at a comfortable rate, taking as much time to generate responses as necessary. For all tasks, listeners initiated presentation of the speech sample by clicking the mouse. They were able to hear each sample only one time. For the comprehension task, the entire narrative was presented continuously, with a brief pause between each sentence. Following presentation of the narrative, 10 randomly ordered comprehension questions were presented one at a time, each on a separate screen. Listeners typed their responses to comprehension questions into the computer on the same screen upon which the individual questions were presented. They were unable to refer to their responses for previous questions or to view forthcoming questions.
For the intelligibility task, individual sentences forming the narrative were presented individually. Following presentation of one sentence, listeners made orthographic transcriptions of what they heard by typing on the computer. When they were finished typing their response, they proceeded to the next sentence until they had entered transcriptions for all 10 sentences.
Prior to beginning the experimental tasks, listeners were instructed that they would be listening to one person who has a speech disability and that they would complete two types of tasks—comprehension and intelligibility. In the comprehension task, listeners were told that they would hear a story comprised of 10 sentences and then would answer questions about what they heard after the story was presented in its entirety. In the intelligibility task, listeners were told that they would hear a different story comprised of 10 sentences. Between each sentence, they would be asked to type exactly what they thought the speaker said. The listeners were instructed to type all words that they could, guessing at words if necessary. They were told to skip any words for which they were unable to venture a guess. Listeners were instructed to follow all directions and answer all questions presented on the computer screen. They were also informed that the person speaking would be difficult to understand and that if they were uncertain they should take their best guess.
Randomization and counterbalancing. In each of the two experimental tasks (comprehension and intelligibility) completed by listeners, different narrative passages were employed. To prevent an order effect, presentation of tasks was counterbalanced within each of the 12 speaker groups so that half of the listeners completed the comprehension task first and half of the listeners completed the intelligibility task first. In addition, for the comprehension task, all comprehension questions were presented in random order for each listener, so that no two listeners received comprehension questions in the same order. Finally, speech stimuli associated with the comprehension task and the intelligibility task were also counterbalanced so that narratives were represented the same number of times in each condition.
Dependent Variables
Two dependent variables were of interest for this study. The first variable was intelligibility of individual words comprising the narrative produced by the speakers with dysarthria. These data were used to characterize how well listeners processed the surface code of the narratives produced by the speakers. The second variable was experimenter ratings of the accuracy of listener responses to comprehension questions pertaining to the narrative as a whole. These data were used to characterize how well listeners comprehended the situation and events described in the narrative.
Scoring. Orthographic transcriptions generated by listeners were scored on a sentence-by-sentence basis. Within each transcribed sentence, individual transcribed words were compared with the words produced by the speaker to determine whether there was an exact phonemic match, without regard for word order. Misspellings and homonyms were accepted as correct. Each correct word earned 1 point. Across each of the 10 sentences, 65 points were possible, 1 for each target word. The total number of words identified correctly was divided by the total number of words possible and multiplied by 100 to yield the percentage of words identified correctly for each listener.
Comprehension of the narratives was determined by scoring listener responses to comprehension questions. The author scored all comprehension questions using a 3-point scale (0, 1, 2). Responses to comprehension questions that were judged to be incorrect earned 0 points. Responses that were judged to be general and nonspecific, yet not incorrect, earned 1 point. Responses that were judged to be specific and correct earned 2 points. Across each of the 10 questions per narrative, a total of 20 points was possible. Comprehension scores were converted to a percentage for each listener by dividing the number of points earned by the number of points possible and multiplying by 100.
Reliability. Both intra- and interjudge reliability were determined for the percentage of words transcribed correctly (intelligibility) and for ratings of listener responses to comprehension questions (comprehension). Intrajudge reliability was obtained by having the original judge (the author) rescore data from 3 randomly selected listeners for each of the 12 speaker groups (25% of the sample). Interscorer reliability was obtained by having a second judge (research assistant), who was not involved in the initial scoring, score data from 3 randomly selected listeners for each of the 12 speaker groups (25% of the sample). Unit-by-unit agreement (Hegde, 1994) was obtained by dividing the number of agreements by the number of agreements plus disagreements, multiplied by 100. Cohen's kappa (Siegel & Castellan, 1988), a measure that accounts for the proportion of inter- or intrajudge agreement expected on the basis of chance alone, was also calculated. Kappa-type statistics that are .81 and above have been regarded as "almost perfect" (Landis & Koch, 1977).
For intelligibility, both intrajudge and interjudge unit-by-unit agreement were 100%. Cohen's kappa was 1.0 for inter- and intrajudge reliability. This is not surprising given that the task of determining whether a word is correct is a relatively simple one.
For comprehension, intrajudge point-by-point agreement was 92%, and Cohen's kappa was .82. Interjudge point-by-point agreement was 90%, and Cohen's kappa was .81. These findings document strong intra- and interjudge reliability in the scoring of dependent measures.
Experimental Design and Statistical Procedures
A 2 x 4 split-plot design (Kirk, 1995) was employed for this study. The within-subjects variable was measure, and its two categories were intelligibility and comprehension. The between-subjects variable was severity, and its four groups were mild, moderate, severe, and profound. Three speakers comprised each severity group, and 12 listeners heard each speaker, for a total of 36 different listeners per severity group.
The research questions of interest pertained to relationships among measures at different levels (across all speakers, within severity groups, and within individual speakers); therefore, three sets of analyses were completed using Pearson product–moment correlation coefficients. The amount of variance accounted for by the relationship between the two variables (r2) was of primary interest. Any correlation coefficient that had a probability of .05 or less was considered statistically significant.
Because the two measures, intelligibility and comprehension, addressed different constructs and had different measurement scales, statistical comparisons of mean differences were not made. However, descriptive results are presented and discussed.
Results
Descriptive statistics summarizing mean intelligibility and mean comprehension scores suggest that comprehension scores were consistently higher than intelligibility scores. This was the case across all speakers and listeners, within each of the severity groups, and within each individual speaker. It is noteworthy, however, that variances at all levels (overall, within severity groups, and within speakers) were quite large for both intelligibility and comprehension data, suggesting marked variability in performance among listeners. Severity group means for each measure are shown in Figure 1, and individual speaker means are shown in Figure 2.
|
|
|
|
Discussion
The purpose of the present study was to evaluate the relationship between comprehension and intelligibility for listeners of speakers with dysarthria within four different severity groups. This study examined the relationship between intelligibility and comprehension scores across all participants when severity effects were statistically controlled, the relationship between intelligibility and comprehension within each severity group, and the relationship between intelligibility and comprehension for individual speakers with dysarthria. Findings are discussed in detail below.
Relationships Between Intelligibility and Comprehension
Across all speakers and their listeners, results of this study showed that there was no significant relationship between intelligibility scores and comprehension scores when severity effects were removed. This finding suggests that intelligibility scores and comprehension scores reflect different underlying phenomena that do not seem to overlap in meaningful ways when the effects of severity are controlled. From a clinical perspective, this finding has important implications. Perhaps most noteworthy is that generalization of an intelligibility score to conclusions about listeners' ability to comprehend a speaker would likely be inaccurate. Indeed, examination of descriptive data (see Figures 1 and 2) suggests that comprehension scores tended to be higher than intelligibility scores, particularly for listeners of speakers in the moderate and severe groups.
One reason for the descriptive differences and poor relationship between intelligibility scores and comprehension scores in the present study may relate to listeners' goals and their subsequent approach to the two tasks. In the comprehension task, it is likely that listeners were focused on constructing a coherent global picture, actively drawing upon their world knowledge as the narrative unfolded so that they could answer questions about what they heard. In the transcription task it is likely that listeners were focused on lexical delimitation, perhaps with less regard for meaning than in the comprehension task.
Another difference between the measures that may explain the poor relationship and descriptive differences between intelligibility and comprehension scores relates to short-term memory limitations of listeners. In the intelligibility task, listeners transcribed sentences one at a time. Listeners were unable to enter their transcription for individual sentences until production of that sentence was complete. Speech rate was markedly reduced relative to normal for most of the speakers, with some sentences lasting as long as 7 or 8 s. It is possible that listeners performed worse on intelligibility tasks because their short-term memory was taxed due to reduced rate of speech along with increased processing demands imposed by the dysarthric speech signal. Because successful decoding of and short-term memory for exact word strings was relatively less important for the comprehension task, the same phenomenon likely had little or no influence on comprehension scores.
Although overall findings of this study showed that there was no relationship between comprehension and intelligibility scores, within speaker severity groups and within individual speaker groups there were some exceptions that warrant discussion. For example, the relationship between intelligibility and comprehension was not significant for speakers or their listeners within the moderate, severe, and profound severity groups. However, for speakers and their listeners within the mild severity group, the relationship between intelligibility and comprehension was significant, although weak (12% of the variance was accounted for by the relationship between the two variables). One explanation for this finding is that the speech signal was good enough to allow listeners to understand most of the words produced by speakers; consequently, listeners were able to comprehend the narratives with a high degree of accuracy, and the relationship between these measures was significant. That the relationship between comprehension and intelligibility scores was not stronger for these speakers and their listeners is somewhat surprising. In fact, examination of data for individual speakers and their listeners shows that the correlation between intelligibility and comprehension was significant for only 1 of the 3 speakers with mild dysarthria. Oddly, this speaker had the lowest intelligibility and comprehension scores within the mild group.
Only 1 other speaker showed a significant relationship between intelligibility and comprehension scores. This speaker was in the moderate group and did not have any apparent characteristics that distinguished him from the other speakers. Thus, the relationship between intelligibility and comprehension for this individual is, again, difficult to explain.
Results of the present study were consistent with those of Hustad and Beukelman (2002) for speakers with severe dysarthria. The replication of findings is particularly noteworthy because there were some important methodological differences between the two studies. In the present study, listeners heard stimuli produced by speakers with dysarthria only one time. In the Hustad and Beukelman study (2002), listeners heard the narrative two times. The effect of hearing target narratives twice may have served to increase intelligibility scores because listeners had the advantage of processing the entire narrative at multiple levels of representation prior to transcribing what they heard.
Another methodological difference involved the scoring rubric employed for comprehension questions. In the present study, a 3-point scoring system was adopted in which responses that were partially correct were given partial credit. In the Hustad and Beukelman (2002) study, a binomial (2-point) scoring system was employed, such that responses to comprehension questions were considered either correct or incorrect. This may have inflated comprehension scores in the present study to some extent. It is also worth noting that use of a 3-point scale to score comprehension responses is probably more ecologically valid than a binomial scale, as comprehension is not necessarily an all-or-none phenomenon.
Limitations
There were several important limitations to this study that reduce its external validity. First, the study was experimental in nature, and consequently many variables were carefully controlled. For example, speakers were similar in that they all had cerebral palsy. Research suggests that listeners may perform differently on perceptual tasks such as those used in the present study when presented with dysarthria of different etiology (Klasner & Yorkston, 2005; Liss, Spitzer, Caviness, Adler, & Edwards, 2000). All speakers produced narratives in a sentence-by-sentence fashion following a model. Thus, there were no language formulation requirements that typically co-occur with the task of speaking to communicate something. In addition, all sentences comprising the narratives were grammatically correct and complete, which is not always the case when speakers spontaneously generate narratives.
Listeners were relatively homogeneous in this study. As a group, they were young and educated, with normal hearing and minimal experience communicating with speakers who had dysarthria. Speakers and their listeners were not engaged in a real communication task in this study. Listeners simply heard speakers producing the target narratives and then answered questions and transcribed what they heard in two discrete tasks. This situation is unlike real communication, when a speaker is talking to a partner and the partner must respond in some way to the speaker.
Another potential limitation relates to the comprehension task. To measure comprehension, listeners answered questions regarding the content of the narrative they heard. This task required listeners to give deliberate thought to aspects of the narrative that they may not otherwise have considered. In addition, the post-perceptual time spent reflecting on comprehension questions, brief as it may have been, was not likely consistent with what listeners actually do when trying to comprehend spoken messages. Thus, results from the present study may be different from those obtained using different methodologies to measure comprehension. Development of clinically usable measurement tools to characterize the information-bearing capability of dysarthric speech should be considered.
Theoretical and Clinical Implications
The process by which listeners derive the intended meaning from a speech signal is complex and multifaceted. Although there is considerable debate in the literature, discourse psychologists agree that input is represented at several levels that interact in various ways between initial perception and comprehension of a message. Results of the present study demonstrated that representations of dysarthric speech at one level of processing (surface code) are not closely related to representations at a higher level of processing (propositional content, situation model; following van Dijk & Kintsch, 1983). Thus, intelligibility measures do not seem to be a good indicator of how well listeners are able to comprehend the intended meaning of a speaker's message. These findings suggest that a comprehensive theory of the impact of dysarthria on communication abilities must move beyond measures of speech intelligibility to address listener processing at multiple levels of representation.
There are important clinical implications for the mismatch between intelligibility and listener comprehension. One implication relates to communicative functioning, a construct that in fact may be quite different from intelligibility. A speaker may sound very impaired and listeners may have difficulty transcribing sentences produced by that speaker, but in situations in which contextual cues and world knowledge are available, the information-bearing capability (i.e., listener's ability to comprehend the message) of that same speech signal may be adequate for the exchange of meaning. Optimal characterization of dysarthric speech should incorporate multiple indices of speech that are specific to the purpose of the measurement. For example, if a clinician wishes to describe the integrity of the speech signal from an acoustic or surface-level perspective, traditional intelligibility measures may be appropriate. However, if a clinician wishes to describe the information-bearing capability of that same speech signal, higher level measures that allow for evaluation of listener comprehension should be employed. Ultimately, the simultaneous use of multiple measures to describe dysarthric speech will permit the development of appropriate compensatory interventions that target the transfer of meaning.
| Acknowledgments |
|---|
Received March 29, 2006
Revision received October 15, 2006
Accepted August 29, 2007
| References |
|---|
|
|
|---|
Altmann, G. T. (2001). The language machine: Psycholinguistics in review. British Journal of Psychology, 92, 129–170.[CrossRef][Medline]
Beukelman, D. R., & Yorkston, K. (1979). The relationship between information transfer and speech intelligibility of dysarthric speakers. Journal of Communication Disorders, 12, 189–196.[CrossRef][Medline]
Butcher, K. R., & Kintsch, W. (2003). Text comprehension and discourse processing. In F. Healy & R. W. Proctor (Eds.), Handbook of psychology: Experimental psychology, 4, pp. 575–595). New York: Wiley.
Duffy, J. (2005). Motor speech disorders: Substrates, differential diagnosis, and management (2nd ed.). St. Louis, MO: Elsevier Mosby.
Foltz, P. (2003). Quantitative cognitive models of text and discourse processing. In A. C. Graesser & M. A. Gernsbacher (Eds.), Handbook of discourse processes (pp. 487–523). Mahwah, NJ: Erlbaum.
Garcia, J., & Cannito, M. (1996). Influence of verbal and nonverbal contexts on the sentence intelligibility of a speaker with dysarthria. Journal of Speech and Hearing Research, 39, 750–760.[Medline]
Giolas, T., & Epstein, A. (1963). Comparative intelligibility of word lists and continuous discourse. Journal of Speech and Hearing Research, 6, 349–358.
Graesser, A. C., Millis, K. K., & Zwaan, R. A. (1997). Discourse comprehension. Annual Review of Psychology, 48, 163–189.[CrossRef][Medline]
Hegde, M. N. (1994). Clinical research in communicative disorders: Principles and strategies (2nd ed.). Austin, TX: Pro-Ed.
Higginbotham, D. J., Drazek, A. L., Kowarsky, K., Scally, C., & Segal, E. (1994). Discourse comprehension of synthetic speech delivered at normal and slow presentation rates. Augmentative and Alternative Communication, 10(3), 191–202.[CrossRef]
Hunter, L., Pring, T., & Martin, S. (1991). The use of strategies to increase speech intelligibility in cerebral palsy: An experimental evaluation. British Journal of Disorders of Communication, 26, 163–174.[CrossRef][Medline]
Hustad, K. C. (2006). A closer look at transcription intelligibility for speakers with dysarthria: Evaluation of scoring paradigms and linguistic errors made by listeners. American Journal of Speech-Language Pathology, 15, 268–277.
Hustad, K., Auker, J., Natale, N., & Carlson, R. (2003). Improving intelligibility of speakers with profound dysarthria and cerebral palsy. Augmentative and Alternative Communication, 19, 187–198.[CrossRef]
Hustad, K. C., & Beukelman, D. R. (2001). Effects of linguistic cues and stimulus cohesion on intelligibility of severely dysarthric speech. Journal of Speech, Language, and Hearing Research, 44, 497–510.
Hustad, K. C., & Beukelman, D. R. (2002). Listener comprehension of severely dysarthric speech: Effects of linguistic cues and stimulus cohesion. Journal of Speech, Language, and Hearing Research, 45, 545–558.
Hustad, K. C., & Cahill, M. A. (2003). Effects of presentation mode and repeated familiarization on intelligibility of dysarthric speech. American Journal of Speech-Language Pathology, 12, 198–208.
Hustad, K. C., Jones, T., & Dailey, S. (2003). Implementing speech supplementation strategies: Effects on intelligibility and speech rate of individuals with chronic severe dysarthria. Journal of Speech, Language, and Hearing Research, 46, 462–474.
Kent, R., Weismer, G., Kent, J., & Rosenbek, J. (1989). Toward phonetic intelligibility testing in dysarthria. Journal of Speech and Hearing Disorders, 54, 482–499.
Kintsch, W. (1992). A cognitive architecture for comprehension. In H. J. Pick & P. W. van den Broek (Eds.), Cognition: Conceptual and methodological issues (pp. 143–163). Washington, DC: American Psychological Association.
Kirk, R. (1995). Experimental design: Procedures for the behavioral sciences (3rd ed.). Pacific Grove, CA: Brooks/Cole.
Klasner, E. R., & Yorkston, K. (2005). Speech intelligibility in ALS and HD dysarthria: The everyday listener's perspective. Journal of Medical Speech-Language Pathology, 13, 127–139.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.[CrossRef][Medline]
Liss, J. M., Spitzer, S. M., Caviness, J. N., Adler, C., & Edwards, B. W. (2000). Lexical boundary error analysis in hypokinetic and ataxic dysarthria. Journal of the Acoustical Society of America, 107, 3415–3424.[CrossRef][Medline]
Siegel, S., & Castellan, N. J. (1988). Non-parametric statistics for the behavioral sciences (2nd ed.). New York: McGraw-Hill.
Singer, M. (2000). Psycholinguistics: Discourse comprehension. In A. E. Kazdin (Ed.), Encyclopedia of psychology, 6, pp. 269–372). Washington, DC: American Psychological Association.
Sound Forge (Version 4.5) [Computer software]. (1998). Madison, WI: Sonic Foundry.
Tikofsky, R. S., & Tikofsky, R. P. (1964). Intelligibility as a measure of dysarthric speech. Journal of Speech and Hearing Research, 7, 325–333.
Tjaden, K. K., & Liss, J. M. (1995). The role of listener familiarity in the perception of dysarthric speech. Clinical Linguistics & Phonetics, 9(2), 139–154.
van Dijk, T., & Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press.
Weismer, G., & Martin, R. (1992). Acoustic and perceptual approaches to the study of intelligibility. In R. Kent (Ed.), Intelligibility in speech disorders (pp. 67–118). Philadelphia: John Benjamins.
Yorkston, K. M., & Beukelman, D. R. (1981). Assessment of intelligibility of dysarthric speech. Tigard, OR: C.C. Publications.
Yorkston, K. M., Beukelman, D. R., Strand, E. A., & Bell, K. R. (1999). Management of motor speech disorders in children and adults (2nd ed.). Austin, TX: Pro Ed.
Yorkston, K., Beukelman, D., & Tice, R. (1996). Sentence Intelligibility Test for Macintosh. Lincoln, NE: Communication Disorders Software.
Yorkston, K., Strand, E., & Kennedy, M. (1996). Comprehensibility of dysarthric speech: Implications for assessment and treatment planning. American Journal of Speech-Language Pathology, 5(1), 55–66.
Zwaan, R. A., & Singer, M. (2003). Text comprehension. In A. C. Graesser & M. A. Gernsbacher (Eds.), Handbook of discourse processes (pp. 83–121). Mahwah, NJ: Erlbaum.
![]()
CiteULike
Connotea
Del.icio.us
Digg
Facebook
Reddit
Technorati
Twitter What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASHA Journals | AJA | AJSLP | JSLHR | LSHSS |