Meeting of the Association for Computational Linguistics, 2008
Cognitive theories of dialogue hold that en- trainment, the automatic alignment between dialogue ... more Cognitive theories of dialogue hold that en- trainment, the automatic alignment between dialogue partners at many levels of linguistic representation, is key to facilitating both pro- duction and comprehension in dialogue. In this paper we examine novel types of entrain- ment in two corpora—Switchboard and the Columbia Games corpus. We examine en- trainment in use of high-frequency words(the most common
Intonational contours are overloaded, conveying different meanings in different contexts. In this... more Intonational contours are overloaded, conveying different meanings in different contexts. In this paper we examine two potential uses of the downstepped contours in Standard American English, in the Boston Directions Corpus of read and spontaneous speech. We investigate speakersí use of these contours in conveying discourse topic structure and in signaling given vs. new information and discuss the possible relationship
We present a corpus study of local dis- course relations based on the Penn Dis- course Tree Bank,... more We present a corpus study of local dis- course relations based on the Penn Dis- course Tree Bank, a large manually anno- tated corpus of explicitly or implicitly re- alized relations. We show that while there is a large degree of ambiguity in temporal explicit discourse connectives, overall con- nectives are mostly unambiguous and al- low high-accuracy prediction of discourse relation type. We achieve 93.09% accu- racy in classifying the explicit relations and 74.74% accuracy overall. In addition, we show that some pairs of relations oc- cur together in text more often than ex- pected by chance. This finding suggests that global sequence classification of the relations in text can lead to better results, especially for implicit relations.
Different summarization requirements could make the writing of a good summary more dif- ficult, o... more Different summarization requirements could make the writing of a good summary more dif- ficult, or easier. Summary length and the char- acteristics of the input are such constraints in- fluencing the quality of a potential summary. In this paper we report the results of a quanti- tative analysis on data from large-scale evalu- ations of multi-document summarization, em- pirically confirming this hypothesis. We fur- ther show that features measuring the cohe- siveness of the input are highly correlated with eventual summary quality and that it is possi- ble to use these as features to predict the diffi- culty of new, unseen, summarization inputs.
To date, few attempts have been made to develop and validate methods for au- tomatic evaluation o... more To date, few attempts have been made to develop and validate methods for au- tomatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture var- ious aspects of well-written text. We train and test linguistic quality models on con- secutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coher- ence and referential clarity are best evalu- ated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference informa- tion, and summarization specific features. Our best results are 90% accuracy for pair- wise comparisons of competing systems over a test set of several inputs and 70% for ranking summaries of a specific input.
We report results on predicting the sense of implicit discourse relations between ad- jacent sent... more We report results on predicting the sense of implicit discourse relations between ad- jacent sentences in text. Our investigation concentrates on the association between discourse relations and properties of the referring expressions that appear in the re- lated sentences. The properties of inter- est include coreference information, gram- matical role, information status and syn- tactic form of referring expressions. Pre- dicting the sense of implicit discourse re- lations based on these features is consid- erably better than a random baseline and several of the most discriminative features conform with linguistic intuitions. How- ever, these features do not perform as well as lexical features traditionally used for sense prediction.
Meeting of the Association for Computational Linguistics, 2008
Cognitive theories of dialogue hold that en- trainment, the automatic alignment between dialogue ... more Cognitive theories of dialogue hold that en- trainment, the automatic alignment between dialogue partners at many levels of linguistic representation, is key to facilitating both pro- duction and comprehension in dialogue. In this paper we examine novel types of entrain- ment in two corpora—Switchboard and the Columbia Games corpus. We examine en- trainment in use of high-frequency words(the most common
Intonational contours are overloaded, conveying different meanings in different contexts. In this... more Intonational contours are overloaded, conveying different meanings in different contexts. In this paper we examine two potential uses of the downstepped contours in Standard American English, in the Boston Directions Corpus of read and spontaneous speech. We investigate speakersí use of these contours in conveying discourse topic structure and in signaling given vs. new information and discuss the possible relationship
We present a corpus study of local dis- course relations based on the Penn Dis- course Tree Bank,... more We present a corpus study of local dis- course relations based on the Penn Dis- course Tree Bank, a large manually anno- tated corpus of explicitly or implicitly re- alized relations. We show that while there is a large degree of ambiguity in temporal explicit discourse connectives, overall con- nectives are mostly unambiguous and al- low high-accuracy prediction of discourse relation type. We achieve 93.09% accu- racy in classifying the explicit relations and 74.74% accuracy overall. In addition, we show that some pairs of relations oc- cur together in text more often than ex- pected by chance. This finding suggests that global sequence classification of the relations in text can lead to better results, especially for implicit relations.
Different summarization requirements could make the writing of a good summary more dif- ficult, o... more Different summarization requirements could make the writing of a good summary more dif- ficult, or easier. Summary length and the char- acteristics of the input are such constraints in- fluencing the quality of a potential summary. In this paper we report the results of a quanti- tative analysis on data from large-scale evalu- ations of multi-document summarization, em- pirically confirming this hypothesis. We fur- ther show that features measuring the cohe- siveness of the input are highly correlated with eventual summary quality and that it is possi- ble to use these as features to predict the diffi- culty of new, unseen, summarization inputs.
To date, few attempts have been made to develop and validate methods for au- tomatic evaluation o... more To date, few attempts have been made to develop and validate methods for au- tomatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture var- ious aspects of well-written text. We train and test linguistic quality models on con- secutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coher- ence and referential clarity are best evalu- ated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference informa- tion, and summarization specific features. Our best results are 90% accuracy for pair- wise comparisons of competing systems over a test set of several inputs and 70% for ranking summaries of a specific input.
We report results on predicting the sense of implicit discourse relations between ad- jacent sent... more We report results on predicting the sense of implicit discourse relations between ad- jacent sentences in text. Our investigation concentrates on the association between discourse relations and properties of the referring expressions that appear in the re- lated sentences. The properties of inter- est include coreference information, gram- matical role, information status and syn- tactic form of referring expressions. Pre- dicting the sense of implicit discourse re- lations based on these features is consid- erably better than a random baseline and several of the most discriminative features conform with linguistic intuitions. How- ever, these features do not perform as well as lexical features traditionally used for sense prediction.
Uploads
Papers