skip to main content
10.3115/1220175.1220308dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Are these documents written from different perspectives?: a test of different perspectives based on statistical distribution divergence

Published: 17 July 2006 Publication History

Abstract

In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for example, from the perspective of Democrats or Republicans. We propose a test of different perspectives based on distribution divergence between the statistical models of two collections. Experimental results show that the test can successfully distinguish document collections of different perspectives from other types of collections.

References

[1]
Robert P. Abelson and J. Douglas Carroll. 1965. Computer simulation of individual belief systems. The American Behavioral Scientist, 8:24--30, May.
[2]
Robert P. Abelson, 1973. Computer Models of Thought and Language, chapter The Structure of Belief Systems, pages 287--339. W. H. Freeman and Company.
[3]
Philip Beineke, Trevor Hastie, and Shivakumar Vaithyanathan. 2004. The sentimental factor: Improving review classification via human-provided information. In Proceedings of the Association for Computational Linguistics (ACL-2004).
[4]
Jaime G. Carbonell. 1978. POLITICS: Automated ideological reasoning. Cognitive Science, 2(1):27--51.
[5]
Thomas M. Cover and Joy A. Thomas. 1991. Elements of Information Theory. Wiley-Interscience.
[6]
Kushal Dave, Steve Lawrence, and David M. Pennock. 2003. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proceedings of the 12th International World Wide Web Conference (WWW2003).
[7]
Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[8]
Brett Kessler, Geoffrey Nunberg, and Hinrich Schütze. 1997. Automatic detection of text genre. In Proceedings of the 35th Conference on Association for Computational Linguistics, pages 32--38.
[9]
S. Kullback and R. A. Leibler. 1951. On information and sufficiency. The Annals of Mathematical Statistics, 22(1):79--86, March.
[10]
David D. Lewis. 1998. Naive (Bayes) at forty: The independence assumption in information retrieval. In Proceedings of the 9th European Conference on Machine Learning (ECML).
[11]
Wei-Hao Lin, Theresa Wilson, Janyce Wiebe, and Alexander Hauptmann. 2006. Which side are you on? identifying perspectives at the document and sentence levels. In Proceedings of Tenth Conference on Natural Language Learning (CoNLL).
[12]
S. Morinaga, K. Yamanishi, K. Tateishi, and T. Fukushima. 2002. Mining product reputations on the web. In Proceedings of the 2002 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[13]
Tony Mullen and Nigel Collier. 2004. Sentiment analysis using support vector machines with diverse information sources. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2004).
[14]
T. Nasukawa and J. Yi. 2003. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd International Conference on Knowledge Capture (K-CAP 2003).
[15]
Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the Association for Computational Linguistics (ACL-2004).
[16]
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2002).
[17]
Ellen Riloff and Janyce Wiebe. 2003. Learning extraction patterns for subjective expressions. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2003).
[18]
Ellen Riloff, Janyce Wiebe, and Theresa Wilson. 2003. Learning subjective nouns using extraction pattern bootstrapping. In Proceedings of the 7th Conference on Natural Language Learning (CoNLL-2003).
[19]
B. D. Ripley. 1987. Stochastic Simulation. Wiley.
[20]
Roger C. Schank and Robert P. Abelson. 1977. Scripts, plans, goals, and understanding: an inquiry into human knowledge structures. Lawrene Erlbaum Associates.
[21]
Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1--47, March.
[22]
Peter Turney and Michael L. Littman. 2003. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), 21(4):315--346.
[23]
Peter Verdonk. 2002. Stylistics. Oxford University Press.
[24]
Janyce Wiebe, Theresa Wilson, Rebecca Bruce, Matthew Bell, and Melanie Martin. 2004. Learning subjective language. Computational Linguistics, 30(3).
[25]
Hong Yu and Vasileios Hatzivassiloglou. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2003).

Cited By

View all
  1. Are these documents written from different perspectives?: a test of different perspectives based on statistical distribution divergence

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
      July 2006
      1214 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 17 July 2006

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)33
      • Downloads (Last 6 weeks)16
      Reflects downloads up to 14 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media