skip to main content
10.3115/1219840.1219888dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Word sense disambiguation vs. statistical machine translation

Published: 25 June 2005 Publication History

Abstract

We directly investigate a subject of much recent debate: do word sense disambiguation models help statistical machine translation quality? We present empirical results casting doubt on this common, but unproved, assumption. Using a state-of-the-art Chinese word sense disambiguation model to choose translation candidates for a typical IBM statistical MT system, we find that word sense disambiguation does not yield significantly better translation quality than the statistical machine translation system alone. Error analysis suggests several key factors behind this surprising finding, including inherent limitations of current statistical MT architectures.

References

[1]
Peter Brown, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer. Word-sense disambiguation using statistical methods. In Proceedings of 29th meeting of the Association for Computational Linguistics, pages 264--270, Berkeley, California, 1991.
[2]
Marine Carpuat, Weifeng Su, and Dekai Wu. Augmenting ensemble classification for word sense disambiguation with a Kernel PCA model. In Proceedings of Senseval-3, Third International Workshop on Evaluating Word Sense Disambiguation Systems, Barcelona, July 2004. SIGLEX, Association for Computational Linguistics.
[3]
Xavier Carreras, Lluis Marques, and Lluís Padró. Named entity extraction using AdaBoost. In Dan Roth and Antal van den Bosch, editors, Proceedings of CoNLL-2002, pages 167--170, Taipei, Taiwan, 2002.
[4]
Philip Clarkson and Ronald Rosenfeld. Statistical language modeling using the CMU-Cambridge toolkit. In Proceedings of Eurospeech '97, pages 2707--2710, Rhodes, Greece, 1997.
[5]
Mona Diab. Relieving the data acquisition bottleneck in word sense disambiguation. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004.
[6]
Yoram Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Journal of Computer and System Sciences, 55(1), pages 119--139, 1997.
[7]
Ulrich Germann. Greeedy decoding for statistical machine translation in almost linear time. In Proceedings of HLT-NAACL-2003. Edmonton, AB, Canada, 2003.
[8]
E. T. Jaynes. Where do we Stand on Maximum Entropy? MIT Press, Cambridge MA, 1978.
[9]
Dan Klein and Christopher D. Manning. Conditional structure versus conditional estimation in NLP models. In Proceedings of EMNLP-2002, Conference on Empirical Methods in Natural Language Processing, pages 9--16, Philadelphia, July 2002. SIGDAT, Association for Computational Linguistics.
[10]
Cong Li and Hang Li. Word translation disambiguation using bilingual bootstrapping. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 343--351, 2002.
[11]
Hwee Tou Ng, Bin Wang, and Yee Seng Chan. Exploiting parallel texts for word sense disambiguation: An empirical study. In Proceedings of ACL-03, Sapporo, Japan, pages 455--462, 2003.
[12]
Franz Och and Hermann Ney. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of ACL-02, Philadelphia, 2002.
[13]
Franz Josef Och and Hermann Ney. A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1):19--52, 2003.
[14]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002.
[15]
Robert E. Schapire and Yoram Singer. BoosTexter: A boosting-based system for text categorization. Machine Learning, 39(2):135--168, 2000.
[16]
Bernhard Schölkopf, Alexander Smola, and Klaus-Rober Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1998.
[17]
Dekai Wu and Hongsing Wong. Machine translation with a stochastic grammatical channel. In Proceedings of COLINGACL'98, Montreal, Canada, August 1998.
[18]
Dekai Wu, Weifeng Su, and Marine Carpuat. A Kernel PCA method for superior word sense disambiguation. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, July 2004.
[19]
Dekai Wu. A polynomial-time algorithm for statistical machine translation. In Proceedings of 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, June 1996.
[20]
David Yarowsky and Radu Florian. Evaluating sense disambiguation across diverse parameter spaces. Natural Language Engineering, 8(4):293--310, 2002.
[21]
Richard Zens, Hermann Ney, Taro Watanabe, and Eiichiro Sumita. Reordering constraints for phrase-based statistical machine translation. In Proceedings of COLING-2004, Geneva, Switzerland, August 2004.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
June 2005
657 pages
  • General Chair:
  • Kevin Knight

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 25 June 2005

Qualifiers

  • Article

Acceptance Rates

ACL '05 Paper Acceptance Rate 77 of 423 submissions, 18%;
Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)4
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media