skip to main content
10.1145/2070481.2070514acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
poster

An active learning scenario for interactive machine translation

Published: 14 November 2011 Publication History

Abstract

This paper provides the first experimental study of an active learning (AL) scenario for interactive machine translation (IMT). Unlike other IMT implementations where user feedback is used only to improve the predictions of the system, our IMT implementation takes advantage of user feedback to update the statistical models involved in the translation process. We introduce a sentence sampling strategy to select the sentences that are worth to be interactively translated, and a retraining method to update the statistical models with the user-validated translations. Both, the sampling strategy and the retraining process are designed to work in real-time to meet the severe time constraints inherent to the IMT framework. Experiments in a simulated setting showed that the use of AL dramatically reduces user effort required to obtain translations of a given quality.

References

[1]
V. Ambati, S. Vogel, and J. Carbonell. Active learning and crowd-sourcing for machine translation. In Proc. of the conference on International Language Resources and Evaluation, pages 2169--2174, 2010.
[2]
S. Barrachina, O. Bender, F. Casacuberta, J. Civera, E. Cubel, S. Khadivi, A. Lagarda, H. Ney, J. Tomás, E. Vidal, and J.-M. Vilar. Statistical approaches to computer-assisted translation. Computational Linguistics, 35:3--28, 2009.
[3]
J. Blatz, E. Fitzgerald, G. Foster, S. Gandrabur, C. Goutte, A. Kulesza, A. Sanchis, and N. Ueffing. Confidence estimation for machine translation. In Proc. of the international conference on Computational Linguistics, pages 315--321, 2004.
[4]
M. Bloodgood and C. Callison-Burch. Bucking the trend: large-scale cost-focused active learning for statistical machine translation. In Proc. of the Association for Computational Linguistics, pages 854--864, 2010.
[5]
P. F. Brown, V. J. D. Pietra, S. A. D. Pietra, and R. L. Mercer. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19:263--311, 1993.
[6]
C. Callison-Burch, C. Fordyce, P. Koehn, C. Monz, and J. Schroeder. (Meta-) evaluation of machine translation. In Proc. of the Workshop on Statistical Machine Translation, pages 136--158, 2007.
[7]
D. Cohn, L. Atlas, and R. Ladner. Improving generalization with active learning. Machine Learning, 15:201--221, 1994.
[8]
A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society., 39(1):1--38, 1977.
[9]
G. Foster, P. Isabelle, and P. Plamondon. Target-text mediated interactive machine translation. Machine Translation, 12:175--194, 1998.
[10]
J. González-Rubio, D. Ortiz-Martínez, and F. Casacuberta. Balancing user effort and translation error in interactive machine translation via confidence measures. In Proc. of the Association for Computational Linguistics, pages 173--177, 2010.
[11]
G. Haffari, M. Roy, and A. Sarkar. Active learning for statistical phrase-based machine translation. In Proc. of the North American Chapter of the Association for Computational Linguistics, pages 415--423, 2009.
[12]
P. Koehn and B. Haddow. Interactive assistance to human translators using statistical machine translation methods. In Proc. of Machine Translation Summit XII, 2009.
[13]
P. Koehn and C. Monz. Manual and automatic evaluation of machine translation between european languages. In Proc. of the Workshop on Statistical Machine Translation, 2006.
[14]
P. Langlais and G. Lapalme. Trans Type: development-evaluation cycles to boost translator's productivity. Machine Translation, 17:77--98, 2002.
[15]
D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proc. of the ACM SIGIR conference on Research and development in information retrieval, pages 3--12, 1994.
[16]
E. Macklovitch. TransType2: the last word. In Proc. of the conference on International Language Resources and Evaluation, pages 167--17, 2006.
[17]
R. Neal and G. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in graphical models, pages 355--368, 1999.
[18]
F. Och. Minimum error rate training in statistical machine translation. In Proc. of the Association for Computational Linguistics, pages 160--167, 2003.
[19]
F. Och and H. Ney. Discriminative training and maximum entropy models for statistical machine translation. In Proc. of the Association for Computational Linguistics, pages 295--302, 2002.
[20]
D. Ortiz-Martínez, I. García-Varea, and F. Casacuberta. Online learning for interactive statistical machine translation. In Proc. of the North American Chapter of the Association for Computational Linguistics, pages 546--554, 2010.
[21]
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu. BLEU: a method for automatic evaluation of machine translation. In Proc. of the Association for Computational Linguistics, pages 311--318, 2002.
[22]
A. Sanchis, A. Juan, and E. Vidal. Estimation of confidence measures for machine translation. In Proc. of the Machine Translation Summit XI, pages 407--412, 2007.
[23]
N. Ueffing and H. Ney. Application of word-level confidence measures in interactive statistical machine translation. In Proc. of the European Association for Machine Translation conference, pages 262--270, 2005.
[24]
N. Ueffing and H. Ney. Word-level confidence estimation for machine translation. Computational Linguistics, 33:9--40, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '11: Proceedings of the 13th international conference on multimodal interfaces
November 2011
432 pages
ISBN:9781450306416
DOI:10.1145/2070481
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 November 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. active learning
  2. interactive machine translation

Qualifiers

  • Poster

Conference

ICMI'11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media