skip to main content
10.3115/974557.974607dlproceedingsArticle/Chapter ViewAbstractPublication PagesanlcConference Proceedingsconference-collections
Article
Free access

Semi-automatic acquisition of domain-specific translation lexicons

Published: 31 March 1997 Publication History

Abstract

We investigate the utility of an algorithm for translation lexicon acquisition (SABLE), used previously on a very large corpus to acquire general translation lexicons, when that algorithm is applied to a much smaller corpus to produce candidates for domain-specific translation lexicons.

References

[1]
Alan Agresti. 1992. Modeling patterns of agreement and disagreement. Statistical methods in medical research, 1:201--218.
[2]
P. F. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. 1993. "The Mathematics of Statistical Machine Translation: Parameter Estimation". Computational Linguistics 19:2.
[3]
Jean Carletta. 1996. Assessing agreement on classification tasks: the Kappa statistic. Computational Linguistics, 22(2):249--254, June.
[4]
S. Chen. 1993. "Aligning Sentences in Bilingual Corpora Using Lexical Information". Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, OH.
[5]
K. W. Church. 1993. "Char-align: A Program for Aligning Parallel Texts at the Character Level". Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, OH.
[6]
P. H. Cousin, L. Sinclair, J. F. Allain, and C. E. Love. 1991. The Collins Paperback French Dictionary. Harper Collins Publishers, Glasgow.
[7]
Ido Dagan and Ken W. Church. 1994. TERMIGHT: Identifying and translating technical terminology. In Proceedings of the Fourth ACL Conference on Applied Natural Language Processing (13--15 October 1994, Stuttgart). Association for Computational Linguistics, October.
[8]
I. Dagan, K. Church, and W. Gale. 1993. "Robust Word Alignment for Machine Aided Translation". Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, available from the ACL.
[9]
Béatrice Daille. 1994. Combined approach for terminology extraction: lexical statistics and linguistic filtering. Ph.D. thesis, University Paris 7.
[10]
Béatrice Daille. 1996. Study and implementation of combined techniques for automatic extraction of terminology. In Judith Klavans and Philip Resnik, editors, The Balancing Act: Combining Symbolic and Statistical Approaches to Language. MIT Press.
[11]
Mark Davis. 1996. "New experiments in cross-language text retrieval at NMSU's Computing Research Lab". Fifth Text Retrieval Conference (TREC-5). NIST.
[12]
Mark Davis and Ted Dunning. 1995. "A TREC evaluation of query translation methods for multilingual text retrieval". Fourth Text Retrieval Conference (TREC-4). NIST.
[13]
Mark Davis, Ted Dunning, and William Ogden. 1995. Text alignment in the real world: improving alignments of noisy translation using common lexical features, string matching strategies, and ngram comparisons. In EACL-95.
[14]
W. Gale and K. W. Church. 1991. "Identifying Word Correspondences in Parallel Texts". Proceedings of the DARPA SNL Workshop, 1991.
[15]
W. Grove, N. Andreasen, P. McDonald-Scott, M. Keller, and R. Shapiro. 1981. Reliability studies of psychiatric diagnosis. Archives of General Psychiatry, 38, April.
[16]
I. Dan Melamed, 1995. Automatic evaluation and uniform filter cascades for inducing n-best translation lexicons. In Proceedings of the Third Workshop on Very Large Corpora, Cambridge, Massachusetts.
[17]
I. Dan Melamed. 1996a. A geometric approach to mapping bitext correspondence. In Conference on Empirical Methods in Natural Language Processing, Philadelphia, Pennsylvania.
[18]
I. Dan Melamed. 1996b. Automatic construction of clean broad-coverage translation lexicons. In Proceedings of the 2nd Conference of the Association for Machine Translation in the Americas, Montreal, Canada.
[19]
I. Dan Melamed. 1996c. Porting SIMR to new language pairs. IRCS Technical Report 96--26. University of Pennsylvania.
[20]
I. Dan Melamed. 1997. A scalable architecture for bilingual lexicography. Dept. of Computer and Information Science Technical Report MS-CIS-97--01. University of Pennsylvania.
[21]
Douglas W. Oard. 1997. "Cross-Language Text Retrieval Research in the USA". Third DELOS Workshop. European Research Consortium for Informatics and Mathematics. March.
[22]
M. Simard, G. F. Foster and P. Isabelle. 1992. "Using Cognates to Align Sentences in Bilingual Corpora". In Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation, Montreal, Canada.
[23]
Frank Smadja. 1993. Retrieving collocations from text: Xtract. Computational Linguistics, 19(1):143--177.
[24]
Frank Smadja, Kathleen McKeown, and Vasileios Hatzivassiloglou. 1996. Translating collocations for bilingual lexicons: A statistical approach. Computational Linguistics, 22(1), March.
[25]
E. Spitznagel and J. Helzer. "A proposed solution to the base rate problem in the kappa statistic". Archives of General Psychiatry, 42. July, 1985.
[26]
D. Wu and X. Xia. 1994. "Learning an English Chinese Lexicon from a Parallel Corpus". Proceedings of the First Conference of the Association-for Machine Translation in the Americas, Columbia, MD.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ANLC '97: Proceedings of the fifth conference on Applied natural language processing
March 1997
417 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 31 March 1997

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)3
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media