research-article

Free access

Active learning for part-of-speech tagging: accelerating corpus annotation

Authors:

Peter McClanahan,

Robbie Haertel,

Deryle LonsdaleAuthors Info & Claims

LAW '07: Proceedings of the Linguistic Annotation Workshop

Pages 101 - 108

Published: 28 June 2007 Publication History

Abstract

In the construction of a part-of-speech annotated corpus, we are constrained by a fixed budget. A fully annotated corpus is required, but we can afford to label only a subset. We train a Maximum Entropy Markov Model tagger from a labeled subset and automatically tag the remainder. This paper addresses the question of where to focus our manual tagging efforts in order to deliver an annotation of highest quality. In this context, we find that active learning is always helpful. We focus on Query by Uncertainty (QBU) and Query by Committee (QBC) and report on experiments with several baselines and new variations of QBC and QBU, inspired by weaknesses particular to their use in this application. Experiments on English prose and poetry test these approaches and evaluate their robustness. The results allow us to make recommendations for both types of text and raise questions that will lead to further inquiry.

References

[1]

Anderson, B., and Moore, A. (2005). "Active Learning for HMM: Objective Functions and Algorithms." ICML, Germany.

Digital Library

[2]

Brants, T., (2000). "TnT -- a statistical part-of-speech tagger." ANLP, Seattle, WA.

Digital Library

[3]

Brill, E., and Wu, J. (1998). "Classifier combination for improved lexical disambiguation." Coling/ACL, Montreal, Quebec, Canada. Pp. 191--195.

Digital Library

[4]

Day, D., et al. (1997). "Mixed-Initiative Development of Language Processing Systems." ANLP, Washington, D.C.

Digital Library

[5]

Engelson, S. and Dagan. I. (1996). "Minimizing manual annotation cost in supervised training from corpora." ACL, Santa Cruz, California. Pp. 319--326.

Digital Library

[6]

Freund, Y., Seung, H., Shamir, E., and Tishby, N. (1997). "Selective sampling using the query by committee algorithm." Machine Learning, 28(2--3):133--168.

Digital Library

[7]

Godbert, G. and Ramsay, J. (1991). "For now." In the British National Corpus file B1C.xml. London: The Diamond Press (pp. 1--108).

[8]

Hughes, T. (1982). "Selected Poems." In the British National Corpus file H&R.xml. London: Faber&Faber Ltd. (pp. 35--235).

[9]

Kupiec, J. (1992). "Robust part-of-speech tagging using a hidden Markov model." Computer Speech and Language 6, pp. 225--242.

[10]

Lewis, D., and Catlett, J. (1994). "Heterogeneous uncertainty sampling for supervised learning." ICML.

[11]

Lewis, D., and Gale, W. (1995). "A sequential algorithm for training text classifiers: Corrigendum and additional data." SIGIR Forum, 29(2), 13--19.

Digital Library

[12]

Mann, G., and McCallum, A. (2007). "Efficient Computation of Entropy Gradient for Semi-Supervised Conditional Random Fields". NAACL-HLT.

Digital Library

[13]

Marcus, M. et al. (1999). "Treebank-3." Linguistic Data Consortium, Philadelphia, PA.

[14]

Raiffa, H. and Schlaiffer, R. (1967). Applied Statistical Decision Theory. New York: Wiley Interscience.

[15]

Raine, C. (1984). "Rich." In the British National Corpus file CB0.xml. London: Faber&Faber Ltd. (pp. 13--101).

[16]

Ratnaparkhi, A. (1996). "A Maximum Entropy Model for Part-Of-Speech Tagging." EMNLP.

[17]

Roy, N., and McCallum, A. (2001a). "Toward optimal active learning through sampling estimation of error reduction." ICML.

Digital Library

[18]

Roy, N. and McCallum, A. (2001b). "Toward Optimal Active Learning through Monte Carlo Estimation of Error Reduction." ICML, Williamstown.

Digital Library

[19]

Seung, H., Opper, M., and Sompolinsky, H. (1992). "Query by committee". COLT. Pp. 287--294.

Digital Library

[20]

Thrun S., and Moeller, K. (1992). "Active exploration in dynamic environments." NIPS.

[21]

Toutanova, K., Klein, D., Manning, C., and Singer, Y. (2003). "Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network." HLT-NAACL. Pp. 252--259.

Digital Library

[22]

Toutanova, K. and Manning, C. (2000). "Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger." EMNLP, Hong Kong. Pp. 63--70.

Digital Library

Cited By

Tsou YLin H(2019)Annotation cost-sensitive active learning by tree samplingMachine Language10.1007/s10994-019-05781-7108:5(785-807)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1007/s10994-019-05781-7
(2017)Modeling of learning curves with applications to POS taggingComputer Speech and Language10.1016/j.csl.2016.06.00141:C(1-28)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1016/j.csl.2016.06.001
Outahajala MBenajiba YRosso PZenkouar L(2015)Using confidence and informativeness criteria to improve POS-tagging in amazighJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.5555/2740244.274027628:3(1319-1330)Online publication date: 1-May-2015
https://dl.acm.org/doi/10.5555/2740244.2740276
Show More Cited By

Active learning for part-of-speech tagging: accelerating corpus annotation
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Rule Based Part of Speech Tagging of Sindhi Language
ICSAP '10: Proceedings of the 2010 International Conference on Signal Acquisition and Processing

Part of Speech (POS) tagging is a process of assigning correct syntactic categories to each word in the text. Tag set and word disambiguation rules are fundamental parts of any POS tagger. No work has hitherto been published of tag set in Sindhi ...
Learning character-level representations for part-of-speech tagging
ICML'14: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32

Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word ...
Part-of-speech tagging of modern hebrew text

Words in Semitic texts often consist of a concatenation of word segments, each corresponding to a part-of-speech (POS) category. Semitic words may be ambiguous with regard to their segmentation as well as to the POS tags assigned to each segment. When ...

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

LAW '07: Proceedings of the Linguistic Annotation Workshop

June 2007

210 pages

Program Chairs:
Branimir Boguraev
IBM T. J. Watson Research Center
,
Nancy Ide
Vassar College
,
Adam Meyers
New York University
,
Shigeko Nariyama
University of Melbourne
,
Manfred Stede
University of Potsdam
,
Janyce Wiebe
University of Pittsburgh
,
Graham Wilcock
University of Helsinki

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 28 June 2007

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
471
Total Downloads

Downloads (Last 12 months)57
Downloads (Last 6 weeks)9

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tsou YLin H(2019)Annotation cost-sensitive active learning by tree samplingMachine Language10.1007/s10994-019-05781-7108:5(785-807)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1007/s10994-019-05781-7
(2017)Modeling of learning curves with applications to POS taggingComputer Speech and Language10.1016/j.csl.2016.06.00141:C(1-28)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1016/j.csl.2016.06.001
Outahajala MBenajiba YRosso PZenkouar L(2015)Using confidence and informativeness criteria to improve POS-tagging in amazighJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.5555/2740244.274027628:3(1319-1330)Online publication date: 1-May-2015
https://dl.acm.org/doi/10.5555/2740244.2740276
Laws FHeimerl FSchütze HChu-Carroll J(2012)Active learning for coreference resolutionProceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies10.5555/2382029.2382102(508-512)Online publication date: 3-Jun-2012
https://dl.acm.org/doi/10.5555/2382029.2382102
Zhu JMa M(2012)Uncertainty-based active learning with instability estimation for text classificationACM Transactions on Speech and Language Processing 10.1145/2093153.20931548:4(1-21)Online publication date: 20-Feb-2012
https://dl.acm.org/doi/10.1145/2093153.2093154
Laws FScheible CSchütze HMerlo PBarzilay RJohnson M(2011)Active learning with Amazon Mechanical TurkProceedings of the Conference on Empirical Methods in Natural Language Processing10.5555/2145432.2145597(1546-1556)Online publication date: 27-Jul-2011
https://dl.acm.org/doi/10.5555/2145432.2145597
Neubig GNakata YMori SLin D(2011)Pointwise prediction for robust, adaptable Japanese morphological analysisProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 210.5555/2002736.2002841(529-533)Online publication date: 19-Jun-2011
https://dl.acm.org/doi/10.5555/2002736.2002841
Rehbein IRuppenhofer JLin D(2011)Evaluating the impact of coder errors on active learningProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 110.5555/2002472.2002479(43-51)Online publication date: 19-Jun-2011
https://dl.acm.org/doi/10.5555/2002472.2002479
Rehbein IRuppenhofer JPalmer AJoshi AHuang CJurafsky D(2010)Bringing active learning to lifeProceedings of the 23rd International Conference on Computational Linguistics10.5555/1873781.1873888(949-957)Online publication date: 23-Aug-2010
https://dl.acm.org/doi/10.5555/1873781.1873888
Sassano MKurohashi SHajič J(2010)Using smaller constituents rather than sentences in active learning for Japanese dependency parsingProceedings of the 48th Annual Meeting of the Association for Computational Linguistics10.5555/1858681.1858718(356-365)Online publication date: 11-Jul-2010
https://dl.acm.org/doi/10.5555/1858681.1858718
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten