skip to main content
10.3115/1218955.1218996dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Long-distance dependency resolution in automatically acquired wide-coverage PCFG-based LFG approximations

Published: 21 July 2004 Publication History

Abstract

This paper shows how finite approximations of long distance dependency (LDD) resolution can be obtained automatically for wide-coverage, robust, probabilistic Lexical-Functional Grammar (LFG) resources acquired from treebanks. We extract LFG subcategorisation frames and paths linking LDD reentrancies from f-structures generated automatically for the Penn-II treebank trees and use them in an LDD resolution algorithm to parse new text. Unlike (Collins, 1999; Johnson, 2000), in our approach resolution of LDDs is done at f-structure (attribute-value structure representations of basic predicate-argument or dependency structure) without empty productions, traces and coindexation in CFG parse trees. Currently our best automatically induced grammars achieve 80.97% f-score for f-structures parsing section 23 of the WSJ part of the Penn-II treebank and evaluating against the DCU 1051 and 80.24% against the PARC 700 Dependency Bank (King et al., 2003), performing at the same or a slightly better level than state-of-the-art hand-crafted grammars (Kaplan et al., 2004).

References

[1]
S. Abney. 1997. Stochastic attribute-value grammars. Computational Linguistics, 23(4):597--618.
[2]
M. Burke, A. Cahill, R. O'Donovan, J. van Genabith, and A. Way 2004. The Evaluation of an Automatic Annotation Algorithm against the PARC 700 Dependency Bank. In Proceedings of the Ninth International Conference on LFG, Christchurch, New Zealand (to appear).
[3]
A. Cahill, M. McCarthy, J. van Genabith, and A. Way. 2002. Parsing with PCFGs and Automatic F-Structure Annotation. In Miriam Butt and Tracy Holloway King, editors, Proceedings of the Seventh International Conference on LFG, pages 76--95. CSLI Publications, Stanford, CA.
[4]
E. Charniak. 1996. Tree-Bank Grammars. In AAAI/IAAI, Vol. 2, pages 1031--1036.
[5]
E. Charniak. 1999. A Maximum-Entropy-Inspired Parser. Technical Report CS-99-12, Brown University, Providence, RI.
[6]
M. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia, PA.
[7]
M. Dalrymple. 2001. Lexical-Functional Grammar. San Diego, CA; London Academic Press.
[8]
J. Hockenmaier. 2003. Parsing with Generative models of Predicate-Argument Structure. In Proceedings of the 41st Annual Conference of the Association for Computational Linguistics, pages 359--366, Sapporo, Japan.
[9]
M. Johnson. 1999. PCFG models of linguistic tree representations. Computational Linguistics, 24(4):613--632.
[10]
M. Johnson. 2002. A simple pattern-matching algorithm for recovering empty nodes and their antecedents. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 136--143, Philadelphia, PA.
[11]
R. Kaplan and J. Bresnan. 1982. Lexical Functional Grammar, a Formal System for Grammatical Representation. In The Mental Representation of Grammatical Relations, pages 173--281. MIT Press, Cambridge, MA.
[12]
R. Kaplan, S. Riezler, T. H. King, J. T. Maxwell, A. Vasserman, and R. Crouch. 2004. Speed and accuracy in shallow and deep stochastic parsing. In Proceedings of the Human Language Technology Conference and the 4th Annual Meeting of the North American Chapter of the Association for Computational Linguistics, pages 97--104, Boston, MA.
[13]
T. H. King, R. Crouch, S. Riezler, M. Dalrymple, and R. Kaplan. 2003. The PARC700 dependency bank. In Proceedings of the EACL03: 4th International Workshop on Linguistically Interpreted Corpora (LINC-03), pages 1--8, Budapest.
[14]
D. Klein and C. Manning. 2003. Accurate Unlexicalized Parsing. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL '02), pages 423--430, Sapporo, Japan.
[15]
C. Macleod, A. Meyers, and R. Grishman. 1994. The COMLEX Syntax Project: The First Year. In Proceedings of the ARPA Workshop on Human Language Technology, pages 669--703, Princeton, NJ.
[16]
D. Magerman. 1994. Natural Language Parsing as Statistical Pattern Recognition. PhD thesis, Stanford University, CA.
[17]
M. Marcus, G. Kim, M. A. Marcinkiewicz, R. MacIntyre, A. Bies, M. Ferguson, K. Katz, and B. Schasberger. 1994. The Penn Treebank: Annotating Predicate Argument Structure. In Proceedings of the ARPA Workshop on Human Language Technology, pages 110--115, Princeton, NJ.
[18]
Y. Miyao, T. Ninomiya, and J. Tsujii. 2003. Probabilistic modeling of argument structures including non-local dependencies. In Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP), pages 285--291, Borovets, Bulgaria.
[19]
R. O'Donovan, M. Burke, A. Cahill, J. van Genabith, and A. Way. 2004. Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II Treebank. In Proceedings of the 42nd Annual Conference of the Association for Computational Linguistics (ACL-04), Barcelona.
[20]
S. Riezler, T. H. King, R. Kaplan, R. Crouch, J. T. Maxwell III, and M. Johnson. 2002. Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques. In Proceedings of the 40th Annual Conference of the Association for Computational Linguistics (ACL-02), pages 271--278, Philadelphia, PA.
[21]
Y. Tateisi, K. Torisawa, Y. Miyao, and J. Tsujii. 1998. Translating the XTAG English Grammar to HPSG. In 4th International Workshop on Tree Adjoining Grammars and Related Frameworks, Philadelphia, PA, pages 172--175.
[22]
J. van Genabith and R. Crouch. 1996. Direct and Underspecified Interpretations of LFG f-Structures. In Proceedings of the 16th International Conference on Computational Linguistics (COLING), pages 262--267, Copenhagen.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '04: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
July 2004
729 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 21 July 2004

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)1
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media