skip to main content
research-article

Improving Semantic Parsing with Enriched Synchronous Context-Free Grammars in Statistical Machine Translation

Published: 03 November 2016 Publication History

Abstract

Semantic parsing maps a sentence in natural language into a structured meaning representation. Previous studies show that semantic parsing with synchronous context-free grammars (SCFGs) achieves favorable performance over most other alternatives. Motivated by the observation that the performance of semantic parsing with SCFGs is closely tied to the translation rules, this article explores to extend translation rules with high quality and increased coverage in three ways. First, we examine the difference between word alignments for semantic parsing and statistical machine translation (SMT) to better adapt word alignment in SMT to semantic parsing. Second, we introduce both structure and syntax informed nonterminals, better guiding the parsing in favor of well-formed structure, instead of using a uninformed nonterminal in SCFGs. Third, we address the unknown word translation issue via synthetic translation rules. Last but not least, we use a filtering approach to improve performance via predicting answer type. Evaluation on the standard GeoQuery benchmark dataset shows that our approach greatly outperforms the state of the art across various languages, including English, Chinese, Thai, German, and Greek.

References

[1]
Jacob Andreas, Andreas Vlachos, and Stephen Clark. 2013. Semantic parsing as machine translation. In Proceedings of the 51st Annual Meeting of the Association of Computational Linguistics. 47--52.
[2]
Yoav Artzi and Luke Zettlemoyer. 2013. Weakly supervised learning of semantic parsers for mapping instructions to actions. Transactions of the Association of Computational Linguistics 1, 49--62.
[3]
Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. 2013. Abstract meaning representation for sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. 178--186.
[4]
Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic parsing on freebase from question-answer pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1533--1544.
[5]
Jonathan Berant and Percy Liang. 2014. Semantic parsing via paraphrasing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1415--1425.
[6]
Peter E. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19, 2, 263--313.
[7]
Qingqing Cai and Alexander Yates. 2013. Large-scale semantic parsing via schema matching and lexicon extension. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 423--433.
[8]
David Chiang. 2007. Hierarchical phrase-based translation. Computational Linguistics 33, 2, 201--228.
[9]
David Chiang, Yuval Marton, and Philip Resnik. 2008. Online large-margin training of syntactic and structural translation features. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 224--233.
[10]
James Clarke, Dan Goldwasser, Ming-Wei Chang, and Dan Roth. 2010. Driving semantic parsing from the worlds response. In Proceedings of the 14th Conference on Computational Natural Language Learning. 18--27.
[11]
Chris Dyer, Adam Lopez, Juri Ganitkevitch, Jonathan Weese, Ferhan Ture, Phil Blunsom, Hendra Setiawan, Vladimir Eidelman, and Philip Resnik. 2010. cdec: A decoder, alignment, and learning framework for finite-state and context-free translation models. In Proceedings of the ACL 2010 System Demonstrations. 7--12.
[12]
Marcello Federico, Nicola Bertoldi, and Mauro Cettolo. 2008. IRSTLM: An open source toolkit for handling large scale language models. In Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech’08). 1618--1621.
[13]
Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in a translation rule? In Proceedings of the Natural Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL’04). 273--280.
[14]
Ruifang Ge and Raymond Mooney. 2005. A statistical semantic parser that integrates syntax and semantics. In Proceedings of the 9th Conference on Computational Natural Language Learning (CoNLL’05). 9--16.
[15]
Dan Goldwasser, Roi Reichart, James Clarke, and Dan Roth. 2011. Confidence driven unsupervised semantic parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 1486--1495.
[16]
Zhanming Jie and Wei Lu. 2014. Multilingual semantic parsing: Parsing multiple languages into semantic representations. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers (COLING’14). 1291--1301.
[17]
Thorsten Joachims. 1999. Making large-scale SVM learning practical. In Advances in Kernel Methods: Support Vector Learning, B. Schölkopf, C. Burges, and A. Smola (Eds.). MIT Press, Cambridge, MA, 169--184.
[18]
Bevan Jones, Mark Johnson, and Sharon Goldwater. 2012. Semantic parsing with Bayesian tree transducers. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 488--496.
[19]
Rohit J. Kate and Raymond J. Mooney. 2006. Using string-kernels for learning semantic parsers. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 913--920.
[20]
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics on Interactive Poster and Demonstration Sessions (ACL’07). 177--180.
[21]
Tom Kwiatkowski, Eunsol Choi, Yoav Artzi, and Luke Zettlemoyer. 2013. Scaling semantic parsers with on-the-fly ontology matching. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1545--1556.
[22]
Tom Kwiatkowski, Luke Zettlemoyer, Sharon Goldwater, and Mark Steedman. 2010. Inducing probabilistic CCG grammars from logical form with higher-order unification. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 1223--1233.
[23]
Tom Kwiatkowski, Luke Zettlemoyer, Sharon Goldwater, and Mark Steedman. 2011. Lexical generalization in CCG grammar induction for semantic parsing. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. 1512--1523.
[24]
Junhui Li, Zhaopeng Tu, Guodong Zhou, and Josef van Genabith. 2012. Using syntactic head information in hierarchical phrase-based translation. In Proceedings of the 7th Workshop on Statistical Machine Translation. 232--242.
[25]
Junhui Li, Muhua Zhu, Wei Lu, and Guodong Zhou. 2015. Improving semantic parsing with enriched synchronous context-free grammar. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 1455--1465.
[26]
Peng Li, Yang Liu, and Maosong Sun. 2013. An extended GHKM algorithm for inducing lambda-SCFG. In Proceedings of the 27th AAAI Conference on Artificial Intelligence. 605--611.
[27]
Percy Liang, Michael I. Jordan, and Dan Klein. 2011. Learning dependency-based compositional semantics. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 590--599.
[28]
Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree-to-string alignment template for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 609--616.
[29]
Wei Lu. 2014. Semantic parsing with relaxed hybrid trees. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1308--1318.
[30]
Wei Lu. 2015. Constrained semantic forests for improved discriminative semantic parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 737--742.
[31]
Wei Lu, Hwee Tou Ng, Wee Sun Lee, and Luke S. Zettlemoyer. 2008. A generative model for parsing natural language to meaning representations. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. 783--792.
[32]
Klaus Macherey, Franz Josef Och, and Hermann Ney. 2001. Natural language understanding using statistical machine translation. In Proceedings of the 7th European Conference on Speech Communication and Technology (EuroSpeech’01). 2205--2208.
[33]
Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29, 1, 19--51.
[34]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311--318.
[35]
Kishore A. Papineni, Salim Roukos, and Todd Ward. 1997. Feature-based language understanding. In Proceedings of the 5th European Conference on Speech Communication and Technology (EuroSpeech 1997). 1435--1438.
[36]
Hoifung Poon and Pedro Domingos. 2009. Unsupervised semantic parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing (EMNLP’09). 1--10.
[37]
Matthew Richardson and Pedro Domingos. 2006. Markov logic networks. Machine Learning 62, 1--2, 107--136.
[38]
Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL’08). 577--585.
[39]
Mark Steedman. 2000. The Syntactic Process. MIT Press, Cambridge, MA.
[40]
Zhaopeng Tu, Yang Liu, Yifan He, Josef van Genabith, Qun Liu, and Shouxun Lin. 2012. Combining multiple alignments to improve machine translation. In Proceedings of the 24th International Conference on Computational Linguistics: Posters (COLING’12). 1249--1260.
[41]
David Vilar, Jia Xu, Luis Fernando D’Haro, and Hermann Ney. 2006. Error analysis of statistical machine translation output. In Proceedings of the 5th International Conference on Language Resources and Evaluation. 697--702.
[42]
Adrienne Wang, Tom Kwiatkowski, and Luke Zettlemoyer. 2014. Morpho-syntactic lexical generalization for CCG semantic parsing. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 1284--1295.
[43]
Yuk Wah Wong and Raymond Mooney. 2006. Learning for semantic parsing with statistical machine translation. In Proceedings of the Human Language Technology Conference of the NAACL. 439--446.
[44]
Yuk Wah Wong and Raymond Mooney. 2007. Learning synchronous grammars for semantic parsing with lambda calculus. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 960--967.
[45]
Tong Xiao, Jingbo Zhu, Hao Zhang, and Qiang Li. 2012. NiuTrans: An open source toolkit for phrase-based and syntax-based machine translation. In Proceedings of the Association for Computational Linguistics 2012 System Demonstrations (ACL’12). 19--24.
[46]
Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. 523--530.
[47]
Luke S. Zettlemoyer and Michael Collins. 2005. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI’05). 658--666.
[48]
Andreas Zollmann and Ashish Venugopal. 2006. Syntax augmented machine translation via chart parsing. In Proceedings of the Workshop on Statistical Machine Translation. 138--141.

Cited By

View all

Index Terms

  1. Improving Semantic Parsing with Enriched Synchronous Context-Free Grammars in Statistical Machine Translation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 16, Issue 1
    TALLIP Notes and Regular Papers
    March 2017
    133 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/2961867
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 November 2016
    Accepted: 01 June 2016
    Revised: 01 May 2016
    Received: 01 February 2016
    Published in TALLIP Volume 16, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Semantic parsing
    2. enriched synchronous context-free grammars
    3. statistical machine translation
    4. word alignment

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • SUTD
    • National Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 06 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media