skip to main content
10.1145/2983323.2983860acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning

Published: 24 October 2016 Publication History

Abstract

Constraints have been shown as an effective way to incorporate unlabeled data for semi-supervised structured classification. We recognize that, constraints are often conditional and probabilistic; moreover, a constraint can have its condition depend on either just observations (which we call x-type constraint) or even hidden variables (which we call y-type constraint). We wish to design a constraint formulation that can flexibly model the constraint probability for both x-type and y-type constraints, and later use it to regularize general structured classifiers for semi-supervision. Surprisingly, none of the existing models have such a constraint formulation. Thus in this paper, we propose a new conditional probabilistic formulation for modeling both x-type and y-type constraints. We also recognize the inference complication for y-type constraint, and propose a systematic selective evaluation approach to efficiently realize the constraints. Finally, we evaluate our model in three applications, including named entity recognition, part-of-speech tagging and entity information extraction, with totally nine data sets. We show that our model is generally more accurate and efficient than the state-of-the-art baselines. Our code and data are available at https://bitbucket.org/vwz/cikm2016-cpf/.

References

[1]
D. Andrzejewski, X. Zhu, M. Craven, and B. Recht. A framework for incorporating general domain knowledge into latent dirichlet allocation using first-order logic. In Proc. of IJCAI, pages 1171--1177, 2011.
[2]
S. Anzaroot, A. Passos, D. Belanger, and A. McCallum. Learning soft linear constraints with application to citation field extraction. In Proc. of ACL, pages 593--602, 2014.
[3]
S. H. Bach, M. Broecheler, B. Huang, and L. Getoor. Hinge-loss markov random fields and probabilistic soft logic. arXiv:1505.04406 {cs.LG}, 2015.
[4]
A. Bakalov, A. Fuxman, P. P. Talukdar, and S. Chakrabarti. Scad: collective discovery of attribute values. In Proc. of WWW, pages 447--456, 2011.
[5]
K. Bellare, G. Druck, and A. McCallum. Alternating projections for learning with expectation constraints. In Proc. of UAI, pages 43--50, 2009.
[6]
S. Buchholz and E. Marsi. Conll-x shared task on multilingual dependency parsing. In Proc. of CoNLL-X, pages 149--164, 2006.
[7]
N. D. F. Campbell, K. Subr, and J. Kautz. Fully-connected crfs with non-parametric pairwise potential. In Proc. of CVPR, pages 1658--1665, 2013.
[8]
M.-W. Chang, L. Ratinov, and D. Roth. Structured learning with constrained conditional models. Mach. Learn., 88(3):399--431, Sept. 2012.
[9]
M. Dudık, S. J. Phillips, and R. E. Schapire. Maximum entropy density estimation with generalized regularization and an application to species distribution modeling. Journal of Machine Learning Research, 8:1217--1260, 2007.
[10]
J. R. Finkel, T. Grenager, and C. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In Proc. of ACL, pages 363--370, 2005.
[11]
K. Ganchev, J. a. Graça, J. Gillenwater, and B. Taskar. Posterior regularization for structured latent variable models. Journal of Mach. Learn. Research, 11:2001--2049, aug 2010.
[12]
R. Gupta and S. Sarawagi. Joint training for open-domain extraction on the web: Exploiting overlap when supervision is limited. In Proc. of WSDM, pages 217--226, 2011.
[13]
L. He, J. Gillenwater, and B. Taskar. Graph-based posterior regularization for semi-supervised structured prediction. In Proc. of CoNLL, pages 38--46, 2013.
[14]
P. Krahenbühl and V. Koltun. Efficient inference in fully connected crfs with gaussian edge potentials. In Proc. of NIPS, pages 109--117, 2011.
[15]
J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, pages 282--289, 2001.
[16]
S. Lee, V. Ganapathi, and D. Koller. Efficient structure learning of markov networks using dollarl_1dollar-regularization. In Proc. of NIPS, pages 817--824, 2006.
[17]
S. Li, J. a. V. Graça, and B. Taskar. Wiki-ly supervised part-of-speech tagging. In Proc. of EMNLP-CoNLL, pages 1389--1398, 2012.
[18]
Z. Li, J. Chao, M. Zhang, and W. Chen. Coupled sequence labeling on heterogeneous annotations: POS tagging as a case study. In ACL, pages 1783--1792, 2015.
[19]
M. Libbrecht, M. Hoffman, J. Bilmes, and W. Noble. Entropic graph-based posterior regularization. In Proc. of ICML, pages 1992--2001, 2015.
[20]
G. S. Mann and A. McCallum. Generalized expectation criteria for semi-supervised learning with weakly labeled data. Journal of Mach. Learn. Research, 11:955--984, 2010.
[21]
A. F. T. Martins. Transferring coreference resolvers with posterior regularization. In ACL, pages 1427--1437, 2015.
[22]
S. Mei, J. Zhu, and J. Zhu. Robust regbayes: Selectively incorporating first-order logic domain knowledge into bayesian models. In Proc. of ICML, 2014.
[23]
X. Ren, A. El-Kishky, C. Wang, F. Tao, C. R. Voss, and J. Han. Clustype: Effective entity recognition and typing by relation phrase-based clustering. In Proc. of KDD, pages 995--1004, 2015.
[24]
M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62(1--2):107--136, 2006.
[25]
A. Ritter, E. Wright, W. Casey, and T. Mitchell. Weakly supervised extraction of computer security events from twitter. In Proc. of WWW, pages 896--905, 2015.
[26]
T. Rocktaschel, S. Singh, and S. Riedel. Injecting logical background knowledge into embeddings for relation extraction. In Proc. of NAACL, 2015.
[27]
R. Salakhutdinov. Learning in markov random fields using tempered transitions. In NIPS, pages 1598--1606, 2009.
[28]
C. D. Santos and B. Zadrozny. Learning character-level representations for part-of-speech tagging. In Proc. of ICML, pages 1818--1826, 2014.
[29]
J. Tang, Z. Fang, and J. Sun. Incorporating social context and domain knowledge for entity recognition. In Proc. of WWW, pages 517--526, 2015.
[30]
B. Yang and C. Cardie. Context-aware learning for sentence-level sentiment analysis with posterior regularization. In Proc. of ACL, pages 325--335, 2014.
[31]
X. Zeng, L. S. Chao, D. F. Wong, I. Trancoso, and L. Tian. Toward better chinese word segmentation for SMT via bilingual constraints. In ACL, pages 1360--1369, 2014.
[32]
J. Zhu, N. Lao, and E. P. Xing. Grafting-light: fast, incremental feature selection and structure learning of markov random fields. In KDD, pages 303--312, 2010.
[33]
A. Zirikly. Cross-lingual transfer of named entity recognizers without parallel corpora. In ACL, pages 390--396, 2015.

Cited By

View all

Index Terms

  1. Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
      October 2016
      2566 pages
      ISBN:9781450340731
      DOI:10.1145/2983323
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 October 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. conditional probabilistic constraint
      2. structured classifier

      Qualifiers

      • Research-article

      Funding Sources

      • Singapore's Agency for Science, Technology and Research (A*STAR)
      • National Nature Science Foundation of China

      Conference

      CIKM'16
      Sponsor:
      CIKM'16: ACM Conference on Information and Knowledge Management
      October 24 - 28, 2016
      Indiana, Indianapolis, USA

      Acceptance Rates

      CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)6
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media