research-article

Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning

Authors:

Vincent W. Zheng,

Kevin Chen-Chuan ChangAuthors Info & Claims

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 1029 - 1038

https://doi.org/10.1145/2983323.2983860

Published: 24 October 2016 Publication History

Abstract

Constraints have been shown as an effective way to incorporate unlabeled data for semi-supervised structured classification. We recognize that, constraints are often conditional and probabilistic; moreover, a constraint can have its condition depend on either just observations (which we call x-type constraint) or even hidden variables (which we call y-type constraint). We wish to design a constraint formulation that can flexibly model the constraint probability for both x-type and y-type constraints, and later use it to regularize general structured classifiers for semi-supervision. Surprisingly, none of the existing models have such a constraint formulation. Thus in this paper, we propose a new conditional probabilistic formulation for modeling both x-type and y-type constraints. We also recognize the inference complication for y-type constraint, and propose a systematic selective evaluation approach to efficiently realize the constraints. Finally, we evaluate our model in three applications, including named entity recognition, part-of-speech tagging and entity information extraction, with totally nine data sets. We show that our model is generally more accurate and efficient than the state-of-the-art baselines. Our code and data are available at https://bitbucket.org/vwz/cikm2016-cpf/.

References

[1]

D. Andrzejewski, X. Zhu, M. Craven, and B. Recht. A framework for incorporating general domain knowledge into latent dirichlet allocation using first-order logic. In Proc. of IJCAI, pages 1171--1177, 2011.

Digital Library

[2]

S. Anzaroot, A. Passos, D. Belanger, and A. McCallum. Learning soft linear constraints with application to citation field extraction. In Proc. of ACL, pages 593--602, 2014.

[3]

S. H. Bach, M. Broecheler, B. Huang, and L. Getoor. Hinge-loss markov random fields and probabilistic soft logic. arXiv:1505.04406 {cs.LG}, 2015.

Digital Library

[4]

A. Bakalov, A. Fuxman, P. P. Talukdar, and S. Chakrabarti. Scad: collective discovery of attribute values. In Proc. of WWW, pages 447--456, 2011.

Digital Library

[5]

K. Bellare, G. Druck, and A. McCallum. Alternating projections for learning with expectation constraints. In Proc. of UAI, pages 43--50, 2009.

Digital Library

[6]

S. Buchholz and E. Marsi. Conll-x shared task on multilingual dependency parsing. In Proc. of CoNLL-X, pages 149--164, 2006.

Digital Library

[7]

N. D. F. Campbell, K. Subr, and J. Kautz. Fully-connected crfs with non-parametric pairwise potential. In Proc. of CVPR, pages 1658--1665, 2013.

Digital Library

[8]

M.-W. Chang, L. Ratinov, and D. Roth. Structured learning with constrained conditional models. Mach. Learn., 88(3):399--431, Sept. 2012.

Digital Library

[9]

M. Dudık, S. J. Phillips, and R. E. Schapire. Maximum entropy density estimation with generalized regularization and an application to species distribution modeling. Journal of Machine Learning Research, 8:1217--1260, 2007.

Digital Library

[10]

J. R. Finkel, T. Grenager, and C. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In Proc. of ACL, pages 363--370, 2005.

Digital Library

[11]

K. Ganchev, J. a. Graça, J. Gillenwater, and B. Taskar. Posterior regularization for structured latent variable models. Journal of Mach. Learn. Research, 11:2001--2049, aug 2010.

Digital Library

[12]

R. Gupta and S. Sarawagi. Joint training for open-domain extraction on the web: Exploiting overlap when supervision is limited. In Proc. of WSDM, pages 217--226, 2011.

Digital Library

[13]

L. He, J. Gillenwater, and B. Taskar. Graph-based posterior regularization for semi-supervised structured prediction. In Proc. of CoNLL, pages 38--46, 2013.

[14]

P. Krahenbühl and V. Koltun. Efficient inference in fully connected crfs with gaussian edge potentials. In Proc. of NIPS, pages 109--117, 2011.

Digital Library

[15]

J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, pages 282--289, 2001.

Digital Library

[16]

S. Lee, V. Ganapathi, and D. Koller. Efficient structure learning of markov networks using dollarl_1dollar-regularization. In Proc. of NIPS, pages 817--824, 2006.

Digital Library

[17]

S. Li, J. a. V. Graça, and B. Taskar. Wiki-ly supervised part-of-speech tagging. In Proc. of EMNLP-CoNLL, pages 1389--1398, 2012.

Digital Library

[18]

Z. Li, J. Chao, M. Zhang, and W. Chen. Coupled sequence labeling on heterogeneous annotations: POS tagging as a case study. In ACL, pages 1783--1792, 2015.

[19]

M. Libbrecht, M. Hoffman, J. Bilmes, and W. Noble. Entropic graph-based posterior regularization. In Proc. of ICML, pages 1992--2001, 2015.

[20]

G. S. Mann and A. McCallum. Generalized expectation criteria for semi-supervised learning with weakly labeled data. Journal of Mach. Learn. Research, 11:955--984, 2010.

Digital Library

[21]

A. F. T. Martins. Transferring coreference resolvers with posterior regularization. In ACL, pages 1427--1437, 2015.

[22]

S. Mei, J. Zhu, and J. Zhu. Robust regbayes: Selectively incorporating first-order logic domain knowledge into bayesian models. In Proc. of ICML, 2014.

[23]

X. Ren, A. El-Kishky, C. Wang, F. Tao, C. R. Voss, and J. Han. Clustype: Effective entity recognition and typing by relation phrase-based clustering. In Proc. of KDD, pages 995--1004, 2015.

Digital Library

[24]

M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62(1--2):107--136, 2006.

Digital Library

[25]

A. Ritter, E. Wright, W. Casey, and T. Mitchell. Weakly supervised extraction of computer security events from twitter. In Proc. of WWW, pages 896--905, 2015.

Digital Library

[26]

T. Rocktaschel, S. Singh, and S. Riedel. Injecting logical background knowledge into embeddings for relation extraction. In Proc. of NAACL, 2015.

[27]

R. Salakhutdinov. Learning in markov random fields using tempered transitions. In NIPS, pages 1598--1606, 2009.

Digital Library

[28]

C. D. Santos and B. Zadrozny. Learning character-level representations for part-of-speech tagging. In Proc. of ICML, pages 1818--1826, 2014.

[29]

J. Tang, Z. Fang, and J. Sun. Incorporating social context and domain knowledge for entity recognition. In Proc. of WWW, pages 517--526, 2015.

Digital Library

[30]

B. Yang and C. Cardie. Context-aware learning for sentence-level sentiment analysis with posterior regularization. In Proc. of ACL, pages 325--335, 2014.

[31]

X. Zeng, L. S. Chao, D. F. Wong, I. Trancoso, and L. Tian. Toward better chinese word segmentation for SMT via bilingual constraints. In ACL, pages 1360--1369, 2014.

[32]

J. Zhu, N. Lao, and E. P. Xing. Grafting-light: fast, incremental feature selection and structure learning of markov random fields. In KDD, pages 303--312, 2010.

Digital Library

[33]

A. Zirikly. Cross-lingual transfer of named entity recognizers without parallel corpora. In ACL, pages 390--396, 2015.

Cited By

Lu YChen PPian YZheng V(2022)CMKT: Concept Map Driven Knowledge TracingIEEE Transactions on Learning Technologies10.1109/TLT.2022.319635515:4(467-480)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1109/TLT.2022.3196355
Zheng V(2018)Engineering graph features via network functional blocksProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304838(5749-5753)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304652.3304838
Chen PLu YZheng VPian Y(2018)Prerequisite-Driven Deep Knowledge Tracing2018 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM.2018.00019(39-48)Online publication date: Nov-2018
https://doi.org/10.1109/ICDM.2018.00019

Index Terms

Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Semi-supervised learning settings
    2. Machine learning algorithms
      1. Regularization

Recommendations

Semi-supervised discriminatively regularized classifier with pairwise constraints
PRICAI'12: Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence

In many real-world classifications such as video surveillance, web retrieval and image segmentation, we often encounter that class information is reflected by the pairwise constraints between data pairs rather than the usual labels for each data, which ...
Joint semi-supervised learning of Hidden Conditional Random Fields and Hidden Markov Models

Although semi-supervised learning has generated great interest for designing classifiers on static patterns, there has been comparatively fewer works on semi-supervised learning for structured outputs and in particular for sequences. We investigate semi-...
Entailment with Conditional Equality Constraints
ESOP '01: Proceedings of the 10th European Symposium on Programming Languages and Systems

Equality constraints (unification constraints) have widespread use in program analysis, most notably in static polymorphic type systems. Conditional equality constraints extend equality constraints with a weak form of subtyping to allow for more ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

October 2016

2566 pages

ISBN:9781450340731

DOI:10.1145/2983323

General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Singapore's Agency for Science, Technology and Research (A*STAR)
National Nature Science Foundation of China

Conference

CIKM'16

Sponsor:

CIKM'16: ACM Conference on Information and Knowledge Management

October 24 - 28, 2016

Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
168
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lu YChen PPian YZheng V(2022)CMKT: Concept Map Driven Knowledge TracingIEEE Transactions on Learning Technologies10.1109/TLT.2022.319635515:4(467-480)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1109/TLT.2022.3196355
Zheng V(2018)Engineering graph features via network functional blocksProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304838(5749-5753)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304652.3304838
Chen PLu YZheng VPian Y(2018)Prerequisite-Driven Deep Knowledge Tracing2018 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM.2018.00019(39-48)Online publication date: Nov-2018
https://doi.org/10.1109/ICDM.2018.00019

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten