research-article

Open access

Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition

Authors:

Chao ZhangAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 978 - 988

https://doi.org/10.1145/3534678.3539247

Published: 14 August 2022 Publication History

Abstract

Weakly supervised named entity recognition methods train label models to aggregate the token annotations of multiple noisy labeling functions (LFs) without seeing any manually annotated labels. To work well, the label model needs to contextually identify and emphasize well-performed LFs while down-weighting the under-performers. However, evaluating the LFs is challenging due to the lack of ground truths. To address this issue, we propose the sparse conditional hidden Markov model (Sparse-CHMM). Instead of predicting the entire emission matrix as other HMM-based methods, Sparse-CHMM focuses on estimating its diagonal elements, which are considered as the reliability scores of the LFs. The sparse scores are then expanded to the full-fledged emission matrix with pre-defined expansion functions. We also augment the emission with weighted XOR scores, which track the probabilities of an LF observing incorrect entities. Sparse-CHMM is optimized through unsupervised learning with a three-stage training pipeline that reduces the training difficulty and prevents the model from falling into local optima. Compared with the baselines in the Wrench benchmark, Sparse-CHMM achieves a 3.01 average F1 score improvement on five comprehensive datasets. Experiments show that each component of Sparse-CHMM is effective, and the estimated LF reliabilities strongly correlate with true LF F1 scores.

Supplemental Material

MP4 File

Presentation video for KDD'22 Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition

Download
100.50 MB

References

[1]

Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. In EMNLP-IJCNLP. 3615--3620.

[2]

Benedikt Boecking, Willie Neiswanger, Eric P. Xing, and Artur Dubrawski. 2021. Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling. In ICLR.

[3]

Xiangning Chen, Ruochen Wang, Minhao Cheng, Xiaocheng Tang, and Cho-Jui Hsieh. 2021. DrNAS: Dirichlet Neural Architecture Search. In ICLR.

[4]

Hanjun Dai, Bo Dai, Yan-Ming Zhang, Shuang Li, and Le Song. 2017. Recurrent Hidden Semi-Markov Model. In ICLR.

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL. 4171--4186.

[6]

Rezarta Islamaj Dogan, Robert Leaman, and Zhiyong Lu. 2014. NCBI disease corpus: A resource for disease name recognition and concept normalization. J. Biomed. Informatics (2014), 1--10.

[7]

Raphael Hoffmann, Congle Zhang, Xiao Ling, Luke Zettlemoyer, and Daniel S. Weld. 2011. Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations. In ACL. 541--550.

Digital Library

[8]

Martin Jankowiak and Fritz Obermeyer. 2018. Pathwise Derivatives Beyond the Reparameterization Trick. In ICML. 2240--2249.

[9]

Edward Kim, Kevin Huang, Adam Saunders, Andrew McCallum, Gerbrand Ceder, and Elsa Olivetti. 2017. Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning. Chemistry of Materials (2017), 9436--9444.

[10]

Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR.

[11]

Ouyu Lan, Xiao Huang, Bill Yuchen Lin, He Jiang, Liyuan Liu, and Xiang Ren. 2020. Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling. In ACL. 2134--2146.

[12]

Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2019. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. bioinformatics (2019), 1234--1240.

[13]

Jiao Li, Yueping Sun, Robin J. Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J. Mattingly, Thomas C. Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database J. Biol. Databases Curation (2016).

[14]

Yinghao Li, Pranav Shetty, Lucas Liu, Chao Zhang, and Le Song. 2021. BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition. In ACL-IJCNL. 6178--6190.

[15]

Pierre Lison, Jeremy Barnes, and Aliaksandr Hubin. 2021. skweak: Weak Supervision Made Easy for NLP. In ACL-IJCNL. 337--346.

[16]

Pierre Lison, Jeremy Barnes, Aliaksandr Hubin, and Samia Touileb. 2020. Named Entity Recognition without Labelled Data: A Weak Supervision Approach. In ACL. 1518--1533.

[17]

Hao Liu, Lirong He, Haoli Bai, Bo Dai, Kun Bai, and Zenglin Xu. 2018. Structured Inference for Recurrent Hidden Semi-Markov Model. In IJCAI. 2447--2453.

[18]

An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, and Matthew Lease. 2017. Aggregating and Predicting Sequence Labels from Crowd Annotations. In ACL. 299--309.

[19]

Jerrod Parker and Shi Yu. 2021. Named Entity Recognition through Deep Representation Learning and Weak Supervision. In ACL-IJCNLP Findings. 3828--3839.

[20]

Maria Pontiki, Dimitris Galanis, John Pavlopoulos, Harris Papageorgiou, Ion Androutsopoulos, and Suresh Manandhar. 2014. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In SemEval. 27--35.

[21]

Alexander Ratner, Stephen H. Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. 2017. Snorkel: Rapid Training Data Creation with Weak Supervision. Proc. VLDB Endow. (2017), 269--282.

Digital Library

[22]

Alexander Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, and Christopher Ré. 2019. Training Complex Models with Multi-Task Weak Supervision. AAAI (2019), 4763--4771.

[23]

Alexander Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, and Christopher Ré. 2016. Data Programming: Creating Large Training Sets, Quickly. In NIPS. 3574--3582.

[24]

Wendi Ren, Yinghao Li, Hanting Su, David Kartchner, Cassie Mitchell, and Chao Zhang. 2020. Denoising Multi-Source Weak Supervision for Neural Text Classification. In EMNLP Findings. 3739--3754.

[25]

Esteban Safranchik, Shiying Luo, and Stephen H. Bach. 2020. Weakly Supervised Sequence Tagging from Noisy Rules. In AAAI. 5570--5578.

[26]

Jingbo Shang, Liyuan Liu, Xiaotao Gu, Xiang Ren, Teng Ren, and Jiawei Han. 2018. Learning Named Entity Tagger using Domain-Specific Dictionary. In EMNLP. 2054--2064.

[27]

Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL2003 Shared Task: Language-Independent Named Entity Recognition. In HLTNAACL. 142--147.

[28]

Ke M. Tran, Yonatan Bisk, Ashish Vaswani, Daniel Marcu, and Kevin Knight. 2016. Unsupervised Neural Hidden Markov Models. In SPNLP. 63--71.

[29]

Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, and Ann Houston. 2013. OntoNotes Release 5.0 LDC2013T19.

[30]

Sam Wiseman, Stuart Shieber, and Alexander Rush. 2018. Learning Neural Templates for Text Generation. In EMNLP. 3174--3187.

[31]

Jieyu Zhang, Yue Yu, Yinghao Li, Yujing Wang, Yaming Yang, Mao Yang, and Alexander Ratner. 2021. WRENCH: A Comprehensive Benchmark for Weak Supervision. CoRR (2021).

Cited By

Özbaltan M(2024)Hidden Abstract Stack Markov Models with Learning ProcessMathematics10.3390/math1213214412:13(2144)Online publication date: 8-Jul-2024
https://doi.org/10.3390/math12132144
Wang XPeng CLi QYu QLin LLi PGao RWu WJiang RYu LDing LZhu L(2024)A Chinese Nested Named Entity Recognition Model for Chicken Disease Based on Multiple Fine-Grained Feature Fusion and Efficient Global PointerApplied Sciences10.3390/app1418849514:18(8495)Online publication date: 20-Sep-2024
https://doi.org/10.3390/app14188495
Zha YKe YHu XXiong C(2024)Ontology Attention Layer for Medical Named Entity RecognitionApplied Sciences10.3390/app1401042114:1(421)Online publication date: 3-Jan-2024
https://doi.org/10.3390/app14010421
Show More Cited By

Index Terms

Sparse Conditional Hidden Markov Model for Weakly Supervised Named Entity Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Person Entity Recognition for the Indonesian Qur’an Translation with the Approach Hidden Markov Model-Viterbi
Abstract
Qur’an contains teachings about life given by Allah to the Prophet Muhammad. In the Qur’an, there are a lot of verses. With a large number of verses, it will be very difficult and take a long time for us to find a name. Manually searching for ...
Generalisation in named entity recognition

Quantitative study of NER performance in diverse corpora of different genres, including newswire and social media.Multiple state of the art NER approaches are tested.Possible reasons for NER failure are analysed and quantified: NE diversity, unseen NEs ...
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Kolon Industries
NSF (National Science Foundation)
ONR MURI

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
415
Total Downloads

Downloads (Last 12 months)139
Downloads (Last 6 weeks)17

Reflects downloads up to 06 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Özbaltan M(2024)Hidden Abstract Stack Markov Models with Learning ProcessMathematics10.3390/math1213214412:13(2144)Online publication date: 8-Jul-2024
https://doi.org/10.3390/math12132144
Wang XPeng CLi QYu QLin LLi PGao RWu WJiang RYu LDing LZhu L(2024)A Chinese Nested Named Entity Recognition Model for Chicken Disease Based on Multiple Fine-Grained Feature Fusion and Efficient Global PointerApplied Sciences10.3390/app1418849514:18(8495)Online publication date: 20-Sep-2024
https://doi.org/10.3390/app14188495
Zha YKe YHu XXiong C(2024)Ontology Attention Layer for Medical Named Entity RecognitionApplied Sciences10.3390/app1401042114:1(421)Online publication date: 3-Jan-2024
https://doi.org/10.3390/app14010421
H. PM. P. P(2024)Integrated Deep Learning with Attention Layer Based Approach for Precise Biomedical Named Entity RecognitionJournal of Advances in Information Technology10.12720/jait.15.6.704-71315:6(704-713)Online publication date: 2024
https://doi.org/10.12720/jait.15.6.704-713
Zhao HXiong W(2024)A multi-scale embedding network for unified named entity recognition in Chinese Electronic Medical RecordsAlexandria Engineering Journal10.1016/j.aej.2024.09.008107(665-674)Online publication date: Nov-2024
https://doi.org/10.1016/j.aej.2024.09.008
Wang LShan MZhou TRyu K(2023)Valuable Knowledge Mining: Deep Analysis of Heart Disease and Psychological Causes Based on Large-Scale Medical DataApplied Sciences10.3390/app13201115113:20(11151)Online publication date: 10-Oct-2023
https://doi.org/10.3390/app132011151
Chen ZSun HZhang WXu CMao QChen PSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence LabelerProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599445(274-285)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599445
Kartchner DNakajima An DRen WZhang CMitchell C(2022)Rule-Enhanced Active Learning for Semi-Automated Weak SupervisionAI10.3390/ai30100133:1(211-228)Online publication date: 16-Mar-2022
https://doi.org/10.3390/ai3010013

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents