skip to main content
10.1145/3583780.3615026acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Mulco: Recognizing Chinese Nested Named Entities through Multiple Scopes

Published: 21 October 2023 Publication History

Abstract

Nested Named Entity Recognition (NNER), as a subarea of Named Entity Recognition, has presented longstanding challenges to researchers. In NNER, one entity may be part of a larger entity, which can occur at multiple levels. These nested structures prevent traditional sequence labeling methods from properly recognizing all entities. While recent research has focused on designing better recognition methods for NNER in various languages, Chinese Nested Named Entity Recognition (CNNER) is still underdeveloped, largely due to a lack of freely available CNNER benchmarks. To support CNNER research, in this paper, we introduce ChiNesE, a CNNER dataset comprising 20,000 sentences from online passages in multiple domains and containing 117,284 entities that fall into 10 categories, of which 43.8% are nested named entities. Based on ChiNesE, we propose Mulco, a novel method that can recognize named entities in nested structures through multiple scopes. Each scope uses a scope-based sequence labeling method that predicts an anchor and the length of a named entity to recognize it. Experimental results show that Mulco outperforms state-of-the-art baseline methods with different recognition schemes on ChiNesE and ACE 2005 Chinese corpus.

References

[1]
Beatrice Alex, Barry Haddow, and Claire Grover. 2007. Recognising nested named entities in biomedical text. In Biological, translational, and clinical language processing. 65--72.
[2]
Yanping Chen, Guorong Wang, Qinghua Zheng, Yongbin Qin, Ruizhang Huang, and Ping Chen. 2019a. A set space model to capture structural information of a sentence. IEEE Access, Vol. 7 (2019), 142515--142530.
[3]
Yanping Chen, Yuefei Wu, Yongbin Qin, Ying Hu, Zeyu Wang, Ruizhang Huang, Xinyu Cheng, and Ping Chen. 2019b. Recognizing nested named entity based on the neural network boundary assembling model. IEEE Intelligent Systems, Vol. 35, 1 (2019), 74--81.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19--1423
[5]
George Doddington, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel, and Ralph Weischedel. 2004. The Automatic Content Extraction (ACE) Program -- Tasks, Data, and Evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC'04). European Language Resources Association (ELRA), Lisbon, Portugal. http://www.lrec-conf.org/proceedings/lrec2004/pdf/5.pdf
[6]
Chunyuan Fu and Guohong Fu. 2012. A dual-layer CRFs based method for Chinese nested named entity recognition. In 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery. 2546--2550. https://doi.org/10.1109/FSKD.2012.6234172
[7]
Zhifeng Hao, Hongfei Wang, Ruichu Cai, and Wen Wen. 2013. Product named entity recognition for Chinese query questions based on a skip-chain CRF model. Neural Computing and Applications, Vol. 23, 2 (2013), 371--379.
[8]
Wang Houfeng and Shi Wuguang. 2005. A simple rule-based approach to organization name recognition in chinese text. In International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 769--772.
[9]
Peiyuan Jiang, Daji Ergu, Fangyao Liu, Ying Cai, and Bo Ma. 2022. A Review of Yolo algorithm developments. Procedia Computer Science, Vol. 199 (2022), 1066--1073.
[10]
Meizhi Ju, Makoto Miwa, and Sophia Ananiadou. 2018. A neural layered model for nested named entity recognition. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 1446--1459.
[11]
Arzoo Katiyar and Claire Cardie. 2018. Nested Named Entity Recognition Revisited. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 861--871. https://doi.org/10.18653/v1/N18--1079
[12]
Cvetana Krstev, Ivan Obradovi?, Milo? Utvi?, and Du?ko Vitas. 2014. A system for named entity recognition based on local grammars. Journal of Logic and Computation, Vol. 24, 2 (2014), 473--489. https://doi.org/10.1093/logcom/exs079
[13]
Jingye Li, Hao Fei, Jiang Liu, Shengqiong Wu, Meishan Zhang, Chong Teng, Donghong Ji, and Fei Li. 2022a. Unified named entity recognition as word-word relation classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 10965--10973.
[14]
Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2022b. A Survey on Deep Learning for Named Entity Recognition. IEEE Trans. on Knowl. and Data Eng., Vol. 34, 1 (jan 2022), 50--70. https://doi.org/10.1109/TKDE.2020.2981314
[15]
Ren Li, Tianjin Mo, Jianxi Yang, Dong Li, Shixin Jiang, and Di Wang. 2021. Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model. Advanced Engineering Informatics, Vol. 50 (2021), 101416.
[16]
Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, and Jiwei Li. 2020. A Unified MRC Framework for Named Entity Recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5849--5859. https://doi.org/10.18653/v1/2020.acl-main.519
[17]
Hongyu Lin, Yaojie Lu, Xianpei Han, and Le Sun. 2019. Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 5182--5192. https://doi.org/10.18653/v1/P19--1511
[18]
Pan Liu, Yanming Guo, Fenglei Wang, and Guohui Li. 2022. Chinese named entity recognition: The state of the art. Neurocomputing, Vol. 473 (2022), 37--53.
[19]
Xinwei Long, Shuzi Niu, and Yucheng Li. 2020. Hierarchical Region Learning for Nested Named Entity Recognition. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 4788--4793. https://doi.org/10.18653/v1/2020.findings-emnlp.430
[20]
Ilya Loshchilov and Frank Hutter. 2018. Decoupled Weight Decay Regularization. In International Conference on Learning Representations.
[21]
Wei Lu and Dan Roth. 2015. Joint mention extraction and classification with mention hypergraphs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 857--867.
[22]
Zita Marinho, Alfonso Mendes, Sebastiao Miranda, and David Nogueira. 2019. Hierarchical nested named entity recognition. In Proceedings of the 2nd Clinical Natural Language Processing Workshop. 28--34.
[23]
Andrei Mikheev, Marc Moens, and Claire Grover. 1999. Named entity recognition without gazetteers. In Ninth Conference of the European Chapter of the Association for Computational Linguistics. 1--8.
[24]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.
[25]
Georgios Petasis, Frantz Vichot, Francis Wolinski, Georgios Paliouras, Vangelis Karkaletsis, and Constantine D Spyropoulos. 2001. Using machine learning to maintain rule-based named-entity recognition and classification systems. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. 426--433.
[26]
Lisa F Rau. 1991. Extracting company names from text. In Proceedings the Seventh IEEE Conference on Artificial Intelligence Application. IEEE Computer Society, 29--30.
[27]
Nicky Ringland, Xiang Dai, Ben Hachey, Sarvnaz Karimi, Cecile Paris, and James R Curran. 2019. NNE: A Dataset for Nested Named Entity Recognition in English Newswire. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5176--5181.
[28]
Yongliang Shen, Xinyin Ma, Zeqi Tan, Shuai Zhang, Wen Wang, and Weiming Lu. 2021. Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2782--2794.
[29]
Takashi Shibuya and Eduard Hovy. 2020. Nested named entity recognition via second-best sequence learning and decoding. Transactions of the Association for Computational Linguistics, Vol. 8 (2020), 605--620.
[30]
Chuanqi Tan, Wei Qiu, Mosha Chen, Rui Wang, and Fei Huang. 2020. Boundary enhanced neural span classification for nested named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9016--9023.
[31]
Bailin Wang and Wei Lu. 2018. Neural Segmental Hypergraphs for Overlapping Mention Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 204--214.
[32]
Bailin Wang, Wei Lu, Yu Wang, and Hongxia Jin. 2018. A Neural Transition-based Model for Nested Mention Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 1011--1017. https://doi.org/10.18653/v1/D18--1124
[33]
Jue Wang, Lidan Shou, Ke Chen, and Gang Chen. 2020. Pyramid: A Layered Model for Nested Named Entity Recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5918--5928. https://doi.org/10.18653/v1/2020.acl-main.525
[34]
Yu Wang, Hanghang Tong, Ziye Zhu, and Yun Li. 2022. Nested Named Entity Recognition: A Survey. ACM Transactions on Knowledge Discovery from Data (TKDD) (2022).
[35]
Casey Whitelaw and Jon Patrick. 2003. Evaluating corpora for named entity recognition using character-level features. In Australasian Joint Conference on Artificial Intelligence. Springer, 910--921.
[36]
Congying Xia, Chenwei Zhang, Tao Yang, Yaliang Li, Nan Du, Xian Wu, Wei Fan, Fenglong Ma, and Philip Yu. 2019. Multi-grained Named Entity Recognition. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 1430--1440. https://doi.org/10.18653/v1/P19--1138
[37]
Nianwen Xue, Fu-Dong Chiou, and Martha Palmer. 2002. Building a large-scale annotated chinese corpus. In COLING 2002: The 19th International Conference on Computational Linguistics.
[38]
Juntao Yu, Bernd Bohnet, and Massimo Poesio. 2020. Named Entity Recognition as Dependency Parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 6470--6476. https://doi.org/10.18653/v1/2020.acl-main.577
[39]
S Yu, H Duan, and Y Wu. 2018. Corpus of multi-level processing for modern Chinese. Available at: opendata. pku. edu. cn/dataset. xhtml (2018).
[40]
Xiantao Zhang, Dongchen Li, and Xihong Wu. 2014. Parsing named entity as syntactic structure. In Fifteenth Annual Conference of the International Speech Communication Association.
[41]
Yuejie Zhang, Zhiting Xu, and Tao Zhang. 2008. Fusion of multiple features for Chinese named entity recognition based on CRF model. In Asia Information Retrieval Symposium. Springer, 95--106.

Index Terms

  1. Mulco: Recognizing Chinese Nested Named Entities through Multiple Scopes

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
      October 2023
      5508 pages
      ISBN:9798400701245
      DOI:10.1145/3583780
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Chinese nested named entity recognition
      2. datasets
      3. nested named entity recognition
      4. sequence labeling

      Qualifiers

      • Research-article

      Conference

      CIKM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 59
        Total Downloads
      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media