research-article

Citation Intent Classification and Its Supporting Evidence Extraction for Citation Graph Construction

Authors:

Hen-Hsen Huang,

Hsin-Hsi ChenAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 2472 - 2481

https://doi.org/10.1145/3583780.3614808

Published: 21 October 2023 Publication History

Abstract

As the significant growth of scientific publications in recent years, an efficient way to extract scholarly knowledge and organize the relationship among literature is necessitated. Previous works constructed scientific knowledge graph with authors, papers, citations, and scientific entities. To assist researchers to grasp the research context comprehensively, this paper constructs a fine-grained citation graph in which citation intents and their supporting evidence are labeled between citing and cited papers instead. We propose a model with a Transformer encoder to encode the long-lengthy paper. To capture the coreference relations of words and sentences in a paper, a coreference graph is created by utilizing Gated Graph Convolution Network (GGCN). We further propose a graph modification mechanism to dynamically update the coreference links. Experimental results show that our model achieves promising results on identifying multiple citation intents in sentences.

References

[1]

Amjad Abu-Jbara, Jefferson Ezra, and Dragomir Radev. 2013. Purpose and polarity of citation: Towards nlp-based bibliometrics. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies. 596--606.

[2]

Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, et al. 2018. Construction of the Literature Graph in Semantic Scholar. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). 84--91.

[3]

Chenxin An, Ming Zhong, Yiran Chen, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. 2021. Enhancing scientific papers summarization with citation graph. In Proceedings of the AAAI Conference on Artificial Intelligence. 12498--12506.

[4]

Sören Auer, Viktor Kovtun, Manuel Prinz, Anna Kasprzik, Markus Stocker, and Maria Esther Vidal. 2018. Towards a knowledge graph for science. In Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics. 1--6.

Digital Library

[5]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).

[6]

Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. Scibert: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019).

[7]

Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150 (2020).

[8]

Marc Bertin, Pierre Jonin, Frédéric Armetta, and Iana Atanassova. 2019. Determining Citation Blocks using End-to-end Neural Coreference Resolution Model for Citation Context Analysis. In 17th International Conference on Scientometrics & Informetrics, Vol. 2. 2720.

[9]

Arman Cohan, Waleed Ammar, Madeleine Van Zuylen, and Field Cady. 2019. Structural scaffolds for citation intent classification in scientific publications. arXiv preprint arXiv:1904.01608 (2019).

[10]

Arman Cohan and Nazli Goharian. 2017. Scientific article summarization using citation-context and article's discourse structure. arXiv preprint arXiv:1704.06619 (2017).

[11]

Arman Cohan and Nazli Goharian. 2018. Scientific document summarization via citation contextualization and scientific discourse. International Journal on Digital Libraries, Vol. 19, 2 (2018), 287--303.

Digital Library

[12]

Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson Liu, Matthew Peters, Michael Schmitz, and Luke Zettlemoyer. 2018. Allennlp: A deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640 (2018).

[13]

Chenrui Guo, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, and Jian Wu. 2020. SmartCiteCon: Implicit Citation Context Extraction from Academic Literature Using Supervised Learning. In Proceedings of the 8th International Workshop on Mining Scientific Publications. Association for Computational Linguistics, Wuhan, China, 21--26. https://aclanthology.org/2020.wosp-1.3

[14]

Zhijiang Guo, Yan Zhang, Zhiyang Teng, and Wei Lu. 2019. Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 297--312.

[15]

Dan Hendrycks and Kevin Gimpel. 2020. Gaussian Error Linear Units (GELUs). arxiv: 1606.08415 [cs.LG]

[16]

Pollawat Hongwimol, Peeranuth Kehasukcharoen, Pasit Laohawarutchai, Piyawat Lertvittayakumjorn, Aik Beng Ng, Zhangsheng Lai, Timothy Liu, and Peerapon Vateekul. 2021. ESRA: Explainable Scientific Research Assistant. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations. 114--121.

[17]

Imran Ihsan and M Abdul Qadir. 2021. An NLP-based citation reason analysis using CCRO. Scientometrics, Vol. 126, 6 (2021), 4769--4791.

Digital Library

[18]

Mohamad Yaser Jaradeh, Allard Oelen, Kheir Eddine Farfar, Manuel Prinz, Jennifer D'Souza, Gábor Kismihók, Markus Stocker, and Sören Auer. 2019. Open research knowledge graph: next generation infrastructure for semantic scholarly knowledge. In Proceedings of the 10th International Conference on Knowledge Capture. 243--246.

Digital Library

[19]

Ganesh Jawahar, Beno^it Sagot, and Djamé Seddah. 2019. What does BERT learn about the structure of language?. In ACL 2019--57th Annual Meeting of the Association for Computational Linguistics.

[20]

David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, and Dan Jurafsky. 2018. Measuring the evolution of a scientific field through citation frames. Transactions of the Association for Computational Linguistics, Vol. 6 (2018), 391--406.

[21]

Amar Viswanathan Kannan, Dmitriy Fradkin, Ioannis Akrotirianakis, Tugba Kulahcioglu, Arquimedes Canedo, Aditi Roy, Shih-Yuan Yu, Malawade Arnav, and Mohammad Abdullah Al Faruque. 2020. Multimodal knowledge graph for deep learning papers and code. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3417--3420.

Digital Library

[22]

Dain Kaplan, Ryu Iida, and Takenobu Tokunaga. 2009. Automatic extraction of citation contexts for research paper summarization: A coreference-chain based approach. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (NLPIR4DL). 88--95.

[23]

Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, David Jurgens, Arman Cohan, and Kyle Lo. 2021. MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting. arXiv preprint arXiv:2107.00414 (2021).

[24]

Kenton Lee, Luheng He, and Luke Zettlemoyer. 2018. Higher-order coreference resolution with coarse-to-fine inference. arXiv preprint arXiv:1804.05392 (2018).

[25]

Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2017. Gated Graph Sequence Neural Networks. arxiv: 1511.05493 [cs.LG]

[26]

Yicong Liang, Qing Li, and Tieyun Qian. 2011. Finding relevant papers based on citation relations. In International conference on web-age information management. Springer, 403--414.

Digital Library

[27]

Haixia Liu. 2017. Sentiment analysis of citations using word2vec. arXiv preprint arXiv:1704.00177 (2017).

[28]

Jiaying Liu, Jing Ren, Wenqing Zheng, Lianhua Chi, Ivan Lee, and Feng Xia. 2020. Web of scholars: A scholar knowledge graph. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2153--2156.

Digital Library

[29]

Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. arXiv preprint arXiv:1808.09602 (2018).

[30]

Ian Magnusson and Scott Friedman. 2021. Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 4651--4658.

[31]

Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1506--1515.

[32]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318.

Digital Library

[33]

Vahed Qazvinian and Dragomir R. Radev. 2010. Identifying Non-Explicit Citing Sentences for Citation-Based Summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, 555--564. https://aclanthology.org/P10--1057

[34]

Nazmus Sakib, Rodina Binti Ahmad, and Khalid Haruna. 2020. A collaborative approach toward scientific paper recommendation using citation context. IEEE Access, Vol. 8 (2020), 51246--51255.

[35]

Zhihong Shen, Hao Ma, and Kuansan Wang. 2018. A Web-scale system for scientific knowledge exploration. In Proceedings of ACL 2018, System Demonstrations. Association for Computational Linguistics, Melbourne, Australia, 87--92. https://doi.org/10.18653/v1/P18--4015

[36]

Parikshit Sondhi and ChengXiang Zhai. 2014. A constrained hidden Markov model approach for non-explicit citation context extraction. In Proceedings of the 2014 SIAM International Conference on Data Mining. SIAM, 361--369.

[37]

William Tanner, Esra Akbas, and Mir Hasan. 2019. Paper recommendation based on citation relation. In 2019 IEEE international conference on big data (big data). IEEE, 3053--3059.

[38]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

[39]

Jia-Yan Wu, Alexander Te-Wei Shieh, Shih-Ju Hsu, and Yun-Nung Chen. 2021. Towards generating citation sentences for multiple references with intent control. arXiv preprint arXiv:2112.01332 (2021).

[40]

Zeguan Xiao, Jiarun Wu, Qingliang Chen, and Congjian Deng. 2021. BERT4GCN: Using BERT Intermediate Layers to Augment GCN for Aspect-based Sentiment Classification. arxiv: 2110.00171 [cs.CL]

[41]

Jian Xu, Sunkyu Kim, Min Song, Minbyul Jeong, Donghyeon Kim, Jaewoo Kang, Justin F Rousseau, Xin Li, Weijia Xu, Vetle I Torvik, et al. 2020. Building a knowledge graph. Scientific data, Vol. 7, 1 (2020), 1--15.

[42]

Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R Fabbri, Irene Li, Dan Friedman, and Dragomir R Radev. 2019. Scisummnet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 7386--7393.

Digital Library

[43]

Tan Ying and Tang Yifei. 2021. Extracting Citation Contents with Coreference Resolution. Data Analysis and Knowledge Discovery, Vol. 5, 8 (2021), 25--33.

[44]

Yufeng Zhang, Xueli Yu, Zeyu Cui, Shu Wu, Zhongzhen Wen, and Liang Wang. 2020. Every document owns its structure: Inductive text classification via graph neural networks. arXiv preprint arXiv:2004.13826 (2020).

[45]

Jinde Zhu, Guojun Mao, and Chunmao Jiang. 2022. DII-GCN: Dropedge Based Deep Graph Convolutional Networks. Symmetry, Vol. 14, 4 (2022), 798.

Index Terms

Citation Intent Classification and Its Supporting Evidence Extraction for Citation Graph Construction
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Ontology engineering
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Information extraction

Recommendations

Citation contagion: a citation analysis of selected predatory marketing journals
Abstract
To date, limited studies have examined the citations of articles published in predatory journals, and none appears to have been done in marketing. Using Google Scholar (GS) as a citation source, this study aims to examine the extent of citations ...
Journal self-citation study for semiconductor literature: synchronous and diachronous approach
Special issue: Informetrics

The present study investigates the self-citations of the most productive semiconductor journals by synchronous (self-citing rate) and diachronous (self-cited rate) approaches. Journal's productivity of 100 most productive semiconductor journals was ...
Preprint citation practice in PLOS
Abstract
The role of preprints in the scientific production and their part in citations have been growing over the past 10 years. In this paper we study preprint citations in several different aspects: the progression of preprint citations over time, their ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ministry of Science and Technology, Taiwan

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
279
Total Downloads

Downloads (Last 12 months)230
Downloads (Last 6 weeks)11

Reflects downloads up to 07 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents