skip to main content
10.1145/3583780.3614808acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Citation Intent Classification and Its Supporting Evidence Extraction for Citation Graph Construction

Published: 21 October 2023 Publication History

Abstract

As the significant growth of scientific publications in recent years, an efficient way to extract scholarly knowledge and organize the relationship among literature is necessitated. Previous works constructed scientific knowledge graph with authors, papers, citations, and scientific entities. To assist researchers to grasp the research context comprehensively, this paper constructs a fine-grained citation graph in which citation intents and their supporting evidence are labeled between citing and cited papers instead. We propose a model with a Transformer encoder to encode the long-lengthy paper. To capture the coreference relations of words and sentences in a paper, a coreference graph is created by utilizing Gated Graph Convolution Network (GGCN). We further propose a graph modification mechanism to dynamically update the coreference links. Experimental results show that our model achieves promising results on identifying multiple citation intents in sentences.

References

[1]
Amjad Abu-Jbara, Jefferson Ezra, and Dragomir Radev. 2013. Purpose and polarity of citation: Towards nlp-based bibliometrics. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies. 596--606.
[2]
Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, et al. 2018. Construction of the Literature Graph in Semantic Scholar. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). 84--91.
[3]
Chenxin An, Ming Zhong, Yiran Chen, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. 2021. Enhancing scientific papers summarization with citation graph. In Proceedings of the AAAI Conference on Artificial Intelligence. 12498--12506.
[4]
Sören Auer, Viktor Kovtun, Manuel Prinz, Anna Kasprzik, Markus Stocker, and Maria Esther Vidal. 2018. Towards a knowledge graph for science. In Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics. 1--6.
[5]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
[6]
Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. Scibert: A pretrained language model for scientific text. arXiv preprint arXiv:1903.10676 (2019).
[7]
Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150 (2020).
[8]
Marc Bertin, Pierre Jonin, Frédéric Armetta, and Iana Atanassova. 2019. Determining Citation Blocks using End-to-end Neural Coreference Resolution Model for Citation Context Analysis. In 17th International Conference on Scientometrics & Informetrics, Vol. 2. 2720.
[9]
Arman Cohan, Waleed Ammar, Madeleine Van Zuylen, and Field Cady. 2019. Structural scaffolds for citation intent classification in scientific publications. arXiv preprint arXiv:1904.01608 (2019).
[10]
Arman Cohan and Nazli Goharian. 2017. Scientific article summarization using citation-context and article's discourse structure. arXiv preprint arXiv:1704.06619 (2017).
[11]
Arman Cohan and Nazli Goharian. 2018. Scientific document summarization via citation contextualization and scientific discourse. International Journal on Digital Libraries, Vol. 19, 2 (2018), 287--303.
[12]
Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson Liu, Matthew Peters, Michael Schmitz, and Luke Zettlemoyer. 2018. Allennlp: A deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640 (2018).
[13]
Chenrui Guo, Haoran Cui, Li Zhang, Jiamin Wang, Wei Lu, and Jian Wu. 2020. SmartCiteCon: Implicit Citation Context Extraction from Academic Literature Using Supervised Learning. In Proceedings of the 8th International Workshop on Mining Scientific Publications. Association for Computational Linguistics, Wuhan, China, 21--26. https://aclanthology.org/2020.wosp-1.3
[14]
Zhijiang Guo, Yan Zhang, Zhiyang Teng, and Wei Lu. 2019. Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 297--312.
[15]
Dan Hendrycks and Kevin Gimpel. 2020. Gaussian Error Linear Units (GELUs). arxiv: 1606.08415 [cs.LG]
[16]
Pollawat Hongwimol, Peeranuth Kehasukcharoen, Pasit Laohawarutchai, Piyawat Lertvittayakumjorn, Aik Beng Ng, Zhangsheng Lai, Timothy Liu, and Peerapon Vateekul. 2021. ESRA: Explainable Scientific Research Assistant. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations. 114--121.
[17]
Imran Ihsan and M Abdul Qadir. 2021. An NLP-based citation reason analysis using CCRO. Scientometrics, Vol. 126, 6 (2021), 4769--4791.
[18]
Mohamad Yaser Jaradeh, Allard Oelen, Kheir Eddine Farfar, Manuel Prinz, Jennifer D'Souza, Gábor Kismihók, Markus Stocker, and Sören Auer. 2019. Open research knowledge graph: next generation infrastructure for semantic scholarly knowledge. In Proceedings of the 10th International Conference on Knowledge Capture. 243--246.
[19]
Ganesh Jawahar, Beno^it Sagot, and Djamé Seddah. 2019. What does BERT learn about the structure of language?. In ACL 2019--57th Annual Meeting of the Association for Computational Linguistics.
[20]
David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, and Dan Jurafsky. 2018. Measuring the evolution of a scientific field through citation frames. Transactions of the Association for Computational Linguistics, Vol. 6 (2018), 391--406.
[21]
Amar Viswanathan Kannan, Dmitriy Fradkin, Ioannis Akrotirianakis, Tugba Kulahcioglu, Arquimedes Canedo, Aditi Roy, Shih-Yuan Yu, Malawade Arnav, and Mohammad Abdullah Al Faruque. 2020. Multimodal knowledge graph for deep learning papers and code. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3417--3420.
[22]
Dain Kaplan, Ryu Iida, and Takenobu Tokunaga. 2009. Automatic extraction of citation contexts for research paper summarization: A coreference-chain based approach. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries (NLPIR4DL). 88--95.
[23]
Anne Lauscher, Brandon Ko, Bailey Kuehl, Sophie Johnson, David Jurgens, Arman Cohan, and Kyle Lo. 2021. MultiCite: Modeling realistic citations requires moving beyond the single-sentence single-label setting. arXiv preprint arXiv:2107.00414 (2021).
[24]
Kenton Lee, Luheng He, and Luke Zettlemoyer. 2018. Higher-order coreference resolution with coarse-to-fine inference. arXiv preprint arXiv:1804.05392 (2018).
[25]
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2017. Gated Graph Sequence Neural Networks. arxiv: 1511.05493 [cs.LG]
[26]
Yicong Liang, Qing Li, and Tieyun Qian. 2011. Finding relevant papers based on citation relations. In International conference on web-age information management. Springer, 403--414.
[27]
Haixia Liu. 2017. Sentiment analysis of citations using word2vec. arXiv preprint arXiv:1704.00177 (2017).
[28]
Jiaying Liu, Jing Ren, Wenqing Zheng, Lianhua Chi, Ivan Lee, and Feng Xia. 2020. Web of scholars: A scholar knowledge graph. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2153--2156.
[29]
Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. arXiv preprint arXiv:1808.09602 (2018).
[30]
Ian Magnusson and Scott Friedman. 2021. Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset and Transformer-Based Results. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 4651--4658.
[31]
Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1506--1515.
[32]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311--318.
[33]
Vahed Qazvinian and Dragomir R. Radev. 2010. Identifying Non-Explicit Citing Sentences for Citation-Based Summarization. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, 555--564. https://aclanthology.org/P10--1057
[34]
Nazmus Sakib, Rodina Binti Ahmad, and Khalid Haruna. 2020. A collaborative approach toward scientific paper recommendation using citation context. IEEE Access, Vol. 8 (2020), 51246--51255.
[35]
Zhihong Shen, Hao Ma, and Kuansan Wang. 2018. A Web-scale system for scientific knowledge exploration. In Proceedings of ACL 2018, System Demonstrations. Association for Computational Linguistics, Melbourne, Australia, 87--92. https://doi.org/10.18653/v1/P18--4015
[36]
Parikshit Sondhi and ChengXiang Zhai. 2014. A constrained hidden Markov model approach for non-explicit citation context extraction. In Proceedings of the 2014 SIAM International Conference on Data Mining. SIAM, 361--369.
[37]
William Tanner, Esra Akbas, and Mir Hasan. 2019. Paper recommendation based on citation relation. In 2019 IEEE international conference on big data (big data). IEEE, 3053--3059.
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[39]
Jia-Yan Wu, Alexander Te-Wei Shieh, Shih-Ju Hsu, and Yun-Nung Chen. 2021. Towards generating citation sentences for multiple references with intent control. arXiv preprint arXiv:2112.01332 (2021).
[40]
Zeguan Xiao, Jiarun Wu, Qingliang Chen, and Congjian Deng. 2021. BERT4GCN: Using BERT Intermediate Layers to Augment GCN for Aspect-based Sentiment Classification. arxiv: 2110.00171 [cs.CL]
[41]
Jian Xu, Sunkyu Kim, Min Song, Minbyul Jeong, Donghyeon Kim, Jaewoo Kang, Justin F Rousseau, Xin Li, Weijia Xu, Vetle I Torvik, et al. 2020. Building a knowledge graph. Scientific data, Vol. 7, 1 (2020), 1--15.
[42]
Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R Fabbri, Irene Li, Dan Friedman, and Dragomir R Radev. 2019. Scisummnet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 7386--7393.
[43]
Tan Ying and Tang Yifei. 2021. Extracting Citation Contents with Coreference Resolution. Data Analysis and Knowledge Discovery, Vol. 5, 8 (2021), 25--33.
[44]
Yufeng Zhang, Xueli Yu, Zeyu Cui, Shu Wu, Zhongzhen Wen, and Liang Wang. 2020. Every document owns its structure: Inductive text classification via graph neural networks. arXiv preprint arXiv:2004.13826 (2020).
[45]
Jinde Zhu, Guojun Mao, and Chunmao Jiang. 2022. DII-GCN: Dropedge Based Deep Graph Convolutional Networks. Symmetry, Vol. 14, 4 (2022), 798.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. citation graph construction
  2. citation intent
  3. intent evidence

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 279
    Total Downloads
  • Downloads (Last 12 months)230
  • Downloads (Last 6 weeks)11
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media