Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

Zhang, Chengzhi; Zhao, Lei; Zhao, Mengyuan; Zhang, Yingyi

Computer Science > Information Retrieval

arXiv:2111.14106 (cs)

[Submitted on 28 Nov 2021 (v1), last revised 30 Nov 2021 (this version, v2)]

Title:Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

Authors:Chengzhi Zhang, Lei Zhao, Mengyuan Zhao, Yingyi Zhang

View PDF

Abstract:With the development of Internet technology, the phenomenon of information overload is becoming more and more obvious. It takes a lot of time for users to obtain the information they need. However, keyphrases that summarize document information highly are helpful for users to quickly obtain and understand documents. For academic resources, most existing studies extract keyphrases through the title and abstract of papers. We find that title information in references also contains author-assigned keyphrases. Therefore, this article uses reference information and applies two typical methods of unsupervised extraction methods (TF*IDF and TextRank), two representative traditional supervised learning algorithms (Naïve Bayes and Conditional Random Field) and a supervised deep learning model (BiLSTM-CRF), to analyze the specific performance of reference information on keyphrase extraction. It is expected to improve the quality of keyphrase recognition from the perspective of expanding the source text. The experimental results show that reference information can increase precision, recall, and F1 of automatic keyphrase extraction to a certain extent. This indicates the usefulness of reference information on keyphrase extraction of academic papers and provides a new idea for the following research on automatic keyphrase extraction.

Subjects:	Information Retrieval (cs.IR); Computation and Language (cs.CL); Digital Libraries (cs.DL)
Cite as:	arXiv:2111.14106 [cs.IR]
	(or arXiv:2111.14106v2 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2111.14106

Submission history

From: Chengzhi Zhang [view email]
[v1] Sun, 28 Nov 2021 11:14:16 UTC (589 KB)
[v2] Tue, 30 Nov 2021 04:43:39 UTC (587 KB)

Computer Science > Information Retrieval

Title:Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators