skip to main content
research-article

An Attention-based Deep Relevance Model for Few-shot Document Filtering

Published: 06 October 2020 Publication History

Abstract

With the large quantity of textual information produced on the Internet, a critical necessity is to filter out the irrelevant information and organize the rest into categories of interest (e.g., an emerging event). However, supervised-learning document filtering methods heavily rely on a large number of labeled documents for model training. Manually identifying plenty of positive examples for each category is expensive and time-consuming. Also, it is unrealistic to cover all the categories from an evolving text source that covers diverse kinds of events, user opinions, and daily life activities. In this article, we propose a novel attention-based deep relevance model for few-shot document filtering (named ADRM), inspired by the relevance feedback methodology proposed for ad hoc retrieval. ADRM calculates the relevance score between a document and a category by taking a set of seed words and a few seed documents relevant to the category. It constructs the category-specific conceptual representation of the document based on the corresponding seed words and seed documents. Specifically, to filter irrelevant yet noisy information in the seed documents, ADRM employs two types of attention mechanisms (namely whole-match attention and max-match attention) and generates category-specific representations for them. Then ADRM is devised to extract the relevance signals by modeling the hidden feature interactions in the word embedding space. The relevance signals are extracted through a gated convolutional process, a self-attention layer, and a relevance aggregation layer. Extensive experiments on three real-world datasets show that ADRM consistently outperforms the existing technical alternatives, including the conventional classification and retrieval baselines, and the state-of-the-art deep relevance ranking models for few-shot document filtering. We also perform an ablation study to demonstrate that each component in ADRM is effective for enhancing filtering performance. Further analysis shows that ADRM is robust under varying parameter settings.

References

[1]
Zeynep Akata, Florent Perronnin, Zaïd Harchaoui, and Cordelia Schmid. 2013. Label-embedding for attribute-based classification. In CVPR. 819--826.
[2]
Zeynep Akata, Scott E. Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. 2015. Evaluation of output embeddings for fine-grained image classification. In CVPR. 2927--2936.
[3]
Krisztian Balog and Heri Ramampiaro. 2013. Cumulative citation recommendation: Classification vs. ranking. In SIGIR. 941--944.
[4]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach. Learn. Res. 3 (January 2003), 993--1022.
[5]
Chris Buckley and Stephen Robertson. 2008. Relevance Feedback Track Overview: TREC 2008. Technical Report. Microsoft Corporation, Redmond, WA.
[6]
Ming-Wei Chang, Lev-Arie Ratinov, Dan Roth, and Vivek Srikumar. 2008. Importance of semantic representation: Dataless classification. In AAAI. 830--835.
[7]
Xingyuan Chen, Yunqing Xia, Peng Jin, and John Carroll. 2015. Dataless text classification with descriptive LDA. In AAAI.
[8]
Zhuyun Dai, Chenyan Xiong, Jamie Callan, and Zhiyuan Liu. 2018. Convolutional neural networks for soft-matching n-grams in ad-hoc search. In WSDM. 126--134.
[9]
Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language modeling with gated convolutional networks. In ICML. 933--941.
[10]
Doug Downey and Oren Etzioni. 2009. Look ma, no hands: Analyzing the monotonic feature abstraction for text classification. In NIPS. 393--400.
[11]
Gregory Druck, Gideon Mann, and Andrew McCallum. 2008. Learning from labeled features using generalized expectation criteria. In SIGIR. 595--602.
[12]
Carsten Eickhoff, Sebastian Dungs, and Vu Tran. 2015. An eye-tracking study of query reformulation. In SIGIR. 13--22.
[13]
Mohamed Elhoseiny, Babak Saleh, and Ahmed M. Elgammal. 2013. Write a classifier: Zero-shot learning using purely textual descriptions. In ICCV. 2584--2591.
[14]
Hui Fang and ChengXiang Zhai. 2006. Semantic term matching in axiomatic approaches to information retrieval. In SIGIR. 115--122.
[15]
John R. Frank, Max Kleiman-Weiner, Daniel A. Roberts, Feng Niu, Ce Zhang, Christopher Ré, and Ian Soboroff. 2012. Building an entity-centric stream filtering test collection for TREC 2012. In TREC.
[16]
Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc’Aurelio Ranzato, and Tomas Mikolov. 2013. DeViSE: A deep visual-semantic embedding model. In NIPS. 2121--2129.
[17]
Yanwei Fu, Timothy M. Hospedales, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2014. Transductive multi-view embedding for zero-shot recognition and annotation. In ECCV. 584--599.
[18]
Zhen-Yong Fu, Tao A. Xiang, Elyor Kodirov, and Shaogang Gong. 2015. Zero-shot object recognition by semantic manifold distance. In CVPR. 2635--2644.
[19]
Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In PMLR. 1180--1189.
[20]
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17, 1 (2016), 2096--2030.
[21]
Tianyu Gao, Xu Han, Zhiyuan Liu, and Maosong Sun. 2019. Hybrid attention-based prototypical networks for noisy few-shot relation classification. In AAAI. 6407--6414.
[22]
Yang Gao, Yue Xu, and Yuefeng Li. 2013. Pattern-based topic models for information filtering. In ICDM Workshops. 921--928.
[23]
Yang Gao, Yue Xu, and Yuefeng Li. 2014. Pattern-based topics for document modelling in information filtering. IEEE Trans. Knowl. Data Eng. 27, 6 (2014), 1629--1642.
[24]
Alfio Gliozzo, Carlo Strapparava, and Ido Dagan. 2009. Improving text categorization bootstrapping via unsupervised learning. ACM Trans. Speech Lang. Process. 6, 1 (2009), 1.
[25]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In CIKM. 55--64.
[26]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. Semantic matching by non-linear word transportation for information retrieval. In CIKM. 701--710.
[27]
Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, and Maosong Sun. 2018. FewRel: A Large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation. In EMNLP. 4803--4809.
[28]
Donna K. Harman. 1995. Overview of the Third Text Retrieval Conference (TREC-3). DIANE Publishing.
[29]
Swapnil Hingmire and Sutanu Chakraborti. 2014. Topic labeled text classification: A weakly supervised approach. In SIGIR. 385--394.
[30]
Swapnil Hingmire, Sandeep Chougule, Girish K. Palshikar, and Sutanu Chakraborti. 2013. Document classification by topic labeling. In SIGIR. 877--880.
[31]
Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In NIPS. 2042--2050.
[32]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In CIKM. 2333--2338.
[33]
Kai Hui, Andrew Yates, Klaus Berberich, and Gerard de Melo. 2017. Pacrr: A position-aware neural ir model for relevance matching. Arxiv Preprint Arxiv:1704.03940 (2017).
[34]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. Arxiv Preprint Arxiv:1412.6980 (2014).
[35]
Andrew K. Lampinen and James L. McClelland. 2018. One-shot and few-shot learning of word embeddings. In ICLR.
[36]
Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. 2014. Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Mach. Intell. 36, 3 (2014), 453--465.
[37]
Victor Lavrenko and W. Bruce Croft. 2017. Relevance-based language models. In ACM SIGIR Forum, Vol. 51. 260--267.
[38]
Chenliang Li, Shiqian Chen, Jian Xing, Aixin Sun, and Zongyang Ma. 2019. Seed-guided topic model for document filtering and classification. ACM Trans. Inf. Syst. 37, 1 (2019), 9:1--9:37.
[39]
Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, and Jungang Xu. 2018. NPRF: A neural pseudo relevance feedback framework for ad-hoc information retrieval. Arxiv Preprint Arxiv:1810.12936 (2018).
[40]
Chenliang Li, Jian Xing, Aixin Sun, and Zongyang Ma. 2016. Effective document labeling with very few seed words: A topic model approach. In CIKM. 85--94.
[41]
Chenliang Li, Wei Zhou, Feng Ji, Yu Duan, and Haiqing Chen. 2018. A deep relevance model for zero-shot document filtering. In ACL. 2300--2310.
[42]
Bing Liu, Xiaoli Li, Wee Sun Lee, and Philip S. Yu. 2004. Text classification by labeling words. In AAAI, Vol. 4. 425--430.
[43]
Yuanhua Lv and ChengXiang Zhai. 2009. Adaptive relevance feedback in information retrieval. In CIKM. 255--264.
[44]
Thomas Mensink, Efstratios Gavves, and Cees G. M. Snoek. 2014. COSTA: Co-occurrence statistics for zero-shot classification. In CVPR. 2441--2448.
[45]
Thomas Mensink, Jakob J. Verbeek, Florent Perronnin, and Gabriela Csurka. 2012. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In ECCV. 488--501.
[46]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. Arxiv Preprint Arxiv:1301.3781 (2013).
[47]
Javed Mostafa, Snehasis Mukhopadhyay, M. Palakal, and W. Lam. 1997. A multilevel approach to intelligent information filtering: Model, system, and evaluation. ACM Transactions on Information Systems (TOIS) 15, 4 (1997), 368--399.
[48]
Nikolaos Nanas, Manolis Vavalis, and Anne De Roeck. 2010. A network-based model for high-dimensional information filtering. In SIGIR. 202--209.
[49]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with BERT. Arxiv Preprint Arxiv:1901.04085 (2019).
[50]
Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton, and Tom M. Mitchell. 2009. Zero-shot learning with semantic output codes. In NIPS. 1410--1418.
[51]
Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL. 115--124.
[52]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, and Xueqi Cheng. 2017. A deep investigation of deep IR models. CoRR abs/1707.07700 (2017).
[53]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text matching as image recognition. In AAAI.
[54]
Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Jingfang Xu, and Xueqi Cheng. 2017. Deeprank: A new deep architecture for relevance ranking in information retrieval. In CIKM. 257--266.
[55]
Devi Parikh and Kristen Grauman. 2011. Relative attributes. In ICCV. 503--510.
[56]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In EMNLP. 1532--1543.
[57]
Julia Proskurnia, Ruslan Mavlyutov, Carlos Castillo, Karl Aberer, and Philippe Cudré-Mauroux. 2017. Efficient document filtering using vector space topic expansion and pattern-mining: the case of event detection in microposts. In CIKM. 457--466.
[58]
Scott E. Reed, Zeynep Akata, Honglak Lee, and Bernt Schiele. 2016. Learning deep representations of fine-grained visual descriptions. In CVPR. 49--58.
[59]
Ridho Reinanda, Edgar Meij, and Maarten de Rijke. 2016. Document filtering for long-tail entities. In CIKM. 771--780.
[60]
Stephen Robertson and Ian Soboroff. 2002. The TREC 2001 filtering track report. In TREC. 5.
[61]
Stephen E. Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR. 232--241.
[62]
Marcus Rohrbach, Michael Stark, and Bernt Schiele. 2011. Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In CVPR. 1641--1648.
[63]
Marcus Rohrbach, Michael Stark, György Szarvas, Iryna Gurevych, and Bernt Schiele. 2010. What helps where - and why? Semantic relatedness for knowledge transfer. In CVPR. 910--917.
[64]
Bernardino Romera-Paredes and Philip H. S. Torr. 2015. An embarrassingly simple approach to zero-shot learning. In ICML. 2152--2161.
[65]
Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. Learning semantic representations using convolutional neural networks for web search. In WWW, Companion Volume. 373--374.
[66]
Ian Soboroff and Stephen Robertson. 2003. Building a filtering test collection for TREC 2002. In SIGIR. 243--250.
[67]
Richard Socher, Milind Ganjoo, Christopher D. Manning, and Andrew Y. Ng. 2013. Zero-shot learning through cross-modal transfer. In NIPS. 935--943.
[68]
Yangqiu Song and Dan Roth. 2014. On dataless hierarchical text classification. In AAAI.
[69]
Shengxian Wan, Yanyan Lan, Jun Xu, Jiafeng Guo, Liang Pang, and Xueqi Cheng. 2016. Match-srnn: Modeling the recursive matching structure with spatial rnn. Arxiv Preprint Arxiv:1604.04378 (2016).
[70]
Ho Chung Wu, Robert W. P. Luk, Kam-Fai Wong, and K. L. Kwok. 2007. A retrospective study of a hybrid document-context based retrieval model. Information Processing 8 Management 43, 5 (2007), 1308--1331.
[71]
Yongqin Xian, Bernt Schiele, and Zeynep Akata. 2017. Zero-shot learning - the good, the bad and the ugly. In CVPR. 3077--3086.
[72]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In SIGIR. 55--64.
[73]
Zheng Ye, Xiangji Huang, Ben He, and Hongfei Lin. 2009. York University at TREC 2009: Relevance Feedback Track. Technical Report. YORK UNIV DOWNSVIEW (ONTARIO).
[74]
Zhi-Xiu Ye and Zhen-Hua Ling. 2019. Multi-level matching and aggregation network for few-shot relation classification. In ACL. 2872--2881.
[75]
Xiang Zhang, Junbo Zhao, and Yann Lecun. 2015. Character-level convolutional networks for text classification. In NIPS. 649--657.
[76]
Mianwei Zhou and Kevin Chen-Chuan Chang. 2013. Entity-centric document filtering: Boosting feature mapping through meta-features. In CIKM. 119--128.

Cited By

View all

Index Terms

  1. An Attention-based Deep Relevance Model for Few-shot Document Filtering

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Information Systems
      ACM Transactions on Information Systems  Volume 39, Issue 1
      January 2021
      329 pages
      ISSN:1046-8188
      EISSN:1558-2868
      DOI:10.1145/3423044
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 October 2020
      Accepted: 01 August 2020
      Revised: 01 July 2020
      Received: 01 January 2020
      Published in TOIS Volume 39, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Few-shot learning
      2. deep learning
      3. document filtering

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • National Natural Science Foundation of China
      • Advance Research Projects of Civil Aerospace Technology, Intelligent Distribution Technology of Domestic Satellite Information
      • CETC key laboratory of aerospace information applications

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)21
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 09 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media