skip to main content
10.1145/564376.564393acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Novelty and redundancy detection in adaptive filtering

Published: 11 August 2002 Publication History

Abstract

This paper addresses the problem of extending an adaptive information filtering system to make decisions about the novelty and redundancy of relevant documents. It argues that relevance and redundance should each be modelled explicitly and separately. A set of five redundancy measures are proposed and evaluated in experiments with and without redundancy thresholds. The experimental results demonstrate that the cosine similarity metric and a redundancy measure based on a mixture of language models are both effective for identifying redundant documents.

References

[1]
J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study. In Topic Detection and Tracking Workshop Report 2001.
[2]
J. Allan, V. Lavrenko, and H. Jin. First story detetion in TDT is hard. In Proc. of the 9th International Conference on Information and Knowledge Management 2000.
[3]
J. Allan, R. Papka, and V. Lavrenko. On- line new event detection and tracking. In Proc. of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 1998.
[4]
J. Carbonell, Y. Yang, R. Brown, C. Jin, and J. Zhang. CMU TDT report 13-- 14 Nov 2001. In Topic Detection and Tracking Workshop Report 2001.
[5]
M. Franz, A. Ittycheriah, J. S. McCarley, andT. Ward. First story detection: Combining similarity and novelty based approaches. In Topic Detection and Tracking Workshop Report 2001.
[6]
W. P. Jones and G. W. Furnas. Pictures of relevance. Journal of the American Society for Information Science 1987.
[7]
W. Kraaij, R. Pohlmann, and D. Hiemstra. Twenty- one at TREC- 8: using language technology for information retrieval. In Proceedings of the Eighth Text REtrieval Conference (TREC- 8), 1999.
[8]
L. Lee. Measures of distributional similarity. In Proceedings of the 37th ACL 1999.
[9]
A. McCallum, R. Rosenfeld, T. Mitchell, and A. Y. Ng. Improving text classification by shrinkage in a hierarchy of classes. In Proceedings of The Eighteenth International Conference on Machine Learning 1998.
[10]
D. R. H. Miller, T. Leek, and R. Schwartz. A hidden markov model information retrieval system. In Proceedings of the 22th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval pages 214 --221, 2001.
[11]
S. Robertson. Threshold setting in adaptive filtering. Journal of Documentation 2000.
[12]
S. Robertson and D. Hull. The TREC- 9 Filtering track report. In The Ninth Text REtrieval Conference (TREC- 9), 2001.
[13]
M. Spitters and W. Kraaij. TNO at TDT2001: Language model-based topic detection. In Topic Detection and Tracking Workshop Report. 2001.
[14]
N. Stokes and J. Carthy. Combining semantic and syntactic document classifiers to improve first story detection. In Proceedings of the 24th Annual International ACM SIGIR Conferenc eon Research and Development in Information Retrieval 2001.
[15]
J. Yamron, S. Knecht, and P. van Mulbregt. Dragon's tracking and detection systems for the TDT2000 evaluation. In Proceedings of the Broadcast News Transcription and Understanding Workshop 1998.
[16]
C. Zhai and J. Lafferty. Model- based feedback in the language modeling approach to information retrieval. In Proceedings of Tenth International Conference on Information and Knowledge Management 2001.
[17]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proc. of the 24th Annual Int'l ACM SIGIR Conferenc eon Research and Development in Information Retrieval pages 334 --342, 2001.
[18]
Y. Zhang and J. Callan. Maximum likelihood estimation for lteirng thresholds. In Proc. of the 24th Annual Int'l ACM SIGIR Conference on Research and Development in Information Retrieval 2001.

Cited By

View all

Index Terms

  1. Novelty and redundancy detection in adaptive filtering

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
      August 2002
      478 pages
      ISBN:1581135610
      DOI:10.1145/564376
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 August 2002

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Article

      Conference

      SIGIR02
      Sponsor:

      Acceptance Rates

      SIGIR '02 Paper Acceptance Rate 44 of 219 submissions, 20%;
      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)41
      • Downloads (Last 6 weeks)8
      Reflects downloads up to 25 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media