skip to main content
article

Relevance search and anomaly detection in bipartite graphs

Published: 01 December 2005 Publication History

Abstract

Many real applications can be modeled using bipartite graphs, such as users vs. files in a P2P system, traders vs. stocks in a financial trading system, conferences vs. authors in a scientific publication network, and so on. We introduce two operations on bipartite graphs: 1) identifying similar nodes (relevance search), and 2) finding nodes connecting irrelevant nodes (anomaly detection). And we propose algorithms to compute the relevance score for each node using random walk with restarts and graph partitioning; we also propose algorithms to identify anomalies, using relevance scores. We evaluate the quality of relevance search based on semantics of the datasets, and we also measure the performance of the anomaly detection algorithm with manually injected anomalies. Both effectiveness and efficiency of the methods are confirmed by experiments on several real datasets.

References

[1]
C. Aggarwal and P. Yu. Outlier detection for high-dimensional data. In SIGMOD, pages 37--46, 2001.
[2]
J. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI, 1998.
[3]
Sergey Brin and Lawrence Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1--7):107--117, 1998.
[4]
Deepayan Chakrabarti. Autopart: Parameter-free graph partitioning and outlier detection. In PKDD, pages 112--124, 2004.
[5]
Deepayan Chakrabarti, Spiros PapADimitriou, Dharmendra S. Modha, and Christos Faloutsos. Fully automatic cross-associations. In KDD, pages 79--88. ACM Press, 2004.
[6]
I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In KDD, 2003.
[7]
Gary William Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of Web communities. In KDD, 2000.
[8]
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. In Proc. Natl. Acad. Sci. USA, volume 99, 2002.
[9]
T. Haveliwala. Topic-sensitive pagerank. In Proceedings of the Eleventh International World Wide Web Conference, 2002.
[10]
Taher H. Haveliwala and Sepandar D. Kamvar. The second eigenvalue of the google matrix. Stanford University Technical Report, 2003.
[11]
Glen Jeh and Jennifer Widom. Simrank: a measure of structural-context similarity. In KDD, 2002.
[12]
R. Kannan, S. Vempala, and A. Vetta. On clusterings -- good, bad and spectral. In FOCS, 2000.
[13]
George Karypis and Vipin Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1):96--129, 1998.
[14]
Stefan Klink, Michael Ley, Emma Rabbidge, Patrick Reuther, Bernd Walter, and Alexander Weber. Browsing and visualizing digital bibliographic data. In VisSym, pages 237--242, 2004.
[15]
C. C. Noble and D. J. Cook. Graph-based anomaly detection. In KDD, pages 631--636, 2003.
[16]
Jia-Yu Pan, Hyung-Jeong Yang, Pinar Duygulu, and Christos Faloutsos. Automatic multimedia cross-modal correlation discovery, In KDD, 2004.
[17]
Berthier Ribeiro-Neto and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.
[18]
Upendra Shardanand and Pattie Maes. Social information filtering: Algorithms for automating "word of mouth". In Human Factors in Computing Systems, 1995.
[19]
Gilbert Strang. Introduction to Linear Algebra. Wellesley-Cambridge Press, 3 edition, 1998.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 7, Issue 2
December 2005
152 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/1117454
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2005
Published in SIGKDD Volume 7, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media