skip to main content
10.1145/3308558.3313403acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

No Place to Hide: Catching Fraudulent Entities in Tensors

Published: 13 May 2019 Publication History

Abstract

Many approaches focus on detecting dense blocks in the tensor of multimodal data to prevent fraudulent entities (e.g., accounts, links) from retweet boosting, hashtag hijacking, link advertising, etc. However, no existing method is effective to find the dense block if it only possesses high density on a subset of all dimensions in tensors. In this paper, we novelly identify dense-block detection with dense-subgraph mining, by modeling a tensor into a weighted graph without any density information lost. Based on the weighted graph, which we call information sharing graph (ISG), we propose an algorithm for finding multiple densest subgraphs, D-Spot, that is faster (up to 11x faster than the state-of-the-art algorithm) and can be computed in parallel. In an N-dimensional tensor, the entity group found by the ISG+D-Spot is at least 1/2 of the optimum with respect to density, compared with the 1/N guarantee ensured by competing methods. We use nine datasets to demonstrate that ISG+D-Spot becomes new state-of-the-art dense-block detection method in terms of accuracy specifically for fraud detection.

References

[1]
1999. Kdd cup 1999 data. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.
[2]
Melih Abdulhayoglu, Melih Abdulhayoglu, Melih Abdulhayoglu, and Melih Abdulhayoglu. 2017. HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network. In ACM SIGKDD. 1507-1515.
[3]
Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion Fraud Detection in Online Reviews by Network Effects. In ICWSM. The AAAI Press.
[4]
David S Anderson, Chris Fleizach, Stefan Savage, and Geoffrey M Voelker. 2007. Spamscatter: characterizing internet scam hosting infrastructure. In Usenix Security Symposium on Usenix Security Symposium. 1132-1141.
[5]
Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks. In ACM CCS. 477-488.
[6]
Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer, 84-95.
[7]
Jie Chen and Yousef Saad. 2012. Dense Subgraph Extraction with Application to Community Detection. IEEE TKDE24, 7 (2012), 1216-1230.
[8]
Hector Garcia-Molina and Jan Pedersen. 2004. Combating web spam with trustrank. In VLDB. 576-587.
[9]
Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the Twitter social network. In WWW.
[10]
A. V Goldberg. 1984. Finding a Maximum Density Subgraph. Technical Report.
[11]
Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. FRAUDAR: Bounding Graph Fraud in the Face of Camouflage. In ACM SIGKDD. 895-904.
[12]
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016. Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms. IEEE TKDE28, 8 (2016), 2187-2200.
[13]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. CatchSync: catching synchronized behavior in large directed graphs. In ACM SIGKDD. 941-950.
[14]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2016. Inferring lockstep behavior from connectivity pattern in large graphs. Knowledge & Information Systems48, 2 (2016), 399-428.
[15]
McAuley Julian. {n. d.}. Amazon product data. http://jmcauley.ucsd.edu/data/amazon/.
[16]
Samir Khuller and Barna Saha. 2009. On Finding Dense Subgraphs. In Automata, Languages and Programming, International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings. 597-608.
[17]
Tamara G. Kolda and Brett W. Bader. 2009. Tensor Decompositions and Applications. Siam Review51, 3 (2009), 455-500.
[18]
Victor E. Lee, Ruan Ning, Ruoming Jin, and Charu Aggarwal. 2010. A Survey of Algorithms for Dense Subgraph Discovery. 303-336 pages.
[19]
R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McClung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham, and M. A. Zissman. 2000. Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation. In Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00, Vol. 2. 12-26 vol.2.
[20]
Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis. In ASONAM. 203-210.
[21]
Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance. 2013. What yelp fake review filter might be doing?. In Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013. AAAI press, 409-418.
[22]
Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In WWW. 201-210.
[23]
Evangelos E. Papalexakis, Christos Faloutsos, and Nicholas D. Sidiropoulos. 2012. ParCube: sparse parallelizable tensor decompositions. In PKDD. 521-536.
[24]
B. Aditya Prakash, Ashwin Sridharan, Mukund Seshadri, Sridhar Machiraju, and Christos Faloutsos. 2010. EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs. In PAKDD.
[25]
Shebuti Rayana and Leman Akoglu. 2015. Collective Opinion Spam Detection: Bridging Review Networks and metadata. In ACM SIGKDD.
[26]
Barna Saha, Allison Hoch, Samir Khuller, Louiqa Raschid, and Xiao Ning Zhang. 2010. Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs. Springer Berlin Heidelberg. 456-472 pages.
[27]
Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective. In IEEE ICDM. 959-964.
[28]
C. E Shannon. 1948. A mathematical theory of communication. Bell Labs Technical Journal27, 4 (1948), 379-423.
[29]
K. Shin, T. Eliassi-Rad, and C. Faloutsos. 2017. CoreScope: Graph Mining Using k-Core Analysis - Patterns, Anomalies and Algorithms. In ICDM, Vol. 00. 469-478.
[30]
Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees. In ECML PKDD. 264-280.
[31]
Kijung Shin, Bryan Hooi, and Christo Faloutsos. 2018. Fast, Accurate, and Flexible Algorithms for Dense Subtensor Mining. ACM Transactions on Knowledge Discovery from Data12, 3 (2018), 1-30.
[32]
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-Cube: Dense-Block Detection in Terabyte-Scale Tensors. In WSDM. 681-689.
[33]
Kijung Shin and U. Kang. 2015. Distributed Methods for High-Dimensional and Large-Scale Tensor Factorization. In IEEE ICDM. 989-994.
[34]
Ming Yang Su. 2011. Real-time anomaly detection systems for Denial-of-Service attacks by weighted k-nearest-neighbor classifiers. Expert Systems with Applications38, 4 (2011), 3492-3498.
[35]
Hua Tang and Zhuolin Cao. 2009. Machine Learning-based Intrusion Detection Algorithms. Journal of Computational Information Systems (2009), 1825-1831.
[36]
Kurt Thomas, Dmytro Iatskiv, Elie Bursztein, Tadek Pietraszek, Chris Grier, and Damon McCoy. 2014. Dialing back abuse on phone verified accounts. In Proceedings of the 2014 ACM SIGSAC. ACM, 465-476.
[37]
Yining Wang, Hsiao Yu Tung, Alexander Smola, and Animashree Anandkumar. 2015. Fast and Guaranteed Tensor Decomposition via Sketching. NIPS.
[38]
Wanhong Xu, Wanhong Xu, Christopher Palow, Christopher Palow, and Christos Faloutsos. 2013. CopyCatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW. 119-130.
[39]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy Mccauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In Usenix NSDI. 2-2.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Dense-block Detection
  2. Fraud Detection
  3. Graph Algorithms

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)2
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media