research-article

No Place to Hide: Catching Fraudulent Entities in Tensors

Authors:

Xu WeiAuthors Info & Claims

WWW '19: The World Wide Web Conference

Pages 83 - 93

https://doi.org/10.1145/3308558.3313403

Published: 13 May 2019 Publication History

Abstract

Many approaches focus on detecting dense blocks in the tensor of multimodal data to prevent fraudulent entities (e.g., accounts, links) from retweet boosting, hashtag hijacking, link advertising, etc. However, no existing method is effective to find the dense block if it only possesses high density on a subset of all dimensions in tensors. In this paper, we novelly identify dense-block detection with dense-subgraph mining, by modeling a tensor into a weighted graph without any density information lost. Based on the weighted graph, which we call information sharing graph (ISG), we propose an algorithm for finding multiple densest subgraphs, D-Spot, that is faster (up to 11x faster than the state-of-the-art algorithm) and can be computed in parallel. In an N-dimensional tensor, the entity group found by the ISG+D-Spot is at least 1/2 of the optimum with respect to density, compared with the 1/N guarantee ensured by competing methods. We use nine datasets to demonstrate that ISG+D-Spot becomes new state-of-the-art dense-block detection method in terms of accuracy specifically for fraud detection.

References

[1]

1999. Kdd cup 1999 data. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

[2]

Melih Abdulhayoglu, Melih Abdulhayoglu, Melih Abdulhayoglu, and Melih Abdulhayoglu. 2017. HinDroid: An Intelligent Android Malware Detection System Based on Structured Heterogeneous Information Network. In ACM SIGKDD. 1507-1515.

Digital Library

[3]

Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion Fraud Detection in Online Reviews by Network Effects. In ICWSM. The AAAI Press.

[4]

David S Anderson, Chris Fleizach, Stefan Savage, and Geoffrey M Voelker. 2007. Spamscatter: characterizing internet scam hosting infrastructure. In Usenix Security Symposium on Usenix Security Symposium. 1132-1141.

Digital Library

[5]

Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering Large Groups of Active Malicious Accounts in Online Social Networks. In ACM CCS. 477-488.

Digital Library

[6]

Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In International Workshop on Approximation Algorithms for Combinatorial Optimization. Springer, 84-95.

Digital Library

[7]

Jie Chen and Yousef Saad. 2012. Dense Subgraph Extraction with Application to Community Detection. IEEE TKDE24, 7 (2012), 1216-1230.

Digital Library

[8]

Hector Garcia-Molina and Jan Pedersen. 2004. Combating web spam with trustrank. In VLDB. 576-587.

Digital Library

[9]

Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the Twitter social network. In WWW.

Digital Library

[10]

A. V Goldberg. 1984. Finding a Maximum Density Subgraph. Technical Report.

[11]

Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. FRAUDAR: Bounding Graph Fraud in the Face of Camouflage. In ACM SIGKDD. 895-904.

Digital Library

[12]

Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016. Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms. IEEE TKDE28, 8 (2016), 2187-2200.

[13]

Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. CatchSync: catching synchronized behavior in large directed graphs. In ACM SIGKDD. 941-950.

Digital Library

[14]

Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2016. Inferring lockstep behavior from connectivity pattern in large graphs. Knowledge & Information Systems48, 2 (2016), 399-428.

Digital Library

[15]

McAuley Julian. {n. d.}. Amazon product data. http://jmcauley.ucsd.edu/data/amazon/.

[16]

Samir Khuller and Barna Saha. 2009. On Finding Dense Subgraphs. In Automata, Languages and Programming, International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings. 597-608.

Digital Library

[17]

Tamara G. Kolda and Brett W. Bader. 2009. Tensor Decompositions and Applications. Siam Review51, 3 (2009), 455-500.

Digital Library

[18]

Victor E. Lee, Ruan Ning, Ruoming Jin, and Charu Aggarwal. 2010. A Survey of Algorithms for Dense Subgraph Discovery. 303-336 pages.

[19]

R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McClung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham, and M. A. Zissman. 2000. Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation. In Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00, Vol. 2. 12-26 vol.2.

[20]

Koji Maruhashi, Fan Guo, and Christos Faloutsos. 2011. MultiAspectForensics: Pattern Mining on Large-Scale Heterogeneous Networks with Tensor Analysis. In ASONAM. 203-210.

Digital Library

[21]

Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance. 2013. What yelp fake review filter might be doing?. In Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013. AAAI press, 409-418.

[22]

Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In WWW. 201-210.

Digital Library

[23]

Evangelos E. Papalexakis, Christos Faloutsos, and Nicholas D. Sidiropoulos. 2012. ParCube: sparse parallelizable tensor decompositions. In PKDD. 521-536.

Digital Library

[24]

B. Aditya Prakash, Ashwin Sridharan, Mukund Seshadri, Sridhar Machiraju, and Christos Faloutsos. 2010. EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs. In PAKDD.

Digital Library

[25]

Shebuti Rayana and Leman Akoglu. 2015. Collective Opinion Spam Detection: Bridging Review Networks and metadata. In ACM SIGKDD.

Digital Library

[26]

Barna Saha, Allison Hoch, Samir Khuller, Louiqa Raschid, and Xiao Ning Zhang. 2010. Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs. Springer Berlin Heidelberg. 456-472 pages.

Digital Library

[27]

Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective. In IEEE ICDM. 959-964.

Digital Library

[28]

C. E Shannon. 1948. A mathematical theory of communication. Bell Labs Technical Journal27, 4 (1948), 379-423.

[29]

K. Shin, T. Eliassi-Rad, and C. Faloutsos. 2017. CoreScope: Graph Mining Using k-Core Analysis - Patterns, Anomalies and Algorithms. In ICDM, Vol. 00. 469-478.

[30]

Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees. In ECML PKDD. 264-280.

Digital Library

[31]

Kijung Shin, Bryan Hooi, and Christo Faloutsos. 2018. Fast, Accurate, and Flexible Algorithms for Dense Subtensor Mining. ACM Transactions on Knowledge Discovery from Data12, 3 (2018), 1-30.

Digital Library

[32]

Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-Cube: Dense-Block Detection in Terabyte-Scale Tensors. In WSDM. 681-689.

Digital Library

[33]

Kijung Shin and U. Kang. 2015. Distributed Methods for High-Dimensional and Large-Scale Tensor Factorization. In IEEE ICDM. 989-994.

Digital Library

[34]

Ming Yang Su. 2011. Real-time anomaly detection systems for Denial-of-Service attacks by weighted k-nearest-neighbor classifiers. Expert Systems with Applications38, 4 (2011), 3492-3498.

Digital Library

[35]

Hua Tang and Zhuolin Cao. 2009. Machine Learning-based Intrusion Detection Algorithms. Journal of Computational Information Systems (2009), 1825-1831.

[36]

Kurt Thomas, Dmytro Iatskiv, Elie Bursztein, Tadek Pietraszek, Chris Grier, and Damon McCoy. 2014. Dialing back abuse on phone verified accounts. In Proceedings of the 2014 ACM SIGSAC. ACM, 465-476.

Digital Library

[37]

Yining Wang, Hsiao Yu Tung, Alexander Smola, and Animashree Anandkumar. 2015. Fast and Guaranteed Tensor Decomposition via Sketching. NIPS.

Digital Library

[38]

Wanhong Xu, Wanhong Xu, Christopher Palow, Christopher Palow, and Christos Faloutsos. 2013. CopyCatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW. 119-130.

Digital Library

[39]

Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy Mccauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In Usenix NSDI. 2-2.

Digital Library

Cited By

Xiao FCai SChen GJagadish HOoi BZhang MBaeza-Yates RBonchi F(2024)VecAug: Unveiling Camouflaged Frauds with Cohort Augmentation for Enhanced DetectionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671527(6025-6036)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671527
Xiao FWu YZhang MChen GOoi B(2023)MINT: Detecting Fraudulent Behaviors from Time-Series Relational DataProceedings of the VLDB Endowment10.14778/3611540.361155116:12(3610-3623)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.14778/3611540.3611551
Feng WLiu SCheng X(2023)Hierarchical Dense Pattern Detection in TensorsACM Transactions on Knowledge Discovery from Data10.1145/357702217:6(1-29)Online publication date: 28-Feb-2023
https://dl.acm.org/doi/10.1145/3577022
Show More Cited By

Recommendations

D-Cube: Dense-Block Detection in Terabyte-Scale Tensors
WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining

How can we detect fraudulent lockstep behavior in large-scale multi-aspect data (i.e., tensors)? Can we detect it when data are too large to fit in memory or even on a disk? Past studies have shown that dense blocks in real-world tensors (e.g., social ...
M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees
ECML PKDD 2016: European Conference on Machine Learning and Knowledge Discovery in Databases - Volume 9851

Given a large-scale and high-order tensor, how can we find dense blocks in it__ __ Can we find them in near-linear time but with a quality guarantee__ __ Extensive previous work has shown that dense blocks in tensors as well as graphs indicate anomalous ...
DenseAlert: Incremental Dense-Subtensor Detection in Tensor Streams
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Consider a stream of retweet events - how can we spot fraudulent lock-step behavior in such multi-aspect data (i.e., tensors) evolving over time? Can we detect it in real time, with an accuracy guarantee? Past studies have shown that dense subtensors ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '19: The World Wide Web Conference

May 2019

3620 pages

ISBN:9781450366748

DOI:10.1145/3308558

Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '19

WWW '19: The Web Conference

May 13 - 17, 2019

CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
388
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xiao FCai SChen GJagadish HOoi BZhang MBaeza-Yates RBonchi F(2024)VecAug: Unveiling Camouflaged Frauds with Cohort Augmentation for Enhanced DetectionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671527(6025-6036)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671527
Xiao FWu YZhang MChen GOoi B(2023)MINT: Detecting Fraudulent Behaviors from Time-Series Relational DataProceedings of the VLDB Endowment10.14778/3611540.361155116:12(3610-3623)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.14778/3611540.3611551
Feng WLiu SCheng X(2023)Hierarchical Dense Pattern Detection in TensorsACM Transactions on Knowledge Discovery from Data10.1145/357702217:6(1-29)Online publication date: 28-Feb-2023
https://dl.acm.org/doi/10.1145/3577022
Ji YZhang ZTang XShen JZhang XYang GZhang ARangwala H(2022)Detecting Cash-out Users via Dense SubgraphsProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539252(687-697)Online publication date: 14-Aug-2022
https://dl.acm.org/doi/10.1145/3534678.3539252
Wang CChai SZhu HJiang C(2022)CAeSaR: An Online Payment Anti-Fraud Integration System With Decision ExplainabilityIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.3186733(1-14)Online publication date: 2022
https://doi.org/10.1109/TDSC.2022.3186733
Wu JHe JZhu FChin Ooi BMiao CWang HSkrypnyk IHsu WChawla S(2021)Indirect Invisible Poisoning Attacks on Domain AdaptationProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467214(1852-1862)Online publication date: 14-Aug-2021
https://dl.acm.org/doi/10.1145/3447548.3467214
Ban YHe J(2021)Local Clustering in Contextual Multi-Armed BanditsProceedings of the Web Conference 202110.1145/3442381.3450058(2335-2346)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3442381.3450058
Jing BPark CTong H(2021)HDMI: High-order Deep Multiplex InfomaxProceedings of the Web Conference 202110.1145/3442381.3449971(2414-2424)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3442381.3449971
Chen DDu YXu SSun YHuang HGao G(2021)Online Anomalous Taxi Trajectory Detection Based on Multidimensional Criteria2021 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN52387.2021.9533443(1-8)Online publication date: 18-Jul-2021
https://doi.org/10.1109/IJCNN52387.2021.9533443
Rozenshtein PPreti GGionis AVelegrakis Y(2021)Mining Dense Subgraphs with Similar EdgesMachine Learning and Knowledge Discovery in Databases10.1007/978-3-030-67664-3_2(20-36)Online publication date: 25-Feb-2021
https://doi.org/10.1007/978-3-030-67664-3_2
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents