research-article

CallMine: Fraud Detection and Visualization of Million-Scale Call Graphs

Authors:

Mirela Cazzolato,

Saranya Vijayakumar,

Meng-Chieh Lee,

Catalina Vajiac,

Agma J.M. Traina,

Christos FaloutsosAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 4509 - 4515

https://doi.org/10.1145/3583780.3614662

Published: 21 October 2023 Publication History

Abstract

Given a million-scale dataset of who-calls-whom data containing imperfect labels, how can we detect existing and new fraud patterns? We propose CallMine, with carefully designed features and visualizations. Our CallMine method has the following properties: (a) Scalable, being linear on the input size, handling about 35 million records in around one hour on a stock laptop; (b) Effective, allowing natural interaction with human analysts; (c) Flexible, being applicable in both supervised and unsupervised settings; (d) Automatic, requiring no user-defined parameters.

In the real world, in a multi-million-scale dataset, CallMine was able to detect fraudsters 7,000x faster, namely in a matter of hours, while expert humans took over 10 months to detect them.

CIKM-ARP Categories: Application; Analytics and machine learning; Data presentation.

References

[1]

Leman Akoglu, Pedro O. S. Vaz de Melo, and Christos Faloutsos. 2012. Quantifying Reciprocity in Large Weighted Communication Networks. In PAKDD (2) (Lecture Notes in Computer Science, Vol. 7302). Springer, 85--96.

[2]

Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph based anomaly detection and description: a survey. Data Min. Knowl. Discov. 29, 3 (2015), 626--688.

Digital Library

[3]

Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, and Jörg Sander. 1999. OPTICS: Ordering Points To Identify the Clustering Structure. In SIGMOD Conference. ACM Press, 49--60.

[4]

Duen Horng Chau, Aniket Kittur, Jason I. Hong, and Christos Faloutsos. 2011. Apolo: making sense of large network data by combining rich user interaction and machine learning. In CHI. ACM, 167--176.

[5]

Communications Fraud Control Association (CFCA). 2019. Fraud Loss Survey. https://cfca.org/wp-content/uploads/2021/02/CFCA-2019-Fraud-Loss-Survey.pdf Version 1.0.

[6]

Communications Fraud Control Association (CFCA). 2021. Fraud Loss Survey. https://cfca.org/wp-content/uploads/2021/12/CFCA-Fraud-Loss-Survey-2021--2.pdf Version 1.0.

[7]

Alceu Ferraz Costa, Yuto Yamaguchi, Agma Juci Machado Traina, Caetano Traina Jr., and Christos Faloutsos. 2015. RSC: Mining and Modeling Temporal Activity in Social Media. In KDD. ACM, 269--278.

[8]

Pedro O. S. Vaz de Melo, Leman Akoglu, Christos Faloutsos, and Antonio Alfredo Ferreira Loureiro. 2010. Surprising Patterns for the Call Duration Distribution of Mobile Phone Users. In ECML/PKDD (3) (Lecture Notes in Computer Science, Vol. 6323). Springer, 354--369.

[9]

Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA, Evangelos Simoudis, Jiawei Han, and Usama M. Fayyad (Eds.). AAAI Press, 226--231. http://www.aaai.org/Library/KDD/1996/kdd96-037.php

Digital Library

[10]

Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, Disha Makhija, and Mohit Kumar. 2017. ZooBP: Belief Propagation for Heterogeneous Networks. Proc. VLDB Endow. 10, 5 (2017), 625--636.

Digital Library

[11]

Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Alex Beutel, Christos Faloutsos, and Athena Vakali. 2015. ND-Sync: Detecting Synchronized Fraud Activities. In PAKDD (2) (Lecture Notes in Computer Science, Vol. 9078). Springer, 201--214.

[12]

Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Retweeting Activity on Twitter: Signs of Deception. In PAKDD (1) (Lecture Notes in Computer Science, Vol. 9077). Springer, 122--134.

[13]

Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. 2020. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowl. Based Syst. 187 (2020).

[14]

Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2018. Beyond Outlier Detection: LookOut for Pictorial Explanation. In ECML/PKDD (Lecture Notes in Computer Science, Vol. 11051). Springer, 122--138.

[15]

Greg Hamerly and Charles Elkan. 2003. Learning the k in k-means. In NIPS. MIT Press, 281--288.

[16]

William L. Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS. 1024--1034.

[17]

Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. FRAUDAR: Bounding Graph Fraud in the Face of Camouflage. In KDD. ACM, 895--904.

[18]

Yajun Huang, Jingbin Zhang, Yiyang Yang, Zhiguo Gong, and Zhifeng Hao. 2020. GNNVis: Visualize Large-Scale Data by Learning a Graph Neural Network Representation. In CIKM. ACM, 545--554.

[19]

Alfred Inselberg and Bernard Dimsdale. 1990. Parallel Coordinates: A Tool for Visualizing Multi-dimensional Geometry. In IEEE Visualization. IEEE Computer Society Press, 361--378.

Digital Library

[20]

Di Jin, Aristotelis Leventidis, Haoming Shen, Ruowang Zhang, Junyue Wu, and Danai Koutra. 2017. PERSEUS-HUB: Interactive and Collective Exploration of Large-Scale Graphs. Informatics 4, 3 (2017), 22.

[21]

Seyed Mehran Kazemi, Rishab Goel, Kshitij Jain, Ivan Kobyzev, Akshay Sethi, Peter Forsyth, and Pascal Poupart. 2020. Representation Learning for Dynamic Graphs: A Survey. J. Mach. Learn. Res. 21 (2020), 70:1--70:73.

[22]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR (Poster). OpenReview.net.

[23]

Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks. In KDD. ACM, 1269--1278.

[24]

Meng-Chieh Lee, Shubhranshu Shekhar, Christos Faloutsos, Timothy Noah Hutson, and Leon D. Iasemidis. 2021. Gen2Out: Detecting and Ranking Generalized Anomalies. In IEEE BigData. IEEE, 801--811.

[25]

Siwei Li, Zhiyan Zhou, Anish Upadhayay, Omar Shaikh, Scott Freitas, Haekyu Park, Zijie J. Wang, Susanta Routray, Matthew Hull, and Duen Horng Chau. 2020. Argo Lite: Open-Source Interactive Graph Exploration and Visualization in Browsers. In CIKM. ACM, 3071--3076.

Digital Library

[26]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15--19, 2008, Pisa, Italy. IEEE Computer Society, 413--422. https://doi.org/10.1109/ICDM.2008.17

Digital Library

[27]

Giang Hoang Nguyen, John Boaz Lee, Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, and Sungchul Kim. 2018. Continuous-Time Dynamic Network Embeddings. In WWW (Companion Volume). ACM, 969--976.

[28]

Namyong Park, Fuchen Liu, Purvanshi Mehta, Dana Cristofor, Christos Faloutsos, and Yuxiao Dong. 2022. EvoKG: Jointly Modeling Event Time and Network Structure for Reasoning over Temporal Knowledge Graphs. In WSDM. ACM, 794--803.

[29]

Robert S. Pienta, Minsuk Kahng, Zhiyuan Lin, Jilles Vreeken, Partha P. Talukdar, James Abello, Ganesh Parameswaran, and Duen Horng Chau. 2017. FACETS: Adaptive Local Exploration of Large Graphs. In Proceedings of the 2017 SIAM International Conference on Data Mining, Houston, Texas, USA, April 27--29, 2017, Nitesh V. Chawla and Wei Wang (Eds.). SIAM, 597--605. https://doi.org/10.1137/1.9781611974973.67

[30]

Kijung Shin, Tina Eliassi-Rad, and Christos Faloutsos. 2016. CoreScope: Graph Mining Using k-Core Analysis - Patterns, Anomalies and Algorithms. In ICDM. IEEE Computer Society, 469--478.

[31]

Charles D. Stolper, Minsuk Kahng, Zhiyuan Lin, Florian Foerster, Aakash Goel, John T. Stasko, and Duen Horng Chau. 2014. GLO-STIX: Graph-Level Operations for Specifying Techniques and Interactive eXploration. IEEE Trans. Vis. Comput. Graph. 20, 12 (2014), 2320--2328. https://doi.org/10.1109/TVCG.2014.2346444

[32]

Felix Wu, Amauri H. Souza Jr., Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Q.Weinberger. 2019. Simplifying Graph Convolutional Networks. In ICML (Proceedings of Machine Learning Research, Vol. 97). PMLR, 6861--6871.

[33]

Da Xu, Chuanwei Ruan, Evren Körpeoglu, Sushant Kumar, and Kannan Achan. 2020. Inductive representation learning on temporal graphs. In ICLR. OpenReview. net.

[34]

Jonathan S. Yedidia, William T. Freeman, and YairWeiss. 2000. Generalized Belief Propagation. In NIPS. MIT Press, 689--695.

[35]

Le-kui Zhou, Yang Yang, Xiang Ren, FeiWu, and Yueting Zhuang. 2018. Dynamic Network Embedding by Modeling Triadic Closure Process. In AAAI. AAAI Press, 571--578.

[36]

Xiaojin Zhu, Zoubin Ghahramani, and John D. Lafferty. 2003. Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. In ICML. AAAI Press, 912--919.

Digital Library

Index Terms

CallMine: Fraud Detection and Visualization of Million-Scale Call Graphs
1. Applied computing
2. Computing methodologies

Recommendations

Research on Credit Card Fraud Detection Model Based on Distance Sum
JCAI '09: Proceedings of the 2009 International Joint Conference on Artificial Intelligence

Along with increasing credit cards and growing trade volume in China, credit card fraud rises sharply. How to enhance the detection and prevention of credit card fraud becomes the focus of risk control of banks. This paper proposes a credit card fraud ...
Interactive Multi-View Visualization for Fraud Detection in Mobile Money Transfer Services

Mobile money transfer services MMTS have gained a solid market segment and are widely used for domestic and international money transfers. Like traditional financial systems they can be used to conduct illegal financial activity including money ...
Label Information Enhanced Fraud Detection against Low Homophily in Graphs
WWW '23: Proceedings of the ACM Web Conference 2023

Node classification is a substantial problem in graph-based fraud detection. Many existing works adopt Graph Neural Networks (GNNs) to enhance fraud detectors. While promising, currently most GNN-based fraud detectors fail to generalize to the low ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Air Force Research Laboratory (AFRL), the Office of Naval Research (ONR) and the Army Research Office (ARO)
Portuguese Foundation for Science and Technology - FCT under CMU Portugal
Sao Paulo Research Foundation - FAPESP
Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior - Brasil (CAPES)
AIDA project - Adaptive, Intelligent and Distributed Assurance Platform
National Science Foundation Graduate Research
Pennsylvania Infrastructure Technology Alliance - PITA award
European Regional Development Fund - ERDF through the Operational Program for Competitiveness and Internationalisation - COMPETE 2020
National Council for Scientific and Technological Development (CNPq)

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
230
Total Downloads

Downloads (Last 12 months)174
Downloads (Last 6 weeks)8

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents