skip to main content
10.1145/3313831.3376140acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

FDHelper: Assist Unsupervised Fraud Detection Experts with Interactive Feature Selection and Evaluation

Published: 23 April 2020 Publication History

Abstract

Online fraud is the well-known dark side of the modern Internet. Unsupervised fraud detection algorithms are widely used to address this problem. However, selecting features, adjusting hyperparameters, evaluating the algorithms, and eliminating false positives all require human expert involvement. In this work, we design and implement an end-to-end interactive visualization system, FDHelper, based on the deep understanding of the mechanism of the black market and fraud detection algorithms. We identify a workflow based on experience from both fraud detection algorithm experts and domain experts. Using a multi-granularity three-layer visualization map embedding an entropy-based distance metric ColDis, analysts can interactively select different feature sets, refine fraud detection algorithms, tune parameters and evaluate the detection result in near real-time. We demonstrate the effectiveness and significance of FDHelper through two case studies with state-of-the-art fraud detection algorithms, interviews with domain experts and algorithm experts, and a user study with eight first-time end users.

Supplementary Material

MP4 File (paper013pv.mp4)
Preview video
MP4 File (pn1122vf.mp4)
Supplemental video

References

[1]
Evmorfia N Argyriou, Aikaterini A Sotiraki, and Antonios Symvonis. 2013. Occupational fraud detection through visualization. In IEEE International Conference on Intelligence and Security Informatics. 4--6.
[2]
Richard J. Bolton and David J. Hand. 2002. Statistical Fraud Detection: A Review. Statist. Sci. 17, 3 (2002), 235--249.
[3]
Nan Cao, Conglei Shi, Sabrina Lin, Jie Lu, Yu-Ru Lin, and Ching-Yung Lin. 2016. Targetvue: Visual analysis of anomalous user behaviors in online communication systems. IEEE TVCG 22, 1 (2016), 280--289.
[4]
Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering large groups of active malicious accounts in online social networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 477--488.
[5]
Emilio Corchado and ´ Alvaro Herrero. 2011. Neural visualization of network traffic data for intrusion detection. Applied Soft Computing 11, 2 (2011), 2042--2056.
[6]
Emilio Di Giacomo, Walter Didimo, Giuseppe Liotta, and Pietro Palladino. 2010. Visual analysis of financial crimes. In Proceedings of the International Conference on Advanced Visual Interfaces. 393--394.
[7]
Walter Didimo, Giuseppe Liotta, Fabrizio Montecchiani, and Pietro Palladino. 2011. An advanced network visualization system for financial crime detection. In IEEE Pacific Visualization Symposium. 203--210.
[8]
Tom Fawcett and Foster Provost. 1997. Adaptive fraud detection. Data mining and knowledge discovery 1, 3 (1997), 291--316.
[9]
Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D. Sculley. 2017. Google Vizier: A Service for Black-Box Optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). ACM, NY, NY, USA, 1487--1495.
[10]
Bryan Hooi, Neil Shah, Alex Beutel, Stephan G¨ unnemann, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. BIRDNEST: Bayesian Inference for Ratings-Fraud Detection. 495--503.
[11]
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016. Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms. IEEE TKDE 28, 8 (2016), 2187--2200.
[12]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. CatchSync: catching synchronized behavior in large directed graphs. (2014), 941--950.
[13]
Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and V. S. Subrahmanian. 2017. FairJudge: Trustworthy User Prediction in Rating Platforms. CoRR (2017).
[14]
Roger A Leite, Theresia Gschwandtner, Silvia Miksch, Simone Kriglstein, Margit Pohl, Erich Gstrein, and Johannes Kuntner. 2018. EVA: Visual Analytics to Identify Fraudulent Events. IEEE transactions on visualization and computer graphics 24, 1 (2018), 330--339.
[15]
Emaad Manzoor, Hemank Lamba, and Leman Akoglu. 2018. xStream: Outlier Detection in Feature-Evolving Data Streams. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK. 19--23.
[16]
Joe Marks, Paul Beardsley, Brad Andalman, William Freeman, Sarah Gibson, Jessica Hodgins, Thomas Kang, Brian Mirtich, Hanspeter Pfister, Wheeler Ruml, and others. 1997. Design galleries: A general approach to setting parameters for computer graphics and animation. In Proceedings of SIGGRAPH. Association for Computing Machinery.
[17]
Sean McGregor, Hailey Buckingham, Thomas G Dietterich, Rachel Houtman, Claire Montgomery, and Ronald Metoyer. 2015. Facilitating testing and debugging of Markov Decision Processes with interactive visualization. In 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 53--61.
[18]
Animesh Patcha and Jung-Min Park. 2007. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer networks 51, 12 (2007), 3448--3470.
[19]
Neil Shah, Alex Beutel, Bryan Hooi, Leman Akoglu, Stephan Gunnemann, Disha Makhija, Mohit Kumar, and Christos Faloutsos. 2017. EdgeCentric: Anomaly Detection in Edge-Attributed Networks. In IEEE ICDM. 327--334.
[20]
Jiao Sun, Qixin Zhu, Zhifei Liu, Xin Liu, Jihae Lee, Zhigang Su, Lei Shi, Ling Huang, and Wei Xu. 2018. FraudVis: Understanding Unsupervised Fraud Detection Algorithms. In 2018 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 170--174.
[21]
Kurt Thomas, Danny Yuxing, Huang David, Wang Elie, Bursztein Chris Grier, Thomas J Holt, Christopher Kruegel, Damon McCoy, Stefan Savage, and Giovanni Vigna. 2015. Framing dependencies introduced by underground commoditization. In In Proceedings (online) WEIS. Citeseer.
[22]
Tian Tian, Tong Zhang, Tong Zhang, Tong Zhang, and Tong Zhang. 2015. Crowd Fraud Detection in Internet Advertising. In International Conference on World Wide Web. 1100--1110.
[23]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605. http://www.jmlr.org/papers/v9/vandermaaten08a.html
[24]
Gang Wang, Xinyi Zhang, Shiliang Tang, Haitao Zheng, and Ben Y Zhao. 2016. Unsupervised clickstream clustering for user behavior analysis. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 225--236.
[25]
Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, and Huamin Qu. 2019. ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 681, 12 pages.
[26]
Daniel Karl I. Weidele, Justin D. Weisz, Eno Oduor, Michael Muller, Josh Andres, Alexander G. Gray, and Dakuo Wang. 2019. AutoAIViz: Opening the Blackbox of Automated Artificial Intelligence with Conditional Parallel Coordinates.
[27]
Liang Xiong, Barnab´ as P´ oczos, and Jeff Schneider. 2012. Group Anomaly Detection using Flexible Genre Models. Advances in Neural Information Processing Systems (2012), 1071--1079.
[28]
Liang Xiong, Barnab´ as P´ oczos, Jeff G. Schneider, Andrew Connolly, and Jake Vanderplas. 2011. Hierarchical Probabilistic Models for Group Anomaly Detection. Journal of Machine Learning Research 15 (2011), 789--797.
[29]
Ban Yikun, Liu Xin, Huang Ling, Duan Yitao, Liu Xue, and Xu Wei. 2019. No Place to Hide: Catching Fraudulent Entities in Tensors. In The World Wide Web Conference (WWW '19). ACM, NY, NY, USA, 83--93.
[30]
Rose Yu, Xinran He, and Yan Liu. 2015. GLAD:Group Anomaly Detection in Social Media Analysis. Acm Transactions on Knowledge Discovery from Data 10, 2 (2015), 1--22.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
April 2020
10688 pages
ISBN:9781450367080
DOI:10.1145/3313831
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fraud detection
  2. human computer interaction
  3. visualization

Qualifiers

  • Research-article

Funding Sources

Conference

CHI '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)10
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media