research-article

FDHelper: Assist Unsupervised Fraud Detection Experts with Interactive Feature Selection and Evaluation

Authors:

Zhongping Zhang,

Wei XuAuthors Info & Claims

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Pages 1 - 12

https://doi.org/10.1145/3313831.3376140

Published: 23 April 2020 Publication History

Abstract

Online fraud is the well-known dark side of the modern Internet. Unsupervised fraud detection algorithms are widely used to address this problem. However, selecting features, adjusting hyperparameters, evaluating the algorithms, and eliminating false positives all require human expert involvement. In this work, we design and implement an end-to-end interactive visualization system, FDHelper, based on the deep understanding of the mechanism of the black market and fraud detection algorithms. We identify a workflow based on experience from both fraud detection algorithm experts and domain experts. Using a multi-granularity three-layer visualization map embedding an entropy-based distance metric ColDis, analysts can interactively select different feature sets, refine fraud detection algorithms, tune parameters and evaluate the detection result in near real-time. We demonstrate the effectiveness and significance of FDHelper through two case studies with state-of-the-art fraud detection algorithms, interviews with domain experts and algorithm experts, and a user study with eight first-time end users.

Supplementary Material

MP4 File (paper013pv.mp4)

Preview video

Download
2.36 MB

MP4 File (pn1122vf.mp4)

Supplemental video

Download
22.38 MB

References

[1]

Evmorfia N Argyriou, Aikaterini A Sotiraki, and Antonios Symvonis. 2013. Occupational fraud detection through visualization. In IEEE International Conference on Intelligence and Security Informatics. 4--6.

[2]

Richard J. Bolton and David J. Hand. 2002. Statistical Fraud Detection: A Review. Statist. Sci. 17, 3 (2002), 235--249.

[3]

Nan Cao, Conglei Shi, Sabrina Lin, Jie Lu, Yu-Ru Lin, and Ching-Yung Lin. 2016. Targetvue: Visual analysis of anomalous user behaviors in online communication systems. IEEE TVCG 22, 1 (2016), 280--289.

[4]

Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. 2014. Uncovering large groups of active malicious accounts in online social networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 477--488.

Digital Library

[5]

Emilio Corchado and ´ Alvaro Herrero. 2011. Neural visualization of network traffic data for intrusion detection. Applied Soft Computing 11, 2 (2011), 2042--2056.

Digital Library

[6]

Emilio Di Giacomo, Walter Didimo, Giuseppe Liotta, and Pietro Palladino. 2010. Visual analysis of financial crimes. In Proceedings of the International Conference on Advanced Visual Interfaces. 393--394.

Digital Library

[7]

Walter Didimo, Giuseppe Liotta, Fabrizio Montecchiani, and Pietro Palladino. 2011. An advanced network visualization system for financial crime detection. In IEEE Pacific Visualization Symposium. 203--210.

[8]

Tom Fawcett and Foster Provost. 1997. Adaptive fraud detection. Data mining and knowledge discovery 1, 3 (1997), 291--316.

[9]

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D. Sculley. 2017. Google Vizier: A Service for Black-Box Optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). ACM, NY, NY, USA, 1487--1495.

Digital Library

[10]

Bryan Hooi, Neil Shah, Alex Beutel, Stephan G¨ unnemann, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. BIRDNEST: Bayesian Inference for Ratings-Fraud Detection. 495--503.

[11]

Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2016. Spotting Suspicious Behaviors in Multimodal Data: A General Metric and Algorithms. IEEE TKDE 28, 8 (2016), 2187--2200.

[12]

Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014. CatchSync: catching synchronized behavior in large directed graphs. (2014), 941--950.

[13]

Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and V. S. Subrahmanian. 2017. FairJudge: Trustworthy User Prediction in Rating Platforms. CoRR (2017).

[14]

Roger A Leite, Theresia Gschwandtner, Silvia Miksch, Simone Kriglstein, Margit Pohl, Erich Gstrein, and Johannes Kuntner. 2018. EVA: Visual Analytics to Identify Fraudulent Events. IEEE transactions on visualization and computer graphics 24, 1 (2018), 330--339.

[15]

Emaad Manzoor, Hemank Lamba, and Leman Akoglu. 2018. xStream: Outlier Detection in Feature-Evolving Data Streams. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK. 19--23.

Digital Library

[16]

Joe Marks, Paul Beardsley, Brad Andalman, William Freeman, Sarah Gibson, Jessica Hodgins, Thomas Kang, Brian Mirtich, Hanspeter Pfister, Wheeler Ruml, and others. 1997. Design galleries: A general approach to setting parameters for computer graphics and animation. In Proceedings of SIGGRAPH. Association for Computing Machinery.

Digital Library

[17]

Sean McGregor, Hailey Buckingham, Thomas G Dietterich, Rachel Houtman, Claire Montgomery, and Ronald Metoyer. 2015. Facilitating testing and debugging of Markov Decision Processes with interactive visualization. In 2015 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 53--61.

[18]

Animesh Patcha and Jung-Min Park. 2007. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer networks 51, 12 (2007), 3448--3470.

[19]

Neil Shah, Alex Beutel, Bryan Hooi, Leman Akoglu, Stephan Gunnemann, Disha Makhija, Mohit Kumar, and Christos Faloutsos. 2017. EdgeCentric: Anomaly Detection in Edge-Attributed Networks. In IEEE ICDM. 327--334.

[20]

Jiao Sun, Qixin Zhu, Zhifei Liu, Xin Liu, Jihae Lee, Zhigang Su, Lei Shi, Ling Huang, and Wei Xu. 2018. FraudVis: Understanding Unsupervised Fraud Detection Algorithms. In 2018 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 170--174.

[21]

Kurt Thomas, Danny Yuxing, Huang David, Wang Elie, Bursztein Chris Grier, Thomas J Holt, Christopher Kruegel, Damon McCoy, Stefan Savage, and Giovanni Vigna. 2015. Framing dependencies introduced by underground commoditization. In In Proceedings (online) WEIS. Citeseer.

[22]

Tian Tian, Tong Zhang, Tong Zhang, Tong Zhang, and Tong Zhang. 2015. Crowd Fraud Detection in Internet Advertising. In International Conference on World Wide Web. 1100--1110.

[23]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605. http://www.jmlr.org/papers/v9/vandermaaten08a.html

[24]

Gang Wang, Xinyi Zhang, Shiliang Tang, Haitao Zheng, and Ben Y Zhao. 2016. Unsupervised clickstream clustering for user behavior analysis. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 225--236.

Digital Library

[25]

Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, and Huamin Qu. 2019. ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 681, 12 pages.

Digital Library

[26]

Daniel Karl I. Weidele, Justin D. Weisz, Eno Oduor, Michael Muller, Josh Andres, Alexander G. Gray, and Dakuo Wang. 2019. AutoAIViz: Opening the Blackbox of Automated Artificial Intelligence with Conditional Parallel Coordinates.

[27]

Liang Xiong, Barnab´ as P´ oczos, and Jeff Schneider. 2012. Group Anomaly Detection using Flexible Genre Models. Advances in Neural Information Processing Systems (2012), 1071--1079.

[28]

Liang Xiong, Barnab´ as P´ oczos, Jeff G. Schneider, Andrew Connolly, and Jake Vanderplas. 2011. Hierarchical Probabilistic Models for Group Anomaly Detection. Journal of Machine Learning Research 15 (2011), 789--797.

[29]

Ban Yikun, Liu Xin, Huang Ling, Duan Yitao, Liu Xue, and Xu Wei. 2019. No Place to Hide: Catching Fraudulent Entities in Tensors. In The World Wide Web Conference (WWW '19). ACM, NY, NY, USA, 83--93.

Digital Library

[30]

Rose Yu, Xinran He, and Yan Liu. 2015. GLAD:Group Anomaly Detection in Social Media Analysis. Acm Transactions on Knowledge Discovery from Data 10, 2 (2015), 1--22.

Cited By

Wani AJoshi ISingh P(2024)Navigating the Job-Seeking Journey: Challenges and Opportunities for Digital Employment Support in KashmirProceedings of the ACM on Human-Computer Interaction10.1145/36373758:CSCW1(1-28)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637375
Li MLiu YZheng QLi XQin W(2023)A Classification Method for Imbalanced Data Based on Ant Lion OptimizerData Mining and Big Data10.1007/978-981-19-9297-1_26(367-382)Online publication date: 20-Jan-2023
https://doi.org/10.1007/978-981-19-9297-1_26
Vorobyev IKrivitskaya A(2022)Reducing false positives in bank anti-fraud systems based on rule induction in distributed tree-based modelsComputers and Security10.1016/j.cose.2022.102786120:COnline publication date: 1-Sep-2022
https://dl.acm.org/doi/10.1016/j.cose.2022.102786
Show More Cited By

Index Terms

FDHelper: Assist Unsupervised Fraud Detection Experts with Interactive Feature Selection and Evaluation
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools
      1. User interface toolkits
  2. Visualization
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Research on Credit Card Fraud Detection Model Based on Distance Sum
JCAI '09: Proceedings of the 2009 International Joint Conference on Artificial Intelligence

Along with increasing credit cards and growing trade volume in China, credit card fraud rises sharply. How to enhance the detection and prevention of credit card fraud becomes the focus of risk control of banks. This paper proposes a credit card fraud ...
Unsupervised Machine Learning for Card Payment Fraud Detection
Risks and Security of Internet and Systems
Abstract
Credit card fraud is one of the most common cybercrimes experienced by consumers today. Machine learning approaches are increasingly used to improve the accuracy of fraud detection systems. However, most of the approaches proposed so far have been ...
Feature engineering strategies for credit card fraud detection

Credit card fraud detection evaluation measure.Each example is assumed to have different financial cost.Transaction aggregation strategy for predicting fraud.Periodic features using the von Mises distribution.Code is open source and available at ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

April 2020

10688 pages

ISBN:9781450367080

DOI:10.1145/3313831

General Chairs:
Regina Bernhaupt
Eindhoven University of Technology, Netherlands
,
Florian 'Floyd' Mueller
Monash University, Australia
,
David Verweij
Newcastle University, UK
,
Josh Andres
RMIT, Australia
,
Program Chairs:
Joanna McGrenere
University of British Columbia, Canada
,
Andy Cockburn
University of Canterbury, New Zealand
,
Ignacio Avellino
University of Maryland Baltimore County, USA
,
Alix Goguey
Grenoble Alpes University, France
,
Pernille Bjørn
University of Copenhagen, Denmark
,
Shengdong (Shen) Zhao
National University of Singapore, Singapore
,
Briane Paul Samson
Future University Hakodate, Japan & De La Salle University, Philippines
,
Rafal Kocielnik
University of Washington, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Tsinghua Initiative Research Program Grant
Fundamental Research Funds for the Central Universities
gift funds from Huawei, Ant Financial and Nanjing Turing AI Institute

Conference

CHI '20

Sponsor:

SIGCHI

CHI '20: CHI Conference on Human Factors in Computing Systems

April 25 - 30, 2020

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
865
Total Downloads

Downloads (Last 12 months)66
Downloads (Last 6 weeks)10

Reflects downloads up to 14 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wani AJoshi ISingh P(2024)Navigating the Job-Seeking Journey: Challenges and Opportunities for Digital Employment Support in KashmirProceedings of the ACM on Human-Computer Interaction10.1145/36373758:CSCW1(1-28)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3637375
Li MLiu YZheng QLi XQin W(2023)A Classification Method for Imbalanced Data Based on Ant Lion OptimizerData Mining and Big Data10.1007/978-981-19-9297-1_26(367-382)Online publication date: 20-Jan-2023
https://doi.org/10.1007/978-981-19-9297-1_26
Vorobyev IKrivitskaya A(2022)Reducing false positives in bank anti-fraud systems based on rule induction in distributed tree-based modelsComputers and Security10.1016/j.cose.2022.102786120:COnline publication date: 1-Sep-2022
https://dl.acm.org/doi/10.1016/j.cose.2022.102786
Sperrle FEl‐Assady MGuo GBorgo RChau DEndert AKeim D(2021)A Survey of Human‐Centered Evaluations in Human‐Centered Machine LearningComputer Graphics Forum10.1111/cgf.1432940:3(543-568)Online publication date: 29-Jun-2021
https://doi.org/10.1111/cgf.14329

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents