skip to main content
10.1145/3289600.3291010acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

CoReRank: Ranking to Detect Users Involved in Blackmarket-Based Collusive Retweeting Activities

Published: 30 January 2019 Publication History

Abstract

Twitter's popularity has fostered the emergence of various illegal user activities - one such activity is to artificially bolster visibility of tweets by gaining large number of retweets within a short time span. The natural way to gain visibility is time-consuming. Therefore, users who want their tweets to get quick visibility try to explore shortcuts - one such shortcut is to approach the blackmarket services, and gain retweets for their own tweets by retweeting other customers' tweets. Thus the users intrinsically become a part of a collusive ecosystem controlled by these services. In this paper, we propose CoReRank, an unsupervised framework to detect collusive users (who are involved in producing artificial retweets), and suspicious tweets (which are submitted to the blackmarket services) simultaneously. CoReRank leverages the retweeting (or quoting) patterns of users, and measures two scores - the 'credibility' of a user and the 'merit' of a tweet. We propose a set of axioms to derive the interdependency between these two scores, and update them in a recursive manner. The formulation is further extended to handle the cold start problem. CoReRank is guaranteed to converge in a finite number of iterations and has linear time complexity. We also propose a semi-supervised version of CoReRank (called CoReRank+) which leverages a partial ground-truth labeling of users and tweets. Extensive experiments are conducted to show the superiority of CoReRank compared to six baselines on a novel dataset we collected and annotated. CoReRank beats the best unsupervised baseline method by 269% (20%) (relative) average precision and 300% (22.22%) (relative) average recall in detecting collusive (genuine) users. CoReRank+ beats the best supervised baseline method by 33.18% AUC. CoReRank also detects suspicious tweets with 0.85 (0.60) average precision (recall). To our knowledge, CoReRank is the first unsupervised method to detect collusive users and suspicious tweets simultaneously with theoretical guarantees.

References

[1]
Faraz Ahmed and Muhammad Abulaish. 2013. A generic statistical approach for spam detection in online social networks. Computer Communications 36, 10--11 (2013), 1120--1129.
[2]
Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida. 2010. Detecting spammers on twitter. In Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), Vol. 6. 12.
[3]
Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. 2010. Who is tweeting on Twitter: human, bot, or cyborg?. In Proceedings of the 26th annual computer security applications conference. ACM, 21--30.
[4]
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, and Maurizio Tesconi. 2015. Fame for sale: efficient detection of fake Twitter followers. Decision Support Systems 80 (2015), 56--71.
[5]
Clayton Allen Davis, Onur Varol, Emilio Ferrara, Alessandro Flammini, and Filippo Menczer. 2016. Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, 273--274.
[6]
Carlo De Micheli and Andrea Stroppa. 2013. Twitter and the underground market. In 11th Nexa Lunch Seminar, Vol. 22.
[7]
Hridoy Dutta, Aditya Chetan, Brihi Joshi, and Tanmoy Chakraborty. 2018. Retweet Us, We will Retweet You: Spotting Collusive Retweeters Involved in Blackmarket Services. In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). 242--249.
[8]
Milad Eftekhar and Nick Koudas. 2013. Some Research Opportunities on Twitter Advertising. IEEE Data Eng. Bull. 36, 3 (2013), 77--82.
[9]
Ahmed ElAzab. 2016. Fake Accounts Detection in Twitter based on Minimum Weighted Feature. World (2016).
[10]
Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia, and Alok N Choudhary. 2012. Towards Online Spam Filtering in Social Networks. In NDSS, Vol. 12. 1--16.
[11]
Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Alex Beutel, Christos Faloutsos, and Athena Vakali. 2015. Nd-sync: Detecting synchronized fraud activities. In PAKDD. 201--214.
[12]
Maria Giatsoglou, Despoina Chatzakou, Neil Shah, Christos Faloutsos, and Athena Vakali. 2015. Retweeting activity on twitter: Signs of deception. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 122-- 134.
[13]
Chris Grier, Kurt Thomas, Vern Paxson, and Michael Zhang. 2010. @ spam: the underground on 140 characters or less. In Proceedings of the 17th ACM conference on Computer and communications security. ACM, 27--37.
[14]
Srishti Gupta, Abhinav Khattar, Arpit Gogia, Ponnurangam Kumaraguru, and Tanmoy Chakraborty. 2018. Collective Classification of Spam Campaigners on Twitter: A Hierarchical Meta-Path Based Approach. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, WWW 2018, Lyon, France, April 23--27, 2018. 529--538.
[15]
Bryan Hooi, Neil Shah, Alex Beutel, Stephan Günnemann, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. 2016. Birdnest: Bayesian inference for ratings-fraud detection. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 495--503.
[16]
Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and V.S. Subrahmanian. 2018. REV2: Fraudulent User Prediction in Rating Platforms. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (WSDM '18). ACM, New York, NY, USA, 333--341.
[17]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World wide web. ACM, 591--600.
[18]
Eric Lancaster, Tanmoy Chakraborty, and VS Subrahmanian. 2018. MALTP: Parallel Prediction of Malicious Tweets. IEEE Transactions on Computational Social Systems (2018).
[19]
Kyumin Lee, James Caverlee, and SteveWebb. 2010. Uncovering social spammers: social honeypots+ machine learning. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 435--442.
[20]
Yuli Liu, Yiqun Liu, Min Zhang, and Shaoping Ma. 2016. Pay Me and I'll Follow You: Detection of Crowdturfing Following Activities in Microblog Environment. In IJCAI. 3789--3796.
[21]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR 9, Nov (2008), 2579--2605.
[22]
Wes McKinney. 2011. pandas: a foundational Python library for data analysis and statistics. Python for High Performance and Scientific Computing (2011), 1--9.
[23]
Zachary Miller, Brian Dickinson, William Deitrick, Wei Hu, and Alex Hai Wang. 2014. Twitter spammer detection using data stream clustering. Information Sciences 260 (2014), 64--73.
[24]
Marti Motoyama, Damon McCoy, Kirill Levchenko, Stefan Savage, and GeoffreyM Voelker. 2011. An analysis of underground forums. In Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference. ACM, 71--80.
[25]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.
[26]
CoReRank Online repository: code & data & Supplementary. 2018. https://github.com/LCS2-IIITD/CoReRank-WSDM-2019. (2018).
[27]
Neil Shah. 2017. FLOCK: Combating Astroturfing on Livestreaming Platforms. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1083--1091.
[28]
Neil Shah, Hemank Lamba, Alex Beutel, and Christos Faloutsos. 2017. The Many Faces of Link Fraud. In Data Mining (ICDM), 2017 IEEE International Conference on. IEEE, 1069--1074.
[29]
Monika Singh, Divya Bansal, and Sanjeev Sofat. 2016. Followers or fradulents? An analysis and classification of Twitter followers market merchants. Cybernetics and Systems 47, 8 (2016), 674--689.
[30]
Gianluca Stringhini, Gang Wang, Manuel Egele, Christopher Kruegel, Giovanni Vigna, Haitao Zheng, and Ben Y Zhao. 2013. Follow the green: growth and dynamics in twitter follower markets. In Proceedings of the 2013 conference on Internet measurement conference. ACM, 163--176.
[31]
Kurt Thomas, Damon McCoy, Chris Grier, Alek Kolcz, and Vern Paxson. 2013. Trafficking Fraudulent Accounts: The Role of the Underground Market in Twitter Spam and Abuse. In USENIX Security Symposium. 195--210.
[32]
Sokratis Vidros, Constantinos Kolias, Georgios Kambourakis, and Leman Akoglu. 2017. Automatic detection of online recruitment frauds: Characteristics, methods, and a public dataset. Future Internet 9, 1 (2017), 6.
[33]
Alex Hai Wang. 2010. Detecting spam bots in online social networking sites: a machine learning approach. In IFIP Annual Conference on Data and Applications Security and Privacy. Springer, 335--342.
[34]
Bo Wang, Arkaitz Zubiaga, Maria Liakata, and Rob Procter. 2015. Making the most of tweet-inherent features for social spam detection on twitter. arXiv preprint arXiv:1503.07405 (2015).
[35]
De Wang, Shamkant B Navathe, Ling Liu, Danesh Irani, Acar Tamersoy, and Calton Pu. 2013. Click traffic analysis of short url spam on twitter. In Collaborative Computing: Networking, Applications and Worksharing (Collaboratecom), 2013 9th International Conference Conference on. IEEE, 250--259.
[36]
Chao Yang, Robert Harkreader, Jialong Zhang, Seungwon Shin, and Guofei Gu. 2012. Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In Proceedings of the 21st international conference on World Wide Web. ACM, 71--80.

Cited By

View all

Index Terms

  1. CoReRank: Ranking to Detect Users Involved in Blackmarket-Based Collusive Retweeting Activities

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '19: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining
      January 2019
      874 pages
      ISBN:9781450359405
      DOI:10.1145/3289600
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 January 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. blackmarket
      2. collusion
      3. online social networks
      4. retweets
      5. twitter

      Qualifiers

      • Research-article

      Conference

      WSDM '19

      Acceptance Rates

      WSDM '19 Paper Acceptance Rate 84 of 511 submissions, 16%;
      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 12 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media