skip to main content
10.1145/3340531.3411934acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Unsupervised Cyberbullying Detection via Time-Informed Gaussian Mixture Model

Published: 19 October 2020 Publication History

Abstract

Social media is a vital means for information-sharing due to its easy access, low cost, and fast dissemination characteristics. However, increases in social media usage have corresponded with a rise in the prevalence of cyberbullying. Most existing cyberbullying detection methods aresupervised and, thus, have two key drawbacks: (1) The data labeling process is often time-consuming and labor-intensive; (2) Current labeling guidelines may not be generalized to future instances because of different language usage and evolving social networks. To address these limitations, this work introduces a principled approach forunsupervised cyberbullying detection. The proposed model consists of two main components: (1) Arepresentation learning network that encodes the social media session by exploiting multi-modal features, e.g., text, network, and time. (2) Amulti-task learning network that simultaneously fits the comment inter-arrival times and estimates the bullying likelihood based on a Gaussian Mixture Model. The proposed model jointly optimizes the parameters of both components to overcome the shortcomings of decoupled training. Our core contribution is an unsupervised cyberbullying detection model that not only experimentally outperforms the state-of-the-art unsupervised models, but also achieves competitive performance compared to supervised models.

Supplementary Material

MP4 File (3340531.3411934.mp4)
This is the pre-recorded video for the paper Unsupervised Cyberbullying Detection via Time-Informed Gaussian Mixture Model. This video covers the studied problem, major challenges, key (and simple) idea of the proposed solution, and some empirical evaluation results. For details, please refer to the paper. Enjoy it and have fun!

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
[2]
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. CSUR, Vol. 41, 3 (2009), 15.
[3]
Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Mean birds: Detecting aggression and bullying on twitter. In Websci. ACM, 13--22.
[4]
Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. JAIR, Vol. 16 (2002), 321--357.
[5]
Lu Cheng, Ruocheng Guo, and Huan Liu. 2019 a. Robust Cyberbullying Detection with Causal Interpretation. In WWW' Companion.
[6]
Lu Cheng, Ruocheng Guo, Yasin Silva, Deborah Hall, and Huan Liu. 2019 b. Hierarchical Attention Networks for Cyberbullying Detection on the Instagram Social Network. In SDM.
[7]
Lu Cheng, Jundong Li, Yasin Silva, Deborah Hall, and Huan Liu. 2019 c. PI-Bully: Personalized Cyberbullying Detection with Peer Influence. In IJCAI. AAAI.
[8]
Lu Cheng, Jundong Li, Yasin N Silva, Deborah L Hall, and Huan Liu. 2019 d. XBully: Cyberbullying Detection within a Multi-Modal Context. In WSDM. 339--347.
[9]
Harsh Dani, Jundong Li, and Huan Liu. 2017. Sentiment informed cyberbullying detection in social media. In ECML PKDD. Springer, 52--67.
[10]
Michele Di Capua, Emanuel Di Nardo, and Alfredo Petrosino. 2016. Unsupervised cyber bullying detection in social networks. In ICPR. IEEE, 432--437.
[11]
Thomas G Dietterich. 2002. Machine learning for sequential data: A review. In SSPR. Springer, 15--30.
[12]
Karthik Dinakar, Birago Jones, Catherine Havasi, Henry Lieberman, and Rosalind Picard. 2012. Common sense reasoning for detection, prevention, and mitigation of cyberbullying. TiiS, Vol. 2, 3 (2012), 18.
[13]
Karthik Dinakar, Roi Reichart, and Henry Lieberman. 2011. Modeling the detection of textual cyberbullying. In ICWSM.
[14]
Dorothy L Espelage, Melissa K Holt, and Rachael R Henkel. 2003. Examination of peer--group contextual effects on aggression during early adolescence. Child development, Vol. 74, 1 (2003), 205--220.
[15]
Ruth Festl and Thorsten Quandt. 2013. Social relations and cyberbullying: The influence of individual and structural attributes on victimization and perpetration via the internet. Human communication research, Vol. 39, 1 (2013), 101--126.
[16]
Aditya Grover, Aaron Zweig, and Stefano Ermon. 2018. Graphite: Iterative generative modeling of graphs. arXiv preprint arXiv:1803.10459 (2018).
[17]
Aabhaas Gupta, Wenxi Yang, Divya Sivakumar, Yasin N Silva, Deborah L Hall, and Maria Camila Nardini Barioni. 2020. Temporal Properties of Cyberbullying on Instagram. (2020).
[18]
Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Analyzing labeled cyberbullying incidents on the instagram social network. In Socinfo. Springer, 49--66.
[19]
Homa Hosseinmardi, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2016. Prediction of cyberbullying incidents in a media-based social network. In ASONAM. IEEE, 186--192.
[20]
Qianjia Huang, Vivek Kumar Singh, and Pradeep Kumar Atrey. 2014. Cyber bullying detection using social and textual analysis. In SAM. ACM, 3--6.
[21]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[22]
Thomas N Kipf and Max Welling. 2016a. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[23]
Thomas N Kipf and Max Welling. 2016b. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).
[24]
Yann LeCun, Sumit Chopra, Raia Hadsell, M Ranzato, and F Huang. 2006. A tutorial on energy-based learning. Predicting structured data, Vol. 1, 0 (2006).
[25]
Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. arXiv preprint arXiv:1506.01057 (2015).
[26]
Ping Liu, Joshua Guberman, Libby Hemphill, and Aron Culotta. 2018. Forecasting the presence and intensity of hostility on Instagram using linguistic and social features. In ICWSM.
[27]
C Moessner. 2014. Cyberbullying, Trends and Tudes. NCPC. org. Accessed (2014).
[28]
Parma Nand, Rivindu Perera, and Abhijeet Kasture. 2016. “How Bullying is this Message?”: A Psychometric Thermometer for Bullying. In COLING. 695--706.
[29]
online. [n. d.]. Ditch The Label (2013) The Annual Cyberbullying Survey. Availabe from https://www.ditchthelabel.org/wp--content/uploads/2016/07/cyberbullying2013.pdf.
[30]
Jing Qian, Anna Bethke, Yinyin Liu, Elizabeth Belding, and William Yang Wang. 2019. A Benchmark Dataset for Learning to Intervene in Online Hate Speech. arXiv preprint arXiv:1909.04251 (2019).
[31]
Rahat Ibn Rafiq, Homa Hosseinmardi, Richard Han, Qin Lv, Shivakant Mishra, and Sabrina Arredondo Mattson. 2015. Careful what you share in six seconds: Detecting cyberbullying instances in Vine. In ASONAM. ACM, 617--622.
[32]
Rahat Ibn Rafiq, Homa Hosseinmardi, Sabrina Arredondo Mattson, Richard Han, Qin Lv, and Shivakant Mishra. 2016. Analysis and detection of labeled cyberbullying instances in Vine, a video-based social network. SNAM, Vol. 6, 1 (2016), 88.
[33]
Elaheh Raisi and Bert Huang. 2017a. Co-trained ensemble models for weakly supervised cyberbullying detection. In NIPS LLD Workshop.
[34]
Elaheh Raisi and Bert Huang. 2017b. Cyberbullying detection with weakly supervised machine learning. In ASONAM. ACM, 409--416.
[35]
Walisa Romsaiyud, Kodchakorn na Nakornphanom, Pimpaka Prasertsilp, Piyaporn Nurarak, and Pirom Konglerd. 2017. Automated cyberbullying detection using clustering appearance patterns. In KST. IEEE, 242--247.
[36]
Guillaume Salha, Romain Hennequin, Viet Anh Tran, and Michalis Vazirgiannis. 2019. A degeneracy framework for scalable graph autoencoders. arXiv preprint arXiv:1902.08813 (2019).
[37]
Christina Salmivalli, Arja Huttunen, and Kirsti MJ Lagerspetz. 1997. Peer networks and bullying in schools. Scandinavian journal of psychology, Vol. 38, 4 (1997), 305--312.
[38]
Yasin N Silva, Deborah L Hall, and Christopher Rich. 2018. BullyBlocker: toward an interdisciplinary approach to identify cyberbullying. SNAM, Vol. 8, 1 (2018), 18.
[39]
Peter K Smith, Jess Mahdavi, Manuel Carvalho, Sonja Fisher, Shanette Russell, and Neil Tippett. 2008. Cyberbullying: Its nature and impact in secondary school pupils. Journal of child psychology and psychiatry, Vol. 49, 4 (2008), 376--385.
[40]
Chunfeng Song, Feng Liu, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2013. Auto-encoder based data clustering. In CIARP. Springer, 117--124.
[41]
Devin Soni and Vivek Singh. 2018. Time Reveals All Wounds: Modeling Temporal Characteristics of Cyberbullying. In ICWSM.
[42]
Daniel Svozil, Vladimir Kvasnicka, and Jiri Pospichal. 1997. Introduction to multi-layer feed-forward neural networks. CHEMOMETR INTELL LAB, Vol. 39, 1 (1997), 43--62.
[43]
Miranda Witvliet, Tjeert Olthof, Jan B Hoeksma, Frits A Goossens, Marieke SI Smits, and Hans M Koot. 2010. Peer group affiliation of children: The role of perceived popularity, likeability, and behavioral similarity in bullying. Social Development, Vol. 19, 2 (2010), 285--303.
[44]
Junyuan Xie, Ross Girshick, and Ali Farhadi. 2016. Unsupervised deep embedding for clustering analysis. In ICML. 478--487.
[45]
Jun-Ming Xu, Kwang-Sung Jun, Xiaojin Zhu, and Amy Bellmore. 2012. Learning from bullying traces in social media. In NAACL HLT. ACL, 656--666.
[46]
Bo Yang, Xiao Fu, Nicholas D Sidiropoulos, and Mingyi Hong. 2017. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In ICML. JMLR. org, 3861--3870.
[47]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In NAACL HLT. 1480--1489.
[48]
Shuangfei Zhai, Yu Cheng, Weining Lu, and Zhongfei Zhang. 2016. Deep structured energy based models for anomaly detection. arXiv preprint arXiv:1605.07717 (2016).
[49]
Caleb Ziems, Ymir Vigfusson, and Fred Morstatter. 2020. Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining Representations for Cyberbullying Classification. In ICWSM.
[50]
Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In ICLR.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
October 2020
3619 pages
ISBN:9781450368599
DOI:10.1145/3340531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Gaussian mixture model
  2. cyberbullying detection
  3. representation learning
  4. social media

Qualifiers

  • Research-article

Funding Sources

  • National Science Foundation

Conference

CIKM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media