skip to main content
research-article

Generalizing Hate Speech Detection Using Multi-Task Learning: : A Case Study of Political Public Figures

Published: 01 January 2025 Publication History

Abstract

Automatic identification of hateful and abusive content is vital in combating the spread of harmful online content and its damaging effects. Most existing works evaluate models by examining the generalization error on train–test splits on hate speech datasets. These datasets often differ in their definitions and labeling criteria, leading to poor generalization performance when predicting across new domains and datasets. This work proposes a new Multi-task Learning (MTL) pipeline that trains simultaneously across multiple hate speech datasets to construct a more encompassing classification model. Using a dataset-level leave-one-out evaluation (designating a dataset for testing and jointly training on all others), we trial the MTL detection on new, previously unseen datasets. Our results consistently outperform a large sample of existing work. We show strong results when examining the generalization error in train–test splits and substantial improvements when predicting on previously unseen datasets. Furthermore, we assemble a novel dataset, dubbed PubFigs, focusing on the problematic speech of American Public Political Figures. We crowdsource-label using Amazon MTurk more than 20,000 tweets and machine-label problematic speech in all the 305,235 tweets in PubFigs. We find that the abusive and hate tweeting mainly originates from right-leaning figures and relates to six topics, including Islam, women, ethnicity, and immigrants. We show that MTL builds embeddings that can simultaneously separate abusive from hate speech, and identify its topics.

Highlights

Using multi-task learning to improve hate speech detection generalization in new datasets.
New Twitter hate speech dataset of American political public figures.
Case study on targets and topic of hate and abuse in speech of online public political figures.

References

[1]
Agrawal Sweta, Awekar Amit, Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms, ISBN 978-3-319-76940-0, 2018, pp. 141–153,.
[2]
Arango Aymé, Pérez Jorge, Poblete Barbara, Hate speech detection is not as easy as you may think: A closer look at model validation, in: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, (Paris, France) New York, NY, USA, ISBN 9781450361729, 2019, pp. 45–54,.
[3]
Badjatiya Pinkesh, Gupta Shashank, Gupta Manish, Varma Vasudeva, Deep learning for hate speech detection in tweets, 2017,.
[4]
Basile Valerio, Bosco Cristina, Fersini Elisabetta, Nozza Debora, Patti Viviana, Pardo Francisco Manuel Rangel, Rosso Paolo, Sanguinetti Manuela, SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter, in: Proceedings of the 13th International Workshop on Semantic Evaluation, Association for Computational Linguistics, Minneapolis, Minnesota, USA, 2019, pp. 54–63,.
[5]
Baxter Jonathan, A Bayesian/information theoretic model of learning to learn via multiple task sampling, vol. 28, 1997, pp. 7–39,.
[6]
Chiril Patricia, Pamungkas Endang Wahyu, Benamara Farah, Moriceau Véronique, Patti Viviana, Emotionally informed hate speech detection: A multi-target perspective, Cogn. Comput. 14 (2021) 322–352. https://api.semanticscholar.org/CorpusID:235671981.
[7]
Davidson Thomas, Warmsley Dana, Macy Michael W., Weber Ingmar, Automated hate speech detection and the problem of offensive language, CoRR abs/1703.04009, 2017, http://arxiv.org/abs/1703.04009.
[8]
de Gibert Ona, Perez Naiara, García-Pablos Aitor, Cuadros Montse, Hate speech dataset from a white supremacy forum, in: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 11–20,.
[9]
Devlin Jacob, Chang Ming-Wei, Lee Kenton, Toutanova Kristina, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805, 2018, arxiv:1810.04805.
[10]
ElSherief Mai, Nilizadeh Shirin, Nguyen Dana, Vigna Giovanni, Belding Elizabeth M., Peer to peer hate: Hate speech instigators and their targets. CoRR abs/1804.04649 2018, 2018, arxiv:1804.04649.
[11]
Fortuna Paula, Nunes Sérgio, A survey on automatic detection of hate speech in text, ACM Comput. Surv. (ISSN ) 51 (4) (2018) 30,. Article 85.
[12]
Fortuna Paula, Soler-Company Juan, Wanner Leo, How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets?, Inf. Process. Manage. (ISSN ) 58 (3) (2021),.
[13]
Gao Lei, Huang Ruihong, Detecting online hate speech using context aware models, in: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, INCOMA Ltd. Varna, Bulgaria, 2017, pp. 260–266,.
[14]
Ghosh Soumitra, Priyankar Amit, Ekbal Asif, Bhattacharyya Pushpak, A transformer-based multi-task framework for joint detection of aggression and hate on social media data, Nat. Lang. Eng. 29 (6) (2023) 1495–1515,.
[15]
Goodfellow Ian, Bengio Yoshua, Courville Aaron, Deep Learning, The MIT Press, ISBN 0262035618, 2016.
[16]
Guimaraes̃ Samuel, Kakizaki Gabriel, Melo Philipe, Silva Márcio, Murai Fabricio, Reis Julio C.S., Benevenuto Fabrício, Anatomy of hate speech datasets: Composition analysis and cross-dataset classification, in: Proceedings of the 34th ACM Conference on Hypertext and Social Media, Rome, Italy, New York, NY, USA, ISBN 9798400702327, 2023, p. 11,. Article 33.
[17]
Jasser Greta, McSwiney Jordan, Pertwee Ed, Zannettou Savvas, Welcome to #GabFam’: Far-right virtual community on gab, New Media Soc. (ISSN ) 25 (7) (2023) 1728–1745,.
[18]
Kapil Prashant, Ekbal Asif, A deep neural network based multi-task learning approach to hate speech detection, Knowl.-Based Syst. (ISSN ) 210 (2020),.
[19]
Kingma Diederik P., Ba Jimmy, Adam: A method for stochastic optimization, 2017, arxiv:1412.6980[cs.LG].
[20]
Kong Quyu, Booth Emily, Bailo Francesco, Johns Amelia, Rizoiu Marian-Andrei, Slipping to the extreme: A mixed method to explain how extreme opinions infiltrate online discussions, in: Proceedings of the International AAAI Conference on Web and Social Media, Vol. 16, No. 1, (ISSN ) 2022, pp. 524–535,.
[21]
Kong Quyu, Ram Rohit, Rizoiu Marian-Andrei, Evently: Modeling and Analyzing Reshare Cascades with Hawkes Processes, in: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, ISBN 9781450382977, 2021, pp. 1097–1100,.
[22]
Kong Quyu, Rizoiu Marian-Andrei, Xie Lexing, Describing and Predicting Online Items with Reshare Cascades via Dual Mixture Self-exciting Processes, in: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, ISBN 9781450368599, 2020, pp. 645–654,.
[23]
Krippendorff Klaus, Computing krippendorff’s alpha-reliability, 2011.
[24]
Lundberg Scott, Lee Su-In, A unified approach to interpreting model predictions, 2017,.
[25]
MacAvaney Sean, Yao Hao-Ren, Yang Eugene, Russell Katina, Goharian Nazli, Frieder Ophir, Hate speech detection: Challenges and solutions, PLOS ONE 14 (8) (2019) 1–16,.
[26]
Madukwe Kosisochukwu, Gao Xiaoying, Xue Bing, In data we trust: A critical analysis of hate speech detection datasets, in: Proceedings of the Fourth Workshop on Online Abuse and Harms, Association for Computational Linguistics, 2020, pp. 150–161,. Online.
[27]
Madukwe Kosisochukwu Judith, Gao Xiaoying, Xue Bing, A GA-based approach to fine-tuning BERT for hate speech detection, in: 2020 IEEE Symposium Series on Computational Intelligence, SSCI, 2020, pp. 2821–2828,.
[28]
Mandl Thomas, Modha Sandip, Majumder Prasenjit, Patel Daksh, Dave Mohana, Mandlia Chintak, Patel Aditya, Overview of the HASOC track at FIRE 2019: Hate speech and offensive content identification in Indo-European languages, in: Proceedings of the 11th Forum for Information Retrieval Evaluation (Kolkata, India) (New York, NY, USA, ISBN 9781450377508, 2019, pp. 14–17,.
[29]
McInnes Lel, Healy John, Melville James, UMAP: Uniform manifold approximation and projection for dimension reduction, 2018,.
[30]
Mozafari Marzieh, Farahbakhsh Reza, Crespi Noel, A BERT-based transfer learning approach for hate speech detection in online social media, 2019, arxiv:1910.12574.
[31]
Pan Sinno Jialin, Yang Qiang, A survey on transfer learning, IEEE Trans. Knowl. Data Eng. 22 (10) (2010) 1345–1359,.
[32]
Plaza-Del-Arco Flor Miriam, Dolores Molina-González M., Alfonso Ureña-López L., Martín-Valdivia María Teresa, A multi-task learning approach to hate speech detection leveraging sentiment analysis, IEEE Access 9 (2021) (2021) 112478–112489,.
[33]
Qian Jing, Bethke Anna, Liu Yinyin, Belding Elizabeth M., Wang William Yang, A benchmark dataset for learning to intervene in online hate speech. CoRR abs/1909.04251 2019, 2019, arxiv:1909.04251.
[34]
Rizoiu Marian-Andrei, Xie Lexing, Caetano Tiberio, Cebrian Manuel, Evolution of privacy loss in wikipedia, in: International Conference on Web Search and Data Mining, WSDM’16, ACM, ACM Press, New York, New York, USA, ISBN 9781450337168, 2016, pp. 215–224,. arxiv:1512.03523.
[35]
Roy Pradeep Kumar, Bhawal Snehaan, Subalalitha Chinnaudayar Navaneethakrishnan, Hate speech and offensive language detection in dravidian languages using deep ensemble framework, Comput. Speech Lang. 75 (2022) (2022).
[36]
Ruzicka M., Anwendung mathematisch-statisticher methoden in der geobotanik (synthetische bearbeitung von aufnahmen), Biol. Bratisl 13 (1958) (1958) 647–661.
[37]
Schmidt Anna, Wiegand Michael, A survey on hate speech detection using natural language processing, in: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, Association for Computational Linguistics, Valencia, Spain, 2017, pp. 1–10,.
[38]
Schneider Philipp J., Rizoiu Marian-Andrei, The effectiveness of moderating harmful online content, Proc. Natl. Acad. Sci. (ISSN ) 120 (34) (2023) 1–3,.
[39]
Sian Katy, Stupid Paki Loving Bitch: The Politics of Online Islamophobia and Misogyny, Springer International Publishing, Cham, ISBN 978-3-319-71776-0, 2018, pp. 117–138,.
[40]
Swamy Steve Durairaj, Jamatia Anupam, Gambäck Björn, Studying generalisability across abusive language detection datasets, in: Proceedings of the 23rd Conference on Computational Natural Language Learning, CoNLL, Association for Computational Linguistics, Hong Kong, China, 2019, pp. 940–950,.
[41]
Twitter Inc., Twitter’s policy on hateful conduct — twitter help, 2023, https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy.
[42]
United Nations, United nations strategy and plan of action on hate speech SYNOPSIS, 2019, https://www.un.org/en/genocideprevention/hate-speech-strategy.shtml.
[43]
Waseem Zeerak, Are you a racist or am I seeing things? Annotator influence on hate speech detection on Twitter, in: Proceedings of the First Workshop on NLP and Computational Social Science, Association for Computational Linguistics, Austin, Texas, 2016, pp. 138–142,.
[44]
Waseem Zeerak, Hovy Dirk, Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter, in: Proceedings of the NAACL Student Research Workshop, Association for Computational Linguistics, San Diego, California, 2016, pp. 88–93,.
[45]
Waseem Zeerak, Thorne James, Bingel Joachim, Bridging the gaps: Multi task learning for domain transfer of hate speech detection, Online Harassment 2018 (2018) 29–55,.
[46]
Welch B.L., The generalization of student’s problem when several different population varlances are involved, Biometrika (ISSN ) 34 (1–2) (1947),. https://academic.oup.com/biomet/article-pdf/34/1-2/28/553093/34-1-2-28.pdf.
[47]
Wu Siqi, Rizoiu Marian-Andrei, Xie Lexing, Variation across scales: Measurement fidelity under Twitter data sampling, in: International AAAI Conference on Web and Social Media, ICWSM’20, 2020, pp. 1–10. arxiv:2003.09557.
[48]
Yin Wenjie, Zubiaga Arkaitz, Towards generalisable hate speech detection: a review on obstacles and solutions, 2021, arxiv:2102.08886[cs.CL].
[49]
Yuan Lanqin, Wang Tianyu, Ferraro Gabriela, Suominen Hanna, Rizoiu Marian-Andrei, Transfer learning for hate speech detection in social media, J. Comput. Soc. Sci. (ISSN ) 6 (2) (2023) 1081–1101,.
[50]
Zhang Ziqi, Luo Lei, Hate speech detection: A solved problem? The challenging case of long tail on Twitter, 2018,.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Computer Speech and Language
Computer Speech and Language  Volume 89, Issue C
Jan 2025
618 pages

Publisher

Academic Press Ltd.

United Kingdom

Publication History

Published: 01 January 2025

Author Tags

  1. Hate speech
  2. Abusive speech
  3. Multi-task learning
  4. Public political figures
  5. Transfer learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media