skip to main content
10.1145/3442381.3450137acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

“Short is the Road that Leads from Fear to Hate”: Fear Speech in Indian WhatsApp Groups

Published: 03 June 2021 Publication History

Abstract

WhatsApp is the most popular messaging app in the world. Due to its popularity, WhatsApp has become a powerful and cheap tool for political campaigning being widely used during the 2019 Indian general election, where it was used to connect to the voters on a large scale. Along with the campaigning, there have been reports that WhatsApp has also become a breeding ground for harmful speech against various protected groups and religious minorities. Many such messages attempt to instil fear among the population about a specific (minority) community. According to research on inter-group conflict, such ‘fear speech’ messages could have a lasting impact and might lead to real offline violence. In this paper, we perform the first large scale study on fear speech across thousands of public WhatsApp groups discussing politics in India. We curate a new dataset and try to characterize fear speech from this dataset. We observe that users writing fear speech messages use various events and symbols to create the illusion of fear among the reader about a target community. We build models to classify fear speech and observe that current state-of-the-art NLP models do not perform well at this task. Fear speech messages tend to spread faster and could potentially go undetected by classifiers built to detect traditional toxic speech due to their low toxic nature. Finally, using a novel methodology to target users with Facebook ads, we conduct a survey among the users of these WhatsApp groups to understand the types of users who consume and share fear speech. We believe that this work opens up new research questions that are very different from tackling hate speech which the research community has been traditionally involved in. We have made our code and dataset public for other researchers.

References

[1]
Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin. 2019. Docbert: Bert for document classification. arXiv preprint arXiv:1904.08398(2019).
[2]
Sai Saket Aluru, Binny Mathew, Punyajoy Saha, and Animesh Mukherjee. 2020. Deep Learning Models for Multilingual Hate Speech Detection. arXiv preprint arXiv:2004.06465(2020).
[3]
Mikel Artetxe and Holger Schwenk. 2019. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics 7 (2019), 597–610.
[4]
Chinmayi Arun. 2019. On WhatsApp, Rumours, and Lynchings. Economic & Political Weekly 54, 6 (2019), 30–35.
[5]
Robert Attwell. 2018. Religious Intolerance, Lynch Mobs and Social Media in India | GSI. https://gsi.s-rminform.com/articles/religious-intolerance-lynch-mobs-and-social-media-in-india. (Accessed on 01/22/2021).
[6]
Valerio Basile, Cristina Bosco, Elisabetta Fersini, Debora Nozza, Viviana Patti, Francisco Manuel Rangel Pardo, Paolo Rosso, and Manuela Sanguinetti. 2019. SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. In Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics, 54–63.
[7]
Soma Basu. 2019. Manufacturing Islamophobia on WhatsApp in India – The Diplomat. https://thediplomat.com/2019/05/manufacturing-islamophobia-on-whatsapp-in-india. (Accessed on 01/22/2021).
[8]
Susan Benesch. 2012. Dangerous speech: A proposal to prevent group violence. (2012).
[9]
Akash Bisht and Sadiq Naqvi. 2020. How Tablighi Jamaat event became India’s worst coronavirus vector | Coronavirus pandemic News | Al Jazeera. https://www.aljazeera.com/news/2020/4/7/how-tablighi-jamaat-event-became-indias-worst-coronavirus-vector. (Accessed on 01/22/2021).
[10]
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment 2008, 10(2008), P10008.
[11]
Aditya Bohra, Deepanshu Vijay, Vinay Singh, Syed Sarfaraz Akhtar, and Manish Shrivastava. 2018. A Dataset of Hindi-English Code-Mixed Social Media Text for Hate Speech Detection. In Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media. Association for Computational Linguistics, 36–41.
[12]
Antoine Buyse. 2014. Words of violence: Fear speech, or how violent conflict escalation relates to the freedom of expression. Hum. Rts. Q. 36(2014), 779.
[13]
Josemar Alves Caetano, Jaqueline Faria de Oliveira, Hélder Seixas Lima, Humberto T Marques-Neto, Gabriel Magno, Wagner Meira Jr, and Virgílio AF Almeida. 2018. Analyzing and characterizing political discussions in WhatsApp public groups. arXiv preprint arXiv:1804.00397(2018).
[14]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the NAACL. 4171–4186.
[15]
Mai ElSherief, Vivek Kulkarni, Dana Nguyen, William Yang Wang, and Elizabeth M. Belding. 2018. Hate Lingo: A Target-Based Linguistic Analysis of Hate Speech in Social Media. Proceedings of the Twelfth International Conference on Web and Social Media, ICWSM 2018 (2018).
[16]
Ethan Fast, Binbin Chen, and Michael S Bernstein. 2016. Empath: Understanding topic signals in large-scale text. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 4647–4657.
[17]
Paula Fortuna and Sérgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–30.
[18]
Iginio Gagliardone. 2019. Extreme Speech| Defining Online Hate and Its “Public Lives”: What is the Place for “Extreme Speech”?International Journal of Communication 13 (2019), 20.
[19]
Kiran Garimella and Dean Eckles. 2020. Images and Misinformation in Political Groups: Evidence from WhatsApp in India. Harvard Kennedy School Misinformation Review (Jul 2020). https://doi.org/10.37016/mr-2020-030
[20]
Kiran Garimella and Dean Eckles. 2020. Images and Misinformation in Political Groups: Evidence from WhatsApp in India. Harvard Kennedy School Misinformation Review (2020).
[21]
Kiran Garimella and Gareth Tyson. 2018. Whatapp Doc? A First Look at Whatsapp Public Group Data. In 12th ICWSM.
[22]
Aristides Gionis, Piotr Indyk, Rajeev Motwani, 1999. Similarity search in high dimensions via hashing. In Vldb, Vol. 99. 518–529.
[23]
Peter Gottschalk, Gabriel Greenberg, and Gary Greenberg. 2008. Islamophobia: making Muslims the enemy. Rowman & Littlefield.
[24]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[25]
Matthew Hoffman, Francis R Bach, and David M Blei. 2010. Online learning for latent dirichlet allocation. In advances in neural information processing systems.
[26]
Katherine Hoffman-Pham 2020. Social Media Markets for Survey Research in Comparative Contexts: Facebook Users in Kenya. Arxiv 1910.03448 (2020).
[27]
Kyle P. Johnson, Patrick Burns, John Stewart, and Todd Cook. 2014–2020. CLTK: The Classical Language Toolkit. https://github.com/cltk/cltk
[28]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
[29]
Maksim Kitsak, Lazaros K Gallos, Shlomo Havlin, Fredrik Liljeros, Lev Muchnik, H Eugene Stanley, and Hernán A Makse. 2010. Identification of influential spreaders in complex networks. Nature physics 6, 11 (2010), 888–893.
[30]
Adam Klein. 2017. Fanaticism, racism, and rage online: Corrupting the digital sphere. Springer.
[31]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. 1188–1196.
[32]
Wolfgang Lenhard and Alexandra Lenhard. 2017. Computation of Effect Sizes. (10 2017).
[33]
Taoying Li, Jie Bai, Xue Yang, Qianyu Liu, and Yan Chen. 2018. Co-Occurrence Network of High-Frequency Words in the Bioinformatics Literature: Structural Characteristics and Evolution. Applied Sciences 8, 10 (2018), 1994.
[34]
CSDS Lokniti. 2018. How widespread is WhatsApp’s usage in India?https://www.livemint.com/Technology/O6DLmIibCCV5luEG9XuJWL/How-widespread-is-WhatsApps-usage-in-India.html
[35]
Fragkiskos D Malliaros, Maria-Evgenia G Rossi, and Michalis Vazirgiannis. 2016. Locating influential nodes in complex networks. Scientific reports 6(2016), 19307.
[36]
Henry B Mann and Donald R Whitney. 1947. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics(1947), 50–60.
[37]
Binny Mathew, Anurag Illendula, Punyajoy Saha, Soumya Sarkar, Pawan Goyal, and Animesh Mukherjee. 2020. Hate Begets Hate: A Temporal Study of Hate Speech. Proc. ACM Hum.-Comput. Interact.CSCW2 (2020).
[38]
Timothy McLaughlin. 2018. How WhatsApp Fuels Fake News and Violence in India | WIRED. https://www.wired.com/story/how-whatsapp-fuels-fake-news-and-violence-in-india/. (Accessed on 01/22/2021).
[39]
Nedjma Ousidhoum, Zizheng Lin, Hongming Zhang, Yangqiu Song, and Dit-Yan Yeung. 2019. Multilingual and Multi-Aspect Hate Speech Analysis. In Proceedings of EMNLP-IJCNLP. 4667–4676.
[40]
Rohit Parakh. 2017. 84% Dead In Cow-Related Violence Since 2010 Are Muslim; 97% Attacks After 2014. https://www.indiaspend.com/86-dead-in-cow-related-violence-since-2010-are-muslim-97-attacks-after-2014-2014. (Accessed on 01/22/2021).
[41]
Juan Carlos Pereira-Kohatsu, Lara Quijano-Sánchez, Federico Liberatore, and Miguel Camacho-Collados. 2019. Detecting and Monitoring Hate Speech in Twitter. Sensors (Basel, Switzerland)21 (October 2019).
[42]
Radim Řehůřek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA, Valletta, Malta, 45–50.
[43]
Julio CS Reis, Philipe Melo, Kiran Garimella, and Fabrício Benevenuto. 2020. Can WhatsApp benefit from debunked fact-checked stories to reduce misinformation?Harvard Kennedy School Misinformation Review (2020).
[44]
Gustavo Resende, Philipe Melo, Hugo Sousa, Johnnatan Messias, Marisa Vasconcelos, Jussara Almeida, and Fabrício Benevenuto. 2019. (Mis) Information Dissemination in WhatsApp: Gathering, Analyzing and Countermeasures. In The World Wide Web Conference. 818–828.
[45]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ” Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining.
[46]
Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the Space of Topic Coherence Measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. ACM, 399–408.
[47]
Paul R. Rosenbaum and Donald B. Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika (1983), 41–55.
[48]
Björn Ross, Michael Rist, Guillermo Carbonell, Benjamin Cabrera, Nils Kurowsky, and Michael Wojatzki. 2017. Measuring the reliability of hate speech annotations: The case of the european refugee crisis. Bochumer Linguistische Arbeitsberichte(2017).
[49]
Sebastian Ruder, Anders Søgaard, and Ivan Vulić. 2019. Unsupervised cross-lingual representation learning. In Proceedings of 57th ACL: Tutorial Abstracts.
[50]
Joni Salminen, Maximilian Hopf, Shammur A Chowdhury, Soon-gyo Jung, Hind Almerekhi, and Bernard J Jansen. 2020. Developing an online hate classifier for multiple social media platforms. Human-centric Computing and Information Sciences 10, 1 (2020), 1.
[51]
Agnese Sampietro. 2019. Emoji and rapport management in Spanish WhatsApp chats. Journal of Pragmatics 143 (2019), 109–120.
[52]
Radha Sarkar and Amar Sarkar. 2016. Sacred slaughter: An analysis of historical, communal, and constitutional aspects of beef bans in India. Politics, Religion & Ideology 17, 4 (2016), 329–351.
[53]
Jan Seidler, Jiří Vondráček, and Ivan Saxl. 2000. The life and work of Zbyněk Šidák (1933–1999). Applications of Mathematics 45, 5 (2000), 321–336.
[54]
Kijung Shin, Tina Eliassi-Rad, and Christos Faloutsos. 2016. Corescope: Graph mining using k-core analysis—patterns, anomalies and algorithms. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 469–478.
[55]
Maria Grazia Sindoni. 2018. Direct hate speech vs. indirect fear speech. A multimodal critical discourse analysis of the Sun’s editorial ”1 in 5 Brit Muslims’ sympathy for jihadis”. Lingue e Linguaggio 28 (12 2018), 267–292.
[56]
Serra Sinem Tekiroğlu, Yi-Ling Chung, and Marco Guerini. 2020. Generating Counter Narratives against Online Hate Speech: Data and Strategies. Association for Computational Linguistics, 1177–1190.
[57]
Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 3, 1 (2016), 1–9.
[58]
Téwodros W. Workneh. 2019. Ethiopia’s Hate Speech Predicament: Seeking Antidotes Beyond a Legislative Response. African Journalism Studies 40, 3 (2019).
[59]
T Yasseri and B Vidgen. 2019. Detecting weak and strong Islamophobic hate speech on social media. Journal of Information Technology and Politics (2019).
[60]
Michele L Ybarra, Kimberly J Mitchell, Janis Wolak, and David Finkelhor. 2006. Examining characteristics and associated distress related to Internet harassment: findings from the Second Youth Internet Safety Survey. Pediatrics 118, 4 (2006).
[61]
Ziqi Zhang, David Robinson, and Jonathan Tepper. 2018. Detecting hate speech on twitter using a convolution-gru based deep neural network. In European semantic web conference. Springer, 745–760.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '21: Proceedings of the Web Conference 2021
April 2021
4054 pages
ISBN:9781450383127
DOI:10.1145/3442381
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Islamophobia
  2. WhatsApp
  3. classification
  4. fear speech
  5. hate speech
  6. survey

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '21
Sponsor:
WWW '21: The Web Conference 2021
April 19 - 23, 2021
Ljubljana, Slovenia

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)139
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media