Crowd Detection in Mass Gatherings Based on Social Media Data: A Case Study of the 2014 Shanghai New Year’s Eve Stampede
Abstract
:1. Introduction
2. Methodology
2.1. Study Case Description
2.2. Sina Weibo Microblog Check-In Data
2.3. Proposed Methods
2.3.1. Spatial Autocorrelation
2.3.2. Topic Modeling and Sentiment Analysis in Chinese
Algorithm 1. The Generation Process of the LDA Model |
for all topics |
choose a word distribution |
for all the documents |
choose a topic distribution |
for word |
choose the topic of the word: |
choose the word: |
3. Results and Discussion
3.1. Results of Spatial Autocorrelation
3.2. Results of Topic Modeling
- Topic 1:
- There was a stampede in Shanghai Bund, I was fortunate that I missed it yesterday.
- Topic 2:
- Hello, Shanghai Bund. There are many people walking here.
- Topic 3:
- We are having fun among many people.
- Topic 4:
- Good morning everybody. Wish you good luck. I feel blessed that I stayed with family yesterday.
- Topic 5:
- I still feel happy despite the cold wave.
- Topic 6:
- Happy days are coming with a very nice breakfast.
- Topic 7:
- I will start working hard from the early morning.
- Topic 8:
- Sharing songs for the dead. May the dead rest in peace.
- Topic 9:
- Many people died tonight. We should stay calm in mass gatherings.
- Topic 10:
- Hoping the dead from the stampede rest in peace.
3.3. Results of Sentiment Analysis
4. Discussion
- (1)
- First, social media data are not limited by sparse sensor coverage. For example, in this study, only Weibo check-in data are used. Everyone in the stream is aware of the entire event timeline through the timestamp, geographic location, and semantic information in the social media data. Thus, processes such as event feedback do not require additional sensors or costs.
- (2)
- Second, social media data are real-time data. The Weibo check-in data used in this paper can reveal the aggregation situation of the crowd in near-real time.
- (3)
- Finally, social media data can be used in multidimensional analyses. For example, Weibo check-in data provide multiple features, such as space, time, and semantic features, to comprehensively analyze crowd change. Notably, the analysis of Weibo posts reflects the influence of user psychology and activities before and after the stampede. Performing such tasks is difficult based on video crowd detection [58] and mobile phone crowd detection [59]. Social media data can provide a new perspective for crowd gathering detection.
- (1)
- Improved data-filtering methods and more powerful NLP models are needed to improve the accuracy of the results.
- (2)
- This research considers only Weibo posts and check-ins and does not include Weibo comments. In the future, we can add Weibo comment data and relationship chain information from comments to the analysis. For example, the impact range of the stampede event can be evaluated in combination with the comments. In cases of disaster recovery, this approach can enable more in-depth analysis of the event process.
- (3)
- To efficiently extract relevant user information and classify microblog users, the current framework does not take user preferences into account. In future work, the features of different users can be extracted by using models, assigning different weights to Weibos or input features directly in an NLP model as parameters, and eliminating bias associated with user preferences in social media.
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Young, S.D. Behavioral insights on big data: Using social media for predicting biomedical outcomes. Trends Microbiol. 2014, 22, 601–602. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xiang, Z.; Gretzel, U. Role of social media in online travel information search. Tour. Manag. 2010, 31, 179–188. [Google Scholar] [CrossRef]
- Asur, S.; Huberman, B.A. Predicting the future with social media. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Toronto, ON, Canada, 31 August–3 September 2010. [Google Scholar]
- Brynielsson, J.; Johansson, F.; Jonsson, C.; Westling, A. Emotion classification of social media posts for estimating people’s reactions to communicated alert messages during crises. Secur. Inform. 2014, 3, 7. [Google Scholar] [CrossRef] [Green Version]
- He, S.; Zheng, X.; Zeng, D.; Luo, C.; Zhang, Z. Exploring entrainment patterns of human emotion in social media. PLoS ONE 2016, 11, e0150630. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ruths, D.; Pfeffer, J. Social media for large studies of behavior. Science 2014, 346, 1063–1064. [Google Scholar] [CrossRef] [PubMed]
- Tyshchuk, Y.; Wallace, W.A. Modeling human behavior on social media in response to significant events. IEEE Trans. Comput. Soc. Syst. 2018, 5, 444–457. [Google Scholar] [CrossRef]
- Tyshchuk, Y. Modeling Human Behavior in the Context of Social Media during Extreme Events Caused by Natural Hazards. PhD Thesis, Faculty of Rensselaer Polytechnic Institute, New York, NY, USA, 2015. [Google Scholar]
- Arbon, P. The development of conceptual models for mass-gathering health. Prehosp. Disaster Med. 2004, 19, 208–212. [Google Scholar] [CrossRef] [Green Version]
- Johansson, A.; Batty, M.; Hayashi, K.; Al Bar, O.; Marcozzi, D.; A Memish, Z. Crowd and environmental management during mass gatherings. Lancet Infect. Dis. 2012, 12, 150–156. [Google Scholar] [CrossRef] [Green Version]
- Zhou, J.; Pei, H.; Wu, H. Early warning of human crowds based on query data from Baidu maps: Analysis based on Shanghai stampede. In Big Data Support. of Urban. Planning and Management; Springer: Cham, Switzerland, 2018; pp. 19–41. [Google Scholar]
- Pretorius, M.; Gwynne, S.; Galea, E.R. Large crowd modelling: An analysis of the Duisburg Love Parade disaster. Fire Mater. 2015, 39, 301–322. [Google Scholar] [CrossRef]
- Berlonghi, A.E. Understanding and planning for different spectator crowds. Saf. Sci. 1995, 18, 239–247. [Google Scholar] [CrossRef]
- De Almeida, M.M.; Von Schreeb, J. Human stampedes: An updated review of current literature. Prehosp. Disaster Med. 2018, 34, 82–88. [Google Scholar] [CrossRef] [PubMed]
- Cheng, Z.; Lu, J.; Zhao, Y. Pedestrian evacuation risk assessment of subway station under large-scale sport activity. Int. J. Environ. Res. Public Health 2020, 17, 3844. [Google Scholar] [CrossRef] [PubMed]
- Song, X.; Zhang, H.; Akerkar, R.A.; Huang, H.; Guo, S.; Zhong, L.; Ji, Y.; Opdahl, A.L.; Purohit, H.; Skupin, A.; et al. Big data and emergency management: Concepts, methodologies, and applications. IEEE Trans. Big Data 2020. [Google Scholar] [CrossRef]
- Xia, T.; Song, X.; Zhang, H.; Song, X.; Kanasugi, H.; Shibasaki, R. Measuring spatio-temporal accessibility to emergency medical services through big GPS data. Health Place 2019, 56, 53–62. [Google Scholar] [CrossRef] [PubMed]
- Dai, D.; Wang, R. Space-time surveillance of negative emotions after consecutive terrorist attacks in London. Int. J. Environ. Res. Public Health 2020, 17, 4000. [Google Scholar] [CrossRef] [PubMed]
- Yin, Z.; Cao, L.; Han, J.; Luo, J.; Huang, T. Diversified trajectory pattern ranking in geo-tagged social media. In Proceedings of the 11th SIAM International Conference on Data Mining, Mesa, AZ, USA, 28–30 April 2011. [Google Scholar]
- Fujisaka, T.; Lee, R.; Sumiya, K. Discovery of user behavior patterns from geo-tagged micro-blogs. In Proceedings of the 4th International Conference on Uniquitous Information Management and Communication, Suwon, Korea, 14–15 January 2010. [Google Scholar]
- Hasan, S.; Zhan, X.; Ukkusuri, S.V. Understanding urban human activity and mobility patterns using large-scale location-based data from online social media. In Proceedings of the 19th ACM SIGKDD International Workshop on Urban Computing, Chicago, IL, USA, 11 August 2013. [Google Scholar]
- Ceron, A.; Curini, L.; Iacus, S.M.; Porro, G. Every tweet counts? How sentiment analysis of social media can improve our knowledge of citizens’ political preferences with an application to Italy and France. New Media Soc. 2013, 16, 340–358. [Google Scholar] [CrossRef]
- Yu, Y.; Duan, W.; Cao, Q. The impact of social and conventional media on firm equity value: A sentiment analysis approach. Decis. Support Syst. 2013, 55, 919–926. [Google Scholar] [CrossRef]
- Xia, R.; Jiang, J.; He, H. Distantly supervised lifelong learning for large-scale social media sentiment analysis. IEEE Trans. Affect. Comput. 2017, 8, 480–491. [Google Scholar] [CrossRef]
- Yang, Y.; Su, Y. Public voice via social media: Role in cooperative governance during public health emergency. Int. J. Environ. Res. Public Health 2020, 17, 6840. [Google Scholar] [CrossRef]
- Sakaki, T.; Okazaki, M.; Matsuo, Y. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 2012, 25, 919–931. [Google Scholar] [CrossRef]
- Crooks, A.; Croitoru, A.; Stefanidis, A.; Radzikowski, J. #Earthquake: Twitter as a distributed sensor system. Trans. GIS 2012, 17, 124–147. [Google Scholar]
- De Longueville, B.; Smith, R.S.; Luraschi, G. Omg, from here, I can see the flames!: A use case of mining location based social networks to acquire spatio-temporal data on forest fires. In Proceedings of the 2009 International Workshop on Location Based Social Networks, Seattle, WA, USA, 3 November 2009. [Google Scholar]
- Thomopoulos, S.C.A.; Kyriazanos, D.M.; Astyakopoulos, A.; Dimitros, K.; Margonis, C.; Thanos, G.K.; Skroumpelou, K. OCULUS fire: A command and control system for fire management with crowd sourcing and social media interconnectivity. In Proceedings of the SPIE Defense + Security, Baltimore, MD, USA, 17–21 April 2016; Available online: https://www.spiedigitallibrary.org/conference-proceedings-of-spie/9842/98420U/OCULUS-fire--a-command-and-control-system-for-fire/10.1117/12.2223996.full?SSO=1 (accessed on 1 November 2020).
- Cheong, F.; Cheong, C. Social media data mining: A social network analysis of tweets during the 2010–2011 Australian floods. In Proceedings of the Pacific Asia Conference on Information Systems (PACIS), Brisbane, Australia, 7–11 July 2011; Volume 11, p. 46. [Google Scholar]
- Rosser, J.F.; Leibovici, D.; Jackson, M.J. Rapid flood inundation mapping using social media, remote sensing and topographic data. Nat. Hazards 2017, 87, 103–120. [Google Scholar] [CrossRef] [Green Version]
- Xu, Z.; Zhang, H.; Liu, Y.; Mei, L. Crowd sensing of urban emergency events based on social media big data. In Proceedings of the IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Beijing, China, 24–26 September 2014. [Google Scholar]
- Ngo, M.Q.; Haghighi, P.D.; Burstein, F. A crowd monitoring framework using emotion analysis of social media for emergency management in mass gatherings. In Proceedings of the 26th Australasian Conference on Information Systems, Adelaide, Australia, 30 November–4 December 2015. [Google Scholar]
- Martínez-Castaño, R.; Pichel, J.C.; Losada, D.E. A big data platform for real time analysis of signs of depression in social media. Int. J. Environ. Res. Public Health 2020, 17, 4752. [Google Scholar] [CrossRef]
- Zhou, M.; Wang, M.; Zhang, J. How are risks generated, developed and amplified? Case study of the stampede incident at Shanghai Bund on 31 December 2014. Int. J. Disaster Risk Reduct. 2017, 24, 209–215. [Google Scholar] [CrossRef]
- Shan, S.; Zhao, F.; Wei, Y.; Liu, M. Disaster management 2.0: A real-time disaster damage assessment model based on mobile social media data—A case study of Weibo (Chinese Twitter). Saf. Sci. 2019, 115, 393–413. [Google Scholar] [CrossRef]
- Xiao, Y.; Li, B.; Gong, Z. Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data. Nat. Hazards 2018, 94, 833–842. [Google Scholar] [CrossRef]
- Bai, H.; Yu, G. A Weibo-based approach to disaster informatics: Incidents monitor in post-disaster situation via Weibo text negative sentiment analysis. Nat. Hazards 2016, 83, 1177–1196. [Google Scholar] [CrossRef]
- Fu, K.-W.; Chan, C.-H.; Chau, M. Assessing censorship on microblogs in China: Discriminatory keyword analysis and the real-name registration policy. IEEE Internet Comput. 2013, 17, 42–50. [Google Scholar] [CrossRef]
- Todd, A.W.; Campbell, A.L.; Meyer, G.G.; Horner, R.H. The effects of a targeted intervention to reduce problem behaviors. J. Posit. Behav. Interv. 2008, 10, 46–55. [Google Scholar] [CrossRef]
- Liu, Y.; Sui, Z.; Kang, C.; Gao, Y. Uncovering patterns of inter-urban trip and spatial interaction from social media check-in data. PLoS ONE 2014, 9, e86026. [Google Scholar] [CrossRef]
- Liu, T.; Yang, L.; Liu, S.; Ge, S. Inferring and analysis of social networks using RFID check-in data in China. PLoS ONE 2017, 12, e0178492. [Google Scholar] [CrossRef] [PubMed]
- Zhen, F.; Cao, Y.; Qin, X.; Wang, B. Delineation of an urban agglomeration boundary based on Sina Weibo microblog ‘check-in’ data: A case study of the Yangtze River Delta. Cities 2017, 60, 180–191. [Google Scholar] [CrossRef]
- Getis, A. Reflections on spatial autocorrelation. Reg. Sci. Urban Econ. 2007, 37, 491–496. [Google Scholar] [CrossRef]
- Schmal, C.; Myung, J.; Herzel, H.; Bordyugov, G.V. Moran’s I quantifies spatio-temporal pattern formation in neural imaging data. Bioinformatics 2017, 33, 3072–3079. [Google Scholar] [CrossRef] [PubMed]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Cvitanic, T.; Lee, B.; Song, H.I.; Fu, K.; Rosen, D. LDA v. LSA: A comparison of two computational text analysis tools for the functional categorization of patents. In Proceedings of the 24th International Conference on Case-Based Reasoning, Atlanta, GA, USA, 31 October–2 November 2016. [Google Scholar]
- Williams, T.; Betak, J. A comparison of LSA and LDA for the analysis of railroad accident text. Procedia Comput. Sci. 2018, 130, 98–102. [Google Scholar] [CrossRef]
- Wu, X.; Fang, L.; Wang, P.; Yu, N. Performance of using LDA for Chinese news text classification. In Proceedings of the IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE), Halifax, NS, Canada, 3–6 May 2015. [Google Scholar]
- Wang, J.; Peng, Y.; Wang, Z.; Yang, C.; Xu, J. Topic mining of Chinese scientific literature research about “The belt and road initiative” based on LDA model from the Sub Disciplinary Perspective. In Data Mining and Big Data, Proceedings of the 4th International Conference on Data Mining and Big Data, Chiang Mai, Thailand, 26–30 July 2019; Springer: Berlin, Germany, 2019. [Google Scholar]
- Song, Y.; Pan, S.; Liu, S.; Zhou, M.X.; Qian, W. Topic and keyword re-ranking for LDA-based topic modeling. In Proceedings of the 18th ACM Conference on Information & Knowledge Management, Hong Kong, China, 2–6 November 2009. [Google Scholar]
- Liu, K.; Gao, S.; Lu, F. Identifying spatial interaction patterns of vehicle movements on urban road networks by topic modelling. Comput. Environ. Urban Syst. 2019, 74, 50–61. [Google Scholar] [CrossRef]
- Hoffman, M.D.; Blei, D.M.; Bach, F. Online learning for Latent Dirichlet Allocation. In Proceedings of the 24th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6 December 2010. [Google Scholar]
- Pang, B.; Lee, L. Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2008, 2, 1–135. [Google Scholar] [CrossRef] [Green Version]
- Nasukawa, T.; Yi, J. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of the 2nd International Conference on Knowledge Capture, Sanibel Island, FL, USA, 23–25 October 2003. [Google Scholar]
- Agarwal, A.; Xie, B.; Vovsha, I.; Rambow, O.; Passonneau, R. Sentiment analysis of Twitter data. In Proceedings of the Workshop on Languages in Social Media, Stroudsburg, PA, USA, 23 June 2011. [Google Scholar]
- Gu, Y.; Qian, Z.; Chen, F. From Twitter to detector: Real-time traffic incident detection using social media data. Transp. Res. Part C Emerg. Technol. 2016, 67, 321–342. [Google Scholar] [CrossRef]
- Xie, S.; Zhang, X.; Cai, J. Video crowd detection and abnormal behavior model detection based on machine learning method. Neural Comput. Appl. 2018, 31, 175–184. [Google Scholar] [CrossRef]
- Yuan, Y. Crowd monitoring using mobile phones. In Proceedings of the 6th International Conference on Intelligent Human-machine Systems & Cybernetics, Hangzhou, China, 26–27 August 2014. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Duan, J.; Zhai, W.; Cheng, C. Crowd Detection in Mass Gatherings Based on Social Media Data: A Case Study of the 2014 Shanghai New Year’s Eve Stampede. Int. J. Environ. Res. Public Health 2020, 17, 8640. https://doi.org/10.3390/ijerph17228640
Duan J, Zhai W, Cheng C. Crowd Detection in Mass Gatherings Based on Social Media Data: A Case Study of the 2014 Shanghai New Year’s Eve Stampede. International Journal of Environmental Research and Public Health. 2020; 17(22):8640. https://doi.org/10.3390/ijerph17228640
Chicago/Turabian StyleDuan, Jiexiong, Weixin Zhai, and Chengqi Cheng. 2020. "Crowd Detection in Mass Gatherings Based on Social Media Data: A Case Study of the 2014 Shanghai New Year’s Eve Stampede" International Journal of Environmental Research and Public Health 17, no. 22: 8640. https://doi.org/10.3390/ijerph17228640
APA StyleDuan, J., Zhai, W., & Cheng, C. (2020). Crowd Detection in Mass Gatherings Based on Social Media Data: A Case Study of the 2014 Shanghai New Year’s Eve Stampede. International Journal of Environmental Research and Public Health, 17(22), 8640. https://doi.org/10.3390/ijerph17228640