skip to main content
10.1145/3372278.3390725acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

A Crowd Analysis Framework for Detecting Violence Scenes

Published: 08 June 2020 Publication History

Abstract

This work examines violence detection in video scenes of crowds and proposes a crowd violence detection framework based on a 3D convolutional deep learning architecture, the 3D-ResNet model with 50 layers. The proposed framework is evaluated on the Violent Flows dataset against several state-of-the-art approaches and achieves higher accuracy values in almost all cases, while also performing the violence detection activities in (near) real-time.

References

[1]
Behnam Babagholami-Mohamadabadi, Ali Zarghami, Mohammadreza Zolfaghari, and Mahdieh Soleymani Baghshah. 2013. Pssdl: Probabilistic semi-supervised dictionary learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 192--207.
[2]
Piotr Bilinski and Francois Bremond. 2016. Human violence recognition and detection in surveillance videos. In 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 30--36.
[3]
Ankur Datta, Mubarak Shah, and N Da Vitoria Lobo. 2002. Person-on-person violence detection in video data. In Object recognition supported by user interaction for service robots, Vol. 1. IEEE, 433--438.
[4]
Zhihong Dong, Jie Qin, and Yunhong Wang. 2016. Multi-stream deep networks for person to person violence detection in videos. In Chinese Conference on Pattern Recognition. Springer, 517--531.
[5]
E Fenil, Gunasekaran Manogaran, GN Vivekananda, T Thanjaivadivel, S Jeeva, A Ahilan, et al. 2019. Real time violence detection framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM. Computer Networks, Vol. 151 (2019), 191--200.
[6]
Eugene Yujun Fu, Hong Va Leong, Grace Ngai, and Stephen CF Chan. 2017. Automatic fight detection in surveillance videos. International Journal of Pervasive Computing and Communications, Vol. 13, 2 (2017), 130--156.
[7]
Yuan Gao, Hong Liu, Xiaohu Sun, Can Wang, and Yi Liu. 2016. Violence detection using oriented violent flows. Image and vision computing, Vol. 48 (2016), 37--41.
[8]
Alex Hanson, Koutilya Pnvr, Sanjukta Krishnagopal, and Larry Davis. 2018. Bidirectional Convolutional LSTM for the Detection of Violence in Videos. In Proceedings of the European Conference on Computer Vision (ECCV). 0--0.
[9]
Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh. 2018. Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet?. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 6546--6555.
[10]
Tal Hassner, Yossi Itcher, and Orit Kliper-Gross. 2012. Violent flows: Real-time detection of violent crowd behavior. In 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 1--6.
[11]
Jian-Feng Huang and Shui-Li Chen. 2014. Detection of violent crowd behavior based on statistical characteristics of the optical flow. In 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). IEEE, 565--569.
[12]
Ivan Laptev, Marcin Marszałek, Cordelia Schmid, and Benjamin Rozenfeld. 2008. Learning realistic human actions from movies.
[13]
Kaelon Lloyd, Paul L Rosin, David Marshall, and Simon C Moore. 2017. Detecting violent and abnormal crowd activity using temporal analysis of grey level co-occurrence matrix (GLCM)-based texture measures. Machine Vision and Applications, Vol. 28, 3--4 (2017), 361--371.
[14]
Amira Ben Mabrouk and Ezzeddine Zagrouba. 2017. Spatio-temporal feature using optical flow based distribution for violence detection. Pattern Recognition Letters, Vol. 92 (2017), 62--67.
[15]
Vijay Mahadevan, Weixin Li, Viral Bhalodia, and Nuno Vasconcelos. 2010. Anomaly detection in crowded scenes. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1975--1981.
[16]
Javad Mahmoodi and Afsane Salajeghe. 2019. A classification method based on optical flow for violence detection. Expert Systems with Applications, Vol. 127 (2019), 121--127.
[17]
Sadegh Mohammadi, Hamed Kiani, Alessandro Perina, and Vittorio Murino. 2015. Violence detection in crowded scenes using substantial derivative. In 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1--6.
[18]
Nam Thanh Nguyen, Dinh Q Phung, Svetha Venkatesh, and Hung Bui. 2005. Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), Vol. 2. IEEE, 955--960.
[19]
Enrique Bermejo Nievas, Oscar Deniz Suarez, Gloria Bueno Garcia, and Rahul Sukthankar. 2011. Violence detection in video using computer vision techniques. In International conference on Computer analysis of images and patterns. Springer, 332--339.
[20]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).
[21]
Swathikiran Sudhakaran and Oswald Lanz. 2017. Learning to detect violent videos using convolutional long short-term memory. In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1--6.
[22]
Fath U Min Ullah, Amin Ullah, Khan Muhammad, Ijaz Ul Haq, and Sung Wook Baik. 2019. Violence detection using spatiotemporal features with 3D convolutional neural network. Sensors, Vol. 19, 11 (2019), 2472.
[23]
Di Wang, Xiaoqin Zhang, Mingyu Fan, and Xiuzi Ye. 2016. Semi-supervised dictionary learning via structural sparse preserving. In Thirtieth AAAI Conference on Artificial Intelligence .
[24]
Dan Xu, Elisa Ricci, Yan Yan, Jingkuan Song, and Nicu Sebe. 2015. Learning deep representations of appearance and motion for anomalous event detection. arXiv preprint arXiv:1510.01553 (2015).
[25]
Long Xu, Chen Gong, Jie Yang, Qiang Wu, and Lixiu Yao. 2014. Violent video detection based on MoSIFT feature and sparse coding. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3538--3542.
[26]
Lahav Yeffet and Lior Wolf. 2009. Local trinary patterns for human action recognition. In 2009 IEEE 12th international conference on computer vision. IEEE, 492--497.
[27]
Tao Zhang, Wenjing Jia, Chen Gong, Jun Sun, and Xiaoning Song. 2018. Semi-supervised dictionary learning via local sparse constraints for violence detection. Pattern recognition letters, Vol. 107 (2018), 98--104.
[28]
Tao Zhang, Wenjing Jia, Xiangjian He, and Jie Yang. 2016a. Discriminative dictionary learning with motion weber local descriptor for violence detection. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 27, 3 (2016), 696--709.
[29]
Tao Zhang, Wenjing Jia, Baoqing Yang, Jie Yang, Xiangjian He, and Zhonglong Zheng. 2017. MoWLD: a robust motion image descriptor for violence detection. Multimedia Tools and Applications, Vol. 76, 1 (2017), 1419--1438.
[30]
Tao Zhang, Zhijie Yang, Wenjing Jia, Baoqing Yang, Jie Yang, and Xiangjian He. 2016b. A new method for violence detection in surveillance scenes. Multimedia Tools and Applications, Vol. 75, 12 (2016), 7327--7349.
[31]
Peipei Zhou, Qinghai Ding, Haibo Luo, and Xinglin Hou. 2018. Violence detection in surveillance video using low-level features. PLoS one, Vol. 13, 10 (2018), e0203668.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D-convolutional neural networks
  2. crowd analysis
  3. deep learning
  4. violence detection
  5. violent flows

Qualifiers

  • Short-paper

Funding Sources

  • European Commission

Conference

ICMR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)2
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media