skip to main content
10.1145/3664647.3680672acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions

Published: 28 October 2024 Publication History

Abstract

Accurately and promptly predicting accidents among surrounding traffic agents from camera footage is crucial for the safety of autonomous vehicles (AVs). This task presents substantial challenges stemming from the unpredictable nature of traffic accidents, their long-tail distribution, the intricacies of traffic scene dynamics, and the inherently constrained field of vision of onboard cameras. To address these challenges, this study introduces a novel accident anticipation framework for AVs, termed CRASH. It seamlessly integrates five components: object detector, feature extractor, object-aware module, context-aware module, and multi-layer fusion. Specifically, we develop the object-aware module to prioritize high-risk objects in complex and ambiguous environments by calculating the spatial-temporal relationships between traffic agents. In parallel, the context-aware is also devised to extend global visual information from the temporal to the frequency domain using the Fast Fourier Transform (FFT) and capture fine-grained visual features of potential objects and broader context cues within traffic scenes. To capture a wider range of visual cues, we further propose a multi-layer fusion that dynamically computes the temporal dependencies between different scenes and iteratively updates the correlations between different visual features for accurate and timely accident prediction. Evaluated on real-world datasets-Dashcam Accident Dataset (DAD), Car Crash Dataset (CCD), and AnAn Accident Detection (A3D) datasets-our model surpasses existing top baselines in critical evaluation metrics like Average Precision (AP) and mean Time-To-Accident (mTTA). Importantly, its robustness and adaptability are particularly evident in challenging driving scenarios with missing or limited training data, demonstrating significant potential for application in real-world autonomous driving systems.

References

[1]
Wentao Bao, Qi Yu, and Yu Kong. 2020. Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning. In Proceedings of the 28th ACM International Conference on Multimedia (MM '20).
[2]
Wentao Bao, Qi Yu, and Yu Kong. 2021. Drive: Deep reinforced accident anticipation with visual explanation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7619--7628.
[3]
Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6154--6162.
[4]
Fu-Hsiang Chan, Yu-Ting Chen, Yu Xiang, and Min Sun. 2017. Anticipating Accidents in Dashcam Videos. In Computer Vision -- ACCV 2016. Springer International Publishing, Cham, 136--153.
[5]
Jing Chen, Qichao Wang, Harry H Cheng, Weiming Peng, and Wenqiang Xu. 2022. A review of vision-based traffic semantic understanding in ITSs. IEEE Transactions on Intelligent Transportation Systems (2022).
[6]
Gary-Patrick Corcoran and James Clark. 2019. Traffic risk assessment: A two-stream approach using dynamic-attention. In 2019 16th Conference on Computer and Robot Vision (CRV). IEEE, 166--173.
[7]
Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language modeling with gated convolutional networks. In International conference on machine learning. PMLR, 933--941.
[8]
Haoqi Fan, Bo Xiong, Karttikeya Mangalam, Yanghao Li, Zhicheng Yan, Jitendra Malik, and Christoph Feichtenhofer. 2021. Multiscale vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 6824--6835.
[9]
Jianwu Fang, Jiahuan Qiao, Jie Bai, Hongkai Yu, and Jianru Xue. 2022. Traffic accident detection via self-supervised consistency learning in driving scenarios. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 7 (2022), 9601--9614.
[10]
Mishal Fatima, Muhammad Umar Karim Khan, and Chong-Min Kyung. 2021. Global feature aggregation for accident anticipation. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2809--2816.
[11]
Jia-Chang Feng, Fa-Ting Hong, and Wei-Shi Zheng. 2021. Mist: Multiple instance self-training framework for video anomaly detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14009--14018.
[12]
Shuo Feng, Haowei Sun, Xintao Yan, Haojie Zhu, Zhengxia Zou, Shengyin Shen, and Henry X Liu. 2023. Dense reinforcement learning for safety validation of autonomous vehicles. Nature, Vol. 615, 7953 (2023), 620--627.
[13]
Yanchen Guan, Haicheng Liao, Zhenning Li, Jia Hu, Runze Yuan, Yunjian Li, Guohui Zhang, and Chengzhong Xu. 2024. World models for autonomous driving: An initial survey. IEEE Transactions on Intelligent Vehicles (2024).
[14]
John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, and Bryan Catanzaro. 2021. Efficient token mixing for transformers via adaptive fourier neural operators. In International Conference on Learning Representations.
[15]
Xingshuo Han, Guowen Xu, Yuan Zhou, Xuehuan Yang, Jiwei Li, and Tianwei Zhang. 2022. Physical backdoor attacks to lane detection systems in autonomous driving. In Proceedings of the 30th ACM International Conference on Multimedia. 2957--2968.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[17]
Muhammad Monjurul Karim, Yu Li, Ruwen Qin, and Zhaozheng Yin. 2022. A Dynamic Spatial-Temporal Attention Network for Early Anticipation of Traffic Accidents. IEEE Transactions on Intelligent Transportation Systems, Vol. 23, 7 (2022), 9590--9600.
[18]
Muhammad Monjurul Karim, Zhaozheng Yin, and Ruwen Qin. 2023. An Attention-guided Multistream Feature Fusion Network for Early Localization of Risky Traffic Agents in Driving Videos. IEEE Transactions on Intelligent Vehicles (2023).
[19]
Md Nasim Khan and Subasish Das. 2024. Advancing traffic safety through the safe system approach: A systematic review. Accident Analysis & Prevention, Vol. 199 (2024), 107518.
[20]
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Limin Wang, and Yu Qiao. 2022. Uniformerv2: Spatiotemporal learning by arming image vits with video uniformer. arXiv preprint arXiv:2211.09552 (2022).
[21]
Zhenning Li, Zhiyong Cui, Haicheng Liao, John Ash, Guohui Zhang, Chengzhong Xu, and Yinhai Wang. 2024. Steering the Future: Redefining Intelligent Transportation Systems with Foundation Models. CHAIN, Vol. 1, 1 (2024), 46--53.
[22]
Zhenning Li, Haicheng Liao, Ruru Tang, Guofa Li, Yunjian Li, and Chengzhong Xu. 2023. Mitigating the impact of outliers in traffic crash analysis: A robust Bayesian regression approach with application to tunnel crash data. Accident Analysis & Prevention, Vol. 185 (2023), 107019.
[23]
Zhenning Li, Chengyue Wang, Haicheng Liao, Guofa Li, and Chengzhong Xu. 2024. Efficient and robust estimation of single-vehicle crash severity: A mixed logit model with heterogeneity in means and variances. Accident Analysis & Prevention, Vol. 196 (2024), 107446.
[24]
Haicheng Liao, Huanming Shen, Zhenning Li, Chengyue Wang, Guofa Li, Yiming Bie, and Chengzhong Xu. 2024. Gpt-4 enhanced multimodal grounding for autonomous driving: Leveraging cross-modal attention with large language models. Communications in Transportation Research, Vol. 4 (2024), 100116.
[25]
Bingbin Liu, Ehsan Adeli, Zhangjie Cao, Kuan-Hui Lee, Abhijeet Shenoi, Adrien Gaidon, and Juan Carlos Niebles. 2020. Spatiotemporal relationship reasoning for pedestrian intent prediction. IEEE Robotics and Automation Letters, Vol. 5, 2 (2020), 3485--3492.
[26]
Kun Liu, Minzhi Zhu, Huiyuan Fu, Huadong Ma, and Tat-Seng Chua. 2020. Enhancing anomaly detection in surveillance videos with transfer learning from action recognition. In Proceedings of the 28th ACM International Conference on Multimedia. 4664--4668.
[27]
Ze Liu, Jia Ning, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin, and Han Hu. 2022. Video swin transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3202--3211.
[28]
Zeyu Ma, Yang Yang, Guoqing Wang, Xing Xu, Heng Tao Shen, and Mingxing Zhang. 2022. Rethinking open-world object detection in autonomous driving scenarios. In Proceedings of the 30th ACM International Conference on Multimedia. 1279--1288.
[29]
Muhammad Monjurul Karim, Yu Li, and Ruwen Qin. 2021. Towards explainable artificial intelligence (XAI) for early anticipation of traffic accidents. arXiv e-prints (2021), arXiv--2108.
[30]
Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, and Jie Zhou. 2021. Global filter networks for image classification. Advances in neural information processing systems, Vol. 34 (2021), 980--993.
[31]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations.
[32]
Wenfeng Song, Shuai Li, Tao Chang, Ke Xie, Aimin Hao, and Hong Qin. 2024. Dynamic attention augmented graph network for video accident anticipation. Pattern Recognition, Vol. 147 (2024), 110071.
[33]
Tomoyuki Suzuki, Hirokatsu Kataoka, Yoshimitsu Aoki, and Yutaka Satoh. 2018. Anticipating Traffic Accidents with Adaptive Loss and Large-Scale Incident DB. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018), 3521--3529. https://api.semanticscholar.org/CorpusID:4713643
[34]
Yoshiaki Takimoto, Yusuke Tanaka, Takeshi Kurashima, Shuhei Yamamoto, Maya Okawa, and Hiroyuki Toda. 2019. Predicting traffic accidents with event recorder data. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Prediction of Human Mobility. 11--14.
[35]
Kamalakar Vijay Thakare, Debi Prosad Dogra, Heeseung Choi, Haksub Kim, and Ig-Jae Kim. 2023. Rareanom: a benchmark video dataset for rare type anomalies. Pattern Recognition, Vol. 140 (2023), 109567.
[36]
Nupur Thakur, PrasanthSai Gouripeddi, and Baoxin Li. 2024. Graph(Graph): A Nested Graph-Based Framework for Early Accident Anticipation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 7533--7541.
[37]
Tianhang Wang, Kai Chen, Guang Chen, Bin Li, Zhijun Li, Zhengfa Liu, and Changjun Jiang. 2023. GSC: A Graph and Spatio-temporal Continuity Based Framework for Accident Anticipation. IEEE Transactions on Intelligent Vehicles (2023).
[38]
Jie Wu, Wei Zhang, Guanbin Li, Wenhao Wu, Xiao Tan, Yingying Li, Errui Ding, and Liang Lin. 2021. Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI, Zhi-Hua Zhou (Ed.). ijcai, 1172--1178. https://doi.org/10.24963/IJCAI.2021/162
[39]
Ruoyu Xue, Jingyuan Chen, and Yajun Fang. 2020. Real-time anomaly detection and feature analysis based on time series for surveillance video. In 2020 5th International Conference on Universal Village (UV). IEEE, 1--7.
[40]
Jiazhi Yang, Shenyuan Gao, Yihang Qiu, Li Chen, Tianyu Li, Bo Dai, Kashyap Chitta, Penghao Wu, Jia Zeng, Ping Luo, et al. 2024. Generalized Predictive Model for Autonomous Driving. arXiv preprint arXiv:2403.09630 (2024).
[41]
Yu Yao, Xizi Wang, Mingze Xu, Zelin Pu, Yuchen Wang, Ella Atkins, and David Crandall. 2022. DoTA: unsupervised detection of traffic anomaly in driving videos. IEEE transactions on pattern analysis and machine intelligence (2022).
[42]
Yu Yao, Mingze Xu, Chiho Choi, David J Crandall, Ella M Atkins, and Behzad Dariush. 2019. Egocentric vision-based future vehicle localization for intelligent driving assistance systems. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 9711--9717.
[43]
Muchao Ye, Xiaojiang Peng, Weihao Gan, Wei Wu, and Yu Qiao. 2019. Anopcn: Video anomaly detection via deep predictive coding network. In Proceedings of the 27th ACM international conference on multimedia. 1805--1813.
[44]
Guang Yu, Siqi Wang, Zhiping Cai, Xinwang Liu, Chuanfu Xu, and Chengkun Wu. 2022. Deep anomaly discovery from unlabeled videos via normality advantage and self-paced refinement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 13987--13998.
[45]
Kuo-Hao Zeng, Shih-Han Chou, Fu-Hsiang Chan, Juan Carlos Niebles, and Min Sun. 2017. Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Index Terms

  1. CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Check for updates

    Author Tags

    1. autonomous driving
    2. dynamic visual fusion
    3. fast fourier transform
    4. spatial-temporal analysis
    5. traffic accident anticipation

    Qualifiers

    • Research-article

    Funding Sources

    • Science and Technology Development Fund of Macau SAR
    • University of Macau
    • Shenzhen-Hong Kong-Macau Science and Technology Program Category C

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 116
      Total Downloads
    • Downloads (Last 12 months)116
    • Downloads (Last 6 weeks)75
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media