research-article

Multi-Modal Depression Detection Based on High-Order Emotional Features

Authors:

Yongxiang Zheng,

Zhenhua TanAuthors Info & Claims

AICCC '22: Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference

Pages 158 - 164

https://doi.org/10.1145/3582099.3582144

Published: 20 April 2023 Publication History

Abstract

The diagnosis of depression has always been a difficulty in its treatment. At present, the research on automatic depression detection mostly directly uses low-order features such as video, audio and text as input. The lack of guidance of high-order features may be a potential problem. This paper proposed a multi-modal depression detection method based on high-order emotional features. A two-stage network is designed to realize emotion recognition and depression detection at the same time, and input the emotional results as high-order semantic features into the improved TBJE-E multi-modal network. This process guided the learning of other modalities with the help of co-attention module, and finally gave the prediction results. The results of experiments on DAIC-WOZ dataset show that the addition of emotional features effectively complements the high-order semantics. Compared with the original TBJE model, the F1 performance of TBJE-E model with emotional features is relatively improved by 6.3%. The method in this paper has reached the SOTA level in the depression detection task. The experimental data also show that at present, the risk of individual internal psychological privacy being stolen by this technology without their knowledge is very low, and this technology has some application value in criminal investigation, psychological diagnosis and treatment and other professional fields.

References

[1]

H. S. Akiskal, and E. B. Weller. 1989. Mood disorders and suicide in children and adolescents. Comprehensive Textbook of Psychiatry. 2 (January 1989), 1981-1994

[2]

Tuka Al Hanai, Mohammad Ghassemi, and James Glass. 2018. Detecting depression with audio/text sequence modeling of interviews. in Proceedings of Interspeech 2018. 1716-1720. https://doi.org/10.21437/Interspeech.2018-2522

[3]

Haque Albert, Guo Michelle, Miner Adam S, and Fei-Fei Li. 2018. Measuring depression symptom severity from spoken language and 3D facial expressions. Retrieved June 20, 2022 from https://doi.org/10.48550/arXiv.1811.08592

[4]

James R. Williamson, Elizabeth Godoy, Miriam Cha, Adrianne Schwarzentruber, Pooya Khorrami, Youngjune Gwon, Hsiang-Tsung Kung, Charlie Dagli, and Thomas F. Quatieri. 2016. Detecting Depression using Vocal, Facial and Semantic Communication Cues. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge (AVEC '16). Association for Computing Machinery, New York, NY, USA, 11–18. https://doi.org/10.1145/2988257.2988263

Digital Library

[5]

Xiaoyan. Xiong, Xu Chen, Yunhua Liu, and Yan Chen. 2018. Research on psychological depression symptom detection based on behavior data. Modern Electronics Technique. 41, 24 (2018), 121-124. https://doi.org/10.16652/j.issn.1004-373x.2018.24.030

[6]

Philip Resnik, William Armstrong, Leonardo Claudino, Thang Nguyen, Viet-an Nguyen, and Jordan Boyd-Graber. 2015. Beyond LDA: exploring supervised topic modeling for depression-related language in Twitter. in Proceedings of the 2nd workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality . Association for Computational Linguistics, Denver, Colorado, 99-107. https://doi.org/10.3115/v1/W15-1212

[7]

Zhenyu Fang. 2017. Prediction of user mental disorders based on Micro-Blog. Computer Knowledge and Technology. 13, 7(2017), 244-247. https://doi.org/10.14004/j.cnki.ckt.2017.1027

[8]

Yao Wang, Baolong Jia, Yining Du, Han Zhang, and Xiang Chen. 2020. Depression detection of SVM ensemble learning social network based on word vector. Wireless Internet Technology. 17, 3(2020), 27-29

[9]

Tadas Baltrušaitis, Peter Robinson, and Louis-Philippe Morency. 2016. OpenFace: An open source facial behavior analysis toolkit. in Proceedings of 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, Lake Placid, NY, USA, 1-10. https://doi.org/10.1109/WACV.2016.7477553

[10]

Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, and Stefan Scherer. 2014. COVAREP — A collaborative voice analysis repository for speech technologies. in Proceedings of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Florence, Italy, 960-964. https://doi.org/10.1109/ICASSP.2014.6853739

[11]

Jean-Benoit Delbrouck, Noé Tits, Mathilde Brousmiche, and Stéphane Dupont, 2020. A Transformer-based joint-encoding for emotion recognition and sentiment analysis. in Proceedings of Second Grand-Challenge and Workshop on Multimodal Language (Challenge-HML). Association for Computational Linguistics, Seattle, USA, 1-7. https://doi.org/10.18653/v1/2020.challengehml-1.1

[12]

Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, and Louis-Philippe Morency, 2018. Multi-attention Recurrent Network for Human Communication Comprehension. AAAI, 32, 1(April 2018), 5642-5649.

[13]

Jonathan Gratch, Ron Artstein, Gale Lucas, Giota Stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, David Traum, Skip Rizzo, and Louis-Philippe Morency. 2014. The distress analysis interview corpus of human and computer interviews. in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), Reykjavik, Iceland, 3123-3128.

[14]

Xingchen Ma, Hongyu Yang, Qiang Chen, Di Huang, and Yunhong Wang. 2016. DepAudioNet: An Efficient Deep Model for Audio based Depression Classification. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge (AVEC '16). Association for Computing Machinery, New York, NY, USA, 35–42. https://doi.org/10.1145/2988257.2988267

Digital Library

[15]

Michel Valstar, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Giota Stratou, Roddy Cowie, and Maja Pantic. 2016. AVEC 2016: Depression, Mood, and Emotion Recognition Workshop and Challenge. In Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge (AVEC '16). Association for Computing Machinery, New York, NY, USA, 3–10. https://doi.org/10.1145/2988257.2988258

Digital Library

[16]

Yuan Gong and Christian Poellabauer. 2017. Topic Modeling Based Multi-modal Depression Detection. In Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge (AVEC '17). Association for Computing Machinery, New York, NY, USA, 69–76. https://doi.org/10.1145/3133944.3133945

Digital Library

Index Terms

Multi-Modal Depression Detection Based on High-Order Emotional Features
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Artificial intelligence

Recommendations

Reading Between the Frames: Multi-modal Depression Detection in Videos from Non-verbal Cues
Advances in Information Retrieval
Abstract
Depression, a prominent contributor to global disability, affects a substantial portion of the population. Efforts to detect depression from social media texts have been prevalent, yet only a few works explored depression detection from user-...
Harnessing emotions for depression detection
Abstract
Human emotions using textual cues, speech patterns, and facial expressions can give insight into their mental state. Although there are several uni-modal datasets for emotion recognition, there are very few labeled datasets for multi-modal ...
Multi-modal Multi-emotion Emotional Support Conversation
Advanced Data Mining and Applications
Abstract
This paper proposes a new task of Multi-modal Multi-emotion Emotional Support Conversation (MMESC), which has great value in various applications, such as counseling, daily chatting, and elderly company. This task aims to fully perceive the users’ ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AICCC '22: Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference

December 2022

302 pages

ISBN:9781450398749

DOI:10.1145/3582099

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AICCC 2022

AICCC 2022: 2022 5th Artificial Intelligence and Cloud Computing Conference

December 17 - 19, 2022

Osaka, Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
113
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)5

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten