skip to main content
10.1145/3133944.3133953acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

AVEC 2017: Real-life Depression, and Affect Recognition Workshop and Challenge

Published: 23 October 2017 Publication History

Abstract

The Audio/Visual Emotion Challenge and Workshop (AVEC 2017) "Real-life depression, and affect" will be the seventh competition event aimed at comparison of multimedia processing and machine learning methods for automatic audiovisual depression and emotion analysis, with all participants competing under strictly the same conditions. The goal of the Challenge is to provide a common benchmark test set for multimodal information processing and to bring together the depression and emotion recognition communities, as well as the audiovisual processing communities, to compare the relative merits of the various approaches to depression and emotion recognition from real-life data. This paper presents the novelties introduced this year, the challenge guidelines, the data used, and the performance of the baseline system on the two proposed tasks: dimensional emotion recognition (time and value-continuous), and dimensional depression estimation (value-continuous).

References

[1]
Akshay Asthana, Stefanos Zafeiriou, Shiyang Cheng, and Maja Pantic. 2014. Incremental face alignment in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR. IEEE, 1859--1866.
[2]
Tadas Baltruvšaitis, Peter Robinson, and Louis-Philippe Morency. 2016. OpenFace: an open source facial behavior analysis toolkit Proceedings of the IEEE Winter Conference on Applications of Computer Vision, WACV. IEEE, Lake Placid, NY, USA.
[3]
Chia-Cheng Chen and Huiman X. Barnhart. 2013. Assessing agreement with intraclass correlation coefficient and concordance correlation coefficient for data with repeated measures. Journal of Computational Statistics & Data Analysis Vol. 60 (April. 2013), 132--145.
[4]
Nicholas Cummins, Stefan Scherer, Jarek Krajewski, Sebastian Schnieder, Julien Epps, and Thomas F. Quatieri. 2015. A review of depression and suicide risk assessment using speech analysis. Speech Communication Vol. 71 (July. 2015), 10--49.
[5]
Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, and Stefan Scherer. 2014. COVAREP - A collaborative voice analysis repository for speech technologies Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP. IEEE, Florence, Italy, 960--964.
[6]
David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jonathan Gratch, Arno Hartholt, Margaux Lhommet, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Alberto Rizzo, and Louis-Philippe Morency. 2014. SimSensei kiosk: A virtual human interviewer for healthcare decision support Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems, AAMAS'14. ACM, Paris, France, 1061--1068.
[7]
Florian Eyben, Klaus R. Scherer, Björn W Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Devillers, Julien Epps, Petri Laukka, Shrikanth S. Narayanan, and Khiet P. Truong. 2016. The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing Vol. 7, 2 (April-June. 2016), 190--202.
[8]
Florian Eyben, Felix Weninger, Florian Groß, and Björn Schuller. 2013. Recent developments in openSMILE, the Munich open-source multimedia feature extractor Proceedings of the ACM International Conference on Multimedia, MM'13. Barcelona, Spain, 835--838.
[9]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research Vol. 9 (June. 2008), 1871--1874.
[10]
Jonathan Gratch, Ron Artstein, Gale Lucas, Giota Stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, David Traum, Albert Rizzo, and Louis-Philippe Morency. 2014. The Distress Analysis Interview Corpus of human and computer interviews Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC. ELRA, Reykjavik, Iceland, 3123--3128.
[11]
Yu-Gang Jiang, Baohan Xu, and Xiangyang Xue. 2014. Predicting emotions in user-generated videos. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, AAAI'14. Association for the Advancement of Artificial Intelligence, Québec, Canada, 73--79.
[12]
Kurt Kroenke, Tara W. Strine, Robert L. Spitzer, Janet B. W. Williams, Joyce T. Berry, and Ali H. Mokdad. 2009. The PHQ-8 as a measure of current depression in the general population. Journal of affective disorders Vol. 114, 1-3 (April. 2009), 163--173.
[13]
Lin Li. 1989. A concordance correlation coefficient to evaluate reproducibility. Biometrics, Vol. 45, 1 (March. 1989), 255--268.
[14]
Soroosh Mariooryad and Carlos Busso. 2015. Correcting time-continuous emotional labels by modeling the reaction lag of evaluators. IEEE Transactions on Affective Computing Vol. 6 (April-June. 2015), 97--108.
[15]
Arianna Mencattini, Eugenio Martinelli, Fabien Ringeval, Björn Schuller, and Corrado Di Natale. 2016. Continuous estimation of emotions in speech by dynamic cooperative speaker models. IEEE Transactions on Affective Computing (2016). 14 pages, to appear.
[16]
Jérémie Nicolle, Vincent Rapp, Kévin Bailly, Lionel Prevost, and Mohamed Chetouani. 2012. Robust continuous prediction of human emotions using multiscale dynamic cues Proceedings of the 14th ACM International Conference on Multimodal Interaction, ICMI'12. ACM, Santa Monica, CA, USA, 501--508.
[17]
Fabien Ringeval, Erik Marchi, Charline Grossard, Jean Xavier, Mohamed Chetouani, David Cohen, and Björn Schuller. 2016. Automatic analysis of typical and atypical encoding of spontaneous emotion in the voice of children Proceedings of INTERSPEECH. ISCA, San Francisco, CA, USA, 1210--1214.
[18]
Fabien Ringeval, Erik Marchi, Marc Méhu, Klaus Scherer, and Björn Schuller. 2015. Face reading from speech - Predicting facial action units from audio cues Proceedings of INTERSPEECH. ISCA, Dresden, Germany, 1977--1981.
[19]
Fabien Ringeval, Björn Schuller, Michel Valstar, Shashank Jaiswal, Erik Marchi, Denis Lalanne, Roddy Cowie, and Maja Pantic. 2015. AV+EC 2015 - The first affect recognition challenge bridging across audio, video, and physiological data. In Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, AVEC'15, ACM International Conference on Multimedia, MM'15. ACM, Brisbane, Australia, 3--8.
[20]
Stefan Scherer, Gale Lucas, Jonathan Gratch, Alberto Rizzo, and Louis-Philippe Morency. 2015. Self-reported symptoms of depression and PTSD are associated with reduced vowel space in screening interviews. IEEE Transactions on Affective Computing Vol. 7, 1 (January-March. 2015), 59--73.
[21]
Stefan Scherer, Giota Stratou, Gale Lucas, Marwa Mahmoud, Jill Boberg, Jonathan Gratch, Albert (Skip) Rizzo, and Louis-Philippe Morency. 2014. Automatic audiovisual behavior descriptors for psychological disorder analysis. Image and Vision Computing Vol. 32, 10 (October. 2014), 648--658.
[22]
Maximilian Schmitt, Erik Marchi, Fabien Ringeval, and Björn Schuller. 2016. Towards cross-lingual automatic diagnosis of autism spectrum condition in children's voices Proceedings of the 14th ITG Conference on Speech Communication, volume 267 of ITG-Fachbericht. ITG/VDE, IEEE, Paderborn, Germany, 264--268.
[23]
Maximilian Schmitt, Fabien Ringeval, and Björn Schuller. 2016. At the border of acoustics and linguistics: Bag-of-Audio-Words for the recognition of emotions in speech. In Proceedings of INTERSPEECH. ISCA, San Francisco, CA, USA, 495--499.
[24]
Maximilian Schmitt and Björn W. Schuller. 2016. openXBOW - Introducing the Passau open-source crossmodal Bag-of-Words toolkit. preprint arXiv:1605.06778 (2016).
[25]
Björn Schuller. 2013. Intelligent Audio Analysis. Springer.
[26]
Björn Schuller, Michel Valstar, Florian Eyben, Roddy Cowie, and Maja Pantic. 2012. AVEC 2012 - The continuous Audio/Visual Emotion Challenge Proceedings of the 14th ACM International Conference on Multimodal Interaction, ICMI'12. ACM, Santa Monica, CA, USA, 449--456.
[27]
Björn Schuller, Michel Valstar, Florian Eyben, Gary McKeown, Roddy Cowie, and Maja Pantic. 2011. AVEC 2011 - The First International Audio/Visual Emotion Challenge Proceedings of the 4th bi-annual International Conference on Affective Computing and Intelligent Interaction, ACII 2011, Vol. Vol. II. Springer, Memphis, TN, USA, 415--424.
[28]
Michel Valstar, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Roddy Cowie, and Maja Pantic. 2016. Summary for AVEC 2016: Depression, mood, and emotion recognition workshop and challenge Proceedings of the 24th ACM International Conference on Multimedia, MM'16. ACM, Amsterdam, The Netherlands, 1483--1484.
[29]
Michel Valstar, Jonathan Gratch, Björn Schuller, Fabien Ringeval, Denis Lalanne, Mercedes Torres Torres, Stefan Scherer, Giota Stratou, Roddy Cowie, and Maja Pantic. 2016. AVEC 2016 - Depression, mood, and emotion recognition workshop and challenge Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, AVEC'16, ACM International Conference on Multimedia, MM'16. ACM, Amsterdam, The Netherlands, 3--10.
[30]
Michel Valstar, Björn Schuller, Jarek Krajewski, Roddy Cowie, and Maja Pantic. 2013. Workshop summary for the 3rd international Audio/Visual Emotion Challenge and workshop (AVEC'13). In Proceedings of the 21st ACM International Conference on Multimedia, MM'13. ACM, Barcelona, Spain, 1085--1086.
[31]
Michel Valstar, Björn Schuller, Jarek Krajewski, Roddy Cowie, and Maja Pantic. 2014. AVEC 2014: The 4th international Audio/Visual Emotion Challenge and workshop Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014. ACM, Orlando, FL, USA, 1243--1244.
[32]
Felix Weninger, Fabien Ringeval, Erik Marchi, and Björn Schuller. 2016. Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, IJCAI'16. IJCAI/AAAI, New York City, NY, USA, 2196--2202.

Cited By

View all
  • (2024)Validation of Machine Learning-Based Assessment of Major Depressive Disorder from Paralinguistic Speech Characteristics in Routine CareDepression and Anxiety10.1155/2024/96673772024(1-12)Online publication date: 9-Apr-2024
  • (2024)EVAC 2024 – Empathic Virtual Agent Challenge: Appraisal-based Recognition of Affective StatesProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3689029(677-683)Online publication date: 4-Nov-2024
  • (2024)Disentangled-Multimodal Privileged Knowledge Distillation for Depression Recognition with Incomplete Multimodal DataProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681227(5712-5721)Online publication date: 28-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AVEC '17: Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge
October 2017
78 pages
ISBN:9781450355025
DOI:10.1145/3133944
© 2017 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. affective computing
  2. automatic emotion/depression recognition
  3. social signal processing

Qualifiers

  • Research-article

Funding Sources

  • Horizon 2020 Programme
  • European Union's 7th Framework Programme
  • Research Innovative Action

Conference

MM '17
Sponsor:
MM '17: ACM Multimedia Conference
October 23, 2017
California, Mountain View, USA

Acceptance Rates

AVEC '17 Paper Acceptance Rate 8 of 17 submissions, 47%;
Overall Acceptance Rate 52 of 98 submissions, 53%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)238
  • Downloads (Last 6 weeks)18
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media