skip to main content
10.1145/3382507.3418832acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
short-paper
Public Access

OpenSense: A Platform for Multimodal Data Acquisition and Behavior Perception

Published: 22 October 2020 Publication History

Abstract

Automatic multimodal acquisition and understanding of social signals is an essential building block for natural and effective human-machine collaboration and communication. This paper introduces OpenSense, a platform for real-time multimodal acquisition and recognition of social signals. OpenSense enables precisely synchronized and coordinated acquisition and processing of human behavioral signals. Powered by the Microsoft's Platform for Situated Intelligence, OpenSense supports a range of sensor devices and machine learning tools and encourages developers to add new components to the system through straightforward mechanisms for component integration. This platform also offers an intuitive graphical user interface to build application pipelines from existing components. OpenSense is freely available for academic research.

Supplementary Material

MP4 File (3382507.3418832.mp4)
Presentation video for the paper "OpenSense: A Platform for Multimodal Data Acquisition and Behavior Perception"

References

[1]
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. http://tensorflow.org/
[2]
T. Baltrušaitis, M. Mahmoud, and P. Robinson. 2015. Cross-Dataset Learning and Person-Specific Normalisation for Automatic Action Unit Detection. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 1--6.
[3]
T. Baltrušaitis, P. Robinson, and L-P. Morency. 2013. Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild. In Proceedings of the IEEE International Conference on Computer Vision. 354--361.
[4]
T. Baltrušaitis, A. Zadeh, Y. C. Lim, and L-P. Morency. 2018. Openface 2.0: Facial Behavior Analysis Toolkit. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 59--66.
[5]
D. Bohus, S. Andrist, and M. Jalobeanu. 2017. Rapid Development of Multimodal Interactive Systems: A Demonstration of Platform for Situated Intelligence. In Proceedings of the ACM International Conference on Multimodal Interaction. 493-- 494.
[6]
Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. 2019. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019).
[7]
Z. Cao, T. Simon, S-E. Wei, and Y. Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[8]
François Chollet et al. 2015. Keras. https://keras.io.
[9]
F. Eyben, F. Weninger, F. Gross, and B. Schuller. 2013. Recent Developments in OpenSmile, the Munich Open-Source Multimedia Feature Extractor. In Proceedings of the ACM International Conference on Multimedia. 835--838.
[10]
F. Eyben, M. Wöllmer, and B. Schuller. 2010. OpenSmile: The Munich Versatile and Fast Open-Source Audio Feature Extractor. In Proceedings of the ACM International Conference on Multimedia. 1459--1462.
[11]
The Apache Software Foundation. 2020. Apache ActiveMQ. http://activemq. apache.org/
[12]
Google. 2020. Google Cloud Platform. https://cloud.google.com
[13]
J. Haas. 2014. A History of the Unity Game Engine. (2014).
[14]
D. J. McDuff, K. Rowan, P. Choudhury, J. Wolk, T. Pham, and M. Czerwinski. 2019. A Multimodal Emotion Sensing Platform for Building Emotion-Aware Applications. CoRR abs/1903.12133 (2019).
[15]
Microsoft. 2020. Microsoft Azure. https://azure.microsoft.com
[16]
Nvidia. 2020. CUDA. https://developer.nvidia.com/cuda-zone
[17]
J. Shen and M. Pantic. 2009. A Software Framework for Multimodal HumanComputer Interaction Systems. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics. 2038--2045.
[18]
J. Shen, W. Shi, and M. Pantic. 2011. HCI2 Workbench: A Development Tool for Multimodal Human-Computer Interaction Systems. In Proceedings of the IEEE International Conference on Automatic Face and Gesture Recognition. 766--773.
[19]
T. Simon, H. Joo, I. Matthews, and Y. Sheikh. 2017. Hand Keypoint Detection in Single Images Using Multiview Bootstrapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[20]
K. Stefanov. 2010. Webcam-based Eye Gaze Tracking Under Natural Head Movement. Master's thesis. University of Amsterdam.
[21]
G. Stratou and L-P. Morency. 2017. MultiSense-?Context-Aware Nonverbal Behavior Analysis Framework: A Psychological Distress Use Case. IEEE Transactions on Affective Computing 8, 2 (2017), 190--203.
[22]
J. Wagner, F. Lingenfelser, T. Baur, I. Damian, F. Kistler, and E. André. 2013. The Social Signal Interpretation (SSI) Framework: Multimodal Signal Processing and Recognition in Real-Time. In Proceedings of the ACM International Conference on Multimedia. 831--834.
[23]
S-E. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh. 2016. Convolutional Pose Machines. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[24]
E. Wood, T. Baltrušaitis, X. Zhang, Y. Sugano, P. Robinson, and A. Bulling. 2015. Rendering of Eyes for Eye-Shape Registration and Gaze Estimation. In Proceedings of the IEEE International Conference on Computer Vision. 3756--3764.
[25]
A. Zadeh, Y. C. Lim, T. Baltrušaitis, and L-P. Morency. 2017. Convolutional Experts Constrained Local Model for 3D Facial Landmark Detection. In Proceedings of the IEEE International Conference on Computer Vision. 2519--2528.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction
October 2020
920 pages
ISBN:9781450375818
DOI:10.1145/3382507
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. behavior
  2. multimodal
  3. open source
  4. perception
  5. platform

Qualifiers

  • Short-paper

Funding Sources

Conference

ICMI '20
Sponsor:
ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
October 25 - 29, 2020
Virtual Event, Netherlands

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)251
  • Downloads (Last 6 weeks)19
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media