Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- abstractOctober 2018
Group Interaction Frontiers in Technology
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 660–662https://doi.org/10.1145/3242969.3272960Analysis of group interaction and team dynamics is an important topic in a wide variety of fields, owing to the amount of time that individuals typically spend in small groups for both professional and personal purposes, and given how crucial group ...
- short-paperOctober 2018
Group-Level Emotion Recognition Using Hybrid Deep Models Based on Faces, Scenes, Skeletons and Visual Attentions
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 635–639https://doi.org/10.1145/3242969.3264990This paper presents a hybrid deep learning network submitted to the 6th Emotion Recognition in the Wild (EmotiW 2018) Grand Challenge [9], in the category of group-level emotion recognition. Advanced deep learning models trained individually on faces, ...
- research-articleOctober 2018
Group-Level Emotion Recognition using Deep Models with A Four-stream Hybrid Network
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 623–629https://doi.org/10.1145/3242969.3264987Group-level Emotion Recognition (GER) in the wild is a challenging task gaining lots of attention. Most recent works utilized two channels of information, a channel involving only faces and a channel containing the whole image, to solve this problem. ...
- research-articleOctober 2018
An Ensemble Model Using Face and Body Tracking for Engagement Detection
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 616–622https://doi.org/10.1145/3242969.3264986Precise detection and localization of learners' engagement levels are useful for monitoring their learning quality. In the emotiW Challenge's engagement detection task, we proposed a series of novel improvements, including (a) a cluster-based framework ...
- research-articleOctober 2018
Predicting Engagement Intensity in the Wild Using Temporal Convolutional Network
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 604–610https://doi.org/10.1145/3242969.3264984Engagement is the holy grail of learning whether it is in a classroom setting or an online learning platform. Studies have shown that engagement of the student while learning can benefit students as well as the teacher if the engagement level of the ...
-
- short-paperOctober 2018
An Occam's Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 589–593https://doi.org/10.1145/3242969.3264980This paper presents a light-weight and accurate deep neural model for audiovisual emotion recognition. To design this model, the authors followed a philosophy of simplicity, drastically limiting the number of parameters to learn from the target datasets,...
- research-articleOctober 2018
Large Vocabulary Continuous Audio-Visual Speech Recognition
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 538–541https://doi.org/10.1145/3242969.3264976We like to conversate with other people using both sounds and visuals, as our perception of speech is bimodal. Essentially echoing the same speech structure, we manage to integrate the two modalities and often understand the message better than with the ...
- research-articleOctober 2018
Responding with Sentiment Appropriate for the User's Current Sentiment in Dialog as Inferred from Prosody and Gaze Patterns
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 529–533https://doi.org/10.1145/3242969.3264974Multi-modal sentiment detection from natural video/audio streams has recently received much attention. I propose to use this multi-modal information to develop a technique, Sentiment Coloring , that utilizes the detected sentiments to generate effective ...
- research-articleOctober 2018
Data Driven Non-Verbal Behavior Generation for Humanoid Robots
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 520–523https://doi.org/10.1145/3242969.3264970Social robots need non-verbal behavior to make an interaction pleasant and efficient. Most of the models for generating non-verbal behavior are rule-based and hence can produce a limited set of motions and are tuned to a particular scenario. In contrast,...
- research-articleOctober 2018
Multimodal Teaching and Learning Analytics for Classroom and Online Educational Settings
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 542–545https://doi.org/10.1145/3242969.3264969Automatic analysis of teacher student interactions is an interesting research problem in social computing. Such interactions happen in both online and class room settings. While teaching effectiveness is the goal in both settings, the mechanism to ...
- research-articleOctober 2018
Interpretable Multimodal Deception Detection in Videos
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 511–515https://doi.org/10.1145/3242969.3264967There are various real-world applications such as video ads, airport screenings, courtroom trials, and job interviews where deception detection can play a crucial role. Hence, there are immense demands on deception detection in videos. Videos contain ...
- research-articleOctober 2018
SAAMEAT: Active Feature Transformation and Selection Methods for the Recognition of User Eating Conditions
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 564–568https://doi.org/10.1145/3242969.3243685Automatic recognition of eating conditions of humans could be a useful technology in health monitoring. The audio-visual information can be used in automating this process, and feature engineering approaches can reduce the dimensionality of audio-visual ...
- research-articleOctober 2018
Deep End-to-End Representation Learning for Food Type Recognition from Speech
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 574–578https://doi.org/10.1145/3242969.3243683The use of Convolutional Neural Networks (CNN) pre-trained for a particular task, as a feature extractor for an alternate task, is a standard practice in many image classification paradigms. However, to date there have been comparatively few works ...
- research-articleOctober 2018
Functional-Based Acoustic Group Feature Selection for Automatic Recognition of Eating Condition
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 579–583https://doi.org/10.1145/3242969.3243682This paper presents the novel Functional-based acoustic Group Feature Selection (FGFS) method for automatic eating condition recognition addressed in the ICMI 2018 Eating Analysis and Tracking Challenge's Food-type Sub-Challenge. The Food-type Sub-...
- research-articleOctober 2018
Predicting Group Performance in Task-Based Interaction
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 14–20https://doi.org/10.1145/3242969.3243027We address the problem of automatically predicting group performance on a task, using multimodal features derived from the group conversation. These include acoustic features extracted from the speech signal, and linguistic features derived from the ...
- short-paperOctober 2018
Multimodal Local-Global Ranking Fusion for Emotion Recognition
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 472–476https://doi.org/10.1145/3242969.3243019Emotion recognition is a core research area at the intersection of artificial intelligence and human communication analysis. It is a significant technical challenge since humans display their emotions through complex idiosyncratic combinations of the ...
- short-paperOctober 2018
End-to-end Learning for 3D Facial Animation from Speech
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 361–365https://doi.org/10.1145/3242969.3243017We present a deep learning framework for real-time speech-driven 3D facial animation from speech audio. Our deep neural network directly maps an input sequence of speech spectrograms to a series of micro facial action unit intensities to drive a 3D ...
- short-paperOctober 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 111–115https://doi.org/10.1145/3242969.3243014Automatic speech recognition can potentially benefit from the lip motion patterns, complementing acoustic speech to improve the overall recognition performance, particularly in noise. In this paper we propose an audio-visual fusion strategy that goes ...
- short-paperOctober 2018
Population-specific Detection of Couples' Interpersonal Conflict using Multi-task Learning
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 229–233https://doi.org/10.1145/3242969.3243007The inherent diversity of human behavior limits the capabilities of general large-scale machine learning systems, that usually require ample amounts of data to provide robust descriptors of the outcomes of interest. Motivated by this challenge, ...
- short-paperOctober 2018
Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionPages 186–190https://doi.org/10.1145/3242969.3242997In human conversational interactions, turn-taking exchanges can be coordinated using cues from multiple modalities. To design spoken dialog systems that can conduct fluid interactions it is desirable to incorporate cues from separate modalities into ...