research-article

Decoding Contact: Automatic Estimation of Contact Signatures in Parent-Infant Free Play Interactions

Authors:

Metehan Doyran,

Albert Ali Salah,

Ronald PoppeAuthors Info & Claims

ICMI '24: Proceedings of the 26th International Conference on Multimodal Interaction

Pages 38 - 46

https://doi.org/10.1145/3678957.3685719

Published: 04 November 2024 Publication History

Abstract

In parent-child interactions (PCIs), there is frequent physical contact between the two actors. Quantifying this contact provides valuable input to assess the nature of the interaction or the relation between parent and child. Here, we explore the application of vision-based techniques to automatically detect contact signatures at each frame of video recordings of playful parent-infant interactions. We employ two separate models: (i) a multimodal convolutional neural network (CNN) that integrates 2D pose and body part information, and (ii) a unimodal graph convolutional neural network (GCN) that utilizes only 2D pose. We showcase the potential and limitations of automatic contact signature estimation through quantitative and qualitative assessments using a parent-infant free play interaction dataset consisting of 100 parent-child dyadic interactions, covering 20 hours. Additionally, our experiments provide insights into various design choices through systematic experimentation. By releasing our annotations and code, we aim to enable further research in the automatic contact signature estimation during free play interactions between parents and infants.

Supplemental Material

PDF File

Appendix

Download
596.16 KB

References

[1]

Mary D Salter Ainsworth, Mary C Blehar, Everett Waters, and Sally N Wall. 2015. Patterns of attachment: A psychological study of the strange situation. Psychology press, New York, NY.

[2]

John Bowlby. 1982. Attachment and loss: retrospect and prospect.American journal of Orthopsychiatry 52, 4 (1982), 664.

[3]

Alicja Brzozowska, Matthew R. Longo, Denis Mareschal, Frank Wiesemann, and Teodora Gliga. 2021. Capturing touch in parent–infant interaction: A comparison of methods. Infancy 26, 3 (2021), 494–514.

[4]

Zhe Cao, Gines Hidalgo Martinez, Tomas Simon, Shih-En Wei, and Yaser A. Sheikh. 2019. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 43 (2019), 172–186. Issue 1.

Digital Library

[5]

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). Springer, Munich, Germany, 801–818.

Digital Library

[6]

Qingshuang Chen, Rana Abu-Zhaya, Amanda Seidl, and Fengqing Zhu. 2019. CNN Based Touch Interaction Detection for Infant Speech Development. In IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, San Jose, CA, 20–25.

[7]

Qingshuang Chen, He Li, Rana Abu-Zhaya, Amanda Seidl, Fengqing Zhu, and Edward J Delp. 2016. Touch event recognition for human interaction. Electronic Imaging 2016, 11 (2016), 1–6.

[8]

Jia. Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09. IEEE, Miami, FL, USA, 248–255.

[9]

Metehan Doyran, Ronald Poppe, and Albert Ali Salah. 2023. Embracing Contact: Detecting Parent-Infant Interactions. In Proceedings of the 25th International Conference on Multimodal Interaction(ICMI ’23). Association for Computing Machinery, New York, NY, USA, 198–206. https://doi.org/10.1145/3577190.3614147

Digital Library

[10]

Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, and Cewu Lu. 2017. Rmpe: Regional multi-person pose estimation. In Proceedings of the IEEE international conference on computer vision. IEEE, Venice, Italy, 2334–2343.

[11]

Mihai Fieraru, Mihai Zanfir, Elisabeta Oneata, Alin-Ionut Popa, Vlad Olaru, and Cristian Sminchisescu. 2020. Three-Dimensional Reconstruction of Human Interactions. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE/CVF, Seattle, WA, USA, 2596–2605.

[12]

Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun. 2021. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021).

[13]

Tsfira Grebelsky‐Lichtman. 2014. Children’s Verbal and Nonverbal Congruent and Incongruent Communication During Parent–Child Interactions. Human Communication Research 40 (07 2014). https://doi.org/10.1111/hcre.12035

[14]

Wen Guo. 2020. Multi-person pose estimation in complex physical interactions. In Proceedings of the 28th ACM International Conference on Multimedia. IEEE/CVF, Seattle, WA, USA, 4752–4755.

Digital Library

[15]

Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, and Francesc Moreno-Noguer. 2022. Multi-person extreme motion prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13053–13064.

[16]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Las Vegas, NV, USA, 770–778. https://doi.org/10.1109/CVPR.2016.90

[17]

Matthew J Hertenstein, Julie M Verkamp, Alyssa M Kerestes, and Rachel M Holmes. 2006. The communicative functions of touch in humans, nonhuman primates, and rats: a review and synthesis of the empirical research. Genetic, social, and general psychology monographs 132, 1 (2006), 5–94.

[18]

Berfu Karaca, Albert Ali Salah, Jaap Denissen, Ronald Poppe, and Sonja M.C. de Zwarte. to appear. Survey of Automated Methods for Nonverbal Behavior Analysis in Parent-Child Interactions. In Proceedings of the International Conference on Face and Gesture Recognition (FG).

[19]

Alex Kendall, Matthew Grimes, and Roberto Cipolla. 2015. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Proceedings of the IEEE international conference on computer vision. IEEE, Santiago, Chile, 2938–2946.

Digital Library

[20]

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, 2023. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4015–4026.

[21]

Lisa J. G. Krijnen, Marjolein Verhoeven, and Anneloes L. van Baar. 2023. Observing mother-child interaction in a free-play vs. a structured task context and its relationship with preterm and term born toddlers’ psychosocial outcomes. Frontiers in Child and Adolescent Psychiatry 2 (2023), 1176560.

[22]

Zhengcen Li, Yueran Li, Linlin Tang, Tong Zhang, and Jingyong Su. 2023. Two-Person Graph Convolutional Network for Skeleton-Based Human Interaction Recognition. IEEE Transactions on Circuits and Systems for Video Technology 33, 7 (2023), 3333–3342. https://doi.org/10.1109/TCSVT.2022.3232373

Digital Library

[23]

Kevin Lin, Lijuan Wang, Kun Luo, Yinpeng Chen, Zicheng Liu, and Ming-Ting Sun. 2020. Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation. IEEE Transactions on Circuits and Systems for Video Technology 31 (2020), 7277–7286. Issue 3.

[24]

Ashley Montague. 1986. Touching: The human significance of the skin (3rd ed.). Harper & Row, New York, NY, USA.

[25]

N Charlotte Onland-Moret, Jacobine E Buizer-Voskamp, Maria EWA Albers, Rachel M Brouwer, Elizabeth EL Buimer, Roy S Hessels, Roel de Heus, Jorg Huijding, Caroline MM Junge, René CW Mandl, 2020. The YOUth study: Rationale, design, and study procedures. Developmental cognitive neuroscience 46 (2020), 100868.

[26]

Muhammad Rameez Ur Rahman, Luca Scofano, Edoardo De Matteis, Alessandro Flaborea, Alessio Sampieri, and Fabio Galasso. 2023. Best Practices for 2-Body Pose Forecasting.

[27]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, Springer, Munich, Germany, 234–241.

[28]

Reva Rubin. 1963. Maternal touch. Nursing outlook 11 (1963), 828–829.

[29]

Sara E Schroer and Chen Yu. 2022. The Real-Time Effects of Parent Speech on Infants’ Multimodal Attention and Dyadic Coordination. Infancy : the official journal of the International Society on Infant Studies 27, 6 (2022), 1154–1178. https://doi.org/10.1111/infa.12500

[30]

Jack P Shonkoff and P Hauser-Cram. 1987. Early intervention for disabled infants and their families: a quantitative analysis. Pediatrics 80, 5 (1987), 650–658.

[31]

Jack P Shonkoff and Samuel J Meisels. 2000. Handbook of Early Childhood Intervention (2 ed.). Cambridge University Press.

[32]

Anja Sommer, Claudia Hachul, and Hans-Günther Roßbach. 2016. Video-Based Assessment and Rating of Parent-Child Interaction Within the National Educational Panel Study. Springer Fachmedien Wiesbaden, Wiesbaden, 151–167. https://doi.org/10.1007/978-3-658-11994-2_9

[33]

Alexandros Stergiou and Ronald Poppe. 2019. Analyzing human–human interactions: A survey. Computer Vision and Image Understanding 188 (2019), 102799.

Digital Library

[34]

Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019. Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, Long Beach, CA, USA, 5693–5703.

[35]

Juulia T. Suvilehto, Enrico Glerean, Robin I. M. Dunbar, Riitta Hari, and Lauri Nummenmaa. 2015. Topography of social touching depends on emotional bonds between humans. Proceedings of the National Academy of Sciences 112, 45 (2015), 13811–13816. https://doi.org/10.1073/pnas.1519231112

[36]

Ines Van Keer, Eva Ceulemans, Nadja Bodner, Sier Vandesande, Karla Van Leeuwen, and Bea Maes. 2019. Parent-child interaction: A micro-level sequential approach in children with a significant cognitive and motor developmental delay. Research in Developmental Disabilities 85 (2019), 172–186. https://doi.org/10.1016/j.ridd.2018.11.008

[37]

Yifei Yin, Chen Guo, Manuel Kaufmann, Juan Jose Zarate, Jie Song, and Otmar Hilliges. 2023. Hi4D: 4D instance segmentation of close human interaction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 17016–17027.

[38]

Feng Zhang, Xiatian Zhu, Hanbin Dai, Mao Ye, and Ce Zhu. 2020. Distribution-Aware Coordinate Representation for Human Pose Estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE/CVF, Seattle, WA, USA, 7093–7102.

[39]

Feng Zhang, Xiatian Zhu, and Mao Ye. 2019. Fast human pose estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE/CVF, Long Beach, CA, USA, 3517–3526.

Cited By

Doyran MSalah APoppe R(2024)Human Contact Annotator: Annotating Physical Contact in Dyadic InteractionsCompanion Proceedings of the 26th International Conference on Multimodal Interaction10.1145/3686215.3689346(97-99)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3686215.3689346

Index Terms

Decoding Contact: Automatic Estimation of Contact Signatures in Parent-Infant Free Play Interactions

Recommendations

Embracing Contact: Detecting Parent-Infant Interactions
ICMI '23: Proceedings of the 25th International Conference on Multimodal Interaction

We focus on a largely overlooked but crucial modality for parent-child interaction analysis: physical contact. In this paper, we provide a feasibility study to automatically detect contact between a parent and child from videos. Our multimodal CNN model ...
Human Contact Annotator: Annotating Physical Contact in Dyadic Interactions
ICMI Companion '24: Companion Proceedings of the 26th International Conference on Multimodal Interaction

In dyadic interactions, observing physical contact between interactants is crucial to understand the nature and quality of their interaction. To facilitate the systematic annotation of physical contact from images, we developed Human Contact Annotator, ...
Parent and child problematic media use: The role of maternal postpartum depression and dysfunctional parent-child interactions in young children
Abstract
Problematic media use, or media use that interferes with daily functioning, is most often studied in adolescent or young adult age groups. Less research has examined problematic media use within the family system, among parents and ...
Highlights
- Problematic media use is seen in children as young as 3–4 years old and in their parents.

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMI '24: Proceedings of the 26th International Conference on Multimodal Interaction

November 2024

725 pages

ISBN:9798400704628

DOI:10.1145/3678957

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMI '24

ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 4 - 8, 2024

San Jose, Costa Rica

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
66
Total Downloads

Downloads (Last 12 months)66
Downloads (Last 6 weeks)16

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Doyran MSalah APoppe R(2024)Human Contact Annotator: Annotating Physical Contact in Dyadic InteractionsCompanion Proceedings of the 26th International Conference on Multimodal Interaction10.1145/3686215.3689346(97-99)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3686215.3689346

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten