skip to main content
10.1145/1027933.1027976acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
Article

MacVisSTA: a system for multimodal analysis

Published: 13 October 2004 Publication History

Abstract

The study of embodied communication requires access to mul-tiple data sources such as multistream video and audio, various derived and meta-data such as gesture, head, posture, facial expression and gaze information. The common element that runs through these data is the co-temporality of the multiple modes of behavior. In this paper, we present the multimedia Visualization for Situated Temporal Analysis (MacVisSTA) system for the analysis of multimodal human communication through video, audio, speech transcriptions, and gesture and head orientation data. The system uses a multiple linked representation strategy in which different rep-resentations are linked by the current time focus. In this framework, the multiple display components associated with the disparate data types are kept in synchrony, each compo-nent serving as both a controller of the system as well as a display. Hence the user is able to analyze and manipulate the data from different analytical viewpoints (e.g. through the time-synchronized speech transcription or through motion segments of interest). MacVisSTA supports analysis of the synchronized data at varying timescales. It provides an annotation interface that permits users to code the data into 'music-score' objects, and to make and organize multimedia observa-tions about the data. Hence MacVisSTA integrates flexible visualization with annotation within a single framework. An XML database manager has been created for storage and search of annotation data. We compare the system with other existing annotation tools with respect to functionality and interface design. The software runs on Macintosh OS X computer systems.

References

[1]
McNeill, D., Hand and Mind: What Gestures Reveal about thought. 1992, Chicago: University of Chicago Press.
[2]
Kendon, A., Gesticulation and speech: Two apsects of the process of utterance, in Relationship Between Verbal and Nonverbal Communication, M.R. Key, Editor. 1980: The Hague. p. 207--227.
[3]
Kozma, R.B., A Reply: Media and Methods. Educational Technology Research and Development, 1994. 42(3): p. 1--14.
[4]
Kozma, R.B., et al., The Use of Multiple, Linked Representations to Facilitate Science Understanding, in International Perspectives on the Design of Technology-Supported Learning Environments, S. Vosniadou, et al., Editors. 1996: Mahwah, New Jersey.
[5]
Kipp, M., Anvil: Annotation of Video and Spoken Language. 2003.
[6]
Neidle, C., SignStream™: A Database Tool for Research on Visual-Gestural Language. Sign Transcription and Database Storage of Sign Information, a special issue of Sign Language and Linguistics, 2002. 4(1/2): p. 203--214.
[7]
Sanderson, P.M., et al., MacSHAPA and the enterprise of Exploratory Sequential Data Analysis (ESDA). International Journal of Human-Computer Studies, 1994. 41: p. 633--668.
[8]
Nivre, J., et al. Towards Multimodal Spoken Language Corpora: TransTool and SyncTool. in Proceedings of the Workshop on Partially Automated Techniques for Transcribing Naturally Occurring Speech at COLING-ACL '98. 1998. Montreal, Canada.
[9]
Hanke, T. and S. Prillwitz. SyncWRITER: Integrating Video into the Transcription and Analysis of Sign Language. in Proceedings of the Fourth European Congress on Sign Language Research. 1994. Munich, Germany.
[10]
CHILDES, Using CLAN (Manual available for download). 2003, CHILDES Project, Carnegie Mellon University.
[11]
Wittenberg, P., MediaTagger. 2000, Max Planck Institute for Psycholinguistics.
[12]
Dybkjaer, L., et al., Survey of Existing Tools, Standards and User Needs for Annotation of Natural Interaction and Multimodal Data. 2001, International Standards for Language Engineering, Natural Interaction and MultiModality Project: Odense, Denmark. p. 1--111.
[13]
Bigbee, T., D. Loehr, and L. Harper. Emerging Requirements for Multi-Modal Annotation and Analysis Tools. in In Proceedings, Eurospeech 2001 Special Event: Existing and Future Corpora -- Acoustic, Linguistic, and Multi-modal Requirements. 2001.
[14]
Quek, F., et al. VisSTA: A Tool for Analyzing Multimodal Discourse Data. in Seventh International Conference on Spoken Language Processing. 2002. Denver, CO.
[15]
Quek, F., et al., A multimedia database system for temporally situated perceptual psycholinguistic analysis. Multimedia Tools and Applications, 2002. 18(2): p. 91--113.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '04: Proceedings of the 6th international conference on Multimodal interfaces
October 2004
368 pages
ISBN:1581139950
DOI:10.1145/1027933
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. embodied communication
  2. flexible visualization and annotation
  3. gesture
  4. multimodal interaction
  5. multiple linked representation

Qualifiers

  • Article

Conference

ICMI04
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)3
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media