skip to main content
10.1145/1877826.1877831acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

RANSAC-based training data selection for emotion recognition from spontaneous speech

Published: 29 October 2010 Publication History

Abstract

Training datasets containing spontaneous emotional expressions are often imperfect due the ambiguities and difficulties of labeling such data by human observers. In this paper, we present a Random Sampling Consensus (RANSAC) based training approach for the problem of emotion recognition from spontaneous speech recordings. Our motivation is to insert a data cleaning process to the training phase of the Hidden Markov Models (HMMs) for the purpose of removing some suspicious instances of labels that may exist in the training dataset. Our experiments using HMMs with various number of states and Gaussian mixtures per state indicate that utilization of RANSAC in the training phase provides an improvement of up to 2.84% in the unweighted recall rates on the test set. . This improvement in the accuracy of the classifier is shown to be statistically significant using McNemar's test.

References

[1]
Angelova, A., Abu-Mostafa, Y., and Perona, P. 2005. Pruning Training Sets for Learning of Object Categories, Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR).
[2]
Barandela, R., and Gasca, E. 2000, Decontamination of Training Samples for Supervised Pattern Recognition Methods. Lecture Notes in Computer Science, vol. 1876, pp. 621--630.
[3]
Ben-Gal I., Outlier detection, In: Maimon O. and Rockach L. (Eds.) Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers, Kluwer Academic Publishers, 2005.
[4]
Bozkurt, E., Erzin, E., Erdem, C. E., Erdem, A. T. 2009. Improving Automatic Emotion Recognition from Speech Signals. Interspeech 2009, ISCA.
[5]
Bozkurt, E., Erdem, C. E., Erdem, A. T., Erzin, E. 2010. Use of Line Spectral Frequencies for Emotion Recognition from Speech. Int. Conf. on Pattern Recognition, August 2010, 0stanbul, Turkey.
[6]
Breiman, L., 1996. Bagging Predictors. Machine Learning, 24(2), 123--140.
[7]
Dietterich, T. G., Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 7(10):1895--1924, 1998.
[8]
Fischler, M. A., and Bolles, R. C. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Graphics and Image Processing, Vol. 24, No. 6.
[9]
Gu, B., Hu, F., Liu, H. 2000. Sampling and its Applications in Data Mining: A Survey. Tech. Rep. School of Computing, National University of Singapore, Singapore.
[10]
Guyon, I., Matin, N., Vapnik, V. 1994. Discovering informative Patterns and Data Cleaning. Workshop on Knowledge Discovery in Databases.
[11]
Kuncheva, L. I. Combining Pattern Classifiers. John Wiley & Sons, 2004.
[12]
Olken, F. 1993. Random Sampling From Databases. Ph. D. Thesis, Department of Computer Science, University of California, Berkeley.
[13]
Ratsch, G., Onoda, T., and Muller, K. 2000. Regularizing Adaboost, Advances in Neural Information Processing Systems, vol. 11, 564--570.
[14]
Schuller, B., Steidl, S., and Batliner, A. 2009. The INTERSPEECH 2009 Emotion Challenge. Interspeech (2009), ISCA.
[15]
Seppi, D., Batliner, A., Schuller, B., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Aharonson, V.2008. Patterns, Prototypes, Performance: Classifying Emotional User States. Interspeech (2008), ISCA.
[16]
Sonka, M., Hlavac, V., and Boyle, R. 2008. Image Processing, Analysis and Machine Vision, Thomson.
[17]
Wang, S., Dash, M., Chia, L. and Xu, M. 2007. Efficient sampling of training set in large and noisy multimedia data. ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 3, No.3.
[18]
The Hidden Markov Toolkit, http://htk.eng.cam.ac.uk/

Cited By

View all

Index Terms

  1. RANSAC-based training data selection for emotion recognition from spontaneous speech

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    AFFINE '10: Proceedings of the 3rd international workshop on Affective interaction in natural environments
    October 2010
    106 pages
    ISBN:9781450301701
    DOI:10.1145/1877826
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. RANSAC
    2. affect recognition
    3. data cleaning
    4. data pruning
    5. emotional speech classification

    Qualifiers

    • Research-article

    Conference

    MM '10
    Sponsor:
    MM '10: ACM Multimedia Conference
    October 29, 2010
    Firenze, Italy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media