skip to main content
10.1145/3532213.3532223acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaiConference Proceedingsconference-collections
research-article

Binaural Sound Source Localization based on Neural Networks in Mismatched HRTF Condition

Published: 13 July 2022 Publication History

Abstract

Binaural sound source localization is a field with wide applications, such as virtual sound localization in VR or speech enhancement, which has drawn many researchers’ attention. However, the mismatched HRTF condition is a severe problem, which has been ignored in most of the previous researches. In this paper, an experiment is firstly conducted to prove the negative influence of HRTF individualization on binaural localization performance. In face with this problem, an improved localization method is proposed in this paper. Both DNN and CNN are used in this method and the performance is compared. Then, another experiment is also operated to prove the efficiency of this method, providing ideas for subsequent iterations.

References

[1]
M. Raspaud, H. Viste, and G. Evangelista, “Binaural Source Localization by Joint Estimation of ILD and ITD,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 1, pp. 68–77, 2010.
[2]
N. Ma, T. May, and G. J. Brown, “Exploiting Deep Neural Networks and Head Movements for Robust Binaural Localization of Multiple Sources in Reverberant Environments,” IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 25, no. 12, pp. 2444–2453, 2017.
[3]
B. R. Hammond and P. J. B. Jackson, “Robust Full-Sphere Binaural Sound Source Localization,” 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 86-90, 2018.
[4]
Oberem, Josefa, “Experiments on localization accuracy with non-individual and individual HRTFs comparing static and dynamic reproduction methods,” bioRxiv, 2020.
[5]
J. Wang, J. Wang, Z. Yan, X. Wang and X. Xie, “DNN and Clustering Based Binaural Sound Source Localization in Mismatched HRTF Condition,” 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), pp. 1-5, 2019.
[6]
J. Wang, J. Wang, K. Qian, “Binaural sound localization based on deep neural network and affinity propagation clustering in mismatched HRTF condition,” J AUDIO SPEECH MUSIC PROC, 2020.
[7]
V. R. Algazi, R. O. Duda, D. M. Thompson, C. Avendano, “The CIPIC HRTF database,” Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575), New Platz, NY, USA, pp. 99-102, 2001.
[8]
Garofolo, John. S, TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1, Web Download. Philadelphia: Linguistic Data Consortium, 1993.
[9]
Roman, Nicoleta, DeLiang Wang, and Guy J. Brown, “Speech segregation based on sound localization,” The Journal of the Acoustical Society of America 114.4: 2236-2252, 2003.
[10]
Dahl, George E., Tara N. Sainath, and Geoffrey E. Hinton, “Improving deep neural networks for LVCSR using rectified linear units and dropout,” 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013.
[11]
Hinton, Geoffrey. E, “Improving neural networks by preventing co-adaptation of feature detectors,” arXiv, 2012.
[12]
Anzai, Yuichiro, Pattern recognition and machine learning, Elsevier, 2012.
[13]
Kanji Watanabe, Yukio Iwaya, Yôiti Suzuki, Shouichi Takane, Sojun Sato, “Dataset of head-related transfer functions measured with a circular loudspeaker array,” Acoustical Science and Technology, Volume 35, Issue 3, Pages 159-165, 2014.

Cited By

View all
  • (2024)Auditory Cortex-Inspired Spectral Attention Modulation for Binaural Sound Localization in HRTF MismatchICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446210(8656-8660)Online publication date: 14-Apr-2024
  1. Binaural Sound Source Localization based on Neural Networks in Mismatched HRTF Condition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence
    March 2022
    809 pages
    ISBN:9781450396110
    DOI:10.1145/3532213
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Binaural localization
    2. HRTF
    3. Neural network

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICCAI '22

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Auditory Cortex-Inspired Spectral Attention Modulation for Binaural Sound Localization in HRTF MismatchICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446210(8656-8660)Online publication date: 14-Apr-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media