skip to main content
10.1145/2502081.2502106acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Non-reference audio quality assessment for online live music recordings

Published: 21 October 2013 Publication History

Abstract

Immensely popular video sharing websites such as YouTube have become the most important sources of music information for Internet users and the most prominent platform for sharing live music. The audio quality of this huge amount of live music recordings, however, varies significantly due to factors such as environmental noise, location, and recording device. However, most video search engines do not take audio quality into consideration when retrieving and ranking results. Given the fact that most users prefer live music videos with better audio quality, we propose the first automatic, non-reference audio quality assessment framework for live music video search online. We first construct two annotated datasets of live music recordings. The first dataset contains 500 human-annotated pieces, and the second contains 2,400 synthetic pieces systematically generated by adding noise effects to clean recordings. Then, we formulate the assessment task as a ranking problem and try to solve it using a learning-based scheme. To validate the effectiveness of our framework, we perform both objective and subjective evaluations. Results show that our framework significantly improves the ranking performance of live music recording retrieval and can prove useful for various real-world music applications.

References

[1]
J. Barbedo and A. Lopes. A new cognitive model for objective assessment of audio quality. Journal of Audio Engineering Society, 53(1/2):22--31, 2005.
[2]
K. Brandenburg. A new coding algorithm for high quality sound signals. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 141--144, 1987.
[3]
K. Brandenburg and T. Sporer. NMR and masking flag: Evaluation of quality using perceptual criteria. In Proc. International AES Conference on Audio Test and Measurement, pages 169--179, 1992.
[4]
D. Campbell, E. Jones, and M. Glavin. Audio quality assessment techniques -- A review, and recent developments. Signal Processing, 89(8):1489--1500, 2009.
[5]
C.-Y. Chiu, D. Bountouridis, J.-C. Wang, and H.-M. Wang. Background music identification through content filtering and min-hash matching. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 2414--2417, 2010.
[6]
V. Dang. Ranklib -- A library of learning to rank algorithms. {Online} http://www.cs.umass.edu/ vdang/ranklib.html.
[7]
A. A. de Lima, F. P. Freeland, R. A. de Jesus, B. C. Bispo, L. W. P. Biscainho, S. L. Netto, A. Said, A. Kalker, R. Schafer, B. Lee, and M. Jam. On the quality assessment of sound signals. In Proc. IEEE International Symposium on Circuits and Systems, pages 416--419, 2008.
[8]
P. J. O. Doets and R. L. Lagendijk. Extracting quality parameters for compressed audio from fingerprints. In Proc. International Conference on Music Information Retrieval, pages 498--503, 2005.
[9]
H. Fastl and E. Zwicker. Psychoacoustics: Facts and models, volume 22. Springer, 2006.
[10]
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119--139, 1997.
[11]
J. Friedman. Greedy function approximation: A gradient boosting machine, 1999.
[12]
S. S. Hemami and A. R. Reibman. No-reference image and video quality estimation: Applications and human-motivated design. Image Communication, 25(7):469--481, 2010.
[13]
R. Huber and B. Kollmeier. PEMO-Q -- A new method for objective audio quality assessment using a model of auditory perception. IEEE Transactions Audio, Speach, and Language Processing, 14(6):1902--1911, 2006.
[14]
International Telecommunications Union Recommendation (ITU-R) BS.1116--1. Methods for the subjective assessment of small impairments in audio system including multichannel sound systems, 1997.
[15]
International Telecommunications Union Recommendation (ITU-R) BS.1284--1. General methods for the subjective assessment of sound quality, 1997--2003.
[16]
International Telecommunications Union Recommendation (ITU-R) BS.1387. Method for objective measurements of perceived audio quality, 1998.
[17]
International Telecommunications Union Recommendation (ITU-R) BS.1534--1. Methods for the subjective assessment of intermediate quality level of coding systems, 2003.
[18]
International Telecommunications Union Recommendation (ITU-R) P.862. Peceptual Evaluation of Speech Quality (PESQ): An objective method for end-to-end speach quality assessment of narrow-band telephone networks and speech codecs, 2001.
[19]
K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002.
[20]
T. Joachims. Optimizing search engines using clickthrough data. In Proc. the ACM International Conference on Knowledge Discovery and Data Mining, pages 133--142, 2002.
[21]
T. Joachims. Training linear SVMs in linear time. In Proc. ACM International Conference on Knowledge Discovery and Data Mining, pages 217--226, 2006.
[22]
M. Karjalainen. A new auditory model for the evaluation of sound quality of audio systems. In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 10, pages 608--611, 1985.
[23]
L. Kennedy and M. Naaman. Less talk, more rock: Automated organization of community-contributed collections of concert videos. In Proc. International Conference on World Wide Web, pages 311--320, 2009.
[24]
O. Lartillot and P. Toiviainen. A Matlab toolbox for musical feature extraction from audio. In Proc. International Conference on Digital Audio Effects, 2007.
[25]
T.-Y. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009.
[26]
L. Malfait, J. Berger, and M. Kastner. P.563--8212, the ITU-T standard for single-ended speech quality assessment. IEEE Transactions on Audio, Speech, and Language Processing, 14(6):1924--1934, 2006.
[27]
A. Rix, J. Beerends, D. Kim, P. Kroon, and O. Ghitza. Objective assessment of speech and audio quality -- Technology and applications. IEEE Transactions on Audio, Speech, and Language Processing, 14(6):1890--1901, 2006.
[28]
M. K. Saini, R. Gadde, S. Yan, and W. T. Ooi. Movimash: online mobile video mashup. In Proc. ACM International Conference on Multimedia, pages 139--148, 2012.
[29]
T. Sporer. Objective audio signal evaluation--applied psychoacoustics for modeling the perceived quality of digital audio. In Audio Engineering Society Convention 103, page 4512, 1997.
[30]
T. Sporer, U. Gbur, J. Herre, and R. Kapust. Evaluating a measurement system. In Audio Engineering Society Convention 95, page 3704, 1996.
[31]
T. Thiede. Perceptual audio quality assessment using a non-linear filter bank. In PhD thesis, Fachbereich Electrotechnik, Technical University of Berlin, 1999.
[32]
T. Thiede, W. C. Treurniet, R. Bitto, C. Schmidmer, T. Sporer, J. G. Beerends, and C. Colomes. PEAQ - the ITU standard for objective measurement of perceived audio quality. Journal of Audio Engineering Society, 48(1/2):3--29, 2000.
[33]
W. Treurniet and G. Soulodre. Evaluation of the ITU-R objective audio quality measurement method. Journal of Audio Engineering Society, 48(3):164--173, 2000.
[34]
E. M. Voorhees. The TREC-8 question answering track report. In Proc. Text Retrieval Conference, pages 77--82, 1999.
[35]
J.-C. Wang, H.-S. Lee, H.-M. Wang, and S.-K. Jeng. Learning the similarity of audio music in bag-of-frames representation from tagged music data. In Proc. International Society for Music Information Retrieval Conference, pages 85--90, 2011.
[36]
Q. Wu, C. J. Burges, K. M. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13(3):254--270, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '13: Proceedings of the 21st ACM international conference on Multimedia
October 2013
1166 pages
ISBN:9781450324045
DOI:10.1145/2502081
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. audio quality assessment
  2. learning-to-rank
  3. live music videos
  4. music information retrieval

Qualifiers

  • Research-article

Conference

MM '13
Sponsor:
MM '13: ACM Multimedia Conference
October 21 - 25, 2013
Barcelona, Spain

Acceptance Rates

MM '13 Paper Acceptance Rate 47 of 235 submissions, 20%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)3
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media