skip to main content
10.1145/2671188.2749322acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

Discriminative Latent Feature Space Learning for Cross-Modal Retrieval

Published: 22 June 2015 Publication History

Abstract

Cross-modal retrieval has drawn much attention in recent years due to its wide applications. Most of existing methods only focus on relevance but overlook heterogeneity and discrimination of features from different modalities, and how to capture and correlate these heterogeneous features is still challenging in this field. Therefore, we propose a general model which jointly learns a discriminative latent feature space for effective cross-modal retrieval. Concretely, a class-specific dictionary is learned to account for each modality, and all resulting sparse codes are simultaneously mapped into a common feature space that describes and associates the cross-modal data. Moreover, label information is leveraged to discriminate different classes inside the intra-modality data and also merge the same class inside the inter-modality data. Cross-modal retrieval is finally performed over the learned common feature space. The experimental results confirmed that our cross-modal method outperforms several competing methods on two public datasets.

References

[1]
D. M. Blei and M. I. Jordan. Modeling annotated data. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 127--134, 2003.
[2]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993--1022, 2003.
[3]
D. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical correlation analysis: An overview with application to learning methods. Neural Computation, 16(12):2639--2664, 2004.
[4]
D.-A. Huang and Y.-C. F. Wang. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In Proceedings of IEEE International Conference on Computer Vision, pages 2496--2503, 2013.
[5]
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In Proceedings of the 26th International Conference on Machine Learning, pages 689--696, 2009.
[6]
J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Discriminative learned dictionaries for local image analysis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 1--8. IEEE, 2008.
[7]
J. Mairal, J. Ponce, G. Sapiro, A. Zisserman, and F. R. Bach. Supervised dictionary learning. In Advances in neural information processing systems, pages 1033--1040, 2009.
[8]
D. Putthividhy, H. T. Attias, and S. S. Nagarajan. Topic regression multi-modal latent dirichlet allocation for image annotation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 3408--3415, 2010.
[9]
N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G. R. Lanckriet, R. Levy, and N. Vasconcelos. A new approach to cross-modal multimedia retrieval. In Proceedings of the ACM International Conference on Multimedia, pages 251--260, 2010.
[10]
A. Sharma, A. Kumar, H. Daume, and D. W. Jacobs. Generalized multiview analysis: A discriminative latent space. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pages 2160--2167, 2012.
[11]
S. Virtanen, Y. Jia, A. Klami, and D. Trevor. Factorized multi-modal topic model. arXiv preprint arXiv: 1210.4920, 2012.
[12]
F. Wu, Y. Han, X. Liu, J. Shao, Y. Zhuang, and Z. Zhang. The heterogeneous feature selection with structural sparsity for multimedia annotation and hashing: a survey. International Journal of Multimedia Information Retrieval, 1(1):3--15, 2012.
[13]
M. Yang, D. Zhang, and X. Feng. Fisher discrimination dictionary learning for sparse representation. In Proceedings of IEEE International Conference on Computer Vision, pages 543--550, 2011.
[14]
Y. Zhuang, Y. F. Wang, F. Wu, Y. Zhang, and W. Lu. Supervised coupled dictionary learning with group structures for multi-modal retrieval. In Proceedings of the Twenty-seventh AAAI Conference on Artificial Intelligence, pages 1070--1076, 2013.

Cited By

View all

Index Terms

  1. Discriminative Latent Feature Space Learning for Cross-Modal Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval
    June 2015
    700 pages
    ISBN:9781450332743
    DOI:10.1145/2671188
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modal retrieval
    2. discriminative dictionary learning
    3. latent space

    Qualifiers

    • Short-paper

    Funding Sources

    • National High Technology Research and Development Program of China
    • Fundamental Research Funds for the Central Universities
    • Program for New Century Excellent Talents in University
    • Key Science and Technology Program of Shaanxi Province China

    Conference

    ICMR '15
    Sponsor:

    Acceptance Rates

    ICMR '15 Paper Acceptance Rate 48 of 127 submissions, 38%;
    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 21 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media