skip to main content
10.1145/2623330.2623678acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Distance metric learning using dropout: a structured regularization approach

Published: 24 August 2014 Publication History

Abstract

Distance metric learning (DML) aims to learn a distance metric better than Euclidean distance. It has been successfully applied to various tasks, e.g., classification, clustering and information retrieval. Many DML algorithms suffer from the over-fitting problem because of a large number of parameters to be determined in DML. In this paper, we exploit the dropout technique, which has been successfully applied in deep learning to alleviate the over-fitting problem, for DML. Different from the previous studies that only apply dropout to training data, we apply dropout to both the learned metrics and the training data. We illustrate that application of dropout to DML is essentially equivalent to matrix norm based regularization. Compared with the standard regularization scheme in DML, dropout is advantageous in simulating the structured regularizers which have shown consistently better performance than non structured regularizers. We verify, both empirically and theoretically, that dropout is effective in regulating the learned metric to avoid the over-fitting problem. Last, we examine the idea of wrapping the dropout technique in the state-of-art DML methods and observe that the dropout technique can significantly improve the performance of the original DML methods.

Supplementary Material

MP4 File (p323-sidebyside.mp4)

References

[1]
P. Baldi and P. J. Sadowski. Understanding dropout. In NIPS, pages 2814--2822, 2013.
[2]
C. M. Bishop. Training with noise is equivalent to tikhonov regularization. Neural computation, 7(1):108--116, 1995.
[3]
E. J. Candies and T. Tao. The power of convex relaxation: near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053--2080, 2010.
[4]
C.-C. Chang and C.-J. Lin. Libsvm: A library for support vector machines. ACM TIST, 2(3):27, 2011.
[5]
H. Chang and D.-Y. Yeung. Locally linear metric adaptation for semi-supervised clustering. In ICML, pages 153--160, 2004.
[6]
G. Chechik, V. Sharma, U. Shalit, and S. Bengio. Large scale online learning of image similarity through ranking. JMLR, 11:1109--1135, 2010.
[7]
J. V. Davis, B. Kulis, P. Jain, S. Sra, and I. S. Dhillon. Information-theoretic metric learning. In ICML, pages 209--216, 2007.
[8]
H. Do, A. Kalousis, J. Wang, and A. Woznica. A metric learning perspective of svm: on the relation of lmnn and svm. In AISTATS, pages 308--317, 2012.
[9]
A. Frank and A. Asuncion. UCI machine learning repository, 2010.
[10]
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset, 2007.
[11]
E. Hazan, A. Agarwal, and S. Kale. Logarithmic regret algorithms for online convex optimization. Machine Learning, 69(2--3):169--192, 2007.
[12]
E. Hazan and S. Kale. Beyond the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization. In COLT, pages 421--436, 2011.
[13]
X. He, W.-Y. Ma, and H. Zhang. Learning an image manifold for retrieval. In ACM Multimedia, pages 17--23, 2004.
[14]
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.
[15]
B. Kulis. Metric learning: A survey. Foundations and Trends in Machine Learning, 5(4):287--364, 2013.
[16]
D. K. H. Lim, B. McFee, and G. Lanckriet. Robust structural metric learning. In ICML, 2013.
[17]
K. Matsuoka. Noise injection into inputs in back-propagation learning. IEEE Transactions on Systems, Man, and Cybernetics, 22(3):436--440, 1992.
[18]
Y. Nesterov et al. Gradient methods for minimizing composite objective function, 2007.
[19]
S. Rifai, X. Glorot, Y. Bengio, and P. Vincent. Adding noise to the input of a model trained with a regularized objective. CoRR, abs/1104.3250, 2011.
[20]
K. Saenko, B. Kulis, M. Fritz, and T. Darrell. Adapting visual category models to new domains. In ECCV, pages 213--226. Springer, 2010.
[21]
P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. Collective classification in network data. AI Magazine, 29(3):93--106, 2008.
[22]
S. Shalev-Shwartz, Y. Singer, and A. Y. Ng. Online and batch learning of pseudo-metrics. In ICML, 2004.
[23]
B. Shaw, B. C. Huang, and T. Jebara. Learning a distance metric from a network. In NIPS, pages 1899--1907, 2011.
[24]
L. van der Maaten, M. Chen, S. Tyree, and K. Q. Weinberger. Learning with marginalized corrupted features. In ICML, pages 410--418, 2013.
[25]
S. Wager, S. Wang, and P. Liang. Dropout training as adaptive regularization. In NIPS, pages 351--359, 2013.
[26]
K. Q. Weinberger and L. K. Saul. Distance metric learning for large margin nearest neighbor classification. JMLR, 10:207--244, 2009.
[27]
E. P. Xing, A. Y. Ng, M. I. Jordan, and S. J. Russell. Distance metric learning with application to clustering with side-information. In NIPS, pages 505--512, 2002.
[28]
L. Yang and R. Jin. Distance metric learning: a comprehensive survery. 2006.
[29]
M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML, pages 928--936, 2003.

Cited By

View all

Index Terms

  1. Distance metric learning using dropout: a structured regularization approach

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2014
      2028 pages
      ISBN:9781450329569
      DOI:10.1145/2623330
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 August 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. distance metric learning
      2. dropout

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      KDD '14
      Sponsor:

      Acceptance Rates

      KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 14 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media