skip to main content
10.1145/2020408.2020608acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
poster

Serendipitous learning: learning beyond the predefined label space

Published: 21 August 2011 Publication History

Abstract

Most traditional supervised learning methods are developed to learn a model from labeled examples and use this model to classify the unlabeled ones into the same label space predefined by the models. However, in many real world applications, the label spaces for both the labeled/training and unlabeled/testing examples can be different. To solve this problem, this paper proposes a novel notion of Serendipitous Learning (SL), which is defined to address the learning scenarios in which the label space can be enlarged during the testing phase. In particular, a large margin approach is proposed to solve SL. The basic idea is to leverage the knowledge in the labeled examples to help identify novel/unknown classes, and the large margin formulation is proposed to incorporate both the classification loss on the examples within the known categories, as well as the clustering loss on the examples in unknown categories. An efficient optimization algorithm based on CCCP and the bundle method is proposed to solve the optimization problem of the large margin formulation of SL. Moreover, an efficient online learning method is proposed to address the issue of large scale data in online learning scenario, which has been shown to have a guaranteed learning regret. An extensive set of experimental results on two synthetic datasets and two datasets from real world applications demonstrate the advantages of the proposed method over several other baseline algorithms. One limitation of the proposed method is that the number of unknown classes is given in advance. It may be possible to remove this constraint if we model it by using a non-parametric way. We also plan to do experiments on more real world applications in the future.

References

[1]
J. Allan. Topic detection and tracking: event-based information organization. Kluwer Academic Publishers Norwel, 2002.
[2]
K. P. Bennett and A. Demiriz. Semi-supervised support vector machines. In NIPS, pages 368--374, 1998.
[3]
J. Betteridge, A. Carlson, S. Hong, and E. Hruschka Jr. Toward Never Ending Language Learning. In AAAI, 2009.
[4]
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. H. Jr., and T. M. Mitchell. Toward an architecture for never-ending language learning. In AAAI, 2010.
[5]
R. Duda, P. Hart, and D. Stork. Pattern classification. Wiley-Interscience, 2001.
[6]
E. Hazan, A. Kalai, S. Kale, and A. Agarwal. Logarithmic regret algorithms for online convex optimization. In COLT, pages 499--513, 2006.
[7]
T. Joachims. Transductive inference for text classification using support vector machines. In ICML, pages 200--209, 1999.
[8]
D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361--397, 2004.
[9]
X. Liu, Y. Gong, W. Xu, and S. Zhu. Document clustering with cluster refinement and model selection capabilities. In SIGIR, pages 191--198, 2002.
[10]
C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization : Algorithms and Complexity. Dover Publications, July 1998.
[11]
D. Preston, C. E. Brodley, R. Khardon, D. Sulla-Menashe, and M. A. Friedl. Redefining class definitions using constraint-based clustering: an application to remote sensing of the earth's surface. In KDD, pages 823--832, 2010.
[12]
B. Scholkopf and A. Smola. Learning with kernels. MIT press Cambridge, Mass, 2002.
[13]
A. Smola, S. Vishwanathan, and Q. Le. Bundle methods for machine learning. NIPS, 20, 2008.
[14]
Q. Tao, D. Chu, and J. Wang. Recursive support vector machines for dimensionality reduction. IEEE Transactions on Neural Networks, 19(1):189--193, 2008.
[15]
C. H. Teo, S. V. N. Vishwanathan, A. J. Smola, and Q. V. Le. Bundle methods for regularized risk minimization. Journal of Machine Learning Research, 11:311--365, 2010.
[16]
H. Valizadegan and R. Jin. Generalized Maximum Margin Clustering and Unsupervised Kernel Learning. NIPS, 2006.
[17]
A. S. Vishwanathan, A. J. Smola, and S. V. N. Vishwanathan. Kernel methods for missing variables. In AISTAT, pages 325--332, 2005.
[18]
L. Xu, J. Neufeld, B. Larson, and D. Schuurmans. Maximum margin clustering. NIPS, 17:1537--1544, 2005.
[19]
L. Xu and D. Schuurmans. Unsupervised and semi-supervised multi-class support vector machines. In AAAI, pages 904--910, 2005.
[20]
T. Yang, R. Jin, A. K. Jain, Y. Zhou, and W. Tong. Unsupervised transfer classification: application to text categorization. In KDD, pages 1159--1168, 2010.
[21]
Y. Yang, J. Z. 0003, J. G. Carbonell, and C. Jin. Topic-conditioned novelty detection. In KDD, pages 688--693, 2002.
[22]
K. Zhang, I. W. Tsang, and J. T. Kwok. Maximum margin clustering made practical. IEEE Transactions on Neural Networks, 20(4):583--596, 2009.
[23]
B. Zhao, F. Wang, and C. Zhang. Efficient maximum margin clustering via cutting plane algorithm. SDM, pages 751--762, 2008.
[24]
B. Zhao, F. Wang, and C. Zhang. Efficient multiclass maximum margin clustering. ICML, pages 751--762, 2008.
[25]
X. Zhu. Semi-supervised learning literature survey. Computer Science, University of Wisconsin-Madison, 2006.

Cited By

View all

Index Terms

  1. Serendipitous learning: learning beyond the predefined label space

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2011
    1446 pages
    ISBN:9781450308137
    DOI:10.1145/2020408
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 August 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. label space
    2. maximum margin classification
    3. serendipitous learning

    Qualifiers

    • Poster

    Conference

    KDD '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 23 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media