skip to main content
10.1145/2623330.2623611acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Effective global approaches for mutual information based feature selection

Published: 24 August 2014 Publication History

Abstract

Most current mutual information (MI) based feature selection techniques are greedy in nature thus are prone to sub-optimal decisions. Potential performance improvements could be gained by systematically posing MI-based feature selection as a global optimization problem. A rare attempt at providing a global solution for the MI-based feature selection is the recently proposed Quadratic Programming Feature Selection (QPFS) approach. We point out that the QPFS formulation faces several non-trivial issues, in particular, how to properly treat feature `self-redundancy' while ensuring the convexity of the objective function. In this paper, we take a systematic approach to the problem of global MI-based feature selection. We show how the resulting NP-hard global optimization problem could be efficiently approximately solved via spectral relaxation and semi-definite programming techniques. We experimentally demonstrate the efficiency and effectiveness of these novel feature selection frameworks.

References

[1]
K. Bache and M. Lichman. UCI machine learning repository, 2013.
[2]
R. Battiti. Using mutual information for selecting features in supervised neural net learning. Neural Networks, IEEE Transactions on, 5(4):537--550, 1994.
[3]
G. Brown, A. Pocock, M.-J. Zhao, and M. Luján. Conditional likelihood maximisation: A unifying framework for information theoretic feature selection. J. Mach. Learn. Res., 13:27--66, Mar. 2012.
[4]
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[5]
A. W. Chaovalitwongse, I. P. Androulakis, and P. M. Pardalos. Quadratic integer programming: complexity and equivalent forms quadratic integer programming: Complexity and equivalent forms. In C. A. Floudas and P. M. Pardalos, editors, Encyclopedia of Optimization, pages 3153--3159. 2009.
[6]
H. Cheng, Z. Qin, W. Qian, and W. Liu. Conditional mutual information based feature selection. In Knowledge Acquisition and Modeling, pages 103--107, 2008.
[7]
C. Ding and H. Peng. Minimum redundancy feature selection from microarray gene expression data. In Bioinformatics Conference, 2003, pages 523--528, 2003.
[8]
U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In IJCAI, pages 1022--1029, 1993.
[9]
C. Fowlkes, S. Belongie, and J. Malik. Efficient spatiotemporal grouping using the nystrom method. In CVPR 2001, volume 1, pages I--231--I--238 vol.1, 2001.
[10]
J. Gallier. Geometric Methods and Applications: For Computer Science and Engineering. Texts in Applied Mathematics. Springer-Verlag GmbH, 2001.
[11]
M. X. Goemans and D. P. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM, 42(6):1115--1145, Nov. 1995.
[12]
M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 2.0 beta. http://cvxr.com/cvx, Sept. 2013.
[13]
B. Guo and M. Nixon. Gait feature subset selection by mutual information. IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, 39(1):36--46, 2009.
[14]
G. Herman, B. Zhang, Y. Wang, G. Ye, and F. Chen. Mutual information-based method for selecting informative feature sets. Pattern Recognition, 46(12):3315 -- 3327, 2013.
[15]
M. Li, J. T. Kwok, and B.-L. Lu. Making large-scale nystrom approximation possible. In J. Furnkranz and T. Joachims, editors, ICML, pages 631--638. Omnipress, 2010.
[16]
D. Lin and X. Tang. Conditional infomax learning: an integrated framework for feature extraction and fusion. In ECCV'06.
[17]
C.-L. Liu, F. Yin, D.-H. Wang, and Q.-F. Wang. Casia online and offline chinese handwriting databases. In Document Analysis and Recognition (ICDAR), 2011 International Conference on, pages 37--41, Sept 2011.
[18]
Z.-Q. Luo, W.-K. Ma, A.-C. So, Y. Ye, and S. Zhang. Semidefinite relaxation of quadratic optimization problems. Signal Processing Magazine, IEEE, 27(3):20--34, 2010.
[19]
T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka. Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE TPAMI, 35(11):2624--2637, Nov 2013.
[20]
H. Peng, F. Long, and C. Ding. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 27(8):1226--1238, 2005.
[21]
I. Rodriguez-Lujan, R. Huerta, C. Elkan, and C. S. Cruz. Quadratic programming feature selection. J. Mach. Learn. Res., 11:1491--1516, 2010.
[22]
J. M. Sotoca and F. Pla. Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recognition, 43(6):2068--2081, June 2010.
[23]
K. C. Toh, M. Todd, and R. H. Tutuncu. Sdpt3 -- a matlab software package for semidefinite programming. Optimization methods and software, 11:545--581, 1999.
[24]
N. Vinh and J. Bailey. Comments on supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recognition, 46(4):1220 -- 1225, 2013.
[25]
D. H. Wolpert. The lack of a priori distinctions between learning algorithms. Neural computation, 8(7):1341--1390, 1996.
[26]
M. Yamashita, K. Fujisawa, M. Fukuda, K. Nakata, and M. Nakata. Algorithm 925: Parallel solver for semidefinite programming problem having sparse schur complement matrix. ACM Trans. Math. Softw., 39(1):6:1--6:22, Nov. 2012.
[27]
Y. Zhang, S. Burer, and W. N. Street. Ensemble pruning via semi-definite programming. J. Mach. Learn. Res., 7:1315--1338, Dec. 2006.

Cited By

View all

Index Terms

  1. Effective global approaches for mutual information based feature selection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2014
    2028 pages
    ISBN:9781450329569
    DOI:10.1145/2623330
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 August 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. feature selection
    2. global optimization
    3. mutual information
    4. semi-definite programming
    5. spectral relaxation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '14
    Sponsor:

    Acceptance Rates

    KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)68
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 06 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media