skip to main content
10.5555/791227.792170guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Advanced Local Feature Selection in Medical Diagnostics

Published: 23 June 2000 Publication History

Abstract

Current electronic data repositories contain enormous amount of data, especially in medical domains, where data is often feature-space heterogeneous so that different features have different importance in different sub-areas of the whole space. In this paper, we suggest a technique that searches for a strategic splitting of the feature space identifying the best subsets of features for each instance. Our technique is based on the wrapper approach where a classification algorithm is used as the evaluation function to differentiate between several feature subsets. We apply the recently developed technique for dynamic integration of classifiers and use decision trees. For each test instance, we consider only those feature combinations that include features present in the path taken by the test instance in the decision tree. We evaluate our technique on medical datasets from the UCI machine-learning repository. The experiments show that the local feature selection is often advantageous in comparison with feature selection on the whole space.

References

[1]
S.A. Aivazyan, Applied Statistics: Classification and Dimension Reduction, Finance and Statistics, Moscow. 1989.
[2]
C. Apte, S.J. Hong, J.R.M. Hosking, J. Lepre, E.P.D. Pednault, and B.K. Rosen, "Decomposition of Heterogeneous Classification Problems", X. Hiu, P. Cohen, and M. Berthold, eds., Advances in Intelligent Data Analysis, Springer-Verlag, London, 1997, pp. 17-28.
[3]
C.G. Atkeson, A.W. Moore, and S. Schaal, "Locally Weighted Learning", Artificial Intelligence Review, Vol. 11, Ns. 1-5, 1997, pp. 11-73.
[4]
C.L. Blake, and C.J. Merz, UCI Repository of Machine Learning Databases {http://www.ics.uci.edu/~mlearn/ MLRepository.html}, Dep-t of Information and CS, Un-ty of California, Irvine CA, 1998.
[5]
C. Cardie, and N. Howe, "Improving Minority Class Prediction Using Case-Specific Feature Weights", Proc. 14th Int. Conf. on Machine Learning, Morgan Kaufmann, 1997, pp. 57-65.
[6]
S. Cost, and S. Salzberg, "A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features", Machine Learning, Vol. 10, No. 1, 1993, pp. 57-78.
[7]
M. Dash, and H. Liu, "Feature Selection for Classification", Intelligent Data Analysis, Vol. 1, No.3, Elsevier Science, 1997.
[8]
T.G. Dietterich, "Machine Learning Research: Four Current Directions", AI Magazine, Vol. 18, No.4, 1997, pp.97-136.
[9]
P. Domingos, "Context-Sensitive Feature Selection for Lazy Learners", J. of AI Review, Vol. 11, Ns. 1-5, 1997, pp.227-253.
[10]
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1997.
[11]
R. Kohavi, "A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection", C. Mellish, ed., Proceedings of IJCAI'95, Morgan Kaufmann, 1995.
[12]
R. Kohavi, D. Sommerfield, and J. Dougherty, "Data Mining Using MLC++: A Machine Learning Library in C++", Tools with Artificial Intelligence, IEEE CS Press, 1996, pp. 234-245.
[13]
M. Koppel, and S.P. Engelson, "Integrating Multiple Classifiers by Finding their Areas of Expertise", AAAI-96 Workshop On Integrating Multiple Learning Models, 1996, pp. 53-58.
[14]
C.J. Merz, "Dynamical Selection of Learning Algorithms", D. Fisher, H.-J. Lenz, eds., Learning from Data, Artificial Intelligence and Statistics, Springer-Verlag, NY, 1996.
[15]
C.J. Merz, "Combining Classifiers Using Correspondence Analysis", M.J. Jordan, M.J. Kearns, S. Asolla, eds., Advances in Neural Information Processing Systems 10, MIT Press, 1998.
[16]
S. Puuronen, V. Terziyan, and A. Tsymbal, "A Dynamic Integration Algorithm for an Ensemble of Classifiers", Z.W. Ras, A. Skowron, eds., Foundations of Intelligent Systems: ISMIS'99, Lecture Notes in AI, Vol. 1609, Springer-Verlag, Warsaw, 1999, pp. 592-600.
[17]
S. Puuronen, A. Tsymbal, and V. Terziyan, "Distance Functions in Dynamic Integration of Data Mining Techniques", B.V. Dasarathy, ed., Data Mining and Knowledge Discovery: Theory, Tools, and Techniques, SPIE-The International Society for Optical Engineering, USA, 2000, to appear.
[18]
J.R. Quinlan, C4.5 Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993.
[19]
R.E. Schapire, "Using Output Codes to Boost Multiclass Learning Problems", Machine Learning: Proc. 14th Int. Conf., 1997, pp. 313-321.
[20]
D.B. Skalak, Combining Nearest Neighbor Classifiers, Ph.D. Thesis, Dept. of Computer Science, University of Massachusetts, Amherst, MA, 1997.
[21]
I. Skrypnik, V. Terziyan, S. Puuronen, and A. Tsymbal, "Learning Feature Selection for Medical Databases", Proc. 12th IEEE Symp. on Computer-Based Medical Systems CBMS'99, IEEE CS Press, Starnford, CT, 1999, pp.53-58.
[22]
I. Skrypnyk, A Tsymbal, and S. Puuronen, "Local Feature Selection for Heterogeneous Problems",. 2nd International Conference Data Mining'2000, Cambridge, UK, July 2000, WIT Press, to be published.
[23]
V. Terziyan, A. Tsymbal, and S. Puuronen, "The Decision Support System for Telemedicine Based on Multiple Expertise", Int. J. of Medical Informatics, Vol. 49, No.2, 1998, pp. 217-229.
[24]
D. Wolpert, "Stacked Generalization", Neural Networks, Vol. 5, 1992, pp. 241-259.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
CBMS '00: Proceedings of the 13th IEEE Symposium on Computer-Based Medical Systems (CBMS'00)
June 2000
ISBN:0769504841

Publisher

IEEE Computer Society

United States

Publication History

Published: 23 June 2000

Author Tags

  1. data mining
  2. decision trees
  3. dynamic integration of classifiers
  4. feature selection
  5. knowledge discovery
  6. medical databases

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media