research-article

Towards efficient and effective discovery of Markov blankets for feature selection

Authors:

Xindong WuAuthors Info & Claims

Volume 509, Issue C

Pages 227 - 242

https://doi.org/10.1016/j.ins.2019.09.010

Published: 01 January 2020 Publication History

Abstract

The Markov blanket (MB), a key concept in a Bayesian network (BN), is essential for large-scale BN structure learning and optimal feature selection. Many MB discovery algorithms that are either efficient or effective have been proposed for addressing high-dimensional data. In this paper, we propose a new algorithm for Efficient and Effective MB discovery, called EEMB. Specifically, given a target feature, the EEMB algorithm discovers the PC (i.e., parents and children) and spouses of the target simultaneously and can distinguish PC from spouses during MB discovery. We compare EEMB with the state-of-the-art MB discovery algorithms using a series of benchmark BNs and real-world datasets. The experiments demonstrate that EEMB is competitive with the fastest MB discovery algorithm in terms of computational efficiency and achieves almost the same MB discovery accuracy as the most accurate of the compared algorithms.

References

[1]

C.F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, X.D. Koutsoukos, Local causal and Markov blanket induction for causal discovery and feature selection for classification part i: algorithms and empirical evaluation, J. Mach. Learn. Res. 11 (Jan) (2010) 171–234.

[2]

C.F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, X.D. Koutsoukos, Local causal and Markov blanket induction for causal discovery and feature selection for classification part ii: analysis and extensions, J. Mach. Learn. Res. 11 (Jan) (2010) 235–284.

[3]

C.F. Aliferis, I. Tsamardinos, A. Statnikov, Hiton: a novel Markov blanket algorithm for optimal variable selection, Proceedings of the AMIA Annual Symposium Proceedings, 2003, American Medical Informatics Association, 2003, p. 21.

[4]

I.A. Beinlich, H.J. Suermondt, R.M. Chavez, G.F. Cooper, The alarm monitoring system: A case study with two probabilistic inference techniques for belief networks, Proceedings of the AIME, Springer, 1989, pp. 247–256.

[5]

J. Binder, D. Koller, S. Russell, K. Kanazawa, Adaptive probabilistic networks with hidden variables, Mach. Learn. 29 (2–3) (1997) 213–244.

[6]

V. Bolón-Canedo, D. Rego-Fernández, D. Peteiro-Barral, A. Alonso-Betanzos, B. Guijarro-Berdiñas, N. Sánchez-Maroño, On the scalability of feature selection methods on high-dimensional data, Knowl. Inf. Syst. (2018) 1–48.

[7]

A. Dawid, R. Cowell, S. Lauritzen, D. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer-Verlag, 1999.

[8]

S.R. De Morais, A. Aussem, A novel scalable and data efficient feature subset selection algorithm, Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2008, pp. 298–312.

[9]

J. Demšar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (Jan) (2006) 1–30.

[10]

D. Dua, K.T. Efi, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences, 2017.

[11]

S. Fu, M.C. Desmarais, Fast Markov blanket discovery algorithm via local learning within single pass, Proceedings of the Conference of the Canadian Society for Computational Studies of Intelligence, Springer, 2008, pp. 96–107.

[12]

T. Gao, Q. Ji, Efficient Markov blanket discovery and its application, IEEE Trans. Cybern. 47 (5) (2017) 1169–1179.

[13]

B. Hitt, P. Levine, Multiple high-resolution serum proteomic features for ovarian cancer detection, 2006, US Patent App. 11/093,018.

[14]

D. Koller, M. Sahami, Toward optimal feature selection, Technical Report, Stanford InfoLab, 1996.

[15]

D. Margaritis, S. Thrun, Bayesian network induction via local neighborhoods, Proceedings of the Advances in Neural Information Processing Systems, 2000, pp. 505–511.

[16]

J. Pearl, Morgan Kaufmann series in representation and reasoning, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, US, 1988.

[17]

J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Elsevier, 2014.

Digital Library

[18]

J.M. Pena, R. Nilsson, J. Björkegren, J. Tegnér, Towards scalable and data efficient learning of Markov boundaries, Int. J. Approx. Reason. 45 (2) (2007) 211–232.

[19]

P. Spirtes, C.N. Glymour, R. Scheines, Causation, Prediction, and Search, MIT press, 2000.

[20]

A. Statnikov, I. Tsamardinos, C. Aliferis, An algorithm for generation of large Bayesian networks, Technical Report DSL-03-01, Department of Biomedical Informatics, Discovery Systems Laboratory, Vanderbilt University, 2003.

[21]

I. Tsamardinos, C.F. Aliferis, A. Statnikov, Time and sample efficient discovery of Markov blankets and direct causal relations, Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2003, pp. 673–678.

[22]

I. Tsamardinos, C.F. Aliferis, A.R. Statnikov, E. Statnikov, Algorithms for large scale Markov blanket discovery., Proceedings of the FLAIRS Conference, 2, 2003, pp. 376–380.

[23]

I. Tsamardinos, L.E. Brown, C.F. Aliferis, The max-min hill-climbing Bayesian network structure learning algorithm, Mach. Learn. 65 (1) (2006) 31–78.

Digital Library

[24]

Y. Wang, J.G. Klijn, Y. Zhang, A.M. Sieuwerts, M.P. Look, F. Yang, D. Talantov, M. Timmermans, M.E. Meijer-van Gelder, J. Yu, et al., Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer, The Lancet 365 (9460) (2005) 671–679.

[25]

X. Wu, K. Yu, W. Ding, H. Wang, X. Zhu, Online feature selection with streaming features, IEEE Trans. Pattern Anal. Mach. Intell. 35 (5) (2013) 1178–1192.

[26]

X. Xue, M. Yao, Z. Wu, A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm, Knowl. Inf. Syst. 57 (2) (2018) 389–412.

[27]

S. Yaramakala, D. Margaritis, Speculative markov blanket discovery for optimal feature selection, Proceedings of the Fifth IEEE International Conference on Data Mining, IEEE, 2005, p. 4.

[28]

K. Yu, L. Liu, J. Li, A unified view of causal and non-causal feature selection, (2018). arXiv preprint arXiv:1802.05844.

[29]

K. Yu, L. Liu, J. Li, W. Ding, T. Le, Multi-source causal feature selection, IEEE Trans. Pattern Anal. Mach. Intell. (2019),.

Digital Library

Cited By

Hassan APaik JKhare SHassan S(2025)A wrapper feature selection approach using Markov blanketsPattern Recognition10.1016/j.patcog.2024.111069158:COnline publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.patcog.2024.111069
Garcia-Torres M(2025)Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter searchJournal of Heuristics10.1007/s10732-025-09550-931:1Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1007/s10732-025-09550-9
Saarela MHong JPark J(2024)On the relation of causality- versus correlation-based feature selection on model fairnessProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636018(56-64)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1145/3605098.3636018
Show More Cited By

Index Terms

Towards efficient and effective discovery of Markov blankets for feature selection

Index terms have been assigned to the content through auto-classification.

Recommendations

BAMB: A Balanced Markov Blanket Discovery Approach to Feature Selection
Special Section on Advances in Causal Discovery and Inference and Regular Papers

The discovery of Markov blanket (MB) for feature selection has attracted much attention in recent years, since the MB of the class attribute is the optimal feature subset for feature selection. However, almost all existing MB discovery algorithms focus ...
Loose-to-strict Markov blanket learning algorithm for feature selection
Abstract
The Markov blanket (MB) represents a crucial concept in a Bayesian network (BN) and is theoretically the optimal solution to the feature selection problem. Methods based on conditional independence (CI) tests are prevalent for MB discovery. ...
A wrapper feature selection approach using Markov blankets
Abstract
In feature selection, Markov Blanket (MB) based approaches have attracted considerable attention with most MB discovery algorithms being categorized as filter based techniques. Typically, the Conditional Independence (CI) test employed by such ...
Highlights
- MB discovery algorithms are traditionally recognized as filter methods requiring different CI tests for different datatypes and tasks.
- We propose a universal Markov blanket based novel wrapper feature selection algorithm.
- A novel ...

Comments

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal

Information Sciences: an International Journal Volume 509, Issue C

Jan 2020

530 pages

ISSN:0020-0255

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 January 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hassan APaik JKhare SHassan S(2025)A wrapper feature selection approach using Markov blanketsPattern Recognition10.1016/j.patcog.2024.111069158:COnline publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1016/j.patcog.2024.111069
Garcia-Torres M(2025)Feature selection for high-dimensional data using a multivariate search space reduction strategy based scatter searchJournal of Heuristics10.1007/s10732-025-09550-931:1Online publication date: 1-Mar-2025
https://dl.acm.org/doi/10.1007/s10732-025-09550-9
Saarela MHong JPark J(2024)On the relation of causality- versus correlation-based feature selection on model fairnessProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636018(56-64)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1145/3605098.3636018
Wang HKuang KLan LWang ZHuang WWu FYang W(2024)Out-of-Distribution Generalization With Causal Feature SeparationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.331225536:4(1758-1772)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TKDE.2023.3312255
Yang JWang ZWang GLiu YHe YWu D(2024)OSFS‐VagueCAAI Transactions on Intelligence Technology10.1049/cit2.123279:6(1451-1466)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1049/cit2.12327
Wang NLiu HZhang LCai YShi Q(2024)Loose-to-strict Markov blanket learning algorithm for feature selectionKnowledge-Based Systems10.1016/j.knosys.2023.111216283:COnline publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1016/j.knosys.2023.111216
Qiu PNiu ZZhang C(2023)Research on the multi-source causal feature selection method based on multiple causal relevanceKnowledge-Based Systems10.1016/j.knosys.2023.110334265:COnline publication date: 8-Apr-2023
https://dl.acm.org/doi/10.1016/j.knosys.2023.110334
Khan WKong LNoman SBrekhna B(2022)A novel feature selection method via mining Markov blanketApplied Intelligence10.1007/s10489-022-03863-z53:7(8232-8255)Online publication date: 30-Jul-2022
https://dl.acm.org/doi/10.1007/s10489-022-03863-z
Ling ZLi BZhang YLi YLing H(2022)Online Markov Blanket Learning for High-Dimensional DataApplied Intelligence10.1007/s10489-022-03841-553:5(5977-5997)Online publication date: 5-Jul-2022
https://dl.acm.org/doi/10.1007/s10489-022-03841-5
Yu KGuo XLiu LLi JWang HLing ZWu X(2020)Causality-based Feature SelectionACM Computing Surveys10.1145/340938253:5(1-36)Online publication date: 28-Sep-2020
https://dl.acm.org/doi/10.1145/3409382

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents