skip to main content
research-article

Ensemble mutation slime mould algorithm with restart mechanism for feature selection

Published: 25 January 2022 Publication History

Abstract

Existing data acquisition technologies desire further improvement to meet the increasing need for big, accurate, and high‐quality data collection. Most of the collected data have redundant information such as noise. To improve the classification accuracy, the dimensionality reduction technique, which is also known as the feature selection, is a necessity for data processing. In this paper, the slime mould algorithm (SMA) is optimized by introducing the composite mutation strategy (CMS) and restart strategy (RS). The improved SMA is named CMSRSSMA, which stands for the CMS, RS, and the improved SMA. The CMS is utilized to increase the population diversity, and the RS is used to avoid the local optimum. In this paper, the CEC2017 benchmark function is used to test the effectiveness of the proposed CMSRSSMA. Then, the CMSRSSMA‐SVM model is proposed for feature selection and parameter optimization simultaneously. The performance of the model is tested by 14 data sets from UCI data repository. Experimental results show that the proposed method is superior to other algorithms in terms of classification accuracy, number of features and fitness value on most selected data sets.

References

[1]
Ebrahimi‐Khusfi Z, Nafarzadegan AR, Dargahian F. Predicting the number of dusty days around the desert wetlands in southeastern Iran using feature selection and machine learning techniques. Ecol Indic. 2021;125:1‐15.
[2]
Liu Q, Gu Q, Wu Z. Feature selection method based on support vector machine and shape analysis for high‐throughput medical data. Comput Biol Med. 2017;91:103‐111.
[3]
Hsu WC, Liu CC, Chang F, Chen SS. Selecting genes for cancer classification using SVM: An adaptive multiple features scheme. Int J Intell Syst. 2013;28:1196‐1213.
[4]
Nguyen MH, Torre FDL. Optimal feature selection for support vector machines. Pattern Recognit. 2010;43:584‐591.
[5]
Xu R, Liu P, Zhang Y, et al. Joint partial optimal transport for open set domain adaptation. Int Joint Conf. 2021:2540‐2546.
[6]
Kang C, Huo Y, Xin L, Tian B, Yu B. Feature selection and tumor classification for microarray data using relaxed Lasso and generalized multi‐class support vector machine. J Thero Biol. 2019;463:77‐91.
[7]
Shahbeig S, Helfroush MS, Rahideh A. A fuzzy multi‐objective hybrid TLBO–PSO approach to select the associated genes with breast cancer. Signal Process. 2017;131:58‐65.
[8]
Lee IG, Zhang Q, Yoon SW, Won D. A mixed integer linear programming support vector machine for cost‐effective feature selection. Knowledge Based Syst. 2020;203:106145.
[9]
Kuo TF, Yajima Y. Ranking and selecting terms for text categorization via SVM discriminate boundary. Int J Intell Syst. 2010;25:137‐154.
[10]
Yang J, Frangi AF, Yang JY, Zhang D, Jin Z. KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Trans Pattern Anal Mach Intell. 2005;27:230‐244.
[11]
Trivedi SK. A study on credit scoring modeling with different feature selection and machine learning approaches. Technol Soc. 2020;63:101413.
[12]
Somol P, Baesens B, Pudil P, Vanthienen J. J Filter‐versus wrapper‐based feature selection for credit scoring. Int J Intell Syst. 2010;20:985‐999.
[13]
Nie F, Zhu W, Li X. Structured graph optimization for unsupervised feature selection. IEEE Trans Knowl Data Eng. 2019;13:1041‐4347.
[14]
Vasconcelos N. Feature selection by maximum marginal diversity: optimality and implications for visual recognition. In: Proceedings of IEEE Computer Society Conference; 2003:762‐769.
[15]
Choi JY, Ro YM, Plataniotis KN. Boosting color feature selection for color face recognition. IEEE Trans Image Proc. 2011;20:1425‐1434.
[16]
Goltsev A, Gritsenko V. Investigation of efficient features for image recognition by neural networks. Neural Net. 2012;28:15‐23.
[17]
Wang L, Yu J. Fault feature selection based on modified binary PSO with mutation and its application in chemical process fault diagnosis. Vol. 3612. Springer; 2005:832‐840.
[18]
Rauber TW, de Assis Boldt F, Varejão FM. Heterogeneous feature models and feature selection applied to bearing fault diagnosis. IEEE Trans Ind Electron. 2015;62:637‐646.
[19]
Jing LP, Huang HK, Shi HB. Improved feature selection approach TFIDF in text mining. In: Proceedings of Internatinal Conference; 2002:944‐946.
[20]
Sofie VL, Thomas A, Yvan S, Yves VP. Discriminative and informative features for biomolecular text mining with ensemble feature selection. Bioinf. 2010;26:i554‐i560.
[21]
Amiri F, Yousefifi MR, Lucas C, Shakery A, Yazdani N. Mutual information‐based feature selection for intrusion detection systems. J Network Comput Appl. 2011;34:1184‐1199.
[22]
Alazab A, Hobbs M, Abawajy J, Alazab M. Using feature selection for intrusion detection system. In: Proceedings of International Symposium; 2012:296‐301.
[23]
Wu C, Li W. Enhancing intrusion detection with feature selection and neural network. Int J Intell Syst. 2021;36:3087‐3105.
[24]
Yvan S, Iñaki I, Pedro L. A review of feature selection techniques in bioinformatics. Bioinf. 2007;23:2507‐2517.
[25]
Nithya B, Ilango V. Evaluation of machine learning based optimized feature selection approaches and classification methods for cervical cancer prediction. SN Appl Sci. 2019;1:641.
[26]
Guo Y, Zhang Z, Tang F. Feature selection with kernelized multi‐class support vector machine. Pattern Recognit. 2021;117:107988.
[27]
Ircio J, Lojo A, Mori U, Lozano JA. Mutual information based feature subset selection in multivariate time series classification. Pattern Recognit. 2020;108:107525.
[28]
Guyon IM, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3:1157‐1182.
[29]
Shahee SA, Ananthakumar U. An effective distance based feature selection approach for imbalanced data. Appl Intell. 2020;50:717‐745.
[30]
Du G, Zhang J, Luo Z, Ma F, Li S. Joint imbalanced classification and feature selection for hospital readmissions. Knowl‐Based Syst. 2020;200:06020.
[31]
Chan H, Yang M, Wang H, et al. Assessing gait patterns of healthy adults climbing stairs employing machine learning techniques. Int J Intell Syst. 2013;28:257‐270.
[32]
Guo YN, Zhang X, Gong DW, Zhang Z, Yang JJ. Novel interactive preference‐based multiobjective evolutionary optimization for bolt supporting networks. IEEE Trans 2020;24:750‐764.
[33]
Wen X, Cong W. Feature selection: a hybrid approach based on self‐adaptive ant colony and support vector machine. In: International Conference Computer Science and Software Engineering; 2008.
[34]
Janani B, Vijayarani MS. Artificial bee colony algorithm for feature selection and improved support vector machine for text classification. Int Doc Supply. 2019;47:154‐170.
[35]
Kumar S, Singh M. Breast cancer detection based on feature selection using enhanced grey wolf optimizer and support vector machine algorithms. Vietnam J Comput Sci. 2021;8:177‐197.
[36]
Wang H, GA‐based A. feature selection and parameters optimization for support vector machines. Expert Syst Appl. 2006.
[37]
Lin SW, Tseng TY, Chen SC, Huang JF. A SA‐based feature selection and parameter optimization approach for support vector machine. Pervasive Comput. 2016;13:549‐556. doi:10.1109/ICSMC.2006.384599
[38]
Wang H, Yu H, Qian Z, Shuang C, Zhu F. Parameters optimization of classifier and feature selection based on improved artificial bee colony algorithm. Int Conf Adv Mechatron Syst IEEE. 2017.
[39]
Jia H, Sun K. Improved barnacles mating optimizer algorithm for feature selection and support vector machine optimization. Pattern Anal Appl. 2021;24:1249‐1274.
[40]
Li S, Chen H, Wang M, Heidari AA, Mirjalili S. Slime mould algorithm: a new method for stochastic optimization. Future Gener Comput Syst. 2020;111:300‐323.
[41]
Naik MK, Panda R, Abraham A. Normalized square difference based multilevel thresholding technique for multispectral images using leader slime mould algorithm. J King Saud Univ Comput Inf Sci. 2020.
[42]
Houssein EH, Mahdy MA, Blondin MJ, Shebl D, Mohamed WM. Hybrid slime mould algorithm with adaptive guided differential evolution algorithm for combinatorial and global optimization problems. Expert Syst Appl. 2021;174:114689.
[43]
Nguyen TT, Wang HJ, Dao TK, Pan JS, Liu JH, Weng S. An improved slime mold algorithm and its application for optimal operation of cascade hydropower stations. IEEE Access. 2020;8:226754‐226772.
[44]
Saunders C, Stitson MO, Weston J, et al. Support vector machine. Comput Sci. 2002;1:1‐28.
[45]
Panja R, Pal NR. MS‐SVM: minimally spanned support vector machine. Appl Soft Comput. 2017;64:356‐365.
[46]
Tong S, Koller D. Support Vector Machine Active Learning with Applications to Text Classification. Kluwer Academics; 2001.
[47]
Lin KC, Zhang KY, Hung JC. Feature selection of support vector machine based on harmonious cat swarm optimization. In: 7th International Conference on Ubi‐Media Comput. 2014:205‐208.
[48]
Zhang H, Wang Z, Chen W, et al. Ensemble mutation‐driven salp swarm algorithm with restart mechanism: framework and fundamental analysis. Expert Syst Appl. 2021;165:113897.
[49]
Kasbawati, Gunawan AY, Sidarto KA, Hertadi R. Differential evolution‐a simple and efficient adaptive scheme for global optimization over continuous space. Int Comput Sci Inst. 2015.
[50]
Wang Y, Cai Z, Zhang Q. Differential evolution with composite trial vector generation strategies and control parameters. IEEE Trans Evol Comput. 2011;15:55‐66.
[51]
Awad NH, Ali MZ, Suganthan PN. Ensemble sinusoidal differential covariance matrix adaptation with Euclidean neighborhood for solving CEC2017 benchmark problems. IEEE Cong Evol Comput. 2017:372‐379.
[52]
Houssein EH, Hosney ME, Oliva D, Mohamed WM, Hassaballah M. A novel hybrid Harris hawks optimization and support vector machines for drug design and discovery. Comput Chem Eng. 2020;133:106656.
[53]
Zhao S, Gao L, Dongmei YU, Jun TU. Ant lion optimizer with chaotic investigation mechanism for optimizing svm parameters. Comput Sci Explor. 2016;10:722‐731.
[54]
Aljarah I, Al‐Zoubi AM, Faris H. Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cogn Comput. 2018;10:1‐18.
[55]
Yin X, Hou YD, Yin J. J A novel SVM parameter tuning method based on advanced whale optimization algorithm. J Phys: Conf Ser. 2019;1237:022140.
[56]
Jia H, Li Y, Sun K, Cao N, Zhou HM. Hybrid sooty tern optimization and differential evolution for feature selection. Comput Syst Sci Eng. 2021;39:321‐335.
[57]
Lichman M. UCI Machine Learning Repository. 2013. http://archive.ics.uci.edu/ml

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Intelligent Systems
International Journal of Intelligent Systems  Volume 37, Issue 3
March 2022
871 pages
ISSN:0884-8173
DOI:10.1002/int.v37.3
Issue’s Table of Contents

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 25 January 2022

Author Tags

  1. composite mutation strategy
  2. feature selection
  3. restart strategy
  4. slime mould algorithm
  5. support vector machine

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media