skip to main content
10.1145/3674658.3674667acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbtConference Proceedingsconference-collections
research-article

Prediction of Linear B-cell Epitopes Based on Multi-Cluster Feature Selection

Published: 18 November 2024 Publication History

Abstract

The identification of B-cell epitopes is crucial for the design and development of immunodiagnostic kits and vaccines. Existing epitope prediction models require feature preprocessing. This requirement is ”arbitrary” and may reduce the recognition accuracy for the epitope prediction model. In this work, we equip the model training with suitable preprocessing strategy, “Multi-Cluster Feature Selection (MCFS)”, to eliminate the above requirement. The new learning network structure, called BCELM, can improve the linear B-cell epitope classification. On the Lbtope_Fixed_non_redundant(LF) dataset, we demonstrate that MCFS boosts the accuracy of a variety of machine learning architectures despite their different designs. With MCFS’ correlations measuring capabilities, BCELM achieves outstanding classification results using the physico-chemical properties and position entropy of amino acids. In processing test independent antigen proteins, our method is more accurate and effective than the comparison model.

References

[1]
Pintar Alessandro, Carugo Oliviero, and Pongor Sándor. 2002. CX, an algorithm that identifies protruding atoms in proteins. Bioinformatics 18 (2002), 980–984.
[2]
Manoj Bhasin and G.P.S. Raghava. 2004. Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine 22 (2004), 3195–3204.
[3]
Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 333–342.
[4]
Maximilian Collatz, Florian Mock, Emanuel Barth, Martin Hoelzer, Konrad Sachse, and Manja Marz. 2021. EpiDope: a deep neural network for linear B-cell epitope prediction (vol 37, pg 448, 2021). Bioinformatics 37 (2021), 448.
[5]
Georgios A. Dalkas and Marianne Rooman. 2017. SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence. BMC Bioinformatics 18 (2017), 95.
[6]
Grantham and R.1974. Amino Acid Difference Formula to Help Explain Protein Evolution. Science 185 (1974), 862–864.
[7]
Md Mehedi Hasan, Mst Shamima Khatun, and Hiroyuki Kurata. 2020. iLBE for Computational Identification of Linear B-cell Epitopes by Integrating Sequence and Evolutionary Features. Genomics Proteomics & Bioinformatics 18 (2020), 593–600.
[8]
Yuh-Jyh Hu, Shun-Chien Lin, Yu-Lung Lin, Kuan-Hui Lin, and Shun-Ning You. 2014. A meta-learning approach for B-cell conformational epitope prediction. BMC Bioinformatics 15 (2014), 378.
[9]
Paul Andrew Karplus. 1984. Prediction of chain flexibility in proteins: A tool for the selection of peptide antigens. Natur Wissenschaften 72 (1984), 212–213.
[10]
A. S. Kolaskar and Prasad C. Tongaonkar. 1990. A semi-empirical method for prediction of antigenic determinants on protein antigens. FEBS Letters 276 (1990), 172–174.
[11]
Ling Yun Liu, Hong Guang Yang, and Bin Cheng. 2019. Prediction of Linear B-cell Epitopes Based on PCA and RNN Network. In IEEE International Conference on Bioinformatics and Computational Biology. 39–43.
[12]
Balachandran Manavalan, Rajiv Gandhi Govindaraj, Tae Hwan Shin, Myeong Ok Kim, and Gwang Lee. 2018. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Frontiers in Immunology 9 (2018), 1695.
[13]
Jean Michel Mansuy, Asséga Sylvain SAGNA, Michael Laurent, and Jacques Izopet. 2021. COVID-19 diagnosis in a Senegalese company: a model for COVID-19 vaccination? Vaccine 39 (2021), 6346.
[14]
Marek Prachar, Sune Justesen, Daniel Bisgaard Steen-Jensen, Stephan Thorgrimsen, Erik Jurgons, Ole Winther, and Frederik Otzen Bagger. 2020. Identification and validation of 174 COVID-19 vaccine candidate epitopes reveals low performance of common epitope prediction tools. Scientific Reports 10 (2020), 20465.
[15]
Bremel Robert D and Homan E Jane. 2010. An integrated approach to epitope analysis I: Dimensional reduction, visualization and prediction of MHC binding using amino acid principal components and regression approaches. Immunome Research 6 (2010), 7.
[16]
B. Robson. 2020. Computers and viral diseases. Preliminary bioinformatics studies on the design of a synthetic vaccine and a preventative peptidomimetic antagonist against the SARS-CoV-2 (2019-nCoV, COVID-19) coronavirus. Computers in Biology and Medicine 119 (2020), 103670.
[17]
Sudipto Saha and G. P. S. Raghava. 2006. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins: Structure, Function, and Bioinformatics 65 (2006), 40–48.
[18]
Tao-Chuan Shih, Li-Ping Ho, Jen-Leih Wu, Hsin-Yiu Chou, and Tun-Wen Pai. 2019. A voting mechanism-based linear epitope prediction system for the host-specific Iridoviridae family. BMC Bioinformatics 20 (2019), 192.
[19]
Harinder Singh, Hifzur Rahman Ansari, and Gajendra P. S. Raghava. 2013. Improved Method for Linear B-Cell Epitope Prediction Using Antigen’s Primary Sequence. PLOS One 8 (2013), 1–8.
[20]
Johannes Sollner, Rainer Grohmann, Ronald Rapberger, Paul Perco, and Bernd Mayer. 2008. Analysis and prediction of protective continuous B-cell epitopes on pathogen proteins. Immunome Research 4 (2008), 1.
[21]
J.D.M. Verberk, J.A.P van Dongen, J. van de Kassteele, N.J. Andrews, R.D. van Gaalen, S.J.M. Hahné, H. Vennema, M. Ramsay, T. Braeckman, S. Ladhani, S.L. Thomas, J.L. Walker, H.E. de Melker, T.K. Fischer, J. Koch, and P. Bruijning-Verhagen. 2021. Impact analysis of rotavirus vaccination in various geographic regions in Western Europe. Vaccine 39 (2021), 6671–6681.
[22]
J. M. Zimmerman, Naomi Eliezer, and R. Simha. 1968. The characterization of amino acid sequences in proteins by statistical methods. Journal of Theoretical Biology 21 (1968), 170–201.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBBT '24: Proceedings of the 2024 16th International Conference on Bioinformatics and Biomedical Technology
May 2024
279 pages
ISBN:9798400717666
DOI:10.1145/3674658
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 November 2024

Check for updates

Author Tags

  1. B-cell
  2. epitope prediction
  3. feature preprocessing
  4. feature selection
  5. position entropy

Qualifiers

  • Research-article

Funding Sources

  • Hebei Natural Science Foundation,Science & Technology program of Hebei Academy of Sciences

Conference

ICBBT 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 13
    Total Downloads
  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)7
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media