skip to main content
article

Intelligent consensus modeling for proline cis-trans isomerization prediction

Published: 01 January 2014 Publication History

Abstract

Proline cis-trans isomerization (CTI) plays a key role in the rate-determining steps of protein folding. Accurate prediction of proline CTI is of great importance for the understanding of protein folding, splicing, cell signaling, and transmembrane active transport in both the human body and animals. Our goal is to develop a state-of-the-art proline CTI predictor based on a biophysically motivated intelligent consensus modeling through the use of sequence information only (i.e., position specific scores generated by PSI-BLAST). The current computational proline CTI predictors reach about 70-73 percent Q2 accuracies and about 0.40 Matthew correlation coefficient (Mcc) through the use of sequence-based evolutionary information as well as predicted protein secondary structure information. However, our approach that utilizes a novel decision tree-based consensus model with a powerful randomized-metalearning technique has achieved 86.58 percent Q2 accuracy and 0.74 Mcc, on the same proline CTI data set, which is a better result than those of any existing computational proline CTI predictors reported in the literature.

References

[1]
U. Reimer, G. Scherer, M. Drewello, S. Kruber, M. Schutkowski, and G. Fischer, "Side-Chain Effects on Peptidyl-Prolyl Cis/Trans Isomerization," J. Molecular Biology, vol. 279, pp. 449-460, 2003.
[2]
B. Eckert, A. Martin, J. Balbach, and F.X. Schmid, "Prolyl Isomerization as a Molecular Timer in Phage Infection," Nature Structural Moleculer Biology, vol. 12, pp. 619-623, 2005.
[3]
W.J. Wedemeyer, E. Welker, and H.A. Scheraga, "Proline Cis-Trans Isomerization and Protein Folding," Biochemistry, vol. 41, pp. 14637-14644, 2002.
[4]
F.X. Schmid, "Prolyl Isomerases," Advances Protein Chemistry, vol. 59, pp. 243-282, 2001.
[5]
J. Balbach and F.X. Schmid, Mechanisms of Protein Folding, R.H. Pain, ed., pp. 212-237. Oxford Univ. Press, 2000.
[6]
M.B. Yaffe et al., "Sequence-Specific and Phosphorylation-Dependent Proline Isomerization--A Potential Mitotic Regulatory Mechanism," Science, vol. 278, pp. 1957-1960, 1997.
[7]
G. Fischer and T. Aumuller, "Regulation of Peptide Bond Cis/Trans Isomerization by Enzyme Catalysis and Its Implication in Physiological Processes," Rev. Physiology Biochemistry and Pharmacology, vol. 148, pp. 105-150, 2004.
[8]
Y. Wu and R. Matthews, "A Cis-Prolyl Peptide Bond Isomerization Dominates the Folding of the Alpha Subunit of Trp Synthase, a TIM Barrel Protein," J. Moleculer Biology, vol. 322, pp. 7-13, 2002.
[9]
F.X. Schmid, "Prolyl Isomerase: Enzymatic Catalysis of Slow Protein-Folding Reactions," Ann. Rev. Biophysics and Biomoleculer Structure, vol. 22, pp. 123-142, 1993.
[10]
C. Frömmel and R. Preissner, "Prediction of Prolyl residues in Cis-Con Formation in Protein Structures on the Basis of the Amino Acid Sequence," FEBS Letters, vol. 277, pp. 159-163, 1990.
[11]
M.L. Wang, W.J. Li, and W.B. Xu, "Support Vector Machines for Prediction of Peptidyl Prolyl Cis/Trans Isomerization," J. Peptide Research, vol. 63, pp. 23-28, 2004.
[12]
J. Song et al., "Prediction of Cis/Trans Isomerization in Proteins Using PSI-BLAST Profiles and Secondary Structure Information," BMC Bioinformatics, vol. 7, article 124, pp. 1-13, 2006.
[13]
S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, no. 3, pp. 403-410, 1990.
[14]
T.N. Petersen, C. Lundegaard, M. Nielsen, H. Bohr, J. Bohr, S. Brunak, G.P. Gippert, and O. Lund, "Prediction of Protein Secondary Structure at 80% Accuracy," Proteins: Structure, Function, and Genetics, vol. 14, pp. 17-20, 2000.
[15]
D. Pahlke, D. Leitner, U. Wiedemann, and D. Labudde, "COPSCis/Trans Peptide Bond Conformation Prediction of Amino Acids on the Basis of Secondary Structure Information," Bioinformatics, vol. 21, pp. 685-686, 2005.
[16]
K. Exarchos et al., "Prediction of Cis/Trans Isomerization Using Feature Selection and Support Vector Machines," J. Biomedical Informatics, vol. 42, pp. 140-149, 2009.
[17]
P.D. Yoo, B.B. Zhou, and A.Y. Zomaya, "Machine Learning Techniques for Protein Secondary Structure Prediction: An Overview and Evaluation," J. Current Bioinformatics, vol. 3, no. 2, pp. 74-86, 2008.
[18]
P.D. Yoo, A. Sikder, J. Taheri, B.B. Zhou, and A.Y. Zomaya, "DomNet: Protein Domain Boundary Prediction Using Enhanced General Regression Network and New Profiles," IEEE Trans. NanoBioscience, vol. 7, no. 2, pp. 172-181, 2008.
[19]
P.D. Yoo, S. Ho, B.B. Zhou, and A.Y. Zomaya, "SiteSeek: Protein Post-Translational Modification Analysis Using Adaptive Locality-Effective Kernel Methods and New Profiles," BMC Bioinformatics, vol. 9, article 272, 2008.
[20]
P.D. Yoo, B.B. Zhou, and A.Y. Zomaya, "A Modular Kernel Approach for Integrative Analysis of Protein Inter-Domain Linker Regions," BMC Genomics, vol. 10, no. 3, article S21, 2009.
[21]
P.D. et al., "Hierarchical Kernel Mixture Models for the Prediction of AIDS Disease Progression Using HIV Structural gp120 Profile," BMC Genomics, vol. 11, no. 4, article S22, 2010.
[22]
L. Breiman, "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.
[23]
PISCES: A Protein Sequence Culling Server, http://dunbrack. fccc.edu/PISCES.php, 2013.
[24]
D.T. Jones, "Protein Secondary Structure Prediction Based on Position-Specific Scoring Matrices," J. Moleculer Biology, vol. 292, pp. 195-202, 1999.
[25]
NCBI FTP Website, ftp://ftp.ncbi.nlm.nih.gov/blast/db/, 2013.
[26]
J. Qian, J. Lin, N.M. Luscombe, H. Yu, and M. Gerstein, "Prediction of Regulatory Networks: Genome-Wide Identification of Transcription Factor Targets From Gene Expression Data," Bioinformatics, vol. 19, pp. 1917-1926, 2003.
[27]
L. Breiman, "Bagging Predictors," Machine Learning, vol. 24, no. 2, pp. 123-140, 1996.
[28]
P. Argentiero, R. Chin, and P. Beaudet, "An Automated Approach to the Design of Decision Tree Classifiers," IEEE Trans. Pattern Analysis Machine Intelligence, vol. PAMI-4, no. 1, pp. 51-57, Jan. 1982.
[29]
X.D. Wu and V. Kumar, The Top Ten Algorithm in Data Mining. Chapman & Hall, 2009.
[30]
G. Holmes, A. Donkin, and I.H. Witten, "Weka: A Machine Learning Workbench," Proc. Second Australia and New Zealand Conf. Intelligent Information Systems, pp. 357-361, 1994.
[31]
M.M.S. Lira, R.R.B. de Aquino, A.A. Ferreira, M.A. Carvalho, O.N. Neto, and G.S.M. Santos, "Combining Multiple Artificial Neural Networks Using Random Committee to Decide Upon Electrical Disturbance Classification," Proc. Int'l Joint Conf. Neural Networks (IJCNN), pp. 2863-868, Aug. 2007.
[32]
S. Lorenzen, B. Peters, A. Goede, R. Preissner, and C. Frömmel, "Conservation of Cis Prolyl Bonds in Proteins During Evolution," Proteins, vol. 58, pp. 589-595, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 11, Issue 1
January/February 2014
265 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 January 2014
Accepted: 14 October 2013
Received: 09 October 2013
Published in TCBB Volume 11, Issue 1

Author Tags

  1. ensemble methods
  2. intelligent systems
  3. machine-learning
  4. proline cis-trans isomerization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media