SPINE X : improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles

Copyright © 2011 Wiley Periodicals, Inc.

Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry. - 1984. - 33(2012), 3 vom: 30. Jan., Seite 259-67
1. Verfasser: Faraggi, Eshel (VerfasserIn)
Weitere Verfasser: Zhang, Tuo, Yang, Yuedong, Kurgan, Lukasz, Zhou, Yaoqi
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2012
Zugriff auf das übergeordnete Werk:Journal of computational chemistry
Schlagworte:Journal Article Research Support, N.I.H., Extramural Validation Study Proteins Solvents
Beschreibung
Zusammenfassung:Copyright © 2011 Wiley Periodicals, Inc.
Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner. Our method called SPINE X was applied to a dataset of 2640 proteins (25% sequence identity cutoff) previously built for the first version of SPINE and achieved a 82.0% accuracy based on 10-fold cross validation (Q(3)). Surpassing 81% accuracy by SPINE X is further confirmed by employing an independently built test dataset of 1833 protein chains, a recently built dataset of 1975 proteins and 117 CASP 9 targets (critical assessment of structure prediction techniques) with an accuracy of 81.3%, 82.3% and 81.8%, respectively. The prediction accuracy is further improved to 83.8% for the dataset of 2640 proteins if the DSSP assignment used above is replaced by a more consistent consensus secondary structure assignment method. Comparison to the popular PSIPRED and CASP-winning structure-prediction techniques is made. SPINE X predicts number of helices and sheets correctly for 21.0% of 1833 proteins, compared to 17.6% by PSIPRED. It further shows that SPINE X consistently makes more accurate prediction in helical residues (6%) without over prediction while PSIPRED makes more accurate prediction in coil residues (3-5%) and over predicts them by 7%. SPINE X Server and its training/test datasets are available at http://sparks.informatics.iupui.edu/
Beschreibung:Date Completed 03.04.2012
Date Revised 12.05.2024
published: Print-Electronic
Citation Status MEDLINE
ISSN:1096-987X
DOI:10.1002/jcc.21968