SPINE X : improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles
Copyright © 2011 Wiley Periodicals, Inc.
Veröffentlicht in: | Journal of computational chemistry. - 1984. - 33(2012), 3 vom: 30. Jan., Seite 259-67 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , , |
Format: | Online-Aufsatz |
Sprache: | English |
Veröffentlicht: |
2012
|
Zugriff auf das übergeordnete Werk: | Journal of computational chemistry |
Schlagworte: | Journal Article Research Support, N.I.H., Extramural Validation Study Proteins Solvents |
Zusammenfassung: | Copyright © 2011 Wiley Periodicals, Inc. Accurate prediction of protein secondary structure is essential for accurate sequence alignment, three-dimensional structure modeling, and function prediction. The accuracy of ab initio secondary structure prediction from sequence, however, has only increased from around 77 to 80% over the past decade. Here, we developed a multistep neural-network algorithm by coupling secondary structure prediction with prediction of solvent accessibility and backbone torsion angles in an iterative manner. Our method called SPINE X was applied to a dataset of 2640 proteins (25% sequence identity cutoff) previously built for the first version of SPINE and achieved a 82.0% accuracy based on 10-fold cross validation (Q(3)). Surpassing 81% accuracy by SPINE X is further confirmed by employing an independently built test dataset of 1833 protein chains, a recently built dataset of 1975 proteins and 117 CASP 9 targets (critical assessment of structure prediction techniques) with an accuracy of 81.3%, 82.3% and 81.8%, respectively. The prediction accuracy is further improved to 83.8% for the dataset of 2640 proteins if the DSSP assignment used above is replaced by a more consistent consensus secondary structure assignment method. Comparison to the popular PSIPRED and CASP-winning structure-prediction techniques is made. SPINE X predicts number of helices and sheets correctly for 21.0% of 1833 proteins, compared to 17.6% by PSIPRED. It further shows that SPINE X consistently makes more accurate prediction in helical residues (6%) without over prediction while PSIPRED makes more accurate prediction in coil residues (3-5%) and over predicts them by 7%. SPINE X Server and its training/test datasets are available at http://sparks.informatics.iupui.edu/ |
---|---|
Beschreibung: | Date Completed 03.04.2012 Date Revised 12.05.2024 published: Print-Electronic Citation Status MEDLINE |
ISSN: | 1096-987X |
DOI: | 10.1002/jcc.21968 |