Prediction of protein structural class using novel evolutionary collocation-based sequence representation

(c) 2008 Wiley Periodicals, Inc. J Comput Chem, 2008.

Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry. - 1984. - 29(2008), 10 vom: 30. Juli, Seite 1596-604
1. Verfasser: Chen, Ke (VerfasserIn)
Weitere Verfasser: Kurgan, Lukasz A, Ruan, Jishou
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2008
Zugriff auf das übergeordnete Werk:Journal of computational chemistry
Schlagworte:Journal Article Research Support, Non-U.S. Gov't Dipeptides Proteins
LEADER 01000naa a22002652 4500
001 NLM177795026
003 DE-627
005 20231223151011.0
007 cr uuu---uuuuu
008 231223s2008 xx |||||o 00| ||eng c
024 7 |a 10.1002/jcc.20918  |2 doi 
028 5 2 |a pubmed24n0593.xml 
035 |a (DE-627)NLM177795026 
035 |a (NLM)18293306 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Chen, Ke  |e verfasserin  |4 aut 
245 1 0 |a Prediction of protein structural class using novel evolutionary collocation-based sequence representation 
264 1 |c 2008 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 07.08.2008 
500 |a Date Revised 02.06.2008 
500 |a published: Print 
500 |a Citation Status MEDLINE 
520 |a (c) 2008 Wiley Periodicals, Inc. J Comput Chem, 2008. 
520 |a Knowledge of structural classes is useful in understanding of folding patterns in proteins. Although existing structural class prediction methods applied virtually all state-of-the-art classifiers, many of them use a relatively simple protein sequence representation that often includes amino acid (AA) composition. To this end, we propose a novel sequence representation that incorporates evolutionary information encoded using PSI-BLAST profile-based collocation of AA pairs. We used six benchmark datasets and five representative classifiers to quantify and compare the quality of the structural class prediction with the proposed representation. The best, classifier support vector machine achieved 61-96% accuracy on the six datasets. These predictions were comprehensively compared with a wide range of recently proposed methods for prediction of structural classes. Our comprehensive comparison shows superiority of the proposed representation, which results in error rate reductions that range between 14% and 26% when compared with predictions of the best-performing, previously published classifiers on the considered datasets. The study also shows that, for the benchmark dataset that includes sequences characterized by low identity (i.e., 25%, 30%, and 40%), the prediction accuracies are 20-35% lower than for the other three datasets that include sequences with a higher degree of similarity. In conclusion, the proposed representation is shown to substantially improve the accuracy of the structural class prediction. A web server that implements the presented prediction method is freely available at http://biomine.ece.ualberta.ca/Structural_Class/SCEC.html 
650 4 |a Journal Article 
650 4 |a Research Support, Non-U.S. Gov't 
650 7 |a Dipeptides  |2 NLM 
650 7 |a Proteins  |2 NLM 
700 1 |a Kurgan, Lukasz A  |e verfasserin  |4 aut 
700 1 |a Ruan, Jishou  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Journal of computational chemistry  |d 1984  |g 29(2008), 10 vom: 30. Juli, Seite 1596-604  |w (DE-627)NLM098138448  |x 1096-987X  |7 nnns 
773 1 8 |g volume:29  |g year:2008  |g number:10  |g day:30  |g month:07  |g pages:1596-604 
856 4 0 |u http://dx.doi.org/10.1002/jcc.20918  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 29  |j 2008  |e 10  |b 30  |c 07  |h 1596-604