Identification of small molecule aggregators from large compound libraries by support vector machines

(c) 2009 Wiley Periodicals, Inc.

Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry. - 1984. - 31(2010), 4 vom: 01. März, Seite 752-63
1. Verfasser: Rao, Hanbing (VerfasserIn)
Weitere Verfasser: Li, Zerong, Li, Xiangyuan, Ma, Xiaohua, Ung, Choongyong, Li, Hu, Liu, Xianghui, Chen, Yuzong
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2010
Zugriff auf das übergeordnete Werk:Journal of computational chemistry
Schlagworte:Journal Article Small Molecule Libraries
LEADER 01000caa a22002652c 4500
001 NLM189689196
003 DE-627
005 20250210134035.0
007 cr uuu---uuuuu
008 231223s2010 xx |||||o 00| ||eng c
024 7 |a 10.1002/jcc.21347  |2 doi 
028 5 2 |a pubmed25n0632.xml 
035 |a (DE-627)NLM189689196 
035 |a (NLM)19569201 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Rao, Hanbing  |e verfasserin  |4 aut 
245 1 0 |a Identification of small molecule aggregators from large compound libraries by support vector machines 
264 1 |c 2010 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 16.04.2010 
500 |a Date Revised 03.02.2010 
500 |a published: Print 
500 |a Citation Status MEDLINE 
520 |a (c) 2009 Wiley Periodicals, Inc. 
520 |a Small molecule aggregators non-specifically inhibit multiple unrelated proteins, rendering them therapeutically useless. They frequently appear as false hits and thus need to be eliminated in high-throughput screening campaigns. Computational methods have been explored for identifying aggregators, which have not been tested in screening large compound libraries. We used 1319 aggregators and 128,325 non-aggregators to develop a support vector machines (SVM) aggregator identification model, which was tested by four methods. The first is five fold cross-validation, which showed comparable aggregator and significantly improved non-aggregator identification rates against earlier studies. The second is the independent test of 17 aggregators discovered independently from the training aggregators, 71% of which were correctly identified. The third is retrospective screening of 13M PUBCHEM and 168K MDDR compounds, which predicted 97.9% and 98.7% of the PUBCHEM and MDDR compounds as non-aggregators. The fourth is retrospective screening of 5527 MDDR compounds similar to the known aggregators, 1.14% of which were predicted as aggregators. SVM showed slightly better overall performance against two other machine learning methods based on five fold cross-validation studies of the same settings. Molecular features of aggregation, extracted by a feature selection method, are consistent with published profiles. SVM showed substantial capability in identifying aggregators from large libraries at low false-hit rates 
650 4 |a Journal Article 
650 7 |a Small Molecule Libraries  |2 NLM 
700 1 |a Li, Zerong  |e verfasserin  |4 aut 
700 1 |a Li, Xiangyuan  |e verfasserin  |4 aut 
700 1 |a Ma, Xiaohua  |e verfasserin  |4 aut 
700 1 |a Ung, Choongyong  |e verfasserin  |4 aut 
700 1 |a Li, Hu  |e verfasserin  |4 aut 
700 1 |a Liu, Xianghui  |e verfasserin  |4 aut 
700 1 |a Chen, Yuzong  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Journal of computational chemistry  |d 1984  |g 31(2010), 4 vom: 01. März, Seite 752-63  |w (DE-627)NLM098138448  |x 1096-987X  |7 nnas 
773 1 8 |g volume:31  |g year:2010  |g number:4  |g day:01  |g month:03  |g pages:752-63 
856 4 0 |u http://dx.doi.org/10.1002/jcc.21347  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 31  |j 2010  |e 4  |b 01  |c 03  |h 752-63