Using the gini coefficient to measure the chemical diversity of small-molecule libraries

© 2016 Wiley Periodicals, Inc.

Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry. - 1984. - 37(2016), 22 vom: 15. Aug., Seite 2091-7
1. Verfasser: Weidlich, Iwona E (VerfasserIn)
Weitere Verfasser: Filippov, Igor V
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2016
Zugriff auf das übergeordnete Werk:Journal of computational chemistry
Schlagworte:Journal Article Diversity Genie chemical databases cheminformatics molecular diversity
LEADER 01000naa a22002652 4500
001 NLM261854305
003 DE-627
005 20231224200452.0
007 cr uuu---uuuuu
008 231224s2016 xx |||||o 00| ||eng c
024 7 |a 10.1002/jcc.24423  |2 doi 
028 5 2 |a pubmed24n0872.xml 
035 |a (DE-627)NLM261854305 
035 |a (NLM)27353971 
040 |a DE-627  |b ger  |c DE-627  |e rakwb 
041 |a eng 
100 1 |a Weidlich, Iwona E  |e verfasserin  |4 aut 
245 1 0 |a Using the gini coefficient to measure the chemical diversity of small-molecule libraries 
264 1 |c 2016 
336 |a Text  |b txt  |2 rdacontent 
337 |a ƒaComputermedien  |b c  |2 rdamedia 
338 |a ƒa Online-Ressource  |b cr  |2 rdacarrier 
500 |a Date Completed 19.07.2018 
500 |a Date Revised 19.07.2018 
500 |a published: Print-Electronic 
500 |a Citation Status PubMed-not-MEDLINE 
520 |a © 2016 Wiley Periodicals, Inc. 
520 |a Modern databases of small organic molecules contain tens of millions of structures. The size of theoretically available chemistry is even larger. However, despite the large amount of chemical information, the "big data" moment for chemistry has not yet provided the corresponding payoff of cheaper computer-predicted medicine or robust machine-learning models for the determination of efficacy and toxicity. Here, we present a study of the diversity of chemical datasets using a measure that is commonly used in socioeconomic studies. We demonstrate the use of this diversity measure on several datasets that were constructed to contain various congeneric subsets of molecules as well as randomly selected molecules. We also apply our method to a number of well-known databases that are frequently used for structure-activity relationship modeling. Our results show the poor diversity of the common sources of potential lead compounds compared to actual known drugs. © 2016 Wiley Periodicals, Inc 
650 4 |a Journal Article 
650 4 |a Diversity Genie 
650 4 |a chemical databases 
650 4 |a cheminformatics 
650 4 |a molecular diversity 
700 1 |a Filippov, Igor V  |e verfasserin  |4 aut 
773 0 8 |i Enthalten in  |t Journal of computational chemistry  |d 1984  |g 37(2016), 22 vom: 15. Aug., Seite 2091-7  |w (DE-627)NLM098138448  |x 1096-987X  |7 nnns 
773 1 8 |g volume:37  |g year:2016  |g number:22  |g day:15  |g month:08  |g pages:2091-7 
856 4 0 |u http://dx.doi.org/10.1002/jcc.24423  |3 Volltext 
912 |a GBV_USEFLAG_A 
912 |a SYSFLAG_A 
912 |a GBV_NLM 
912 |a GBV_ILN_350 
951 |a AR 
952 |d 37  |j 2016  |e 22  |b 15  |c 08  |h 2091-7