Analysis of biomedical text for chemical names : a comparison of three methods
At the National Library of Medicine (NLM), a variety of biomedical vocabularies are found in data pertinent to its mission. In addition to standard medical terminology, there are specialized vocabularies including that of chemical nomenclature. Normal language tools including the lexically based one...
Veröffentlicht in: | Proceedings. AMIA Symposium. - 1998. - (1999) vom: 23., Seite 176-80 |
---|---|
1. Verfasser: | |
Weitere Verfasser: | , , , , |
Format: | Aufsatz |
Sprache: | English |
Veröffentlicht: |
1999
|
Zugriff auf das übergeordnete Werk: | Proceedings. AMIA Symposium |
Schlagworte: | Comparative Study Journal Article Inorganic Chemicals Organic Chemicals |
Zusammenfassung: | At the National Library of Medicine (NLM), a variety of biomedical vocabularies are found in data pertinent to its mission. In addition to standard medical terminology, there are specialized vocabularies including that of chemical nomenclature. Normal language tools including the lexically based ones used by the Unified Medical Language System (UMLS) to manipulate and normalize text do not work well on chemical nomenclature. In order to improve NLM's capabilities in chemical text processing, two approaches to the problem of recognizing chemical nomenclature were explored. The first approach was a lexical one and consisted of analyzing text for the presence of a fixed set of chemical segments. The approach was extended with general chemical patterns and also with terms from NLM's indexing vocabulary, MeSH, and the NLM SPECIALIST lexicon. The second approach applied Bayesian classification to n-grams of text via two different methods. The single lexical method and two statistical methods were tested against data from the 1999 UMLS Metathesaurus. One of the statistical methods had an overall classification accuracy of 97% |
---|---|
Beschreibung: | Date Completed 01.02.2000 Date Revised 21.10.2016 published: Print Citation Status MEDLINE |
ISSN: | 1531-605X |