ConceptVector : Text Visual Analytics via Interactive Lexicon Building Using Word Embedding

Central to many text analysis methods is the notion of a concept: a set of semantically related keywords characterizing a specific object, phenomenon, or theme. Advances in word embedding allow building a concept from a small set of seed terms. However, naive application of such techniques may resul...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on visualization and computer graphics. - 1996. - 24(2018), 1 vom: 07. Jan., Seite 361-370
1. Verfasser: Park, Deokgun (VerfasserIn)
Weitere Verfasser: Kim, Seungyeon, Lee, Jurim, Choo, Jaegul, Diakopoulos, Nicholas, Elmqvist, Niklas
Format: Online-Aufsatz
Sprache:English
Veröffentlicht: 2018
Zugriff auf das übergeordnete Werk:IEEE transactions on visualization and computer graphics
Schlagworte:Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't
Beschreibung
Zusammenfassung:Central to many text analysis methods is the notion of a concept: a set of semantically related keywords characterizing a specific object, phenomenon, or theme. Advances in word embedding allow building a concept from a small set of seed terms. However, naive application of such techniques may result in false positive errors because of the polysemy of natural language. To mitigate this problem, we present a visual analytics system called ConceptVector that guides a user in building such concepts and then using them to analyze documents. Document-analysis case studies with real-world datasets demonstrate the fine-grained analysis provided by ConceptVector. To support the elaborate modeling of concepts, we introduce a bipolar concept model and support for specifying irrelevant words. We validate the interactive lexicon building interface by a user study and expert reviews. Quantitative evaluation shows that the bipolar lexicon generated with our methods is comparable to human-generated ones
Beschreibung:Date Completed 08.04.2019
Date Revised 08.04.2019
published: Print-Electronic
Citation Status MEDLINE
ISSN:1941-0506
DOI:10.1109/TVCG.2017.2744478