HINTs : Sensemaking on large collections of documents with Hypergraph visualization and INTelligent agents

Sensemaking on a large collection of documents (corpus) is a challenging task often found in fields such as market research, legal studies, intelligence analysis, political science, or computational linguistics. Previous works approach this problem from topic- and entity-based perspectives, but the...

Ausführliche Beschreibung

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on visualization and computer graphics. - 1996. - PP(2024) vom: 12. Sept.
1. Verfasser:	Lee, Sam Yu-Te (VerfasserIn)
Weitere Verfasser:	Ma, Kwan-Liu
Format:	Online-Aufsatz
Sprache:	English
Veröffentlicht:	2024
Zugriff auf das übergeordnete Werk:	IEEE transactions on visualization and computer graphics
Schlagworte:	Journal Article

Beschreibung
Zusammenfassung:	Sensemaking on a large collection of documents (corpus) is a challenging task often found in fields such as market research, legal studies, intelligence analysis, political science, or computational linguistics. Previous works approach this problem from topic- and entity-based perspectives, but the capability of the underlying NLP model limits their effectiveness. Recent advances in prompting with LLMs present opportunities to enhance such approaches with higher accuracy and customizability. However, poorly designed prompts and visualizations could mislead users into falsely interpreting the visualizations and hinder the system's trustworthiness. In this paper, we address this issue by taking into account the user analysis tasks and visualization goals in the prompt-based data extraction stage, thereby extending the concept of Model Alignment. We present HINTs, a VA system for supporting sensemaking on large collections of documents, combining previous entity-based and topic-based approaches. The visualization pipeline of HINTs consists of three stages. First, entities and topics are extracted from the corpus with prompts. Then, the result is modeled as a hypergraph and hierarchically clustered. Finally, an enhanced space-filling curve layout is applied to visualize the hypergraph for interactive exploration. The system further integrates an LLM-based intelligent chatbot agent in the interface to facilitate the sensemaking of interested documents. To demonstrate the generalizability and effectiveness of the HINTs system, we present two case studies on different domains and a comparative user study. We report our insights on the behavior patterns and challenges when intelligent agents are used to facilitate sensemaking. We find that while intelligent agents can address many challenges in sensemaking, the visual hints that visualizations provide are still necessary. We discuss limitations and future work for combining interactive visualization and LLMs more profoundly to better support corpus analysis
Beschreibung:	Date Revised 16.09.2024 published: Print-Electronic Citation Status Publisher
ISSN:	1941-0506
DOI:	10.1109/TVCG.2024.3459961