Perspectives - Interactive Document Clustering in the Discourse Analysis Tool Suite
Tim Fischer, Chris Biemann

TL;DR
Perspectives is an interactive tool that enables Digital Humanities scholars to explore, organize, and refine large document collections through flexible clustering, human-in-the-loop adjustments, and tailored analytical lenses.
Contribution
It introduces Perspectives, a novel interactive clustering extension with human-in-the-loop refinement and customizable analytical lenses for digital humanities research.
Findings
Effective exploration of large document collections demonstrated
Interactive refinement improves clustering relevance
Supports uncovering topics and sentiments in data
Abstract
This paper introduces Perspectives, an interactive extension of the Discourse Analysis Tool Suite designed to empower Digital Humanities (DH) scholars to explore and organize large, unstructured document collections. Perspectives implements a flexible, aspect-focused document clustering pipeline with human-in-the-loop refinement capabilities. We showcase how this process can be initially steered by defining analytical lenses through document rewriting prompts and instruction-based embeddings, and further aligned with user intent through tools for refining clusters and mechanisms for fine-tuning the embedding model. The demonstration highlights a typical workflow, illustrating how DH researchers can leverage Perspectives's interactive document map to uncover topics, sentiments, or other relevant categories, thereby gaining insights and preparing their data for subsequent in-depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Computational and Text Analysis Methods · Language and cultural evolution
