ClioQuery: Interactive Query-Oriented Text Analytics for Comprehensive Investigation of Historical News Archives
Abram Handler, Narges Mahyar, Brendan O'Connor

TL;DR
ClioQuery is a novel text analytics system designed to assist historians in comprehensively investigating historical news archives by focusing on query words and employing NLP-based text simplification techniques.
Contribution
The paper introduces ClioQuery, a system that uniquely organizes text analysis around query words and integrates NLP simplification with traditional visualization tools for historical research.
Findings
ClioQuery helps historians analyze query words more effectively.
The system improves crowdworkers' ability to find and recall historical information.
User studies confirm the usefulness of text simplification in historical research.
Abstract
Historians and archivists often find and analyze the occurrences of query words in newspaper archives, to help answer fundamental questions about society. But much work in text analytics focuses on helping people investigate other textual units, such as events, clusters, ranked documents, entity relationships, or thematic hierarchies. Informed by a study into the needs of historians and archivists, we thus propose ClioQuery, a text analytics system uniquely organized around the analysis of query words in context. ClioQuery applies text simplification techniques from natural language processing to help historians quickly and comprehensively gather and analyze all occurrences of a query word across an archive. It also pairs these new NLP methods with more traditional features like linked views and in-text highlighting to help engender trust in summarization techniques. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
