Scholastic: Graphical Human-Al Collaboration for Inductive and   Interpretive Text Analysis

Matt-Heun Hong; Lauren A. Marsh; Jessica L. Feuston; Janet Ruppert,; Jed R. Brubaker; Danielle Albers Szafir

arXiv:2208.06133·cs.HC·August 15, 2022

Scholastic: Graphical Human-Al Collaboration for Inductive and Interpretive Text Analysis

Matt-Heun Hong, Lauren A. Marsh, Jessica L. Feuston, Janet Ruppert,, Jed R. Brubaker, Danielle Albers Szafir

PDF

TL;DR

Scholastic introduces a human-centered, interactive system that combines machine learning and visualization to support interpretive text analysis, addressing scholars' concerns about algorithmic disruption.

Contribution

It presents a novel human-in-the-loop clustering approach with visualizations to aid inductive and interpretive research on large text corpora.

Findings

01

Supports iterative coding and refinement by scholars

02

Enhances document sampling through interactive visualizations

03

Facilitates meaningful theme discovery in large datasets

Abstract

Interpretive scholars generate knowledge from text corpora by manually sampling documents, applying codes, and refining and collating codes into categories until meaningful themes emerge. Given a large corpus, machine learning could help scale this data sampling and analysis, but prior research shows that experts are generally concerned about algorithms potentially disrupting or driving interpretive scholarship. We take a human-centered design approach to addressing concerns around machine-assisted interpretive research to build Scholastic, which incorporates a machine-in-the-loop clustering algorithm to scaffold interpretive text analysis. As a scholar applies codes to documents and refines them, the resulting coding schema serves as structured metadata which constrains hierarchical document and word clusters inferred from the corpus. Interactive visualizations of these clusters can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.