In your own words: computationally identifying interpretable themes in free-text survey data

Jenny S Wang; Aliya Saperstein; Emma Pierson

arXiv:2603.26930·cs.CY·April 7, 2026

In your own words: computationally identifying interpretable themes in free-text survey data

Jenny S Wang, Aliya Saperstein, Emma Pierson

PDF

TL;DR

This paper introduces a computational framework called 'In Your Own Words' for identifying interpretable themes in free-text survey responses, enhancing systematic analysis and revealing nuanced insights.

Contribution

The paper presents a novel method that produces more coherent themes from free-text data, aiding survey design and understanding heterogeneity in responses.

Findings

01

Themes are more coherent and interpretable than previous methods.

02

Identified salient constructs like belonging and identity fluidity.

03

Revealed heterogeneity within categories and discordance between self and perceived identities.

Abstract

Free-text survey responses can provide nuance often missed by structured questions, but remain difficult to statistically analyze. To address this, we introduce In Your Own Words, a computational framework for exploratory analyses of free-text survey data that identifies structured, interpretable themes in free-text responses, facilitating systematic analysis. To illustrate the benefits of this approach, we apply it to a new dataset of free-text descriptions of race, gender, and sexual orientation from 1,004 U.S. participants. The themes our approach produces on this dataset are more coherent and interpretable than those produced by past computational methods. The themes have three practical applications in survey research. First, they can suggest structured questions to add to future surveys by surfacing salient constructs - such as belonging and identity fluidity - that existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.