How can LLMs Support Policy Researchers? Evaluating an LLM-Assisted Workflow for Large-Scale Unstructured Data

Yuhan Liu; Shuyao Zhou; Jakob Kaiser; Ella Colby; Jennifer Okwara; Maggie Wang; Varun Nagaraj Rao; Andr\'es Monroy-Hern\'andez

arXiv:2604.04479·cs.HC·April 8, 2026

How can LLMs Support Policy Researchers? Evaluating an LLM-Assisted Workflow for Large-Scale Unstructured Data

Yuhan Liu, Shuyao Zhou, Jakob Kaiser, Ella Colby, Jennifer Okwara, Maggie Wang, Varun Nagaraj Rao, Andr\'es Monroy-Hern\'andez

PDF

TL;DR

This paper evaluates an LLM-assisted thematic analysis workflow for policy research, demonstrating its potential to analyze large-scale unstructured data efficiently and comparing results with authoritative reports.

Contribution

It introduces and tests a scalable LLM-based workflow for thematic analysis of policy-related unstructured data, highlighting its practical utility and limitations.

Findings

01

The workflow enables rapid analysis of millions of Reddit posts.

02

It produces themes that align with authoritative policy reports.

03

Researchers view the tool as a quick, rough input for policy analysis.

Abstract

Policy researchers need scalable ways to surface public views, yet they often rely on interviews, listening sessions, and surveys-analyzed thematically-that are slow, expensive, and limited in scale and diversity. LLMs offer new possibilities for thematic analysis of unstructured text, yet we know little about how LLM-assisted workflows perform for policy research. Building on a workflow for LLM-assisted thematic analysis of online forums, we conduct a study with 11 policy researchers, who use an early prototype and see it as a quick, rough-and-ready input to their research. We then extend and scale the workflow to analyze millions of Reddit posts and 1,058 chatbot-led interview transcripts on a policy-relevant topic, treating these sources as rich and scalable data for policy discourse. We compare the synthesized themes to those from authoritative policy reports, identify points of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.