A Computational Method for Measuring "Open Codes" in Qualitative Analysis
John Chen, Alexandros Lotsos, Sihan Cheng, Caiyi Wang, Lexie Zhao, Yanjia Zhang, Jessica Hullman, Bruce Sherin, Uri Wilensky, Michael Horn

TL;DR
This paper introduces a computational approach using LLMs to measure and evaluate inductive coding in qualitative analysis, addressing challenges in assessing exploratory human and AI-generated codes.
Contribution
It proposes a novel LLM-enriched merging algorithm and four metrics to assess coding contributions, robustness, and issues in qualitative datasets.
Findings
The merging algorithm significantly influences metric outcomes.
Metrics are stable across different runs and LLMs.
The metrics can diagnose coding issues like hallucinated codes.
Abstract
Qualitative analysis is critical to understanding human datasets in many social science disciplines. A central method in this process is inductive coding, where researchers identify and interpret codes directly from the datasets themselves. Yet, this exploratory approach poses challenges for meeting methodological expectations (such as ``depth'' and ``variation''), especially as researchers increasingly adopt Generative AI (GAI) for support. Ground-truth-based metrics are insufficient because they contradict the exploratory nature of inductive coding, while manual evaluation can be labor-intensive. This paper presents a theory-informed computational method for measuring inductive coding results from humans and GAI. Our method first merges individual codebooks using an LLM-enriched algorithm. It measures each coder's contribution against the merged result using four novel metrics:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
