Scaling Clinician-Grade Feature Generation from Clinical Notes with Multi-Agent Language Models
Jiayi Wang, Jacqueline Jil Vallon, Nikhil V. Kotha, Neil Panjwani, Xi Ling, Margaret Redfield, Sushmita Vij, Sandy Srinivas, John Leppert, Mark K. Buyyounouski, Mohsen Bayati

TL;DR
This paper introduces SNOW, a multi-agent LLM system that automates clinician-level feature extraction from clinical notes, achieving high prediction accuracy and scalability while reducing manual effort significantly.
Contribution
The study presents a novel multi-agent LLM framework, SNOW, that automates detailed clinical feature generation from notes, matching expert performance and enabling scalable, interpretable outcome prediction.
Findings
SNOW achieves comparable performance to manual feature extraction.
SNOW reduces human effort by approximately 48-fold.
SNOW generalizes well to external cohorts and different conditions.
Abstract
Developing accurate clinical prediction models is often bottlenecked by the difficulty of deriving meaningful structured features from unstructured EHR notes, a process that traditionally requires manual, unscalable clinical abstraction. In this study, we first established a rigorous patient-level Clinician Feature Generation (CFG) protocol, in which domain experts manually reviewed notes to define and extract nuanced features for a cohort of 147 patients with prostate cancer. As a high-fidelity ground truth, this labor-intensive process provided the blueprint for SNOW (Scalable Note-to-Outcome Workflow), a transparent multi-agent large language model (LLM) system designed to autonomously mimic the iterative reasoning and validation workflow of clinical experts. On 5-year cancer recurrence prediction, SNOW (AUC-ROC 0.767) achieved performance comparable to manual CFG (0.762) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education
