CHiLL: Zero-shot Custom Interpretable Feature Extraction from Clinical   Notes with Large Language Models

Denis Jered McInerney; Geoffrey Young; Jan-Willem van de Meent; Byron; C. Wallace

arXiv:2302.12343·cs.CL·October 23, 2023·1 cites

CHiLL: Zero-shot Custom Interpretable Feature Extraction from Clinical Notes with Large Language Models

Denis Jered McInerney, Geoffrey Young, Jan-Willem van de Meent, Byron, C. Wallace

PDF

Open Access

TL;DR

CHiLL leverages large language models to generate interpretable, clinically meaningful features from health records via natural language prompts, enabling effective and transparent risk prediction models.

Contribution

This work introduces CHiLL, a novel method for zero-shot feature extraction from clinical notes using LLMs, enhancing interpretability and domain alignment in predictive models.

Findings

01

Linear models with CHiLL features perform comparably to reference features.

02

CHiLL features offer greater interpretability than Bag-of-Words.

03

Feature weights align well with clinical expectations.

Abstract

We propose CHiLL (Crafting High-Level Latents), an approach for natural-language specification of features for linear models. CHiLL prompts LLMs with expert-crafted queries to generate interpretable features from health records. The resulting noisy labels are then used to train a simple linear classifier. Generating features based on queries to an LLM can empower physicians to use their domain expertise to craft features that are clinically meaningful for a downstream task of interest, without having to manually extract these from raw EHR. We are motivated by a real-world risk prediction task, but as a reproducible proxy, we use MIMIC-III and MIMIC-CXR data and standard predictive tasks (e.g., 30-day readmission) to evaluate this approach. We find that linear models using automatically extracted features are comparably performant to models using reference features, and provide greater…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Topic Modeling · Natural Language Processing Techniques

MethodsALIGN