Counterfactual Modeling with Fine-Tuned LLMs for Health Intervention Design and Sensor Data Augmentation

Shovito Barua Soumma; Asiful Arefeen; Stephanie M. Carpenter; Melanie Hingle; Hassan Ghasemzadeh

arXiv:2601.14590·cs.LG·April 21, 2026

Counterfactual Modeling with Fine-Tuned LLMs for Health Intervention Design and Sensor Data Augmentation

Shovito Barua Soumma, Asiful Arefeen, Stephanie M. Carpenter, Melanie Hingle, Hassan Ghasemzadeh

PDF

TL;DR

This paper evaluates large language models for generating counterfactual explanations in health data, demonstrating their effectiveness in intervention design and data augmentation to improve model robustness.

Contribution

It introduces a fine-tuned LLM approach for generating high-quality, clinically actionable counterfactuals that enhance health intervention strategies and data efficiency.

Findings

01

Fine-tuned LLMs produce plausible, valid counterfactuals with up to 99% plausibility.

02

Counterfactual data augmentation restores classifier performance with 20% F1 score recovery.

03

LLMs outperform optimization-based baselines in generating coherent, actionable counterfactuals.

Abstract

Counterfactual explanations (CFEs) provide human-centric interpretability by identifying the minimal, actionable changes required to alter a machine learning model's prediction. Therefore, CFs can be used as (i) interventions for abnormality prevention and (ii) augmented data for training robust models. We conduct a comprehensive evaluation of CF generation using large language models (LLMs), including GPT-4 (zero-shot and few-shot) and two open-source models-BioMistral-7B and LLaMA-3.1-8B, in both pretrained and fine-tuned configurations. Using the multimodal AI-READI clinical dataset, we assess CFs across three dimensions: intervention quality, feature diversity, and augmentation effectiveness. Fine-tuned LLMs, particularly LLaMA-3.1-8B, produce CFs with high plausibility (up to 99%), strong validity (up to 0.99), and realistic, behaviorally modifiable feature adjustments. When used…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.