A Counterfactual LLM Framework for Detecting Human Biases: A Case Study of Sex/Gender in Emergency Triage

Ariel Guerra-Adames; Marta Avalos-Fernandez; Oc\'eane Dor\'emus; Leo Anthony Celi; C\'edric Gil-Jardin\'e; Emmanuel Lagarde

arXiv:2511.17124·cs.CY·November 24, 2025

A Counterfactual LLM Framework for Detecting Human Biases: A Case Study of Sex/Gender in Emergency Triage

Ariel Guerra-Adames, Marta Avalos-Fernandez, Oc\'eane Dor\'emus, Leo Anthony Celi, C\'edric Gil-Jardin\'e, Emmanuel Lagarde

PDF

Open Access

TL;DR

This paper introduces a counterfactual LLM-based framework to detect gender biases in clinical decision-making, validated on hospital data and applicable across various domains to identify and address disparities.

Contribution

It presents a novel, domain-agnostic counterfactual method using LLMs to quantify gender disparities in decision-making processes.

Findings

01

Women are approximately 2.1% less likely to receive higher-severity triage scores.

02

The approach detects both explicit and implicit gender cues contributing to disparities.

03

Over 200,000 lower-severity triage assignments per year in France are linked to gender bias.

Abstract

We present a novel, domain-agnostic counterfactual approach that uses Large Language Models (LLMs) to quantify gender disparities in human clinical decision-making. The method trains an LLM to emulate observed decisions, then evaluates counterfactual pairs in which only gender is flipped, estimating directional disparities while holding all other clinical factors constant. We study emergency triage, validating the approach on more than 150,000 admissions to the Bordeaux University Hospital (France) and replicating results on a subset of MIMIC-IV across a different language, population, and healthcare system. In the Bordeaux cohort, otherwise identical presentations were approximately 2.1% more likely to receive a lower-severity triage score when presented as female rather than male; scaled to national emergency volumes in France, this corresponds to more than 200,000 lower-severity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare