Improving Social Determinants of Health Documentation in French EHRs Using Large Language Models
Adrien Bazoge, Pac\^ome Constant dit Beaufils, Mohammed Hmitouch, Romain Bourcier, Emmanuel Morin, Richard Dufour, B\'eatrice Daille, Pierre-Antoine Gourraud, Matilde Karakachoff

TL;DR
This study demonstrates that large language models can effectively extract social determinants of health from French clinical notes, significantly improving data completeness compared to traditional structured EHR coding.
Contribution
The paper introduces a novel LLM-based approach for extracting SDoH categories from French clinical notes, with publicly available datasets and evaluation metrics.
Findings
High accuracy for well-documented SDoH categories (F1 > 0.80)
Model identified 95.8% of patients with at least one SDoH
Performance limited by annotation inconsistencies and language-specific issues
Abstract
Social determinants of health (SDoH) significantly influence health outcomes, shaping disease progression, treatment adherence, and health disparities. However, their documentation in structured electronic health records (EHRs) is often incomplete or missing. This study presents an approach based on large language models (LLMs) for extracting 13 SDoH categories from French clinical notes. We trained Flan-T5-Large on annotated social history sections from clinical notes at Nantes University Hospital, France. We evaluated the model at two levels: (i) identification of SDoH categories and associated values, and (ii) extraction of detailed SDoH with associated temporal and quantitative information. The model performance was assessed across four datasets, including two that we publicly release as open resources. The model achieved strong performance for identifying well-documented categories…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
