SHIELD: A Diverse Clinical Note Dataset and Distilled Small Language Models for Enterprise-Scale De-identification
Jose D. Posada, David Love, Somalee Datta, Priya Desai

TL;DR
This paper introduces SHIELD, a diverse clinical note dataset, and develops distilled small language models for effective, enterprise-scale de-identification of electronic health records, addressing limitations of existing benchmarks and models.
Contribution
The paper presents a new diverse dataset for clinical de-identification and demonstrates how to distill large language models into efficient small models suitable for enterprise deployment.
Findings
SHIELD dataset contains 1,394 notes with 10,505 PHI spans across 9 categories.
Distilled models achieve 0.88 precision and 0.86 recall on PHI span detection.
Distilled models generalize well across datasets but struggle with institution-specific entities.
Abstract
De-identification of clinical text remains essential for secondary use of electronic health records (EHRs), yet public benchmarks such as i2b2 2006/2014 are over a decade old and lack the semantic and demographic diversity of modern narratives. While Large Language Models (LLMs) achieve state-of-the-art zero-shot extraction, enterprise deployment is hindered by compute costs and governance restricting Protected Health Information (PHI) from cloud APIs. We introduce SHIELD (Synthetic Human-annotated Identifier-replaced Entries for Learning and De-identification), a diverse dataset of 1,394 notes with 10,505 gold-standard PHI spans across 9 categories, built via set-cover diversity sampling with human-in-the-loop adjudication. We evaluate four LLMs (two proprietary, two open-weight) to establish a performance ceiling, then distill these capabilities into locally deployable Small Language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
