TL;DR
This paper evaluates methods for generating privacy-preserving synthetic clinical notes using large language models, balancing data utility and privacy risks, and demonstrating the effectiveness of keyword-based approaches in healthcare data sharing.
Contribution
It introduces a keyword-based text generation method with LLMs that maintains data utility while reducing privacy risks in synthetic clinical notes.
Findings
Keyword-based method shows low PHI risk and high utility.
Re-identified data outperforms de-identified data in utility.
One-shot generation has higher PHI exposure.
Abstract
This study examines integrating EHRs and NLP with large language models (LLMs) to improve healthcare data management and patient care. It focuses on using advanced models to create secure, HIPAA-compliant synthetic patient notes for biomedical research. The study used de-identified and re-identified MIMIC III datasets with GPT-3.5, GPT-4, and Mistral 7B to generate synthetic notes. Text generation employed templates and keyword extraction for contextually relevant notes, with one-shot generation for comparison. Privacy assessment checked PHI occurrence, while text utility was tested using an ICD-9 coding task. Text quality was evaluated with ROUGE and cosine similarity metrics to measure semantic similarity with source notes. Analysis of PHI occurrence and text utility via the ICD-9 coding task showed that the keyword-based method had low risk and good performance. One-shot generation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Label Smoothing · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Warmup With Cosine Annealing · Residual Connection · Dropout · Transformer · Adam
