DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

Arshia Ilaty; Hossein Shirazi; Amir Rahmani; Hajar Homayouni

arXiv:2604.01481·cs.LG·April 3, 2026

DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

Arshia Ilaty, Hossein Shirazi, Amir Rahmani, Hajar Homayouni

PDF

TL;DR

DISCO-TAB is a hierarchical reinforcement learning framework that enhances privacy-preserving synthetic clinical data generation by capturing complex dependencies and maintaining data utility and privacy.

Contribution

It introduces a multi-granularity discriminator system with reinforcement learning to improve the realism and utility of synthetic EHR data, surpassing prior methods.

Findings

01

Achieves up to 38.2% improvement in clinical classifier utility.

02

Ensures statistical fidelity with JSD < 0.01.

03

Demonstrates robustness against membership inference attacks.

Abstract

The development of robust clinical decision support systems is frequently impeded by the scarcity of high-fidelity, privacy-preserving biomedical data. While Generative Large Language Models (LLMs) offer a promising avenue for synthetic data generation, they often struggle to capture the complex, non-linear dependencies and severe class imbalances inherent in Electronic Health Records (EHR), leading to statistically plausible but clinically invalid records. To bridge this gap, we introduce DISCO-TAB (DIScriminator-guided COntrol for TABular synthesis), a novel framework that orchestrates a fine-tuned LLM with a multi-objective discriminator system optimized via Reinforcement Learning. Unlike prior methods relying on scalar feedback, DISCO-TAB evaluates synthesis at four granularities, token, sentence, feature, and row, while integrating Automated Constraint Discovery and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.