# Evaluating Differentially Private Generation of Domain-Specific Text

**Authors:** Yidan Sun, Viktor Schlegel, Srinivasan Nandakumar, Iqra Zahid, Yuping Wu, Warren Del-Pinto, Goran Nenadic, Siew-Kei Lam, Jie Zhang, Anil A Bharath

arXiv: 2508.20452 · 2025-09-01

## TL;DR

This paper introduces a benchmark for evaluating the quality of domain-specific text generated under differential privacy, revealing current methods' limitations in utility and fidelity, especially under strict privacy constraints.

## Contribution

It provides a unified benchmark for systematic evaluation of differentially private text generation across domains, addressing key challenges and setting standards for future research.

## Key findings

- Significant utility and fidelity degradation under strict privacy constraints
- Current privacy-preserving methods have notable limitations in real-world scenarios
- Benchmark facilitates realistic evaluation of differentially private text generation

## Abstract

Generative AI offers transformative potential for high-stakes domains such as healthcare and finance, yet privacy and regulatory barriers hinder the use of real-world data. To address this, differentially private synthetic data generation has emerged as a promising alternative. In this work, we introduce a unified benchmark to systematically evaluate the utility and fidelity of text datasets generated under formal Differential Privacy (DP) guarantees. Our benchmark addresses key challenges in domain-specific benchmarking, including choice of representative data and realistic privacy budgets, accounting for pre-training and a variety of evaluation metrics. We assess state-of-the-art privacy-preserving generation methods across five domain-specific datasets, revealing significant utility and fidelity degradation compared to real data, especially under strict privacy constraints. These findings underscore the limitations of current approaches, outline the need for advanced privacy-preserving data sharing methods and set a precedent regarding their evaluation in realistic scenarios.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20452/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20452/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/2508.20452/full.md

---
Source: https://tomesphere.com/paper/2508.20452