Training-Free Private Synthesis with Validation: A New Frontier for Practical Educational Data Sharing

Hibiki Ito; Chia-Yu Hsu; Hiroaki Ogata

arXiv:2604.01821·cs.CY·April 3, 2026

Training-Free Private Synthesis with Validation: A New Frontier for Practical Educational Data Sharing

Hibiki Ito, Chia-Yu Hsu, Hiroaki Ogata

PDF

TL;DR

This paper introduces a practical, training-free LLM-based differential privacy synthetic data generation method for educational data sharing, enabling easier implementation and validation with moderate privacy risks.

Contribution

It proposes a novel two-stage approach combining training-free LLM-based DP-SDG with on-demand validation, reducing engineering effort for educational data sharing.

Findings

01

LLM-based DP-SDG performs comparably to deep learning baselines.

02

The method significantly reduces engineering costs.

03

Moderate privacy leakage occurs during validation.

Abstract

While secondary use of real-world data (RWD) in education offers substantial research opportunities, data sharing is often limited by privacy constraints. Differentially private synthetic data generation (DP-SDG) has emerged as a possible solution. However, educational RWD is fragmented across platforms and institutions and stored in different formats, so DP-SDG must be tailored to each dataset, requiring substantial engineering effort. In addition, such data are often small-sample and high-dimensional, making deep learning (DL)-based methods common but difficult to implement without specialist expertise. In this setting, it is also hard to achieve practically useful downstream utility. As a result, despite its theoretical promise, DP-SDG remains far from a practical solution in education. To address this issue, we propose a more practical two-stage method: (1) training-free, LLM-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.