SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation

Shuo Yang; Zheyu Zhang; Bardh Prenkaj; Gjergji Kasneci

arXiv:2604.24368·cs.LG·April 28, 2026

SAGE: Sparse Adaptive Guidance for Dependency-Aware Tabular Data Generation

Shuo Yang, Zheyu Zhang, Bardh Prenkaj, Gjergji Kasneci

PDF

TL;DR

SAGE introduces a novel LLM-based framework for generating high-fidelity synthetic tabular data by enforcing sparse, dynamic feature dependencies, improving data utility and reducing policy violations.

Contribution

It proposes a new method that models feature dependencies sparsely and adaptively, addressing limitations of previous dense, static dependency approaches.

Findings

01

Boosts F1 scores by 10% over previous methods

02

Reduces policy violations by one point

03

Improves data fidelity and downstream utility

Abstract

Generating high-fidelity synthetic tabular data remains a critical challenge for enhancing data availability in privacy-sensitive and low-resource domains. Recent approaches leverage LLMs by representing table rows as sequences, yet suffer from two fundamental limitations: (1) they model feature dependencies densely, introducing spurious correlations; and (2) they assume static relationships between features, ignoring how these dependencies vary with feature values. To overcome these limitations, we introduce SAGE (Sparse Adaptive Guidance), a novel LLM-based generation framework that enforces sparse and dynamic dependency guidance. SAGE discretizes features into value-aware pseudo-features and constructs a mutual information-based sparse dependency graph. This graph adaptively guides generation through explicit context selection or implicit logit correction, enabling LLMs to focus on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.