Not All Features Deserve Attention: Graph-Guided Dependency Learning for Tabular Data Generation with Language Models
Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci

TL;DR
This paper introduces GraDe, a method that enhances large language models for tabular data generation by explicitly incorporating sparse dependency graphs to focus attention on critical feature interactions, improving performance on complex datasets.
Contribution
The paper proposes GraDe, a novel approach that integrates external dependency graphs into LLMs' attention to better model structured tabular data with sparse dependencies.
Findings
GraDe outperforms existing LLM-based methods by up to 12% on complex datasets.
GraDe achieves competitive results with state-of-the-art approaches in synthetic data quality.
The method is minimally intrusive and effectively models structure-aware tabular data.
Abstract
Large Language Models (LLMs) have shown strong potential for tabular data generation by modeling textualized feature-value pairs. However, tabular data inherently exhibits sparse feature-level dependencies, where many feature interactions are structurally insignificant. This creates a fundamental mismatch as LLMs' self-attention mechanism inevitably distributes focus across all pairs, diluting attention on critical relationships, particularly in datasets with complex dependencies or semantically ambiguous features. To address this limitation, we propose GraDe (Graph-Guided Dependency Learning), a novel method that explicitly integrates sparse dependency graphs into LLMs' attention mechanism. GraDe employs a lightweight dynamic graph learning module guided by externally extracted functional dependencies, prioritizing key feature interactions while suppressing irrelevant ones. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
