UtilGen: Utility-Centric Generative Data Augmentation with Dual-Level Task Adaptation
Jiyu Guo, Shuo Yang, Yiming Huang, Yancheng Long, Xiaobo Xia, Xiu Su, Bo Zhao, Zeke Xie, Liqiang Nie

TL;DR
UtilGen introduces a utility-centric data augmentation framework that adaptively generates task-specific synthetic data by leveraging downstream task feedback, significantly improving model performance across multiple benchmarks.
Contribution
It proposes a novel dual-level optimization strategy for generative data augmentation that focuses on maximizing task utility rather than just visual quality.
Findings
Achieves an average accuracy improvement of 3.87% over SOTA methods.
Produces more impactful, task-relevant synthetic data.
Demonstrates effectiveness across eight benchmark datasets.
Abstract
Data augmentation using generative models has emerged as a powerful paradigm for enhancing performance in computer vision tasks. However, most existing augmentation approaches primarily focus on optimizing intrinsic data attributes -- such as fidelity and diversity -- to generate visually high-quality synthetic data, while often neglecting task-specific requirements. Yet, it is essential for data generators to account for the needs of downstream tasks, as training data requirements can vary significantly across different tasks and network architectures. To address these limitations, we propose UtilGen, a novel utility-centric data augmentation framework that adaptively optimizes the data generation process to produce task-specific, high-utility training data via downstream task feedback. Specifically, we first introduce a weight allocation network to evaluate the task-specific utility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
