DataEvolver: Let Your Data Build and Improve Itself via Goal-Driven Loop Agents
Qisong Zhang (1), Wenzhuo Wu (1), Zhuangzhuang Jia (1), Yunhao Yang (1), Huayu Zhang (2), Xianghao Zang (2), Zhixiang He (2), Zhongjiang He (2), Kongming Liang (1), Zhanyu Ma (1) ((1) School of Artificial Intelligence, Beijing University of Posts, Telecommunications

TL;DR
DataEvolver is a closed-loop system for iterative visual data generation and refinement, enabling goal-driven dataset creation with multiple artifact types and validation mechanisms.
Contribution
It introduces a reusable framework for building visual datasets through explicit goal tracking, review, correction, and acceptance loops, validated on an object-rotation task.
Findings
Our model outperforms unadapted base and multi-angle LoRA models on SpatialEdit.
Ablation studies show improvements from scene-aware generation to feedback correction.
The framework effectively enhances dataset quality through iterative validation and correction.
Abstract
Constructing controllable visual data is a major bottleneck for image editing and multimodal understanding. Useful supervision is rarely produced by a single rendering pass; instead it emerges through iterative generation, inspection, correction, filtering, and export. We present DataEvolver, a closed-loop visual data engine that organizes this process around explicit goals, persistent artifacts, bounded corrective actions, and acceptance decisions. DataEvolver supports multiple artifact types, including RGB images, masks, depth maps, normal maps, meshes, poses, trajectories, and review traces. In the current release, the system operates through two coupled loops: generation-time self-correction within each sample and validation-time self-expansion across dataset rounds. We validate the framework on an image-level object-rotation setting. With a fixed Qwen-Edit LoRA probe, our final…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
