OmniDiagram: Advancing Unified Diagram Code Generation via Visual Interrogation Reward
Haoyue Yang, Xuanle Zhao, Xuexin Liu, Feibang Jiang, Yao Zhu

TL;DR
OmniDiagram introduces a unified framework for diagram code generation that uses visual feedback via the Viva strategy, enabling high-quality, diverse diagram creation without manual annotations.
Contribution
The paper presents OmniDiagram, a novel unified diagram code generation framework with a visual interrogation reward, and introduces the large-scale M3$^2$Diagram dataset.
Findings
OmniDiagram achieves new state-of-the-art results on diagram code benchmarks.
Viva's visual feedback improves diagram fidelity without manual ground truth.
The dataset contains over 196,000 high-quality diagram instances.
Abstract
The paradigm of programmable diagram generation is evolving rapidly, playing a crucial role in structured visualization. However, most existing studies are confined to a narrow range of task formulations and language support, constraining their applicability to diverse diagram types. In this work, we propose OmniDiagram, a unified framework that incorporates diverse diagram code languages and task definitions. To address the challenge of aligning code logic with visual fidelity in Reinforcement Learning (RL), we introduce a novel visual feedback strategy named Visual Interrogation Verifies All (\textsc{Viva}). Unlike brittle syntax-based rules or pixel-level matching, \textsc{Viva} rewards the visual structure of rendered diagrams through a generative approach. Specifically, \textsc{Viva} actively generates targeted visual inquiries to scrutinize diagram visual fidelity and provides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
