OmniDiagram: Advancing Unified Diagram Code Generation via Visual Interrogation Reward

Haoyue Yang; Xuanle Zhao; Xuexin Liu; Feibang Jiang; Yao Zhu

arXiv:2604.05514·cs.AI·April 8, 2026

OmniDiagram: Advancing Unified Diagram Code Generation via Visual Interrogation Reward

Haoyue Yang, Xuanle Zhao, Xuexin Liu, Feibang Jiang, Yao Zhu

PDF

TL;DR

OmniDiagram introduces a unified framework for diagram code generation that uses visual feedback via the Viva strategy, enabling high-quality, diverse diagram creation without manual annotations.

Contribution

The paper presents OmniDiagram, a novel unified diagram code generation framework with a visual interrogation reward, and introduces the large-scale M3$^2$Diagram dataset.

Findings

01

OmniDiagram achieves new state-of-the-art results on diagram code benchmarks.

02

Viva's visual feedback improves diagram fidelity without manual ground truth.

03

The dataset contains over 196,000 high-quality diagram instances.

Abstract

The paradigm of programmable diagram generation is evolving rapidly, playing a crucial role in structured visualization. However, most existing studies are confined to a narrow range of task formulations and language support, constraining their applicability to diverse diagram types. In this work, we propose OmniDiagram, a unified framework that incorporates diverse diagram code languages and task definitions. To address the challenge of aligning code logic with visual fidelity in Reinforcement Learning (RL), we introduce a novel visual feedback strategy named Visual Interrogation Verifies All (\textsc{Viva}). Unlike brittle syntax-based rules or pixel-level matching, \textsc{Viva} rewards the visual structure of rendered diagrams through a generative approach. Specifically, \textsc{Viva} actively generates targeted visual inquiries to scrutinize diagram visual fidelity and provides…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.