IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator-Critic Framework

Feiyu Wang; Jiayuan Yang; Zhiyuan Zhao; Da Zhang; Bingyu Li; Peng Liu; Junyu Gao

arXiv:2603.09312·cs.CV·March 11, 2026

IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator-Critic Framework

Feiyu Wang, Jiayuan Yang, Zhiyuan Zhao, Da Zhang, Bingyu Li, Peng Liu, Junyu Gao

PDF

Open Access

TL;DR

IntroSVG introduces a closed-loop, feedback-driven framework for text-to-SVG generation that improves quality by integrating visual perception and iterative refinement, surpassing previous methods in complexity and semantic accuracy.

Contribution

The paper presents a novel introspective generator-critic framework that incorporates visual feedback and iterative refinement for enhanced text-to-SVG generation.

Findings

01

Achieves state-of-the-art performance on key metrics

02

Generates more complex and semantically aligned SVGs

03

Enhances robustness through error correction training

Abstract

Scalable Vector Graphics (SVG) are central to digital design due to their inherent scalability and editability. Despite significant advancements in content generation enabled by Visual Language Models (VLMs), existing text-to-SVG generation methods are limited by a core challenge: the autoregressive training process does not incorporate visual perception of the final rendered image, which fundamentally constrains generation quality. To address this limitation, we propose an Introspective SVG Generation Framework (IntroSVG). At its core, the framework instantiates a unified VLM that operates in a closed loop, assuming dual roles of both generator and critic. Specifically, through Supervised Fine-Tuning (SFT), the model learns to draft SVGs and to provide feedback on their rendered outputs; moreover, we systematically convert early-stage failures into high-quality error-correction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Topic Modeling