Rendering-Aware Reinforcement Learning for Vector Graphics Generation

Juan A. Rodriguez; Haotian Zhang; Abhay Puri; Aarash Feizi; Rishav Pramanik; Pascal Wichmann; Arnab Mondal; Mohammad Reza Samsami; Rabiul Awal; Perouz Taslakian; Spandana Gella; Sai Rajeswar; David Vazquez; Christopher Pal; Marco Pedersoli

arXiv:2505.20793·cs.CV·December 2, 2025

Rendering-Aware Reinforcement Learning for Vector Graphics Generation

Juan A. Rodriguez, Haotian Zhang, Abhay Puri, Aarash Feizi, Rishav Pramanik, Pascal Wichmann, Arnab Mondal, Mohammad Reza Samsami, Rabiul Awal, Perouz Taslakian, Spandana Gella, Sai Rajeswar, David Vazquez, Christopher Pal, Marco Pedersoli

PDF

Open Access 1 Models 3 Datasets

TL;DR

This paper introduces RLRF, a reinforcement learning approach that improves SVG vector graphic generation by using rendering feedback to enhance the fidelity and efficiency of the generated images.

Contribution

It presents a novel RL method that leverages rendering feedback to train vision-language models for more accurate and coherent SVG generation, addressing limitations of supervised methods.

Findings

01

RLRF outperforms supervised fine-tuning in SVG quality.

02

The method produces more faithful and efficient SVGs.

03

Enhanced structural understanding and generalization in SVG generation.

Abstract

Scalable Vector Graphics (SVG) offer a powerful format for representing visual designs as interpretable code. Recent advances in vision-language models (VLMs) have enabled high-quality SVG generation by framing the problem as a code generation task and leveraging large-scale pretraining. VLMs are particularly suitable for this task as they capture both global semantics and fine-grained visual patterns, while transferring knowledge across vision, natural language, and code domains. However, existing VLM approaches often struggle to produce faithful and efficient SVGs because they never observe the rendered images during training. Although differentiable rendering for autoregressive SVG code generation remains unavailable, rendered outputs can still be compared to original inputs, enabling evaluative feedback suitable for reinforcement learning (RL). We introduce RLRF (Reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
nllg/detikzify-v2.5-8b
model· 286 dl· ♡ 10
286 dl♡ 10

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Human Motion and Animation