Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

Haomin Wang; Qi Wei; Qianli Ma; Shengyuan Ding; Jinhui Yin; Kai Chen; Hongjie Zhang

arXiv:2603.16189·cs.CV·March 18, 2026

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

Haomin Wang, Qi Wei, Qianli Ma, Shengyuan Ding, Jinhui Yin, Kai Chen, Hongjie Zhang

PDF

Open Access 1 Datasets

TL;DR

This paper introduces CTRL-S, a reinforcement learning framework with a chain-of-thought mechanism and multi-reward optimization, significantly improving SVG generation quality and reasoning in vision-language models.

Contribution

The paper presents a novel multi-task, multi-reward reinforcement learning approach with a chain-of-thought mechanism for SVG generation, supported by a new high-quality SVG dataset.

Findings

01

CTRL-S outperforms existing methods in success rates.

02

Improves SVG code structure and visual fidelity.

03

Enhances reasoning and generalization in SVG generation.

Abstract

With the rapid advancement of vision-language models, an increasing number of studies have explored their potential for SVG generation tasks. Although existing approaches improve performance by constructing large-scale SVG datasets and introducing SVG-specific tokens, they still suffer from limited generalization, redundant paths in code outputs, and a lack of explicit reasoning. In this work, we present CTRL-S (Chain-of-Thought Reinforcement Learning for SVG), a unified framework that introduces a chain-of-thought mechanism to explicitly expose the model's reasoning process during SVG generation. To support this structured reasoning, we construct SVG-Sophia, a high-quality dataset containing 145K samples across SVG code refinement, Text-to-SVG, and Image-to-SVG tasks. By training the model to generate group-level structured SVG code, CTRL-S significantly improves structural coherence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

InternSVG/SVG-Sophia
dataset· 113 dl
113 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Hate Speech and Cyberbullying Detection