VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Josef Kucha\v{r}, Marek Kadl\v{c}\'ik, Michal Spiegel, Michal \v{S}tef\'anik

TL;DR
This paper presents VectorEdits, a large dataset for instruction-based vector graphic editing, highlighting the challenges current models face in accurately modifying SVG images based on natural language commands.
Contribution
The creation of a comprehensive dataset with over 270,000 SVG-image pairs and instructions, enabling research on instruction-guided vector graphic editing.
Findings
State-of-the-art models struggle with accurate edits.
Current methods often produce invalid vector graphics.
The dataset facilitates future research in natural language-driven editing.
Abstract
We introduce a large-scale dataset for instruction-guided vector image editing, consisting of over 270,000 pairs of SVG images paired with natural language edit instructions. Our dataset enables training and evaluation of models that modify vector graphics based on textual commands. We describe the data collection process, including image pairing via CLIP similarity and instruction generation with vision-language models. Initial experiments with state-of-the-art large language models reveal that current methods struggle to produce accurate and valid edits, underscoring the challenge of this task. To foster research in natural language-driven vector graphic generation and editing, we make our resources created within this work publicly available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Handwritten Text Recognition Techniques
