Vector Grimoire: Codebook-based Shape Generation under Raster Image   Supervision

Moritz Feuerpfeil; Marco Cipriano; Gerard de Melo

arXiv:2410.05991·cs.CV·October 10, 2024

Vector Grimoire: Codebook-based Shape Generation under Raster Image Supervision

Moritz Feuerpfeil, Marco Cipriano, Gerard de Melo

PDF

Open Access 1 Video

TL;DR

GRIMOIRE is a novel text-guided SVG generative model that learns to produce vector graphics from raster images using only image supervision, enabling more flexible and scalable vector shape generation.

Contribution

It introduces a raster image supervised approach for SVG generation, combining a visual shape quantizer and an autoregressive transformer for natural language guided vector creation.

Findings

01

Outperforms previous image-supervised methods in quality

02

Works effectively on MNIST, icon, and font datasets

03

Enables scalable vector graphic generation from raster images

Abstract

Scalable Vector Graphics (SVG) is a popular format on the web and in the design industry. However, despite the great strides made in generative modeling, SVG has remained underexplored due to the discrete and complex nature of such data. We introduce GRIMOIRE, a text-guided SVG generative model that is comprised of two modules: A Visual Shape Quantizer (VSQ) learns to map raster images onto a discrete codebook by reconstructing them as vector shapes, and an Auto-Regressive Transformer (ART) models the joint probability distribution over shape tokens, positions and textual descriptions, allowing us to generate vector graphics from natural language. Unlike existing models that require direct supervision from SVG data, GRIMOIRE learns shape image patches using only raster image supervision which opens up vector generative modeling to significantly more data. We demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Vector Grimoire: Codebook-based Shape Generation under Raster Image Supervision· slideslive

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Simulation and Modeling Applications

MethodsDense Connections · Adam · Linear Layer · Residual Connection · Position-Wise Feed-Forward Layer · Attention Is All You Need · Label Smoothing · Dropout · Byte Pair Encoding · Absolute Position Encodings