VGBench: Evaluating Large Language Models on Vector Graphics   Understanding and Generation

Bocheng Zou; Mu Cai; Jianrui Zhang; Yong Jae Lee

arXiv:2407.10972·cs.CV·August 30, 2024

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

Bocheng Zou, Mu Cai, Jianrui Zhang, Yong Jae Lee

PDF

Open Access 1 Repo 1 Datasets

TL;DR

VGBench is a comprehensive benchmark designed to evaluate large language models' abilities in understanding and generating vector graphics, covering various formats, question types, and prompting techniques, with insights into their strengths and limitations.

Contribution

This work introduces VGBench, the first extensive benchmark for LLMs on vector graphics, encompassing diverse formats, tasks, and evaluation methods, filling a gap in current research.

Findings

01

LLMs perform well on understanding and generation tasks.

02

Performance drops on low-level formats like SVG.

03

Benchmark and data will be open-sourced for community use.

Abstract

In the realm of vision models, the primary mode of representation is using pixels to rasterize the visual world. Yet this is not always the best or unique way to represent visual content, especially for designers and artists who depict the world using geometry primitives such as polygons. Vector graphics (VG), on the other hand, offer a textual representation of visual content, which can be more concise and powerful for content like cartoons, sketches and scientific figures. Recent studies have shown promising results on processing vector graphics with capable Large Language Models (LLMs). However, such works focus solely on qualitative results, understanding, or a specific type of vector graphics. We propose VGBench, a comprehensive benchmark for LLMs on handling vector graphics through diverse aspects, including (a) both visual understanding and generation, (b) evaluation of various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vgbench/VGBench
noneOfficial

Datasets

vgbench/VGQA
dataset· 22 dl
22 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Data Visualization and Analytics · Natural Language Processing Techniques

MethodsFocus