VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation
Bocheng Zou, Mu Cai, Jianrui Zhang, Yong Jae Lee

TL;DR
VGBench is a comprehensive benchmark designed to evaluate large language models' abilities in understanding and generating vector graphics, covering various formats, question types, and prompting techniques, with insights into their strengths and limitations.
Contribution
This work introduces VGBench, the first extensive benchmark for LLMs on vector graphics, encompassing diverse formats, tasks, and evaluation methods, filling a gap in current research.
Findings
LLMs perform well on understanding and generation tasks.
Performance drops on low-level formats like SVG.
Benchmark and data will be open-sourced for community use.
Abstract
In the realm of vision models, the primary mode of representation is using pixels to rasterize the visual world. Yet this is not always the best or unique way to represent visual content, especially for designers and artists who depict the world using geometry primitives such as polygons. Vector graphics (VG), on the other hand, offer a textual representation of visual content, which can be more concise and powerful for content like cartoons, sketches and scientific figures. Recent studies have shown promising results on processing vector graphics with capable Large Language Models (LLMs). However, such works focus solely on qualitative results, understanding, or a specific type of vector graphics. We propose VGBench, a comprehensive benchmark for LLMs on handling vector graphics through diverse aspects, including (a) both visual understanding and generation, (b) evaluation of various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Data Visualization and Analytics · Natural Language Processing Techniques
MethodsFocus
