GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models

Jingxuan Wei; Caijun Jia; Xi Bai; Xinglong Xu; Siyuan Li; Linzhuang Sun; Bihui Yu; Conghui He; Lijun Wu; Cheng Tan

arXiv:2511.11134·cs.AI·January 15, 2026

GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models

Jingxuan Wei, Caijun Jia, Xi Bai, Xinglong Xu, Siyuan Li, Linzhuang Sun, Bihui Yu, Conghui He, Lijun Wu, Cheng Tan

PDF

Open Access 1 Datasets

TL;DR

GGBench is a new benchmark designed to evaluate the geometric generative reasoning abilities of unified multimodal models, emphasizing their capacity for integrated understanding and active construction in visual and language tasks.

Contribution

This paper introduces GGBench, a novel benchmark that specifically measures the geometric reasoning and generative capabilities of multimodal models, filling a critical evaluation gap.

Findings

01

GGBench effectively diagnoses models' reasoning and construction skills.

02

Unified multimodal models show varied performance on geometric tasks.

03

The benchmark sets a new standard for evaluating generative reasoning in AI.

Abstract

The advent of Unified Multimodal Models (UMMs) signals a paradigm shift in artificial intelligence, moving from passive perception to active, cross-modal generation. Despite their unprecedented ability to synthesize information, a critical gap persists in evaluation: existing benchmarks primarily assess discriminative understanding or unconstrained image generation separately, failing to measure the integrated cognitive process of generative reasoning. To bridge this gap, we propose that geometric construction provides an ideal testbed as it inherently demands a fusion of language comprehension and precise visual generation. We introduce GGBench, a benchmark designed specifically to evaluate geometric generative reasoning. It provides a comprehensive framework for systematically diagnosing a model's ability to not only understand and reason but to actively construct a solution, thereby…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

OpenRaiser/GGBench
dataset· 65 dl
65 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Language, Metaphor, and Cognition · Explainable Artificial Intelligence (XAI)