CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves

Amirreza Mohseni; Mona Mohammadi; Morteza Saghafian; Naser Talebizadeh Sardari

arXiv:2605.14068·cs.CV·May 19, 2026

CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves

Amirreza Mohseni, Mona Mohammadi, Morteza Saghafian, Naser Talebizadeh Sardari

PDF

1 Repo 3 Models 3 Datasets

TL;DR

CurveBench is a new benchmark dataset for hierarchical topological reasoning over nested Jordan curves, highlighting the difficulty of exact topological understanding in visual models.

Contribution

The paper introduces CurveBench, a comprehensive dataset and task for structured prediction of containment trees from images of Jordan curves, and evaluates current models' performance.

Findings

01

Strongest model achieves only 71.1% accuracy on easy and 19.1% on hard configurations.

02

Fine-tuning vision-language models significantly improves accuracy, surpassing some large language models.

03

There remains a large gap in exact topological reasoning capabilities of current models.

Abstract

We introduce CurveBench, a benchmark for hierarchical topological reasoning from visual input. CurveBench consists of \textbf{756 images} of pairwise non-intersecting Jordan curves across easy, polygonal, topographic-inspired, maze-like, and dense counting configurations. Each image is annotated with a rooted tree encoding the containment relations between planar regions. We formulate the task as structured prediction: given an image, a model must recover the full rooted containment tree induced by the curves. Despite the visual simplicity of the task, the strongest evaluated model, Gemini 3.1 Pro, achieves only \textbf{71.1\%} tree-generation accuracy on CurveBench-Easy and \textbf{19.1\%} on CurveBench-Hard. We further demonstrate benchmark utility through RLVR-style fine-tuning of open-weight vision-language models. Our trained Qwen3-VL-8B model improves over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amir-mohseni/CurveBench
github

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.