Topo-R1: Detecting Topological Anomalies via Vision-Language Models
Meilong Xu, Qingqiao Hu, Xiaoling Hu, Shahira Abousamra, Xin Yu, Weimin Lyu, Kehan Qi, Dimitris Samaras, Chao Chen

TL;DR
This paper evaluates vision-language models' ability to perceive topological anomalies in tubular structures, finds current models lacking, and introduces a new benchmark and Topo-R1 method that significantly improve topological understanding.
Contribution
It creates the first large-scale benchmark for topological anomaly detection in segmentation masks and proposes Topo-R1, a reinforcement learning approach that enhances topological perception.
Findings
Current VLMs perform nearly at random on topological tasks.
The benchmark enables systematic evaluation of topological perception.
Topo-R1 outperforms general-purpose VLMs and matches supervised methods.
Abstract
Topology is critical in tubular structures such as blood vessels, nerve fibers, and road networks, where connectivity and loop structure govern downstream functional analysis. Vision-Language Models (VLMs) are promising candidates for understanding such structures, given their reasoning and grounding capabilities. To probe their topological perception, we systematically evaluate leading closed- and open-source VLMs on localizing and classifying four canonical topological anomalies (broken/spurious connections, missing/extra branches) in tubular-network segmentation masks. They perform nearly at random, indicating that topology-aware perception is largely absent from current general-purpose VLMs. As no existing resource pairs segmentation masks with localized anomaly annotations, we build an automated, multi-domain data-curation pipeline that synthesizes diverse topological perturbations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
