FASTopoWM: Fast-Slow Lane Segment Topology Reasoning with Latent World Models

Yiming Yang; Hongbin Lin; Yueru Luo; Suzhong Fu; Chao Zheng; Xinrui Yan; Shuqi Mei; Kun Tang; Shuguang Cui; Zhen Li

arXiv:2507.23325·cs.CV·November 13, 2025

FASTopoWM: Fast-Slow Lane Segment Topology Reasoning with Latent World Models

Yiming Yang, Hongbin Lin, Yueru Luo, Suzhong Fu, Chao Zheng, Xinrui Yan, Shuqi Mei, Kun Tang, Shuguang Cui, Zhen Li

PDF

Open Access 3 Reviews

TL;DR

FASTopoWM is a novel framework that enhances lane topology reasoning in autonomous driving by leveraging fast-slow systems and latent world models, significantly improving detection and perception accuracy over state-of-the-art methods.

Contribution

It introduces a unified fast-slow lane reasoning framework with latent world models, enabling parallel supervision and better temporal perception in autonomous driving.

Findings

01

Outperforms state-of-the-art in lane segment detection (37.4% vs. 33.6% mAP)

02

Improves centerline perception accuracy (46.3% vs. 41.5% OLS)

03

Enhances temporal perception robustness in lane topology reasoning.

Abstract

Lane segment topology reasoning provides comprehensive bird's-eye view (BEV) road scene understanding, which can serve as a key perception module in planning-oriented end-to-end autonomous driving systems. Existing lane topology reasoning methods often fall short in effectively leveraging temporal information to enhance detection and reasoning performance. Recently, stream-based temporal propagation method has demonstrated promising results by incorporating temporal cues at both the query and BEV levels. However, it remains limited by over-reliance on historical queries, vulnerability to pose estimation failures, and insufficient temporal propagation. To overcome these limitations, we propose FASTopoWM, a novel fast-slow lane segment topology reasoning framework augmented with latent world models. To reduce the impact of pose estimation failures, this unified framework enables parallel…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

- This work provides a much better way to incorporate historical information to facilitate lane topology reasoning, compared to StreamMapNet. - FASTopoWM achieves SOTA performance on OpenLane-V2 with a 37.4 mAP (+3.8 compared to the previous SOTA baseline Topo2Seq) while maintaining a certainly acceptable latency (11.4) - Ablations shown in Table 3 demonstrated various parts’ functions in FASTopoWM. The experiments are overall comprehensive.

Weaknesses

- The paper's primary weakness is a disconnect between its terminology and the methods described. Certain concepts, like "world models" and the "fast-slow system," feel overstated. - For example, the "world model" is a two-layer transformer that predicts features for the next timestep. This is a much simpler implementation than what is typically understood by the term. - Similarly, the proposed "fast-slow system" primarily relies on the slow system's output. The fast system only acts as a fallba

Reviewer 02Rating 2Confidence 4

Strengths

- The paper is well written and clearly organized, and figures are effectively created to support the content. - Experimental results verify the effectiveness of the proposed approach against several baselines.

Weaknesses

**Major Weaknesses** 1. Substantial overlap with TopoStreamer[1] in problem framing and technical pipeline, without the necessary systematic comparison or discussion. TopoStreamer also targets temporal lane segment topology reasoning and essentially adopts the same or highly similar streaming mechanisms and supervisory objectives. Yet the manuscript provides no systematic analysis of differences or pros/cons. This is a serious issue by ICLR standards: when facing the most directly related, rece

Reviewer 03Rating 6Confidence 4

Strengths

1. The parallel supervision for historical and current features via the latent world model to achieve better feature learning is novel and interesting. 2. The individual designs in the whole framework are reasonable to achieve the goal. And there are corresponding ablations to validate the effectiveness.

Weaknesses

1. Experiments on three claims in the Introduction are missing: (1) over-reliance on historical queries (2) vulnerability to pose estimation failures and (3) Weak temporal propagation. I would like to see some experiments, an analysis or even some cues on how the author identifies these problems. 2. Likewise to problem 1, it’s still unclear to me why the “slow-fast system” can solve these problems fundamentally. Some evidence or statistics should be given. 3. The definition of the Slow

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms · Vehicle License Plate Recognition