RAFT-MSF++: Temporal Geometry-Motion Feature Fusion for Self-Supervised Monocular Scene Flow

Xunpei Sun; Zuoxun Hou; Yi Chang; Gang Chen; and Wei-Shi Zheng

arXiv:2604.19349·cs.CV·April 22, 2026

RAFT-MSF++: Temporal Geometry-Motion Feature Fusion for Self-Supervised Monocular Scene Flow

Xunpei Sun, Zuoxun Hou, Yi Chang, Gang Chen, and Wei-Shi Zheng

PDF

1 Repo

TL;DR

RAFT-MSF++ is a self-supervised multi-frame method that fuses temporal features to improve monocular scene flow estimation, especially in occluded regions, by using a novel Geometry-Motion Feature and attention mechanisms.

Contribution

It introduces a recurrent fusion framework with Geometry-Motion Features and occlusion-aware modules for enhanced temporal reasoning in monocular scene flow estimation.

Findings

01

Achieves 24.14% SF-all on KITTI Scene Flow benchmark.

02

30.99% improvement over the baseline.

03

Demonstrates robustness in occluded regions.

Abstract

Monocular scene flow estimation aims to recover dense 3D motion from image sequences, yet most existing methods are limited to two-frame inputs, restricting temporal modeling and robustness to occlusions. We propose RAFT-MSF++, a self-supervised multi-frame framework that recurrently fuses temporal features to jointly estimate depth and scene flow. Central to our approach is the Geometry-Motion Feature (GMF), which compactly encodes coupled motion and geometry cues and is iteratively updated for effective temporal reasoning. To ensure the robustness of this temporal fusion against occlusions, we incorporate relative positional attention to inject spatial priors and an occlusion regularization module to propagate reliable motion from visible regions. These components enable the GMF to effectively propagate information even in ambiguous areas. Extensive experiments show that RAFT-MSF++…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunzunyi/RAFT-MSF-PlusPlus
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.