Speed3R: Sparse Feed-forward 3D Reconstruction Models

Weining Ren; Xiao Tan; Kai Han

arXiv:2603.08055·cs.CV·March 10, 2026

Speed3R: Sparse Feed-forward 3D Reconstruction Models

Weining Ren, Xiao Tan, Kai Han

PDF

Open Access 1 Models

TL;DR

Speed3R is a novel 3D reconstruction model that significantly accelerates inference by using sparse attention guided by keypoints, achieving over 12x speedup with minimal accuracy loss.

Contribution

The paper introduces a dual-branch attention mechanism that reduces computational complexity by focusing on informative image tokens, inspired by Structure-from-Motion principles.

Findings

01

12.4x inference speedup on 1000-view sequences

02

Minimal trade-off in geometric accuracy

03

High-quality reconstructions with reduced computational cost

Abstract

While recent feed-forward 3D reconstruction models accelerate 3D reconstruction by jointly inferring dense geometry and camera poses in a single pass, their reliance on dense attention imposes a quadratic complexity, creating a prohibitive computational bottleneck that severely limits inference speed. To resolve this, we introduce Speed3R, an end-to-end trainable model inspired by the core principle of Structure-from-Motion: that a sparse set of keypoints is sufficient for robust pose estimation. Speed3R features a dual-branch attention mechanism where a compression branch creates a coarse contextual prior to guide a selection branch, which performs fine-grained attention only on the most informative image tokens. This strategy mimics the efficiency of traditional keypoint matching, achieving a remarkable 12.4x inference speedup on 1000-view sequences, while introducing a minimal,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
weining17/Speed3R_Pi3
model· 149 dl· ♡ 1
149 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · 3D Shape Modeling and Analysis