MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry

Leo Kaixuan Cheng; Abdus Shaikh; Ruofan Liang; Zhijie Wu; Yushi Guan; Nandita Vijaykumar

arXiv:2603.02351·cs.CV·March 4, 2026

MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry

Leo Kaixuan Cheng, Abdus Shaikh, Ruofan Liang, Zhijie Wu, Yushi Guan, Nandita Vijaykumar

PDF

Open Access

TL;DR

MERG3R is a novel divide-and-conquer framework that enhances large-scale neural visual geometry models by enabling scalable, memory-efficient 3D reconstruction from unordered image collections without retraining.

Contribution

It introduces a training-free, model-agnostic method for partitioning and merging images to overcome GPU memory limits in neural geometry models.

Findings

01

Improves reconstruction accuracy on large datasets

02

Reduces memory usage significantly

03

Enables scalable 3D reconstruction beyond native memory limits

Abstract

Recent advancements in neural visual geometry, including transformer-based models such as VGGT and Pi3, have achieved impressive accuracy on 3D reconstruction tasks. However, their reliance on full attention makes them fundamentally limited by GPU memory capacity, preventing them from scaling to large, unordered image collections. We introduce MERG3R, a training-free divide-and-conquer framework that enables geometric foundation models to operate far beyond their native memory limits. MERG3R first reorders and partitions unordered images into overlapping, geometrically diverse subsets that can be reconstructed independently. It then merges the resulting local reconstructions through an efficient global alignment and confidence-weighted bundle adjustment procedure, producing a globally consistent 3D model. Our framework is model-agnostic and can be paired with existing neural geometry…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging