MERG3R: A Divide-and-Conquer Approach to Large-Scale Neural Visual Geometry
Leo Kaixuan Cheng, Abdus Shaikh, Ruofan Liang, Zhijie Wu, Yushi Guan, Nandita Vijaykumar

TL;DR
MERG3R is a novel divide-and-conquer framework that enhances large-scale neural visual geometry models by enabling scalable, memory-efficient 3D reconstruction from unordered image collections without retraining.
Contribution
It introduces a training-free, model-agnostic method for partitioning and merging images to overcome GPU memory limits in neural geometry models.
Findings
Improves reconstruction accuracy on large datasets
Reduces memory usage significantly
Enables scalable 3D reconstruction beyond native memory limits
Abstract
Recent advancements in neural visual geometry, including transformer-based models such as VGGT and Pi3, have achieved impressive accuracy on 3D reconstruction tasks. However, their reliance on full attention makes them fundamentally limited by GPU memory capacity, preventing them from scaling to large, unordered image collections. We introduce MERG3R, a training-free divide-and-conquer framework that enables geometric foundation models to operate far beyond their native memory limits. MERG3R first reorders and partitions unordered images into overlapping, geometrically diverse subsets that can be reconstructed independently. It then merges the resulting local reconstructions through an efficient global alignment and confidence-weighted bundle adjustment procedure, producing a globally consistent 3D model. Our framework is model-agnostic and can be paired with existing neural geometry…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging
