Fisheye3R: Adapting Unified 3D Feed-Forward Foundation Models to Fisheye Lenses
Ruxiao Duan, Erin Hong, Dongxu Zhao, Eric Turner, Alex Wong, Yunwen Zhou

TL;DR
Fisheye3R enables existing 3D reconstruction models to effectively process fisheye images by adapting them to handle high radial distortion, using self-supervised and supervised learning schemes without requiring fisheye ground truth.
Contribution
It introduces Fisheye3R, a novel adaptation framework that allows foundation models to natively process fisheye images without performance loss on perspective images.
Findings
Improves camera pose, depth, and point map estimation on fisheye images.
Supports adaptation with unlabeled perspective images and no fisheye training data.
Demonstrates consistent performance gains across multiple foundation models.
Abstract
Feed-forward foundation models for multi-view 3-dimensional (3D) reconstruction have been trained on large-scale datasets of perspective images; when tested on wide field-of-view images, e.g., from a fisheye camera, their performance degrades. Their error arises from changes in spatial positions of pixels due to a non-linear projection model that maps 3D points onto the 2D image plane. While one may surmise that training on fisheye images would resolve this problem, there are far fewer fisheye images with ground truth than perspective images, which limit generalization. To enable inference on imagery exhibiting high radial distortion, we propose Fisheye3R, a novel adaptation framework that extends these multi-view 3D reconstruction foundation models to natively accommodate fisheye inputs without performance regression on perspective images. To address the scarcity of fisheye images and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
