Geo-ID: Test-Time Geometric Consensus for Cross-View Consistent Intrinsics
Alara Dirik, Stefanos Zafeiriou

TL;DR
Geo-ID is a test-time framework that enhances cross-view consistency in intrinsic image decomposition by leveraging geometric correspondences, enabling coherent editing and relighting without retraining existing models.
Contribution
It introduces a model-agnostic, test-time method that improves multi-view intrinsic consistency using geometric correspondences without retraining or inverse rendering.
Findings
Significant improvement in cross-view consistency with more views.
Maintains comparable single-view decomposition quality.
Enables coherent editing and relighting in neural scene representations.
Abstract
Intrinsic image decomposition aims to estimate physically based rendering (PBR) parameters such as albedo, roughness, and metallicity from images. While recent methods achieve strong single-view predictions, applying them independently to multiple views of the same scene often yields inconsistent estimates, limiting their use in downstream applications such as editable neural scenes and 3D reconstruction. Video-based models can improve cross-frame consistency but require dense, ordered sequences and substantial compute, limiting their applicability to sparse, unordered image collections. We propose Geo-ID, a novel test-time framework that repurposes pretrained single-view intrinsic predictors to produce cross-view consistent decompositions by coupling independent per-view predictions through sparse geometric correspondences that form uncertainty-aware consensus targets. Geo-ID is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis
