TL;DR
ProDiG is a novel diffusion-guided framework that progressively refines aerial 3D representations to generate realistic ground-level views and coherent 3D models from aerial imagery, overcoming large viewpoint gaps.
Contribution
ProDiG introduces a geometry-aware diffusion framework with adaptive Gaussian modules for accurate aerial-to-ground 3D reconstruction without extra ground-truth data.
Findings
ProDiG outperforms existing methods in visual quality and geometric consistency.
It effectively handles large viewpoint changes and scale variations.
Experimental results show significant improvements on synthetic and real datasets.
Abstract
Generating ground-level views and coherent 3D site models from aerial-only imagery is challenging due to extreme viewpoint changes, missing intermediate observations, and large scale variations. Existing methods either refine renderings post-hoc, often producing geometrically inconsistent results, or rely on multi-altitude ground-truth, which is rarely available. Gaussian Splatting and diffusion-based refinements improve fidelity under small variations but fail under wide aerial-toground gaps. To address these limitations, we introduce ProDiG (Progressive Diffusion-Guided Gaussian Splatting for Aerial to Ground Reconstruction), a diffusionguided framework that progressively transforms aerial 3D representations toward ground-level fidelity. ProDiG synthesizes intermediate-altitude views and refines the Gaussian representation at each stage using a geometry-aware causal attention module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
