Feed-Forward Gaussian Splatting from Sparse Aerial Views
Dongli Wu, Zhuoxiao Li, Tongyan Hua, Yinrui Ren, Xiaobao Wei, Rongjun Qin, Wufan Zhao

TL;DR
AnyCity is a novel, observation-grounded generative framework that reconstructs large-scale urban scenes from sparse aerial views in a single feed-forward pass, improving coherence and detail.
Contribution
It introduces a new method combining geometry anchoring and diffusion priors for efficient, consistent 3D urban scene reconstruction from sparse aerial data.
Findings
Achieves coherent urban novel-view synthesis with second-level inference.
Shows consistent improvements over baseline methods on various datasets.
Effectively separates observed geometry from generative priors.
Abstract
Reconstructing large-scale urban scenes from sparse aerial views is a crucial yet challenging task. Due to biased top-down and shallow-oblique camera poses, sparse aerial captures exhibit strong evidence imbalance: roofs and open regions are repeatedly observed, while facades, distant buildings, and occluded structures receive little multi-view support. Existing feed-forward 3D Gaussian Splatting methods directly regress a deterministic representation from sparse inputs, but this often leads to ghosting, melted facades, and stretched textures. Recent pseudo-view and video-based generative reconstruction methods use additional supervision or generative priors. However, they often lack a clear separation between observed geometry and prior-driven content, which can lead to plausible but inconsistent structures. We propose AnyCity, an observation-grounded generative reconstruction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
