Cross-view Localization and Synthesis -- Datasets, Challenges and Opportunities
Ningli Xu, Rongjun Qin

TL;DR
This survey reviews recent progress, datasets, challenges, and techniques in cross-view localization and synthesis, emphasizing their importance for applications like autonomous navigation and urban planning.
Contribution
It provides a comprehensive overview of datasets, challenges, and state-of-the-art methods in cross-view localization and synthesis, highlighting future research directions.
Findings
Large-scale datasets have driven recent progress.
CNNs and ViTs are key for feature embedding in localization.
GANs and diffusion models are used for image synthesis.
Abstract
Cross-view localization and synthesis are two fundamental tasks in cross-view visual understanding, which deals with cross-view datasets: overhead (satellite or aerial) and ground-level imagery. These tasks have gained increasing attention due to their broad applications in autonomous navigation, urban planning, and augmented reality. Cross-view localization aims to estimate the geographic position of ground-level images based on information provided by overhead imagery while cross-view synthesis seeks to generate ground-level images based on information from the overhead imagery. Both tasks remain challenging due to significant differences in viewing perspective, resolution, and occlusion, which are widely embedded in cross-view datasets. Recent years have witnessed rapid progress driven by the availability of large-scale datasets and novel approaches. Typically, cross-view…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
