Geo2Sound: A Scalable Geo-Aligned Framework for Soundscape Generation from Satellite Imagery

Kunlin Wu; Yanning Wang; Haofeng Tan; Boyi Chen; Teng Fei; Xianping Ma; Yang Yue; Zan Zhou; and Xiaofeng Liu

arXiv:2604.14707·cs.MM·April 17, 2026

Geo2Sound: A Scalable Geo-Aligned Framework for Soundscape Generation from Satellite Imagery

Kunlin Wu, Yanning Wang, Haofeng Tan, Boyi Chen, Teng Fei, Xianping Ma, Yang Yue, Zan Zhou, and Xiaofeng Liu

PDF

1 Repo

TL;DR

Geo2Sound introduces a scalable framework that generates realistic soundscapes from satellite imagery by combining geospatial attributes, semantic hypotheses, and geo-acoustic alignment, validated on a new large-scale benchmark.

Contribution

It presents a novel task and framework for satellite-to-soundscape generation, along with the first large-scale benchmark dataset for this purpose.

Findings

01

Geo2Sound achieves a state-of-the-art FAD of 1.765, outperforming baselines by 50%.

02

Human evaluations show 26.5% improvement in realism and semantic alignment.

03

The framework effectively models geographic and acoustic correlations for soundscape synthesis.

Abstract

Recent image-to-audio models have shown impressive performance on object-centric visual scenes. However, their application to satellite imagery remains limited by the complex, wide-area semantic ambiguity of top-down views. While satellite imagery provides a uniquely scalable source for global soundscape generation, matching these views to real acoustic environments with unique spatial structures is inherently difficult. To address this challenge, we introduce Geo2Sound, a novel task and framework for generating geographically realistic soundscapes from satellite imagery. Specifically, Geo2Sound combines structural geospatial attributes modeling, semantic hypothesis expansion, and geo-acoustic alignment in a unified framework. A lightweight classifier summarizes overhead scenes into compact geographic attributes, multiple sound-oriented semantic hypotheses are used to generate diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Blanketzzz/Geo2Sound
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.