AudioGS: Spectrogram-Based Audio Gaussian Splatting for Sound Field Reconstruction

Chunhao Bi; Houqiang Zhong; Zhixin Xu; Li Song; Zhengxue Cheng

arXiv:2604.08967·cs.SD·April 13, 2026

AudioGS: Spectrogram-Based Audio Gaussian Splatting for Sound Field Reconstruction

Chunhao Bi, Houqiang Zhong, Zhixin Xu, Li Song, Zhengxue Cheng

PDF

TL;DR

AudioGS introduces a spectrogram-based, visual-free framework for high-fidelity sound field reconstruction that explicitly encodes sound as Gaussian distributions, outperforming visual-dependent methods.

Contribution

The paper proposes AudioGS, a novel spectrogram-based, visual-free sound field encoding method inspired by 3D Gaussian Splatting, improving spatial audio synthesis accuracy.

Findings

01

AudioGS reduces MAG error by over 14%.

02

AudioGS decreases DPAM perceptual metric by about 25%.

03

Outperforms state-of-the-art visual-dependent baselines.

Abstract

Spatial audio is fundamental to immersive virtual experiences, yet synthesizing high-fidelity binaural audio from sparse observations remains a significant challenge. Existing methods typically rely on implicit neural representations conditioned on visual priors, which often struggle to capture fine-grained acoustic structures. Inspired by 3D Gaussian Splatting (3DGS), we introduce AudioGS, a novel visual-free framework that explicitly encodes the sound field as a set of Audio Gaussians based on spectrograms. AudioGS associates each time-frequency bin with an Audio Gaussian equipped with dual Spherical Harmonic (SH) coefficients and a decay coefficient. For a target pose, we render binaural audio by evaluating the SH field to capture directionality, incorporating geometry-guided distance attenuation and phase correction, and reconstructing the waveform. Experiments on the Replay-NVAS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.