Physics-Guided Variational Model for Unsupervised Sound Source Tracking
Luan Vin\'icius Fiorio, Ivana Nikoloska, Bruno Defraene, Alex Young, Johan David, Ronald M. Aarts

TL;DR
This paper presents an unsupervised, physics-guided variational model for sound source tracking that leverages geometric constraints and outperforms traditional methods without needing labeled data.
Contribution
It introduces a novel unsupervised model combining a variational encoder with a physics-based decoder for sound source tracking, capable of handling mismatched geometries and noise.
Findings
Outperforms traditional baselines in accuracy
Achieves comparable results to supervised models
Generalizes well to different array geometries
Abstract
Sound source tracking is commonly performed using classical array-processing algorithms, while machine-learning approaches typically rely on precise source position labels that are expensive or impractical to obtain. This paper introduces a physics-guided variational model capable of fully unsupervised single-source sound source tracking. The method combines a variational encoder with a physics-based decoder that injects geometric constraints into the latent space through analytically derived pairwise time-delay likelihoods. Without requiring ground-truth labels, the model learns to estimate source directions directly from microphone array signals. Experiments on real-world data demonstrate that the proposed approach outperforms traditional baselines and achieves accuracy and computational complexity comparable to state-of-the-art supervised models. We further show that the method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Music Technology and Sound Studies
