Hyperbolic Audio Source Separation
Darius Petermann, Gordon Wichern, Aswin Subramanian, Jonathan Le Roux

TL;DR
This paper presents a novel hyperbolic embedding framework for audio source separation, capturing hierarchical relationships in sound data, and demonstrates its effectiveness and uncertainty estimation capabilities on synthetic datasets.
Contribution
Introduces a hyperbolic embedding approach for audio source separation, leveraging hierarchical relationships, and provides uncertainty estimates for improved source isolation.
Findings
Hyperbolic embeddings perform comparably to Euclidean baselines.
Low-dimensional hyperbolic embeddings show stronger performance.
Uncertainty estimates enable better trade-offs in source separation.
Abstract
We introduce a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features. Inspired by recent successes modeling hierarchical relationships in text and images with hyperbolic embeddings, our algorithm obtains a hyperbolic embedding for each time-frequency bin of a mixture signal and estimates masks using hyperbolic softmax layers. On a synthetic dataset containing mixtures of multiple people talking and musical instruments playing, our hyperbolic model performed comparably to a Euclidean baseline in terms of source to distortion ratio, with stronger performance at low embedding dimensions. Furthermore, we find that time-frequency regions containing multiple overlapping sources are embedded towards the center (i.e., the most uncertain region) of the hyperbolic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Acoustic Wave Phenomena Research
MethodsSoftmax
