Variational decomposition autoencoding improves disentanglement of latent representations
Ioannis Ziogas, Aamna Al Shehhi, Ahsan H. Khandoker, Leontios J. Hadjileontiadis

TL;DR
This paper introduces variational decomposition autoencoding (VDA), enhancing VAEs with a structural bias for signal decomposition, leading to improved disentanglement and interpretability of latent representations in complex time-evolving data.
Contribution
The paper presents DecVAEs, a novel framework combining signal decomposition, contrastive learning, and variational priors to improve disentanglement in time-frequency data representations.
Findings
DecVAEs outperform state-of-the-art VAE methods in disentanglement quality.
DecVAEs generalize well across different scientific tasks.
Latent encodings from DecVAEs are more interpretable.
Abstract
Understanding the structure of complex, nonstationary, high-dimensional time-evolving signals is a central challenge in scientific data analysis. In many domains, such as speech and biomedical signal processing, the ability to learn disentangled and interpretable representations is critical for uncovering latent generative mechanisms. Traditional approaches to unsupervised representation learning, including variational autoencoders (VAEs), often struggle to capture the temporal and spectral diversity inherent in such data. Here we introduce variational decomposition autoencoding (VDA), a framework that extends VAEs by incorporating a strong structural bias toward signal decomposition. VDA is instantiated through variational decomposition autoencoders (DecVAEs), i.e., encoder-only neural networks that combine a signal decomposition model, a contrastive self-supervised task, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Voice and Speech Disorders
