Learning Interpretable Low-dimensional Representation via Physical Symmetry
Xuanjie Liu, Daniel Chin, Yichen Huang, Gus Xia

TL;DR
This paper introduces a physics-inspired approach using symmetry constraints to learn low-dimensional, interpretable representations of time-series data, demonstrated on music and vision tasks without labels.
Contribution
It proposes a novel method leveraging physical symmetry as a self-consistency constraint to learn interpretable, low-dimensional representations in an unsupervised manner.
Findings
Learned a linear pitch factor from unlabelled music audio.
Applied the method to vision to learn 3D space from videos.
Introduced counterfactual augmentation to improve sample efficiency.
Abstract
We have recently seen great progress in learning interpretable music representations, ranging from basic factors, such as pitch and timbre, to high-level concepts, such as chord and texture. However, most methods rely heavily on music domain knowledge. It remains an open question what general computational principles give rise to interpretable representations, especially low-dim factors that agree with human perception. In this study, we take inspiration from modern physics and use physical symmetry as a self consistency constraint for the latent space of time-series data. Specifically, it requires the prior model that characterises the dynamics of the latent states to be equivariant with respect to certain group transformations. We show that physical symmetry leads the model to learn a linear pitch factor from unlabelled monophonic music audio in a self-supervised fashion. In addition,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMusic and Audio Processing · Neuroscience and Music Perception · Music Technology and Sound Studies
