Bilinear autoencoders find interpretable manifolds
Thomas Dooms, Ward Gauderis, Geraint Wiggins, Jose Oramas

TL;DR
This paper introduces bilinear autoencoders with quadratic latents that better capture complex, interpretable manifolds in neural network representations, improving reconstruction and enabling manifold discovery.
Contribution
It proposes a novel bilinear autoencoder framework with quadratic latents that enhances interpretability and manifold detection in neural representations.
Findings
Quadratic latents detect multi-dimensional geometries in neural data.
Autoencoders with different geometric priors recover the same input subspace.
Models improve reconstruction error in language models.
Abstract
Sparse autoencoders have become a standard tool for uncovering interpretable latent representations in neural networks. Yet salient concepts often span manifolds that current linear methods cannot capture without post hoc analysis. This paper uses quadratic latents to close this gap: we implement these with bilinear autoencoders, which decompose activations into low-rank quadratic forms, compose linearly in weight space, and admit input-independent geometric analysis. This qualitative difference in what concepts quadratic latents can detect challenges the standard linear representation hypothesis. Our experiments and visualisations show that multi-dimensional geometries are highly prevalent and that composite latents capture them well, systematically improving reconstruction error in language models. Furthermore, we show that autoencoders with varying geometric priors recover the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
