Can sparse autoencoders make sense of gene expression latent variable models?
Viktoria Schuster

TL;DR
This paper investigates the use of sparse autoencoders to interpret latent variables in biological data, demonstrating their ability to uncover meaningful biological signals and introducing a new interpretability tool.
Contribution
It demonstrates the effectiveness of SAEs in decomposing complex biological embeddings and introduces scFeatureLens for automated biological concept linking.
Findings
SAEs can effectively extract ground truth variables from simulated data.
SAEs reveal key biological processes in single-cell embeddings.
scFeatureLens enables large-scale biological hypothesis generation.
Abstract
Sparse autoencoders (SAEs) have lately been used to uncover interpretable latent features in large language models. By projecting dense embeddings into a much higher-dimensional and sparse space, learned features become disentangled and easier to interpret. This work explores the potential of SAEs for decomposing embeddings in complex and high-dimensional biological data. Using simulated data, it outlines the efficacy, hyperparameter landscape, and limitations of SAEs when it comes to extracting ground truth generative variables from latent space. The application to embeddings from pretrained single-cell models shows that SAEs can find and steer key biological processes and even uncover subtle biological signals that might otherwise be missed. This work further introduces scFeatureLens, an automated interpretability approach for linking SAE features and biological concepts from gene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
