From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Val\'erie Costa, Thomas Fel, Ekdeep Singh Lubana, Bahareh Tolooshams, Demba Ba

TL;DR
This paper introduces MP-SAE, a hierarchical sparse autoencoder inspired by matching pursuit, which captures complex features in neural representations better than traditional methods, revealing hierarchical and nonlinear structures.
Contribution
The paper proposes MP-SAE, a novel autoencoder architecture that unrolls matching pursuit steps to better capture hierarchical and nonlinear features in neural representations.
Findings
MP-SAE captures hierarchical, conditionally orthogonal features.
It recovers meaningful features in vision-language models.
Adaptive sparsity improves inference efficiency.
Abstract
Motivated by the hypothesis that neural network representations encode abstract, interpretable features as linearly accessible, approximately orthogonal directions, sparse autoencoders (SAEs) have become a popular tool in interpretability. However, recent work has demonstrated phenomenology of model representations that lies outside the scope of this hypothesis, showing signatures of hierarchical, nonlinear, and multi-dimensional features. This raises the question: do SAEs represent features that possess structure at odds with their motivating hypothesis? If not, does avoiding this mismatch help identify said features and gain further insights into neural network representations? To answer these questions, we take a construction-based approach and re-contextualize the popular matching pursuits (MP) algorithm from sparse coding to design MP-SAE -- an SAE that unrolls its encoder into a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Data Visualization and Analytics
