SPARC: Concept-Aligned Sparse Autoencoders for Cross-Model and Cross-Modal Interpretability
Ali Nasiri-Sarvi, Hassan Rivaz, Mahdi S. Hosseini

TL;DR
SPARC introduces a unified sparse autoencoder framework that aligns high-level concept representations across diverse AI models and modalities, significantly enhancing interpretability and enabling cross-model applications.
Contribution
It proposes a novel alignment method with global TopK sparsity and cross-reconstruction loss to create a shared concept space across models and modalities.
Findings
Achieves a Jaccard similarity of 0.80 in concept alignment on Open Images
Triples previous alignment performance, improving interpretability
Enables cross-model and cross-modal retrieval and localization
Abstract
Understanding how different AI models encode the same high-level concepts, such as objects or attributes, remains challenging because each model typically produces its own isolated representation. Existing interpretability methods like Sparse Autoencoders (SAEs) produce latent concepts individually for each model, resulting in incompatible concept spaces and limiting cross-model interpretability. To address this, we introduce SPARC (Sparse Autoencoders for Aligned Representation of Concepts), a new framework that learns a single, unified latent space shared across diverse architectures and modalities (e.g., vision models like DINO, and multimodal models like CLIP). SPARC's alignment is enforced through two key innovations: (1) a Global TopK sparsity mechanism, ensuring all input streams activate identical latent dimensions for a given concept; and (2) a Cross-Reconstruction Loss, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference
