Sparse Crosscoders for diffing MoEs and Dense models

Marmik Chaudhari; Nishkal Hundia; Idhant Gulati

arXiv:2603.05805·cs.LG·March 9, 2026

Sparse Crosscoders for diffing MoEs and Dense models

Marmik Chaudhari, Nishkal Hundia, Idhant Gulati

PDF

Open Access

TL;DR

This paper compares the internal representations of sparse Mixture of Experts (MoE) models and dense models, revealing that MoEs develop more specialized and focused features, while dense models have broader, more general features.

Contribution

It introduces a systematic method using crosscoders to analyze and compare MoE and dense model internals, highlighting differences in feature organization and specialization.

Findings

01

MoEs learn fewer unique features than dense models.

02

MoEs have higher activation density in their features.

03

Dense models distribute information across more general features.

Abstract

Mixture of Experts (MoE) achieve parameter-efficient scaling through sparse expert routing, yet their internal representations remain poorly understood compared to dense models. We present a systematic comparison of MoE and dense model internals using crosscoders, a variant of sparse autoencoders, that jointly models multiple activation spaces. We train 5-layer dense and MoEs (equal active parameters) on 1B tokens across code, scientific text, and english stories. Using BatchTopK crosscoders with explicitly designated shared features, we achieve $\sim 87%$ fractional variance explained and uncover concrete differences in feature organization. The MoE learns significantly fewer unique features compared to the dense model. MoE-specific features also exhibit higher activation density than shared features, whereas dense-specific features show lower density. Our analysis reveals that MoEs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Graph Neural Networks · Topic Modeling