Sparsity as a Key: Unlocking New Insights from Latent Structures for Out-of-Distribution Detection

Ahyoung Oh; Wonseok Shin; Songkuk Kim

arXiv:2604.26409·cs.CV·April 30, 2026

Sparsity as a Key: Unlocking New Insights from Latent Structures for Out-of-Distribution Detection

Ahyoung Oh, Wonseok Shin, Songkuk Kim

PDF

TL;DR

This paper introduces a novel SAE-based framework for out-of-distribution detection in Vision Transformers, revealing class-specific activation patterns and a structural invariant that improves detection performance.

Contribution

First application of Sparse Autoencoders to ViT [CLS] tokens for OOD detection, uncovering class-specific activation patterns and a divergence-based scoring method.

Findings

01

Achieves strong FPR95 results across benchmarks.

02

Reveals stable class-specific activation patterns in in-distribution data.

03

Disruptions in activation patterns indicate OOD samples.

Abstract

Sparse Autoencoders (SAEs) have demonstrated significant success in interpreting Large Language Models (LLMs) by decomposing dense representations into sparse, semantic components. However, their potential for analyzing Vision Transformers (ViTs) remains largely under-explored. In this work, we present the first application of SAEs to the ViT [CLS] token for out-of-distribution (OOD) detection, addressing the limitation of existing methods that rely on entangled feature representations. We propose a novel framework utilizing a Top-k SAE to disentangle the dense [CLS] features into a structured latent space. Through this analysis, we reveal that in-distribution (ID) data exhibits consistent, class-specific activation patterns, which we formalize as Class Activation Profiles (CAPs). Our study uncovers a key structural invariant: while ID samples preserve a stable pattern within CAPs, OOD…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.