Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models
Boyi Deng, Xu Wang, Yaoning Wang, Yu Wan, Yubo Ma, Baosong Yang, Haoran Wei, Jialong Tang, Huan Lin, Ruize Gao, Tianhao Li, Qian Cao, Xuancheng Ren, Xiaodong Deng, An Yang, Fei Huang, Dayiheng Liu, Jingren Zhou

TL;DR
Qwen-Scope introduces a suite of sparse autoencoders for the Qwen large language models, enabling interpretability, control, evaluation, and optimization of model behavior through representation-level interfaces.
Contribution
It provides an open-source set of SAEs integrated with Qwen models, facilitating practical tools for model development, analysis, and safety improvements.
Findings
SAEs enable inference-time steering without weight modification.
SAEs serve as proxies for evaluating model capabilities and redundancy.
SAEs support data workflows and post-training optimization for safety.
Abstract
Large language models have achieved remarkable capabilities across diverse tasks, yet their internal decision-making processes remain largely opaque, limiting our ability to inspect, control, and systematically improve them. This opacity motivates a growing body of research in mechanistic interpretability, with sparse autoencoders (SAEs) emerging as one of the most promising tools for decomposing model activations into sparse, interpretable feature representations. We introduce Qwen-Scope, an open-source suite of SAEs built on the Qwen model family, comprising 14 groups of SAEs across 7 model variants from the Qwen3 and Qwen3.5 series, covering both dense and mixture-of-expert architectures. Built on top of these SAEs, we show that SAEs can go beyond post-hoc analysis to serve as practical interfaces for model development along four directions: (i) inference-time steering, where SAE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Qwen/SAE-Res-Qwen3.5-2B-Base-W32K-L0_100model· 52 dl· ♡ 452 dl♡ 4
- 🤗Qwen/SAE-Res-Qwen3.5-27B-W80K-L0_50model· 47 dl· ♡ 3747 dl♡ 37
- 🤗Qwen/SAE-Res-Qwen3-1.7B-Base-W32K-L0_50model· 69 dl· ♡ 469 dl♡ 4
- 🤗Qwen/SAE-Res-Qwen3-1.7B-Base-W32K-L0_100model· 29 dl· ♡ 329 dl♡ 3
- 🤗Qwen/SAE-Res-Qwen3-8B-Base-W64K-L0_50model· 130 dl· ♡ 5130 dl♡ 5
- 🤗Qwen/SAE-Res-Qwen3-8B-Base-W64K-L0_100model· 50 dl· ♡ 350 dl♡ 3
- 🤗Qwen/SAE-Res-Qwen3.5-2B-Base-W32K-L0_50model· 165 dl· ♡ 9165 dl♡ 9
- 🤗Qwen/SAE-Res-Qwen3.5-9B-Base-W64K-L0_50model· 41 dl· ♡ 741 dl♡ 7
- 🤗Qwen/SAE-Res-Qwen3.5-9B-Base-W64K-L0_100model· 72 dl· ♡ 372 dl♡ 3
- 🤗Qwen/SAE-Res-Qwen3.5-27B-W80K-L0_100model· 62 dl· ♡ 1362 dl♡ 13
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
