Tree SAE: Learning Hierarchical Feature Structures in Sparse Autoencoders

Tue M. Cao; Hoang X. Nhat; Raed Alharbi; Phi Le Nguyen; My T. Thai

arXiv:2605.07922·cs.LG·May 12, 2026

Tree SAE: Learning Hierarchical Feature Structures in Sparse Autoencoders

Tue M. Cao, Hoang X. Nhat, Raed Alharbi, Phi Le Nguyen, My T. Thai

PDF

TL;DR

Tree SAE is a novel autoencoder model that learns hierarchical feature structures by combining activation and reconstruction constraints, improving the understanding of complex data hierarchies.

Contribution

The paper introduces Tree SAE, a new method that effectively captures hierarchical relationships in features by integrating activation and reconstruction conditions.

Findings

01

Tree SAE outperforms existing SAEs in learning hierarchical feature pairs.

02

Tree SAE maintains competitive performance on key benchmarks.

03

Demonstrates utility in analyzing hierarchical concepts in language models.

Abstract

Learning hierarchical features in Sparse Autoencoders (SAEs) is essential for capturing the structured nature of real-world data and mitigating issues like feature absorption or splitting. Existing works attempt to identify hierarchical relationships within independent feature sets by relying on activation coverage, the assumption that child feature should only activate when its parent feature activates. However, we demonstrate that this condition alone is insufficient; that is, it often produces false positives where parent and child concepts are semantically unrelated. To address this, we introduce a novel reconstruction condition that enforces a deeper functional link between hierarchical levels. By combining both activation and reconstruction constraints, we propose the Tree SAE, a model designed to learn hierarchical structures directly from within the feature set. Our results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.