Sparse Autoencoders for Sequential Recommendation Models: Interpretation and Flexible Control

Anton Klenitskiy; Konstantin Polev; Daria Denisova; Alexey Vasilev; Dmitry Simakov; Gleb Gusev

arXiv:2507.12202·cs.IR·February 18, 2026

Sparse Autoencoders for Sequential Recommendation Models: Interpretation and Flexible Control

Anton Klenitskiy, Konstantin Polev, Daria Denisova, Alexey Vasilev, Dmitry Simakov, Gleb Gusev

PDF

Open Access

TL;DR

This paper extends sparse autoencoders to sequential recommendation models, enabling interpretability and flexible control over transformer-based systems, which enhances understanding and customization of recommendations.

Contribution

It introduces a framework for interpreting and controlling transformer-based sequential recommenders using sparse autoencoders, improving transparency and adaptability.

Findings

01

Directions learned are more interpretable and monosemantic.

02

The approach allows effective and flexible control of model behavior.

03

Transformers trained with SAE produce more meaningful internal representations.

Abstract

Many current state-of-the-art models for sequential recommendations are based on transformer architectures. Interpretation and explanation of such black box models is an important research question, as a better understanding of their internals can help understand, influence, and control their behavior, which is very important in a variety of real-world applications. Recently, sparse autoencoders (SAE) have been shown to be a promising unsupervised approach to extract interpretable features from neural networks. In this work, we extend SAE to sequential recommender systems and propose a framework for interpreting and controlling model representations. We show that this approach can be successfully applied to the transformer trained on a sequential recommendation task: directions learned in such an unsupervised regime turn out to be more interpretable and monosemantic than the original…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Recommender Systems and Techniques · Generative Adversarial Networks and Image Synthesis