From Insight to Intervention: Interpretable Neuron Steering for Controlling Popularity Bias in Recommender Systems

Parviz Ahmadov; Masoud Mansoury

arXiv:2601.15122·cs.IR·January 29, 2026

From Insight to Intervention: Interpretable Neuron Steering for Controlling Popularity Bias in Recommender Systems

Parviz Ahmadov, Masoud Mansoury

PDF

Open Access

TL;DR

This paper introduces PopSteer, a post-hoc method using a Sparse Autoencoder to interpret and mitigate popularity bias in recommender systems, improving fairness with minimal accuracy loss.

Contribution

It presents a novel, interpretable neuron-level steering approach for popularity bias mitigation in recommender systems, enhancing transparency and control.

Findings

01

Significantly improves fairness in recommendations.

02

Maintains recommendation accuracy with minimal impact.

03

Provides interpretable insights into bias mechanisms.

Abstract

Popularity bias is a pervasive challenge in recommender systems, where a few popular items dominate attention while the majority of less popular items remain underexposed. This imbalance can reduce recommendation quality and lead to unfair item exposure. Although existing mitigation methods address this issue to some extent, they often lack transparency in how they operate. In this paper, we propose a post-hoc approach, PopSteer, that leverages a Sparse Autoencoder (SAE) to both interpret and mitigate popularity bias in recommendation models. The SAE is trained to replicate a trained model's behavior while enabling neuron-level interpretability. By introducing synthetic users with strong preferences for either popular or unpopular items, we identify neurons encoding popularity signals through their activation patterns. We then steer recommendations by adjusting the activations of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Explainable Artificial Intelligence (XAI) · Sentiment Analysis and Opinion Mining