Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via   Uniform Data

S.M.F. Sani; Seyed Abbas Hosseini; Hamid R. Rabiee

arXiv:2310.04855·cs.LG·October 10, 2023

Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

S.M.F. Sani, Seyed Abbas Hosseini, Hamid R. Rabiee

PDF

Open Access

TL;DR

This paper introduces a bandit-based framework for unbiased recommendation systems that mitigates self-feedback bias by leveraging uniform data collection and sequential training, improving over existing debiasing methods.

Contribution

It proposes a novel offline sequential training schema and a bandit approach to reduce bias in recommendation systems, addressing the feedback loop issue.

Findings

01

Outperforms state-of-the-art debiasing methods in experiments

02

Effectively explores under-understood items to improve recommendations

03

Simulates real-world continuous training scenarios

Abstract

Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Recommender Systems and Techniques