Multi-Objective Recommendation via Multivariate Policy Learning

Olivier Jeunen; Jatin Mandav; Ivan Potapov; Nakul Agarwal; Sourabh; Vaid; Wenzhe Shi; Aleksei Ustimenko

arXiv:2405.02141·cs.IR·September 17, 2024

Multi-Objective Recommendation via Multivariate Policy Learning

Olivier Jeunen, Jatin Mandav, Ivan Potapov, Nakul Agarwal, Sourabh, Vaid, Wenzhe Shi, Aleksei Ustimenko

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel multivariate policy learning framework for multi-objective recommendation systems, optimizing scalarisation weights as actions to improve long-term user engagement and fairness.

Contribution

It extends policy learning to continuous multivariate actions, proposing a pessimistic lower bound approach with correction techniques for better optimization.

Findings

01

Effective in simulations, offline, and online experiments

02

Improves balancing multiple objectives in recommender systems

03

Enhances long-term user engagement and fairness

Abstract

Real-world recommender systems often need to balance multiple objectives when deciding which recommendations to present to users. These include behavioural signals (e.g. clicks, shares, dwell time), as well as broader objectives (e.g. diversity, fairness). Scalarisation methods are commonly used to handle this balancing task, where a weighted average of per-objective reward signals determines the final score used for ranking. Naturally, how these weights are computed exactly, is key to success for any online platform. We frame this as a decision-making task, where the scalarisation weights are actions taken to maximise an overall North Star reward (e.g. long-term user retention or growth). We extend existing policy learning methods to the continuous multivariate action domain, proposing to maximise a pessimistic lower bound on the North Star reward that the learnt policy will yield.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

olivierjeunen/multivariate-policy-learning-recsys-2024
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques