Fast Slate Policy Optimization: Going Beyond Plackett-Luce

Otmane Sakhi; David Rohde; Nicolas Chopin

arXiv:2308.01566·cs.LG·January 1, 2024·1 cites

Fast Slate Policy Optimization: Going Beyond Plackett-Luce

Otmane Sakhi, David Rohde, Nicolas Chopin

PDF

Open Access

TL;DR

This paper introduces a new policy optimization method for large-scale slate decision systems, enabling efficient learning in massive action spaces beyond traditional models like Plackett-Luce.

Contribution

It proposes a novel relaxation of decision functions leading to a scalable and effective policy optimization algorithm for large action spaces.

Findings

01

Outperforms Plackett-Luce in large action space scenarios

02

Scales efficiently to millions of actions

03

Demonstrates improved reward optimization

Abstract

An increasingly important building block of large scale machine learning systems is based on returning slates; an ordered lists of items given a query. Applications of this technology include: search, information retrieval and recommender systems. When the action space is large, decision systems are restricted to a particular structure to complete online queries quickly. This paper addresses the optimization of these large scale decision systems given an arbitrary reward function. We cast this learning problem in a policy optimization framework and propose a new class of policies, born from a novel relaxation of decision functions. This results in a simple, yet efficient learning algorithm that scales to massive action spaces. We compare our method to the commonly adopted Plackett-Luce policy class and demonstrate the effectiveness of our approach on problems with action space sizes in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Machine Learning and Algorithms · Machine Learning and Data Classification