Time-Constrained Recommendations: Reinforcement Learning Strategies for E-Commerce
Sayak Chakrabarty, Souradip Pal

TL;DR
This paper explores reinforcement learning approaches to optimize e-commerce recommendations within users' limited time budgets, balancing relevance and evaluation costs to enhance engagement.
Contribution
It introduces a unified MDP formulation for time-constrained slate recommendation, a simulation framework, and empirical evidence of RL effectiveness over traditional methods.
Findings
RL methods outperform bandit-based approaches under tight time constraints.
Simulation framework enables studying policy behavior in re-ranking scenarios.
Unified MDP formulation captures resource-aware recommendation dynamics.
Abstract
Unlike traditional recommendation tasks, finite user time budgets introduce a critical resource constraint, requiring the recommender system to balance item relevance and evaluation cost. For example, in a mobile shopping interface, users interact with recommendations by scrolling, where each scroll triggers a list of items called slate. Users incur an evaluation cost - time spent assessing item features before deciding to click. Highly relevant items having higher evaluation costs may not fit within the user's time budget, affecting engagement. In this position paper, our objective is to evaluate reinforcement learning algorithms that learn patterns in user preferences and time budgets simultaneously, crafting recommendations with higher engagement potential under resource constraints. Our experiments explore the use of reinforcement learning to recommend items for users using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
