Budgeted Recommendation with Delayed Feedback

Kweiguu Liu; Setareh Maghsudi

arXiv:2405.11417·cs.LG·May 21, 2024

Budgeted Recommendation with Delayed Feedback

Kweiguu Liu, Setareh Maghsudi

PDF

Open Access

TL;DR

This paper addresses the challenge of resource-constrained decision-making in contextual bandits with delayed feedback, proposing a new policy to optimize resource use despite delays and limited budgets.

Contribution

It introduces DORAL, a novel policy designed to handle delayed feedback in constrained contextual bandits, improving resource allocation under such conditions.

Findings

01

DORAL effectively manages delayed feedback in resource-limited settings.

02

The policy improves decision accuracy despite feedback delays.

03

Application to COVID-19 resource distribution demonstrates practical benefits.

Abstract

In a conventional contextual multi-armed bandit problem, the feedback (or reward) is immediately observable after an action. Nevertheless, delayed feedback arises in numerous real-life situations and is particularly crucial in time-sensitive applications. The exploration-exploitation dilemma becomes particularly challenging under such conditions, as it couples with the interplay between delays and limited resources. Besides, a limited budget often aggravates the problem by restricting the exploration potential. A motivating example is the distribution of medical supplies at the early stage of COVID-19. The delayed feedback of testing results, thus insufficient information for learning, degraded the efficiency of resource allocation. Motivated by such applications, we study the effect of delayed feedback on constrained contextual bandits. We develop a decision-making policy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConsumer Market Behavior and Pricing · Forecasting Techniques and Applications · Advanced Bandit Algorithms Research