Remote Contextual Bandits
Francesco Pase, Deniz Gunduz, Michele Zorzi

TL;DR
This paper investigates the rate-limited communication constraints in remote contextual multi-armed bandit problems, analyzing the information-theoretic limits and proposing bounds on achievable regret with policy transmission.
Contribution
It characterizes the fundamental communication rate thresholds for sub-linear regret in remote CMABs and analyzes the impact of policy compression on learning performance.
Findings
Identifies two rate regions with linear and sub-linear regret behaviors.
Provides upper bounds on regret when policies are transmitted reliably without distortion.
Analyzes the trade-off between communication rate and learning performance.
Abstract
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker observes the context and the reward, but must communicate the actions to be taken by the agents over a rate-limited communication channel. This can model, for example, a personalized ad placement application, where the content owner observes the individual visitors to its website, and hence has the context information, but must convey the ads that must be shown to each visitor to a separate entity that manages the marketing content. In this remote CMAB (R-CMAB) problem, the constraint on the communication rate between the decision-maker and the agents imposes a trade-off between the number of bits sent per agent and the acquired average reward. We are particularly interested in characterizing the rate required to achieve sub-linear regret. Consequently, this can be considered as a policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Smart Grid Energy Management
