Recurrent Neural-Linear Posterior Sampling for Nonstationary Contextual Bandits
Aditya Ramesh, Paulo Rauber, Michelangelo Conserva, J\"urgen, Schmidhuber

TL;DR
This paper introduces a recurrent neural network-based method for nonstationary contextual bandits that learns relevant historical context directly from raw interaction data, outperforming traditional handcrafted approaches.
Contribution
It proposes a novel combination of recurrent neural networks with posterior sampling for nonstationary bandits, addressing limitations of handcrafted historical contexts.
Findings
Recurrent approach outperforms feedforward models in diverse nonstationary problems.
Method is more widely applicable than traditional nonstationary bandit algorithms.
Provides a new regret bound for linear posterior sampling with measurement error.
Abstract
An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences. Handcrafting an appropriate historical context is an attractive alternative to transform a nonstationary problem into a stationary problem that can be solved efficiently. However, even a carefully designed historical context may introduce spurious relationships or lack a convenient representation of crucial information. In order to address these issues, we propose an approach that learns to represent the relevant context for a decision based solely on the raw history of interactions between the agent and the environment. This approach relies on a combination of features extracted by recurrent neural networks with a contextual linear bandit algorithm based on posterior sampling. Our experiments on a diverse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques
