A Reduction Algorithm for Markovian Contextual Linear Bandits

Kaan Buyukkalayci; Osama Hanna; Christina Fragouli

arXiv:2603.12530·cs.LG·March 16, 2026

A Reduction Algorithm for Markovian Contextual Linear Bandits

Kaan Buyukkalayci, Osama Hanna, Christina Fragouli

PDF

Open Access

TL;DR

This paper extends the reduction approach for linear contextual bandits to Markovian contexts, enabling efficient algorithms with strong regret guarantees even with temporally correlated data.

Contribution

It introduces a reduction method for Markovian contextual linear bandits under geometric ergodicity, including an online learning algorithm for unknown transition distributions.

Findings

01

Achieves regret bounds comparable to standard linear bandit algorithms.

02

Handles temporally correlated contexts via a stationary surrogate action set.

03

Provides a phased algorithm with online transition learning.

Abstract

Recent work shows that when contexts are drawn i.i.d., linear contextual bandits can be reduced to single-context linear bandits. This ``contexts are cheap" perspective is highly advantageous, as it allows for sharper finite-time analyses and leverages mature techniques from the linear bandit literature, such as those for misspecification and adversarial corruption. Motivated by applications with temporally correlated availability, we extend this perspective to Markovian contextual linear bandits, where the action set evolves via an exogenous Markov chain. Our main contribution is a reduction that applies under uniform geometric ergodicity. We construct a stationary surrogate action set to solve the problem using a standard linear bandit oracle, employing a delayed-update scheme to control the bias induced by the nonstationary conditional context distributions. We further provide a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Reinforcement Learning in Robotics