Robust Linear Dueling Bandits with Post-serving Context under Unknown Delays and Adversarial Corruptions

Youngmin Oh

arXiv:2605.01752·cs.LG·May 20, 2026

Robust Linear Dueling Bandits with Post-serving Context under Unknown Delays and Adversarial Corruptions

Youngmin Oh

PDF

TL;DR

This paper introduces a robust algorithm for linear dueling bandits in volatile environments with delays and adversarial corruptions, achieving near-optimal regret bounds under challenging conditions.

Contribution

It proposes erm, an algorithm that predicts post-serving contexts and adaptively mitigates delays and corruptions, with theoretical guarantees on regret.

Findings

01

Achieves a regret bound of ( ( ext{T}) + ext{C} + ext{D}))

02

The algorithm is delay-regime-agnostic and handles unknown stochastic or adversarial delays and corruptions.

03

Lower bounds nearly match upper bounds, confirming near-optimality in adversarial delay settings.

Abstract

We study linear dueling bandits in volatile environments characterized by the simultaneous presence of post-serving contexts, delayed feedback, and adversarial corruption. Feedback is subject to unknown stochastic or adversarial delays and a cumulative corruption budget $C$ . To address these challenges, we propose \term, which integrates a learned approximator that predicts post-serving contexts from pre-serving information. It further employs an adaptive weighting strategy that clips feature vectors to mitigate the impact of corrupted and delayed observations simultaneously. Under standard regularity conditions and a parametric post-serving mapping, we rigorously establish that our algorithm is delay-regime-agnostic, achieving a regret upper bound of $O (d (T + C + D))$ , where $d$ is the total feature dimension and $D$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.