Learning to Defer in Non-Stationary Time Series via Switching State-Space Models
Yannis Montreuil, Letian Yu, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

TL;DR
This paper introduces L2D-SLDS, an online learning-to-defer framework for non-stationary time series that adaptively balances expert input and learner updates, outperforming existing methods on multiple benchmarks.
Contribution
The paper proposes a novel switching state-space model for online learning-to-defer in non-stationary settings, with theoretical regret guarantees and practical improvements.
Findings
L2D-SLDS defers on less than 2% of rounds in real datasets.
It outperforms or matches non-stationary bandit baselines.
The model effectively updates beliefs about experts in non-stationary environments.
Abstract
Learning-to-defer (L2D) routes each decision to a system's own predictor or to an external expert. Streaming time-series settings break the offline-L2D assumptions: the data are non-stationary, expert availability shifts over time, and the internal predictor is trained online. We propose L2D-SLDS, a one-stage online L2D framework based on a factorized switching linear-Gaussian state-space model over all potential residuals: a discrete regime, a shared global factor, and per-expert idiosyncratic states. The always-observed internal residual continuously updates beliefs about every unqueried expert through the shared factor, and a learner-aware query score balances immediate cost against latent-state information gain and one-step learner improvement. We prove an oracle inequality against a time-varying learn-and-defer comparator, decomposing regret into a query-bonus budget, an SLDS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Age of Information Optimization
