Logarithmic Regret and Polynomial Scaling in Online Multi-step-ahead Prediction
Jiachen Qian, Yang Zheng

TL;DR
This paper introduces an online learning algorithm for multi-step-ahead prediction of unknown linear stochastic systems, achieving logarithmic regret and revealing polynomial scaling of the regret constant with the prediction horizon.
Contribution
It provides a novel optimal parameterization of the prediction policy and establishes new regret bounds that do not depend on fixed failure probabilities.
Findings
Achieves logarithmic regret relative to the Kalman filter.
Regret constant grows polynomially with the prediction horizon.
Introduces a new proof technique for almost-sure regret bounds.
Abstract
This letter studies the problem of online multi-step-ahead prediction for unknown linear stochastic systems. Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of future inputs, past inputs, and past outputs. Based on this characterization, we propose an online least-squares algorithm to learn the policy and analyze its regret relative to the optimal model-based predictor. We show that the online algorithm achieves logarithmic regret with respect to the optimal Kalman filter in the multi-step setting. Furthermore, with new proof techniques, we establish an almost-sure regret bound that does not rely on fixed failure probabilities for sufficiently large horizons . Finally, our analysis also reveals that, while the regret remains logarithmic in , its constant factor grows polynomially with the prediction horizon…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Stochastic Gradient Optimization Techniques
