Learning with little mixing
Ingvar Ziemann, Stephen Tu

TL;DR
This paper establishes that under certain hypercontractivity conditions, the risk of least-squares estimators on dependent time-series data can match iid rates after a burn-in period, even with weak long-range dependencies.
Contribution
It introduces the concept of learning with little mixing, showing fast rate excess risk bounds under hypercontractivity, extending results to processes with weak dependencies and long-range correlations.
Findings
Risk matches iid rates after burn-in under hypercontractivity
Applicable to processes with long-range correlations
Nearly minimax optimal bounds for system identification
Abstract
We study square loss in a realizable time-series framework with martingale difference noise. Our main result is a fast rate excess risk bound which shows that whenever a trajectory hypercontractivity condition holds, the risk of the least-squares estimator on dependent data matches the iid rate order-wise after a burn-in time. In comparison, many existing results in learning from dependent data have rates where the effective sample size is deflated by a factor of the mixing-time of the underlying process, even after the burn-in time. Furthermore, our results allow the covariate process to exhibit long range correlations which are substantially weaker than geometric ergodicity. We call this phenomenon learning with little mixing, and present several examples for when it occurs: bounded function classes for which the and norms are equivalent, ergodic finite state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStatistical Methods and Inference · Markov Chains and Monte Carlo Methods · Advanced Bandit Algorithms Research
