Convergence and Loss Bounds for Bayesian Sequence Prediction

Marcus Hutter

arXiv:cs/0301014·cs.LG·November 18, 2016

Convergence and Loss Bounds for Bayesian Sequence Prediction

Marcus Hutter

PDF

TL;DR

This paper analyzes Bayesian sequence prediction, establishing convergence of the mixture posterior to the true distribution, providing bounds on prediction loss, and demonstrating the effectiveness of Bayesian methods in unknown sequence models.

Contribution

It introduces a new elementary derivation of convergence rates for Bayesian mixture predictors and bounds their prediction loss relative to the true distribution.

Findings

01

Convergence of the mixture posterior to the true posterior is established.

02

Prediction loss of the Bayesian mixture predictor is bounded and asymptotically optimal.

03

The paper provides convergence rates and bounds without assumptions on loss structure.

Abstract

The probability of observing $x_{t}$ at time $t$ , given past observations $x_{1} ... x_{t - 1}$ can be computed with Bayes' rule if the true generating distribution $μ$ of the sequences $x_{1} x_{2} x_{3} ...$ is known. If $μ$ is unknown, but known to belong to a class $M$ one can base ones prediction on the Bayes mix $ξ$ defined as a weighted sum of distributions $ν \in M$ . Various convergence results of the mixture posterior $ξ_{t}$ to the true posterior $μ_{t}$ are presented. In particular a new (elementary) derivation of the convergence $ξ_{t} / μ_{t} \to 1$ is provided, which additionally gives the rate of convergence. A general sequence predictor is allowed to choose an action $y_{t}$ based on $x_{1} ... x_{t - 1}$ and receives loss $ℓ_{x_{t} y_{t}}$ if $x_{t}$ is the next symbol of the sequence. No assumptions are made on the structure of $ℓ$ (apart from being bounded) and $M$ . The Bayes-optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.