A new Hedging algorithm and its application to inferring latent random   variables

Yoav Freund; Daniel Hsu

arXiv:0806.4802·cs.GT·July 1, 2008·5 cites

A new Hedging algorithm and its application to inferring latent random variables

Yoav Freund, Daniel Hsu

PDF

Open Access

TL;DR

This paper introduces a novel online learning algorithm for cumulative discounted gain that relies on regret-based weighting instead of exponential weights, and explores its application in inferring latent variables.

Contribution

It proposes a new regret-based weighting scheme for online learning and demonstrates its use as an alternative to Bayesian methods for latent variable inference.

Findings

01

The algorithm effectively updates weights based on regret, ignoring experts with worse performance.

02

It offers a viable alternative to Bayesian averaging in latent variable inference.

03

The approach improves learning efficiency by focusing on better-performing experts.

Abstract

We present a new online learning algorithm for cumulative discounted gain. This learning algorithm does not use exponential weights on the experts. Instead, it uses a weighting scheme that depends on the regret of the master algorithm relative to the experts. In particular, experts whose discounted cumulative gain is smaller (worse) than that of the master algorithm receive zero weight. We also sketch how a regret-based algorithm can be used as an alternative to Bayesian averaging in the context of inferring latent random variables.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Distributed Sensor Networks and Detection Algorithms