Online Learning Using Only Peer Prediction

Yang Liu; David P. Helmbold

arXiv:1910.04382·cs.LG·January 7, 2020

Online Learning Using Only Peer Prediction

Yang Liu, David P. Helmbold

PDF

Open Access

TL;DR

This paper introduces a novel online learning approach that relies solely on peer prediction mechanisms to evaluate expert predictions without direct loss feedback, under certain calibration conditions.

Contribution

It proposes a peer prediction-based method for online learning that achieves bounded regret without direct loss feedback, expanding the applicability of online learning models.

Findings

01

Peer calibration condition ensures bounded regret

02

Peer score functions can be derived for various models

03

Method works without direct loss feedback

Abstract

This paper considers a variant of the classical online learning problem with expert predictions. Our model's differences and challenges are due to lacking any direct feedback on the loss each expert incurs at each time step $t$ . We propose an approach that uses peer prediction and identify conditions where it succeeds. Our techniques revolve around a carefully designed peer score function $s ()$ that scores experts' predictions based on the peer consensus. We show a sufficient condition, that we call \emph{peer calibration}, under which standard online learning algorithms using loss feedback computed by the carefully crafted $s ()$ have bounded regret with respect to the unrevealed ground truth values. We then demonstrate how suitable $s ()$ functions can be derived for different assumptions and models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Machine Learning and Data Classification