Entropy Regularization for Population Estimation

Ben Chugg; Peter Henderson; Jacob Goldin; Daniel E. Ho

arXiv:2208.11747·cs.LG·August 26, 2022

Entropy Regularization for Population Estimation

Ben Chugg, Peter Henderson, Jacob Goldin, Daniel E. Ho

PDF

Open Access 1 Video

TL;DR

This paper demonstrates that entropy regularization enhances population mean reward estimation in structured bandit problems by reducing variance and bias, with implications for policy and exploration strategies.

Contribution

It introduces a novel application of entropy regularization to improve unbiasedness and variance in population estimation within bandit frameworks.

Findings

01

Entropy regularization yields lower-variance reward estimates.

02

The method maintains near-unbiasedness in estimates.

03

Improves the trade-off between exploration and accurate estimation.

Abstract

Entropy regularization is known to improve exploration in sequential decision-making problems. We show that this same mechanism can also lead to nearly unbiased and lower-variance estimates of the mean reward in the optimize-and-estimate structured bandit setting. Mean reward estimation (i.e., population estimation) tasks have recently been shown to be essential for public policy settings where legal constraints often require precise estimates of population metrics. We show that leveraging entropy and KL divergence can yield a better trade-off between reward and estimator variance than existing baselines, all while remaining nearly unbiased. These properties of entropy regularization illustrate an exciting potential for bridging the optimal exploration and estimation literatures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Entropy Regularization for Population Estimation· underline

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Decision-Making and Behavioral Economics · Smart Grid Energy Management

MethodsEntropy Regularization