Heuristic Pathologies and Further Variance Reduction via Uncertainty Propagation in the AIVAT Family of Techniques
Juho Kim, Tuomas Sandholm

TL;DR
This paper analyzes the AIVAT variance reduction technique, highlighting potential vulnerabilities in heuristic functions and proposing methods to propagate uncertainty for improved evaluation in multiagent environments.
Contribution
It introduces parameterization of heuristics to reveal vulnerabilities and proposes uncertainty propagation to enhance AIVAT's evaluation accuracy.
Findings
Heuristic functions can be manipulated to artificially lower variance.
Propagating heuristic uncertainty quantifies estimate confidence.
Using uncertainty propagation reduced sample needs by 43% in poker experiments.
Abstract
How should an agent's performance in a multiagent environment be evaluated when there is a limited sample size or a high cost of running a trial? The AIVAT family of variance reduction techniques was proposed to address this challenge by introducing unbiased low-variance estimators of agents' expected payoffs. An important component of AIVAT is a heuristic value function that discriminates between potentially low- and high-value counterfactual histories. A notable gap in the literature is that there is little to no constraint or guideline on how the heuristic value function should be chosen or how uncertainty in its output should be handled. In our first contribution, we parameterize the heuristic value function to highlight AIVAT's potential vulnerabilities: a) the sample variance can be set pathologically low by directly applying gradient descent on the sample variance, and b) one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
