How to Evaluate Behavioral Models
Greg d'Eon, Sophie Greenwood, Kevin Leyton-Brown, and James R. Wright

TL;DR
This paper provides a principled framework for selecting loss functions in evaluating behavioral models, recommending squared L2 error based on formal axioms and a new family of divergences.
Contribution
It formalizes axioms for loss functions in behavioral model evaluation and introduces diagonal bounded Bregman divergences, advocating for squared L2 error as the optimal choice.
Findings
Squared L2 error satisfies the proposed axioms.
Many commonly used loss functions are ruled out by the axioms.
The new family of divergences includes squared L2 error as a special case.
Abstract
Researchers building behavioral models, such as behavioral game theorists, use experimental data to evaluate predictive models of human behavior. However, there is little agreement about which loss function should be used in evaluations, with error rate, negative log-likelihood, cross-entropy, Brier score, and squared L2 error all being common choices. We attempt to offer a principled answer to the question of which loss functions should be used for this task, formalizing axioms that we argue loss functions should satisfy. We construct a family of loss functions, which we dub "diagonal bounded Bregman divergences", that satisfy all of these axioms. These rule out many loss functions used in practice, but notably include squared L2 error; we thus recommend its use for evaluating behavioral models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGame Theory and Applications · Opinion Dynamics and Social Influence
