It's MBR All the Way Down: Modern Generation Techniques Through the Lens   of Minimum Bayes Risk

Amanda Bertsch; Alex Xie; Graham Neubig; Matthew R. Gormley

arXiv:2310.01387·cs.CL·October 3, 2023

It's MBR All the Way Down: Modern Generation Techniques Through the Lens of Minimum Bayes Risk

Amanda Bertsch, Alex Xie, Graham Neubig, Matthew R. Gormley

PDF

Open Access 1 Repo

TL;DR

This paper explores Minimum Bayes Risk (MBR) decoding, demonstrating its theoretical foundations, recent variants, and practical benefits in NLP, advocating for broader adoption due to its reliable performance improvements.

Contribution

The paper provides a comprehensive introduction to MBR, unifies recent methods under its framework, and offers practical guidelines for applying MBR in NLP tasks.

Findings

01

MBR improves NLP model performance without additional training.

02

Recent methods can be reformulated as special cases of MBR.

03

Theoretical analysis supports the empirical effectiveness of MBR variants.

Abstract

Minimum Bayes Risk (MBR) decoding is a method for choosing the outputs of a machine learning system based not on the output with the highest probability, but the output with the lowest risk (expected error) among multiple candidates. It is a simple but powerful method: for an additional cost at inference time, MBR provides reliable several-point improvements across metrics for a wide variety of tasks without any additional data or training. Despite this, MBR is not frequently applied in NLP works, and knowledge of the method itself is limited. We first provide an introduction to the method and the recent literature. We show that several recent methods that do not reference MBR can be written as special cases of MBR; this reformulation provides additional theoretical justification for the performance of these methods, explaining some results that were previously only empirical. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zurichnlp/mbr
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Bayesian Modeling and Causal Inference · Machine Learning and Algorithms