Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition

Yuu Jinnai

arXiv:2510.19471·cs.CL·May 14, 2026

Re-evaluating Minimum Bayes Risk Decoding for Automatic Speech Recognition

Yuu Jinnai

PDF

1 Repo

TL;DR

This paper evaluates the effectiveness of Minimum Bayes Risk decoding for automatic speech recognition and speech translation, showing it often outperforms traditional beam search in accuracy.

Contribution

It demonstrates that MBR decoding, previously successful in text generation, is also effective for speech-to-text tasks like ASR and ST, with empirical results on English and Japanese.

Findings

01

MBR decoding outperforms beam search in most evaluated settings.

02

MBR decoding shows promise for high-accuracy offline ASR and ST.

03

Code for MBR decoding in ASR is publicly available.

Abstract

Recent work has shown that sample-based Minimum Bayes Risk (MBR) decoding outperforms beam search in text-to-text generation tasks, such as machine translation, text summarization, and image captioning. On the other hand, beam search is the current practice for speech-to-text tasks such as automatic speech recognition (ASR) and Speech Translation (ST). Given that MBR decoding is effective in text-to-text generation tasks, it is reasonable to expect it to also be effective for speech-to-text tasks. In this paper, we evaluate MBR decoding for ASR and ST tasks on English and Japanese using Whisper and its derivative models. We observe that the accuracy of MBR decoding outperforms that of beam search in most of the experimental settings we have evaluated. The results show that MBR decoding is a promising method for offline ASR and ST tasks that require high accuracy. The code is available…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CyberAgentAILab/mbr-for-asr
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.