# Automatic Quality Estimation for ASR System Combination

**Authors:** Shahab Jalalvand, Matteo Negri, Daniele Falavigna, Marco, Matassoni, Marco Turchi

arXiv: 1706.07238 · 2017-06-23

## TL;DR

This paper introduces a novel ROVER variant that leverages ASR quality estimation at segment level for better hypothesis ranking, improving word error rates without relying on decoder confidence scores.

## Contribution

The paper proposes a new ROVER approach using quality estimation features for segment-level hypothesis ranking, outperforming standard ROVER without decoder confidence scores.

## Key findings

- Significant WER reduction of 0.5% to 7.3% across tasks.
- Effective features compensate for lack of decoder information.
- Approach is competitive with oracles exploiting true hypothesis quality.

## Abstract

Recognizer Output Voting Error Reduction (ROVER) has been widely used for system combination in automatic speech recognition (ASR). In order to select the most appropriate words to insert at each position in the output transcriptions, some ROVER extensions rely on critical information such as confidence scores and other ASR decoder features. This information, which is not always available, highly depends on the decoding process and sometimes tends to over estimate the real quality of the recognized words. In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses. We first introduce an effective set of features to compensate for the absence of ASR decoder information. Then, we apply QE techniques to perform accurate hypothesis ranking at segment-level before starting the fusion process. The evaluation is carried out on two different tasks, in which we respectively combine hypotheses coming from independent ASR systems and multi-microphone recordings. In both tasks, it is assumed that the ASR decoder information is not available. The proposed approach significantly outperforms standard ROVER and it is competitive with two strong oracles that e xploit prior knowledge about the real quality of the hypotheses to be combined. Compared to standard ROVER, the abs olute WER improvements in the two evaluation scenarios range from 0.5% to 7.3%.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.07238/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1706.07238/full.md

## References

79 references — full list in the complete paper: https://tomesphere.com/paper/1706.07238/full.md

---
Source: https://tomesphere.com/paper/1706.07238