# Fast and Accurate Capitalization and Punctuation for Automatic Speech   Recognition Using Transformer and Chunk Merging

**Authors:** Binh Nguyen, Vu Bao Hung Nguyen, Hien Nguyen, Pham Ngoc Phuong,, The-Loc Nguyen, Quoc Truong Do, Luong Chi Mai

arXiv: 1908.02404 · 2019-08-08

## TL;DR

This paper introduces a Transformer-based method with chunk merging for fast, accurate punctuation and capitalization restoration in long-speech ASR, improving both speed and accuracy over existing methods.

## Contribution

The paper presents a unified Transformer model with chunk merging for simultaneous punctuation and capitalization restoration in long speech, enhancing accuracy and decoding speed.

## Key findings

- Outperforms existing methods in accuracy
- Achieves faster decoding speeds
- Effective on British National Corpus

## Abstract

In recent years, studies on automatic speech recognition (ASR) have shown outstanding results that reach human parity on short speech segments. However, there are still difficulties in standardizing the output of ASR such as capitalization and punctuation restoration for long-speech transcription. The problems obstruct readers to understand the ASR output semantically and also cause difficulties for natural language processing models such as NER, POS and semantic parsing. In this paper, we propose a method to restore the punctuation and capitalization for long-speech ASR transcription. The method is based on Transformer models and chunk merging that allows us to (1), build a single model that performs punctuation and capitalization in one go, and (2), perform decoding in parallel while improving the prediction accuracy. Experiments on British National Corpus showed that the proposed approach outperforms existing methods in both accuracy and decoding speed.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.02404/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1908.02404/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/1908.02404/full.md

---
Source: https://tomesphere.com/paper/1908.02404