Universal-2-TF: Robust All-Neural Text Formatting for ASR

Yash Khare; Taufiquzzaman Peyash; Andrea Vanzo; Takuya Yoshioka

arXiv:2501.05948·cs.CL·January 13, 2025

Universal-2-TF: Robust All-Neural Text Formatting for ASR

Yash Khare, Taufiquzzaman Peyash, Andrea Vanzo, Takuya Yoshioka

PDF

Open Access

TL;DR

This paper presents a neural text formatting model for ASR that improves accuracy, efficiency, and robustness across various languages and domains, outperforming traditional methods.

Contribution

It introduces a novel two-stage neural architecture for all-neural text formatting in ASR, reducing hallucinations and computational costs.

Findings

01

Superior TF accuracy demonstrated in evaluations

02

Enhanced computational efficiency over existing methods

03

Improved perceptual quality in ASR outputs

Abstract

This paper introduces an all-neural text formatting (TF) model designed for commercial automatic speech recognition (ASR) systems, encompassing punctuation restoration (PR), truecasing, and inverse text normalization (ITN). Unlike traditional rule-based or hybrid approaches, this method leverages a two-stage neural architecture comprising a multi-objective token classifier and a sequence-to-sequence (seq2seq) model. This design minimizes computational costs and reduces hallucinations while ensuring flexibility and robustness across diverse linguistic entities and text domains. Developed as part of the Universal-2 ASR system, the proposed method demonstrates superior performance in TF accuracy, computational efficiency, and perceptual quality, as validated through comprehensive evaluations using both objective and subjective methods. This work underscores the importance of holistic TF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques