Thutmose Tagger: Single-pass neural model for Inverse Text Normalization
Alexandra Antonova, Evelina Bakhturina, Boris Ginsburg

TL;DR
This paper introduces Thutmose Tagger, a single-pass neural token classifier for inverse text normalization that reduces hallucination errors and improves interpretability over traditional sequence-to-sequence models.
Contribution
It proposes a novel token classification approach for ITN, enhancing accuracy, interpretability, and ease of deployment compared to existing neural sequence-to-sequence models.
Findings
Achieves state-of-the-art accuracy on English and Russian datasets.
Less prone to hallucination errors than sequence-to-sequence models.
Simpler and more interpretable model architecture.
Abstract
Inverse text normalization (ITN) is an essential post-processing step in automatic speech recognition (ASR). It converts numbers, dates, abbreviations, and other semiotic classes from the spoken form generated by ASR to their written forms. One can consider ITN as a Machine Translation task and use neural sequence-to-sequence models to solve it. Unfortunately, such neural models are prone to hallucinations that could lead to unacceptable errors. To mitigate this issue, we propose a single-pass token classifier model that regards ITN as a tagging task. The model assigns a replacement fragment to every input token or marks it for deletion or copying without changes. We present a dataset preparation method based on the granular alignment of ITN examples. The proposed model is less prone to hallucination errors. The model is trained on the Google Text Normalization dataset and achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
MethodsTest
