Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction

Junjie An; Jingguang Tian; Tianyi Wang; Yu Gao; Xiaofeng Mou; Yi Xu

arXiv:2602.12287·cs.CL·February 16, 2026

Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction

Junjie An, Jingguang Tian, Tianyi Wang, Yu Gao, Xiaofeng Mou, Yi Xu

PDF

Open Access

TL;DR

This paper introduces a retrieval-augmented self-taught reasoning model with adaptive chain-of-thought for correcting named entity errors in ASR, significantly reducing error rates by leveraging sophisticated LLM reasoning and retrieval techniques.

Contribution

It proposes a novel framework combining retrieval-augmented generation with adaptive reasoning depth, enhancing named entity correction in ASR beyond existing methods.

Findings

01

Achieves 17.96% and 34.42% relative error reduction on AISHELL-1 and Homophone datasets.

02

Demonstrates the effectiveness of adaptive reasoning depth in improving correction accuracy.

03

Validates the approach's superiority over strong baselines.

Abstract

End-to-end automatic speech recognition (ASR) systems frequently misrecognize domain-specific phrases like named entities, which can cause catastrophic failures in downstream tasks. A new family of named entity correction methods based on large language models (LLMs) has recently emerged. However, these approaches have yet to fully exploit the sophisticated reasoning capabilities inherent to LLMs. To bridge this gap, we propose a novel retrieval-augmented generation framework for correcting named entity errors in ASR. Our approach consists of two key components: (1) a rephrasing language model (RLM) for named entity recognition, followed by candidate retrieval using a phonetic-level edit distance; and (2) a novel self-taught reasoning model with adaptive chain-of-thought (A-STAR) that dynamically adjusts the depth of its reasoning based on task difficulty. Experiments on the AISHELL-1…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis