MGSC: A Multi-granularity Consistency Framework for Robust End-to-end Asr

Xuwen Yang

arXiv:2508.15853·cs.CL·August 25, 2025

MGSC: A Multi-granularity Consistency Framework for Robust End-to-end Asr

Xuwen Yang

PDF

TL;DR

This paper introduces MGSC, a framework that enhances end-to-end ASR robustness by enforcing internal semantic and token alignment consistency, significantly reducing errors in noisy environments.

Contribution

It proposes a novel, model-agnostic consistency regularization framework that leverages multi-granularity internal self-consistency to improve ASR robustness against noise.

Findings

01

Reduces Character Error Rate by 8.7% on noisy data

02

Prevents severe semantic mistakes in noisy environments

03

Uncovers synergy between macro and micro-level consistency

Abstract

End-to-end ASR models, despite their success on benchmarks, often pro-duce catastrophic semantic errors in noisy environments. We attribute this fragility to the prevailing 'direct mapping' objective, which solely penalizes final output errors while leaving the model's internal computational pro-cess unconstrained. To address this, we introduce the Multi-Granularity Soft Consistency (MGSC) framework, a model-agnostic, plug-and-play module that enforces internal self-consistency by simultaneously regulariz-ing macro-level sentence semantics and micro-level token alignment. Cru-cially, our work is the first to uncover a powerful synergy between these two consistency granularities: their joint optimization yields robustness gains that significantly surpass the sum of their individual contributions. On a public dataset, MGSC reduces the average Character Error Rate by a relative 8.7% across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.