StutterFuse: Mitigating Modality Collapse in Stuttering Detection with Jaccard-Weighted Metric Learning and Gated Fusion
Guransh Singh, Md Shah Fahad

TL;DR
StutterFuse introduces a retrieval-augmented classifier for multi-label stuttering detection, addressing modality collapse with novel metric learning and fusion strategies, achieving state-of-the-art results and cross-lingual generalization.
Contribution
The paper presents StutterFuse, the first retrieval-augmented classifier for stuttering detection, incorporating set-based metric learning and gated fusion to improve multi-label classification accuracy.
Findings
Achieves a weighted F1-score of 0.65 on SEP-28k dataset.
Outperforms strong baselines in multi-label stuttering detection.
Demonstrates effective zero-shot cross-lingual generalization.
Abstract
Stuttering detection breaks down when disfluencies overlap. Existing parametric models struggle to distinguish complex, simultaneous disfluencies (e.g., a 'block' with a 'prolongation') due to the scarcity of these specific combinations in training data. While Retrieval-Augmented Generation (RAG) has revolutionized NLP by grounding models in external knowledge, this paradigm remains unexplored in pathological speech processing. To bridge this gap, we introduce StutterFuse, the first Retrieval-Augmented Classifier (RAC) for multi-label stuttering detection. By conditioning a Conformer encoder on a non-parametric memory bank of clinical examples, we allow the model to classify by reference rather than memorization. We further identify and solve "Modality Collapse", an "Echo Chamber" effect where naive retrieval boosts recall but degrades precision. We mitigate this using: (1) SetCon, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStuttering Research and Treatment · Dysphagia Assessment and Management · Language Development and Disorders
