Auxiliary Sequence Labeling Tasks for Disfluency Detection
Dongyub Lee, Byeongil Ko, Myeong Cheol Shin, Taesun Whang, Daniel Lee,, Eun Hwa Kim, EungGyun Kim, and Jaechoon Jo

TL;DR
This paper introduces a novel approach for disfluency detection in speech transcripts by incorporating auxiliary sequence labeling tasks like NER and POS, leading to improved accuracy over previous methods.
Contribution
It proposes using linguistic auxiliary tasks such as NER and POS for disfluency detection, enhancing model performance and understanding of disfluency patterns.
Findings
Auxiliary SL tasks improve disfluency detection accuracy.
The method outperforms previous state-of-the-art on English Switchboard.
Certain auxiliary tasks are more influential depending on the baseline model.
Abstract
Detecting disfluencies in spontaneous speech is an important preprocessing step in natural language processing and speech recognition applications. Existing works for disfluency detection have focused on designing a single objective only for disfluency detection, while auxiliary objectives utilizing linguistic information of a word such as named entity or part-of-speech information can be effective. In this paper, we focus on detecting disfluencies on spoken transcripts and propose a method utilizing named entity recognition (NER) and part-of-speech (POS) as auxiliary sequence labeling (SL) tasks for disfluency detection. First, we investigate cases that utilizing linguistic information of a word can prevent mispredicting important words and can be helpful for the correct detection of disfluencies. Second, we show that training a disfluency detection model with auxiliary SL tasks can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
