TL;DR
This study investigates how disfluency detection impacts intent detection and slot filling in Vietnamese, a low-resource language, showing that disfluencies negatively affect performance but multilingual models can mitigate this.
Contribution
First empirical analysis of disfluency effects on downstream tasks in Vietnamese, including dataset extension and comparison of multilingual and monolingual models.
Findings
Disfluencies negatively impact intent detection and slot filling.
Multilingual XLM-R outperforms PhoBERT in disfluent contexts.
Disfluency effects differ from fluency contexts, favoring multilingual models.
Abstract
We present the first empirical study investigating the influence of disfluency detection on downstream tasks of intent detection and slot filling. We perform this study for Vietnamese -- a low-resource language that has no previous study as well as no public dataset available for disfluency detection. First, we extend the fluent Vietnamese intent detection and slot filling dataset PhoATIS by manually adding contextual disfluencies and annotating them. Then, we conduct experiments using strong baselines for disfluency detection and joint intent detection and slot filling, which are based on pre-trained language models. We find that: (i) disfluencies produce negative effects on the performances of the downstream intent detection and slot filling tasks, and (ii) in the disfluency context, the pre-trained multilingual language model XLM-R helps produce better intent detection and slot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsXLM-R
