An Unsupervised Dialogue Topic Segmentation Model Based on Utterance Rewriting
Xia Hou, Qifeng Li, Tongliang Li

TL;DR
This paper introduces an unsupervised dialogue topic segmentation model that uses utterance rewriting to improve the accuracy of topic boundary detection in conversations, especially handling co-references and omissions.
Contribution
The study proposes a novel utterance rewriting technique combined with an unsupervised learning algorithm for dialogue topic segmentation, achieving state-of-the-art results.
Findings
Improves segmentation accuracy by about 6% on DialSeg711.
Achieves 11.42% in absolute error score and 12.97% in WD on DialSeg711.
Attains 35.17% in absolute error score and 38.49% in WD on Doc2Dial.
Abstract
Dialogue topic segmentation plays a crucial role in various types of dialogue modeling tasks. The state-of-the-art unsupervised DTS methods learn topic-aware discourse representations from conversation data through adjacent discourse matching and pseudo segmentation to further mine useful clues in unlabeled conversational relations. However, in multi-round dialogs, discourses often have co-references or omissions, leading to the fact that direct use of these discourses for representation learning may negatively affect the semantic similarity computation in the neighboring discourse matching task. In order to fully utilize the useful cues in conversational relations, this study proposes a novel unsupervised dialog topic segmentation method that combines the Utterance Rewriting (UR) technique with an unsupervised learning algorithm to efficiently utilize the useful cues in unlabeled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling
