Findings on Conversation Disentanglement
Rongxin Zhu, Jey Han Lau, Jianzhong Qi

TL;DR
This paper investigates conversation disentanglement using transformer models, introduces a multi-task learning approach, and explores bipartite graph post-processing to improve thread identification in multi-party conversations.
Contribution
It proposes a multi-task learning model for utterance-to-utterance and thread classification, and demonstrates the potential of bipartite graph post-processing to enhance disentanglement accuracy.
Findings
BERT with handcrafted features is a strong baseline.
Multi-task learning improves thread classification.
Bipartite graph post-processing outperforms greedy methods.
Abstract
Conversation disentanglement, the task to identify separate threads in conversations, is an important pre-processing step in multi-party conversational NLP applications such as conversational question answering and conversation summarization. Framing it as a utterance-to-utterance classification problem -- i.e. given an utterance of interest (UOI), find which past utterance it replies to -- we explore a number of transformer-based models and found that BERT in combination with handcrafted features remains a strong baseline. We then build a multi-task learning model that jointly learns utterance-to-utterance and utterance-to-thread classification. Observing that the ground truth label (past utterance) is in the top candidates when our model makes an error, we experiment with using bipartite graphs as a post-processing step to learn how to best match a set of UOIs to past utterances.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Weight Decay · Dense Connections · WordPiece · Linear Warmup With Linear Decay · Layer Normalization
