Findings on Conversation Disentanglement

Rongxin Zhu; Jey Han Lau; Jianzhong Qi

arXiv:2112.05346·cs.CL·December 13, 2021

Findings on Conversation Disentanglement

Rongxin Zhu, Jey Han Lau, Jianzhong Qi

PDF

Open Access

TL;DR

This paper investigates conversation disentanglement using transformer models, introduces a multi-task learning approach, and explores bipartite graph post-processing to improve thread identification in multi-party conversations.

Contribution

It proposes a multi-task learning model for utterance-to-utterance and thread classification, and demonstrates the potential of bipartite graph post-processing to enhance disentanglement accuracy.

Findings

01

BERT with handcrafted features is a strong baseline.

02

Multi-task learning improves thread classification.

03

Bipartite graph post-processing outperforms greedy methods.

Abstract

Conversation disentanglement, the task to identify separate threads in conversations, is an important pre-processing step in multi-party conversational NLP applications such as conversational question answering and conversation summarization. Framing it as a utterance-to-utterance classification problem -- i.e. given an utterance of interest (UOI), find which past utterance it replies to -- we explore a number of transformer-based models and found that BERT in combination with handcrafted features remains a strong baseline. We then build a multi-task learning model that jointly learns utterance-to-utterance and utterance-to-thread classification. Observing that the ground truth label (past utterance) is in the top candidates when our model makes an error, we experiment with using bipartite graphs as a post-processing step to learn how to best match a set of UOIs to past utterances.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Weight Decay · Dense Connections · WordPiece · Linear Warmup With Linear Decay · Layer Normalization