Enhancing Semantic Understanding with Self-supervised Methods for Abstractive Dialogue Summarization
Hyunjae Lee, Jaewoong Yun, Hyunjin Choi, Seongho Joe, Youngjune L., Gwon

TL;DR
This paper introduces self-supervised techniques to improve dialogue summarization by enhancing BERT's contextual understanding, leading to better abstractive summaries on dialogue datasets.
Contribution
It proposes novel self-supervised methods to train dialogue-specific BERT models, addressing the gap in dialogue summarization compared to news article summarization.
Findings
Improved ROUGE scores on SAMSum dataset
Self-supervised methods enhance BERT's contextualization in dialogues
Sensitivity analysis identifies key hyperparameters
Abstract
Contextualized word embeddings can lead to state-of-the-art performances in natural language understanding. Recently, a pre-trained deep contextualized text encoder such as BERT has shown its potential in improving natural language tasks including abstractive summarization. Existing approaches in dialogue summarization focus on incorporating a large language model into summarization task trained on large-scale corpora consisting of news articles rather than dialogues of multiple speakers. In this paper, we introduce self-supervised methods to compensate shortcomings to train a dialogue summarization model. Our principle is to detect incoherent information flows using pretext dialogue text to enhance BERT's ability to contextualize the dialogue text representations. We build and fine-tune an abstractive dialogue summarization model on a shared encoder-decoder architecture using the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · WordPiece · Layer Normalization · Softmax · Linear Warmup With Linear Decay · Adam · Dense Connections · Weight Decay
