Discourse Structure Extraction from Pre-Trained and Fine-Tuned Language Models in Dialogues
Chuyuan Li, Patrick Huber, Wen Xiao, Maxime Amblard, Chlo\'e Braud,, Giuseppe Carenini

TL;DR
This paper explores extracting discourse structures from dialogues using attention matrices in pre-trained language models, proposing unsupervised and semi-supervised methods that improve over baseline scores on the STAC corpus.
Contribution
It introduces novel unsupervised and semi-supervised approaches to extract discourse structures from dialogue-focused PLMs, addressing data sparsity issues.
Findings
Semi-supervised method achieves 59.3 F1 score.
Unsupervised method achieves 57.2 F1 score.
Scores improve to 68.1 with projective trees.
Abstract
Discourse processing suffers from data sparsity, especially for dialogues. As a result, we explore approaches to build discourse structures for dialogues, based on attention matrices from Pre-trained Language Models (PLMs). We investigate multiple tasks for fine-tuning and show that the dialogue-tailored Sentence Ordering task performs best. To locate and exploit discourse information in PLMs, we propose an unsupervised and a semi-supervised method. Our proposals achieve encouraging results on the STAC corpus, with F1 scores of 57.2 and 59.3 for unsupervised and semi-supervised methods, respectively. When restricted to projective trees, our scores improved to 63.3 and 68.1.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsSTAC
