TL;DR
This paper introduces a structure-aware mutual information loss function called DMI for training dialog representations, which improves performance across multiple conversational understanding tasks compared to traditional language models.
Contribution
The paper proposes a novel discourse mutual information loss function that incorporates structural information into dialog representation learning, outperforming existing models.
Findings
DMI-based models outperform baselines on nine dialog tasks.
Structural information enhances dialog representations.
DMI captures inherent uncertainty in response prediction.
Abstract
Although many pretrained models exist for text or images, there have been relatively fewer attempts to train representations specifically for dialog understanding. Prior works usually relied on finetuned representations based on generic text representation models like BERT or GPT-2. But such language modeling pretraining objectives do not take the structural information of conversational text into consideration. Although generative dialog models can learn structural features too, we argue that the structure-unaware word-by-word generation is not suitable for effective conversation modeling. We empirically demonstrate that such representations do not perform consistently across various dialog understanding tasks. Hence, we propose a structure-aware Mutual Information based loss-function DMI (Discourse Mutual Information) for training dialog-representation models, that additionally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Cosine Annealing · Linear Warmup With Linear Decay · Residual Connection · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization
