Robust Dialogue Utterance Rewriting as Sequence Tagging
Jie Hao, Linfeng Song, Liwei Wang, Kun Xu, Zhaopeng Tu, Dong Yu

TL;DR
This paper introduces a sequence-tagging-based model for dialogue rewriting that enhances robustness across domains by reducing search space and employing reinforcement learning with BLEU or GPT-2 signals to improve fluency.
Contribution
The paper proposes a novel sequence-tagging approach combined with reinforcement learning to improve domain robustness and fluency in dialogue utterance rewriting.
Findings
Significant performance improvements on domain transfer tasks.
Enhanced robustness compared to existing models.
Improved fluency through reinforcement learning signals.
Abstract
The task of dialogue rewriting aims to reconstruct the latest dialogue utterance by copying the missing content from the dialogue context. Until now, the existing models for this task suffer from the robustness issue, i.e., performances drop dramatically when testing on a different domain. We address this robustness issue by proposing a novel sequence-tagging-based model so that the search space is significantly reduced, yet the core of this task is still well covered. As a common issue of most tagging models for text generation, the model's outputs may lack fluency. To alleviate this issue, we inject the loss signal from BLEU or GPT-2 under a REINFORCE framework. Experiments show huge improvements of our model over the current state-of-the-art systems on domain transfer.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Cosine Annealing · Weight Decay · Dropout · Discriminative Fine-Tuning · Softmax · Dense Connections · Attention Dropout · Linear Warmup With Cosine Annealing · Attention Is All You Need
