TL;DR
This paper introduces a reinforcement learning approach leveraging natural language inference signals to generate dialogue responses that are both natural and consistent with a given persona, improving over existing methods.
Contribution
It proposes a novel NLI-based reinforcement learning framework for persona consistency in dialogue generation, combining attention-based models with adversarial and NLI signals.
Findings
Outperforms strong baselines in persona consistency
Improves naturalness and coherence of generated responses
Demonstrates effectiveness through human and automatic metrics
Abstract
Consistency is one of the major challenges faced by dialogue agents. A human-like dialogue agent should not only respond naturally, but also maintain a consistent persona. In this paper, we exploit the advantages of natural language inference (NLI) technique to address the issue of generating persona consistent dialogues. Different from existing work that re-ranks the retrieved responses through an NLI model, we cast the task as a reinforcement learning problem and propose to exploit the NLI signals from response-persona pairs as rewards for the process of dialogue generation. Specifically, our generator employs an attention-based encoder-decoder to generate persona-based responses. Our evaluator consists of two components: an adversarially trained naturalness module and an NLI based consistency module. Moreover, we use another well-performed NLI model in the evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
