CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models
Bishal Santra, Ravi Ghadia, Manish Gupta, Pawan Goyal

TL;DR
This paper introduces CORAL, a novel loss function for dialog generation that considers context and human preferences, improving training effectiveness and response quality over traditional methods.
Contribution
The paper proposes CORAL, a reinforcement learning-based loss function that incorporates context and human preferences, addressing limitations of cross-entropy loss in dialog generation.
Findings
CORAL outperforms state-of-the-art models on benchmark datasets.
Models trained with CORAL generate more relevant and engaging responses.
The mix-policy training algorithm effectively reduces sample complexity.
Abstract
In the field of Natural Language Processing, there are many tasks that can be tackled effectively using the cross-entropy (CE) loss function. However, the task of dialog generation poses unique challenges for CE loss. This is because CE loss assumes that, for any given input, the only possible output is the one available as the ground truth in the training dataset. But, in dialog generation, there can be multiple valid responses (for a given context) that not only have different surface forms but can also be semantically different. Furthermore, CE loss computation for the dialog generation task does not take the input context into consideration and, hence, it grades the response irrespective of the context. To grade the generated response for qualities like relevance, engagingness, etc., the loss function should depend on both the context and the generated response. To address these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · AI in Service Interactions · Multimodal Machine Learning Applications
MethodsCorrelation Alignment for Deep Domain Adaptation
