TL;DR
This paper introduces DreamCUB, a model-based reinforcement learning framework utilizing a dialogue world model to predict user beliefs and improve dialogue quality, achieving state-of-the-art results in emotion and sentiment tasks.
Contribution
It develops a novel dialogue world model that predicts user beliefs and integrates it into a reinforcement learning framework for enhanced dialogue systems.
Findings
State-of-the-art emotion classification accuracy
Improved sentiment identification performance
Enhanced dialogue quality through joint training
Abstract
World models have been widely utilized in robotics, gaming, and auto-driving. However, their applications on natural language tasks are relatively limited. In this paper, we construct the dialogue world model, which could predict the user's emotion, sentiment, and intention, and future utterances. By defining a POMDP, we argue emotion, sentiment and intention can be modeled as the user belief and solved by maximizing the information bottleneck. By this user belief modeling, we apply the model-based reinforcement learning framework to the dialogue system, and propose a framework called DreamCUB. Experiments show that the pretrained dialogue world model can achieve state-of-the-art performances on emotion classification and sentiment identification, while dialogue quality is also enhanced by joint training of the policy, critic and dialogue world model. Further analysis shows that this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
